Cp4.1LG12g10040 (gene) Cucurbita pepo (Zucchini)

NameCp4.1LG12g10040
Typegene
OrganismCucurbita pepo (Cucurbita pepo (Zucchini))
DescriptionA/G-specific adenine DNA glycosylase
LocationCp4.1LG12 : 9693676 .. 9703087 (-)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: five_prime_UTRCDSthree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
AGCCCAGCCCAACAACTTCCACCCTGTCTTTTGAAGAAACCCTAAATCTCCCTAATTCATTCTTCCATTTCTCAATCTCCCATTTCTCAATCTTAGATCTCTCTCTCATCTCTCCCGAGCAGAGTCTACGGACCCCGTTCCCCGATTGCTTCCCCAACTCACCAGCTCTCGTTTTCAATCCTTCTCAGATCTCTTCCGGTGGTCCGAGATCTCTCTGTCTCCCAACTCCAGCGGCATCTCTTACGAGCATAATCCCCCGTATTAGCGGCCGGCAGCAGCTCTCTCCCACGAACCCACATCGCGGCGGCTTCTTTCCCTCTCGAGCACCCATCCCCGCCGGTAACTCTCTTTCAAAAGCACCTAAGTTTTATCATAAATTCTTGTGTTGTTGGAAATCATATGAAAACTTCCACCAGACCTGCTCGAAAAACATTTTGTTGTTCTGTTGTTTCTCTTTTAAATTTAATTCAATAACCTTTCCCTGCTTAACCACTCATGGAAACGATGGCTTTTCTTGATTCAATGCATTCCCTCATGTATTTTTCTTTTCCCTTTTCCACAGTATTCTCATTTTATTACGATGGCTGATGGTCATGAAACTGATAAGAACATTGAAATATGGAAAATTAAGAAGTTGATTAAAGCACTTGAAGCTGCAAGAGGCAATGGGACTAGCATGATCTCTCTCATCATGCCTCCGCGCGATCAAATATCTCGTGTTACTAAGATGCTTGGTGATGAATTTGGAACTGCTTCGAACATTAAGAGTAGGGTAAATCGCCAATCTGTTTTGGGCGCCATCACGTCTGCTCAGCAGAGACTCAAGTTGTATAACAAGGTTCCTCCAAATGGGCTTGTGCTATATACTGGAACAATCGTGACTGAAGATGGAAAAGAAAAGAAAGTTACAATTGATTTTGAGCCTTTCAGACCTATAAATGCTTCTCTCTATCTCTGCGACAACAAGTTCCATACGGAAGCCCTGAATGAACTTCTAGAATCTGATGACAAGTTTGGCTTCATTGTCATGGATGGTAATGGAACACTTTTTGGGACATTGAGTGGTAACACACGTGAAGTCCTTCACAAATTTAGCGTCGACCTTCCTAAGAAACATGGAAGAGGAGGTCAGTCAGCACTTAGGTTTGCCCGTCTTCGGATGGAGAAACGGCACAACTATGTCAGGAAAACAGCAGAGCTTGCAACCCAGTTCTATATTAATCCGGCCACTAGTCAACCCAATGTTGCAGGATTGATACTGGCTGGATCAGCCGACTTCAAAACAGAGCTCAGCCAGTCCGACATGTTTGATCCTCGTCTTCAGGCTAAAATACTTAATGTGGTTGATGTCTCTTATGGAGGGGAAAATGGATTTAATCAGGCTATTGAATTGTCGTCCGAGATCTTGTCTAATGTAAAATTTATACAGGAGAAGCGTTTGATTGGAAAATACTTTGAAGAGATTAGCCAGGACACGGGGAAATATGTTTTTGGTGTTGACGACACACTGAAAGCTCTGGAGATGGGTGCTGTTGAGATACTCATTGTTTGGGAAAATTTGGATATCAATAGGTACGTATTAAAGAATGTTTCCACTGGTGAGGTTATTATAAAGCACTTGAATAAGGAACAGGAAGCCAATCAGAGCAACTTCCGTGACCCCATCACCGCTGCTGAATTGGAGGTTCAAGAAAAAATGGCCTTGCTGGAATGGTTTGCAAACGAGTACAAGAAGTTTGGTTGTACCCTGGAATTCGTTACGAACAAATCACAGGAAGGATCACAATTCTGCAGAGGTTTTGGTGGTATTGGAGGAATTCTTCGTTACCAGCTTGACATAAGATCGTTTGATGAACTATCCGATGGGGAAGAGTATGGTGATTCTGAATAGCAATCAATCAATCACTCCTCGTTGCTGGTGCAGATGGGGGATAAAGAGCTTGCACCAGCGCCCAGGGTTCTCGGAGGTTAGTTTTGGATTGAACTTTGTGAAGGCTTTATATAATGAACTTTGTCTTATGGTTACATGCTCAATAATCTACCTTTTGTACTCGTAGGGGTTATGTCCTTTTGATCGTGATTTGGCGCCTATCTGATAGATCTTCAAGGATGCGTGCTCCCATCTATAGCCACAGTGCTCTGCCCTTGTTCTCGTGGAATGGTTGCTTCCTACAGCAGATACTCAGTTACAAAAATATTTCAAGTATTACAATAATATTCAATGCCGTATGACCTTAAATTTTTGAGTTTGTGTCGTTTTGACTATTTATCGTTGCTGATAAACCATGTCAGTTGGATTTGAATTCATATGCATGTTACAGTGTTTTAAGGTTGTTTTGCTGGGTATGGTTTTTGGATATCTAAAAACATTGTGTTTGGAATTAATAGGAAACGGCATCAGGAACTAGGCTACTTTTTTAAAGATGCATTTAAAGCGTTATTATACTTAGCCCAAGTGTGTTAGGTATTCCTTTAGGAAGTGCTATTGGAAAAGTTTGTTTTAGTTTGACGCATGGATGTGGTTTTGAACCTTTGCCCATGTCGGGTGATAATTTGTTTTGATGAAGTGCTTTTACCAAAGTTATTTACTTTTGGAAGTGCTTATAAATAACAGTCCAAGTGTGACAGTTTGGTGTGACTTTAGAAGTGCATATGGGTGATTTGGAGTACCCTCGTAAAAGTGTATTTAAAATGTTTTAAAGAGTATTTATATCTTATGGAAGTGTTTCAAATTTGCACACCTTTTTTTTAAATCTTTTTTAAATCTTCTCAAAAGGTTTCGAGAAGGACAGCATTGATGAATGAATGGCAATGGAAGAAAACTTCGTTAGCTTATGAAACAGCAATCTACTTGCATTACAAAAACAAGAACTAACTTCGAGATAAAATCTCGAATGGCTGTAAATTCATCCTTTGATTGTTCTTTTAAATCTTTGGATATCATCTTTTTGTCTCTACTATTAGACATTGTCTTCTCATCTATATCTTTTGAGGTATTATCTTTGAAAAGATTCAAATTTGCAGGCAACTGAGCGTCGAGAATTTTTTTTTTTTTTTTTTTTTTAAGAAAAGGAAGCACAACTGTTCTAATCTCGCAACCTCTAGAGTTTAATGGCCGCCTGTAACATGAAGACTTTGGTTAATTGATCACACAACGTAATAATAGGAAGGAATTGACATAATAAACCAATATATTAGGAATGTTCCCATTAATTAAGCATTGAAATCTTAGAACCCCCTAAAATAGATTATTTGATTTACTTTGTTAATAAAATACTACTATTACTAAATTTAACTTCACTTCATACGGTAATTATAGAATGTAGAGACCTTAATACCTATAATCTGCTCCGCAATGACTTCAACTTGTGAATGACGTACAATCTATAGCCTAAGATCTCAGATCTATCATTCATGTAGAAAGAAGTCTTTAAGTACGTTGCCAGCAGATAAAGATGTATGCAGCTTACAGGCGGGGAAGAACATAATTCAAATTTCAGACTACAACAAAACTAACCCAAGAGGAGAAATGGTATGGCCAAGTTATGGATACATCAGAGTAAGGTAAACGAGAAGAAATGAGTTAAGAAATGCATGCCCAAGAGAGTGAAAAAAAGTAAAGAATTGAGAACAGATAGACAACCTCGCAGGTCCAGTTCTCGCTCCTTAATGGCATTGAAGAAATGAGGGCTTTTCCAGATTAAATCCACTCTGGGTCTCACCATGTTCTTCTGACTTGAATTCACTGGACGACGATGGGGTTTCGGGGTTTAAGGGTTTAGGGCGTCGACGACAAACGACAGCAAGCAGCGCTGTGTGGATCGGAACACGAGACGTCCACATTTCGGACCGGACCAACCCCCTTAATGGACTTTCAGACGACCCAAATGAATTCACATTTGGGTCTCTCCATCTATGTGGCTTGTGTATTAAAGTTCTAAACTTTTTAAGAAATTAGACTGAATTCACATTTGGGTTAATTCATTATTCATTTTGTTAGTGAAAACTAATCGAGTATTAATCTCGTCAAACTGGGATTTCCCGCCCAAATCAGGTTAGGTCGCAGTGTTTACCTTTTCTAACCCAGCTACACTAATAATGGACTGGGTCCTCGTACATCGCTACACATATCTGCCCTACTCGTTTCTTTTCTCTTCTTCTCTCCTTGGCTGGCAACTTTCACCTCGTGGGTAATCATTGAAGAGTCATCAGGGTAGTGGGAGGGGTTGTAACAGCATGAGCGGCGGAGAAAAGAACGAGAACGAGGAGGATGTGAAAAAGAAACCAACGAAGGGGGAAAAGCGCCGGGGCCGAAGTCCGTCCAAAAGGGAACCAATCGCTGACATTGAAGATATTATGTTCAGCATAGATAAAGTTCAGACAATGAGGTCATCGCTATTGGATTGGTACGACCTTAGCCATAGGGACCTTCCTTGGAGGAGGTTGGACAAAGGGCAGCCTCAAACACGGGGTTATGGTGTGTGGGTATCAGAAATAATGCTGCAGCAGACGAGAGTTCAGACCGTCGTTGAATATTACAAGCGTTGGATGCATAAATGGCCCACCGTTCAACATCTCTCTCGTGCTTCTCTCGAGGTTGGTTTTATGATTTAGTTTGAGCTACAATATTGTCTTCTTTACTCTCAGTAAGCAGTTTAGATCTCTGGCAGGAGGTGAATGAAATGTGGGCAGGCTTGGGATACTATAGACGAGCTCGTTTTCTTTTAGAGGTAATCTTTACTCGCTGCACTAATGGAATTTTTGGTCTCTTAATATCTGCCACCACGATCACTCCTTCCAATTTCCTTGGGACATGGTTTAGGTTGGCATTTACTTTTAAATATTAGCTTCGTTTGTTAAGAGGGCAGCTTTTTGTTTTGAATCTTTCTCAAGATAGAGGTAGGAGAGGTGGGAAACTTTTCTACCACAACTGGTTGATTGCTGGGAAAACACACACTGGTTTGCTTTACTATAGCTTATATCACTGTCTCAATCTCCATCTACTCTTTCATGACTTATTACGGCCCTGTCAGCCTAGGCGAAGATGAGGATAAGTAATAAAGAGTTGGACGGTTCAGCGACAACTGTTGGGACTCATATGGGTCTGCTATCCTACCATTTTAACATTAGAATAGTATTTAAGAAAGGTGATGTACAAGATAACCCTAGTATTCAGGGTATTGAAATAACAACCTAACTTGTGGTCTCCCTTCCAACAAGCTTCTCAAACTCCCTTACACCCTATAAGCTCCTTCCTCAAAGACATACTCAGAAAACCCCATGCTCGCTCTTGTTTCCTAAGGCTTCATGATTCTAGGCTTTATAAGACTCCCTTGATGGAGGGAGATTTACTCTTTTATCAAGTGTACTGATTTATTATATGTCTCCTTTTGCTTTGCCTGTGTCGATAAGGAATTTGAAAAGACCATGAGGCACTTTTTTCTCTTGGGTCTCTTTTTGGAAGGTTCCTTGGATTGCAAATCAGTTATTGGCCAATTGATTTCTCATATTATATGAACTTTGATATCAGAAAGATATGTTAGCAGTGAGTCAGTGACATGTCTGGGACAGTTGCTAAGATCATCGAAACTTGAAAGTTTGTATTAAAAGATGTTTAAAAGATGAGGAGTTAACTGGATGTTGAAGCTTTCTTGAGTTATCACGTGGAAAAGTTCTATCTAGAGAGGAAGATTGTAGAAACTAGATATTTCAGGATGAGACTCCGTCAAATCTTTGCTAAAACATTTAGTGGGAGCTGCTGAAATCTTGAAATTTGAAGAAGGTGAATATGTTGTCATGGTGTTTATGGAGAGGGTAGAGCGATCTTGAGGATAGAATGAGAGCGATCCTGTACATGAATGAAACTGGACCACATTCGAATGAGAGCAATCCTGAGGATAGGATGGTTAAAATCGGTTTAAAAGAATTGAAAGTTACTTACCTATACCAACAAGGTGCACTTTCTTTTTCTTTTTGGTGGCTCAATCATAGGAACTCCGAAGTTAGCAAGCTTTTCTTAGAGCAGTTCTATGTCAGGTGACCTCCTCATAATTTTCCTAGGATGCATGTGAGTGATAACAAAGCATGCTGAAAGGTCCCGTATTGGTCTGTAGGGATAGTCTTCACTCTTAGAAGCAATAAGTAAGCAACATACCCATGTTGTAGGAGTGCAGAGTAATGTCGAGGCACATAGGCGTTACACTTCGACTGTAATGCACACCATATGCATACATGCTAGTGGGTTTTCTTTTCTACTTGTATATTCTTCAGTAAATTGACTGCATGTTACTTATCAAAATATTACAAAATAAAGTACAAAAAGAACTGAAATGAAAAATATATATATATATATCAACATCTCTGGATATTAATCATTGCTATGAAATTTTGAATTACCCTTTCTAATTTTGTTGTTAGGGTGCAAAGTTGATAGTCAAAGAAGGTGGCGAATTTCCTAAAACGGTTTCTGCCCTTCGAAAAATTCCTGGAATTGGAGAGTATACAGCAGGGGCTATTGCCTCCATAGCATTTGATGAAGTGAGTGTTTTTCTCGCCTATTTTTTTCTCTTACTCGTTTTAAAAACCTTGAGGGAAAGTCTAAAGAGGACAATATCTGCTAGCGGTGGGTCTGGGCCGTTACAATGCACTTCTAAATATGTTTCCTGAGCAGGTGGTGCCTGTGGTTGATGGTAATGTGATTCGGGTAATCGCTCGATTAAAGGCTATTTCAGGAAATCCGAAAGACTCAAAGTTGGTTAAGCAAGTTTGGTGAGCATACTTTCTAGTTGTTCTAAAGGCTTTTATAGAATCTTTTTATCCTAATTGTTCTCATTGTTTTTTTTGGGGTGGGGGGTGGATGAGTTAGGAAGGCAGCTGCTCAATTAGTTGATCCTTCCAGGCCTGGGGACTTCAATCAGGCACTCATGGAACTTGGTGCAACTTTATGCACACCAACAAGCCCAAGCTGCTCAACATGCCCTGTGTTTGATCACTGTGAGGCCCTTTCAATCTCAAAGGATGATAGTTCAGTTCTTGTCACAGATTATCCCGCTAAGGGTATAAAGACCAAACAAAGACATGATTACTCTGCTGTTTGTGTGGTTGAGATATTGGAAAATCAGGGCTCATCTGAGTTGAAGCAATCCAGTAGATTTCTTCTTGTAAAGAGGCCTGATGAAGGCTTGCTTGCTGGCCTATGGGAGTTCCCATCTGTCTTGTTGAAGGGAGAAGCTGATTCAAGTACAAGGAGAGAATCCATGAACAGCCTCTTGAGTAAATCCTTTGGACTTGAACCAAAAAAGAATTTTGATATCGTTATTAGAGAAGATGTTGGAGATTTTATCCATGTTTTCTCGCACATCCGTCTCAAGATATATGTTGAACACTTGGTGTTACGTTTAAAAGGTTCGAGTTACTTCTCCTTTCTTGCCTATCTGTGCATTGAGTATGCCATGAACTCTGTGACTAGAAACTTGCTATAAAACTTAACCTATATGAGATTTACTTATCTTATAGGCTCTCAAACATTTTGATGTAGCTATAGTTGTCCTTTATGTTCTTTTGTGTGACACCCCACATCGGTTGGAGAGGGGAACGAAACATTCCTTATAAGGGTGTGGAAACCTCTTCCTAGTAGACGTGTGTTAAAACCTTGAGGGAAAGCCTAAAAGAGGACAATATCTAGTAGCGGTGGGCTTGGATTGTTACAAATGGTATCAGAGGCAGACATTGGGCAGTGTGCCAATGGGGACGTTGGGCCCTAAGGAGAGTGGATTGTGAGATTCCAATTCTTATAAGGGTGTGAAAACCTCTCCCTAATAGACATGTGTTATTTCTTTTCATTTCAATTAATTTGCTTAATCTGTTTGTATGTGTTCTTGAAGGTGAAGGTAGCAAGTTGTTTCGGAAACAGGAGAAGAAATCTATATCATGGAAATGTGTGGACAACAAGGTTATGTCCAGCGTGGGGTTGACGTCCAGTGTGAGGAAGGTAAGCAAAGATGGTGCTTTAGATGACTTCTCCTTGTGAAATATTTAACTTGAAGGTTCTATTTTCATCGTTAAACTTGAAAGAAAACTCTTTTTAGATTGCATGATTGCAGACATTTAGTGTAACACCCAAACTCACGGCTAGCAAATATTGTCATTTTTAGGTTTTCCTTTAAGGACTTTCCTTTAAGGTTTTAAGACGCCTCTACTAGAGAGAGGTTTCCACGCCCTTATAAGAAATTCTTCGCTCCCCCCTTCAACCGATGTGAGATCTCACAATCCACCTCCCTTACGGGCCAGTGTCCCCGATGGCACACCACTCGGTGTCTGGCTCTAATATCATTTGTAACAGCTCAAGCCCACCACTAGCAGATATTGCCCACTTTGGCATGTTACGTATCATCGTCAGCCTCACGGTTTTAAAATTTGTCTACTATGGAAAGGGTCTAGCCTTACTCCGACTAGTGCCTTGCACAGTTTGGTGACTGGCTCTGATACCATTTGTAATAGCTCAAGGCCCACCGTTAGCAGATATTGTCCTGTTTAGGCTTTCCCTTCAAGGGTTTCCCCTCAAAGTTTAAAAACGCGTCTACTAGGGAGAGATTTTCACACCCTTACAAGGAATGCTTCGTTCTCCTCTCAAACCAATGTGAGATCTCACATTTAGTATTCCTTCGAACAAACAAGTCAATACTCATATAGTCTCGATGATGACTGGTATGAACCTTAACAACACTTGAGATTCAAAATCATATATACGTTGGATTAAGTGACATTGAGAAAATCAGTAAACAGGGACTTCTCTGATCACATGATCACTCTCTTTGAACCCCCTCGTGCATGCATATTTATTTATGGATTGCATAGAGATTTGATATTTGTAATCCAAGTTCGATGGACGTCAAAATTCACAAACACAGTATCCTTTTACAGTCCATGATATCATAATGAGTATGTAATTATGTATGATCAGGACATCGAGGGAGGTCGATGTACGTTTCGTTTTCTCACTGTATTTATATAGAATTAGTGTTACTAAGCATGATTATACTCAAGAGGTCCAGATGATTAATAATCTGACATCTTTTTGATAGGTGTATGCCATGGTGGAGAAATTTGAGGCAGATAAGATATCTCCCATCCGTGCAGTAGCCACAAAAAAACAGAGAGCTACTTCAACAAACTTGAGCTGCAGGAGCTGTTGACCTTAATCAAAGTAAGCTAATATATCCATTAGAAAAATAGTAAACTATCGGGGTCGAACCGACTGAACCATCGGTGTTTTGTTGGTTTAGTGAACCAATGATAGTATGACTAAAAAAAATTATTTTTTTTTCTTCGGTGACTCGTATATAAAATTAAAAATTTAAAGAGAATTGTTAAATATTTGTAAAGGGATAAAAGAAAAAACGTGAATCACGAT

mRNA sequence

AGCCCAGCCCAACAACTTCCACCCTGTCTTTTGAAGAAACCCTAAATCTCCCTAATTCATTCTTCCATTTCTCAATCTCCCATTTCTCAATCTTAGATCTCTCTCTCATCTCTCCCGAGCAGAGTCTACGGACCCCGTTCCCCGATTGCTTCCCCAACTCACCAGCTCTCGTTTTCAATCCTTCTCAGATCTCTTCCGGTGGTCCGAGATCTCTCTGTCTCCCAACTCCAGCGGCATCTCTTACGAGCATAATCCCCCGTATTAGCGGCCGGCAGCAGCTCTCTCCCACGAACCCACATCGCGGCGGCTTCTTTCCCTCTCGAGCACCCATCCCCGCCGTATTCTCATTTTATTACGATGGCTGATGGTCATGAAACTGATAAGAACATTGAAATATGGAAAATTAAGAAGTTGATTAAAGCACTTGAAGCTGCAAGAGGCAATGGGACTAGCATGATCTCTCTCATCATGCCTCCGCGCGATCAAATATCTCGTGTTACTAAGATGCTTGGTGATGAATTTGGAACTGCTTCGAACATTAAGAGTAGGGTAAATCGCCAATCTGTTTTGGGCGCCATCACGTCTGCTCAGCAGAGACTCAAGTTGTATAACAAGGTTCCTCCAAATGGGCTTGTGCTATATACTGGAACAATCGTGACTGAAGATGGAAAAGAAAAGAAAGTTACAATTGATTTTGAGCCTTTCAGACCTATAAATGCTTCTCTCTATCTCTGCGACAACAAGTTCCATACGGAAGCCCTGAATGAACTTCTAGAATCTGATGACAAGTTTGGCTTCATTGTCATGGATGGTAATGGAACACTTTTTGGGACATTGAGTGGTAACACACGTGAAGTCCTTCACAAATTTAGCGTCGACCTTCCTAAGAAACATGGAAGAGGAGGTCAGTCAGCACTTAGGTTTGCCCGTCTTCGGATGGAGAAACGGCACAACTATGTCAGGAAAACAGCAGAGCTTGCAACCCAGTTCTATATTAATCCGGCCACTAGTCAACCCAATGTTGCAGGATTGATACTGGCTGGATCAGCCGACTTCAAAACAGAGCTCAGCCAGTCCGACATGTTTGATCCTCGTCTTCAGGCTAAAATACTTAATGTGGTTGATGTCTCTTATGGAGGGGAAAATGGATTTAATCAGGCTATTGAATTGTCGTCCGAGATCTTGTCTAATGTAAAATTTATACAGGAGAAGCGTTTGATTGGAAAATACTTTGAAGAGATTAGCCAGGACACGGGGAAATATGTTTTTGGTGTTGACGACACACTGAAAGCTCTGGAGATGGGTGCTGTTGAGATACTCATTGTTTGGGAAAATTTGGATATCAATAGGTACGTATTAAAGAATGTTTCCACTGGTGAGGTTATTATAAAGCACTTGAATAAGGAACAGGAAGCCAATCAGAGCAACTTCCGTGACCCCATCACCGCTGCTGAATTGGAGGTTCAAGAAAAAATGGCCTTGCTGGAATGGTTTGCAAACGAGTACAAGAAGTTTGGTTGTACCCTGGAATTCGTTACGAACAAATCACAGGAAGGATCACAATTCTGCAGAGGTTTTGGTGGTATTGGAGGAATTCTTCGTTACCAGCTTGACATAAGATCGTTTGATGAACTATCCGATGGGGAAGAGTATGGTGATTCTGAATAGCAATCAATCAATCACTCCTCGTTGCTGGTGCAGATGGGGGATAAAGAGCTTGCACCAGCGCCCAGGGTTCTCGGAGGGGTTATGTCCTTTTGATCGTGATTTGGCGCCTATCTGATAGATCTTCAAGGATGCGTGCTCCCATCTATAGCCACAGTGCTCTGCCCTTGTTCTCGTGGAATGAAAAGGAAGCACAACTGTTCTAATCTCGCAACCTCTAGAGTTTAATGGCCGCCTGTAACATGAAGACTTTGGTTAATTGATCACACAACGTAATAATAGGAAGGAATTGACATAATAAACCAATATATTAGGAATGTTCCCATTAATTAAGCATTGAAATCTTAGAACCCCCTAAAATAGATTATTTGATTTACTTTGTTAATAAAATACTACTATTACTAAATTTAACTTCACTTCATACGGTAATTATAGAATGTAGAGACCTTAATACCTATAATCTGCTCCGCAATGACTTCAACTTGTGAATGACGTACAATCTATAGCCTAAGATCTCAGATCTATCATTCATGTAGAAAGAAGTCTTTAAGTACGTTGCCAGCAGATAAAGATGTATGCAGCTTACAGGCGGGGAAGAACATAATTCAAATTTCAGACTACAACAAAACTAACCCAAGAGGAGAAATGGTATGGCCAAGTTATGGATACATCAGAGTAAGGTAAACGAGAAGAAATGAGTTAAGAAATGCATGCCCAAGAGAGTGAAAAAAAGTAAAGAATTGAGAACAGATAGACAACCTCGCAGGTCCAGTTCTCGCTCCTTAATGGCATTGAAGAAATGAGGGCTTTTCCAGATTAAATCCACTCTGGGTCTCACCATGTTCTTCTGACTTGAATTCACTGGACGACGATGGGGTTTCGGGGTTTAAGGGTTTAGGGCGTCGACGACAAACGACAGCAAGCAGCGCTGTGTGGATCGGAACACGAGACGTCCACATTTCGGACCGGACCAACCCCCTTAATGGACTTTCAGACGACCCAAATGAATTCACATTTGGGTCTCTCCATCTATGTGGCTTGTGTATTAAAGTTCTAAACTTTTTAAGAAATTAGACTGAATTCACATTTGGGTTAATTCATTATTCATTTTGTTAGTGAAAACTAATCGAGTATTAATCTCGTCAAACTGGGATTTCCCGCCCAAATCAGGTTAGGTCGCAGTGTTTACCTTTTCTAACCCAGCTACACTAATAATGGACTGGGTCCTCGTACATCGCTACACATATCTGCCCTACTCGTTTCTTTTCTCTTCTTCTCTCCTTGGCTGGCAACTTTCACCTCGTGGGTAATCATTGAAGAGTCATCAGGGTAGTGGGAGGGGTTGTAACAGCATGAGCGGCGGAGAAAAGAACGAGAACGAGGAGGATGTGAAAAAGAAACCAACGAAGGGGGAAAAGCGCCGGGGCCGAAGTCCGTCCAAAAGGGAACCAATCGCTGACATTGAAGATATTATGTTCAGCATAGATAAAGTTCAGACAATGAGGTCATCGCTATTGGATTGGTACGACCTTAGCCATAGGGACCTTCCTTGGAGGAGGTTGGACAAAGGGCAGCCTCAAACACGGGGTTATGGTGTGTGGGTATCAGAAATAATGCTGCAGCAGACGAGAGTTCAGACCGTCGTTGAATATTACAAGCGTTGGATGCATAAATGGCCCACCGTTCAACATCTCTCTCGTGCTTCTCTCGAGGAGGTGAATGAAATGTGGGCAGGCTTGGGATACTATAGACGAGCTCGTTTTCTTTTAGAGGGTGCAAAGTTGATAGTCAAAGAAGGTGGCGAATTTCCTAAAACGGTTTCTGCCCTTCGAAAAATTCCTGGAATTGGAGAGAAGGCAGCTGCTCAATTAGTTGATCCTTCCAGGCCTGGGGACTTCAATCAGGCACTCATGGAACTTGGTGCAACTTTATGCACACCAACAAGCCCAAGCTGCTCAACATGCCCTGTGTTTGATCACTGTGAGGCCCTTTCAATCTCAAAGGATGATAGTTCAGTTCTTGTCACAGATTATCCCGCTAAGGGTATAAAGACCAAACAAAGACATGATTACTCTGCTGTTTGTGTGGTTGAGATATTGGAAAATCAGGGCTCATCTGAGTTGAAGCAATCCAGTAGATTTCTTCTTGTAAAGAGGCCTGATGAAGGCTTGCTTGCTGGCCTATGGGAGTTCCCATCTGTCTTGTTGAAGGGAGAAGCTGATTCAAGTACAAGGAGAGAATCCATGAACAGCCTCTTGAGTAAATCCTTTGGACTTGAACCAAAAAAGAATTTTGATATCGTTATTAGAGAAGATGTTGGAGATTTTATCCATGTTTTCTCGCACATCCGTCTCAAGATATATGTTGAACACTTGGTGTTACGTTTAAAAGGTGAAGGTAGCAAGTTGTTTCGGAAACAGGAGAAGAAATCTATATCATGGAAATGTGTGGACAACAAGGTTATGTCCAGCGTGGGGTTGACGTCCAGTGTGAGGAAGGTGTATGCCATGGTGGAGAAATTTGAGGCAGATAAGATATCTCCCATCCGTGCAGTAGCCACAAAAAAACAGAGAGCTACTTCAACAAACTTGAGCTGCAGGAGCTGTTGACCTTAATCAAAGTAAGCTAATATATCCATTAGAAAAATAGTAAACTATCGGGGTCGAACCGACTGAACCATCGGTGTTTTGTTGGTTTAGTGAACCAATGATAGTATGACTAAAAAAAATTATTTTTTTTTCTTCGGTGACTCGTATATAAAATTAAAAATTTAAAGAGAATTGTTAAATATTTGTAAAGGGATAAAAGAAAAAACGTGAATCACGAT

Coding sequence (CDS)

ATGAGCGGCGGAGAAAAGAACGAGAACGAGGAGGATGTGAAAAAGAAACCAACGAAGGGGGAAAAGCGCCGGGGCCGAAGTCCGTCCAAAAGGGAACCAATCGCTGACATTGAAGATATTATGTTCAGCATAGATAAAGTTCAGACAATGAGGTCATCGCTATTGGATTGGTACGACCTTAGCCATAGGGACCTTCCTTGGAGGAGGTTGGACAAAGGGCAGCCTCAAACACGGGGTTATGGTGTGTGGGTATCAGAAATAATGCTGCAGCAGACGAGAGTTCAGACCGTCGTTGAATATTACAAGCGTTGGATGCATAAATGGCCCACCGTTCAACATCTCTCTCGTGCTTCTCTCGAGGAGGTGAATGAAATGTGGGCAGGCTTGGGATACTATAGACGAGCTCGTTTTCTTTTAGAGGGTGCAAAGTTGATAGTCAAAGAAGGTGGCGAATTTCCTAAAACGGTTTCTGCCCTTCGAAAAATTCCTGGAATTGGAGAGAAGGCAGCTGCTCAATTAGTTGATCCTTCCAGGCCTGGGGACTTCAATCAGGCACTCATGGAACTTGGTGCAACTTTATGCACACCAACAAGCCCAAGCTGCTCAACATGCCCTGTGTTTGATCACTGTGAGGCCCTTTCAATCTCAAAGGATGATAGTTCAGTTCTTGTCACAGATTATCCCGCTAAGGGTATAAAGACCAAACAAAGACATGATTACTCTGCTGTTTGTGTGGTTGAGATATTGGAAAATCAGGGCTCATCTGAGTTGAAGCAATCCAGTAGATTTCTTCTTGTAAAGAGGCCTGATGAAGGCTTGCTTGCTGGCCTATGGGAGTTCCCATCTGTCTTGTTGAAGGGAGAAGCTGATTCAAGTACAAGGAGAGAATCCATGAACAGCCTCTTGAGTAAATCCTTTGGACTTGAACCAAAAAAGAATTTTGATATCGTTATTAGAGAAGATGTTGGAGATTTTATCCATGTTTTCTCGCACATCCGTCTCAAGATATATGTTGAACACTTGGTGTTACGTTTAAAAGGTGAAGGTAGCAAGTTGTTTCGGAAACAGGAGAAGAAATCTATATCATGGAAATGTGTGGACAACAAGGTTATGTCCAGCGTGGGGTTGACGTCCAGTGTGAGGAAGGTGTATGCCATGGTGGAGAAATTTGAGGCAGATAAGATATCTCCCATCCGTGCAGTAGCCACAAAAAAACAGAGAGCTACTTCAACAAACTTGAGCTGCAGGAGCTGTTGA

Protein sequence

MSGGEKNENEEDVKKKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDLSHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLEEVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGEKAAAQLVDPSRPGDFNQALMELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSFGLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSVGLTSSVRKVYAMVEKFEADKISPIRAVATKKQRATSTNLSCRSC
BLAST of Cp4.1LG12g10040 vs. Swiss-Prot
Match: MUTYH_ARATH (Adenine DNA glycosylase OS=Arabidopsis thaliana GN=MYH PE=3 SV=1)

HSP 1 Score: 354.4 bits (908), Expect = 1.7e-96
Identity = 206/448 (45.98%), Postives = 279/448 (62.28%), Query Frame = 1

Query: 5   EKNENEEDVKKKPTKGEKRRGRSPSKREPIA-DIEDIMFSIDKVQTMRSSLLDWYDLSHR 64
           +K E EE+ +++  + E+    + ++ E +  DIED+ FS ++ Q +R  LLDWYD++ R
Sbjct: 89  DKEEAEEESEEEEEEEEEE---AEAEEEALGGDIEDL-FSENETQKIRMGLLDWYDVNKR 148

Query: 65  DLPWR-RLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLE-- 124
           DLPWR R  + + + R Y VWVSEIMLQQTRVQTV++YYKRWM KWPT+  L +ASLE  
Sbjct: 149 DLPWRNRRSESEKERRAYEVWVSEIMLQQTRVQTVMKYYKRWMQKWPTIYDLGQASLENL 208

Query: 125 -----------------EVNEMWAGLGYYRRAR---------------FLLEGAKLIVKE 184
                            EVNEMWAGLGYYRRAR               F  + + L+  +
Sbjct: 209 IVSRSRELSFLRGNEKKEVNEMWAGLGYYRRARFLLEGAKMVVAGTEGFPNQASSLMKVK 268

Query: 185 G-GEFP------------------KTVSALRKIPGIGE------------KAAAQLVDPS 244
           G G++                     +  L ++  I              K AAQLVDPS
Sbjct: 269 GIGQYTAGAIASIAFNEAVPVVDGNVIRVLARLKAISANPKDRLTARNFWKLAAQLVDPS 328

Query: 245 RPGDFNQALMELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQR 304
           RPGDFNQ+LMELGATLCT + PSCS+CPV   C A S+S+++ ++ VTDYP K IK K R
Sbjct: 329 RPGDFNQSLMELGATLCTVSKPSCSSCPVSSQCRAFSLSEENRTISVTDYPTKVIKAKPR 388

Query: 305 HDYSAVCVVEILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRES 364
           HD+  VCV+EI         +   RF+LVKRP++GLLAGLWEFPSV+L  EADS+TRR +
Sbjct: 389 HDFCCVCVLEI---HNLERNQSGGRFVLVKRPEQGLLAGLWEFPSVILNEEADSATRRNA 448

Query: 365 MNSLLSKS--FGLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRK 384
           +N  L ++  F +E KK   IV RE++G+F+H+F+HIR K+YVE LV++L G    LF+ 
Sbjct: 449 INVYLKEAFRFHVELKKACTIVSREELGEFVHIFTHIRRKVYVELLVVQLTGGTEDLFKG 508

BLAST of Cp4.1LG12g10040 vs. Swiss-Prot
Match: MUTY_BACSU (Adenine DNA glycosylase OS=Bacillus subtilis (strain 168) GN=mutY PE=2 SV=1)

HSP 1 Score: 152.1 bits (383), Expect = 1.3e-35
Identity = 110/383 (28.72%), Postives = 169/383 (44.13%), Query Frame = 1

Query: 47  VQTMRSSLLDWYDLSHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMH 106
           +Q  R  L+ W++   R LPWR           Y VWVSE+MLQQTRV+TV+ Y+ R++ 
Sbjct: 13  IQQFRDDLISWFEREQRVLPWRE------DQDPYKVWVSEVMLQQTRVETVIPYFLRFVE 72

Query: 107 KWPTVQHLSRASLEEVNEMWAGLGYYRRARFL---------------------------- 166
           ++PTV+ L+ A  E+V + W GLGYY R R L                            
Sbjct: 73  QFPTVEALADADEEKVLKAWEGLGYYSRVRNLQSAVKEVKQEYGGIVPPDEKDFGGLKGV 132

Query: 167 ---LEGAKLIVKEGGEFPKTVSALRKIPG----------------IGEKAAAQLVDPSRP 226
               +GA L +      P     + ++                  I E A    +   +P
Sbjct: 133 GPYTKGAVLSIAYNKPIPAVDGNVMRVMSRILSIWDDIAKPKTRTIFEDAIRAFISKEKP 192

Query: 227 GDFNQALMELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHD 286
            +FNQ LMELGA +CTP SPSC  CPV  HC A     +    + +     GIKT     
Sbjct: 193 SEFNQGLMELGALICTPKSPSCLLCPVQQHCSAFEEGTERELPVKSKKKKPGIKT----- 252

Query: 287 YSAVCVVEILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMN 346
            +A+ + +           +  +  + KRP +GLLA LWEFP+  L+ +    T RE + 
Sbjct: 253 MAAIVLTD-----------EDGQVYIHKRPSKGLLANLWEFPN--LETQKGIKTEREQLI 312

Query: 347 SLLSKSFGLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFR--KQE 381
           + L   +G++        I +  G   HVF+H+   I V    ++   + SKL +  K+E
Sbjct: 313 AFLENEYGIQAD------ISDLQGVVEHVFTHLVWNISVFFGKVKQVSDTSKLKKVTKEE 365

BLAST of Cp4.1LG12g10040 vs. Swiss-Prot
Match: MUTYH_HUMAN (Adenine DNA glycosylase OS=Homo sapiens GN=MUTYH PE=1 SV=1)

HSP 1 Score: 147.9 bits (372), Expect = 2.5e-34
Identity = 72/134 (53.73%), Postives = 91/134 (67.91%), Query Frame = 1

Query: 44  IDKVQTMRSSLLDWYDLSHRDLPWRRL--DKGQPQTRGYGVWVSEIMLQQTRVQTVVEYY 103
           + +V   R SLL WYD   RDLPWRR   D+     R Y VWVSE+MLQQT+V TV+ YY
Sbjct: 87  VAEVTAFRGSLLSWYDQEKRDLPWRRRAEDEMDLDRRAYAVWVSEVMLQQTQVATVINYY 146

Query: 104 KRWMHKWPTVQHLSRASLEEVNEMWAGLGYYRRARFLLEGAKLIVKE-GGEFPKTVSALR 163
             WM KWPT+Q L+ ASLEEVN++WAGLGYY R R L EGA+ +V+E GG  P+T   L+
Sbjct: 147 TGWMQKWPTLQDLASASLEEVNQLWAGLGYYSRGRRLQEGARKVVEELGGHMPRTAETLQ 206

Query: 164 K-IPGIGEKAAAQL 174
           + +PG+G   A  +
Sbjct: 207 QLLPGVGRYTAGAI 220

BLAST of Cp4.1LG12g10040 vs. Swiss-Prot
Match: MUTYH_MOUSE (Adenine DNA glycosylase OS=Mus musculus GN=Mutyh PE=2 SV=2)

HSP 1 Score: 147.5 bits (371), Expect = 3.2e-34
Identity = 81/183 (44.26%), Postives = 109/183 (59.56%), Query Frame = 1

Query: 14  KKKPTKGEKRRGRSPS---------------KREPIADIE----DIMFSIDKVQTMRSSL 73
           KK+P   ++RR R+ S               KRE +         +   +  V   RS+L
Sbjct: 12  KKQPANHKRRRTRALSSSQAKPSSLDGLAKQKREELLQASVSPYHLFSDVADVTAFRSNL 71

Query: 74  LDWYDLSHRDLPWRRLDKGQPQT--RGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQ 133
           L WYD   RDLPWR L K +  +  R Y VWVSE+MLQQT+V TV++YY RWM KWP +Q
Sbjct: 72  LSWYDQEKRDLPWRNLAKEEANSDRRAYAVWVSEVMLQQTQVATVIDYYTRWMQKWPKLQ 131

Query: 134 HLSRASLEEVNEMWAGLGYYRRARFLLEGAKLIVKE-GGEFPKTVSALRK-IPGIGEKAA 174
            L+ ASLEEVN++W+GLGYY R R L EGA+ +V+E GG  P+T   L++ +PG+G   A
Sbjct: 132 DLASASLEEVNQLWSGLGYYSRGRRLQEGARKVVEELGGHMPRTAETLQQLLPGVGRYTA 191

BLAST of Cp4.1LG12g10040 vs. Swiss-Prot
Match: MUTYH_RAT (Adenine DNA glycosylase OS=Rattus norvegicus GN=Mutyh PE=2 SV=1)

HSP 1 Score: 142.9 bits (359), Expect = 7.9e-33
Identity = 75/169 (44.38%), Postives = 103/169 (60.95%), Query Frame = 1

Query: 9   NEEDVKKKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDLSHRDLPWR 68
           +    K     G  ++ R    + P++    +   I  V   R +LL WYD   RDLPWR
Sbjct: 27  SSSQAKPSGLDGLAKQKREELLKTPVSPYH-LFSDIADVTAFRRNLLSWYDQEKRDLPWR 86

Query: 69  RLDKGQPQT--RGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLEEVNEMW 128
           +  K +     R Y VWVSE+MLQQT+V TV++YY RWM KWPT+Q L+ ASLEEVN++W
Sbjct: 87  KRVKEETNLDRRAYAVWVSEVMLQQTQVATVIDYYTRWMQKWPTLQDLASASLEEVNQLW 146

Query: 129 AGLGYYRRARFLLEGAKLIVKE-GGEFPKTVSALRK-IPGIGEKAAAQL 174
           +GLGYY R R L EGA+ +V+E GG  P+T   L++ +PG+G   A  +
Sbjct: 147 SGLGYYSRGRRLQEGARKVVEELGGHVPRTAETLQQLLPGVGRYTAGAI 194

BLAST of Cp4.1LG12g10040 vs. TrEMBL
Match: A0A0A0KC27_CUCSA (Uncharacterized protein OS=Cucumis sativus GN=Csa_6G088720 PE=4 SV=1)

HSP 1 Score: 565.5 bits (1456), Expect = 5.5e-158
Identity = 311/464 (67.03%), Postives = 345/464 (74.35%), Query Frame = 1

Query: 1   MSGGEKNENEEDVKK--------KPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRS 60
           MS GEKNEN+E +KK        KPT   KRRGRSPSK E + DIEDIMFSID VQT+R+
Sbjct: 55  MSDGEKNENDEYMKKNTDFRRKKKPTTERKRRGRSPSKSEAVVDIEDIMFSIDNVQTIRA 114

Query: 61  SLLDWYDLSHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQ 120
           SLLDWYD S RDLPWR LDKG+P+TR YGVWVSEIMLQQTRVQTVV++Y RWM KWPTVQ
Sbjct: 115 SLLDWYDRSRRDLPWRSLDKGEPETRAYGVWVSEIMLQQTRVQTVVQFYNRWMLKWPTVQ 174

Query: 121 HLSRASLEEV------------------------------------NEMWAGLGYYRRAR 180
           HLSRASLEEV                                         G+G Y    
Sbjct: 175 HLSRASLEEVNEMWAGLGYYRRARFLFEGAKMIVKEGGRFPRTVSSLRKIPGIGEYTAGA 234

Query: 181 FL-LEGAKLIVKEGGEFPKTVSALRKIPGIGE---------KAAAQLVDPSRPGDFNQAL 240
              +   +++    G   + ++ L+ I G  +         KAAAQLVD SRPGDFNQAL
Sbjct: 235 IASIAFGEVVPVVDGNVIRVIARLKAISGNPKDPKLIKQVWKAAAQLVDLSRPGDFNQAL 294

Query: 241 MELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVV 300
           MELGATLCTPT+PSCSTCPVFDHCEALSISK DSSVLVTDYPAKGIK KQRHDYSAVCVV
Sbjct: 295 MELGATLCTPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKIKQRHDYSAVCVV 354

Query: 301 EILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSF 360
           EILE+QG+ EL QSSRFLLVKRPDEGLLAGLWEFPSV L GEAD STRRES+NSLLSK+F
Sbjct: 355 EILESQGTPELGQSSRFLLVKRPDEGLLAGLWEFPSVSLDGEADLSTRRESINSLLSKNF 414

Query: 361 GLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCV 409
           GLE KKNF+IV REDVGDFIH+F+HIRLKIYVEHLVL LKGEGSKLFRKQEKKSI WKCV
Sbjct: 415 GLEAKKNFEIVNREDVGDFIHIFTHIRLKIYVEHLVLCLKGEGSKLFRKQEKKSILWKCV 474

BLAST of Cp4.1LG12g10040 vs. TrEMBL
Match: E5GB45_CUCME (A/G-specific adenine DNA glycosylase OS=Cucumis melo subsp. melo PE=4 SV=1)

HSP 1 Score: 495.4 bits (1274), Expect = 7.0e-137
Identity = 271/401 (67.58%), Postives = 297/401 (74.06%), Query Frame = 1

Query: 1   MSGGEKNENEEDVKKK--------PTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRS 60
           MS GEKNENEE+VKKK        PT   KRR RSPSK E + DIEDIMFSID VQT+R+
Sbjct: 1   MSDGEKNENEENVKKKTDFRRKKKPTTKRKRRSRSPSKSEAVVDIEDIMFSIDNVQTIRA 60

Query: 61  SLLDWYDLSHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQ 120
           SLLDWYD S RDLPWR LDKG+P+TR YGVWVSEIMLQQTRVQTVV++Y RWM KWPTVQ
Sbjct: 61  SLLDWYDRSRRDLPWRSLDKGEPETRAYGVWVSEIMLQQTRVQTVVQFYNRWMLKWPTVQ 120

Query: 121 HLSRASLEEV------------------------------------NEMWAGLGYYRRAR 180
           HLSRASLEEV                                         G+G Y    
Sbjct: 121 HLSRASLEEVNEMWAGLGYYRRARFLFEGAKMIVKEGGRFPKTVSSLRKIPGIGEYTAGA 180

Query: 181 FL-LEGAKLIVKEGGEFPKTVSALRKIPGIGE---------KAAAQLVDPSRPGDFNQAL 240
              +   +++    G   + ++ L+ I G  +         KAAAQLVD SRPGDFNQAL
Sbjct: 181 IASIAFGEVVPVVDGNVIRVIARLKAISGNPKDPKLIKQVWKAAAQLVDLSRPGDFNQAL 240

Query: 241 MELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVV 300
           MELGATLCTPT+PSCSTCPVFDHCEALSISK DSSVLVTDYPAKGIKTKQRHDYSAVCVV
Sbjct: 241 MELGATLCTPTNPSCSTCPVFDHCEALSISKRDSSVLVTDYPAKGIKTKQRHDYSAVCVV 300

Query: 301 EILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSF 348
           EILE+QG+SEL QSSRFLLVKRPDEGLLAGLWEFPSV L GEADSSTRRES++SLLSK+F
Sbjct: 301 EILESQGTSELGQSSRFLLVKRPDEGLLAGLWEFPSVSLDGEADSSTRRESIDSLLSKNF 360

BLAST of Cp4.1LG12g10040 vs. TrEMBL
Match: A0A0D2LX00_GOSRA (Uncharacterized protein OS=Gossypium raimondii GN=B456_001G100200 PE=4 SV=1)

HSP 1 Score: 407.9 bits (1047), Expect = 1.5e-110
Identity = 227/432 (52.55%), Postives = 280/432 (64.81%), Query Frame = 1

Query: 23  RRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDLSHRDLPWRRLDKG--------- 82
           +R +   + E I DIED+ FS +    +R+SLL+WYD + RDLPWR   K          
Sbjct: 8   KRPQLIKQEEQIGDIEDL-FSEEDTHKIRASLLEWYDKNQRDLPWRTSTKKSENGENVQE 67

Query: 83  --QPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLEEVNEMWAGLGY 142
             + + R YGVWVSE+MLQQTRVQTV++YY RWM KWPT+QHLS+ASLEEVNEMWAGLGY
Sbjct: 68  EEEEEKRAYGVWVSEVMLQQTRVQTVIDYYNRWMLKWPTLQHLSQASLEEVNEMWAGLGY 127

Query: 143 YRRARFLLEGAKLI----------------VKEGGEFP------------------KTVS 202
           YRRARFLLEGAK+I                V   G++                     V 
Sbjct: 128 YRRARFLLEGAKMIVAEGSEFPNTVFALRKVPGIGDYTAGAIASIAFKQVVPVVDGNVVR 187

Query: 203 ALRKIPGIGE------------KAAAQLVDPSRPGDFNQALMELGATLCTPTSPSCSTCP 262
            L ++  I              K AAQLVDPSRPGDFNQ+LMELGATLCTP +P+C++CP
Sbjct: 188 VLARLKAISANPKDKTTVKNFWKLAAQLVDPSRPGDFNQSLMELGATLCTPLNPNCTSCP 247

Query: 263 VFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGS-SELKQSSRFL 322
           V   C AL  S++D SV+V DYP K +KTKQR+D+S V VVEI  +Q    + K +SR L
Sbjct: 248 VSSQCRALHNSRNDESVMVMDYPMKVVKTKQRNDFSTVSVVEISRSQDRLQQTKSNSRVL 307

Query: 323 LVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSFGLEPKKNFDIVIREDVGD 382
           LVKRPDEGLLAGLWEFP V L  EAD S RR+ ++ LL KSF L P KN +++ RE VG+
Sbjct: 308 LVKRPDEGLLAGLWEFPCVTLDEEADLSMRRKLIDQLLKKSFKLNPPKNCNVISRELVGE 367

Query: 383 FIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSVGLTSSVRKVY 397
           F+HVFSHIR KIYVE LVL LKG    LF + +  +  WK +D + +S +GLTSSVRKVY
Sbjct: 368 FVHVFSHIRRKIYVELLVLHLKGGKHVLFEEDDINATDWKLLDCEAVSRMGLTSSVRKVY 427

BLAST of Cp4.1LG12g10040 vs. TrEMBL
Match: V4UAI5_9ROSI (Uncharacterized protein OS=Citrus clementina GN=CICLE_v10015195mg PE=4 SV=1)

HSP 1 Score: 405.2 bits (1040), Expect = 9.6e-110
Identity = 226/460 (49.13%), Postives = 291/460 (63.26%), Query Frame = 1

Query: 8   ENEEDVKKKPTKG-EKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDLSHRDLP 67
           +NE   KKK  +   +++   P + E   DIED+ FS  +V+ +R SLL WYD + R+LP
Sbjct: 2   DNERKTKKKKERQLPEKKTALPLEEE---DIEDL-FSEKEVKKIRQSLLQWYDKNQRELP 61

Query: 68  WRRLDKG----QPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLEEV 127
           WR   +     + + R YGVWVSE+MLQQTRVQTV++YY RWM KWPT+ HL++ASLEEV
Sbjct: 62  WRERSESDKEEEKEKRAYGVWVSEVMLQQTRVQTVIDYYNRWMTKWPTIHHLAKASLEEV 121

Query: 128 NEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGEKAAAQL-------VD 187
           NEMWAGLGYYRRARFLLEGAK+IV EG  FP TVS LRK+PGIG   A  +       V 
Sbjct: 122 NEMWAGLGYYRRARFLLEGAKMIVAEGDGFPNTVSDLRKVPGIGNYTAGAIASIAFKEVV 181

Query: 188 PSRPGDFNQALMELGATLC---------------------------------------TP 247
           P   G+  + L  L A                                          TP
Sbjct: 182 PVVDGNVIRVLARLKAISANPKDTSTVKNFWKLATQLVDSCRPGDFNQSLMELGAVICTP 241

Query: 248 TSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGSSE 307
            +P+C++CPV D C+A S+SK D+SVLVT YP K +K +QRHD SA CVVEIL     SE
Sbjct: 242 LNPNCTSCPVSDKCQAYSMSKCDNSVLVTSYPMKVLKARQRHDVSAACVVEILGGNDESE 301

Query: 308 LKQ-SSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSFGLEPKKNFD 367
             Q    F+LVKR DEGLLAGLWEFPS++L GE D +TRRE+    L KSF L+P+ N  
Sbjct: 302 RTQPDGVFILVKRRDEGLLAGLWEFPSIILDGETDITTRREAAECFLKKSFNLDPRNNCS 361

Query: 368 IVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSVG 416
           I++REDVG+F+H+FSHIRLK++VE LVLR+KG   K   KQ+K ++SWKCVD   ++S+G
Sbjct: 362 IILREDVGEFVHIFSHIRLKVHVELLVLRIKGGIDKWVEKQDKGTLSWKCVDGGTLASMG 421

BLAST of Cp4.1LG12g10040 vs. TrEMBL
Match: A0A067LD77_JATCU (Uncharacterized protein OS=Jatropha curcas GN=JCGZ_15038 PE=4 SV=1)

HSP 1 Score: 404.4 bits (1038), Expect = 1.6e-109
Identity = 226/447 (50.56%), Postives = 296/447 (66.22%), Query Frame = 1

Query: 14  KKKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDLSHRDLPWRRL--- 73
           KK+  + +K+R     + + I DIED+ FS  ++Q +R SLLDWYD + R LPWRR    
Sbjct: 9   KKRNVQQKKKRKLVNEEEKTIPDIEDL-FSDKEIQKIRESLLDWYDHNQRVLPWRRKNTN 68

Query: 74  -----DKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLEEVNEM 133
                ++ +   R YGVWVSE+MLQQTRVQTV++YY RWM KWPT+++L+ ASLEEVNEM
Sbjct: 69  PLEIEEEEEKGKRAYGVWVSEVMLQQTRVQTVIDYYNRWMLKWPTLENLALASLEEVNEM 128

Query: 134 WAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGEKAAAQL-------VDPSR 193
           WAGLGYYRRARFLLEGAK+IV EGG FP TVS+LRK+PGIG   A  +       V P  
Sbjct: 129 WAGLGYYRRARFLLEGAKMIVAEGGGFPSTVSSLRKVPGIGNYTAGAIASIAFGEVVPVV 188

Query: 194 PGDFNQAL-------------------MELGATLCTPTSPS------------------- 253
            G+  + L                    +L A L  P  P                    
Sbjct: 189 DGNVIRVLARLKAISTNPKNLVAIKNFWKLAAQLVDPCRPGDFNQSLMELGATVCTPSNP 248

Query: 254 -CSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGSSELKQ 313
            CS CPV + C ALSIS +D SVLVTDYPAK +K KQR+++SAVCVVEIL +QG ++  Q
Sbjct: 249 NCSLCPVSNQCRALSIS-EDKSVLVTDYPAKVVKVKQRNEFSAVCVVEILGSQGPTDGDQ 308

Query: 314 S-SRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSFGLEPKKNFDIVI 373
           S S FLLVKRPD+GLLAGLWEFP+V+L  EAD + R + +N  L K+F ++P++   IV+
Sbjct: 309 SESGFLLVKRPDDGLLAGLWEFPTVMLDKEADLTKRTKEINQFLKKTFKIDPQRTCSIVL 368

Query: 374 REDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSVGLTS 406
           RED+G+F+H+FSHIRLK+YVE LV+ LKG  ++LF + +K++ SWK V+ K +S++GLTS
Sbjct: 369 REDIGEFVHIFSHIRLKVYVELLVICLKGGTTELFSEHKKEATSWKYVNKKALSNLGLTS 428

BLAST of Cp4.1LG12g10040 vs. TAIR10
Match: AT4G12740.1 (AT4G12740.1 HhH-GPD base excision DNA repair family protein)

HSP 1 Score: 354.4 bits (908), Expect = 9.8e-98
Identity = 206/448 (45.98%), Postives = 279/448 (62.28%), Query Frame = 1

Query: 5   EKNENEEDVKKKPTKGEKRRGRSPSKREPIA-DIEDIMFSIDKVQTMRSSLLDWYDLSHR 64
           +K E EE+ +++  + E+    + ++ E +  DIED+ FS ++ Q +R  LLDWYD++ R
Sbjct: 89  DKEEAEEESEEEEEEEEEE---AEAEEEALGGDIEDL-FSENETQKIRMGLLDWYDVNKR 148

Query: 65  DLPWR-RLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLE-- 124
           DLPWR R  + + + R Y VWVSEIMLQQTRVQTV++YYKRWM KWPT+  L +ASLE  
Sbjct: 149 DLPWRNRRSESEKERRAYEVWVSEIMLQQTRVQTVMKYYKRWMQKWPTIYDLGQASLENL 208

Query: 125 -----------------EVNEMWAGLGYYRRAR---------------FLLEGAKLIVKE 184
                            EVNEMWAGLGYYRRAR               F  + + L+  +
Sbjct: 209 IVSRSRELSFLRGNEKKEVNEMWAGLGYYRRARFLLEGAKMVVAGTEGFPNQASSLMKVK 268

Query: 185 G-GEFP------------------KTVSALRKIPGIGE------------KAAAQLVDPS 244
           G G++                     +  L ++  I              K AAQLVDPS
Sbjct: 269 GIGQYTAGAIASIAFNEAVPVVDGNVIRVLARLKAISANPKDRLTARNFWKLAAQLVDPS 328

Query: 245 RPGDFNQALMELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQR 304
           RPGDFNQ+LMELGATLCT + PSCS+CPV   C A S+S+++ ++ VTDYP K IK K R
Sbjct: 329 RPGDFNQSLMELGATLCTVSKPSCSSCPVSSQCRAFSLSEENRTISVTDYPTKVIKAKPR 388

Query: 305 HDYSAVCVVEILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRES 364
           HD+  VCV+EI         +   RF+LVKRP++GLLAGLWEFPSV+L  EADS+TRR +
Sbjct: 389 HDFCCVCVLEI---HNLERNQSGGRFVLVKRPEQGLLAGLWEFPSVILNEEADSATRRNA 448

Query: 365 MNSLLSKS--FGLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRK 384
           +N  L ++  F +E KK   IV RE++G+F+H+F+HIR K+YVE LV++L G    LF+ 
Sbjct: 449 INVYLKEAFRFHVELKKACTIVSREELGEFVHIFTHIRRKVYVELLVVQLTGGTEDLFKG 508

BLAST of Cp4.1LG12g10040 vs. NCBI nr
Match: gi|659119956|ref|XP_008459934.1| (PREDICTED: A/G-specific adenine DNA glycosylase [Cucumis melo])

HSP 1 Score: 575.5 bits (1482), Expect = 7.7e-161
Identity = 316/464 (68.10%), Postives = 349/464 (75.22%), Query Frame = 1

Query: 1   MSGGEKNENEEDVKKK--------PTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRS 60
           MS GEKNENEE+VKKK        PT   KRR RSPSK E + DIEDIMFSID VQT+R+
Sbjct: 1   MSDGEKNENEENVKKKTDFRRKKKPTTKRKRRSRSPSKSEAVVDIEDIMFSIDNVQTIRA 60

Query: 61  SLLDWYDLSHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQ 120
           SLLDWYD S RDLPWR LDKG+P+TR YGVWVSEIMLQQTRVQTVV++Y RWM KWPTVQ
Sbjct: 61  SLLDWYDRSRRDLPWRSLDKGEPETRAYGVWVSEIMLQQTRVQTVVQFYNRWMLKWPTVQ 120

Query: 121 HLSRASLEEV------------------------------------NEMWAGLGYYRRAR 180
           HLSRASLEEV                                         G+G Y    
Sbjct: 121 HLSRASLEEVNEMWAGLGYYRRARFLFEGAKMIVKEGGRFPKTVSSLRKIPGIGEYTAGA 180

Query: 181 FL-LEGAKLIVKEGGEFPKTVSALRKIPGIGE---------KAAAQLVDPSRPGDFNQAL 240
              +   +++    G   + ++ L+ I G  +         KAAAQLVD SRPGDFNQAL
Sbjct: 181 IASIAFGEVVPVVDGNVIRVIARLKAISGNPKDPKLIKQVWKAAAQLVDLSRPGDFNQAL 240

Query: 241 MELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVV 300
           MELGATLCTPT+PSCSTCPVFDHCEALSISK DSSVLVTDYPAKGIKTKQRHDYSAVCVV
Sbjct: 241 MELGATLCTPTNPSCSTCPVFDHCEALSISKRDSSVLVTDYPAKGIKTKQRHDYSAVCVV 300

Query: 301 EILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSF 360
           EILE+QG+SEL QSSRFLLVKRPDEGLLAGLWEFPSV L GEADSSTRRES++SLLSK+F
Sbjct: 301 EILESQGTSELGQSSRFLLVKRPDEGLLAGLWEFPSVSLDGEADSSTRRESIDSLLSKNF 360

Query: 361 GLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCV 409
           GLEPKKNF+IV REDVGDFIHVF+HIRLKIYVEHLVL LKGEGSKLFRKQEKKSI WKCV
Sbjct: 361 GLEPKKNFEIVNREDVGDFIHVFTHIRLKIYVEHLVLCLKGEGSKLFRKQEKKSILWKCV 420

BLAST of Cp4.1LG12g10040 vs. NCBI nr
Match: gi|778711687|ref|XP_004140565.2| (PREDICTED: A/G-specific adenine DNA glycosylase [Cucumis sativus])

HSP 1 Score: 565.5 bits (1456), Expect = 7.9e-158
Identity = 311/464 (67.03%), Postives = 345/464 (74.35%), Query Frame = 1

Query: 1   MSGGEKNENEEDVKK--------KPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRS 60
           MS GEKNEN+E +KK        KPT   KRRGRSPSK E + DIEDIMFSID VQT+R+
Sbjct: 1   MSDGEKNENDEYMKKNTDFRRKKKPTTERKRRGRSPSKSEAVVDIEDIMFSIDNVQTIRA 60

Query: 61  SLLDWYDLSHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQ 120
           SLLDWYD S RDLPWR LDKG+P+TR YGVWVSEIMLQQTRVQTVV++Y RWM KWPTVQ
Sbjct: 61  SLLDWYDRSRRDLPWRSLDKGEPETRAYGVWVSEIMLQQTRVQTVVQFYNRWMLKWPTVQ 120

Query: 121 HLSRASLEEV------------------------------------NEMWAGLGYYRRAR 180
           HLSRASLEEV                                         G+G Y    
Sbjct: 121 HLSRASLEEVNEMWAGLGYYRRARFLFEGAKMIVKEGGRFPRTVSSLRKIPGIGEYTAGA 180

Query: 181 FL-LEGAKLIVKEGGEFPKTVSALRKIPGIGE---------KAAAQLVDPSRPGDFNQAL 240
              +   +++    G   + ++ L+ I G  +         KAAAQLVD SRPGDFNQAL
Sbjct: 181 IASIAFGEVVPVVDGNVIRVIARLKAISGNPKDPKLIKQVWKAAAQLVDLSRPGDFNQAL 240

Query: 241 MELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVV 300
           MELGATLCTPT+PSCSTCPVFDHCEALSISK DSSVLVTDYPAKGIK KQRHDYSAVCVV
Sbjct: 241 MELGATLCTPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKIKQRHDYSAVCVV 300

Query: 301 EILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSF 360
           EILE+QG+ EL QSSRFLLVKRPDEGLLAGLWEFPSV L GEAD STRRES+NSLLSK+F
Sbjct: 301 EILESQGTPELGQSSRFLLVKRPDEGLLAGLWEFPSVSLDGEADLSTRRESINSLLSKNF 360

Query: 361 GLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCV 409
           GLE KKNF+IV REDVGDFIH+F+HIRLKIYVEHLVL LKGEGSKLFRKQEKKSI WKCV
Sbjct: 361 GLEAKKNFEIVNREDVGDFIHIFTHIRLKIYVEHLVLCLKGEGSKLFRKQEKKSILWKCV 420

BLAST of Cp4.1LG12g10040 vs. NCBI nr
Match: gi|700191190|gb|KGN46394.1| (hypothetical protein Csa_6G088720 [Cucumis sativus])

HSP 1 Score: 565.5 bits (1456), Expect = 7.9e-158
Identity = 311/464 (67.03%), Postives = 345/464 (74.35%), Query Frame = 1

Query: 1   MSGGEKNENEEDVKK--------KPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRS 60
           MS GEKNEN+E +KK        KPT   KRRGRSPSK E + DIEDIMFSID VQT+R+
Sbjct: 55  MSDGEKNENDEYMKKNTDFRRKKKPTTERKRRGRSPSKSEAVVDIEDIMFSIDNVQTIRA 114

Query: 61  SLLDWYDLSHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQ 120
           SLLDWYD S RDLPWR LDKG+P+TR YGVWVSEIMLQQTRVQTVV++Y RWM KWPTVQ
Sbjct: 115 SLLDWYDRSRRDLPWRSLDKGEPETRAYGVWVSEIMLQQTRVQTVVQFYNRWMLKWPTVQ 174

Query: 121 HLSRASLEEV------------------------------------NEMWAGLGYYRRAR 180
           HLSRASLEEV                                         G+G Y    
Sbjct: 175 HLSRASLEEVNEMWAGLGYYRRARFLFEGAKMIVKEGGRFPRTVSSLRKIPGIGEYTAGA 234

Query: 181 FL-LEGAKLIVKEGGEFPKTVSALRKIPGIGE---------KAAAQLVDPSRPGDFNQAL 240
              +   +++    G   + ++ L+ I G  +         KAAAQLVD SRPGDFNQAL
Sbjct: 235 IASIAFGEVVPVVDGNVIRVIARLKAISGNPKDPKLIKQVWKAAAQLVDLSRPGDFNQAL 294

Query: 241 MELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVV 300
           MELGATLCTPT+PSCSTCPVFDHCEALSISK DSSVLVTDYPAKGIK KQRHDYSAVCVV
Sbjct: 295 MELGATLCTPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKIKQRHDYSAVCVV 354

Query: 301 EILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSF 360
           EILE+QG+ EL QSSRFLLVKRPDEGLLAGLWEFPSV L GEAD STRRES+NSLLSK+F
Sbjct: 355 EILESQGTPELGQSSRFLLVKRPDEGLLAGLWEFPSVSLDGEADLSTRRESINSLLSKNF 414

Query: 361 GLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCV 409
           GLE KKNF+IV REDVGDFIH+F+HIRLKIYVEHLVL LKGEGSKLFRKQEKKSI WKCV
Sbjct: 415 GLEAKKNFEIVNREDVGDFIHIFTHIRLKIYVEHLVLCLKGEGSKLFRKQEKKSILWKCV 474

BLAST of Cp4.1LG12g10040 vs. NCBI nr
Match: gi|307135815|gb|ADN33687.1| (A/G-specific adenine DNA glycosylase [Cucumis melo subsp. melo])

HSP 1 Score: 495.4 bits (1274), Expect = 1.0e-136
Identity = 271/401 (67.58%), Postives = 297/401 (74.06%), Query Frame = 1

Query: 1   MSGGEKNENEEDVKKK--------PTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRS 60
           MS GEKNENEE+VKKK        PT   KRR RSPSK E + DIEDIMFSID VQT+R+
Sbjct: 1   MSDGEKNENEENVKKKTDFRRKKKPTTKRKRRSRSPSKSEAVVDIEDIMFSIDNVQTIRA 60

Query: 61  SLLDWYDLSHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQ 120
           SLLDWYD S RDLPWR LDKG+P+TR YGVWVSEIMLQQTRVQTVV++Y RWM KWPTVQ
Sbjct: 61  SLLDWYDRSRRDLPWRSLDKGEPETRAYGVWVSEIMLQQTRVQTVVQFYNRWMLKWPTVQ 120

Query: 121 HLSRASLEEV------------------------------------NEMWAGLGYYRRAR 180
           HLSRASLEEV                                         G+G Y    
Sbjct: 121 HLSRASLEEVNEMWAGLGYYRRARFLFEGAKMIVKEGGRFPKTVSSLRKIPGIGEYTAGA 180

Query: 181 FL-LEGAKLIVKEGGEFPKTVSALRKIPGIGE---------KAAAQLVDPSRPGDFNQAL 240
              +   +++    G   + ++ L+ I G  +         KAAAQLVD SRPGDFNQAL
Sbjct: 181 IASIAFGEVVPVVDGNVIRVIARLKAISGNPKDPKLIKQVWKAAAQLVDLSRPGDFNQAL 240

Query: 241 MELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVV 300
           MELGATLCTPT+PSCSTCPVFDHCEALSISK DSSVLVTDYPAKGIKTKQRHDYSAVCVV
Sbjct: 241 MELGATLCTPTNPSCSTCPVFDHCEALSISKRDSSVLVTDYPAKGIKTKQRHDYSAVCVV 300

Query: 301 EILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSF 348
           EILE+QG+SEL QSSRFLLVKRPDEGLLAGLWEFPSV L GEADSSTRRES++SLLSK+F
Sbjct: 301 EILESQGTSELGQSSRFLLVKRPDEGLLAGLWEFPSVSLDGEADSSTRRESIDSLLSKNF 360

BLAST of Cp4.1LG12g10040 vs. NCBI nr
Match: gi|763741238|gb|KJB08737.1| (hypothetical protein B456_001G100200 [Gossypium raimondii])

HSP 1 Score: 407.9 bits (1047), Expect = 2.1e-110
Identity = 227/432 (52.55%), Postives = 280/432 (64.81%), Query Frame = 1

Query: 23  RRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDLSHRDLPWRRLDKG--------- 82
           +R +   + E I DIED+ FS +    +R+SLL+WYD + RDLPWR   K          
Sbjct: 8   KRPQLIKQEEQIGDIEDL-FSEEDTHKIRASLLEWYDKNQRDLPWRTSTKKSENGENVQE 67

Query: 83  --QPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLEEVNEMWAGLGY 142
             + + R YGVWVSE+MLQQTRVQTV++YY RWM KWPT+QHLS+ASLEEVNEMWAGLGY
Sbjct: 68  EEEEEKRAYGVWVSEVMLQQTRVQTVIDYYNRWMLKWPTLQHLSQASLEEVNEMWAGLGY 127

Query: 143 YRRARFLLEGAKLI----------------VKEGGEFP------------------KTVS 202
           YRRARFLLEGAK+I                V   G++                     V 
Sbjct: 128 YRRARFLLEGAKMIVAEGSEFPNTVFALRKVPGIGDYTAGAIASIAFKQVVPVVDGNVVR 187

Query: 203 ALRKIPGIGE------------KAAAQLVDPSRPGDFNQALMELGATLCTPTSPSCSTCP 262
            L ++  I              K AAQLVDPSRPGDFNQ+LMELGATLCTP +P+C++CP
Sbjct: 188 VLARLKAISANPKDKTTVKNFWKLAAQLVDPSRPGDFNQSLMELGATLCTPLNPNCTSCP 247

Query: 263 VFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGS-SELKQSSRFL 322
           V   C AL  S++D SV+V DYP K +KTKQR+D+S V VVEI  +Q    + K +SR L
Sbjct: 248 VSSQCRALHNSRNDESVMVMDYPMKVVKTKQRNDFSTVSVVEISRSQDRLQQTKSNSRVL 307

Query: 323 LVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSFGLEPKKNFDIVIREDVGD 382
           LVKRPDEGLLAGLWEFP V L  EAD S RR+ ++ LL KSF L P KN +++ RE VG+
Sbjct: 308 LVKRPDEGLLAGLWEFPCVTLDEEADLSMRRKLIDQLLKKSFKLNPPKNCNVISRELVGE 367

Query: 383 FIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSVGLTSSVRKVY 397
           F+HVFSHIR KIYVE LVL LKG    LF + +  +  WK +D + +S +GLTSSVRKVY
Sbjct: 368 FVHVFSHIRRKIYVELLVLHLKGGKHVLFEEDDINATDWKLLDCEAVSRMGLTSSVRKVY 427

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
MUTYH_ARATH1.7e-9645.98Adenine DNA glycosylase OS=Arabidopsis thaliana GN=MYH PE=3 SV=1[more]
MUTY_BACSU1.3e-3528.72Adenine DNA glycosylase OS=Bacillus subtilis (strain 168) GN=mutY PE=2 SV=1[more]
MUTYH_HUMAN2.5e-3453.73Adenine DNA glycosylase OS=Homo sapiens GN=MUTYH PE=1 SV=1[more]
MUTYH_MOUSE3.2e-3444.26Adenine DNA glycosylase OS=Mus musculus GN=Mutyh PE=2 SV=2[more]
MUTYH_RAT7.9e-3344.38Adenine DNA glycosylase OS=Rattus norvegicus GN=Mutyh PE=2 SV=1[more]
Match NameE-valueIdentityDescription
A0A0A0KC27_CUCSA5.5e-15867.03Uncharacterized protein OS=Cucumis sativus GN=Csa_6G088720 PE=4 SV=1[more]
E5GB45_CUCME7.0e-13767.58A/G-specific adenine DNA glycosylase OS=Cucumis melo subsp. melo PE=4 SV=1[more]
A0A0D2LX00_GOSRA1.5e-11052.55Uncharacterized protein OS=Gossypium raimondii GN=B456_001G100200 PE=4 SV=1[more]
V4UAI5_9ROSI9.6e-11049.13Uncharacterized protein OS=Citrus clementina GN=CICLE_v10015195mg PE=4 SV=1[more]
A0A067LD77_JATCU1.6e-10950.56Uncharacterized protein OS=Jatropha curcas GN=JCGZ_15038 PE=4 SV=1[more]
Match NameE-valueIdentityDescription
AT4G12740.19.8e-9845.98 HhH-GPD base excision DNA repair family protein[more]
Match NameE-valueIdentityDescription
gi|659119956|ref|XP_008459934.1|7.7e-16168.10PREDICTED: A/G-specific adenine DNA glycosylase [Cucumis melo][more]
gi|778711687|ref|XP_004140565.2|7.9e-15867.03PREDICTED: A/G-specific adenine DNA glycosylase [Cucumis sativus][more]
gi|700191190|gb|KGN46394.1|7.9e-15867.03hypothetical protein Csa_6G088720 [Cucumis sativus][more]
gi|307135815|gb|ADN33687.1|1.0e-13667.58A/G-specific adenine DNA glycosylase [Cucumis melo subsp. melo][more]
gi|763741238|gb|KJB08737.1|2.1e-11052.55hypothetical protein B456_001G100200 [Gossypium raimondii][more]
The following terms have been associated with this gene:
Vocabulary: Molecular Function
TermDefinition
GO:0016787hydrolase activity
GO:0003824catalytic activity
Vocabulary: Biological Process
TermDefinition
GO:0006281DNA repair
GO:0006284base-excision repair
Vocabulary: INTERPRO
TermDefinition
IPR023170HTH_base_excis_C
IPR015797NUDIX_hydrolase-like_dom_sf
IPR011257DNA_glycosylase
IPR003265HhH-GPD_domain
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0006284 base-excision repair
biological_process GO:0006306 DNA methylation
biological_process GO:0055114 oxidation-reduction process
biological_process GO:0006281 DNA repair
cellular_component GO:0016021 integral component of membrane
cellular_component GO:0005575 cellular_component
molecular_function GO:0051539 4 iron, 4 sulfur cluster binding
molecular_function GO:0003677 DNA binding
molecular_function GO:0019104 DNA N-glycosylase activity
molecular_function GO:0016798 hydrolase activity, acting on glycosyl bonds
molecular_function GO:0003824 catalytic activity
molecular_function GO:0016787 hydrolase activity
molecular_function GO:0005488 binding
molecular_function GO:0020037 heme binding
molecular_function GO:0005506 iron ion binding
molecular_function GO:0004497 monooxygenase activity
molecular_function GO:0016705 oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Cp4.1LG12g10040.1Cp4.1LG12g10040.1mRNA


Analysis Name: InterPro Annotations of Cucurbita pepo
Date Performed: 2017-12-02
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR003265HhH-GPD domainPFAMPF00730HhH-GPDcoord: 84..175
score: 6.9
IPR003265HhH-GPD domainSMARTSM00478endo3endcoord: 88..192
score: 7.
IPR011257DNA glycosylaseGENE3DG3DSA:1.10.340.30coord: 75..169
score: 8.4
IPR011257DNA glycosylaseunknownSSF48150DNA-glycosylasecoord: 47..213
score: 3.77
IPR015797NUDIX hydrolase domain-likeGENE3DG3DSA:3.90.79.10coord: 258..390
score: 4.1
IPR015797NUDIX hydrolase domain-likeunknownSSF55811Nudixcoord: 228..343
score: 1.78
IPR023170Helix-turn-helix, base-excision DNA repair, C-terminalGENE3DG3DSA:1.10.1670.10coord: 170..214
score: 1.1
NoneNo IPR availablePANTHERPTHR10359A/G-SPECIFIC ADENINE GLYCOSYLASE/ENDONUCLEASE IIIcoord: 44..392
score: 8.4