Cp4.1LG12g10040 (gene) Cucurbita pepo (MU‐CU‐16) v4.1

Overview
NameCp4.1LG12g10040
Typegene
OrganismCucurbita pepo (Cucurbita pepo (MU‐CU‐16) v4.1)
DescriptionAdenine DNA glycosylase
LocationCp4.1LG12: 9693676 .. 9703087 (-)
RNA-Seq ExpressionCp4.1LG12g10040
SyntenyCp4.1LG12g10040
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: exonfive_prime_UTRpolypeptideCDSthree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
AGCCCAGCCCAACAACTTCCACCCTGTCTTTTGAAGAAACCCTAAATCTCCCTAATTCATTCTTCCATTTCTCAATCTCCCATTTCTCAATCTTAGATCTCTCTCTCATCTCTCCCGAGCAGAGTCTACGGACCCCGTTCCCCGATTGCTTCCCCAACTCACCAGCTCTCGTTTTCAATCCTTCTCAGATCTCTTCCGGTGGTCCGAGATCTCTCTGTCTCCCAACTCCAGCGGCATCTCTTACGAGCATAATCCCCCGTATTAGCGGCCGGCAGCAGCTCTCTCCCACGAACCCACATCGCGGCGGCTTCTTTCCCTCTCGAGCACCCATCCCCGCCGGTAACTCTCTTTCAAAAGCACCTAAGTTTTATCATAAATTCTTGTGTTGTTGGAAATCATATGAAAACTTCCACCAGACCTGCTCGAAAAACATTTTGTTGTTCTGTTGTTTCTCTTTTAAATTTAATTCAATAACCTTTCCCTGCTTAACCACTCATGGAAACGATGGCTTTTCTTGATTCAATGCATTCCCTCATGTATTTTTCTTTTCCCTTTTCCACAGTATTCTCATTTTATTACGATGGCTGATGGTCATGAAACTGATAAGAACATTGAAATATGGAAAATTAAGAAGTTGATTAAAGCACTTGAAGCTGCAAGAGGCAATGGGACTAGCATGATCTCTCTCATCATGCCTCCGCGCGATCAAATATCTCGTGTTACTAAGATGCTTGGTGATGAATTTGGAACTGCTTCGAACATTAAGAGTAGGGTAAATCGCCAATCTGTTTTGGGCGCCATCACGTCTGCTCAGCAGAGACTCAAGTTGTATAACAAGGTTCCTCCAAATGGGCTTGTGCTATATACTGGAACAATCGTGACTGAAGATGGAAAAGAAAAGAAAGTTACAATTGATTTTGAGCCTTTCAGACCTATAAATGCTTCTCTCTATCTCTGCGACAACAAGTTCCATACGGAAGCCCTGAATGAACTTCTAGAATCTGATGACAAGTTTGGCTTCATTGTCATGGATGGTAATGGAACACTTTTTGGGACATTGAGTGGTAACACACGTGAAGTCCTTCACAAATTTAGCGTCGACCTTCCTAAGAAACATGGAAGAGGAGGTCAGTCAGCACTTAGGTTTGCCCGTCTTCGGATGGAGAAACGGCACAACTATGTCAGGAAAACAGCAGAGCTTGCAACCCAGTTCTATATTAATCCGGCCACTAGTCAACCCAATGTTGCAGGATTGATACTGGCTGGATCAGCCGACTTCAAAACAGAGCTCAGCCAGTCCGACATGTTTGATCCTCGTCTTCAGGCTAAAATACTTAATGTGGTTGATGTCTCTTATGGAGGGGAAAATGGATTTAATCAGGCTATTGAATTGTCGTCCGAGATCTTGTCTAATGTAAAATTTATACAGGAGAAGCGTTTGATTGGAAAATACTTTGAAGAGATTAGCCAGGACACGGGGAAATATGTTTTTGGTGTTGACGACACACTGAAAGCTCTGGAGATGGGTGCTGTTGAGATACTCATTGTTTGGGAAAATTTGGATATCAATAGGTACGTATTAAAGAATGTTTCCACTGGTGAGGTTATTATAAAGCACTTGAATAAGGAACAGGAAGCCAATCAGAGCAACTTCCGTGACCCCATCACCGCTGCTGAATTGGAGGTTCAAGAAAAAATGGCCTTGCTGGAATGGTTTGCAAACGAGTACAAGAAGTTTGGTTGTACCCTGGAATTCGTTACGAACAAATCACAGGAAGGATCACAATTCTGCAGAGGTTTTGGTGGTATTGGAGGAATTCTTCGTTACCAGCTTGACATAAGATCGTTTGATGAACTATCCGATGGGGAAGAGTATGGTGATTCTGAATAGCAATCAATCAATCACTCCTCGTTGCTGGTGCAGATGGGGGATAAAGAGCTTGCACCAGCGCCCAGGGTTCTCGGAGGTTAGTTTTGGATTGAACTTTGTGAAGGCTTTATATAATGAACTTTGTCTTATGGTTACATGCTCAATAATCTACCTTTTGTACTCGTAGGGGTTATGTCCTTTTGATCGTGATTTGGCGCCTATCTGATAGATCTTCAAGGATGCGTGCTCCCATCTATAGCCACAGTGCTCTGCCCTTGTTCTCGTGGAATGGTTGCTTCCTACAGCAGATACTCAGTTACAAAAATATTTCAAGTATTACAATAATATTCAATGCCGTATGACCTTAAATTTTTGAGTTTGTGTCGTTTTGACTATTTATCGTTGCTGATAAACCATGTCAGTTGGATTTGAATTCATATGCATGTTACAGTGTTTTAAGGTTGTTTTGCTGGGTATGGTTTTTGGATATCTAAAAACATTGTGTTTGGAATTAATAGGAAACGGCATCAGGAACTAGGCTACTTTTTTAAAGATGCATTTAAAGCGTTATTATACTTAGCCCAAGTGTGTTAGGTATTCCTTTAGGAAGTGCTATTGGAAAAGTTTGTTTTAGTTTGACGCATGGATGTGGTTTTGAACCTTTGCCCATGTCGGGTGATAATTTGTTTTGATGAAGTGCTTTTACCAAAGTTATTTACTTTTGGAAGTGCTTATAAATAACAGTCCAAGTGTGACAGTTTGGTGTGACTTTAGAAGTGCATATGGGTGATTTGGAGTACCCTCGTAAAAGTGTATTTAAAATGTTTTAAAGAGTATTTATATCTTATGGAAGTGTTTCAAATTTGCACACCTTTTTTTTAAATCTTTTTTAAATCTTCTCAAAAGGTTTCGAGAAGGACAGCATTGATGAATGAATGGCAATGGAAGAAAACTTCGTTAGCTTATGAAACAGCAATCTACTTGCATTACAAAAACAAGAACTAACTTCGAGATAAAATCTCGAATGGCTGTAAATTCATCCTTTGATTGTTCTTTTAAATCTTTGGATATCATCTTTTTGTCTCTACTATTAGACATTGTCTTCTCATCTATATCTTTTGAGGTATTATCTTTGAAAAGATTCAAATTTGCAGGCAACTGAGCGTCGAGAATTTTTTTTTTTTTTTTTTTTTTAAGAAAAGGAAGCACAACTGTTCTAATCTCGCAACCTCTAGAGTTTAATGGCCGCCTGTAACATGAAGACTTTGGTTAATTGATCACACAACGTAATAATAGGAAGGAATTGACATAATAAACCAATATATTAGGAATGTTCCCATTAATTAAGCATTGAAATCTTAGAACCCCCTAAAATAGATTATTTGATTTACTTTGTTAATAAAATACTACTATTACTAAATTTAACTTCACTTCATACGGTAATTATAGAATGTAGAGACCTTAATACCTATAATCTGCTCCGCAATGACTTCAACTTGTGAATGACGTACAATCTATAGCCTAAGATCTCAGATCTATCATTCATGTAGAAAGAAGTCTTTAAGTACGTTGCCAGCAGATAAAGATGTATGCAGCTTACAGGCGGGGAAGAACATAATTCAAATTTCAGACTACAACAAAACTAACCCAAGAGGAGAAATGGTATGGCCAAGTTATGGATACATCAGAGTAAGGTAAACGAGAAGAAATGAGTTAAGAAATGCATGCCCAAGAGAGTGAAAAAAAGTAAAGAATTGAGAACAGATAGACAACCTCGCAGGTCCAGTTCTCGCTCCTTAATGGCATTGAAGAAATGAGGGCTTTTCCAGATTAAATCCACTCTGGGTCTCACCATGTTCTTCTGACTTGAATTCACTGGACGACGATGGGGTTTCGGGGTTTAAGGGTTTAGGGCGTCGACGACAAACGACAGCAAGCAGCGCTGTGTGGATCGGAACACGAGACGTCCACATTTCGGACCGGACCAACCCCCTTAATGGACTTTCAGACGACCCAAATGAATTCACATTTGGGTCTCTCCATCTATGTGGCTTGTGTATTAAAGTTCTAAACTTTTTAAGAAATTAGACTGAATTCACATTTGGGTTAATTCATTATTCATTTTGTTAGTGAAAACTAATCGAGTATTAATCTCGTCAAACTGGGATTTCCCGCCCAAATCAGGTTAGGTCGCAGTGTTTACCTTTTCTAACCCAGCTACACTAATAATGGACTGGGTCCTCGTACATCGCTACACATATCTGCCCTACTCGTTTCTTTTCTCTTCTTCTCTCCTTGGCTGGCAACTTTCACCTCGTGGGTAATCATTGAAGAGTCATCAGGGTAGTGGGAGGGGTTGTAACAGCATGAGCGGCGGAGAAAAGAACGAGAACGAGGAGGATGTGAAAAAGAAACCAACGAAGGGGGAAAAGCGCCGGGGCCGAAGTCCGTCCAAAAGGGAACCAATCGCTGACATTGAAGATATTATGTTCAGCATAGATAAAGTTCAGACAATGAGGTCATCGCTATTGGATTGGTACGACCTTAGCCATAGGGACCTTCCTTGGAGGAGGTTGGACAAAGGGCAGCCTCAAACACGGGGTTATGGTGTGTGGGTATCAGAAATAATGCTGCAGCAGACGAGAGTTCAGACCGTCGTTGAATATTACAAGCGTTGGATGCATAAATGGCCCACCGTTCAACATCTCTCTCGTGCTTCTCTCGAGGTTGGTTTTATGATTTAGTTTGAGCTACAATATTGTCTTCTTTACTCTCAGTAAGCAGTTTAGATCTCTGGCAGGAGGTGAATGAAATGTGGGCAGGCTTGGGATACTATAGACGAGCTCGTTTTCTTTTAGAGGTAATCTTTACTCGCTGCACTAATGGAATTTTTGGTCTCTTAATATCTGCCACCACGATCACTCCTTCCAATTTCCTTGGGACATGGTTTAGGTTGGCATTTACTTTTAAATATTAGCTTCGTTTGTTAAGAGGGCAGCTTTTTGTTTTGAATCTTTCTCAAGATAGAGGTAGGAGAGGTGGGAAACTTTTCTACCACAACTGGTTGATTGCTGGGAAAACACACACTGGTTTGCTTTACTATAGCTTATATCACTGTCTCAATCTCCATCTACTCTTTCATGACTTATTACGGCCCTGTCAGCCTAGGCGAAGATGAGGATAAGTAATAAAGAGTTGGACGGTTCAGCGACAACTGTTGGGACTCATATGGGTCTGCTATCCTACCATTTTAACATTAGAATAGTATTTAAGAAAGGTGATGTACAAGATAACCCTAGTATTCAGGGTATTGAAATAACAACCTAACTTGTGGTCTCCCTTCCAACAAGCTTCTCAAACTCCCTTACACCCTATAAGCTCCTTCCTCAAAGACATACTCAGAAAACCCCATGCTCGCTCTTGTTTCCTAAGGCTTCATGATTCTAGGCTTTATAAGACTCCCTTGATGGAGGGAGATTTACTCTTTTATCAAGTGTACTGATTTATTATATGTCTCCTTTTGCTTTGCCTGTGTCGATAAGGAATTTGAAAAGACCATGAGGCACTTTTTTCTCTTGGGTCTCTTTTTGGAAGGTTCCTTGGATTGCAAATCAGTTATTGGCCAATTGATTTCTCATATTATATGAACTTTGATATCAGAAAGATATGTTAGCAGTGAGTCAGTGACATGTCTGGGACAGTTGCTAAGATCATCGAAACTTGAAAGTTTGTATTAAAAGATGTTTAAAAGATGAGGAGTTAACTGGATGTTGAAGCTTTCTTGAGTTATCACGTGGAAAAGTTCTATCTAGAGAGGAAGATTGTAGAAACTAGATATTTCAGGATGAGACTCCGTCAAATCTTTGCTAAAACATTTAGTGGGAGCTGCTGAAATCTTGAAATTTGAAGAAGGTGAATATGTTGTCATGGTGTTTATGGAGAGGGTAGAGCGATCTTGAGGATAGAATGAGAGCGATCCTGTACATGAATGAAACTGGACCACATTCGAATGAGAGCAATCCTGAGGATAGGATGGTTAAAATCGGTTTAAAAGAATTGAAAGTTACTTACCTATACCAACAAGGTGCACTTTCTTTTTCTTTTTGGTGGCTCAATCATAGGAACTCCGAAGTTAGCAAGCTTTTCTTAGAGCAGTTCTATGTCAGGTGACCTCCTCATAATTTTCCTAGGATGCATGTGAGTGATAACAAAGCATGCTGAAAGGTCCCGTATTGGTCTGTAGGGATAGTCTTCACTCTTAGAAGCAATAAGTAAGCAACATACCCATGTTGTAGGAGTGCAGAGTAATGTCGAGGCACATAGGCGTTACACTTCGACTGTAATGCACACCATATGCATACATGCTAGTGGGTTTTCTTTTCTACTTGTATATTCTTCAGTAAATTGACTGCATGTTACTTATCAAAATATTACAAAATAAAGTACAAAAAGAACTGAAATGAAAAATATATATATATATATCAACATCTCTGGATATTAATCATTGCTATGAAATTTTGAATTACCCTTTCTAATTTTGTTGTTAGGGTGCAAAGTTGATAGTCAAAGAAGGTGGCGAATTTCCTAAAACGGTTTCTGCCCTTCGAAAAATTCCTGGAATTGGAGAGTATACAGCAGGGGCTATTGCCTCCATAGCATTTGATGAAGTGAGTGTTTTTCTCGCCTATTTTTTTCTCTTACTCGTTTTAAAAACCTTGAGGGAAAGTCTAAAGAGGACAATATCTGCTAGCGGTGGGTCTGGGCCGTTACAATGCACTTCTAAATATGTTTCCTGAGCAGGTGGTGCCTGTGGTTGATGGTAATGTGATTCGGGTAATCGCTCGATTAAAGGCTATTTCAGGAAATCCGAAAGACTCAAAGTTGGTTAAGCAAGTTTGGTGAGCATACTTTCTAGTTGTTCTAAAGGCTTTTATAGAATCTTTTTATCCTAATTGTTCTCATTGTTTTTTTTGGGGTGGGGGGTGGATGAGTTAGGAAGGCAGCTGCTCAATTAGTTGATCCTTCCAGGCCTGGGGACTTCAATCAGGCACTCATGGAACTTGGTGCAACTTTATGCACACCAACAAGCCCAAGCTGCTCAACATGCCCTGTGTTTGATCACTGTGAGGCCCTTTCAATCTCAAAGGATGATAGTTCAGTTCTTGTCACAGATTATCCCGCTAAGGGTATAAAGACCAAACAAAGACATGATTACTCTGCTGTTTGTGTGGTTGAGATATTGGAAAATCAGGGCTCATCTGAGTTGAAGCAATCCAGTAGATTTCTTCTTGTAAAGAGGCCTGATGAAGGCTTGCTTGCTGGCCTATGGGAGTTCCCATCTGTCTTGTTGAAGGGAGAAGCTGATTCAAGTACAAGGAGAGAATCCATGAACAGCCTCTTGAGTAAATCCTTTGGACTTGAACCAAAAAAGAATTTTGATATCGTTATTAGAGAAGATGTTGGAGATTTTATCCATGTTTTCTCGCACATCCGTCTCAAGATATATGTTGAACACTTGGTGTTACGTTTAAAAGGTTCGAGTTACTTCTCCTTTCTTGCCTATCTGTGCATTGAGTATGCCATGAACTCTGTGACTAGAAACTTGCTATAAAACTTAACCTATATGAGATTTACTTATCTTATAGGCTCTCAAACATTTTGATGTAGCTATAGTTGTCCTTTATGTTCTTTTGTGTGACACCCCACATCGGTTGGAGAGGGGAACGAAACATTCCTTATAAGGGTGTGGAAACCTCTTCCTAGTAGACGTGTGTTAAAACCTTGAGGGAAAGCCTAAAAGAGGACAATATCTAGTAGCGGTGGGCTTGGATTGTTACAAATGGTATCAGAGGCAGACATTGGGCAGTGTGCCAATGGGGACGTTGGGCCCTAAGGAGAGTGGATTGTGAGATTCCAATTCTTATAAGGGTGTGAAAACCTCTCCCTAATAGACATGTGTTATTTCTTTTCATTTCAATTAATTTGCTTAATCTGTTTGTATGTGTTCTTGAAGGTGAAGGTAGCAAGTTGTTTCGGAAACAGGAGAAGAAATCTATATCATGGAAATGTGTGGACAACAAGGTTATGTCCAGCGTGGGGTTGACGTCCAGTGTGAGGAAGGTAAGCAAAGATGGTGCTTTAGATGACTTCTCCTTGTGAAATATTTAACTTGAAGGTTCTATTTTCATCGTTAAACTTGAAAGAAAACTCTTTTTAGATTGCATGATTGCAGACATTTAGTGTAACACCCAAACTCACGGCTAGCAAATATTGTCATTTTTAGGTTTTCCTTTAAGGACTTTCCTTTAAGGTTTTAAGACGCCTCTACTAGAGAGAGGTTTCCACGCCCTTATAAGAAATTCTTCGCTCCCCCCTTCAACCGATGTGAGATCTCACAATCCACCTCCCTTACGGGCCAGTGTCCCCGATGGCACACCACTCGGTGTCTGGCTCTAATATCATTTGTAACAGCTCAAGCCCACCACTAGCAGATATTGCCCACTTTGGCATGTTACGTATCATCGTCAGCCTCACGGTTTTAAAATTTGTCTACTATGGAAAGGGTCTAGCCTTACTCCGACTAGTGCCTTGCACAGTTTGGTGACTGGCTCTGATACCATTTGTAATAGCTCAAGGCCCACCGTTAGCAGATATTGTCCTGTTTAGGCTTTCCCTTCAAGGGTTTCCCCTCAAAGTTTAAAAACGCGTCTACTAGGGAGAGATTTTCACACCCTTACAAGGAATGCTTCGTTCTCCTCTCAAACCAATGTGAGATCTCACATTTAGTATTCCTTCGAACAAACAAGTCAATACTCATATAGTCTCGATGATGACTGGTATGAACCTTAACAACACTTGAGATTCAAAATCATATATACGTTGGATTAAGTGACATTGAGAAAATCAGTAAACAGGGACTTCTCTGATCACATGATCACTCTCTTTGAACCCCCTCGTGCATGCATATTTATTTATGGATTGCATAGAGATTTGATATTTGTAATCCAAGTTCGATGGACGTCAAAATTCACAAACACAGTATCCTTTTACAGTCCATGATATCATAATGAGTATGTAATTATGTATGATCAGGACATCGAGGGAGGTCGATGTACGTTTCGTTTTCTCACTGTATTTATATAGAATTAGTGTTACTAAGCATGATTATACTCAAGAGGTCCAGATGATTAATAATCTGACATCTTTTTGATAGGTGTATGCCATGGTGGAGAAATTTGAGGCAGATAAGATATCTCCCATCCGTGCAGTAGCCACAAAAAAACAGAGAGCTACTTCAACAAACTTGAGCTGCAGGAGCTGTTGACCTTAATCAAAGTAAGCTAATATATCCATTAGAAAAATAGTAAACTATCGGGGTCGAACCGACTGAACCATCGGTGTTTTGTTGGTTTAGTGAACCAATGATAGTATGACTAAAAAAAATTATTTTTTTTTCTTCGGTGACTCGTATATAAAATTAAAAATTTAAAGAGAATTGTTAAATATTTGTAAAGGGATAAAAGAAAAAACGTGAATCACGAT

mRNA sequence

AGCCCAGCCCAACAACTTCCACCCTGTCTTTTGAAGAAACCCTAAATCTCCCTAATTCATTCTTCCATTTCTCAATCTCCCATTTCTCAATCTTAGATCTCTCTCTCATCTCTCCCGAGCAGAGTCTACGGACCCCGTTCCCCGATTGCTTCCCCAACTCACCAGCTCTCGTTTTCAATCCTTCTCAGATCTCTTCCGGTGGTCCGAGATCTCTCTGTCTCCCAACTCCAGCGGCATCTCTTACGAGCATAATCCCCCGTATTAGCGGCCGGCAGCAGCTCTCTCCCACGAACCCACATCGCGGCGGCTTCTTTCCCTCTCGAGCACCCATCCCCGCCGTATTCTCATTTTATTACGATGGCTGATGGTCATGAAACTGATAAGAACATTGAAATATGGAAAATTAAGAAGTTGATTAAAGCACTTGAAGCTGCAAGAGGCAATGGGACTAGCATGATCTCTCTCATCATGCCTCCGCGCGATCAAATATCTCGTGTTACTAAGATGCTTGGTGATGAATTTGGAACTGCTTCGAACATTAAGAGTAGGGTAAATCGCCAATCTGTTTTGGGCGCCATCACGTCTGCTCAGCAGAGACTCAAGTTGTATAACAAGGTTCCTCCAAATGGGCTTGTGCTATATACTGGAACAATCGTGACTGAAGATGGAAAAGAAAAGAAAGTTACAATTGATTTTGAGCCTTTCAGACCTATAAATGCTTCTCTCTATCTCTGCGACAACAAGTTCCATACGGAAGCCCTGAATGAACTTCTAGAATCTGATGACAAGTTTGGCTTCATTGTCATGGATGGTAATGGAACACTTTTTGGGACATTGAGTGGTAACACACGTGAAGTCCTTCACAAATTTAGCGTCGACCTTCCTAAGAAACATGGAAGAGGAGGTCAGTCAGCACTTAGGTTTGCCCGTCTTCGGATGGAGAAACGGCACAACTATGTCAGGAAAACAGCAGAGCTTGCAACCCAGTTCTATATTAATCCGGCCACTAGTCAACCCAATGTTGCAGGATTGATACTGGCTGGATCAGCCGACTTCAAAACAGAGCTCAGCCAGTCCGACATGTTTGATCCTCGTCTTCAGGCTAAAATACTTAATGTGGTTGATGTCTCTTATGGAGGGGAAAATGGATTTAATCAGGCTATTGAATTGTCGTCCGAGATCTTGTCTAATGTAAAATTTATACAGGAGAAGCGTTTGATTGGAAAATACTTTGAAGAGATTAGCCAGGACACGGGGAAATATGTTTTTGGTGTTGACGACACACTGAAAGCTCTGGAGATGGGTGCTGTTGAGATACTCATTGTTTGGGAAAATTTGGATATCAATAGGTACGTATTAAAGAATGTTTCCACTGGTGAGGTTATTATAAAGCACTTGAATAAGGAACAGGAAGCCAATCAGAGCAACTTCCGTGACCCCATCACCGCTGCTGAATTGGAGGTTCAAGAAAAAATGGCCTTGCTGGAATGGTTTGCAAACGAGTACAAGAAGTTTGGTTGTACCCTGGAATTCGTTACGAACAAATCACAGGAAGGATCACAATTCTGCAGAGGTTTTGGTGGTATTGGAGGAATTCTTCGTTACCAGCTTGACATAAGATCGTTTGATGAACTATCCGATGGGGAAGAGTATGGTGATTCTGAATAGCAATCAATCAATCACTCCTCGTTGCTGGTGCAGATGGGGGATAAAGAGCTTGCACCAGCGCCCAGGGTTCTCGGAGGGGTTATGTCCTTTTGATCGTGATTTGGCGCCTATCTGATAGATCTTCAAGGATGCGTGCTCCCATCTATAGCCACAGTGCTCTGCCCTTGTTCTCGTGGAATGAAAAGGAAGCACAACTGTTCTAATCTCGCAACCTCTAGAGTTTAATGGCCGCCTGTAACATGAAGACTTTGGTTAATTGATCACACAACGTAATAATAGGAAGGAATTGACATAATAAACCAATATATTAGGAATGTTCCCATTAATTAAGCATTGAAATCTTAGAACCCCCTAAAATAGATTATTTGATTTACTTTGTTAATAAAATACTACTATTACTAAATTTAACTTCACTTCATACGGTAATTATAGAATGTAGAGACCTTAATACCTATAATCTGCTCCGCAATGACTTCAACTTGTGAATGACGTACAATCTATAGCCTAAGATCTCAGATCTATCATTCATGTAGAAAGAAGTCTTTAAGTACGTTGCCAGCAGATAAAGATGTATGCAGCTTACAGGCGGGGAAGAACATAATTCAAATTTCAGACTACAACAAAACTAACCCAAGAGGAGAAATGGTATGGCCAAGTTATGGATACATCAGAGTAAGGTAAACGAGAAGAAATGAGTTAAGAAATGCATGCCCAAGAGAGTGAAAAAAAGTAAAGAATTGAGAACAGATAGACAACCTCGCAGGTCCAGTTCTCGCTCCTTAATGGCATTGAAGAAATGAGGGCTTTTCCAGATTAAATCCACTCTGGGTCTCACCATGTTCTTCTGACTTGAATTCACTGGACGACGATGGGGTTTCGGGGTTTAAGGGTTTAGGGCGTCGACGACAAACGACAGCAAGCAGCGCTGTGTGGATCGGAACACGAGACGTCCACATTTCGGACCGGACCAACCCCCTTAATGGACTTTCAGACGACCCAAATGAATTCACATTTGGGTCTCTCCATCTATGTGGCTTGTGTATTAAAGTTCTAAACTTTTTAAGAAATTAGACTGAATTCACATTTGGGTTAATTCATTATTCATTTTGTTAGTGAAAACTAATCGAGTATTAATCTCGTCAAACTGGGATTTCCCGCCCAAATCAGGTTAGGTCGCAGTGTTTACCTTTTCTAACCCAGCTACACTAATAATGGACTGGGTCCTCGTACATCGCTACACATATCTGCCCTACTCGTTTCTTTTCTCTTCTTCTCTCCTTGGCTGGCAACTTTCACCTCGTGGGTAATCATTGAAGAGTCATCAGGGTAGTGGGAGGGGTTGTAACAGCATGAGCGGCGGAGAAAAGAACGAGAACGAGGAGGATGTGAAAAAGAAACCAACGAAGGGGGAAAAGCGCCGGGGCCGAAGTCCGTCCAAAAGGGAACCAATCGCTGACATTGAAGATATTATGTTCAGCATAGATAAAGTTCAGACAATGAGGTCATCGCTATTGGATTGGTACGACCTTAGCCATAGGGACCTTCCTTGGAGGAGGTTGGACAAAGGGCAGCCTCAAACACGGGGTTATGGTGTGTGGGTATCAGAAATAATGCTGCAGCAGACGAGAGTTCAGACCGTCGTTGAATATTACAAGCGTTGGATGCATAAATGGCCCACCGTTCAACATCTCTCTCGTGCTTCTCTCGAGGAGGTGAATGAAATGTGGGCAGGCTTGGGATACTATAGACGAGCTCGTTTTCTTTTAGAGGGTGCAAAGTTGATAGTCAAAGAAGGTGGCGAATTTCCTAAAACGGTTTCTGCCCTTCGAAAAATTCCTGGAATTGGAGAGAAGGCAGCTGCTCAATTAGTTGATCCTTCCAGGCCTGGGGACTTCAATCAGGCACTCATGGAACTTGGTGCAACTTTATGCACACCAACAAGCCCAAGCTGCTCAACATGCCCTGTGTTTGATCACTGTGAGGCCCTTTCAATCTCAAAGGATGATAGTTCAGTTCTTGTCACAGATTATCCCGCTAAGGGTATAAAGACCAAACAAAGACATGATTACTCTGCTGTTTGTGTGGTTGAGATATTGGAAAATCAGGGCTCATCTGAGTTGAAGCAATCCAGTAGATTTCTTCTTGTAAAGAGGCCTGATGAAGGCTTGCTTGCTGGCCTATGGGAGTTCCCATCTGTCTTGTTGAAGGGAGAAGCTGATTCAAGTACAAGGAGAGAATCCATGAACAGCCTCTTGAGTAAATCCTTTGGACTTGAACCAAAAAAGAATTTTGATATCGTTATTAGAGAAGATGTTGGAGATTTTATCCATGTTTTCTCGCACATCCGTCTCAAGATATATGTTGAACACTTGGTGTTACGTTTAAAAGGTGAAGGTAGCAAGTTGTTTCGGAAACAGGAGAAGAAATCTATATCATGGAAATGTGTGGACAACAAGGTTATGTCCAGCGTGGGGTTGACGTCCAGTGTGAGGAAGGTGTATGCCATGGTGGAGAAATTTGAGGCAGATAAGATATCTCCCATCCGTGCAGTAGCCACAAAAAAACAGAGAGCTACTTCAACAAACTTGAGCTGCAGGAGCTGTTGACCTTAATCAAAGTAAGCTAATATATCCATTAGAAAAATAGTAAACTATCGGGGTCGAACCGACTGAACCATCGGTGTTTTGTTGGTTTAGTGAACCAATGATAGTATGACTAAAAAAAATTATTTTTTTTTCTTCGGTGACTCGTATATAAAATTAAAAATTTAAAGAGAATTGTTAAATATTTGTAAAGGGATAAAAGAAAAAACGTGAATCACGAT

Coding sequence (CDS)

ATGAGCGGCGGAGAAAAGAACGAGAACGAGGAGGATGTGAAAAAGAAACCAACGAAGGGGGAAAAGCGCCGGGGCCGAAGTCCGTCCAAAAGGGAACCAATCGCTGACATTGAAGATATTATGTTCAGCATAGATAAAGTTCAGACAATGAGGTCATCGCTATTGGATTGGTACGACCTTAGCCATAGGGACCTTCCTTGGAGGAGGTTGGACAAAGGGCAGCCTCAAACACGGGGTTATGGTGTGTGGGTATCAGAAATAATGCTGCAGCAGACGAGAGTTCAGACCGTCGTTGAATATTACAAGCGTTGGATGCATAAATGGCCCACCGTTCAACATCTCTCTCGTGCTTCTCTCGAGGAGGTGAATGAAATGTGGGCAGGCTTGGGATACTATAGACGAGCTCGTTTTCTTTTAGAGGGTGCAAAGTTGATAGTCAAAGAAGGTGGCGAATTTCCTAAAACGGTTTCTGCCCTTCGAAAAATTCCTGGAATTGGAGAGAAGGCAGCTGCTCAATTAGTTGATCCTTCCAGGCCTGGGGACTTCAATCAGGCACTCATGGAACTTGGTGCAACTTTATGCACACCAACAAGCCCAAGCTGCTCAACATGCCCTGTGTTTGATCACTGTGAGGCCCTTTCAATCTCAAAGGATGATAGTTCAGTTCTTGTCACAGATTATCCCGCTAAGGGTATAAAGACCAAACAAAGACATGATTACTCTGCTGTTTGTGTGGTTGAGATATTGGAAAATCAGGGCTCATCTGAGTTGAAGCAATCCAGTAGATTTCTTCTTGTAAAGAGGCCTGATGAAGGCTTGCTTGCTGGCCTATGGGAGTTCCCATCTGTCTTGTTGAAGGGAGAAGCTGATTCAAGTACAAGGAGAGAATCCATGAACAGCCTCTTGAGTAAATCCTTTGGACTTGAACCAAAAAAGAATTTTGATATCGTTATTAGAGAAGATGTTGGAGATTTTATCCATGTTTTCTCGCACATCCGTCTCAAGATATATGTTGAACACTTGGTGTTACGTTTAAAAGGTGAAGGTAGCAAGTTGTTTCGGAAACAGGAGAAGAAATCTATATCATGGAAATGTGTGGACAACAAGGTTATGTCCAGCGTGGGGTTGACGTCCAGTGTGAGGAAGGTGTATGCCATGGTGGAGAAATTTGAGGCAGATAAGATATCTCCCATCCGTGCAGTAGCCACAAAAAAACAGAGAGCTACTTCAACAAACTTGAGCTGCAGGAGCTGTTGA

Protein sequence

MSGGEKNENEEDVKKKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDLSHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLEEVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGEKAAAQLVDPSRPGDFNQALMELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSFGLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSVGLTSSVRKVYAMVEKFEADKISPIRAVATKKQRATSTNLSCRSC
Homology
BLAST of Cp4.1LG12g10040 vs. ExPASy Swiss-Prot
Match: F4JRF4 (Adenine DNA glycosylase OS=Arabidopsis thaliana OX=3702 GN=MYH PE=3 SV=1)

HSP 1 Score: 385.2 bits (988), Expect = 9.5e-106
Identity = 216/448 (48.21%), Postives = 284/448 (63.39%), Query Frame = 0

Query: 5   EKNENEEDVKKKPTKGEKRRGRSPSKREPI-ADIEDIMFSIDKVQTMRSSLLDWYDLSHR 64
           +K E EE+ +++    E+    + ++ E +  DIED +FS ++ Q +R  LLDWYD++ R
Sbjct: 89  DKEEAEEESEEEE---EEEEEEAEAEEEALGGDIED-LFSENETQKIRMGLLDWYDVNKR 148

Query: 65  DLPWR-RLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLE-- 124
           DLPWR R  + + + R Y VWVSEIMLQQTRVQTV++YYKRWM KWPT+  L +ASLE  
Sbjct: 149 DLPWRNRRSESEKERRAYEVWVSEIMLQQTRVQTVMKYYKRWMQKWPTIYDLGQASLENL 208

Query: 125 -----------------EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIP 184
                            EVNEMWAGLGYYRRARFLLEGAK++V     FP   S+L K+ 
Sbjct: 209 IVSRSRELSFLRGNEKKEVNEMWAGLGYYRRARFLLEGAKMVVAGTEGFPNQASSLMKVK 268

Query: 185 GIGE----------------------------------------------KAAAQLVDPS 244
           GIG+                                              K AAQLVDPS
Sbjct: 269 GIGQYTAGAIASIAFNEAVPVVDGNVIRVLARLKAISANPKDRLTARNFWKLAAQLVDPS 328

Query: 245 RPGDFNQALMELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQR 304
           RPGDFNQ+LMELGATLCT + PSCS+CPV   C A S+S+++ ++ VTDYP K IK K R
Sbjct: 329 RPGDFNQSLMELGATLCTVSKPSCSSCPVSSQCRAFSLSEENRTISVTDYPTKVIKAKPR 388

Query: 305 HDYSAVCVVEILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRES 364
           HD+  VCV+EI         +   RF+LVKRP++GLLAGLWEFPSV+L  EADS+TRR +
Sbjct: 389 HDFCCVCVLEI---HNLERNQSGGRFVLVKRPEQGLLAGLWEFPSVILNEEADSATRRNA 448

Query: 365 MNSLLSKS--FGLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRK 384
           +N  L ++  F +E KK   IV RE++G+F+H+F+HIR K+YVE LV++L G    LF+ 
Sbjct: 449 INVYLKEAFRFHVELKKACTIVSREELGEFVHIFTHIRRKVYVELLVVQLTGGTEDLFKG 508

BLAST of Cp4.1LG12g10040 vs. ExPASy Swiss-Prot
Match: Q99P21 (Adenine DNA glycosylase OS=Mus musculus OX=10090 GN=Mutyh PE=1 SV=2)

HSP 1 Score: 236.1 bits (601), Expect = 7.1e-61
Identity = 161/431 (37.35%), Postives = 208/431 (48.26%), Query Frame = 0

Query: 14  KKKPTKGEKRRGRSPS---------------KREPIADIE----DIMFSIDKVQTMRSSL 73
           KK+P   ++RR R+ S               KRE +         +   +  V   RS+L
Sbjct: 12  KKQPANHKRRRTRALSSSQAKPSSLDGLAKQKREELLQASVSPYHLFSDVADVTAFRSNL 71

Query: 74  LDWYDLSHRDLPWRRLDKGQPQT--RGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQ 133
           L WYD   RDLPWR L K +  +  R Y VWVSE+MLQQT+V TV++YY RWM KWP +Q
Sbjct: 72  LSWYDQEKRDLPWRNLAKEEANSDRRAYAVWVSEVMLQQTQVATVIDYYTRWMQKWPKLQ 131

Query: 134 HLSRASLEEVNEMWAGLGYYRRARFLLEGAKLIVKE-GGEFPKTVSALRK-IPGIGE--- 193
            L+ ASLEEVN++W+GLGYY R R L EGA+ +V+E GG  P+T   L++ +PG+G    
Sbjct: 132 DLASASLEEVNQLWSGLGYYSRGRRLQEGARKVVEELGGHMPRTAETLQQLLPGVGRYTA 191

Query: 194 -------------------------------------------KAAAQLVDPSRPGDFNQ 253
                                                        A QLVDP+RPGDFNQ
Sbjct: 192 GAIASIAFDQVTGVVDGNVLRVLCRVRAIGADPTSTLVSHHLWNLAQQLVDPARPGDFNQ 251

Query: 254 ALMELGATLCTPTSPSCSTCPVFDHCEA-------------------------------- 313
           A MELGAT+CTP  P CS CPV   C A                                
Sbjct: 252 AAMELGATVCTPQRPLCSHCPVQSLCRAYQRVQRGQLSALPGRPDIEECALNTRQCQLCL 311

Query: 314 LSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGSSELKQSSRFLLVKRPDEG 344
            S S  D S+ V ++P K  +   R +YSA CVVE     G   +      LLV+RPD G
Sbjct: 312 TSSSPWDPSMGVANFPRKASRRPPREEYSATCVVEQPGAIGGPLV------LLVQRPDSG 371

BLAST of Cp4.1LG12g10040 vs. ExPASy Swiss-Prot
Match: Q9UIF7 (Adenine DNA glycosylase OS=Homo sapiens OX=9606 GN=MUTYH PE=1 SV=1)

HSP 1 Score: 231.1 bits (588), Expect = 2.3e-59
Identity = 150/385 (38.96%), Postives = 191/385 (49.61%), Query Frame = 0

Query: 44  IDKVQTMRSSLLDWYDLSHRDLPWRRL--DKGQPQTRGYGVWVSEIMLQQTRVQTVVEYY 103
           + +V   R SLL WYD   RDLPWRR   D+     R Y VWVSE+MLQQT+V TV+ YY
Sbjct: 87  VAEVTAFRGSLLSWYDQEKRDLPWRRRAEDEMDLDRRAYAVWVSEVMLQQTQVATVINYY 146

Query: 104 KRWMHKWPTVQHLSRASLEEVNEMWAGLGYYRRARFLLEGAKLIVKE-GGEFPKTVSALR 163
             WM KWPT+Q L+ ASLEEVN++WAGLGYY R R L EGA+ +V+E GG  P+T   L+
Sbjct: 147 TGWMQKWPTLQDLASASLEEVNQLWAGLGYYSRGRRLQEGARKVVEELGGHMPRTAETLQ 206

Query: 164 K-IPGIGEKAAA----------------------------------------------QL 223
           + +PG+G   A                                               QL
Sbjct: 207 QLLPGVGRYTAGAIASIAFGQATGVVDGNVARVLCRVRAIGADPSSTLVSQQLWGLAQQL 266

Query: 224 VDPSRPGDFNQALMELGATLCTPTSPSCSTCPVFDHCEA--------------LSISKD- 283
           VDP+RPGDFNQA MELGAT+CTP  P CS CPV   C A              LS S D 
Sbjct: 267 VDPARPGDFNQAAMELGATVCTPQRPLCSQCPVESLCRARQRVEQEQLLASGSLSGSPDV 326

Query: 284 --------------------DSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGSSELK 343
                               D ++ V ++P K  +   R + SA CV   LE  G+    
Sbjct: 327 EECAPNTGQCHLCLPPSEPWDQTLGVVNFPRKASRKPPREESSATCV---LEQPGA---- 386

BLAST of Cp4.1LG12g10040 vs. ExPASy Swiss-Prot
Match: Q8R5G2 (Adenine DNA glycosylase OS=Rattus norvegicus OX=10116 GN=Mutyh PE=2 SV=1)

HSP 1 Score: 223.8 bits (569), Expect = 3.7e-57
Identity = 160/463 (34.56%), Postives = 221/463 (47.73%), Query Frame = 0

Query: 9   NEEDVKKKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDLSHRDLPWR 68
           +    K     G  ++ R    + P++    +   I  V   R +LL WYD   RDLPWR
Sbjct: 27  SSSQAKPSGLDGLAKQKREELLKTPVSPYH-LFSDIADVTAFRRNLLSWYDQEKRDLPWR 86

Query: 69  RLDKGQP--QTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLEEVNEMW 128
           +  K +     R Y VWVSE+MLQQT+V TV++YY RWM KWPT+Q L+ ASLEEVN++W
Sbjct: 87  KRVKEETNLDRRAYAVWVSEVMLQQTQVATVIDYYTRWMQKWPTLQDLASASLEEVNQLW 146

Query: 129 AGLGYYRRARFLLEGAKLIVKE-GGEFPKTVSALRK-IPGIGEKAAA------------- 188
           +GLGYY R R L EGA+ +V+E GG  P+T   L++ +PG+G   A              
Sbjct: 147 SGLGYYSRGRRLQEGARKVVEELGGHVPRTAETLQQLLPGVGRYTAGAIASIAFDQVTGV 206

Query: 189 ---------------------------------QLVDPSRPGDFNQALMELGATLCTPTS 248
                                            QLVDP+RPGDFNQA MELGAT+CTP  
Sbjct: 207 VDGNVIRVLCRVRAIGADPTSSFVSHHLWDLAQQLVDPARPGDFNQAAMELGATVCTPQR 266

Query: 249 PSCSTCPVFDHC-----------EALSISKD---------------------DSSVLVTD 308
           P C+ CPV   C            AL  S D                     D ++ V +
Sbjct: 267 PLCNHCPVQSLCRAHQRVGQGRLSALPGSPDIEECALNTRQCQLCLPSTNPWDPNMGVVN 326

Query: 309 YPAKGIKTKQRHDYSAVCVVEILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLK 368
           +P K  +   R +YSA CVVE     G   +      LLV+RP+ GLLAGLWEFPSV L 
Sbjct: 327 FPRKASRRPPREEYSATCVVEQPGATGGPLI------LLVQRPNSGLLAGLWEFPSVTL- 386

Query: 369 GEADSSTRRESMNSLLSKSFGLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHLVLRLK 390
              + S + +    L        P     +   + +G+ IHVFSHI+L   V  L   L+
Sbjct: 387 ---EPSGQHQHKALLQELQHWSAPLPTTPL---QHLGEVIHVFSHIKLTYQVYSLA--LE 446

BLAST of Cp4.1LG12g10040 vs. ExPASy Swiss-Prot
Match: Q10159 (Adenine DNA glycosylase OS=Schizosaccharomyces pombe (strain 972 / ATCC 24843) OX=284812 GN=myh1 PE=1 SV=1)

HSP 1 Score: 168.7 bits (426), Expect = 1.4e-40
Identity = 135/459 (29.41%), Postives = 206/459 (44.88%), Query Frame = 0

Query: 46  KVQTMRSSLLDWYDLSHRDLPWRRL------------DKGQPQTRGYGVWVSEIMLQQTR 105
           +V+  R SL+ +YD + R LPWR+             D  QP  R Y V VSEIMLQQTR
Sbjct: 17  EVERFRESLIQFYDKTKRILPWRKKECIPPSEDSPLEDWEQPVQRLYEVLVSEIMLQQTR 76

Query: 106 VQTVVEYYKRWMHKWPTVQHLSRASLE-EVNEMWAGLGYYRRARFLLEGAKLIVK-EGGE 165
           V+TV  YY +WM   PT++  + A    +V  +W+G+G+Y R + L +  + + K    E
Sbjct: 77  VETVKRYYTKWMETLPTLKSCAEAEYNTQVMPLWSGMGFYTRCKRLHQACQHLAKLHPSE 136

Query: 166 FPKTVSALRK-IPGIGE------------------------------------------- 225
            P+T     K IPG+G                                            
Sbjct: 137 IPRTGDEWAKGIPGVGPYTAGAVLSIAWKQPTGIVDGNVIRVLSRALAIHSDCSKGKANA 196

Query: 226 ---KAAAQLVDPSRPGDFNQALMELGATLCTPTSPSCSTCPVFDHCEAL---SISKDDSS 285
              K A +LVDP RPGDFNQALMELGA  CTP SP CS CP+ + C+A    ++ +D ++
Sbjct: 197 LIWKLANELVDPVRPGDFNQALMELGAITCTPQSPRCSVCPISEICKAYQEQNVIRDGNT 256

Query: 286 V-------------------------LVTDYPAKGIKTKQRHDYSAVCVVEILENQGSSE 345
           +                         +V  YP    KTKQR + + V +      Q +  
Sbjct: 257 IKYDIEDVPCNICITDIPSKEDLQNWVVARYPVHPAKTKQREERALVVIF-----QKTDP 316

Query: 346 LKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSFGLEPKKNFDI 405
             +   FL+ KRP  GLLAGLW+FP++    E    +  + M++   KS       +   
Sbjct: 317 STKEKFFLIRKRPSAGLLAGLWDFPTI----EFGQESWPKDMDAEFQKSIAQWISNDSRS 376

Query: 406 VIR--EDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSV 407
           +I+  +  G ++H+FSHIR   +V + +         +   ++   IS   +++  M  +
Sbjct: 377 LIKKYQSRGRYLHIFSHIRKTSHVFYAI-----ASPDIVTNEDFFWISQSDLEHVGMCEL 436

BLAST of Cp4.1LG12g10040 vs. NCBI nr
Match: XP_023547668.1 (adenine DNA glycosylase [Cucurbita pepo subsp. pepo])

HSP 1 Score: 806 bits (2081), Expect = 4.88e-293
Identity = 418/464 (90.09%), Postives = 418/464 (90.09%), Query Frame = 0

Query: 1   MSGGEKNENEEDVKKKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDL 60
           MSGGEKNENEEDVKKKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDL
Sbjct: 1   MSGGEKNENEEDVKKKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDL 60

Query: 61  SHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLE 120
           SHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLE
Sbjct: 61  SHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLE 120

Query: 121 EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGE------------- 180
           EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGE             
Sbjct: 121 EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGEYTAGAIASIAFDE 180

Query: 181 ---------------------------------KAAAQLVDPSRPGDFNQALMELGATLC 240
                                            KAAAQLVDPSRPGDFNQALMELGATLC
Sbjct: 181 VVPVVDGNVIRVIARLKAISGNPKDSKLVKQVWKAAAQLVDPSRPGDFNQALMELGATLC 240

Query: 241 TPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGS 300
           TPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGS
Sbjct: 241 TPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGS 300

Query: 301 SELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSFGLEPKKNF 360
           SELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSFGLEPKKNF
Sbjct: 301 SELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSFGLEPKKNF 360

Query: 361 DIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSV 418
           DIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSV
Sbjct: 361 DIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSV 420

BLAST of Cp4.1LG12g10040 vs. NCBI nr
Match: XP_022953220.1 (adenine DNA glycosylase [Cucurbita moschata])

HSP 1 Score: 779 bits (2012), Expect = 1.59e-282
Identity = 401/464 (86.42%), Postives = 411/464 (88.58%), Query Frame = 0

Query: 1   MSGGEKNENEEDVKKKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDL 60
           MSGGEK+ENEEDVKKKPTKGEKRRGRSPSKREPI DIEDIMFSIDKVQTMRS LLDWYDL
Sbjct: 1   MSGGEKSENEEDVKKKPTKGEKRRGRSPSKREPITDIEDIMFSIDKVQTMRSPLLDWYDL 60

Query: 61  SHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLE 120
           SHRDLPWRRLDKGQP+TRGYGVWVSEIMLQQTRVQTVVEYYKRWMH+WPTVQHLSRASLE
Sbjct: 61  SHRDLPWRRLDKGQPETRGYGVWVSEIMLQQTRVQTVVEYYKRWMHRWPTVQHLSRASLE 120

Query: 121 EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGE------------- 180
           EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGE             
Sbjct: 121 EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGEYTAGAIASIAFDE 180

Query: 181 ---------------------------------KAAAQLVDPSRPGDFNQALMELGATLC 240
                                            KAAAQLVDPSRPGDFNQALMELGATLC
Sbjct: 181 VVPVVDGNVIRVIARLKAISRNPKDSKLVKQVWKAAAQLVDPSRPGDFNQALMELGATLC 240

Query: 241 TPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGS 300
           TPTSPSCSTCPVFDHCEALS SKDDSSVLVTDYPAKG+KTKQRHDYSAVCVVEILENQG+
Sbjct: 241 TPTSPSCSTCPVFDHCEALSSSKDDSSVLVTDYPAKGVKTKQRHDYSAVCVVEILENQGT 300

Query: 301 SELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSFGLEPKKNF 360
           SELKQSSRFLLVKRPDEGLLAGLWEFPSVLL GEADSSTRRES+NSLLSKSFGLEPKKNF
Sbjct: 301 SELKQSSRFLLVKRPDEGLLAGLWEFPSVLLNGEADSSTRRESINSLLSKSFGLEPKKNF 360

Query: 361 DIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSV 418
           +IVIREDVGDF+HVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSS+
Sbjct: 361 EIVIREDVGDFVHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSM 420

BLAST of Cp4.1LG12g10040 vs. NCBI nr
Match: KAG6576094.1 (Adenine DNA glycosylase, partial [Cucurbita argyrosperma subsp. sororia])

HSP 1 Score: 776 bits (2004), Expect = 2.63e-281
Identity = 400/464 (86.21%), Postives = 411/464 (88.58%), Query Frame = 0

Query: 1   MSGGEKNENEEDVKKKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDL 60
           MSGGEK+ENEEDVK KPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDL
Sbjct: 1   MSGGEKSENEEDVKTKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDL 60

Query: 61  SHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLE 120
           SHRDLPWRRLDKGQP+TRGYGVWVSEIMLQQTRVQTVVEYYKRWMH+WPTVQHLSRASLE
Sbjct: 61  SHRDLPWRRLDKGQPETRGYGVWVSEIMLQQTRVQTVVEYYKRWMHRWPTVQHLSRASLE 120

Query: 121 EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGE------------- 180
           EVNEMWAGLGYYRRARFLLEGAKLIV+EGGEFPKTVSALRKIPGIGE             
Sbjct: 121 EVNEMWAGLGYYRRARFLLEGAKLIVEEGGEFPKTVSALRKIPGIGEYTAGAIASIAFDE 180

Query: 181 ---------------------------------KAAAQLVDPSRPGDFNQALMELGATLC 240
                                            KAAAQLVDPSRPGDFNQALMELGATLC
Sbjct: 181 VVPVVDGNVIRVIARLKAISGNPKDSKLVKQVWKAAAQLVDPSRPGDFNQALMELGATLC 240

Query: 241 TPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGS 300
           TPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQ +
Sbjct: 241 TPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQST 300

Query: 301 SELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSFGLEPKKNF 360
           SELKQSSRFLLVKRPDEGLLAGLWEFPSVLL GEADSSTRRES+NSLLSKSFGLEPKKNF
Sbjct: 301 SELKQSSRFLLVKRPDEGLLAGLWEFPSVLLNGEADSSTRRESINSLLSKSFGLEPKKNF 360

Query: 361 DIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSV 418
           +IVIREDVGDF+HVFSHIRLKIYVEHLV+RLKGEGSKLFRKQEKKSISWKCVD KVMSS+
Sbjct: 361 EIVIREDVGDFVHVFSHIRLKIYVEHLVIRLKGEGSKLFRKQEKKSISWKCVDKKVMSSM 420

BLAST of Cp4.1LG12g10040 vs. NCBI nr
Match: XP_022991840.1 (adenine DNA glycosylase [Cucurbita maxima])

HSP 1 Score: 775 bits (2001), Expect = 7.52e-281
Identity = 399/464 (85.99%), Postives = 409/464 (88.15%), Query Frame = 0

Query: 1   MSGGEKNENEEDVKKKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDL 60
           MSGGEKNEN EDVKKKPTKGEKRRGRSPSKREPI DIEDIMFSIDKVQTMRSSLLDWYDL
Sbjct: 1   MSGGEKNENHEDVKKKPTKGEKRRGRSPSKREPIVDIEDIMFSIDKVQTMRSSLLDWYDL 60

Query: 61  SHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLE 120
           SHRDLPWRRLDKGQP+TRGYGVWVSEIMLQQTRVQTVVEYYKRWMH+WPTVQHLSRASLE
Sbjct: 61  SHRDLPWRRLDKGQPETRGYGVWVSEIMLQQTRVQTVVEYYKRWMHRWPTVQHLSRASLE 120

Query: 121 EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGE------------- 180
           EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTV  LRKIPGIGE             
Sbjct: 121 EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVPDLRKIPGIGEYTAGAIASIAFDE 180

Query: 181 ---------------------------------KAAAQLVDPSRPGDFNQALMELGATLC 240
                                            KAAAQLVDPSRPGDFNQALMELGATLC
Sbjct: 181 VVPVVDGNVIRVIARLKAISGNPKDSKLVKQVWKAAAQLVDPSRPGDFNQALMELGATLC 240

Query: 241 TPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGS 300
           TPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVE+LEN+G+
Sbjct: 241 TPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEMLENRGT 300

Query: 301 SELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSFGLEPKKNF 360
           SELKQ SRFLLVKRPDEGLLAGLWEFPSVLL GEADSSTRRES+NSLLSKSFGLEPKKNF
Sbjct: 301 SELKQCSRFLLVKRPDEGLLAGLWEFPSVLLNGEADSSTRRESINSLLSKSFGLEPKKNF 360

Query: 361 DIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSV 418
           +IVIREDVGDF+HVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSS+
Sbjct: 361 EIVIREDVGDFVHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSM 420

BLAST of Cp4.1LG12g10040 vs. NCBI nr
Match: KAG7014611.1 (Adenine DNA glycosylase [Cucurbita argyrosperma subsp. argyrosperma])

HSP 1 Score: 769 bits (1987), Expect = 2.55e-279
Identity = 391/427 (91.57%), Postives = 403/427 (94.38%), Query Frame = 0

Query: 1   MSGGEKNENEEDVKKKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDL 60
           MSGGEK+ENEEDVK KPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDW+DL
Sbjct: 1   MSGGEKSENEEDVKTKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWFDL 60

Query: 61  SHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLE 120
           SHRDLPWRRLDKGQP+TRGYGVWVSEIMLQQTRVQTVVEYYKRWMH+WPTVQHLSRASLE
Sbjct: 61  SHRDLPWRRLDKGQPETRGYGVWVSEIMLQQTRVQTVVEYYKRWMHRWPTVQHLSRASLE 120

Query: 121 EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGEKAAAQLVDPSRPG 180
           EVNEMWAGLGYYRRARFLLEGAKLIV+EGGEFPKTVSALRKIPGIGEKAAAQLVDPSRPG
Sbjct: 121 EVNEMWAGLGYYRRARFLLEGAKLIVEEGGEFPKTVSALRKIPGIGEKAAAQLVDPSRPG 180

Query: 181 DFNQALMELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDY 240
           DFNQALMELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDY
Sbjct: 181 DFNQALMELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDY 240

Query: 241 SAVCVVEILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNS 300
           SAVCVVEILENQ +SELKQSSRFLLVKRPDEGLLAGLWEFPSVLL GEADSSTRRES+NS
Sbjct: 241 SAVCVVEILENQSTSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLNGEADSSTRRESINS 300

Query: 301 LLSKSFGLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLF------- 360
           LLSKSFGLEPKKNF+IVIREDVGDF+HVFSHIRLKIYVEHLV+RLKG     F       
Sbjct: 301 LLSKSFGLEPKKNFEIVIREDVGDFVHVFSHIRLKIYVEHLVIRLKGLSYFSFLAYLCIE 360

Query: 361 --RKQEKKSISWKCVDNKVMSSVGLTSSVRKVYAMVEKFEADKISPIRAVATKKQRATST 418
                EKKSISWKCVD KVMSS+GLTSSVRKVYAMVEKFEA+KISP  AVATKKQR TST
Sbjct: 361 YAMNSEKKSISWKCVDKKVMSSMGLTSSVRKVYAMVEKFEAEKISPSPAVATKKQRPTST 420

BLAST of Cp4.1LG12g10040 vs. ExPASy TrEMBL
Match: A0A6J1GP17 (Adenine DNA glycosylase OS=Cucurbita moschata OX=3662 GN=LOC111455831 PE=3 SV=1)

HSP 1 Score: 779 bits (2012), Expect = 7.68e-283
Identity = 401/464 (86.42%), Postives = 411/464 (88.58%), Query Frame = 0

Query: 1   MSGGEKNENEEDVKKKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDL 60
           MSGGEK+ENEEDVKKKPTKGEKRRGRSPSKREPI DIEDIMFSIDKVQTMRS LLDWYDL
Sbjct: 1   MSGGEKSENEEDVKKKPTKGEKRRGRSPSKREPITDIEDIMFSIDKVQTMRSPLLDWYDL 60

Query: 61  SHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLE 120
           SHRDLPWRRLDKGQP+TRGYGVWVSEIMLQQTRVQTVVEYYKRWMH+WPTVQHLSRASLE
Sbjct: 61  SHRDLPWRRLDKGQPETRGYGVWVSEIMLQQTRVQTVVEYYKRWMHRWPTVQHLSRASLE 120

Query: 121 EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGE------------- 180
           EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGE             
Sbjct: 121 EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGEYTAGAIASIAFDE 180

Query: 181 ---------------------------------KAAAQLVDPSRPGDFNQALMELGATLC 240
                                            KAAAQLVDPSRPGDFNQALMELGATLC
Sbjct: 181 VVPVVDGNVIRVIARLKAISRNPKDSKLVKQVWKAAAQLVDPSRPGDFNQALMELGATLC 240

Query: 241 TPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGS 300
           TPTSPSCSTCPVFDHCEALS SKDDSSVLVTDYPAKG+KTKQRHDYSAVCVVEILENQG+
Sbjct: 241 TPTSPSCSTCPVFDHCEALSSSKDDSSVLVTDYPAKGVKTKQRHDYSAVCVVEILENQGT 300

Query: 301 SELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSFGLEPKKNF 360
           SELKQSSRFLLVKRPDEGLLAGLWEFPSVLL GEADSSTRRES+NSLLSKSFGLEPKKNF
Sbjct: 301 SELKQSSRFLLVKRPDEGLLAGLWEFPSVLLNGEADSSTRRESINSLLSKSFGLEPKKNF 360

Query: 361 DIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSV 418
           +IVIREDVGDF+HVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSS+
Sbjct: 361 EIVIREDVGDFVHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSM 420

BLAST of Cp4.1LG12g10040 vs. ExPASy TrEMBL
Match: A0A6J1JU23 (Adenine DNA glycosylase OS=Cucurbita maxima OX=3661 GN=LOC111488366 PE=3 SV=1)

HSP 1 Score: 775 bits (2001), Expect = 3.64e-281
Identity = 399/464 (85.99%), Postives = 409/464 (88.15%), Query Frame = 0

Query: 1   MSGGEKNENEEDVKKKPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRSSLLDWYDL 60
           MSGGEKNEN EDVKKKPTKGEKRRGRSPSKREPI DIEDIMFSIDKVQTMRSSLLDWYDL
Sbjct: 1   MSGGEKNENHEDVKKKPTKGEKRRGRSPSKREPIVDIEDIMFSIDKVQTMRSSLLDWYDL 60

Query: 61  SHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLE 120
           SHRDLPWRRLDKGQP+TRGYGVWVSEIMLQQTRVQTVVEYYKRWMH+WPTVQHLSRASLE
Sbjct: 61  SHRDLPWRRLDKGQPETRGYGVWVSEIMLQQTRVQTVVEYYKRWMHRWPTVQHLSRASLE 120

Query: 121 EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGE------------- 180
           EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTV  LRKIPGIGE             
Sbjct: 121 EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVPDLRKIPGIGEYTAGAIASIAFDE 180

Query: 181 ---------------------------------KAAAQLVDPSRPGDFNQALMELGATLC 240
                                            KAAAQLVDPSRPGDFNQALMELGATLC
Sbjct: 181 VVPVVDGNVIRVIARLKAISGNPKDSKLVKQVWKAAAQLVDPSRPGDFNQALMELGATLC 240

Query: 241 TPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEILENQGS 300
           TPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVE+LEN+G+
Sbjct: 241 TPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVVEMLENRGT 300

Query: 301 SELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSFGLEPKKNF 360
           SELKQ SRFLLVKRPDEGLLAGLWEFPSVLL GEADSSTRRES+NSLLSKSFGLEPKKNF
Sbjct: 301 SELKQCSRFLLVKRPDEGLLAGLWEFPSVLLNGEADSSTRRESINSLLSKSFGLEPKKNF 360

Query: 361 DIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSV 418
           +IVIREDVGDF+HVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSS+
Sbjct: 361 EIVIREDVGDFVHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCVDNKVMSSM 420

BLAST of Cp4.1LG12g10040 vs. ExPASy TrEMBL
Match: A0A1S3CBT2 (Adenine DNA glycosylase OS=Cucumis melo OX=3656 GN=LOC103498904 PE=3 SV=1)

HSP 1 Score: 660 bits (1704), Expect = 6.07e-236
Identity = 350/464 (75.43%), Postives = 375/464 (80.82%), Query Frame = 0

Query: 1   MSGGEKNENEEDVKKK--------PTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRS 60
           MS GEKNENEE+VKKK        PT   KRR RSPSK E + DIEDIMFSID VQT+R+
Sbjct: 1   MSDGEKNENEENVKKKTDFRRKKKPTTKRKRRSRSPSKSEAVVDIEDIMFSIDNVQTIRA 60

Query: 61  SLLDWYDLSHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQ 120
           SLLDWYD S RDLPWR LDKG+P+TR YGVWVSEIMLQQTRVQTVV++Y RWM KWPTVQ
Sbjct: 61  SLLDWYDRSRRDLPWRSLDKGEPETRAYGVWVSEIMLQQTRVQTVVQFYNRWMLKWPTVQ 120

Query: 121 HLSRASLEEVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGE----- 180
           HLSRASLEEVNEMWAGLGYYRRARFL EGAK+IVKEGG FPKTVS+LRKIPGIGE     
Sbjct: 121 HLSRASLEEVNEMWAGLGYYRRARFLFEGAKMIVKEGGRFPKTVSSLRKIPGIGEYTAGA 180

Query: 181 -----------------------------------------KAAAQLVDPSRPGDFNQAL 240
                                                    KAAAQLVD SRPGDFNQAL
Sbjct: 181 IASIAFGEVVPVVDGNVIRVIARLKAISGNPKDPKLIKQVWKAAAQLVDLSRPGDFNQAL 240

Query: 241 MELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVV 300
           MELGATLCTPT+PSCSTCPVFDHCEALSISK DSSVLVTDYPAKGIKTKQRHDYSAVCVV
Sbjct: 241 MELGATLCTPTNPSCSTCPVFDHCEALSISKRDSSVLVTDYPAKGIKTKQRHDYSAVCVV 300

Query: 301 EILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSF 360
           EILE+QG+SEL QSSRFLLVKRPDEGLLAGLWEFPSV L GEADSSTRRES++SLLSK+F
Sbjct: 301 EILESQGTSELGQSSRFLLVKRPDEGLLAGLWEFPSVSLDGEADSSTRRESIDSLLSKNF 360

Query: 361 GLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCV 408
           GLEPKKNF+IV REDVGDFIHVF+HIRLKIYVEHLVL LKGEGSKLFRKQEKKSI WKCV
Sbjct: 361 GLEPKKNFEIVNREDVGDFIHVFTHIRLKIYVEHLVLCLKGEGSKLFRKQEKKSILWKCV 420

BLAST of Cp4.1LG12g10040 vs. ExPASy TrEMBL
Match: A0A5A7T8X3 (Adenine DNA glycosylase OS=Cucumis melo var. makuwa OX=1194695 GN=E6C27_scaffold122G001490 PE=3 SV=1)

HSP 1 Score: 654 bits (1686), Expect = 8.25e-233
Identity = 350/489 (71.57%), Postives = 376/489 (76.89%), Query Frame = 0

Query: 1   MSGGEKNENEEDVKKK--------PTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRS 60
           MS GEKNENEE+VKKK        PT   KRRGRSPSK E + DIEDIMFSID VQT+R+
Sbjct: 1   MSDGEKNENEENVKKKTDFRRKKKPTTERKRRGRSPSKSEAVVDIEDIMFSIDNVQTIRA 60

Query: 61  SLLDWYDLSHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQ 120
           SLLDWYD S RDLPWR LDKG+P+TR YGVWVSEIMLQQTRVQTVV++Y RWM KWPTVQ
Sbjct: 61  SLLDWYDRSRRDLPWRSLDKGEPETRAYGVWVSEIMLQQTRVQTVVQFYNRWMLKWPTVQ 120

Query: 121 HLSRASLEEVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGE----- 180
           HLSRASLEEVNEMWAGLGYYRRARFL EGAK+IVKEGG FPKTVS+LRKIPGIGE     
Sbjct: 121 HLSRASLEEVNEMWAGLGYYRRARFLFEGAKMIVKEGGRFPKTVSSLRKIPGIGEYTAGA 180

Query: 181 ------------------------------------------------------------ 240
                                                                       
Sbjct: 181 IASIAFGEVSAFLVYFFSILNSQGTLLNMFPNQVVPVVDGNVIRVIARLKAISGNPKDPK 240

Query: 241 ------KAAAQLVDPSRPGDFNQALMELGATLCTPTSPSCSTCPVFDHCEALSISKDDSS 300
                 KAAAQLVDPSRPGDFNQALMELGATLCTPT+PSCSTCPVFDHCEALSISK DSS
Sbjct: 241 LIKQVWKAAAQLVDPSRPGDFNQALMELGATLCTPTNPSCSTCPVFDHCEALSISKRDSS 300

Query: 301 VLVTDYPAKGIKTKQRHDYSAVCVVEILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFP 360
           VLVTDYPAKGIKTKQRHDYSAVCVVEILE+QG+SEL QSSRFLLVKRPDEGLLAGLWEFP
Sbjct: 301 VLVTDYPAKGIKTKQRHDYSAVCVVEILESQGTSELGQSSRFLLVKRPDEGLLAGLWEFP 360

Query: 361 SVLLKGEADSSTRRESMNSLLSKSFGLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHL 408
           SV L GEAD +TRRES++SLLSK+FGLEPKKNF+IV REDVGDFIHVF+HIRLKIYVEHL
Sbjct: 361 SVSLDGEADLTTRRESIDSLLSKNFGLEPKKNFEIVNREDVGDFIHVFTHIRLKIYVEHL 420

BLAST of Cp4.1LG12g10040 vs. ExPASy TrEMBL
Match: A0A0A0KC27 (Adenine DNA glycosylase OS=Cucumis sativus OX=3659 GN=Csa_6G088720 PE=3 SV=1)

HSP 1 Score: 649 bits (1675), Expect = 1.11e-230
Identity = 344/464 (74.14%), Postives = 371/464 (79.96%), Query Frame = 0

Query: 1   MSGGEKNENEEDVKK--------KPTKGEKRRGRSPSKREPIADIEDIMFSIDKVQTMRS 60
           MS GEKNEN+E +KK        KPT   KRRGRSPSK E + DIEDIMFSID VQT+R+
Sbjct: 55  MSDGEKNENDEYMKKNTDFRRKKKPTTERKRRGRSPSKSEAVVDIEDIMFSIDNVQTIRA 114

Query: 61  SLLDWYDLSHRDLPWRRLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQ 120
           SLLDWYD S RDLPWR LDKG+P+TR YGVWVSEIMLQQTRVQTVV++Y RWM KWPTVQ
Sbjct: 115 SLLDWYDRSRRDLPWRSLDKGEPETRAYGVWVSEIMLQQTRVQTVVQFYNRWMLKWPTVQ 174

Query: 121 HLSRASLEEVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIPGIGE----- 180
           HLSRASLEEVNEMWAGLGYYRRARFL EGAK+IVKEGG FP+TVS+LRKIPGIGE     
Sbjct: 175 HLSRASLEEVNEMWAGLGYYRRARFLFEGAKMIVKEGGRFPRTVSSLRKIPGIGEYTAGA 234

Query: 181 -----------------------------------------KAAAQLVDPSRPGDFNQAL 240
                                                    KAAAQLVD SRPGDFNQAL
Sbjct: 235 IASIAFGEVVPVVDGNVIRVIARLKAISGNPKDPKLIKQVWKAAAQLVDLSRPGDFNQAL 294

Query: 241 MELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQRHDYSAVCVV 300
           MELGATLCTPT+PSCSTCPVFDHCEALSISK DSSVLVTDYPAKGIK KQRHDYSAVCVV
Sbjct: 295 MELGATLCTPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKIKQRHDYSAVCVV 354

Query: 301 EILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRESMNSLLSKSF 360
           EILE+QG+ EL QSSRFLLVKRPDEGLLAGLWEFPSV L GEAD STRRES+NSLLSK+F
Sbjct: 355 EILESQGTPELGQSSRFLLVKRPDEGLLAGLWEFPSVSLDGEADLSTRRESINSLLSKNF 414

Query: 361 GLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRKQEKKSISWKCV 408
           GLE KKNF+IV REDVGDFIH+F+HIRLKIYVEHLVL LKGEGSKLFRKQEKKSI WKCV
Sbjct: 415 GLEAKKNFEIVNREDVGDFIHIFTHIRLKIYVEHLVLCLKGEGSKLFRKQEKKSILWKCV 474

BLAST of Cp4.1LG12g10040 vs. TAIR 10
Match: AT4G12740.1 (HhH-GPD base excision DNA repair family protein )

HSP 1 Score: 385.2 bits (988), Expect = 6.8e-107
Identity = 216/448 (48.21%), Postives = 284/448 (63.39%), Query Frame = 0

Query: 5   EKNENEEDVKKKPTKGEKRRGRSPSKREPI-ADIEDIMFSIDKVQTMRSSLLDWYDLSHR 64
           +K E EE+ +++    E+    + ++ E +  DIED +FS ++ Q +R  LLDWYD++ R
Sbjct: 89  DKEEAEEESEEEE---EEEEEEAEAEEEALGGDIED-LFSENETQKIRMGLLDWYDVNKR 148

Query: 65  DLPWR-RLDKGQPQTRGYGVWVSEIMLQQTRVQTVVEYYKRWMHKWPTVQHLSRASLE-- 124
           DLPWR R  + + + R Y VWVSEIMLQQTRVQTV++YYKRWM KWPT+  L +ASLE  
Sbjct: 149 DLPWRNRRSESEKERRAYEVWVSEIMLQQTRVQTVMKYYKRWMQKWPTIYDLGQASLENL 208

Query: 125 -----------------EVNEMWAGLGYYRRARFLLEGAKLIVKEGGEFPKTVSALRKIP 184
                            EVNEMWAGLGYYRRARFLLEGAK++V     FP   S+L K+ 
Sbjct: 209 IVSRSRELSFLRGNEKKEVNEMWAGLGYYRRARFLLEGAKMVVAGTEGFPNQASSLMKVK 268

Query: 185 GIGE----------------------------------------------KAAAQLVDPS 244
           GIG+                                              K AAQLVDPS
Sbjct: 269 GIGQYTAGAIASIAFNEAVPVVDGNVIRVLARLKAISANPKDRLTARNFWKLAAQLVDPS 328

Query: 245 RPGDFNQALMELGATLCTPTSPSCSTCPVFDHCEALSISKDDSSVLVTDYPAKGIKTKQR 304
           RPGDFNQ+LMELGATLCT + PSCS+CPV   C A S+S+++ ++ VTDYP K IK K R
Sbjct: 329 RPGDFNQSLMELGATLCTVSKPSCSSCPVSSQCRAFSLSEENRTISVTDYPTKVIKAKPR 388

Query: 305 HDYSAVCVVEILENQGSSELKQSSRFLLVKRPDEGLLAGLWEFPSVLLKGEADSSTRRES 364
           HD+  VCV+EI         +   RF+LVKRP++GLLAGLWEFPSV+L  EADS+TRR +
Sbjct: 389 HDFCCVCVLEI---HNLERNQSGGRFVLVKRPEQGLLAGLWEFPSVILNEEADSATRRNA 448

Query: 365 MNSLLSKS--FGLEPKKNFDIVIREDVGDFIHVFSHIRLKIYVEHLVLRLKGEGSKLFRK 384
           +N  L ++  F +E KK   IV RE++G+F+H+F+HIR K+YVE LV++L G    LF+ 
Sbjct: 449 INVYLKEAFRFHVELKKACTIVSREELGEFVHIFTHIRRKVYVELLVVQLTGGTEDLFKG 508

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
F4JRF49.5e-10648.21Adenine DNA glycosylase OS=Arabidopsis thaliana OX=3702 GN=MYH PE=3 SV=1[more]
Q99P217.1e-6137.35Adenine DNA glycosylase OS=Mus musculus OX=10090 GN=Mutyh PE=1 SV=2[more]
Q9UIF72.3e-5938.96Adenine DNA glycosylase OS=Homo sapiens OX=9606 GN=MUTYH PE=1 SV=1[more]
Q8R5G23.7e-5734.56Adenine DNA glycosylase OS=Rattus norvegicus OX=10116 GN=Mutyh PE=2 SV=1[more]
Q101591.4e-4029.41Adenine DNA glycosylase OS=Schizosaccharomyces pombe (strain 972 / ATCC 24843) O... [more]
Match NameE-valueIdentityDescription
XP_023547668.14.88e-29390.09adenine DNA glycosylase [Cucurbita pepo subsp. pepo][more]
XP_022953220.11.59e-28286.42adenine DNA glycosylase [Cucurbita moschata][more]
KAG6576094.12.63e-28186.21Adenine DNA glycosylase, partial [Cucurbita argyrosperma subsp. sororia][more]
XP_022991840.17.52e-28185.99adenine DNA glycosylase [Cucurbita maxima][more]
KAG7014611.12.55e-27991.57Adenine DNA glycosylase [Cucurbita argyrosperma subsp. argyrosperma][more]
Match NameE-valueIdentityDescription
A0A6J1GP177.68e-28386.42Adenine DNA glycosylase OS=Cucurbita moschata OX=3662 GN=LOC111455831 PE=3 SV=1[more]
A0A6J1JU233.64e-28185.99Adenine DNA glycosylase OS=Cucurbita maxima OX=3661 GN=LOC111488366 PE=3 SV=1[more]
A0A1S3CBT26.07e-23675.43Adenine DNA glycosylase OS=Cucumis melo OX=3656 GN=LOC103498904 PE=3 SV=1[more]
A0A5A7T8X38.25e-23371.57Adenine DNA glycosylase OS=Cucumis melo var. makuwa OX=1194695 GN=E6C27_scaffold... [more]
A0A0A0KC271.11e-23074.14Adenine DNA glycosylase OS=Cucumis sativus OX=3659 GN=Csa_6G088720 PE=3 SV=1[more]
Match NameE-valueIdentityDescription
AT4G12740.16.8e-10748.21HhH-GPD base excision DNA repair family protein [more]
InterPro
Analysis Name: InterPro Annotations of Cucurbita pepo (Zucchini) v1
Date Performed: 2021-10-25
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR003265HhH-GPD domainSMARTSM00478endo3endcoord: 88..192
e-value: 7.3E-8
score: 42.1
IPR003265HhH-GPD domainPFAMPF00730HhH-GPDcoord: 84..175
e-value: 5.7E-14
score: 52.5
IPR003265HhH-GPD domainCDDcd00056ENDO3ccoord: 80..190
e-value: 5.2324E-29
score: 109.253
NoneNo IPR availableGENE3D3.90.79.10Nucleoside Triphosphate Pyrophosphohydrolasecoord: 233..394
e-value: 4.4E-32
score: 112.5
NoneNo IPR availableGENE3D1.10.340.30Hypothetical protein; domain 2coord: 64..175
e-value: 1.1E-37
score: 131.1
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 1..31
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 1..18
IPR029119MutY, C-terminalPFAMPF14815NUDIX_4coord: 259..383
e-value: 2.3E-13
score: 50.0
IPR029119MutY, C-terminalCDDcd03431DNA_Glycosylase_Ccoord: 237..387
e-value: 1.11541E-19
score: 81.9995
IPR044298Adenine/Thymine-DNA glycosylasePANTHERPTHR42944ADENINE DNA GLYCOSYLASEcoord: 25..167
coord: 168..402
IPR011257DNA glycosylaseSUPERFAMILY48150DNA-glycosylasecoord: 47..213
IPR015797NUDIX hydrolase-like domain superfamilySUPERFAMILY55811Nudixcoord: 228..343

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Cp4.1LG12g10040.1Cp4.1LG12g10040.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0006281 DNA repair
biological_process GO:0006284 base-excision repair
cellular_component GO:0016021 integral component of membrane
molecular_function GO:0051539 4 iron, 4 sulfur cluster binding
molecular_function GO:0003677 DNA binding
molecular_function GO:0046872 metal ion binding
molecular_function GO:0000701 purine-specific mismatch base pair DNA N-glycosylase activity
molecular_function GO:0003824 catalytic activity
molecular_function GO:0016798 hydrolase activity, acting on glycosyl bonds