Cla007083 (gene) Watermelon (97103) v1

NameCla007083
Typegene
OrganismCitrullus. lanatus (Watermelon (97103) v1)
DescriptionA/G-specific adenine glycosylase (AHRD V1 *-*- F4JRF4_ARATH); contains Interpro domain(s) IPR005760 A/G-specific adenine glycosylase MutY, bacterial form
LocationChr6 : 1585469 .. 1597670 (+)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: CDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGAGCGGCGGAGAAAAGAACGAGAACGAGGAGTTTGTGAAGCAAAATACTGATTTTCGTCGGGAAAAGAAACCAACGAAGGAACGAAAACGGCGGGGTCGAAGTCCGTCTAAAAGGGAAGCAGATGTTGACATTGAAGATATTATGTTCAGCATAGACAATGTTCAGATAATCCGGGCATCGTTATTGGAATGGTACGACCGTAGCTGCAGGGACCTTCCATGGAGGAGATTGGACAAAGGACAACCTGAAACACGGGCTTACGGTGTGTGGGTTTCAGAAATAATGCTGCAGCAGACGAGAGTTCAGACCGTCGTTCACTTTTACAACCGTTGGATGCTTAGATGGCCCACCGTTCAACATCTCTCTCGTGCTTCTCTTGAGGTTTGTTTTTCTTTGAGCTACATTACTGTCTTCTTTACTCTTGGTAAATGAATTTCGATATCTGCCAGGAAGTTAATGAAATGTGGGCAGGCTTGGGGTACTACAGACGAGCTCGTTTTCTTTTGGAGGTAATCGTTATTCATTTCAGTTACCTTGGACATGATAGGTTTGATACACATTTAAATATGAGTTTCTGCATTTCACTTCTTTATTCGAACTGCAAGACAATGAACTTTTAGAAACAAAGTGTGAATAAGATAGGAGCAAAAGTGTTTCGGGAAACTTGGAGTGTCATTGAGTCATTCATTTCATTTGATAGGAGCAATAAAAGCTAAGTTGCAAAAGAAGAAAGAGCAAAGGCAAGGAATACCACCCTAAAACTTATATAAGTTGCTCTTTCAAAAAGGCTATCTAGGACGGTATGTGTTTACCATGAGATGAGGTCGAAGATTACAAAGGAACGAATAATAGAGAGAACTCGTATGCATTTCTTCAAAATTATGCTTCACTTTCCCAAGAAACACATGATTTTTTCCTCTCTATTTTCTTTCTTAACAAGTTTTGCATCCAGATATCCCCCATTTGCCAGAGCATATAAGAGAAAGTGAGTTAATGAGTTGACTTAGTATGGTCGTGTTTTGTATCTGTGTTTAAGTGTGAGTCCACTGCATATTAGTAAATAGTTAGTTGGCACAGAAATGAGGGAGCAAATATATGGGTATTCCTTTTTAAGAGGTCACACCTTTTGCTTTTAATATTTTTCAAGAAAGTTGAACATTTTTTTTTGTAAGTTACGGTTGCAAGTTTGAGGTGTCTTACGATTTGTGTTCTTCAGCTGCAATGCACACAGTAAGCATACATCCTAGAGGGATTTTCTTTCTACTCTTGTATATCTTTAGTAAATTGGTTGCTTTCAACTTTTCAAATATCAATGGAAGTTCATGTTTCTTGTGAAAATATATTAGTAAAATACAAAATAACTGAAGTGAAAATAATAGCTATAATAGGCATCTCGGGATAATTAATCATTGGTTATAAAATTCTGGATACTCCTTTCTTATTTTGTTGCTAGGGTGCAAAGATGATAGTCAAAGAAGGTGGTAGATTTCCTAAAACGGTTTCTGCCCTTCGAAAAATTCCTGGAATTGGAGAATATACAGCAGGGGCTATTGCCTCCATAGCATTCGATGAAGTGAGTGCTTTTCTGGCCTAATTTTTTTCCCTACTCCCAAGGAGCACTTCTAAATATGTTTCCTGAGCAGGTGGTGCCTGTGGTCGATGGTAATGTGATAAGGGTAATTGCTCGATTGAAGGCCATTCTAGGAAATCCAAAAGACCCAAAGTTGAACAAGCAAGTTTGGTGAGCTTTCCTCATTATTAGGATAACTGACTTTTTGACCACGTAACAAAAATGCAAGACTAGAGTCTGTCTTATGGGGGGAGGGGGGGCATGAAACTTTTATAGAATCTTTACATCCTAGTCATTCTTATTATCTTCATCTTGTCTTGATAACTTAAAAATTTATAAATATGTTTGATCGACTTGAGAGAAAATTTTAAAATGTTATTTTTTCCTTGCATTACCATCGTTTTTGTATTTGTTAATAATGCTGAAAATGATAAAATAGAACTTTTTTATATATTTGATTTATTCTAGTTTGAAGTTCATGCTCATTTAACCATTCTAATATAATTACCTTATCTCATCTCTCTCTATTTGTTGGGGGAGAGGAATTAGGAAGGCAGCTGCTCAATTAGTTGATCCTTCCAGGCCTGGGGACTTCAATCAGGCACTCATGGAACTTGGTGCTACTTTATGCAGTCCTACAAACCCAAGCTGCTCAACATGCCCTGTGTTTGATCACTGCGAGGCCCTTTCAATCTCAAAGCATGATAGTTCAGTTCTTGTCACAGATTATCCTGCTAAGGGGATAAAGACCAAACAAAGACATGATTATTCTGCTGTAAGTGTGGTTGAGATATTGGAAAACCAGGGTACATCTAAGTTAGAGCAATCTAGTAGATTTCTTCTTGTAAAGAGGCCTGATGAAGGTTTGCTTGCTGGTCTATGGGAGTTTCCATCCGTCTTGTTGGACGGAGAAGCTGATTTAAGTACAAGGAGAGAATCCATTAATAGCCTCTTGAGTAAATACTTTGGACTTGAACCAAAAAAGAATTTTGAAATAGTTATTAGAGAAGAGGTTGGAGATTTTATCCATGTTTTCACCCACATCCGTCTCAAGATATATGTTGAGCACTTGGTGTTATGTTTAAAAGGTTTGAGTTACTTCTCCTTTCATCCCTATCTGTGCATTAAGTATTCATAAACCAATGACTTGAACTTGCTATAAAACTTAACCTATATGAGATCTACTTATCTTATAGGCTGTCAAACGTCGTGTGCATGTTTATTATCTACTTCTATGTCACCTGTCGCGGATGGTATATTGTCAATTGTCATAGGTGGAAAACTATTACTGAAACCGGTCTATGAGGAAAACCATTAAATGAAGTTTTGATTTTTGAAACATTACCCTGTATGATTTTTAAGCTTCCAAATTCCAATTATCATATTAGTTAAGGATTGAATTAGTATTTTGAGTTTCTCATCGGTTGCCCTCTTAAATTTGTTTTGTCCTGCTTCTTCATTGATGCCTATTGTGCTATTTTCTGTGGCCTCCCTTGGCTTGAATGGGAGGTATATATTATTTTTCTCTCCTACAATTCTATTTTTCTTTCAAATGAAGTTTCTATTCCCTACNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCTACACATATAAAATAGGAATAAATAACTTCATAAATTTACATTCTAATCACCATTGAAATTAAATTGTATTCTGTTCATATTAGTGCTTGAAACTTCGTGGCTCTTAATTGTCAATAGTGTAACAAAGAAGCACTTGATGGTTTATTGTTTCTCGCATTGCGGTTACACTGTTTAAATTGCCTGAAACAAAACATCGATTAGTCCTTAGAATTCAGCTGCAACTTTTTACCTTGCCTGTTGCATCTGCGACATGTAACATCATATTGTTAAATACTTTTGAATATTTTAATTTTAAAATGGAATAGGTCATTGTGATTAGTATGGTAATTCATGATCTAAATTCTAATTTTCATTGTAAATTGACATCAACCACAATTTATTAGATTTAAAATGATTCTATTTTCATATTTATAGTAAAGGGTAATGTTGATGTCTAAAATTCATACAAGCACAATAGATTAATCTACTATAGAATAGGTACCGCTAGTTATTTCTCAATATTTTTAATTTTTAATGTCATCAATTCTTGTAAGTGGATTACTTCGAAAAAATTCTTAGTAAGTGATGAGAAGAATTATTAGATAGTATATTTGAGCTAATATTTATATAGGTGTTAATTTAGATTAATTAATGATTAATTAGTTATTGTCTTATTTTTTTGTATAAATAGCCTCTTTTGGGTTGTAATAAGAAGCTTTTGGATATCCTTTGAAATAGAACTTTTGTTCTTTGGAGAACTTTCTCCCAATTTAGAGATGGATTTCCCTAACTTGGCCATGGAATTGGCCTAATACTTTTACATGTTGATCATGATGTCAACCTCTTAGATCATTTTGCAGCATTTCAAGTTGATAGTAATGAAAGCTGCTTTCAAGTTGATAACAATGAAAGCTGCTTACTTTCGTTTGGAATTGATTTTCATCAGTTTTCTTGAATCTTATGCTGTAGATAGACATGTTTTTCTGTGACTATTCAAAATACTTTGTGCATCAAAAGCTATGGATGACTATGGATTTTTTTTTTTCTTTGTTTTGGAAATTTTTGAATTGTGGTCCTGTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTCATTTCAGTTCAGTTATTTTACTTTCATGTATGTTTGCGTTCTTAAAGGTGAAGGTAGCAAGTTGTTTCAAAAACAGGAGAAGAAATCTATATTATGGAAATGTGTAGACAGCGAGGTTATGTCAAGCATGGGATTGACGTCCAGTGTGAGGAAGGTAAGCACAGATGGTGCGTTAGATGACTTCTCGTTGTGTTACTTCAATATTACTTTCATAATTATTTAGTCCCATGGGAACATAATTATATCAACCCTATAGTTCTAGTAAGTTCTGGCATCAGAACCAAATACTGTCAATTTCTCCACATTTTTCTCCTTCTCTCAGTCACATACACCCAAAGATCCATTTCTTCCCCCTGTTTTCTTGGGGGTTTTTTTTCTTCTGGGCACGCGGTTGAGATAGTGTTGAGATAAGGGAGTAGGCATTTCTTTTTTTTTACTTACTGCTGTTGCAATTGACATGATGGAGGTCAAGATTCGCAAATACAATTTCCCTTCATTAGCCTTAATTGGGACAGAATTACACATTGGTTATAACCATGCTTCTGAAGTTTAAGAATAATGAAGATGTGACTTCTGCACGGTCAGATGATTGAAGGCGGCCTACGTATATCGTTTGCTATTTATATTTAGAAAAAGAAAGAAAAAAAAATAGTGAAATTTAACTTGAGAAACGTCAATTTGGTTAATAATACGAAACTGAGGTATGAGTTCGTTCAACCATCTGTTGTATACACCAATTAGCAACATTCAATGTCAACTTTTCTCAATTAGTAGATCTTCCACTTGTTATATCTGTCAATGCAATCACAAATCTCCTTGATCATCCTTGTTAGTTTAGTTATTAAATGTGGGTTGTTTATGTATTGTTTAATTTAAGTATTTATGTTTTCGTAAGAGATTTGTTTTCTAATCATGTAAATATGTAAAGGGCAATCAGTTATTAGCAACTAATTTGACATGTTTTTGATAGGCCTATGCCATGGTCGAGAAATTTCAGGCAGAGAAGACATCTTCTAGCCGTGCAGTCCCCAGAAAAAAACAGAAAGCTTGAATTGCAGGAGCTGTTGACATTTACGATAAGTTTATTTGATGCTTTTCCCATTCGTCGATTGTTGTGTTTGTCCCACTAGTAATCGACAGCAGGACAACCTATTTTCATGAAAGTTGTAGATGATGAGGGAAAATAGGAGGGGAGATTCATGGGTCAGTTTTGTGTGGATGGGGATCAAACATTCTATCTCTAGAGAGGAAGACCACGTCAGTTATCATTTAACTAAGCTCACTTTGGTAGTTATGTTTTGGTTTTTCTTTGATAGGTAAGAATTAGATAAATAGATATTGTTGAATGACAAAAAGTTCAAGAATGTACCCATCCAGAATCCTCATACACCAAAATATTTGAACTCACTGGACTTTGGTTGTAAAGAAATAGTAATTTGATAAAACTAATATGCAAAGCATATCTAGGCTGTCAAATAGGCTTTGTAAATTGAGTTATGTACTCAAGTTAGTGGCTATACAAGGCGCATATGGAGTTTCATTAGCTTGGAATATTGTTCAAATGAAATCATGGAAAAAAAACTTGGAACTTTTATCTTTTTTTTGTTTAAAATATTGGAGTTGGGATCCAATGAGCTAATGGAAGATTTTAGTGAATTTTCTATATTTTTATCATTTGAAGCAATCAGAAGATTATCTCCCATTTGCGTGGAAGAACACAGAAGGAAGAAGTATGGTAGAGTCAAATGTCTTTTCTAATCATTCAAAAGAAGCTCTAATGGAGTGGGAGTGCTTATAACAAAAAAAAAACTATATTTTGCATTGTAGGTCCCACAATTGTACAACTCTTTAAATTTAACAACTTCACTTGTTCATTCATTGCTTCAAAGTCCCTCCAATGTCTCATAAAGCTCAATAAACATAAATGACGAAAACTTCATTTTTACTTAAATTAAATAGATTAAGCATGATGATTCAACCAAGTTAACTAAAAAGTCCAATAATTAAGCTTAGTTTTAGAGTTAAATAACCCACATAAACAAGAGTTATCAATCTATCATTGATTGATGAACCCTTAAAATTGTGCCAAATTTATAAATACTTTAAGACTATGTAAGATTTGCTATTATTTATCATAAATTGTCATGCATGCAATTTTGTCTTATTTCTTAAACATTGTTCGTGCTCTTAACAAGCTTTTTGTTGATCTCTCTTAAGCGAATAACGTGGAAGATGACTGTGTGAAAATCACTTGACGCTCGTAGGAGTTGTCTCCTTTATAATGTGTAATCGGTCTTGTAACTTTGTCACTCACTCTTTGTCTTGATCAGCTTATGATGTGTAAATGCTCCTGTAACTTTGTATTTTTTTAACGAATTTTATTTTCTTAAAAAAAAAAAGAAAAAGAAAAAAAAAAAAGGCCACGGATAAGAGAGGAAGACCCAAATGTGAATTTATTTGGGCCGTGTGGAAGTCCATTAATGGGTTGGTTCGATCTGGAAATGCAATTTGGAATACCGTCTTCTATTGTTTCGACCCACGCAGCGTTGTCTTGCTGTCGCTTCTCGTCGACGCGTTCCATTCTCTTCCGCCTAAAGCCCTAAACCCCGTCGTCGTCGTCGTCGAAGAACATGGTGAGACTTACAGTCGATTTAATCTGGAAAAGCCCTCATTTCTTCAATGCCATTAAAGAGCGAGAACTGGACCTTCGAGGTTGTCTATCCTTTCTCAACTCTTGCTTCTTTCACTCTCTCGACCACACATTTCTTAACTCCTTTCTTCTCGTCTCCAGGAAACAAGATAGCAGTGATAGAAAACCTAGGTGCCACCGAGGTGAGTGATTTCCACCGTCTTATATGTCTTGTTTTTATATCTCTGGAACTCCAACATCTGTTAAGTCCGTTTCCGGGGTCTTCTTACTCTACTATGATGAATCCCTAACTTAGAGCATGCCATTTCTCCTCCCGGGTTGCTTTTGTTGTAGTCTGAAATTTAAATTATATTCTTTACATACTGGCGAGAATCTCTCCACAGAATTCAACTTCACTTCAAATGAGTAATGACAGTTCTCATTAATCTGATTAGTTTGTTTTGCTACAATCTGTTATACCATTATGAAGGTCCTTTGAAGTTTCATTTTCCTTGAAGACAGCCTAAAGAACTCCAATTTGAGGCCATATGCTGCTGTACATTTTCCCTTGAGTCATAAAAAGTTTTAAATTTAACATATTGAGGATCCAATTAACCCTTTTAACCATAAGGATCAGATCTGAGCAGTAATAGCATTTTTTTCCCATAGAATTATGGCTATATTCTTGAACTGCCTACCCGAGAGAACCAAGTTCACATTGAATGAAGAGGTGGTTCATAGTTTTTACATCCTTTCTACATGAATGATAGATATGAGATCTGAGGCTATATAGAAAGAGATGGCCTCCTTTGCCTTCTATCTCGAGTAATAAGACTTCCTTTAGGGCCATAAATAAACTCAGACCTGTTAAGGAACTTGCTGGTGTTAACCAATTAGGAGTATCTTTTTATTGAATAAAAAATCTGCCCCTTACAAATAGGACTAGTGTTTATACTAATGTCTCTCTTAACCAAAACTTACAAACTTAACAAACTACTCCAACAGCTAACAACTAACTTGCCCAATAGCTAGTTCAACCAAAAAATTACTCAATGACATCACTTACCTTACAAAACAAGCTCCACGAGCGCAGAACACAACTTGGACTAATTAGAGTTGAGCTTGAAGTTAAAGGAAATATGAGAAAATAGGCCATTTGAATAAAGTTTCCAAAGCAGTCTATCCTTATCATTCCCTAGCTTTATCAGATTGAGCTTAAGAGAAAAAGGGATCCACTCTTTAATTATCTATCTCTTTGTCTCCTCCTCAAGCCTACATCCCATAGACTCCTCTTCCACAAATTCTTGATGAGGCTCCTTCTGTTTTTGGAACATCATTATTTCGAAAAGAAACAGGAGTGGCTATCATCAAACTAACTATCTTCCAAAGATGTAATTGTCTCTCATGCCAACTTTGTCTAAAGTACAAATTATCAACGGACCTTAGGTTTTGATTACTCCATGGCTAAAAATTAACTCCATGACCACGACCTCCTACATTTGTGAGCCACCTGATGGGATAAGAACCACATTTACTCTGTAAAACTTTCTCCTAGGCCCTCATGAAAGTATCTACAACAGAGCCTCTGATCATTAAGGGTTCTTTCTTGTATCTCATGTATCCCTTTTCCAATCCACCTAAAGAAATAGGGACTAAAGGAGGGCAGCTTACTTTCACAGTTTCACTTGATTGAGTGACAACCGATATCAAAGAAACGACATACTCATTCTTTTGTGTCAAGCTTCTAGAGTTGATTCCTCATTACTCTGTATTACTTTGTCTTTATTCAAAATGAAATGATATTTCTTGATAAGAAAGGTTTTGTCCAAGGCTCTTGCAATATTGAGTTGTAGTGGTGTTTGTTCACCAATTTGACCAAAGAGCACCCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAGAAAAAAGAAAAGTGCAGCCTCATTTCATTATAAGGTTAGGTGGGTTGGCTTGTGAGATTAGTCGAGATGTGCACAAGTTGGTCTAGACATGCACGAGTCACGAATAATTGAAAAAAAAAAGAGATGACCAAGAATACCCCTCATTTCCTATTAAAATTGACAAAATACACCAAGAAGGTGTCAGTAATACACAGTAGGTTAAAACTTTGTTAATTTTTTTTTTTTTTTAATAAACGGAACGGAACTTTTCATTGATATAATAAAAAGAGCAATAAATCAATTACAAAGAGACTAAATATCCAAAATAGACAAAAAAGAACTGAGGATCAGGAGGTGCACTTCAGCGGTTAAAGTTTTGTTAATATTTCTTATTTAAAGTACTTTTTCTGTTTGTTTGAGGATAGAATGGTTGACAATTTATTTGGCTTTCCCAAAAATGAAAGAAAGTGATGTCTATCCTTGGTTTTCTTTTATTTTCTAGCCATGTAATGGTAGATTTTTCTTGGCTAAATTTCAGGACCAATTTGATGCCATTGATTTGTCTGATAATGAGATTGTGAAACTGGAAAATATGCCATATCTTAATCGATTGGGCACATTGCTGATCAATAATAATAGAATCACTCGCATCAATCCAAATATTGGAGGTGAGCTTTCCATTATGCTTTTACGGGATTAGTTAAACTCCATTTGATAACCATTTGCTTTCGAAAATTATGCTTGTTTAAACACTCACAATTTTTTTTGGGTGAGTTTTCCACTTGAAAACACATTTAATATTTTAAACCAAGTTTCAAAAACAAAATAAGTTTAAAAAACTACTCTTTTTAGTTTTAGATTTTGGCTTGGTTTTTGAAAATGTTTTTAAAATATAGATGATAAAACACAAAAATTATTTGTGGAAGTAGTGTTTACAATCTTAATTTTCAAAAACAGAAAACTAAAAAACAAAATGGTTATCAAATGGGACTTTAACAATCATGTTTTTGGAAGTATTTTTTCTTAAATAAAATTGTCTTTGTTCTTCTCAGAGTTCTTGCCAAAATTACATACACTAGTTCTTACAAACAACAGACTTGTGAACTTGGTAGAGATCGACCCGCTGGCATCCCTTCCAAAACTTCAGTTTCTTAGTTTGTTGGATAACAATATTACGAAGAAGCCAAACTATAGATTGTATGTTATTCACAAGTTAAAGTCAGTCCGGGTGCTTGATTTCAAGAAAGTCAGAAACAAGGTAACTTTTTCCTGCATTCAGACATTATTTAAGGAGTGGCTGTTATTCTTGGCTTGTTTTTGCCTTTTGCTTTTTACAATGCCCAGAAACAGTTAGAAGACAATTTCAAGATTATTTCAATTTTAGAGTCTTATGGTGTATTTTGAAAATGTCGTAACTCAACTTTTATGTTTCTCTTAAGGATGTAGCATTAGTTTGAATTCTTTTTATTGTATTTAGTATGATATTTGAATCTAAGTTGGGTTTTGATTGTGCTATTTCGGTAAGATCTTAAACTTCTTGTTGTTTTTAAAAATACATATATTATGCTATTTATGTAAATACCTCTATTTTTTGTTAGAGAAGAAATGTGTTTTTTGCATATCAAGAATGTGCATATGCTCATATGTTAATACCTAAGGTTTCGTTTAGTAACAATTTCGGTTTTTGTTTTTGTTTTTTTAAAATTAAGCCTATAGACACTACTTCCACCTCTAAATTTCTTCCTTTGTTGCCTAGTTTTTACCAACGGTTTAAAAAACCAAGTCAAAATTTGAAAACTAAGAAGAGTAGTTTTTAAAAATTTGTTTTTGTTTTTGGAATTTGGCTAAAAATTCAACTATTGTACTTAAGAAAGATGCAACCATTGTAAGAAATGTGGAGGAAATAGGCTTAATTTTCAAAAACCAAAAATCAAAAAACAAAATAGTTATCAAATGGGACCTAAAGAATTCAAATGCATCCATTCAGTTTAATGTCCAGCTATTTATTCATTAACAATCATGTGCCCGAGAATTAGTGGCTGAACAAGGAATGAAAAGTAAAATTATAAATAGAGTTGGAATGTCCGTGGACATCTAGGATGTGCACATGTTTAGGGGGTTCCACTCAACTTCATTGAACTCCAATTCGATGTGTCTGGCTTAACTCTTGATCAACAGTTGGACCTTGGAGCTGTGTGTTTCTATTTTCCCCCTGTTCATGGAGTCCTCTTTTTGTTTGTTCTTTTTCTGCAAAACTTTTATGTGCAGGATATATTTTAGCATATGTTCTAACATTCTTGGATTTCGTCTGATAACAAATTTGTTTTTGTTATGAAAGTTATTTTTAAAATTTTAGCCAATTTGTTAAACAACTTATAGTTTTATATACGATACAGTTATTGATCATTTCCAAATGTCTTATACCAGGAGAGATTGGAGGCTAGGAATTTATTTTCATCAAAAGAAGTTGAAGAAGAGGCAAAGAAGGAATCTGTGAGGACGTTCATTCCAGGTGAGGCAGAGAATGCATCCAAACCTGTCGAGGAGAAACAAACTTCAAATGTGTCTGCACCAACACCAGAGCAGATTATAGCTATTAAGGTGTGTACATTCTAATAGCATAAAGCTTAAGAATCACTTTGATTGAAATATATGTTTGCAAGTTTTCAAGATCTTTATTTTGTTCCTGTTTAACTTCTTAGAAATTTTTAAGGTTGCCTCAGCTGGTTTCAGCCATTTATGCCACCTTTTCAAATCCACGACATTAATTCCCTTAGTTTGATCTGCATTGGTGTGTTTCGTCTAGATAAGCTCCTAAAGATTATTTGCAAAGGCTTGTAAAAAATTTATCACCACCATATGTTTCCTATTACATCGTGTGATCAATTAATTCTTTATGTTACAGGCAGCCATTGTTAACTCCCAAACTCTTGAAGAGGTCGCAAGATTAGAACAGGTGTGCTTCGTTGTTTTGTTGTGTACAGTCACCCCTCCTACCGTAAAAAGGGAGAAAAATCTCTATTTAAATGTTATGGGAATAATAACGTTGGAAAATGCTTCTTTGTTGCTGGCAGGCGCTTAAGTCAGGTCAGTTACCTGCAGATTTGAATCTTTTGGAAGATAATACCGTGCCAAACACCACAAAAGATACAGAGGATAAGACAATGTCCGATAGTGGAGACCAAGAAAATGTATCCAAGGATGTGAAAGAGCAATCGAATGATGAATCTACACCTATGGAGCAGGTACCACAAACCCTTTTGCTATTTGAGTGA

mRNA sequence

ATGAGCGGCGGAGAAAAGAACGAGAACGAGGAGTTTGTGAAGCAAAATACTGATTTTCGTCGGGAAAAGAAACCAACGAAGGAACGAAAACGGCGGGGTCGAAGTCCGTCTAAAAGGGAAGCAGATGTTGACATTGAAGATATTATGTTCAGCATAGACAATGTTCAGATAATCCGGGCATCGTTATTGGAATGGTACGACCGTAGCTGCAGGGACCTTCCATGGAGGAGATTGGACAAAGGACAACCTGAAACACGGGCTTACGGTGTGTGGGTTTCAGAAATAATGCTGCAGCAGACGAGAGTTCAGACCGTCGTTCACTTTTACAACCGTTGGATGCTTAGATGGCCCACCGTTCAACATCTCTCTCGTGCTTCTCTTGAGGAAGTTAATGAAATGTGGGCAGGCTTGGGGTACTACAGACGAGCTCGTTTTCTTTTGGAGGGTGCAAAGATGATAGTCAAAGAAGGTGGTAGATTTCCTAAAACGGTTTCTGCCCTTCGAAAAATTCCTGGAATTGGAGAATATACAGCAGGGGCTATTGCCTCCATAGCATTCGATGAAGTGGTGCCTGTGGTCGATGGTAATGTGATAAGGGTAATTGCTCGATTGAAGGCCATTCTAGGAAATCCAAAAGACCCAAAGTTGAACAAGCAAGTTTGGAAGGCAGCTGCTCAATTAGTTGATCCTTCCAGGCCTGGGGACTTCAATCAGGCACTCATGGAACTTGGTGCTACTTTATGCAGTCCTACAAACCCAAGCTGCTCAACATGCCCTGTGTTTGATCACTGCGAGGCCCTTTCAATCTCAAAGCATGATAGTTCAGTTCTTGTCACAGATTATCCTGCTAAGGGGATAAAGACCAAACAAAGACATGATTATTCTGCTGTAAGTGTGGTTGAGATATTGGAAAACCAGGGTACATCTAAGTTAGAGCAATCTAGTAGATTTCTTCTTGTAAAGAGGCCTGATGAAGGTTTGCTTGCTGGTCTATGGGAGTTTCCATCCGTCTTGTTGGACGGAGAAGCTGATTTAAGTACAAGGAGAGAATCCATTAATAGCCTCTTGAGTAAATACTTTGGACTTGAACCAAAAAAGAATTTTGAAATAGTTATTAGAGAAGAGGTTGGAGATTTTATCCATGTTTTCACCCACATCCGTCTCAAGATATATGTTGAGCACTTGGTGTTATGTTTAAAAGGAAACAAGATAGCAGTGATAGAAAACCTAGGTGCCACCGAGGACCAATTTGATGCCATTGATTTGTCTGATAATGAGATTGTGAAACTGGAAAATATGCCATATCTTAATCGATTGGGCACATTGCTGATCAATAATAATAGAATCACTCGCATCAATCCAAATATTGGAGAGTTCTTGCCAAAATTACATACACTAGTTCTTACAAACAACAGACTTGTGAACTTGGTAGAGATCGACCCGCTGGCATCCCTTCCAAAACTTCAGTTTCTTAGTTTGTTGGATAACAATATTACGAAGAAGCCAAACTATAGATTGTATGTTATTCACAAGTTAAAGTCAGTCCGGGTGCTTGATTTCAAGAAAGTCAGAAACAAGGAGAGATTGGAGGCTAGGAATTTATTTTCATCAAAAGAAGTTGAAGAAGAGGCAAAGAAGGAATCTGTGAGGACGTTCATTCCAGGTGAGGCAGAGAATGCATCCAAACCTGTCGAGGAGAAACAAACTTCAAATGTGTCTGCACCAACACCAGAGCAGATTATAGCTATTAAGGCAGCCATTGTTAACTCCCAAACTCTTGAAGAGGTCGCAAGATTAGAACAGGCGCTTAAGTCAGGTCAGTTACCTGCAGATTTGAATCTTTTGGAAGATAATACCGTGCCAAACACCACAAAAGATACAGAGGATAAGACAATGTCCGATAGTGGAGACCAAGAAAATGTATCCAAGGATGTGAAAGAGCAATCGAATGATGAATCTACACCTATGGAGCAGGTACCACAAACCCTTTTGCTATTTGAGTGA

Coding sequence (CDS)

ATGAGCGGCGGAGAAAAGAACGAGAACGAGGAGTTTGTGAAGCAAAATACTGATTTTCGTCGGGAAAAGAAACCAACGAAGGAACGAAAACGGCGGGGTCGAAGTCCGTCTAAAAGGGAAGCAGATGTTGACATTGAAGATATTATGTTCAGCATAGACAATGTTCAGATAATCCGGGCATCGTTATTGGAATGGTACGACCGTAGCTGCAGGGACCTTCCATGGAGGAGATTGGACAAAGGACAACCTGAAACACGGGCTTACGGTGTGTGGGTTTCAGAAATAATGCTGCAGCAGACGAGAGTTCAGACCGTCGTTCACTTTTACAACCGTTGGATGCTTAGATGGCCCACCGTTCAACATCTCTCTCGTGCTTCTCTTGAGGAAGTTAATGAAATGTGGGCAGGCTTGGGGTACTACAGACGAGCTCGTTTTCTTTTGGAGGGTGCAAAGATGATAGTCAAAGAAGGTGGTAGATTTCCTAAAACGGTTTCTGCCCTTCGAAAAATTCCTGGAATTGGAGAATATACAGCAGGGGCTATTGCCTCCATAGCATTCGATGAAGTGGTGCCTGTGGTCGATGGTAATGTGATAAGGGTAATTGCTCGATTGAAGGCCATTCTAGGAAATCCAAAAGACCCAAAGTTGAACAAGCAAGTTTGGAAGGCAGCTGCTCAATTAGTTGATCCTTCCAGGCCTGGGGACTTCAATCAGGCACTCATGGAACTTGGTGCTACTTTATGCAGTCCTACAAACCCAAGCTGCTCAACATGCCCTGTGTTTGATCACTGCGAGGCCCTTTCAATCTCAAAGCATGATAGTTCAGTTCTTGTCACAGATTATCCTGCTAAGGGGATAAAGACCAAACAAAGACATGATTATTCTGCTGTAAGTGTGGTTGAGATATTGGAAAACCAGGGTACATCTAAGTTAGAGCAATCTAGTAGATTTCTTCTTGTAAAGAGGCCTGATGAAGGTTTGCTTGCTGGTCTATGGGAGTTTCCATCCGTCTTGTTGGACGGAGAAGCTGATTTAAGTACAAGGAGAGAATCCATTAATAGCCTCTTGAGTAAATACTTTGGACTTGAACCAAAAAAGAATTTTGAAATAGTTATTAGAGAAGAGGTTGGAGATTTTATCCATGTTTTCACCCACATCCGTCTCAAGATATATGTTGAGCACTTGGTGTTATGTTTAAAAGGAAACAAGATAGCAGTGATAGAAAACCTAGGTGCCACCGAGGACCAATTTGATGCCATTGATTTGTCTGATAATGAGATTGTGAAACTGGAAAATATGCCATATCTTAATCGATTGGGCACATTGCTGATCAATAATAATAGAATCACTCGCATCAATCCAAATATTGGAGAGTTCTTGCCAAAATTACATACACTAGTTCTTACAAACAACAGACTTGTGAACTTGGTAGAGATCGACCCGCTGGCATCCCTTCCAAAACTTCAGTTTCTTAGTTTGTTGGATAACAATATTACGAAGAAGCCAAACTATAGATTGTATGTTATTCACAAGTTAAAGTCAGTCCGGGTGCTTGATTTCAAGAAAGTCAGAAACAAGGAGAGATTGGAGGCTAGGAATTTATTTTCATCAAAAGAAGTTGAAGAAGAGGCAAAGAAGGAATCTGTGAGGACGTTCATTCCAGGTGAGGCAGAGAATGCATCCAAACCTGTCGAGGAGAAACAAACTTCAAATGTGTCTGCACCAACACCAGAGCAGATTATAGCTATTAAGGCAGCCATTGTTAACTCCCAAACTCTTGAAGAGGTCGCAAGATTAGAACAGGCGCTTAAGTCAGGTCAGTTACCTGCAGATTTGAATCTTTTGGAAGATAATACCGTGCCAAACACCACAAAAGATACAGAGGATAAGACAATGTCCGATAGTGGAGACCAAGAAAATGTATCCAAGGATGTGAAAGAGCAATCGAATGATGAATCTACACCTATGGAGCAGGTACCACAAACCCTTTTGCTATTTGAGTGA

Protein sequence

MSGGEKNENEEFVKQNTDFRREKKPTKERKRRGRSPSKREADVDIEDIMFSIDNVQIIRASLLEWYDRSCRDLPWRRLDKGQPETRAYGVWVSEIMLQQTRVQTVVHFYNRWMLRWPTVQHLSRASLEEVNEMWAGLGYYRRARFLLEGAKMIVKEGGRFPKTVSALRKIPGIGEYTAGAIASIAFDEVVPVVDGNVIRVIARLKAILGNPKDPKLNKQVWKAAAQLVDPSRPGDFNQALMELGATLCSPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKTKQRHDYSAVSVVEILENQGTSKLEQSSRFLLVKRPDEGLLAGLWEFPSVLLDGEADLSTRRESINSLLSKYFGLEPKKNFEIVIREEVGDFIHVFTHIRLKIYVEHLVLCLKGNKIAVIENLGATEDQFDAIDLSDNEIVKLENMPYLNRLGTLLINNNRITRINPNIGEFLPKLHTLVLTNNRLVNLVEIDPLASLPKLQFLSLLDNNITKKPNYRLYVIHKLKSVRVLDFKKVRNKERLEARNLFSSKEVEEEAKKESVRTFIPGEAENASKPVEEKQTSNVSAPTPEQIIAIKAAIVNSQTLEEVARLEQALKSGQLPADLNLLEDNTVPNTTKDTEDKTMSDSGDQENVSKDVKEQSNDESTPMEQVPQTLLLFE
BLAST of Cla007083 vs. Swiss-Prot
Match: MUTYH_ARATH (Adenine DNA glycosylase OS=Arabidopsis thaliana GN=MYH PE=3 SV=1)

HSP 1 Score: 434.1 bits (1115), Expect = 2.8e-120
Identity = 233/426 (54.69%), Postives = 285/426 (66.90%), Query Frame = 1

Query: 5   EKNENEEFVKQNTDFRREKKPTKERKRRGRSPSKREADV-------DIEDIMFSIDNVQI 64
           E+   EE   +  +   +K+  +E         + EA+        DIED+ FS +  Q 
Sbjct: 72  EREAEEEEKAEEAEAEADKEEAEEESEEEEEEEEEEAEAEEEALGGDIEDL-FSENETQK 131

Query: 65  IRASLLEWYDRSCRDLPWR-RLDKGQPETRAYGVWVSEIMLQQTRVQTVVHFYNRWMLRW 124
           IR  LL+WYD + RDLPWR R  + + E RAY VWVSEIMLQQTRVQTV+ +Y RWM +W
Sbjct: 132 IRMGLLDWYDVNKRDLPWRNRRSESEKERRAYEVWVSEIMLQQTRVQTVMKYYKRWMQKW 191

Query: 125 PTVQHLSRASLE-------------------EVNEMWAGLGYYRRARFLLEGAKMIVKEG 184
           PT+  L +ASLE                   EVNEMWAGLGYYRRARFLLEGAKM+V   
Sbjct: 192 PTIYDLGQASLENLIVSRSRELSFLRGNEKKEVNEMWAGLGYYRRARFLLEGAKMVVAGT 251

Query: 185 GRFPKTVSALRKIPGIGEYTAGAIASIAFDEVVPVVDGNVIRVIARLKAILGNPKDPKLN 244
             FP   S+L K+ GIG+YTAGAIASIAF+E VPVVDGNVIRV+ARLKAI  NPKD    
Sbjct: 252 EGFPNQASSLMKVKGIGQYTAGAIASIAFNEAVPVVDGNVIRVLARLKAISANPKDRLTA 311

Query: 245 KQVWKAAAQLVDPSRPGDFNQALMELGATLCSPTNPSCSTCPVFDHCEALSISKHDSSVL 304
           +  WK AAQLVDPSRPGDFNQ+LMELGATLC+ + PSCS+CPV   C A S+S+ + ++ 
Sbjct: 312 RNFWKLAAQLVDPSRPGDFNQSLMELGATLCTVSKPSCSSCPVSSQCRAFSLSEENRTIS 371

Query: 305 VTDYPAKGIKTKQRHDYSAVSVVEILENQGTSKLEQSSRFLLVKRPDEGLLAGLWEFPSV 364
           VTDYP K IK K RHD+  V V+EI       + +   RF+LVKRP++GLLAGLWEFPSV
Sbjct: 372 VTDYPTKVIKAKPRHDFCCVCVLEI---HNLERNQSGGRFVLVKRPEQGLLAGLWEFPSV 431

Query: 365 LLDGEADLSTRRESINSLLSK--YFGLEPKKNFEIVIREEVGDFIHVFTHIRLKIYVEHL 402
           +L+ EAD +TRR +IN  L +   F +E KK   IV REE+G+F+H+FTHIR K+YVE L
Sbjct: 432 ILNEEADSATRRNAINVYLKEAFRFHVELKKACTIVSREELGEFVHIFTHIRRKVYVELL 491

BLAST of Cla007083 vs. Swiss-Prot
Match: RU2A_ARATH (U2 small nuclear ribonucleoprotein A' OS=Arabidopsis thaliana GN=At1g09760 PE=2 SV=2)

HSP 1 Score: 319.3 bits (817), Expect = 9.9e-86
Identity = 167/216 (77.31%), Postives = 182/216 (84.26%), Query Frame = 1

Query: 397 LCLKGNKIAVIENLGATEDQFDAIDLSDNEIVKLENMPYLNRLGTLLINNNRITRINPNI 456
           L L+GNKI VIENLGATEDQFD IDLSDNEIVKLEN PYLNRLGTLLINNNRITRINPN+
Sbjct: 24  LDLRGNKIPVIENLGATEDQFDTIDLSDNEIVKLENFPYLNRLGTLLINNNRITRINPNL 83

Query: 457 GEFLPKLHTLVLTNNRLVNLVEIDPLASLPKLQFLSLLDNNITKKPNYRLYVIHKLKSVR 516
           GEFLPKLH+LVLTNNRLVNLVEIDPLAS+PKLQ+LSLLDNNITKK NYRLYVIHKLKS+R
Sbjct: 84  GEFLPKLHSLVLTNNRLVNLVEIDPLASIPKLQYLSLLDNNITKKANYRLYVIHKLKSLR 143

Query: 517 VLDFKKVRNKERLEARNLFSSKEVEEEAKKESVRTFIPGEAENASKPVEEKQTSNVSAPT 576
           VLDF K++ KER EA +LFSSKE EEE KK S       E +  S+  E  +T  V APT
Sbjct: 144 VLDFIKIKAKERAEAASLFSSKEAEEEVKKVSRE-----EVKKVSETAENPETPKVVAPT 203

Query: 577 PEQIIAIKAAIVNSQTLEEVARLEQALKSGQLPADL 613
            EQI+AIKAAI+NSQT+EE+ARLEQALK GQ+PA L
Sbjct: 204 AEQILAIKAAIINSQTIEEIARLEQALKFGQVPAGL 234

BLAST of Cla007083 vs. Swiss-Prot
Match: MUTYH_MOUSE (Adenine DNA glycosylase OS=Mus musculus GN=Mutyh PE=2 SV=2)

HSP 1 Score: 308.5 bits (789), Expect = 1.7e-82
Identity = 183/431 (42.46%), Postives = 243/431 (56.38%), Query Frame = 1

Query: 22  EKKPTKERKRRGRSPS---------------KRE----ADVDIEDIMFSIDNVQIIRASL 81
           +K+P   ++RR R+ S               KRE    A V    +   + +V   R++L
Sbjct: 12  KKQPANHKRRRTRALSSSQAKPSSLDGLAKQKREELLQASVSPYHLFSDVADVTAFRSNL 71

Query: 82  LEWYDRSCRDLPWRRLDKGQPET--RAYGVWVSEIMLQQTRVQTVVHFYNRWMLRWPTVQ 141
           L WYD+  RDLPWR L K +  +  RAY VWVSE+MLQQT+V TV+ +Y RWM +WP +Q
Sbjct: 72  LSWYDQEKRDLPWRNLAKEEANSDRRAYAVWVSEVMLQQTQVATVIDYYTRWMQKWPKLQ 131

Query: 142 HLSRASLEEVNEMWAGLGYYRRARFLLEGAKMIVKE-GGRFPKTVSALRKI-PGIGEYTA 201
            L+ ASLEEVN++W+GLGYY R R L EGA+ +V+E GG  P+T   L+++ PG+G YTA
Sbjct: 132 DLASASLEEVNQLWSGLGYYSRGRRLQEGARKVVEELGGHMPRTAETLQQLLPGVGRYTA 191

Query: 202 GAIASIAFDEVVPVVDGNVIRVIARLKAILGNPKDPKLNKQVWKAAAQLVDPSRPGDFNQ 261
           GAIASIAFD+V  VVDGNV+RV+ R++AI  +P    ++  +W  A QLVDP+RPGDFNQ
Sbjct: 192 GAIASIAFDQVTGVVDGNVLRVLCRVRAIGADPTSTLVSHHLWNLAQQLVDPARPGDFNQ 251

Query: 262 ALMELGATLCSPTNPSCSTCPVFDHCEA-------------------------------- 321
           A MELGAT+C+P  P CS CPV   C A                                
Sbjct: 252 AAMELGATVCTPQRPLCSHCPVQSLCRAYQRVQRGQLSALPGRPDIEECALNTRQCQLCL 311

Query: 322 LSISKHDSSVLVTDYPAKGIKTKQRHDYSAVSVVEILENQGTSKLEQSSRFLLVKRPDEG 381
            S S  D S+ V ++P K  +   R +YSA  VVE     G          LLV+RPD G
Sbjct: 312 TSSSPWDPSMGVANFPRKASRRPPREEYSATCVVEQPGAIG------GPLVLLVQRPDSG 371

Query: 382 LLAGLWEFPSVLLDGEADLSTRRESINSLLSKYFGLEPKKNFEIVIREEVGDFIHVFTHI 398
           LLAGLWEFPSV L  E     + +++   L ++ G  P      +  + +G+ IH+F+HI
Sbjct: 372 LLAGLWEFPSVTL--EPSEQHQHKALLQELQRWCGPLP-----AIRLQHLGEVIHIFSHI 429

BLAST of Cla007083 vs. Swiss-Prot
Match: MUTYH_RAT (Adenine DNA glycosylase OS=Rattus norvegicus GN=Mutyh PE=2 SV=1)

HSP 1 Score: 298.5 bits (763), Expect = 1.8e-79
Identity = 184/440 (41.82%), Postives = 239/440 (54.32%), Query Frame = 1

Query: 14  KQNTDFRREKKPTKERKRRGR--------SPS--------KRE----ADVDIEDIMFSID 73
           K     R  KK     KRRG+         PS        KRE      V    +   I 
Sbjct: 3   KLRASVRSHKKQPANHKRRGKCALSSSQAKPSGLDGLAKQKREELLKTPVSPYHLFSDIA 62

Query: 74  NVQIIRASLLEWYDRSCRDLPWRRLDKGQP--ETRAYGVWVSEIMLQQTRVQTVVHFYNR 133
           +V   R +LL WYD+  RDLPWR+  K +   + RAY VWVSE+MLQQT+V TV+ +Y R
Sbjct: 63  DVTAFRRNLLSWYDQEKRDLPWRKRVKEETNLDRRAYAVWVSEVMLQQTQVATVIDYYTR 122

Query: 134 WMLRWPTVQHLSRASLEEVNEMWAGLGYYRRARFLLEGAKMIVKE-GGRFPKTVSALRKI 193
           WM +WPT+Q L+ ASLEEVN++W+GLGYY R R L EGA+ +V+E GG  P+T   L+++
Sbjct: 123 WMQKWPTLQDLASASLEEVNQLWSGLGYYSRGRRLQEGARKVVEELGGHVPRTAETLQQL 182

Query: 194 -PGIGEYTAGAIASIAFDEVVPVVDGNVIRVIARLKAILGNPKDPKLNKQVWKAAAQLVD 253
            PG+G YTAGAIASIAFD+V  VVDGNVIRV+ R++AI  +P    ++  +W  A QLVD
Sbjct: 183 LPGVGRYTAGAIASIAFDQVTGVVDGNVIRVLCRVRAIGADPTSSFVSHHLWDLAQQLVD 242

Query: 254 PSRPGDFNQALMELGATLCSPTNPSCSTCPVFDHCEA----------------------- 313
           P+RPGDFNQA MELGAT+C+P  P C+ CPV   C A                       
Sbjct: 243 PARPGDFNQAAMELGATVCTPQRPLCNHCPVQSLCRAHQRVGQGRLSALPGSPDIEECAL 302

Query: 314 ---------LSISKHDSSVLVTDYPAKGIKTKQRHDYSAVSVVEILENQGTSKLEQSSRF 373
                     S +  D ++ V ++P K  +   R +YSA  VVE     G          
Sbjct: 303 NTRQCQLCLPSTNPWDPNMGVVNFPRKASRRPPREEYSATCVVEQPGATG------GPLI 362

Query: 374 LLVKRPDEGLLAGLWEFPSVLLDGEADLSTRRESINSLLSKYFGLEPKKNFEIVIREEVG 398
           LLV+RP+ GLLAGLWEFPSV L  E     + +++   L  +    P         + +G
Sbjct: 363 LLVQRPNSGLLAGLWEFPSVTL--EPSGQHQHKALLQELQHWSAPLPTTPL-----QHLG 422

BLAST of Cla007083 vs. Swiss-Prot
Match: MUTYH_HUMAN (Adenine DNA glycosylase OS=Homo sapiens GN=MUTYH PE=1 SV=1)

HSP 1 Score: 258.5 bits (659), Expect = 2.1e-67
Identity = 127/231 (54.98%), Postives = 161/231 (69.70%), Query Frame = 1

Query: 40  EADVDIEDIMFSIDNVQIIRASLLEWYDRSCRDLPWRRL--DKGQPETRAYGVWVSEIML 99
           +A V    +   +  V   R SLL WYD+  RDLPWRR   D+   + RAY VWVSE+ML
Sbjct: 75  QASVSSYHLFRDVAEVTAFRGSLLSWYDQEKRDLPWRRRAEDEMDLDRRAYAVWVSEVML 134

Query: 100 QQTRVQTVVHFYNRWMLRWPTVQHLSRASLEEVNEMWAGLGYYRRARFLLEGAKMIVKE- 159
           QQT+V TV+++Y  WM +WPT+Q L+ ASLEEVN++WAGLGYY R R L EGA+ +V+E 
Sbjct: 135 QQTQVATVINYYTGWMQKWPTLQDLASASLEEVNQLWAGLGYYSRGRRLQEGARKVVEEL 194

Query: 160 GGRFPKTVSALRKI-PGIGEYTAGAIASIAFDEVVPVVDGNVIRVIARLKAILGNPKDPK 219
           GG  P+T   L+++ PG+G YTAGAIASIAF +   VVDGNV RV+ R++AI  +P    
Sbjct: 195 GGHMPRTAETLQQLLPGVGRYTAGAIASIAFGQATGVVDGNVARVLCRVRAIGADPSSTL 254

Query: 220 LNKQVWKAAAQLVDPSRPGDFNQALMELGATLCSPTNPSCSTCPVFDHCEA 267
           +++Q+W  A QLVDP+RPGDFNQA MELGAT+C+P  P CS CPV   C A
Sbjct: 255 VSQQLWGLAQQLVDPARPGDFNQAAMELGATVCTPQRPLCSQCPVESLCRA 305


HSP 2 Score: 155.2 bits (391), Expect = 2.5e-36
Identity = 127/400 (31.75%), Postives = 183/400 (45.75%), Query Frame = 1

Query: 40  EADVDIEDIMFSIDNVQIIRASLLEWYDRSCRDLPWRRL--DKGQPETRAYGVWVSEIML 99
           +A V    +   +  V   R SLL WYD+  RDLPWRR   D+   + RAY VWVSE+ML
Sbjct: 75  QASVSSYHLFRDVAEVTAFRGSLLSWYDQEKRDLPWRRRAEDEMDLDRRAYAVWVSEVML 134

Query: 100 QQTRVQTVVHFYNRWMLRWPTVQHLSRASLEEVNEMWAGLGY-------YRRARFLLE-- 159
           QQT+V TV+++Y  WM +WPT+Q L+ ASLEEVN++WAGLGY          AR ++E  
Sbjct: 135 QQTQVATVINYYTGWMQKWPTLQDLASASLEEVNQLWAGLGYYSRGRRLQEGARKVVEEL 194

Query: 160 GAKM---------IVKEGGRFPKTVSALRKIPGIGEYTA---GAIASIAFDEVVPVVDGN 219
           G  M         ++   GR+  T  A+  I   G+ T    G +A +         D +
Sbjct: 195 GGHMPRTAETLQQLLPGVGRY--TAGAIASI-AFGQATGVVDGNVARVLCRVRAIGADPS 254

Query: 220 VIRVIARLKAILGNPKDPK----LNKQVWKAAAQLVDPSRP---------------GDFN 279
              V  +L  +     DP      N+   +  A +  P RP                   
Sbjct: 255 STLVSQQLWGLAQQLVDPARPGDFNQAAMELGATVCTPQRPLCSQCPVESLCRARQRVEQ 314

Query: 280 QALMELGATLCSPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKTKQRHDYSAV 339
           + L+  G+   SP    C+      H         D ++ V ++P K  +   R + SA 
Sbjct: 315 EQLLASGSLSGSPDVEECAPNTGQCHLCLPPSEPWDQTLGVVNFPRKASRKPPREESSAT 374

Query: 340 SVVEILENQGTSKLEQSSRFLLVKRPDEGLLAGLWEFPSVLLDGEADLSTRRESINSLLS 398
            V   LE  G       ++ LLV+RP+ GLLAGLWEFPSV  +    L  +R+++   L 
Sbjct: 375 CV---LEQPGAL----GAQILLVQRPNSGLLAGLWEFPSVTWEPSEQL--QRKALLQELQ 434

BLAST of Cla007083 vs. TrEMBL
Match: A0A0A0KC27_CUCSA (Uncharacterized protein OS=Cucumis sativus GN=Csa_6G088720 PE=4 SV=1)

HSP 1 Score: 733.0 bits (1891), Expect = 3.2e-208
Identity = 365/401 (91.02%), Postives = 380/401 (94.76%), Query Frame = 1

Query: 1   MSGGEKNENEEFVKQNTDFRREKKPTKERKRRGRSPSKREADVDIEDIMFSIDNVQIIRA 60
           MS GEKNEN+E++K+NTDFRR+KKPT ERKRRGRSPSK EA VDIEDIMFSIDNVQ IRA
Sbjct: 55  MSDGEKNENDEYMKKNTDFRRKKKPTTERKRRGRSPSKSEAVVDIEDIMFSIDNVQTIRA 114

Query: 61  SLLEWYDRSCRDLPWRRLDKGQPETRAYGVWVSEIMLQQTRVQTVVHFYNRWMLRWPTVQ 120
           SLL+WYDRS RDLPWR LDKG+PETRAYGVWVSEIMLQQTRVQTVV FYNRWML+WPTVQ
Sbjct: 115 SLLDWYDRSRRDLPWRSLDKGEPETRAYGVWVSEIMLQQTRVQTVVQFYNRWMLKWPTVQ 174

Query: 121 HLSRASLEEVNEMWAGLGYYRRARFLLEGAKMIVKEGGRFPKTVSALRKIPGIGEYTAGA 180
           HLSRASLEEVNEMWAGLGYYRRARFL EGAKMIVKEGGRFP+TVS+LRKIPGIGEYTAGA
Sbjct: 175 HLSRASLEEVNEMWAGLGYYRRARFLFEGAKMIVKEGGRFPRTVSSLRKIPGIGEYTAGA 234

Query: 181 IASIAFDEVVPVVDGNVIRVIARLKAILGNPKDPKLNKQVWKAAAQLVDPSRPGDFNQAL 240
           IASIAF EVVPVVDGNVIRVIARLKAI GNPKDPKL KQVWKAAAQLVD SRPGDFNQAL
Sbjct: 235 IASIAFGEVVPVVDGNVIRVIARLKAISGNPKDPKLIKQVWKAAAQLVDLSRPGDFNQAL 294

Query: 241 MELGATLCSPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKTKQRHDYSAVSVV 300
           MELGATLC+PTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIK KQRHDYSAV VV
Sbjct: 295 MELGATLCTPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKIKQRHDYSAVCVV 354

Query: 301 EILENQGTSKLEQSSRFLLVKRPDEGLLAGLWEFPSVLLDGEADLSTRRESINSLLSKYF 360
           EILE+QGT +L QSSRFLLVKRPDEGLLAGLWEFPSV LDGEADLSTRRESINSLLSK F
Sbjct: 355 EILESQGTPELGQSSRFLLVKRPDEGLLAGLWEFPSVSLDGEADLSTRRESINSLLSKNF 414

Query: 361 GLEPKKNFEIVIREEVGDFIHVFTHIRLKIYVEHLVLCLKG 402
           GLE KKNFEIV RE+VGDFIH+FTHIRLKIYVEHLVLCLKG
Sbjct: 415 GLEAKKNFEIVNREDVGDFIHIFTHIRLKIYVEHLVLCLKG 455

BLAST of Cla007083 vs. TrEMBL
Match: E5GB45_CUCME (A/G-specific adenine DNA glycosylase OS=Cucumis melo subsp. melo PE=4 SV=1)

HSP 1 Score: 728.4 bits (1879), Expect = 7.9e-207
Identity = 366/401 (91.27%), Postives = 378/401 (94.26%), Query Frame = 1

Query: 1   MSGGEKNENEEFVKQNTDFRREKKPTKERKRRGRSPSKREADVDIEDIMFSIDNVQIIRA 60
           MS GEKNENEE VK+ TDFRR+KKPT +RKRR RSPSK EA VDIEDIMFSIDNVQ IRA
Sbjct: 1   MSDGEKNENEENVKKKTDFRRKKKPTTKRKRRSRSPSKSEAVVDIEDIMFSIDNVQTIRA 60

Query: 61  SLLEWYDRSCRDLPWRRLDKGQPETRAYGVWVSEIMLQQTRVQTVVHFYNRWMLRWPTVQ 120
           SLL+WYDRS RDLPWR LDKG+PETRAYGVWVSEIMLQQTRVQTVV FYNRWML+WPTVQ
Sbjct: 61  SLLDWYDRSRRDLPWRSLDKGEPETRAYGVWVSEIMLQQTRVQTVVQFYNRWMLKWPTVQ 120

Query: 121 HLSRASLEEVNEMWAGLGYYRRARFLLEGAKMIVKEGGRFPKTVSALRKIPGIGEYTAGA 180
           HLSRASLEEVNEMWAGLGYYRRARFL EGAKMIVKEGGRFPKTVS+LRKIPGIGEYTAGA
Sbjct: 121 HLSRASLEEVNEMWAGLGYYRRARFLFEGAKMIVKEGGRFPKTVSSLRKIPGIGEYTAGA 180

Query: 181 IASIAFDEVVPVVDGNVIRVIARLKAILGNPKDPKLNKQVWKAAAQLVDPSRPGDFNQAL 240
           IASIAF EVVPVVDGNVIRVIARLKAI GNPKDPKL KQVWKAAAQLVD SRPGDFNQAL
Sbjct: 181 IASIAFGEVVPVVDGNVIRVIARLKAISGNPKDPKLIKQVWKAAAQLVDLSRPGDFNQAL 240

Query: 241 MELGATLCSPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKTKQRHDYSAVSVV 300
           MELGATLC+PTNPSCSTCPVFDHCEALSISK DSSVLVTDYPAKGIKTKQRHDYSAV VV
Sbjct: 241 MELGATLCTPTNPSCSTCPVFDHCEALSISKRDSSVLVTDYPAKGIKTKQRHDYSAVCVV 300

Query: 301 EILENQGTSKLEQSSRFLLVKRPDEGLLAGLWEFPSVLLDGEADLSTRRESINSLLSKYF 360
           EILE+QGTS+L QSSRFLLVKRPDEGLLAGLWEFPSV LDGEAD STRRESI+SLLSK F
Sbjct: 301 EILESQGTSELGQSSRFLLVKRPDEGLLAGLWEFPSVSLDGEADSSTRRESIDSLLSKNF 360

Query: 361 GLEPKKNFEIVIREEVGDFIHVFTHIRLKIYVEHLVLCLKG 402
           GLEPKKNFEIV RE+VGDFIHVFTHIRLKIYVEHLVLCLKG
Sbjct: 361 GLEPKKNFEIVNREDVGDFIHVFTHIRLKIYVEHLVLCLKG 401

BLAST of Cla007083 vs. TrEMBL
Match: B9IAY3_POPTR (Uncharacterized protein OS=Populus trichocarpa GN=POPTR_0014s17120g PE=4 SV=2)

HSP 1 Score: 507.7 bits (1306), Expect = 2.2e-140
Identity = 259/409 (63.33%), Postives = 315/409 (77.02%), Query Frame = 1

Query: 9   NEEFVKQNTDFRREKKPTKERKRRGRSPSKREADVDIEDIMFSIDNVQIIRASLLEWYDR 68
           +EE +++ +  +R     K +++R  S SK++   DIED+ FS    Q IRASLLEWYD 
Sbjct: 2   DEEGIEKPSKRKRNAAIAKPKEQRQHS-SKKQVVADIEDL-FSDKETQKIRASLLEWYDH 61

Query: 69  SCRDLPWRRL--------------DKGQPETRAYGVWVSEIMLQQTRVQTVVHFYNRWML 128
           + RDLPWRR+              ++ + E RAYGVWVSE+MLQQTRVQTV+ +YNRWML
Sbjct: 62  NQRDLPWRRITQTKETPFKEEEEEEEEEEERRAYGVWVSEVMLQQTRVQTVIDYYNRWML 121

Query: 129 RWPTVQHLSRASLEEVNEMWAGLGYYRRARFLLEGAKMIVKEGGRFPKTVSALRKIPGIG 188
           +WPT+ HL++ASLEEVNE WAGLGYYRRARFLLEGAKMIV  G  FPK VS+LRK+PGIG
Sbjct: 122 KWPTLHHLAQASLEEVNEKWAGLGYYRRARFLLEGAKMIVAGGDGFPKIVSSLRKVPGIG 181

Query: 189 EYTAGAIASIAFDEVVPVVDGNVIRVIARLKAILGNPKDPKLNKQVWKAAAQLVDPSRPG 248
           +YTAGAIASIAF EVVPVVDGNVIRV+ARLKAI  NPKD    K+ WK AAQLVDP RPG
Sbjct: 182 DYTAGAIASIAFKEVVPVVDGNVIRVLARLKAISANPKDKVTVKKFWKLAAQLVDPHRPG 241

Query: 249 DFNQALMELGATLCSPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKTKQRHDY 308
           DFNQ+LMELGATLC+P NPSCS+CPV   C AL+ISK D  VL+TDYPAK IK KQRH++
Sbjct: 242 DFNQSLMELGATLCTPVNPSCSSCPVSGQCRALTISKLDKLVLITDYPAKSIKLKQRHEF 301

Query: 309 SAVSVVEILENQGTSKLEQSSR-FLLVKRPDEGLLAGLWEFPSVLLDGEADLSTRRESIN 368
           SAV  VEI   Q   + +QSS  FLLVKRPDEGLLAGLWEFPSV+L  EAD++ RR+ +N
Sbjct: 302 SAVCAVEITGRQDLIEGDQSSSVFLLVKRPDEGLLAGLWEFPSVMLGKEADMTRRRKEMN 361

Query: 369 SLLSKYFGLEPKKNFEIVIREEVGDFIHVFTHIRLKIYVEHLVLCLKGN 403
             L K F L+P+K   +++RE++G+FIH+FTHIRLK+YVE L++ LKG+
Sbjct: 362 RFLKKSFRLDPQKTCSVLLREDIGEFIHIFTHIRLKVYVELLIVHLKGD 408

BLAST of Cla007083 vs. TrEMBL
Match: A0A0D2LX00_GOSRA (Uncharacterized protein OS=Gossypium raimondii GN=B456_001G100200 PE=4 SV=1)

HSP 1 Score: 502.3 bits (1292), Expect = 9.2e-139
Identity = 263/408 (64.46%), Postives = 307/408 (75.25%), Query Frame = 1

Query: 27  KERKRRGRSPSKREADV-DIEDIMFSIDNVQIIRASLLEWYDRSCRDLPWRRLDKG---- 86
           K  K +     K+E  + DIED+ FS ++   IRASLLEWYD++ RDLPWR   K     
Sbjct: 3   KTNKTKRPQLIKQEEQIGDIEDL-FSEEDTHKIRASLLEWYDKNQRDLPWRTSTKKSENG 62

Query: 87  -------QPETRAYGVWVSEIMLQQTRVQTVVHFYNRWMLRWPTVQHLSRASLEEVNEMW 146
                  + E RAYGVWVSE+MLQQTRVQTV+ +YNRWML+WPT+QHLS+ASLEEVNEMW
Sbjct: 63  ENVQEEEEEEKRAYGVWVSEVMLQQTRVQTVIDYYNRWMLKWPTLQHLSQASLEEVNEMW 122

Query: 147 AGLGYYRRARFLLEGAKMIVKEGGRFPKTVSALRKIPGIGEYTAGAIASIAFDEVVPVVD 206
           AGLGYYRRARFLLEGAKMIV EG  FP TV ALRK+PGIG+YTAGAIASIAF +VVPVVD
Sbjct: 123 AGLGYYRRARFLLEGAKMIVAEGSEFPNTVFALRKVPGIGDYTAGAIASIAFKQVVPVVD 182

Query: 207 GNVIRVIARLKAILGNPKDPKLNKQVWKAAAQLVDPSRPGDFNQALMELGATLCSPTNPS 266
           GNV+RV+ARLKAI  NPKD    K  WK AAQLVDPSRPGDFNQ+LMELGATLC+P NP+
Sbjct: 183 GNVVRVLARLKAISANPKDKTTVKNFWKLAAQLVDPSRPGDFNQSLMELGATLCTPLNPN 242

Query: 267 CSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKTKQRHDYSAVSVVEILENQG-TSKLEQ 326
           C++CPV   C AL  S++D SV+V DYP K +KTKQR+D+S VSVVEI  +Q    + + 
Sbjct: 243 CTSCPVSSQCRALHNSRNDESVMVMDYPMKVVKTKQRNDFSTVSVVEISRSQDRLQQTKS 302

Query: 327 SSRFLLVKRPDEGLLAGLWEFPSVLLDGEADLSTRRESINSLLSKYFGLEPKKNFEIVIR 386
           +SR LLVKRPDEGLLAGLWEFP V LD EADLS RR+ I+ LL K F L P KN  ++ R
Sbjct: 303 NSRVLLVKRPDEGLLAGLWEFPCVTLDEEADLSMRRKLIDQLLKKSFKLNPPKNCNVISR 362

Query: 387 EEVGDFIHVFTHIRLKIYVEHLVLCLKGNKIAVIENLGATEDQFDAID 422
           E VG+F+HVF+HIR KIYVE LVL LKG K  + E     ED  +A D
Sbjct: 363 ELVGEFVHVFSHIRRKIYVELLVLHLKGGKHVLFE-----EDDINATD 404

BLAST of Cla007083 vs. TrEMBL
Match: A0A0B0MDJ1_GOSAR (Tm9sf4 OS=Gossypium arboreum GN=F383_16864 PE=4 SV=1)

HSP 1 Score: 501.1 bits (1289), Expect = 2.0e-138
Identity = 264/413 (63.92%), Postives = 311/413 (75.30%), Query Frame = 1

Query: 29  RKRRGRSPSKREADVDIEDIMFSIDNVQIIRASLLEWYDRSCRDLPWR----------RL 88
           +K+R +   + E   DIED+ FS ++   IRASLLEWYD++ RDLPWR           +
Sbjct: 6   KKKRPQLIKQEEQIGDIEDL-FSEEDTYKIRASLLEWYDKNQRDLPWRTSTKKSENGENV 65

Query: 89  DKGQPETRAYGVWVSEIMLQQTRVQTVVHFYNRWMLRWPTVQHLSRASLEEVNEMWAGLG 148
            + + E RAYGVWVSE+MLQQTRVQTV+ +YNRWML+WPT+QHLS+ASLEEVNEMWAGLG
Sbjct: 66  QEEEEEKRAYGVWVSEVMLQQTRVQTVIDYYNRWMLKWPTLQHLSQASLEEVNEMWAGLG 125

Query: 149 YYRRARFLLEGAKMIVKEGGRFPKTVSALRKIPGIGEYTAGAIASIAFDEVVPVVDGNVI 208
           YYRRARFLLEGAKMIV EG  FP TVSALRK+PGIG+YTAGAIASIAF +VVPVVDGNV+
Sbjct: 126 YYRRARFLLEGAKMIVAEGIEFPNTVSALRKVPGIGDYTAGAIASIAFKQVVPVVDGNVV 185

Query: 209 RVIARLKAILGNPKDPKLNKQVWKAAAQLVDPSRPGDFNQALMELGATLCSPTNPSCSTC 268
           RV+ARLKAI  NPKD    K  WK AAQLVDPSRPGD NQ+LMELGATLC+P NP+C++C
Sbjct: 186 RVLARLKAISANPKDKTTVKSFWKLAAQLVDPSRPGDLNQSLMELGATLCTPLNPNCTSC 245

Query: 269 PVFDHCEALSISKHDSSVLVTDYPAKGIKTKQRHDYSAVSVVEILENQGTSKLEQS-SRF 328
           PV   C AL  S++D  V+VTDYP K +K KQR D+S VSVVEI  +Q  S+  +S SR 
Sbjct: 246 PVSSQCRALHNSRNDELVMVTDYPMKVVKAKQRIDFSTVSVVEISRSQDRSQQTKSNSRV 305

Query: 329 LLVKRPDEGLLAGLWEFPSVLLDGEADLSTRRESINSLLSKYFGLEPKKNFEIVIREEVG 388
           LLVKRPDEGLLAGLWEFP V LD EADLS RR+ I+ LL K F L P KN  ++ RE VG
Sbjct: 306 LLVKRPDEGLLAGLWEFPCVTLDEEADLSMRRKLIDQLLKKSFKLNPPKNCNVISRELVG 365

Query: 389 DFIHVFTHIRLKIYVEHLVLCLKGNKIAVIENLGATEDQFDAID--LSDNEIV 429
           +F+HVF+HIR KIYVE LVL LKG K  + E     ED  +  D  L D+E +
Sbjct: 366 EFVHVFSHIRRKIYVELLVLHLKGGKHVLFE-----EDDINTTDWKLLDSEAI 412

BLAST of Cla007083 vs. NCBI nr
Match: gi|778711687|ref|XP_004140565.2| (PREDICTED: A/G-specific adenine DNA glycosylase [Cucumis sativus])

HSP 1 Score: 733.0 bits (1891), Expect = 4.6e-208
Identity = 365/401 (91.02%), Postives = 380/401 (94.76%), Query Frame = 1

Query: 1   MSGGEKNENEEFVKQNTDFRREKKPTKERKRRGRSPSKREADVDIEDIMFSIDNVQIIRA 60
           MS GEKNEN+E++K+NTDFRR+KKPT ERKRRGRSPSK EA VDIEDIMFSIDNVQ IRA
Sbjct: 1   MSDGEKNENDEYMKKNTDFRRKKKPTTERKRRGRSPSKSEAVVDIEDIMFSIDNVQTIRA 60

Query: 61  SLLEWYDRSCRDLPWRRLDKGQPETRAYGVWVSEIMLQQTRVQTVVHFYNRWMLRWPTVQ 120
           SLL+WYDRS RDLPWR LDKG+PETRAYGVWVSEIMLQQTRVQTVV FYNRWML+WPTVQ
Sbjct: 61  SLLDWYDRSRRDLPWRSLDKGEPETRAYGVWVSEIMLQQTRVQTVVQFYNRWMLKWPTVQ 120

Query: 121 HLSRASLEEVNEMWAGLGYYRRARFLLEGAKMIVKEGGRFPKTVSALRKIPGIGEYTAGA 180
           HLSRASLEEVNEMWAGLGYYRRARFL EGAKMIVKEGGRFP+TVS+LRKIPGIGEYTAGA
Sbjct: 121 HLSRASLEEVNEMWAGLGYYRRARFLFEGAKMIVKEGGRFPRTVSSLRKIPGIGEYTAGA 180

Query: 181 IASIAFDEVVPVVDGNVIRVIARLKAILGNPKDPKLNKQVWKAAAQLVDPSRPGDFNQAL 240
           IASIAF EVVPVVDGNVIRVIARLKAI GNPKDPKL KQVWKAAAQLVD SRPGDFNQAL
Sbjct: 181 IASIAFGEVVPVVDGNVIRVIARLKAISGNPKDPKLIKQVWKAAAQLVDLSRPGDFNQAL 240

Query: 241 MELGATLCSPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKTKQRHDYSAVSVV 300
           MELGATLC+PTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIK KQRHDYSAV VV
Sbjct: 241 MELGATLCTPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKIKQRHDYSAVCVV 300

Query: 301 EILENQGTSKLEQSSRFLLVKRPDEGLLAGLWEFPSVLLDGEADLSTRRESINSLLSKYF 360
           EILE+QGT +L QSSRFLLVKRPDEGLLAGLWEFPSV LDGEADLSTRRESINSLLSK F
Sbjct: 301 EILESQGTPELGQSSRFLLVKRPDEGLLAGLWEFPSVSLDGEADLSTRRESINSLLSKNF 360

Query: 361 GLEPKKNFEIVIREEVGDFIHVFTHIRLKIYVEHLVLCLKG 402
           GLE KKNFEIV RE+VGDFIH+FTHIRLKIYVEHLVLCLKG
Sbjct: 361 GLEAKKNFEIVNREDVGDFIHIFTHIRLKIYVEHLVLCLKG 401

BLAST of Cla007083 vs. NCBI nr
Match: gi|700191190|gb|KGN46394.1| (hypothetical protein Csa_6G088720 [Cucumis sativus])

HSP 1 Score: 733.0 bits (1891), Expect = 4.6e-208
Identity = 365/401 (91.02%), Postives = 380/401 (94.76%), Query Frame = 1

Query: 1   MSGGEKNENEEFVKQNTDFRREKKPTKERKRRGRSPSKREADVDIEDIMFSIDNVQIIRA 60
           MS GEKNEN+E++K+NTDFRR+KKPT ERKRRGRSPSK EA VDIEDIMFSIDNVQ IRA
Sbjct: 55  MSDGEKNENDEYMKKNTDFRRKKKPTTERKRRGRSPSKSEAVVDIEDIMFSIDNVQTIRA 114

Query: 61  SLLEWYDRSCRDLPWRRLDKGQPETRAYGVWVSEIMLQQTRVQTVVHFYNRWMLRWPTVQ 120
           SLL+WYDRS RDLPWR LDKG+PETRAYGVWVSEIMLQQTRVQTVV FYNRWML+WPTVQ
Sbjct: 115 SLLDWYDRSRRDLPWRSLDKGEPETRAYGVWVSEIMLQQTRVQTVVQFYNRWMLKWPTVQ 174

Query: 121 HLSRASLEEVNEMWAGLGYYRRARFLLEGAKMIVKEGGRFPKTVSALRKIPGIGEYTAGA 180
           HLSRASLEEVNEMWAGLGYYRRARFL EGAKMIVKEGGRFP+TVS+LRKIPGIGEYTAGA
Sbjct: 175 HLSRASLEEVNEMWAGLGYYRRARFLFEGAKMIVKEGGRFPRTVSSLRKIPGIGEYTAGA 234

Query: 181 IASIAFDEVVPVVDGNVIRVIARLKAILGNPKDPKLNKQVWKAAAQLVDPSRPGDFNQAL 240
           IASIAF EVVPVVDGNVIRVIARLKAI GNPKDPKL KQVWKAAAQLVD SRPGDFNQAL
Sbjct: 235 IASIAFGEVVPVVDGNVIRVIARLKAISGNPKDPKLIKQVWKAAAQLVDLSRPGDFNQAL 294

Query: 241 MELGATLCSPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKTKQRHDYSAVSVV 300
           MELGATLC+PTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIK KQRHDYSAV VV
Sbjct: 295 MELGATLCTPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKIKQRHDYSAVCVV 354

Query: 301 EILENQGTSKLEQSSRFLLVKRPDEGLLAGLWEFPSVLLDGEADLSTRRESINSLLSKYF 360
           EILE+QGT +L QSSRFLLVKRPDEGLLAGLWEFPSV LDGEADLSTRRESINSLLSK F
Sbjct: 355 EILESQGTPELGQSSRFLLVKRPDEGLLAGLWEFPSVSLDGEADLSTRRESINSLLSKNF 414

Query: 361 GLEPKKNFEIVIREEVGDFIHVFTHIRLKIYVEHLVLCLKG 402
           GLE KKNFEIV RE+VGDFIH+FTHIRLKIYVEHLVLCLKG
Sbjct: 415 GLEAKKNFEIVNREDVGDFIHIFTHIRLKIYVEHLVLCLKG 455

BLAST of Cla007083 vs. NCBI nr
Match: gi|659119956|ref|XP_008459934.1| (PREDICTED: A/G-specific adenine DNA glycosylase [Cucumis melo])

HSP 1 Score: 728.4 bits (1879), Expect = 1.1e-206
Identity = 366/401 (91.27%), Postives = 378/401 (94.26%), Query Frame = 1

Query: 1   MSGGEKNENEEFVKQNTDFRREKKPTKERKRRGRSPSKREADVDIEDIMFSIDNVQIIRA 60
           MS GEKNENEE VK+ TDFRR+KKPT +RKRR RSPSK EA VDIEDIMFSIDNVQ IRA
Sbjct: 1   MSDGEKNENEENVKKKTDFRRKKKPTTKRKRRSRSPSKSEAVVDIEDIMFSIDNVQTIRA 60

Query: 61  SLLEWYDRSCRDLPWRRLDKGQPETRAYGVWVSEIMLQQTRVQTVVHFYNRWMLRWPTVQ 120
           SLL+WYDRS RDLPWR LDKG+PETRAYGVWVSEIMLQQTRVQTVV FYNRWML+WPTVQ
Sbjct: 61  SLLDWYDRSRRDLPWRSLDKGEPETRAYGVWVSEIMLQQTRVQTVVQFYNRWMLKWPTVQ 120

Query: 121 HLSRASLEEVNEMWAGLGYYRRARFLLEGAKMIVKEGGRFPKTVSALRKIPGIGEYTAGA 180
           HLSRASLEEVNEMWAGLGYYRRARFL EGAKMIVKEGGRFPKTVS+LRKIPGIGEYTAGA
Sbjct: 121 HLSRASLEEVNEMWAGLGYYRRARFLFEGAKMIVKEGGRFPKTVSSLRKIPGIGEYTAGA 180

Query: 181 IASIAFDEVVPVVDGNVIRVIARLKAILGNPKDPKLNKQVWKAAAQLVDPSRPGDFNQAL 240
           IASIAF EVVPVVDGNVIRVIARLKAI GNPKDPKL KQVWKAAAQLVD SRPGDFNQAL
Sbjct: 181 IASIAFGEVVPVVDGNVIRVIARLKAISGNPKDPKLIKQVWKAAAQLVDLSRPGDFNQAL 240

Query: 241 MELGATLCSPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKTKQRHDYSAVSVV 300
           MELGATLC+PTNPSCSTCPVFDHCEALSISK DSSVLVTDYPAKGIKTKQRHDYSAV VV
Sbjct: 241 MELGATLCTPTNPSCSTCPVFDHCEALSISKRDSSVLVTDYPAKGIKTKQRHDYSAVCVV 300

Query: 301 EILENQGTSKLEQSSRFLLVKRPDEGLLAGLWEFPSVLLDGEADLSTRRESINSLLSKYF 360
           EILE+QGTS+L QSSRFLLVKRPDEGLLAGLWEFPSV LDGEAD STRRESI+SLLSK F
Sbjct: 301 EILESQGTSELGQSSRFLLVKRPDEGLLAGLWEFPSVSLDGEADSSTRRESIDSLLSKNF 360

Query: 361 GLEPKKNFEIVIREEVGDFIHVFTHIRLKIYVEHLVLCLKG 402
           GLEPKKNFEIV RE+VGDFIHVFTHIRLKIYVEHLVLCLKG
Sbjct: 361 GLEPKKNFEIVNREDVGDFIHVFTHIRLKIYVEHLVLCLKG 401

BLAST of Cla007083 vs. NCBI nr
Match: gi|307135815|gb|ADN33687.1| (A/G-specific adenine DNA glycosylase [Cucumis melo subsp. melo])

HSP 1 Score: 728.4 bits (1879), Expect = 1.1e-206
Identity = 366/401 (91.27%), Postives = 378/401 (94.26%), Query Frame = 1

Query: 1   MSGGEKNENEEFVKQNTDFRREKKPTKERKRRGRSPSKREADVDIEDIMFSIDNVQIIRA 60
           MS GEKNENEE VK+ TDFRR+KKPT +RKRR RSPSK EA VDIEDIMFSIDNVQ IRA
Sbjct: 1   MSDGEKNENEENVKKKTDFRRKKKPTTKRKRRSRSPSKSEAVVDIEDIMFSIDNVQTIRA 60

Query: 61  SLLEWYDRSCRDLPWRRLDKGQPETRAYGVWVSEIMLQQTRVQTVVHFYNRWMLRWPTVQ 120
           SLL+WYDRS RDLPWR LDKG+PETRAYGVWVSEIMLQQTRVQTVV FYNRWML+WPTVQ
Sbjct: 61  SLLDWYDRSRRDLPWRSLDKGEPETRAYGVWVSEIMLQQTRVQTVVQFYNRWMLKWPTVQ 120

Query: 121 HLSRASLEEVNEMWAGLGYYRRARFLLEGAKMIVKEGGRFPKTVSALRKIPGIGEYTAGA 180
           HLSRASLEEVNEMWAGLGYYRRARFL EGAKMIVKEGGRFPKTVS+LRKIPGIGEYTAGA
Sbjct: 121 HLSRASLEEVNEMWAGLGYYRRARFLFEGAKMIVKEGGRFPKTVSSLRKIPGIGEYTAGA 180

Query: 181 IASIAFDEVVPVVDGNVIRVIARLKAILGNPKDPKLNKQVWKAAAQLVDPSRPGDFNQAL 240
           IASIAF EVVPVVDGNVIRVIARLKAI GNPKDPKL KQVWKAAAQLVD SRPGDFNQAL
Sbjct: 181 IASIAFGEVVPVVDGNVIRVIARLKAISGNPKDPKLIKQVWKAAAQLVDLSRPGDFNQAL 240

Query: 241 MELGATLCSPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKTKQRHDYSAVSVV 300
           MELGATLC+PTNPSCSTCPVFDHCEALSISK DSSVLVTDYPAKGIKTKQRHDYSAV VV
Sbjct: 241 MELGATLCTPTNPSCSTCPVFDHCEALSISKRDSSVLVTDYPAKGIKTKQRHDYSAVCVV 300

Query: 301 EILENQGTSKLEQSSRFLLVKRPDEGLLAGLWEFPSVLLDGEADLSTRRESINSLLSKYF 360
           EILE+QGTS+L QSSRFLLVKRPDEGLLAGLWEFPSV LDGEAD STRRESI+SLLSK F
Sbjct: 301 EILESQGTSELGQSSRFLLVKRPDEGLLAGLWEFPSVSLDGEADSSTRRESIDSLLSKNF 360

Query: 361 GLEPKKNFEIVIREEVGDFIHVFTHIRLKIYVEHLVLCLKG 402
           GLEPKKNFEIV RE+VGDFIHVFTHIRLKIYVEHLVLCLKG
Sbjct: 361 GLEPKKNFEIVNREDVGDFIHVFTHIRLKIYVEHLVLCLKG 401

BLAST of Cla007083 vs. NCBI nr
Match: gi|566204725|ref|XP_002321221.2| (hypothetical protein POPTR_0014s17120g [Populus trichocarpa])

HSP 1 Score: 507.7 bits (1306), Expect = 3.1e-140
Identity = 259/409 (63.33%), Postives = 315/409 (77.02%), Query Frame = 1

Query: 9   NEEFVKQNTDFRREKKPTKERKRRGRSPSKREADVDIEDIMFSIDNVQIIRASLLEWYDR 68
           +EE +++ +  +R     K +++R  S SK++   DIED+ FS    Q IRASLLEWYD 
Sbjct: 2   DEEGIEKPSKRKRNAAIAKPKEQRQHS-SKKQVVADIEDL-FSDKETQKIRASLLEWYDH 61

Query: 69  SCRDLPWRRL--------------DKGQPETRAYGVWVSEIMLQQTRVQTVVHFYNRWML 128
           + RDLPWRR+              ++ + E RAYGVWVSE+MLQQTRVQTV+ +YNRWML
Sbjct: 62  NQRDLPWRRITQTKETPFKEEEEEEEEEEERRAYGVWVSEVMLQQTRVQTVIDYYNRWML 121

Query: 129 RWPTVQHLSRASLEEVNEMWAGLGYYRRARFLLEGAKMIVKEGGRFPKTVSALRKIPGIG 188
           +WPT+ HL++ASLEEVNE WAGLGYYRRARFLLEGAKMIV  G  FPK VS+LRK+PGIG
Sbjct: 122 KWPTLHHLAQASLEEVNEKWAGLGYYRRARFLLEGAKMIVAGGDGFPKIVSSLRKVPGIG 181

Query: 189 EYTAGAIASIAFDEVVPVVDGNVIRVIARLKAILGNPKDPKLNKQVWKAAAQLVDPSRPG 248
           +YTAGAIASIAF EVVPVVDGNVIRV+ARLKAI  NPKD    K+ WK AAQLVDP RPG
Sbjct: 182 DYTAGAIASIAFKEVVPVVDGNVIRVLARLKAISANPKDKVTVKKFWKLAAQLVDPHRPG 241

Query: 249 DFNQALMELGATLCSPTNPSCSTCPVFDHCEALSISKHDSSVLVTDYPAKGIKTKQRHDY 308
           DFNQ+LMELGATLC+P NPSCS+CPV   C AL+ISK D  VL+TDYPAK IK KQRH++
Sbjct: 242 DFNQSLMELGATLCTPVNPSCSSCPVSGQCRALTISKLDKLVLITDYPAKSIKLKQRHEF 301

Query: 309 SAVSVVEILENQGTSKLEQSSR-FLLVKRPDEGLLAGLWEFPSVLLDGEADLSTRRESIN 368
           SAV  VEI   Q   + +QSS  FLLVKRPDEGLLAGLWEFPSV+L  EAD++ RR+ +N
Sbjct: 302 SAVCAVEITGRQDLIEGDQSSSVFLLVKRPDEGLLAGLWEFPSVMLGKEADMTRRRKEMN 361

Query: 369 SLLSKYFGLEPKKNFEIVIREEVGDFIHVFTHIRLKIYVEHLVLCLKGN 403
             L K F L+P+K   +++RE++G+FIH+FTHIRLK+YVE L++ LKG+
Sbjct: 362 RFLKKSFRLDPQKTCSVLLREDIGEFIHIFTHIRLKVYVELLIVHLKGD 408

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
MUTYH_ARATH2.8e-12054.69Adenine DNA glycosylase OS=Arabidopsis thaliana GN=MYH PE=3 SV=1[more]
RU2A_ARATH9.9e-8677.31U2 small nuclear ribonucleoprotein A' OS=Arabidopsis thaliana GN=At1g09760 PE=2 ... [more]
MUTYH_MOUSE1.7e-8242.46Adenine DNA glycosylase OS=Mus musculus GN=Mutyh PE=2 SV=2[more]
MUTYH_RAT1.8e-7941.82Adenine DNA glycosylase OS=Rattus norvegicus GN=Mutyh PE=2 SV=1[more]
MUTYH_HUMAN2.1e-6754.98Adenine DNA glycosylase OS=Homo sapiens GN=MUTYH PE=1 SV=1[more]
Match NameE-valueIdentityDescription
A0A0A0KC27_CUCSA3.2e-20891.02Uncharacterized protein OS=Cucumis sativus GN=Csa_6G088720 PE=4 SV=1[more]
E5GB45_CUCME7.9e-20791.27A/G-specific adenine DNA glycosylase OS=Cucumis melo subsp. melo PE=4 SV=1[more]
B9IAY3_POPTR2.2e-14063.33Uncharacterized protein OS=Populus trichocarpa GN=POPTR_0014s17120g PE=4 SV=2[more]
A0A0D2LX00_GOSRA9.2e-13964.46Uncharacterized protein OS=Gossypium raimondii GN=B456_001G100200 PE=4 SV=1[more]
A0A0B0MDJ1_GOSAR2.0e-13863.92Tm9sf4 OS=Gossypium arboreum GN=F383_16864 PE=4 SV=1[more]
Match NameE-valueIdentityDescription
gi|778711687|ref|XP_004140565.2|4.6e-20891.02PREDICTED: A/G-specific adenine DNA glycosylase [Cucumis sativus][more]
gi|700191190|gb|KGN46394.1|4.6e-20891.02hypothetical protein Csa_6G088720 [Cucumis sativus][more]
gi|659119956|ref|XP_008459934.1|1.1e-20691.27PREDICTED: A/G-specific adenine DNA glycosylase [Cucumis melo][more]
gi|307135815|gb|ADN33687.1|1.1e-20691.27A/G-specific adenine DNA glycosylase [Cucumis melo subsp. melo][more]
gi|566204725|ref|XP_002321221.2|3.1e-14063.33hypothetical protein POPTR_0014s17120g [Populus trichocarpa][more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR000445HhH_motif
IPR001611Leu-rich_rpt
IPR003265HhH-GPD_domain
IPR003603U2A'_phosphoprotein32A_C
IPR004036Endonuclease-III-like_CS2
IPR011257DNA_glycosylase
IPR015797NUDIX_hydrolase-like_dom_sf
IPR023170HTH_base_excis_C
Vocabulary: Molecular Function
TermDefinition
GO:0003677DNA binding
GO:0005515protein binding
GO:0003824catalytic activity
GO:0016787hydrolase activity
Vocabulary: Biological Process
TermDefinition
GO:0006284base-excision repair
GO:0006281DNA repair
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0006284 base-excision repair
biological_process GO:0006298 mismatch repair
biological_process GO:0008150 biological_process
biological_process GO:0009409 response to cold
biological_process GO:0006281 DNA repair
biological_process GO:0000398 mRNA splicing, via spliceosome
biological_process GO:0001510 RNA methylation
biological_process GO:0006306 DNA methylation
cellular_component GO:0030529 intracellular ribonucleoprotein complex
cellular_component GO:0005634 nucleus
cellular_component GO:0005654 nucleoplasm
cellular_component GO:0044444 cytoplasmic part
cellular_component GO:0019013 viral nucleocapsid
cellular_component GO:0005730 nucleolus
cellular_component GO:0005829 cytosol
cellular_component GO:0009507 chloroplast
cellular_component GO:0015030 Cajal body
cellular_component GO:0005575 cellular_component
molecular_function GO:0032357 oxidized purine DNA binding
molecular_function GO:0046872 metal ion binding
molecular_function GO:0035485 adenine/guanine mispair binding
molecular_function GO:0034039 8-oxo-7,8-dihydroguanine DNA N-glycosylase activity
molecular_function GO:0051539 4 iron, 4 sulfur cluster binding
molecular_function GO:0019104 DNA N-glycosylase activity
molecular_function GO:0003677 DNA binding
molecular_function GO:0016798 hydrolase activity, acting on glycosyl bonds
molecular_function GO:0005488 binding
molecular_function GO:0003674 molecular_function
molecular_function GO:0005515 protein binding
molecular_function GO:0003824 catalytic activity
molecular_function GO:0016787 hydrolase activity
molecular_function GO:0000701 purine-specific mismatch base pair DNA N-glycosylase activity
This gene is associated with the following unigenes:
Unigene NameAnalysis NameSequence type in Unigene
WMU31097watermelon unigene v2 vs TrEMBLtranscribed_cluster
WMU32973watermelon unigene v2 vs TrEMBLtranscribed_cluster

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Cla007083Cla007083.1mRNA


The following transcribed_cluster feature(s) are associated with this gene:

Feature NameUnique NameType
WMU32973WMU32973transcribed_cluster
WMU31097WMU31097transcribed_cluster


Analysis Name: InterPro Annotations of watermelon (97103)
Date Performed: 2016-09-28
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR000445Helix-hairpin-helix motifPFAMPF00633HHHcoord: 157..184
score: 1.
IPR001611Leucine-rich repeatPROFILEPS51450LRRcoord: 438..459
score: 5.771coord: 487..508
score: 6.588coord: 416..437
score: 6.226coord: 394..414
score: 5.294coord: 462..483
score:
IPR003265HhH-GPD domainPFAMPF00730HhH-GPDcoord: 92..223
score: 1.3
IPR003265HhH-GPD domainSMARTSM00478endo3endcoord: 96..246
score: 2.8
IPR003603U2A'/phosphoprotein 32 family A, C-terminalSMARTSM00446LRRcap_2coord: 501..519
score: 0.
IPR004036Endonuclease III-like, conserved site-2PROSITEPS01155ENDONUCLEASE_III_2coord: 158..187
scor
IPR011257DNA glycosylaseGENE3DG3DSA:1.10.340.30coord: 83..178
score: 3.3
IPR011257DNA glycosylaseunknownSSF48150DNA-glycosylasecoord: 55..267
score: 3.77
IPR015797NUDIX hydrolase domain-likeGENE3DG3DSA:3.90.79.10coord: 312..396
score: 6.5
IPR015797NUDIX hydrolase domain-likeunknownSSF55811Nudixcoord: 313..401
score: 7.91
IPR023170Helix-turn-helix, base-excision DNA repair, C-terminalGENE3DG3DSA:1.10.1670.10coord: 179..268
score: 1.3
NoneNo IPR availablePANTHERPTHR10359A/G-SPECIFIC ADENINE GLYCOSYLASE/ENDONUCLEASE IIIcoord: 5..395
score: 6.6E
NoneNo IPR availablePANTHERPTHR10359:SF1A/G-SPECIFIC ADENINE DNA GLYCOSYLASEcoord: 5..395
score: 6.6E
NoneNo IPR availablePFAMPF14580LRR_9coord: 397..546
score: 1.3