CSPI01G10200 (gene) Wild cucumber (PI 183967)

NameCSPI01G10200
Typegene
OrganismCucumis sativus (Wild cucumber (PI 183967))
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
LocationChr1 : 6384121 .. 6392436 (-)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: CDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGGAACAACAACGAGTAGAGGTTGTTGTTGAACACAACTTCATTTCACCACCGTCATATACGGTTGACGATGCCGAATCGGAACCAACTTCTGCCGCCCACTTGTCCAATTGCACGACACAGGAAATAAGCCGCACAATAGTCTCCGATTATGAACTGGATTCGGACCCCATGGAGAAAAATCGAGGTAATTAAAATTTTATACTCTCGTTTAATTTAAATATAATTGTTTAACTGATTTTCATAACATTTTGAGGTCTGGGATTTGAGTTAGATTAGTCTTTTTGGTTTCAAAATCAAATAATTAAGAACGGCAGCGATTGTGGGTGGGGAAATGATGAAGGGACAATGGAAGGGAAGAGGTTGGGGCCTGGGGGATGAGAGAGAAAAAATCTTTTCTTTTTTTTTTTTAGCAAATTACATCAAGAACCTCTTTATACAATATAGTTGTAATAGCAAATAAGAAACTTTTAATAGTTAATGCTCGAATTTATTAAAATTGGACTCTTAAACTTATATAAGGATAGAATTTTATCGTTTTGAATCAAATTTCTATAATTGTTGTAAAGATTGAGAATTCATTTGGATAATTATAATCGATGACCATTTTGAGGAATAATAATCAAGGATATAACAACATTTAAGAAAATTGCAAATATAGAAAACTATCGCTGATAAACTTGTATTACTGATAAGACTCGTATGGTCTATTAGTCAATATTTACAACATGGTCTATTGATGATAAACTTATATCATTGATAGAATTTGACAAATTATTTTATATTTACAAATTTTTTAAAATGTTGCTATATACTTAATTATTTTGAATCTAATAGCTAAATTTGCAACTATCCCAATCCATTTTTACAATTGAAAGCCTGGAGATATATCTGCACGCAACTTAGGATCATGATTTGTCAAGGTGATTTTTGCAATTTGCCATTATTTTTAAAATTTTAATATAGTTTTATTTTAATAAGAGTACAGATATAGAATCCATCTGATATATCTCGACTAGGTAGACACTTACACAACAACATTAATATAGTACTCATTATTAAAATTATTTTTCAAATGCAACAACATTAAACTGTAAGAAGAATACTTAGGTGTATCGTGGGTAACACGTATGAATGAGAATTTGTGTGTGTGCATAGAAATACTTAGGTACTTTATGTTCAACTATTTTAGGTATTTTATGCAACTTGCTGGGATACGGATTGAAATACTTTTGGTGTATTCTGGACAACATTTATATAATAAAATATTTGTAGGATGTACCAAAATATTACCTGGTCATTTAAAATGCTACAATATTGTAATATTTGTTGAGTTGAAATTTATGAGACAAATTGAAATCTAAAATGAATTATATGAACAATGGAATTTGAAAATTGAATGTCACTGTTACACAAATAACAGTTATTGAAATTTAATAAAAGTATGTATGAATATATATGTTGATCAGCTGAGACATCAAGACGACTTCTTTTGTACAAATCGGCATTGAAGGGTGATTGGAAAAGAGCTGAATTAGTCCTAAATGATTACCCACATTACGTTCGTTGTGCAATAACAAGAAACAAAGAGACTGTTCTTCATGTTGCTGCCGGAGCGAAGCAGAGTGTGTTCGTGGAGGAGTTAGTGAATCGAATGACCCGAAAAGACATGGCTTTGCGAGACAAATATGGAAATACTGCCCTTTGCTTTGCTGCTACATCAAGAATTGTCAAAATTGCTAAACTAATGGTGGAAAAAAATCATGAACTTCCTCTTATTCGTACTTTCAGAGAAGGAACTCCACTTCTCATTGCAGTTTCTTATAAAAGTAGAGATATGATTTCTTACCTTTTGTCTGTCACTGATCTTAGCCAGCTAACTGCCCAAGAACGGATTGAGCTTCTTATTGCCACGATCCATAGCGATTTTCTTGGTAAGTCATTCTATTTCTTCACCAAAGTCATTCTATTTCTTCACAAACTTAACTTGCATTAATGTTTTTCAAATATTGAAGACCCTGTTTGAATTATAAATGAATGAAAAAGTTTCTGAAAGTGCATTTATATTTAAAGAGATTGTTCCTTGACTGTAGTCTTTGAAGAGACAATTTTGGGGGTAGTAGTTGATCATTTTTGTATATTATGAAATTTTGAGTGGATTCCAATTCTACATGATTTAACAACTTTGTCTTTTGTTTTTTCCTTAAAACGTAATTTCGAGTGTTACACTCATGATCTTGGTGTATGCACCGTTGAAAATTAAATATGCTTTAAATAAATAAGTTATATGTGAATTGATTTATGAGAGTTAGAAAAAGTTAAAGGTTGAGAGTTAAAAGGTTTAAAGTTAATTTAGTATGAATTTTCATATTAATTAAGTGTAGGTTTTAAATTGTTGTGTTTAATTTGTGGTATGAGTTATGCTCATGAGAATCGTATAAATAGGATTGAAATGTTGTAATCTAAATTATCCAAGTGTGAGTAGAATACACATAAAAAAGTTAGATTCTTCAAATAAAGCTTGTGTTTTCTTCTTAAATATTCTATTTGGTAATTATCTATTTGAGGTGTTCATTGTACGCTTCCAACAAAGTGGTATCAGAGCTTGAGTTCGGAATATAAAAAAATTGTTTTTTTTTTATTTTGCAAATGGCAAACAACAATTTAGTTCCCTTCCAAGTACCTCGACTTACGAAAGAAAATTATAGCAGCTGGTGTATTCGAGTGAAAGCTCTACTTGGTTCACAAGATGTGTGGGACATTGTTAATAATGGTTATGAAGAACCAGAAAGTGATACAGCTTTGAGTCAAGCTCAACGAGAAAATTTACAAAATACAAAGAAAAGAGATCAAAAGGCTCTCACCATCATTCATCAAGCCATTGATGATTCAAATTTTGAGAAAATTTCTGGAGCAACTACTGCACATCAAGCATGACAAATTTTGGAGAATACGTATAAAGGAGTAGATCGAGTCAAGAAGGTTCGCCTTCAAAAATTGAGAGGTGATTATGAATCACTACATATGAAGGAGTCTGAATCGGTTTCGGATTATACGTCAAGATTGTTAGCAGCAGTAAATGAAATGAAAAGATATGGTGAGACAATAAGTGATGAGCAAGTAGTAGAAAAGATACTTCGCTCATTGGACGAAAAATTCAATTTCATCGTTGTAGCTATTGAAGAATCAAAGTATTTGAGTACAATGTCCATTGATCAACTTATGGGTTCTTTACAAGCCCACGAAGAGAAGCTTCTTAAGAAGAACAAGCAGACGACTGAGCAACTTTTTCAGTCAAAGTTGAATTTAAAAGACAAGGAAGACAGCCTAGAAAAAGGCAATCGAGGTCGAAGACGTGGTGGTAATCGTGGACATGGTGATTTCAGAGATCATGGTTGAGGAAACTTTGGTCAAAGAAAATTTGATGAGAGTAATTCAAACTCAAATTCATCAAAGGGTCGTGGAAGACAACATTATTCAAGGTTAAATGGAGAAAGATCAAATAATGACAAGAGGTATGACAAAAGACAGGCTGAATGTTATAATTGTCATAAATTTGGCCAATATTCCTGGGAATGCAAAAATAGAGTTGAAGAAAATGCAAATTATGCTGAGAAAGACGAAGAAAGAGGTAATTCATCATTGCTTCTAGCATGTAAAGGTGTGGAAACATGTGAAAACAATGCATGGTATCTTGATAGTGGTGCAAGCAATCATATGTGTGGATGTAAATCAATGTTCATGGAGCTTGATGAATCTGTTGGTGGTGATACTGTATTTGGTGATGCAACGAAAATTCCAGTTAGAGGAAAAGGTAAAATTTTGATCAATTTGAAGAATGGAAAGCATGAGTTTATCTCTAATGTTTATTATGTGCCTGATATGAAGAACAACATTTTGAGTTTGGGACAACTCTTAGAGAAAGGCTATAATATTTTGATGAAGGATTATAGTCGTCTTTTGATAAGAGATAATCATGACAATATGATTGCTAAAGTGCAAATGACGAAAAATGGAATGTTTTTATTGAACATTCAAACTGATGTAGCGAAATGTTTAAAGTCATGTTTGAAAGATCCAAACTGGATTTGGCACTTGAGATTTGGGCATTTAAACTTTGATGGCTTGAGACTATTAGCCAAGAAGAACATGGTGAAAGGGTTGCCATATGTTAAACATCCAGACCAATTTTGTGAAGGTTGTCTTTATGGCAAACAATCAAGGAAGAGTTTTCCACAAGAATCATCTTGGAAAGCAAGGAAACCACTAAAGTTGGTTCACATTGATCTTTGTGGACCGATCAAACCAAGTTCTTTCGGTAAGAATAATTATTTCTTATTATTTATTGATGATTTCAGCCGAAAAACTTGGATTTATTTTGTCAAAGAGAAATCAGAAGTATTTGGCATGTTCAAGAGATTTAGAGCTCTTGTTGAAAAAGAAAGTGGTTATTACATAAAAGCATTGAGATCAGATAGGGGAGGTGAATTCACTTCAAATGAATTCAAAAAAATTTGCGCAGAAAATGGAATTCGTCGACCTATGACAGTTTCATTTACTCCTCCACAAAATGGTGTTGTTGAGAGGAAGAATCGAACAATATTTAACATGGCTCGAAGCATGTTGAAGAGCAAGAAGATGCCAAAAGAATTTTGGGCACAAGTTGTTGAATGTGCAATATACTTGTCAAATTGTTCCCCTACTAGAAGCTTGTGGAACAAAACTCCTCAACAAGCATGGACAGGAAGAAAACCATCCATTGCTCATTTGAGAGTATTTGGATGTATGACTTATGTGCATATACCAGATCAAAAGTGAACCTTTGAATTTTGAAAAAGCTTCGCAAAATGACAAATGGAAGATTGCTATGGATGAAGAGATAAAAGCTATAAAAAAGAATGATACGGGGGAACTTTCTACTCTTCCAAATGGAAAGAAAGCGGTAGGTGTCAAATGAGTGTTCAAGATAAAAAGAAATGAAAAAGGAGAATTGGAGAGATACAAAGCAAGATTAGTTGCAAAAGGATATTCTCAAAGAAAAGGCATTGATTACAATGAAGTGTTTGCTCCAGTTGCTCGTTTGGAAACCATAAGATTGTTAATTGCGATTGCTGCTCAAAATAATTGGAAGATCTCTTATATGGATGTCAAATCAGCATTTTTGAATGGATATCTAGAAGAAGAAGTCTACTTAGAACAACCTCTTGGTTATTCTGTGAAAGGTCAAGAGGATAAAGTTCTAAAATTGAAGAATGTATTATACGGATTGAAACAAGCACCAAGAATGTGGAGTAGCAGAATCAACAAATATTTCCTTGATAATGGGTATTTGAGGTGCCCTTATGAACACTCCCTTTATATTAAGACTAATGATCATGGAGATATTTTGGTTATTTGTGTGTACGTGAATGACTTAATTTTTATAGGAAATCCTGCAAGTATGTTTGAAGATCTCAAGAAGGCGATGACCCAAGAATTTGAAATGACAGATATAGGGCTGATGTCATATTATCTTGGTATTGAGGTGAAGCAAACAAATGAAGGTATTTTATCTCTCAAGAACAATATACTAGAGAAATTCTAGAGAAGTTCAATATGATTAATTCTAAGCCTGTTACAACTCCGATTGAAACGGGGACCAAATTGTCCAAATATGAAGAAGGAGATGTTGATCATTCATATTTCAAAAGTTTGGTTGGGAGTTTGAGATATTTAACTTGCACACGACCATATATTCTTTTCAGTGTTGGATTGGTGGGTCGATTTATGGAATCTCCTACAACTACTCATTTGAAAGTGGCAAAGAGAATTTTTCATTACCTCAAAGGTACGCTTGACTATGGGTTATTTTATTCTTCATCTAAAGAATTCAAGCTTGAAGGCTATTGTGATAGTGACTGGGCTGGAGATACTAATGATCGAAAGAGCACTAGTGGATTTGTATTCTTCATTGGTAATACTGCATTTACTTGGAGTTCTAAGAAACAACCTATTGTGACATTATCCACTTGTGAAGCCGAATACATTGCTGCAGCTTCATGTGCTTGTCATGCAGTTTGGTTAAGAAATTTGTTAAAGACAATTGGAATTTTGCAAGATGATCCAACTGTGATCCATGTAGATAATAAGTCAACAATTGCTCTGGCAAAGCATCATGTGTTCCATGATCGTAGCAAACACATTGATACTAGATTTCATTTCATCCGAGATTCCATTTCAAGGAAGGAGGTTCATGTTGAATATGTGAAGACTGAAGATCAAATTGCAGATATTTTCACGAAGCCACTCAAAGTTAATGTGTTTAATAAGTTAAGAAGTTCGCTTGGAGTTTTTTTTTTTAAAAAAAACATGTTTAAGGGAGGATGTTGAAAATTAAACAATGCTTTAAATAAATAAGTTTGTGGTATGAGTTATGCTCATGAAAATCCTATAAATAGGATTGAAATGTTGTAATCTAAATTATCCAACTGTAGAATACACATAAAAAAGTTAGATTCTTCTAACAAAGCTTGTTTTTTCTTCTTAAGTATTCTATTTGGTAATTATCTATTTGTGGTGTTTGTTGTACGCTTCCAACATGCACCTCCTAGTTCGTCTCCTCTAGTATCTAAAAGTGTTATACTCATCACTCATTTTCTAAAGTCATTTTAATTAATGATTATGTCTTCTTTTTGCAGATTTAAGCTTGTGGATTTTGAAACTATATCCGGAGTTAGCGGTAATGAAAGATACCAAGAACAATAATGAAACTGCATTACATGTATTGGCCAGAAAACCATCTGCCATGGATAGCACAAAGCAGCTACAAAATTTGAAAATGCGCATCAACTCTTGGAGATTTAATTCCAAACTATTCATATCACCTTGGAAACTTATTAATGAAATTTTGGCTTCTCTAATTTTGCCCTCCAATAGTAATAAAGATGTGACCAAAACATTGGCTCATCAATTAGTTGAATTCCTATGGAGGTATGTTGTATATGAGCTTCCACAAAAAGAGATGCTCGAGTTTATTAAACATCCCACAAGTTTGCTGAATGATGCTGCAGGGGCAGGTAATGTTGAGTTTTTGATTGTGCTCATTTGTGAATTTCCAGATATTTTATGGGGTGACGACGACAATGACGATAGTAAGAGTATATTTCATGTAGCTGTTGAAAATCGACTTGAGAATGTGTTTAATCTCATAAATGAGATTGGTAAGCTCAATGAATTCTCAACTAAATATAGAACTTTCAAGGGAAAATATAGCATCCTGCATTTGGCTGGAAATCTAGCTGCTCCAAACCATCTTAATAGAGTATCAGGAGCTGCTCTTCAAATGCAACGTGAAATGCTTTGGTTCAAGGTATGTACTTAGCTTAATTAAACACACCATTCCCATCTGTTAAACTTTTTCTTATTTCATTCCTTCTGGGTCAAATTAAATAAATTTTGTTCACGATTATTATAGAATAATAACAAATGTCATATAACTCGAGAATTGTTAAAGACTTATATTTGCATTAATTAAAAAAAGTCTAAGAATGCTAAATATTTTGAATCTAATTGCAACTTAAATAATTAAATACAAAACGAACTCAAAATCTTGGACAAGCATTAGAATTACCTGTAACTTGTTAGATAGATTATTTGTATTTAATTTCTCATACAACATGACATTTCAATTGATTAATGAAGGAAGTGGAGAAGATAGTTTTGCCTTCTCAACTTGAAGTGAAATCCAACGATCCAGATCCATCTATACCCAAGTTGACACCGCGCCAATTATTCACCGAAAAACACAAACGTCTACGTAAAGAGGGCGAAGAGTGGATGAAAAACACAGCAAACTCATGCATGCTCGTCGCTACTTTGATATCCACTGTAGTTTTTGCTGCAGCCTTCACAGTTCCTGGTGGCAATGATGATAATACAGGAACTCCTATTTTCCAAAATAAGTTTTGGTTTGCAATGTTTGTAGTGTCAGATGCAATTGCCTTGTTTTCGTCTTCTACGTCTATTTTAATGTTCTTGTCGATATTAACTTCACGCTATGCAGAAGAAGATTTCTTACACTCACTGCCCTCGAAATTGCTGTTTGGACTCGCATCGCTCTTCATCTCCATTGTGTTCATGGCTGTAGCTTTTAGTTCTACCTTCTTTTTGATCTACCACAACGCTAATATCTCCATCCCAACCATGGTCACTGCAATGGCCATTATTCCCATTACTTGTTTTTGTTTACTTCAGTTTACATTGTGGATTGATATTTTTCACAACACTTACTCCTCTAGATTTCTTTTTAATCCCAATAATCCACGCAAATTGTTTTAA

mRNA sequence

ATGGAACAACAACGAGTAGAGGTTGTTGTTGAACACAACTTCATTTCACCACCGTCATATACGGTTGACGATGCCGAATCGGAACCAACTTCTGCCGCCCACTTGTCCAATTGCACGACACAGGAAATAAGCCGCACAATAGTCTCCGATTATGAACTGGATTCGGACCCCATGGAGAAAAATCGAGCTGAGACATCAAGACGACTTCTTTTGTACAAATCGGCATTGAAGGGTGATTGGAAAAGAGCTGAATTAGTCCTAAATGATTACCCACATTACGTTCGTTGTGCAATAACAAGAAACAAAGAGACTGTTCTTCATGTTGCTGCCGGAGCGAAGCAGAGTGTGTTCGTGGAGGAGTTAGTGAATCGAATGACCCGAAAAGACATGGCTTTGCGAGACAAATATGGAAATACTGCCCTTTGCTTTGCTGCTACATCAAGAATTGTCAAAATTGCTAAACTAATGGTGGAAAAAAATCATGAACTTCCTCTTATTCGTACTTTCAGAGAAGGAACTCCACTTCTCATTGCAGTTTCTTATAAAAGTAGAGATATGATTTCTTACCTTTTGTCTGTCACTGATCTTAGCCAGCTAACTGCCCAAGAACGGATTGAGCTTCTTATTGCCACGATCCATAGCGATTTTCTTGGAGTAGATCGAGTCAAGAAGGTTCGCCTTCAAAAATTGAGAGGTGATTATGAATCACTACATATGAAGGAGTCTGAATCGGTTTCGGATTATACGTCAAGATTGTTAGCAGCAGTAAATGAAATGAAAAGATATGGTGAGACAATAAGTGATGAGCAAGTAGTAGAAAAGATACTTCGCTCATTGGACGAAAAATTCAATTTCATCGTTGTAGCTATTGAAGAATCAAAGTATTTGAGTACAATGTCCATTGATCAACTTATGGGTTCTTTACAAGCCCACGAAGAGAAGCTTCTTAAGAAGAACAAGCAGACGACTGAGCAACTTTTTCAGTCAAAGTTGAATTTAAAAGACAAGGAAGACAGCCTAGAAAAAGGCAATCGAGGTCGAAGACGTGGTGGTAATCGTGGACATGGTGATTTCAGAGATCATGAAAGATCAAATAATGACAAGAGGTATGACAAAAGACAGGCTGAATGTTATAATTGTCATAAATTTGGCCAATATTCCTGGGAATGCAAAAATAGAGTTGAAGAAAATGCAAATTATGCTGAGAAAGACGAAGAAAGAGGTAATTCATCATTGCTTCTAGCATGTAAAGGTGTGGAAACATGTGAAAACAATGCATGGTATCTTGATAGTGGTGCAAGCAATCATATGTGTGGATGTAAATCAATGTTCATGGAGCTTGATGAATCTGTTGGTGGTGATACTGTATTTGGTGATGCAACGAAAATTCCAGTTAGAGGAAAAGGTAAAATTTTGATCAATTTGAAGAATGGAAAGCATGAGTTTATCTCTAATGTTTATTATGTGCCTGATATGAAGAACAACATTTTGAGTTTGGGACAACTCTTAGAGAAAGGCTATAATATTTTGATGAAGGATTATAGTCGTCTTTTGATAAGAGATAATCATGACAATATGATTGCTAAAGTGCAAATGACGAAAAATGGAATGTTTTTATTGAACATTCAAACTGATGTAGCGAAATGTTTAAAGTCATGTTTGAAAGATCCAAACTGGATTTGGCACTTGAGATTTGGGCATTTAAACTTTGATGGCTTGAGACTATTAGCCAAGAAGAACATGGTGAAAGGGTTGCCATATGTTAAACATCCAGACCAATTTTGTGAAGGTTGTCTTTATGGCAAACAATCAAGGAAGAGTTTTCCACAAGAATCATCTTGGAAAGCAAGGAAACCACTAAAGTTGGTTCACATTGATCTTTGTGGACCGATCAAACCAAGTTCTTTCGGTAAGAATAATTATTTCTTATTATTTATTGATGATTTCAGCCGAAAAACTTGGATTTATTTTGTCAAAGAGAAATCAGAAGTATTTGGCATGTTCAAGAGATTTAGAGCTCTTGTTGAAAAAGAAAGTGGTTATTACATAAAAGCATTGAGATCAGATAGGGGAGGTGAATTCACTTCAAATGAATTCAAAAAAATTTGCGCAGAAAATGGAATTCGTCGACCTATGACAGTTTCATTTACTCCTCCACAAAATGGTGTTGTTGAGAGGAAGAATCGAACAATATTTAACATGGCTCGAAGCATGTTGAAGAGCAAGAAGATGCCAAAAGAATTTTGGGCACAAGTTGTTGAATGTGCAATATACTTGTCAAATTGTTCCCCTACTAGAAGCTTGTGGAACAAAACTCCTCAACAAGCATGGACAGGAAGAAAACCATCCATTGCTCATTTGAGAATCAAAAGTGAACCTTTGAATTTTGAAAAAGCTTCGCAAAATGACAAATGGAAGATTGCTATGGATGAAGAGATAAAAGCTATAAAAAAGAATGATACGGGGGAACTTTCTACTCTTCCAAATGGAAAGAAAGCGATAAAAAGAAATGAAAAAGGAGAATTGGAGAGATACAAAGCAAGATTAGTTGCAAAAGGATATTCTCAAAGAAAAGGCATTGATTACAATGAAGTGTTTGCTCCAGTTGCTCGTTTGGAAACCATAAGATTGTTAATTGCGATTGCTGCTCAAAATAATTGGAAGATCTCTTATATGGATGTCAAATCAGCATTTTTGAATGGATATCTAGAAGAAGAAGTCTACTTAGAACAACCTCTTGGTTATTCTGTGAAAGGTCAAGAGGATAAAGTTCTAAAATTGAAGAATGTATTATACGGATTGAAACAAGCACCAAGAATGTGGAGTAGCAGAATCAACAAATATTTCCTTGATAATGGGTATTTGAGGTGCCCTTATGAACACTCCCTTTATATTAAGACTAATGATCATGGAGATATTTTGGTTATTTGTGTGTACGTGAATGACTTAATTTTTATAGGAAATCCTGCAAGTATGTTTGAAGATCTCAAGAAGGCGATGACCCAAGAATTTGAAATGACAGATATAGGGCTGATGTCATATTATCTTGGTATTGAGCCTGTTACAACTCCGATTGAAACGGGGACCAAATTGTCCAAATATGAAGAAGGAGATGTTGATCATTCATATTTCAAAAGTTTGGTTGGGAGTTTGAGATATTTAACTTGCACACGACCATATATTCTTTTCAGTGTTGGATTGGTGGGTCGATTTATGGAATCTCCTACAACTACTCATTTGAAAGTGGCAAAGAGAATTTTTCATTACCTCAAAGGTACGCTTGACTATGGGTTATTTTATTCTTCATCTAAAGAATTCAAGCTTGAAGGCTATTGTGATAGTGACTGGGCTGGAGATACTAATGATCGAAAGAGCACTAGTGGATTTGTATTCTTCATTGGTAATACTGCATTTACTTGGAGTTCTAAGAAACAACCTATTGTGACATTATCCACTTGTGAAGCCGAATACATTGCTGCAGCTTCATGTGCTTGTCATGCAGTTTGGTTAAGAAATTTGTTAAAGACAATTGGAATTTTGCAAGATGATCCAACTGTGATCCATGTAGATAATAAGTCAACAATTGCTCTGGCAAAGCATCATGTGTTCCATGATCGTAGCAAACACATTGATACTAGATTTCATTTCATCCGAGATTCCATTTCAAGGAAGGAGGTTCATGTTGAATATGTGAAGACTGAAGATCAAATTGCAGATATTTTCACGAAGCCACTCAAAGTTAATGTGTTTAATAAGTTAAGAAATTTAAGCTTGTGGATTTTGAAACTATATCCGGAGTTAGCGGTAATGAAAGATACCAAGAACAATAATGAAACTGCATTACATGTATTGGCCAGAAAACCATCTGCCATGGATAGCACAAAGCAGCTACAAAATTTGAAAATGCGCATCAACTCTTGGAGATTTAATTCCAAACTATTCATATCACCTTGGAAACTTATTAATGAAATTTTGGCTTCTCTAATTTTGCCCTCCAATAGTAATAAAGATGTGACCAAAACATTGGCTCATCAATTAGTTGAATTCCTATGGAGGTATGTTGTATATGAGCTTCCACAAAAAGAGATGCTCGAGTTTATTAAACATCCCACAAGTTTGCTGAATGATGCTGCAGGGGCAGGTAATGTTGAGTTTTTGATTGTGCTCATTTGTGAATTTCCAGATATTTTATGGGGTGACGACGACAATGACGATAGTAAGAGTATATTTCATGTAGCTGTTGAAAATCGACTTGAGAATGTGTTTAATCTCATAAATGAGATTGGTAAGCTCAATGAATTCTCAACTAAATATAGAACTTTCAAGGGAAAATATAGCATCCTGCATTTGGCTGGAAATCTAGCTGCTCCAAACCATCTTAATAGAGTATCAGGAGCTGCTCTTCAAATGCAACGTGAAATGCTTTGGTTCAAGGAAGTGGAGAAGATAGTTTTGCCTTCTCAACTTGAAGTGAAATCCAACGATCCAGATCCATCTATACCCAAGTTGACACCGCGCCAATTATTCACCGAAAAACACAAACGTCTACGTAAAGAGGGCGAAGAGTGGATGAAAAACACAGCAAACTCATGCATGCTCGTCGCTACTTTGATATCCACTGTAGTTTTTGCTGCAGCCTTCACAGTTCCTGGTGGCAATGATGATAATACAGGAACTCCTATTTTCCAAAATAAGTTTTGGTTTGCAATGTTTGTAGTGTCAGATGCAATTGCCTTGTTTTCGTCTTCTACGTCTATTTTAATGTTCTTGTCGATATTAACTTCACGCTATGCAGAAGAAGATTTCTTACACTCACTGCCCTCGAAATTGCTGTTTGGACTCGCATCGCTCTTCATCTCCATTGTGTTCATGGCTGTAGCTTTTAGTTCTACCTTCTTTTTGATCTACCACAACGCTAATATCTCCATCCCAACCATGGTCACTGCAATGGCCATTATTCCCATTACTTGTTTTTGTTTACTTCAGTTTACATTGTGGATTGATATTTTTCACAACACTTACTCCTCTAGATTTCTTTTTAATCCCAATAATCCACGCAAATTGTTTTAA

Coding sequence (CDS)

ATGGAACAACAACGAGTAGAGGTTGTTGTTGAACACAACTTCATTTCACCACCGTCATATACGGTTGACGATGCCGAATCGGAACCAACTTCTGCCGCCCACTTGTCCAATTGCACGACACAGGAAATAAGCCGCACAATAGTCTCCGATTATGAACTGGATTCGGACCCCATGGAGAAAAATCGAGCTGAGACATCAAGACGACTTCTTTTGTACAAATCGGCATTGAAGGGTGATTGGAAAAGAGCTGAATTAGTCCTAAATGATTACCCACATTACGTTCGTTGTGCAATAACAAGAAACAAAGAGACTGTTCTTCATGTTGCTGCCGGAGCGAAGCAGAGTGTGTTCGTGGAGGAGTTAGTGAATCGAATGACCCGAAAAGACATGGCTTTGCGAGACAAATATGGAAATACTGCCCTTTGCTTTGCTGCTACATCAAGAATTGTCAAAATTGCTAAACTAATGGTGGAAAAAAATCATGAACTTCCTCTTATTCGTACTTTCAGAGAAGGAACTCCACTTCTCATTGCAGTTTCTTATAAAAGTAGAGATATGATTTCTTACCTTTTGTCTGTCACTGATCTTAGCCAGCTAACTGCCCAAGAACGGATTGAGCTTCTTATTGCCACGATCCATAGCGATTTTCTTGGAGTAGATCGAGTCAAGAAGGTTCGCCTTCAAAAATTGAGAGGTGATTATGAATCACTACATATGAAGGAGTCTGAATCGGTTTCGGATTATACGTCAAGATTGTTAGCAGCAGTAAATGAAATGAAAAGATATGGTGAGACAATAAGTGATGAGCAAGTAGTAGAAAAGATACTTCGCTCATTGGACGAAAAATTCAATTTCATCGTTGTAGCTATTGAAGAATCAAAGTATTTGAGTACAATGTCCATTGATCAACTTATGGGTTCTTTACAAGCCCACGAAGAGAAGCTTCTTAAGAAGAACAAGCAGACGACTGAGCAACTTTTTCAGTCAAAGTTGAATTTAAAAGACAAGGAAGACAGCCTAGAAAAAGGCAATCGAGGTCGAAGACGTGGTGGTAATCGTGGACATGGTGATTTCAGAGATCATGAAAGATCAAATAATGACAAGAGGTATGACAAAAGACAGGCTGAATGTTATAATTGTCATAAATTTGGCCAATATTCCTGGGAATGCAAAAATAGAGTTGAAGAAAATGCAAATTATGCTGAGAAAGACGAAGAAAGAGGTAATTCATCATTGCTTCTAGCATGTAAAGGTGTGGAAACATGTGAAAACAATGCATGGTATCTTGATAGTGGTGCAAGCAATCATATGTGTGGATGTAAATCAATGTTCATGGAGCTTGATGAATCTGTTGGTGGTGATACTGTATTTGGTGATGCAACGAAAATTCCAGTTAGAGGAAAAGGTAAAATTTTGATCAATTTGAAGAATGGAAAGCATGAGTTTATCTCTAATGTTTATTATGTGCCTGATATGAAGAACAACATTTTGAGTTTGGGACAACTCTTAGAGAAAGGCTATAATATTTTGATGAAGGATTATAGTCGTCTTTTGATAAGAGATAATCATGACAATATGATTGCTAAAGTGCAAATGACGAAAAATGGAATGTTTTTATTGAACATTCAAACTGATGTAGCGAAATGTTTAAAGTCATGTTTGAAAGATCCAAACTGGATTTGGCACTTGAGATTTGGGCATTTAAACTTTGATGGCTTGAGACTATTAGCCAAGAAGAACATGGTGAAAGGGTTGCCATATGTTAAACATCCAGACCAATTTTGTGAAGGTTGTCTTTATGGCAAACAATCAAGGAAGAGTTTTCCACAAGAATCATCTTGGAAAGCAAGGAAACCACTAAAGTTGGTTCACATTGATCTTTGTGGACCGATCAAACCAAGTTCTTTCGGTAAGAATAATTATTTCTTATTATTTATTGATGATTTCAGCCGAAAAACTTGGATTTATTTTGTCAAAGAGAAATCAGAAGTATTTGGCATGTTCAAGAGATTTAGAGCTCTTGTTGAAAAAGAAAGTGGTTATTACATAAAAGCATTGAGATCAGATAGGGGAGGTGAATTCACTTCAAATGAATTCAAAAAAATTTGCGCAGAAAATGGAATTCGTCGACCTATGACAGTTTCATTTACTCCTCCACAAAATGGTGTTGTTGAGAGGAAGAATCGAACAATATTTAACATGGCTCGAAGCATGTTGAAGAGCAAGAAGATGCCAAAAGAATTTTGGGCACAAGTTGTTGAATGTGCAATATACTTGTCAAATTGTTCCCCTACTAGAAGCTTGTGGAACAAAACTCCTCAACAAGCATGGACAGGAAGAAAACCATCCATTGCTCATTTGAGAATCAAAAGTGAACCTTTGAATTTTGAAAAAGCTTCGCAAAATGACAAATGGAAGATTGCTATGGATGAAGAGATAAAAGCTATAAAAAAGAATGATACGGGGGAACTTTCTACTCTTCCAAATGGAAAGAAAGCGATAAAAAGAAATGAAAAAGGAGAATTGGAGAGATACAAAGCAAGATTAGTTGCAAAAGGATATTCTCAAAGAAAAGGCATTGATTACAATGAAGTGTTTGCTCCAGTTGCTCGTTTGGAAACCATAAGATTGTTAATTGCGATTGCTGCTCAAAATAATTGGAAGATCTCTTATATGGATGTCAAATCAGCATTTTTGAATGGATATCTAGAAGAAGAAGTCTACTTAGAACAACCTCTTGGTTATTCTGTGAAAGGTCAAGAGGATAAAGTTCTAAAATTGAAGAATGTATTATACGGATTGAAACAAGCACCAAGAATGTGGAGTAGCAGAATCAACAAATATTTCCTTGATAATGGGTATTTGAGGTGCCCTTATGAACACTCCCTTTATATTAAGACTAATGATCATGGAGATATTTTGGTTATTTGTGTGTACGTGAATGACTTAATTTTTATAGGAAATCCTGCAAGTATGTTTGAAGATCTCAAGAAGGCGATGACCCAAGAATTTGAAATGACAGATATAGGGCTGATGTCATATTATCTTGGTATTGAGCCTGTTACAACTCCGATTGAAACGGGGACCAAATTGTCCAAATATGAAGAAGGAGATGTTGATCATTCATATTTCAAAAGTTTGGTTGGGAGTTTGAGATATTTAACTTGCACACGACCATATATTCTTTTCAGTGTTGGATTGGTGGGTCGATTTATGGAATCTCCTACAACTACTCATTTGAAAGTGGCAAAGAGAATTTTTCATTACCTCAAAGGTACGCTTGACTATGGGTTATTTTATTCTTCATCTAAAGAATTCAAGCTTGAAGGCTATTGTGATAGTGACTGGGCTGGAGATACTAATGATCGAAAGAGCACTAGTGGATTTGTATTCTTCATTGGTAATACTGCATTTACTTGGAGTTCTAAGAAACAACCTATTGTGACATTATCCACTTGTGAAGCCGAATACATTGCTGCAGCTTCATGTGCTTGTCATGCAGTTTGGTTAAGAAATTTGTTAAAGACAATTGGAATTTTGCAAGATGATCCAACTGTGATCCATGTAGATAATAAGTCAACAATTGCTCTGGCAAAGCATCATGTGTTCCATGATCGTAGCAAACACATTGATACTAGATTTCATTTCATCCGAGATTCCATTTCAAGGAAGGAGGTTCATGTTGAATATGTGAAGACTGAAGATCAAATTGCAGATATTTTCACGAAGCCACTCAAAGTTAATGTGTTTAATAAGTTAAGAAATTTAAGCTTGTGGATTTTGAAACTATATCCGGAGTTAGCGGTAATGAAAGATACCAAGAACAATAATGAAACTGCATTACATGTATTGGCCAGAAAACCATCTGCCATGGATAGCACAAAGCAGCTACAAAATTTGAAAATGCGCATCAACTCTTGGAGATTTAATTCCAAACTATTCATATCACCTTGGAAACTTATTAATGAAATTTTGGCTTCTCTAATTTTGCCCTCCAATAGTAATAAAGATGTGACCAAAACATTGGCTCATCAATTAGTTGAATTCCTATGGAGGTATGTTGTATATGAGCTTCCACAAAAAGAGATGCTCGAGTTTATTAAACATCCCACAAGTTTGCTGAATGATGCTGCAGGGGCAGGTAATGTTGAGTTTTTGATTGTGCTCATTTGTGAATTTCCAGATATTTTATGGGGTGACGACGACAATGACGATAGTAAGAGTATATTTCATGTAGCTGTTGAAAATCGACTTGAGAATGTGTTTAATCTCATAAATGAGATTGGTAAGCTCAATGAATTCTCAACTAAATATAGAACTTTCAAGGGAAAATATAGCATCCTGCATTTGGCTGGAAATCTAGCTGCTCCAAACCATCTTAATAGAGTATCAGGAGCTGCTCTTCAAATGCAACGTGAAATGCTTTGGTTCAAGGAAGTGGAGAAGATAGTTTTGCCTTCTCAACTTGAAGTGAAATCCAACGATCCAGATCCATCTATACCCAAGTTGACACCGCGCCAATTATTCACCGAAAAACACAAACGTCTACGTAAAGAGGGCGAAGAGTGGATGAAAAACACAGCAAACTCATGCATGCTCGTCGCTACTTTGATATCCACTGTAGTTTTTGCTGCAGCCTTCACAGTTCCTGGTGGCAATGATGATAATACAGGAACTCCTATTTTCCAAAATAAGTTTTGGTTTGCAATGTTTGTAGTGTCAGATGCAATTGCCTTGTTTTCGTCTTCTACGTCTATTTTAATGTTCTTGTCGATATTAACTTCACGCTATGCAGAAGAAGATTTCTTACACTCACTGCCCTCGAAATTGCTGTTTGGACTCGCATCGCTCTTCATCTCCATTGTGTTCATGGCTGTAGCTTTTAGTTCTACCTTCTTTTTGATCTACCACAACGCTAATATCTCCATCCCAACCATGGTCACTGCAATGGCCATTATTCCCATTACTTGTTTTTGTTTACTTCAGTTTACATTGTGGATTGATATTTTTCACAACACTTACTCCTCTAGATTTCTTTTTAATCCCAATAATCCACGCAAATTGTTTTAA
BLAST of CSPI01G10200 vs. Swiss-Prot
Match: POLX_TOBAC (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum PE=2 SV=1)

HSP 1 Score: 320.5 bits (820), Expect = 1.1e-85
Identity = 179/496 (36.09%), Postives = 286/496 (57.66%), Query Frame = 1

Query: 808  AMDEEIKAIKKNDTGELSTLPNGKKAIK--------RNEKGELERYKARLVAKGYSQRKG 867
            AM EE+++++KN T +L  LP GK+ +K        ++   +L RYKARLV KG+ Q+KG
Sbjct: 829  AMQEEMESLQKNGTYKLVELPKGKRPLKCKWVFKLKKDGDCKLVRYKARLVVKGFEQKKG 888

Query: 868  IDYNEVFAPVARLETIRLLIAIAAQNNWKISYMDVKSAFLNGYLEEEVYLEQPLGYSVKG 927
            ID++E+F+PV ++ +IR ++++AA  + ++  +DVK+AFL+G LEEE+Y+EQP G+ V G
Sbjct: 889  IDFDEIFSPVVKMTSIRTILSLAASLDLEVEQLDVKTAFLHGDLEEEIYMEQPEGFEVAG 948

Query: 928  QEDKVLKLKNVLYGLKQAPRMWSSRINKYFLDNGYLRCPYEHSLYIKTNDHGDILVICVY 987
            ++  V KL   LYGLKQAPR W  + + +     YL+   +  +Y K     + +++ +Y
Sbjct: 949  KKHMVCKLNKSLYGLKQAPRQWYMKFDSFMKSQTYLKTYSDPCVYFKRFSENNFIILLLY 1008

Query: 988  VNDLIFIGNPASMFEDLKKAMTQEFEMTDIGLMSYYLGIE-------------------- 1047
            V+D++ +G    +   LK  +++ F+M D+G     LG++                    
Sbjct: 1009 VDDMLIVGKDKGLIAKLKGDLSKSFDMKDLGPAQQILGMKIVRERTSRKLWLSQEKYIER 1068

Query: 1048 -----------PVTTPIETGTKLSKY-------EEGDVDHSYFKSLVGSLRY-LTCTRPY 1107
                       PV+TP+    KLSK        E+G++    + S VGSL Y + CTRP 
Sbjct: 1069 VLERFNMKNAKPVSTPLAGHLKLSKKMCPTTVEEKGNMAKVPYSSAVGSLMYAMVCTRPD 1128

Query: 1108 ILFSVGLVGRFMESPTTTHLKVAKRIFHYLKGTLDYGLFYSSSKEFKLEGYCDSDWAGDT 1167
            I  +VG+V RF+E+P   H +  K I  YL+GT    L +  S    L+GY D+D AGD 
Sbjct: 1129 IAHAVGVVSRFLENPGKEHWEAVKWILRYLRGTTGDCLCFGGSDPI-LKGYTDADMAGDI 1188

Query: 1168 NDRKSTSGFVFFIGNTAFTWSSKKQPIVTLSTCEAEYIAAASCACHAVWLRNLLKTIGIL 1227
            ++RKS++G++F     A +W SK Q  V LST EAEYIAA       +WL+  L+ +G+ 
Sbjct: 1189 DNRKSSTGYLFTFSGGAISWQSKLQKCVALSTTEAEYIAATETGKEMIWLKRFLQELGLH 1248

Query: 1228 QDDPTVIHVDNKSTIALAKHHVFHDRSKHIDTRFHFIRDSISRKEVHVEYVKTEDQIADI 1257
            Q +  V++ D++S I L+K+ ++H R+KHID R+H+IR+ +  + + V  + T +  AD+
Sbjct: 1249 QKE-YVVYCDSQSAIDLSKNSMYHARTKHIDVRYHWIREMVDDESLKVLKISTNENPADM 1308

BLAST of CSPI01G10200 vs. Swiss-Prot
Match: COPIA_DROME (Copia protein OS=Drosophila melanogaster GN=GIP PE=1 SV=3)

HSP 1 Score: 276.9 bits (707), Expect = 1.4e-72
Identity = 176/541 (32.53%), Postives = 285/541 (52.68%), Query Frame = 1

Query: 785  AHLRIKSEPLNFEKASQND---KWKIAMDEEIKAIKKNDTGELSTLPNGKK--------A 844
            AH      P +F++    D    W+ A++ E+ A K N+T  ++  P  K         +
Sbjct: 883  AHTIFNDVPNSFDEIQYRDDKSSWEEAINTELNAHKINNTWTITKRPENKNIVDSRWVFS 942

Query: 845  IKRNEKGELERYKARLVAKGYSQRKGIDYNEVFAPVARLETIRLLIAIAAQNNWKISYMD 904
            +K NE G   RYKARLVA+G++Q+  IDY E FAPVAR+ + R ++++  Q N K+  MD
Sbjct: 943  VKYNELGNPIRYKARLVARGFTQKYQIDYEETFAPVARISSFRFILSLVIQYNLKVHQMD 1002

Query: 905  VKSAFLNGYLEEEVYLEQPLGYSVKGQEDKVLKLKNVLYGLKQAPRMWSSRINKYFLDNG 964
            VK+AFLNG L+EE+Y+  P G S     D V KL   +YGLKQA R W     +   +  
Sbjct: 1003 VKTAFLNGTLKEEIYMRLPQGISCNS--DNVCKLNKAIYGLKQAARCWFEVFEQALKECE 1062

Query: 965  YLRCPYEHSLYIKTNDHGDI---LVICVYVNDLIFIGNPASMFEDLKKAMTQEF------ 1024
            ++    +  +YI   D G+I   + + +YV+D++      +   + K+ + ++F      
Sbjct: 1063 FVNSSVDRCIYIL--DKGNINENIYVLLYVDDVVIATGDMTRMNNFKRYLMEKFRMTDLN 1122

Query: 1025 ----------EMTDIGL---MSYYL----------GIEPVTTPIETGTKLSKYEEGDVDH 1084
                      EM +  +    S Y+              V+TP+ +          +  +
Sbjct: 1123 EIKHFIGIRIEMQEDKIYLSQSAYVKKILSKFNMENCNAVSTPLPSKINYELLNSDEDCN 1182

Query: 1085 SYFKSLVGSLRYLT-CTRPYILFSVGLVGRFMESPTTTHLKVAKRIFHYLKGTLDYGLFY 1144
            +  +SL+G L Y+  CTRP +  +V ++ R+     +   +  KR+  YLKGT+D  L +
Sbjct: 1183 TPCRSLIGCLMYIMLCTRPDLTTAVNILSRYSSKNNSELWQNLKRVLRYLKGTIDMKLIF 1242

Query: 1145 SSSKEF--KLEGYCDSDWAGDTNDRKSTSGFVFFIGN-TAFTWSSKKQPIVTLSTCEAEY 1204
              +  F  K+ GY DSDWAG   DRKST+G++F + +     W++K+Q  V  S+ EAEY
Sbjct: 1243 KKNLAFENKIIGYVDSDWAGSEIDRKSTTGYLFKMFDFNLICWNTKRQNSVAASSTEAEY 1302

Query: 1205 IAAASCACHAVWLRNLLKTIGILQDDPTVIHVDNKSTIALAKHHVFHDRSKHIDTRFHFI 1264
            +A       A+WL+ LL +I I  ++P  I+ DN+  I++A +   H R+KHID ++HF 
Sbjct: 1303 MALFEAVREALWLKFLLTSINIKLENPIKIYEDNQGCISIANNPSCHKRAKHIDIKYHFA 1362

Query: 1265 RDSISRKEVHVEYVKTEDQIADIFTKPLKVNVFNKLRNLSLWILKLYPELAVMKDTKNNN 1279
            R+ +    + +EY+ TE+Q+ADIFTKPL    F +LR+          +L +++D ++N 
Sbjct: 1363 REQVQNNVICLEYIPTENQLADIFTKPLPAARFVELRD----------KLGLLQDDQSNA 1409

BLAST of CSPI01G10200 vs. Swiss-Prot
Match: M810_ARATH (Uncharacterized mitochondrial protein AtMg00810 OS=Arabidopsis thaliana GN=AtMg00810 PE=4 SV=1)

HSP 1 Score: 124.8 bits (312), Expect = 9.1e-27
Identity = 61/154 (39.61%), Postives = 93/154 (60.39%), Query Frame = 1

Query: 1016 LGIEPVTTPIETGTKLSKYEEGDVDHSYFKSLVGSLRYLTCTRPYILFSVGLVGRFMESP 1075
            L  +P++TP+      S       D S F+S+VG+L+YLT TRP I ++V +V + M  P
Sbjct: 72   LDCKPMSTPLPLKLNSSVSTAKYPDPSDFRSIVGALQYLTLTRPDISYAVNIVCQRMHEP 131

Query: 1076 TTTHLKVAKRIFHYLKGTLDYGLFYSSSKEFKLEGYCDSDWAGDTNDRKSTSGFVFFIGN 1135
            T     + KR+  Y+KGT+ +GL+   + +  ++ +CDSDWAG T+ R+ST+GF  F+G 
Sbjct: 132  TLADFDLLKRVLRYVKGTIFHGLYIHKNSKLNVQAFCDSDWAGCTSTRRSTTGFCTFLGC 191

Query: 1136 TAFTWSSKKQPIVTLSTCEAEYIAAASCACHAVW 1170
               +WS+K+QP V+ S+ E EY A A  A    W
Sbjct: 192  NIISWSAKRQPTVSRSSTETEYRALALTAAELTW 225

BLAST of CSPI01G10200 vs. Swiss-Prot
Match: YD15B_YEAST (Transposon Ty1-DR6 Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GN=TY1B-DR6 PE=3 SV=1)

HSP 1 Score: 108.6 bits (270), Expect = 6.7e-22
Identity = 112/465 (24.09%), Postives = 198/465 (42.58%), Query Frame = 1

Query: 845  YKARLVAKGYSQRKGIDYNEVFAPVARLETIRLLIAIAAQNNWKISYMDVKSAFLNGYLE 904
            +KAR VA+G  Q      + + +       +   +++A  NN+ I+ +D+ SA+L   ++
Sbjct: 1298 HKARFVARGDIQHPDTYDSGMQSNTVHHYALMTSLSLALDNNYYITQLDISSAYLYADIK 1357

Query: 905  EEVYLEQPLGYSVKGQEDKVLKLKNVLYGLKQAPRMWSSRINKYFLDNGYLRCPYEHSLY 964
            EE+Y+  P      G  DK+++LK  LYGLKQ+   W   I  Y +    +      S  
Sbjct: 1358 EELYIRPPPHL---GMNDKLIRLKKSLYGLKQSGANWYETIKSYLIQQCGMEEVRGWSCV 1417

Query: 965  IKTNDHGDILVICVYVNDLIF----IGNPASMFEDLKKAMT----------QEFEMTDIG 1024
             K +     + IC++V+D++     + +   + E LK              +E +   +G
Sbjct: 1418 FKNSQ----VTICLFVDDMVLFSKNLNSNKRIIEKLKMQYDTKIINLGESDEEIQYDILG 1477

Query: 1025 LMSYY-------LGIEPVTT--------PIET-GTKLSK---------YEEGDVDHSYFK 1084
            L   Y       LG+E   T        P+   G KLS           +E ++D   +K
Sbjct: 1478 LEIKYQRGKYMKLGMENSLTEKIPKLNVPLNPKGRKLSAPGQPGLYIDQDELEIDEDEYK 1537

Query: 1085 SLVGSLRYLTCTRPYI--------LFSVGLVGRFMESPTTTHLKVAKRIFHYLKGTLDYG 1144
              V  ++ L     Y+        L+ +  + + +  P+   L +   +  ++  T D  
Sbjct: 1538 EKVHEMQKLIGLASYVGYKFRFDLLYYINTLAQHILFPSRQVLDMTYELIQFMWDTRDKQ 1597

Query: 1145 LFYSSSK----EFKLEGYCDSDWAGDTNDRKSTSGFVFFIGNTAFTWSSKKQPIVTLSTC 1204
            L +  +K    + KL    D+ + G+    KS  G +F +        S K  +   ST 
Sbjct: 1598 LIWHKNKPTKPDNKLVAISDASY-GNQPYYKSQIGNIFLLNGKVIGGKSTKASLTCTSTT 1657

Query: 1205 EAEYIAAASCACHAVWLRNLLKTIGILQDDPTV--IHVDNKSTIALAKH-HVFHDRSKHI 1256
            EAE  A +        L NL   +  L   P +  +  D++STI++ K  +    R++  
Sbjct: 1658 EAEIHAVSEAI---PLLNNLSHLVQELNKKPIIKGLLTDSRSTISIIKSTNEEKFRNRFF 1717

BLAST of CSPI01G10200 vs. Swiss-Prot
Match: YL14B_YEAST (Transposon Ty1-LR4 Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GN=TY1B-LR4 PE=5 SV=1)

HSP 1 Score: 108.6 bits (270), Expect = 6.7e-22
Identity = 112/465 (24.09%), Postives = 198/465 (42.58%), Query Frame = 1

Query: 845  YKARLVAKGYSQRKGIDYNEVFAPVARLETIRLLIAIAAQNNWKISYMDVKSAFLNGYLE 904
            +KAR VA+G  Q      + + +       +   +++A  NN+ I+ +D+ SA+L   ++
Sbjct: 1298 HKARFVARGDIQHPDTYDSGMQSNTVHHYALMTSLSLALDNNYYITQLDISSAYLYADIK 1357

Query: 905  EEVYLEQPLGYSVKGQEDKVLKLKNVLYGLKQAPRMWSSRINKYFLDNGYLRCPYEHSLY 964
            EE+Y+  P      G  DK+++LK  LYGLKQ+   W   I  Y +    +      S  
Sbjct: 1358 EELYIRPPPHL---GMNDKLIRLKKSLYGLKQSGANWYETIKSYLIQQCGMEEVRGWSCV 1417

Query: 965  IKTNDHGDILVICVYVNDLIF----IGNPASMFEDLKKAMT----------QEFEMTDIG 1024
             K +     + IC++V+D++     + +   + E LK              +E +   +G
Sbjct: 1418 FKNSQ----VTICLFVDDMVLFSKNLNSNKRIIEKLKMQYDTKIINLGESDEEIQYDILG 1477

Query: 1025 LMSYY-------LGIEPVTT--------PIET-GTKLSK---------YEEGDVDHSYFK 1084
            L   Y       LG+E   T        P+   G KLS           +E ++D   +K
Sbjct: 1478 LEIKYQRGKYMKLGMENSLTEKIPKLNVPLNPKGRKLSAPGQPGLYIDQDELEIDEDEYK 1537

Query: 1085 SLVGSLRYLTCTRPYI--------LFSVGLVGRFMESPTTTHLKVAKRIFHYLKGTLDYG 1144
              V  ++ L     Y+        L+ +  + + +  P+   L +   +  ++  T D  
Sbjct: 1538 EKVHEMQKLIGLASYVGYKFRFDLLYYINTLAQHILFPSRQVLDMTYELIQFMWDTRDKQ 1597

Query: 1145 LFYSSSK----EFKLEGYCDSDWAGDTNDRKSTSGFVFFIGNTAFTWSSKKQPIVTLSTC 1204
            L +  +K    + KL    D+ + G+    KS  G +F +        S K  +   ST 
Sbjct: 1598 LIWHKNKPTKPDNKLVAISDASY-GNQPYYKSQIGNIFLLNGKVIGGKSTKASLTCTSTT 1657

Query: 1205 EAEYIAAASCACHAVWLRNLLKTIGILQDDPTV--IHVDNKSTIALAKH-HVFHDRSKHI 1256
            EAE  A +        L NL   +  L   P +  +  D++STI++ K  +    R++  
Sbjct: 1658 EAEIHAVSEAI---PLLNNLSHLVQELNKKPIIKGLLTDSRSTISIIKSTNEEKFRNRFF 1717

BLAST of CSPI01G10200 vs. TrEMBL
Match: A0A151TPU3_CAJCA (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cajanus cajan GN=KK1_022691 PE=4 SV=1)

HSP 1 Score: 713.0 bits (1839), Expect = 8.7e-202
Identity = 365/596 (61.24%), Postives = 451/596 (75.67%), Query Frame = 1

Query: 216 FLGVDRVKKVRLQKLRGDYESLHMKESESVSDYTSRLLAAVNEMKRYGETISDEQVVEKI 275
           + G D+VKKVRLQ LRG++E+LHMKE E VSDY SR+L   N +KR GE + D +++EKI
Sbjct: 102 YKGADQVKKVRLQTLRGEFEALHMKEGELVSDYFSRVLTVTNNLKRNGEKLDDVRIMEKI 161

Query: 276 LRSLDEKFNFIVVAIEESKYLSTMSIDQLMGSLQAHEEKLLKKNKQTTEQLFQSKLNLKD 335
           LRSLD KF  IV   EE+K L  MSI+QL+GSLQA+EEK  KK ++  EQ+F++ ++ + 
Sbjct: 162 LRSLDPKFEHIVTITEETKDLEAMSIEQLLGSLQAYEEKK-KKKEEIVEQVFKAHVDSRK 221

Query: 336 KEDSLEKGNR----------------GRRRGGN--RGHGDFRDHERSNNDKRYDKRQAEC 395
           +E++  +  R                GRR   N  RG    R   R N + RYDK + +C
Sbjct: 222 EENAHNQSRRSYSQEQGRGRAYGHGQGRRPNNNNQRGESSNRGRGRGNPNSRYDKSRIKC 281

Query: 396 YNCHKFGQYSWEC----KNRVEENANYAEKDEERGNSSLLLACKGVETCENNAWYLDSGA 455
           YNC+KFG Y+ EC    KN+VEE ANYAE+  +  + +LLLA KG +  E+N WYLDSGA
Sbjct: 282 YNCNKFGHYASECRAPNKNKVEEKANYAEERCQE-DGTLLLAYKGQDKGEDNQWYLDSGA 341

Query: 456 SNHMCGCKSMFMELDESVGGDTVFGDATKIPVRGKGKILINLKNGKHEFISNVYYVPDMK 515
           SNHMCG +SMF+ELDESV G+  FGD +K+ V GKG +LI LKNG+H+FISNVYYVP MK
Sbjct: 342 SNHMCGKRSMFVELDESVKGNVAFGDESKVAVEGKGNVLIRLKNGEHQFISNVYYVPSMK 401

Query: 516 NNILSLGQLLEKGYNILMKDYSRLLIRDNHDNMIAKVQMTKNGMFLLNIQTDVAKCLKSC 575
           +NILSLGQLLEKGY+I +K+ + L IRDN    IAKV MT+N MF+LNIQ+D  +CLK C
Sbjct: 402 SNILSLGQLLEKGYDIQLKN-NNLSIRDNTSRFIAKVPMTRNRMFVLNIQSDGPQCLKMC 461

Query: 576 LKDPNWIWHLRFGHLNFDGLRLLAKKNMVKGLPYVKHPDQFCEGCLYGKQSRKSFPQESS 635
            KD +W+WHLRFGHLNF GL LL+KK MV+GLP + HP+Q CEGCL GKQ R SFP+ES 
Sbjct: 462 YKDQSWLWHLRFGHLNFKGLELLSKKAMVRGLPCITHPNQVCEGCLLGKQFRLSFPKESD 521

Query: 636 WKARKPLKLVHIDLCGPIKPSSFGKNNYFLLFIDDFSRKTWIYFVKEKSEVFGMFKRFRA 695
            +A+KPL+L+H D+CGPIKP S GK+NYFLLFIDDFSRKTW+YF+KEKSEVF  FK+F+A
Sbjct: 522 SRAQKPLELIHTDVCGPIKPRSLGKSNYFLLFIDDFSRKTWVYFLKEKSEVFENFKKFKA 581

Query: 696 LVEKESGYYIKALRSDRGGEFTSNEFKKICAENGIRRPMTVSFTPPQNGVVERKNRTIFN 755
            VEKESG  IKALRSDRGGEFTS EF+K C +NGIRR +TV  +P QNGV ERKNRTI  
Sbjct: 582 HVEKESGLLIKALRSDRGGEFTSKEFQKYCEDNGIRRQLTVPRSPQQNGVAERKNRTILE 641

Query: 756 MARSMLKSKKMPKEFWAQVVECAIYLSNCSPTRSLWNKTPQQAWTGRKPSIAHLRI 790
           MARSMLKSKK+PKEFWA+ V CA+YL+N SPTRS+  KTPQ+AW+GRKP I+HLR+
Sbjct: 642 MARSMLKSKKLPKEFWAEAVACAVYLTNRSPTRSVSGKTPQEAWSGRKPGISHLRV 694

BLAST of CSPI01G10200 vs. TrEMBL
Match: A0A151RPT4_CAJCA (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cajanus cajan GN=KK1_033991 PE=4 SV=1)

HSP 1 Score: 713.0 bits (1839), Expect = 8.7e-202
Identity = 365/596 (61.24%), Postives = 451/596 (75.67%), Query Frame = 1

Query: 216 FLGVDRVKKVRLQKLRGDYESLHMKESESVSDYTSRLLAAVNEMKRYGETISDEQVVEKI 275
           + G D+VKKVRLQ LRG++E+LHMKE E VSDY SR+L   N +KR GE + D +++EKI
Sbjct: 102 YKGADQVKKVRLQTLRGEFEALHMKEGELVSDYFSRVLTVTNNLKRNGEKLDDVRIMEKI 161

Query: 276 LRSLDEKFNFIVVAIEESKYLSTMSIDQLMGSLQAHEEKLLKKNKQTTEQLFQSKLNLKD 335
           LRSLD KF  IV   EE+K L  MSI+QL+GSLQA+EEK  KK ++  EQ+F++ ++ + 
Sbjct: 162 LRSLDPKFEHIVTITEETKDLEAMSIEQLLGSLQAYEEKK-KKKEEIVEQVFKAHVDSRK 221

Query: 336 KEDSLEKGNR----------------GRRRGGN--RGHGDFRDHERSNNDKRYDKRQAEC 395
           +E++  +  R                GRR   N  RG    R   R N + RYDK + +C
Sbjct: 222 EENAHNQSRRSYSQEQGRGRAYGHGQGRRPNNNNQRGESSNRGRGRGNPNSRYDKSRIKC 281

Query: 396 YNCHKFGQYSWEC----KNRVEENANYAEKDEERGNSSLLLACKGVETCENNAWYLDSGA 455
           YNC+KFG Y+ EC    KN+VEE ANYAE+  +  + +LLLA KG +  E+N WYLDSGA
Sbjct: 282 YNCNKFGHYASECRAPNKNKVEEKANYAEERCQE-DGTLLLAYKGQDKGEDNQWYLDSGA 341

Query: 456 SNHMCGCKSMFMELDESVGGDTVFGDATKIPVRGKGKILINLKNGKHEFISNVYYVPDMK 515
           SNHMCG +SMF+ELDESV G+  FGD +K+ V GKG +LI LKNG+H+FISNVYYVP MK
Sbjct: 342 SNHMCGKRSMFVELDESVKGNVAFGDESKVAVEGKGNVLIRLKNGEHQFISNVYYVPSMK 401

Query: 516 NNILSLGQLLEKGYNILMKDYSRLLIRDNHDNMIAKVQMTKNGMFLLNIQTDVAKCLKSC 575
           +NILSLGQLLEKGY+I +K+ + L IRDN    IAKV MT+N MF+LNIQ+D  +CLK C
Sbjct: 402 SNILSLGQLLEKGYDIQLKN-NNLSIRDNTSRFIAKVPMTRNRMFVLNIQSDGPQCLKMC 461

Query: 576 LKDPNWIWHLRFGHLNFDGLRLLAKKNMVKGLPYVKHPDQFCEGCLYGKQSRKSFPQESS 635
            KD +W+WHLRFGHLNF GL LL+KK MV+GLP + HP+Q CEGCL GKQ R SFP+ES 
Sbjct: 462 YKDQSWLWHLRFGHLNFKGLELLSKKAMVRGLPCITHPNQVCEGCLLGKQFRLSFPKESD 521

Query: 636 WKARKPLKLVHIDLCGPIKPSSFGKNNYFLLFIDDFSRKTWIYFVKEKSEVFGMFKRFRA 695
            +A+KPL+L+H D+CGPIKP S GK+NYFLLFIDDFSRKTW+YF+KEKSEVF  FK+F+A
Sbjct: 522 SRAQKPLELIHTDVCGPIKPRSLGKSNYFLLFIDDFSRKTWVYFLKEKSEVFENFKKFKA 581

Query: 696 LVEKESGYYIKALRSDRGGEFTSNEFKKICAENGIRRPMTVSFTPPQNGVVERKNRTIFN 755
            VEKESG  IKALRSDRGGEFTS EF+K C +NGIRR +TV  +P QNGV ERKNRTI  
Sbjct: 582 HVEKESGLLIKALRSDRGGEFTSKEFQKYCEDNGIRRQLTVPRSPQQNGVAERKNRTILE 641

Query: 756 MARSMLKSKKMPKEFWAQVVECAIYLSNCSPTRSLWNKTPQQAWTGRKPSIAHLRI 790
           MARSMLKSKK+PKEFWA+ V CA+YL+N SPTRS+  KTPQ+AW+GRKP I+HLR+
Sbjct: 642 MARSMLKSKKLPKEFWAEAVACAVYLTNRSPTRSVSGKTPQEAWSGRKPGISHLRV 694

BLAST of CSPI01G10200 vs. TrEMBL
Match: A0A151TGP0_CAJCA (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cajanus cajan GN=KK1_012507 PE=4 SV=1)

HSP 1 Score: 708.0 bits (1826), Expect = 2.8e-200
Identity = 362/596 (60.74%), Postives = 449/596 (75.34%), Query Frame = 1

Query: 216 FLGVDRVKKVRLQKLRGDYESLHMKESESVSDYTSRLLAAVNEMKRYGETISDEQVVEKI 275
           + G D+VKKVRLQ LRG++E+LHMKE E VSDY SR+L   N +KR GE + D +++EKI
Sbjct: 91  YKGADQVKKVRLQTLRGEFEALHMKEGELVSDYFSRVLTVTNNLKRNGEKLDDVRIMEKI 150

Query: 276 LRSLDEKFNFIVVAIEESKYLSTMSIDQLMGSLQAHEEKLLKKNKQTTEQLFQSKLNLKD 335
           LRSLD KF  IV   EE+K L  MSI+QL+GSLQA+EEK  KK ++  EQ+F++ ++ + 
Sbjct: 151 LRSLDPKFEHIVTITEETKDLEAMSIEQLLGSLQAYEEKK-KKKEEIVEQVFKAHVDSRK 210

Query: 336 KEDSLEKGNR----------------GRRRGGN--RGHGDFRDHERSNNDKRYDKRQAEC 395
           +E++  +  R                GRR   N  RG    R   R N + RYDK + +C
Sbjct: 211 EENAHNQSRRSYSQEQGRGRAYGHGQGRRPNNNNQRGESSNRGRGRGNPNSRYDKSRIKC 270

Query: 396 YNCHKFGQYSWEC----KNRVEENANYAEKDEERGNSSLLLACKGVETCENNAWYLDSGA 455
           YNC+KFG Y+ EC    KN+VEE ANYAE+  +  + +LLLA KG +  E+N WYLDSGA
Sbjct: 271 YNCNKFGHYASECRAPNKNKVEEKANYAEERCQE-DGTLLLAYKGQDKGEDNQWYLDSGA 330

Query: 456 SNHMCGCKSMFMELDESVGGDTVFGDATKIPVRGKGKILINLKNGKHEFISNVYYVPDMK 515
           SNHMCG +SMF+ELDESV G+  FGD +K+ V GKG +LI LKNG+H+FISN+YYVP MK
Sbjct: 331 SNHMCGKRSMFVELDESVKGNVAFGDESKVAVEGKGNVLIQLKNGEHQFISNIYYVPSMK 390

Query: 516 NNILSLGQLLEKGYNILMKDYSRLLIRDNHDNMIAKVQMTKNGMFLLNIQTDVAKCLKSC 575
           +NILSLGQLLEKGY+I +K+ + L IRDN    I KV M +N MF+LNIQ+D  +CLK C
Sbjct: 391 SNILSLGQLLEKGYDIQLKN-NNLSIRDNTSRFITKVPMMRNRMFVLNIQSDGPQCLKMC 450

Query: 576 LKDPNWIWHLRFGHLNFDGLRLLAKKNMVKGLPYVKHPDQFCEGCLYGKQSRKSFPQESS 635
            KD +W+WHLRFGHLNF GL LL+KK MV+GLP + HP+Q CEGCL GKQ R SFP+ES 
Sbjct: 451 YKDQSWLWHLRFGHLNFKGLDLLSKKAMVRGLPCITHPNQVCEGCLLGKQFRLSFPKESD 510

Query: 636 WKARKPLKLVHIDLCGPIKPSSFGKNNYFLLFIDDFSRKTWIYFVKEKSEVFGMFKRFRA 695
            +A+KPL+L+H D+CGPIKP S GK+NYFLLFIDDFSRKTW+YF+KEKSEVF  FK+F+A
Sbjct: 511 SRAQKPLELIHTDVCGPIKPRSLGKSNYFLLFIDDFSRKTWVYFLKEKSEVFENFKKFKA 570

Query: 696 LVEKESGYYIKALRSDRGGEFTSNEFKKICAENGIRRPMTVSFTPPQNGVVERKNRTIFN 755
            VEKESG  IKALRSDRGGEFTS EF+K C +NGIRR +TV  +P QNGV ERKNRTI  
Sbjct: 571 HVEKESGLLIKALRSDRGGEFTSKEFQKYCEDNGIRRQLTVPRSPQQNGVAERKNRTILE 630

Query: 756 MARSMLKSKKMPKEFWAQVVECAIYLSNCSPTRSLWNKTPQQAWTGRKPSIAHLRI 790
           MARSMLKSKK+PKEFWA+ V CA+YL+N SPTRS+  KTPQ+AW+GRKP I+HLR+
Sbjct: 631 MARSMLKSKKLPKEFWAEAVACAVYLTNRSPTRSVSGKTPQEAWSGRKPGISHLRV 683

BLAST of CSPI01G10200 vs. TrEMBL
Match: Q9SXB2_ARATH (T28P6.8 protein OS=Arabidopsis thaliana GN=T28P6.8 PE=4 SV=1)

HSP 1 Score: 693.7 bits (1789), Expect = 5.4e-196
Identity = 362/666 (54.35%), Postives = 464/666 (69.67%), Query Frame = 1

Query: 152 IAKLMVEKNHELPLIRTFREGTPLLIAVSYKSRDMISYLLSVTDLSQLTAQERIELLIAT 211
           + K  +E  +E  L +T ++G         + RD  +  L    L + T ++ +E   A 
Sbjct: 38  VEKGFIEPENEGSLSQTQKDGLR-----DSRKRDKKALCLIYQGLDEDTFEKVVEATSAK 97

Query: 212 -----IHSDFLGVDRVKKVRLQKLRGDYESLHMKESESVSDYTSRLLAAVNEMKRYGETI 271
                + + + G D+VKKVRLQ LRG++E+L MKE E VSDY SR+L   N +KR GE +
Sbjct: 98  EAWEKLRTSYKGADQVKKVRLQTLRGEFEALQMKEGELVSDYFSRVLTVTNNLKRNGEKL 157

Query: 272 SDEQVVEKILRSLDEKFNFIVVAIEESKYLSTMSIDQLMGSLQAHEEKLLKKNKQTTEQL 331
            D +++EK+LRSLD KF  IV  IEE+K L  M+I+QL+GSLQA+EEK  KK +   EQ+
Sbjct: 158 DDVRIMEKVLRSLDLKFEHIVTVIEETKDLEAMTIEQLLGSLQAYEEKK-KKKEDIVEQV 217

Query: 332 FQSKLNLKDKEDSLEKGN----RGRRRGGNRGHGDFRDHERSNNDK-------------- 391
              ++  ++   S ++      RGR RGG      +R HE + N +              
Sbjct: 218 LNMQITKEENGQSYQRRGGGQVRGRGRGGYGNGRGWRPHEDNTNQRGENSSRGRGKGHPK 277

Query: 392 -RYDKRQAECYNCHKFGQYSWECK----NRVEENANYAEKDEERGNSSLLLACKGVETCE 451
            RYDK   +CYNC KFG Y+ ECK     + EE ANY E+  +  +  L+ + K  E  E
Sbjct: 278 SRYDKSSVKCYNCGKFGHYASECKAPSNKKFEEKANYVEEKIQEEDMLLMASYKKDEQKE 337

Query: 452 NNAWYLDSGASNHMCGCKSMFMELDESVGGDTVFGDATKIPVRGKGKILINLKNGKHEFI 511
           N+ WYLDSGASNHMCG KSMF ELDESV G+   GD +K+ V+GKG ILI LKNG H+FI
Sbjct: 338 NHKWYLDSGASNHMCGRKSMFAELDESVRGNVALGDESKMEVKGKGNILIRLKNGDHQFI 397

Query: 512 SNVYYVPDMKNNILSLGQLLEKGYNILMKDYSRLLIRDNHDNMIAKVQMTKNGMFLLNIQ 571
           SNVYY+P MK NILSLGQLLEKGY+I +KD + L IRD   N+I KV M+KN MF+LNI+
Sbjct: 398 SNVYYIPSMKTNILSLGQLLEKGYDIRLKD-NNLSIRDQESNLITKVPMSKNRMFVLNIR 457

Query: 572 TDVAKCLKSCLKDPNWIWHLRFGHLNFDGLRLLAKKNMVKGLPYVKHPDQFCEGCLYGKQ 631
            D+A+CLK C K+ +W+WHLRFGHLNF GL LL++K MV+GLP + HP+Q CEGCL GKQ
Sbjct: 458 NDIAQCLKMCYKEESWLWHLRFGHLNFGGLELLSRKEMVRGLPCINHPNQVCEGCLLGKQ 517

Query: 632 SRKSFPQESSWKARKPLKLVHIDLCGPIKPSSFGKNNYFLLFIDDFSRKTWIYFVKEKSE 691
            + SFP+ESS +A+KPL+L+H D+CGPIKP S GK+NYFLLFIDDFSRKTW+YF+KEKSE
Sbjct: 518 FKMSFPKESSSRAQKPLELIHTDVCGPIKPKSLGKSNYFLLFIDDFSRKTWVYFLKEKSE 577

Query: 692 VFGMFKRFRALVEKESGYYIKALRSDRGGEFTSNEFKKICAENGIRRPMTVSFTPPQNGV 751
           VF +FK+F+A VEKESG  IK +RSDRGGEFTS EF K C +NGIRR +TV  +P QNGV
Sbjct: 578 VFEIFKKFKAHVEKESGLVIKTMRSDRGGEFTSKEFLKYCEDNGIRRQLTVPRSPQQNGV 637

Query: 752 VERKNRTIFNMARSMLKSKKMPKEFWAQVVECAIYLSNCSPTRSLWNKTPQQAWTGRKPS 790
           VERKNRTI  MARSMLKSK++PKE WA+ V CA+YL N SPT+S+  KTPQ+AW+GRKP 
Sbjct: 638 VERKNRTILEMARSMLKSKRLPKELWAEAVACAVYLLNRSPTKSVSGKTPQEAWSGRKPG 696

BLAST of CSPI01G10200 vs. TrEMBL
Match: Q9C536_ARATH (Copia-type polyprotein, putative OS=Arabidopsis thaliana GN=T18I24.5 PE=4 SV=1)

HSP 1 Score: 691.8 bits (1784), Expect = 2.1e-195
Identity = 361/666 (54.20%), Postives = 463/666 (69.52%), Query Frame = 1

Query: 152 IAKLMVEKNHELPLIRTFREGTPLLIAVSYKSRDMISYLLSVTDLSQLTAQERIELLIAT 211
           + K  +E  +E  L +T ++G         + RD  +  L    L + T ++ +E   A 
Sbjct: 38  VEKGFIEPENEGSLSQTQKDGLR-----DSRKRDKKALCLIYQGLDEDTFEKVVEATSAK 97

Query: 212 -----IHSDFLGVDRVKKVRLQKLRGDYESLHMKESESVSDYTSRLLAAVNEMKRYGETI 271
                + + + G D+VKKVRLQ LRG++E+L MKE E VSDY SR+L   N +KR GE +
Sbjct: 98  EAWEKLRTSYKGADQVKKVRLQTLRGEFEALQMKEGELVSDYFSRVLTVTNNLKRNGEKL 157

Query: 272 SDEQVVEKILRSLDEKFNFIVVAIEESKYLSTMSIDQLMGSLQAHEEKLLKKNKQTTEQL 331
            D +++EK+LRSLD KF  IV  IEE+K L  M+I+QL+GSLQA+EEK  KK +   EQ+
Sbjct: 158 DDVRIMEKVLRSLDLKFEHIVTVIEETKDLEAMTIEQLLGSLQAYEEKK-KKKEDIVEQV 217

Query: 332 FQSKLNLKDKEDSLEKGN----RGRRRGGNRGHGDFRDHERSNNDK-------------- 391
              ++  ++   S ++      RGR RGG      +R HE + N +              
Sbjct: 218 LNMQITKEENGQSYQRRGGGQVRGRGRGGYGNGRGWRPHEDNTNQRGENSSRGRGKGHPK 277

Query: 392 -RYDKRQAECYNCHKFGQYSWECK----NRVEENANYAEKDEERGNSSLLLACKGVETCE 451
            RYDK   +CYNC KFG Y+ ECK     + EE ANY E+  +  +  L+ + K  E  E
Sbjct: 278 SRYDKSSVKCYNCGKFGHYASECKAPSNKKFEEKANYVEEKIQEEDMLLMASYKKDEQEE 337

Query: 452 NNAWYLDSGASNHMCGCKSMFMELDESVGGDTVFGDATKIPVRGKGKILINLKNGKHEFI 511
           N+ WYLDSGASNHMCG KSMF ELDESV G+   GD +K+ V+GKG ILI LKNG H+FI
Sbjct: 338 NHKWYLDSGASNHMCGRKSMFAELDESVRGNVALGDESKMEVKGKGNILIRLKNGDHQFI 397

Query: 512 SNVYYVPDMKNNILSLGQLLEKGYNILMKDYSRLLIRDNHDNMIAKVQMTKNGMFLLNIQ 571
           SNVYY+P MK NILSLGQLLEKGY+I +KD + L IRD   N+I KV M+KN MF+LNI+
Sbjct: 398 SNVYYIPSMKTNILSLGQLLEKGYDIRLKD-NNLSIRDQESNLITKVPMSKNRMFVLNIR 457

Query: 572 TDVAKCLKSCLKDPNWIWHLRFGHLNFDGLRLLAKKNMVKGLPYVKHPDQFCEGCLYGKQ 631
            D+A+CLK C K+ +W+WHLRFGHLNF GL LL++K MV+GLP + HP+Q CEGCL GKQ
Sbjct: 458 NDIAQCLKMCYKEESWLWHLRFGHLNFGGLELLSRKEMVRGLPCINHPNQVCEGCLLGKQ 517

Query: 632 SRKSFPQESSWKARKPLKLVHIDLCGPIKPSSFGKNNYFLLFIDDFSRKTWIYFVKEKSE 691
            + SFP+ESS +A+KPL+L+H D+CGPIKP S GK+NYFLLFIDDFSRKTW+YF+KEKSE
Sbjct: 518 FKMSFPKESSSRAQKPLELIHTDVCGPIKPKSLGKSNYFLLFIDDFSRKTWVYFLKEKSE 577

Query: 692 VFGMFKRFRALVEKESGYYIKALRSDRGGEFTSNEFKKICAENGIRRPMTVSFTPPQNGV 751
           VF +FK+F+A VEKESG  IK +RSDRGGEFTS EF K C +NGIRR +TV  +P QNGV
Sbjct: 578 VFEIFKKFKAHVEKESGLVIKTMRSDRGGEFTSKEFLKYCEDNGIRRQLTVPRSPQQNGV 637

Query: 752 VERKNRTIFNMARSMLKSKKMPKEFWAQVVECAIYLSNCSPTRSLWNKTPQQAWTGRKPS 790
            ERKNRTI  MARSMLKSK++PKE WA+ V CA+YL N SPT+S+  KTPQ+AW+GRKP 
Sbjct: 638 AERKNRTILEMARSMLKSKRLPKELWAEAVACAVYLLNRSPTKSVSGKTPQEAWSGRKPG 696

BLAST of CSPI01G10200 vs. TAIR10
Match: AT5G35810.1 (AT5G35810.1 Ankyrin repeat family protein)

HSP 1 Score: 287.7 bits (735), Expect = 4.6e-77
Identity = 157/347 (45.24%), Postives = 227/347 (65.42%), Query Frame = 1

Query: 1340 KTLAHQLVEFLWRYVVYELPQKEMLEFIKHPTSLLNDAAGAGNVEFLIVLICEFPDILWG 1399
            +TLAH +VE LW +V+ +LP +E+ +F+     LL DAA +GN+E L++LI  +PD++W 
Sbjct: 2    RTLAHMVVEELWSFVI-KLPVEEISQFVGSSPMLLFDAAQSGNLELLLILIRSYPDLIWT 61

Query: 1400 DDDNDDSKSIFHVAVENRLENVFNLINEIGKLNEFSTKYRTFKGKYSILHLAGNLAAPNH 1459
             D    ++S+FH+A  NR E +FN I E+G + +    Y+  +   ++LHL   L  PN 
Sbjct: 62   VDHK--NQSLFHIAAINRHEKIFNRIYELGAIKDLIAMYKEKESNDNLLHLVARLPPPNR 121

Query: 1460 LNRVSGAALQMQREMLWFKEVEKIVLPSQLEVKSNDPDPSIPKLTPRQLFTEKHKRLRKE 1519
            L  VSGAALQMQRE+LW+K V++IV    ++ K+   +          LFT++H  LRKE
Sbjct: 122  LQVVSGAALQMQREILWYKAVKEIVPRVYIKTKNKKEE------VAHDLFTKEHDNLRKE 181

Query: 1520 GEEWMKNTANSCMLVATLISTVVFAAAFTVPGGNDDN-----TGTPIFQNKFWFAMFVVS 1579
            GE+WMK TA +C+LV+TLI+TVVFAAAFT+PGGND +      G P F+ +FWF +F++S
Sbjct: 182  GEKWMKETATACILVSTLIATVVFAAAFTLPGGNDTSGDIKTLGFPTFRKEFWFEVFIIS 241

Query: 1580 DAIALFSSSTSILMFLSILTSRYAEEDFLHSLPSKLLFGLASLFISIVFMAVAFSSTFFL 1639
            D++AL SS TSI++FLSILTSRYAE  F  +LP+KL+ GL +LF+SI+ M +AF++T  L
Sbjct: 242  DSVALLSSVTSIMIFLSILTSRYAEASFQTTLPTKLMLGLLALFVSIISMVLAFTATLIL 301

Query: 1640 IYHNANISIPTMVTAMAIIPITCFCLLQFTLWIDIFHNTYSSRFLFN 1682
            I          ++  +A      F +L F LW D   + Y S+FLF+
Sbjct: 302  IRDQEPKWSLILLVYVASATALSFVVLHFQLWFDTLRSAYLSKFLFH 339

BLAST of CSPI01G10200 vs. TAIR10
Match: AT4G23160.1 (AT4G23160.1 cysteine-rich RLK (RECEPTOR-like protein kinase) 8)

HSP 1 Score: 279.6 bits (714), Expect = 1.2e-74
Identity = 160/470 (34.04%), Postives = 262/470 (55.74%), Query Frame = 1

Query: 792  EPLNFEKASQNDKWKIAMDEEIKAIKKNDTGELSTLPNGKKAI--------KRNEKGELE 851
            EP  + +A +   W  AMD+EI A++   T E+ TLP  KK I        K N  G +E
Sbjct: 85   EPSTYNEAKEFLVWCGAMDDEIGAMETTHTWEICTLPPNKKPIGCKWVYKIKYNSDGTIE 144

Query: 852  RYKARLVAKGYSQRKGIDYNEVFAPVARLETIRLLIAIAAQNNWKISYMDVKSAFLNGYL 911
            RYKARLVAKGY+Q++GID+ E F+PV +L +++L++AI+A  N+ +  +D+ +AFLNG L
Sbjct: 145  RYKARLVAKGYTQQEGIDFIETFSPVCKLTSVKLILAISAIYNFTLHQLDISNAFLNGDL 204

Query: 912  EEEVYLEQPLGYSVKGQE----DKVLKLKNVLYGLKQAPRMWSSRINKYFLDNGYLRCPY 971
            +EE+Y++ P GY+ +  +    + V  LK  +YGLKQA R W  + +   +  G+++   
Sbjct: 205  DEEIYMKLPPGYAARQGDSLPPNAVCYLKKSIYGLKQASRQWFLKFSVTLIGFGFVQSHS 264

Query: 972  EHSLYIKTNDHGDILVICVYVNDLIFIGNPASMFEDLKKAMTQEFEMTDI-------GLM 1031
            +H+ ++K       L + VYV+D+I   N  +  ++LK  +   F++ D+       GL 
Sbjct: 265  DHTYFLKITATL-FLCVLVYVDDIIICSNNDAAVDELKSQLKSCFKLRDLGPLKYFLGLE 324

Query: 1032 ----------------------SYYLGIEPVTTPIETGTKLSKYEEGD-VDHSYFKSLVG 1091
                                  +  LG +P + P++     S +  GD VD   ++ L+G
Sbjct: 325  IARSAAGINICQRKYALDLLDETGLLGCKPSSVPMDPSVTFSAHSGGDFVDAKAYRRLIG 384

Query: 1092 SLRYLTCTRPYILFSVGLVGRFMESPTTTHLKVAKRIFHYLKGTLDYGLFYSSSKEFKLE 1151
             L YL  TR  I F+V  + +F E+P   H +   +I HY+KGT+  GLFYSS  E +L+
Sbjct: 385  RLMYLQITRLDISFAVNKLSQFSEAPRLAHQQAVMKILHYIKGTVGQGLFYSSQAEMQLQ 444

Query: 1152 GYCDSDWAGDTNDRKSTSGFVFFIGNTAFTWSSKKQPIVTLSTCEAEYIAAASCACHAVW 1211
             + D+ +    + R+ST+G+  F+G +  +W SKKQ +V+ S+ EAEY A +      +W
Sbjct: 445  VFSDASFQSCKDTRRSTNGYCMFLGTSLISWKSKKQQVVSKSSAEAEYRALSFATDEMMW 504

Query: 1212 LRNLLKTIGILQDDPTVIHVDNKSTIALAKHHVFHDRSKHIDTRFHFIRD 1220
            L    + + +    PT++  DN + I +A + VFH+R+KHI++  H +R+
Sbjct: 505  LAQFFRELQLPLSKPTLLFCDNTAAIHIATNAVFHERTKHIESDCHSVRE 553

BLAST of CSPI01G10200 vs. TAIR10
Match: AT3G54070.1 (AT3G54070.1 Ankyrin repeat family protein)

HSP 1 Score: 253.4 bits (646), Expect = 9.5e-67
Identity = 147/315 (46.67%), Postives = 206/315 (65.40%), Query Frame = 1

Query: 1373 LLNDAAGAGNVEFLIVLICEFPDILWGDDDNDDSKSIFHVAVENRLENVFNLINEIGKLN 1432
            LL DAA  GNVE L++LI    D+LW  D+N+  +++FHVA   R EN+F+LI E+G + 
Sbjct: 258  LLFDAAELGNVEILVILIRSHLDLLWIVDNNN--RTLFHVAALYRHENIFSLIYELGGIK 317

Query: 1433 EFSTKYRTFKGKYSILHLAGNLAAPNHLNRVSGAALQMQREMLWFKEVEKIVLPSQLEVK 1492
            +    Y+  + K ++LHL   L   N     SGAAL MQ+E+LWFK V++IV  S +E K
Sbjct: 318  DLIASYKEKQSKDTLLHLVARLPPMNRQQVGSGAALHMQKELLWFKAVKEIVPRSYIETK 377

Query: 1493 SNDPDPSIPKLTPRQLFTEKHKRLRKEGEEWMKNTANSCMLVATLISTVVFAAAFTVPGG 1552
            +   + +        +FTE+H+ LRKEGE WMK TA +CML ATLI+TVVFAAA T+PGG
Sbjct: 378  NTKGELA------HDIFTEQHENLRKEGERWMKETATACMLGATLIATVVFAAAITIPGG 437

Query: 1553 NDDN------TGTPIFQNKFWFAMFVVSDAIALFSSSTSILMFLSILTSRYAEEDFLHSL 1612
            NDD+       G P F+ +  F +F +SD++ALFSS  SI++FLSI TSRYAEEDF + L
Sbjct: 438  NDDSGDKANTLGFPNFRKRLLFDIFTLSDSVALFSSMMSIVIFLSIFTSRYAEEDFRYDL 497

Query: 1613 PSKLLFGLASLFISIVFMAVAFSSTFFLI-YHNANISIPTMVTAMAIIPITCFCLLQFTL 1672
            P+KL+FGL++LFISI+ M +AF+ +  LI    A++S+  +++ +A +    F  L F L
Sbjct: 498  PTKLMFGLSALFISIISMILAFTFSMILIRVEKASLSL-VLISCLASLTALTFAYLYFHL 557

Query: 1673 WIDIFHNTYSSRFLF 1681
            W +   + Y S FLF
Sbjct: 558  WFNTLRSVYISMFLF 563

BLAST of CSPI01G10200 vs. TAIR10
Match: AT3G18670.1 (AT3G18670.1 Ankyrin repeat family protein)

HSP 1 Score: 218.8 bits (556), Expect = 2.6e-56
Identity = 132/310 (42.58%), Postives = 192/310 (61.94%), Query Frame = 1

Query: 1381 GNVEFLIVLICEFPDILWGDDDNDDSKSIFHVAVENRLENVFNLINEIG-KLNEFSTKYR 1440
            G VE++  ++  +PDI+W    N    +IF  AV  R E +F+LI  IG K N  +T + 
Sbjct: 300  GIVEYIEEMMRHYPDIVWSK--NSSGLNIFFYAVSQRQEKIFSLIYNIGAKKNILATNWD 359

Query: 1441 TFKGKYSILHLAGNLAAPNHLNRVSGAALQMQREMLWFKEVEKIVLPSQLEVKSNDPDPS 1500
             F    ++LH A   A  + LN + GAALQMQRE+ WFKEVEK+V P   ++ +      
Sbjct: 360  IFHN--NMLHHAAYRAPASRLNLIPGAALQMQRELQWFKEVEKLVQPKHRKMVNLKQ--- 419

Query: 1501 IPKLTPRQLFTEKHKRLRKEGEEWMKNTANSCMLVATLISTVVFAAAFTVPGGNDDNTGT 1560
              K TP+ LFT++HK L ++GE+WMK TA SC +VA LI+T++F++AFTVPGG   + G 
Sbjct: 420  --KKTPKALFTDQHKDLVEQGEKWMKETATSCTVVAALITTMMFSSAFTVPGGYRSD-GM 479

Query: 1561 PIFQNKFWFAMFVVSDAIALFSSSTSILMFLSILTSRYAEEDFLHSLPSKLLFGLASLFI 1620
            P++ ++  F +F++SDAI+LF+S  S+LMFL IL SRY EEDFL SLP+KL+ GL +LF+
Sbjct: 480  PLYIHQHRFKIFLISDAISLFTSCMSLLMFLGILKSRYREEDFLRSLPTKLIVGLLALFL 539

Query: 1621 SIVFMAVAFSSTFFLIYHNANISIPTMVTAMAIIPITCFCLLQFTLWIDIFHNTYSSRFL 1680
            S+  M V F  T   +       +      +A+IP+  F +LQF + ++IF  TY     
Sbjct: 540  SMATMIVTFVVTLMTLVGEKISWVSAQFMFLAVIPLGMFVVLQFPVLLEIFRATYCPNVF 596

Query: 1681 FNPNNPRKLF 1690
               + PR++F
Sbjct: 600  ---DKPRRVF 596

BLAST of CSPI01G10200 vs. TAIR10
Match: AT5G04730.1 (AT5G04730.1 Ankyrin-repeat containing protein)

HSP 1 Score: 217.6 bits (553), Expect = 5.8e-56
Identity = 157/442 (35.52%), Postives = 234/442 (52.94%), Query Frame = 1

Query: 1245 LKVNVFNKLRNLSLWILKLYPELAVMKDTKNNNETALHVLARKPSAMD--STKQLQNLKM 1304
            +K ++F    N   W   +Y  + V ++ + N +  L  +    S +     KQ  +LK 
Sbjct: 204  VKPDLFRSHCNFGFWRHLIYSCIRVSENPRPNRDNRLFCMTLPQSLLKWFGIKQTYDLKK 263

Query: 1305 RINSWRFNSKLFISPWKLINEILASL--ILPSNSNKDVTKTLAHQLVEFLWRYVVYELPQ 1364
            R +  +          KL+ ++  SL  I+  N              E  W+  VYE   
Sbjct: 264  RHSQAQ----------KLLKQMCTSLRDIMAKN--------------EIRWKETVYEA-- 323

Query: 1365 KEMLEFIKHPTSLLNDAAGAGNVEFLIVLICEFPDILWGDDDNDDSKSIFHVAVENRLEN 1424
                         L +AA +GN +F I +I     +LW  +     +++F +AVE + E 
Sbjct: 324  -------------LLEAAKSGNRDFFIEIIKCNSQLLWILNPTS-GRNLFQLAVEFKKEK 383

Query: 1425 VFNLINEIGKLNEFSTKYRTF-KGKYSILHLAGNLAAPNHLNRVSGAALQMQREMLWFKE 1484
            +FNLI+  G  +   T  R++ KG  +ILH+AG L+ P+ L+++SGAAL+MQRE  WFKE
Sbjct: 384  IFNLIH--GLDDRKVTLLRSYDKGNNNILHIAGRLSTPDQLSKISGAALKMQRESQWFKE 443

Query: 1485 VEKIVLPSQLEVKSNDPDPSIPKLTPRQLFTEKHKRLRKEGEEWMKNTANSCMLVATLIS 1544
            VE +V   ++  K+ D        TPRQ+F   H+ LRKEGEEWMK TA +C  VA LI+
Sbjct: 444  VESLVSEREVVQKNKD------NKTPRQIFEHYHEHLRKEGEEWMKYTATACSFVAALIA 503

Query: 1545 TVVFAAAFTVPGGNDDNTGTPIFQNKFWFAMFVVSDAIALFSSSTSILMFLSILTSRYAE 1604
            TV F A FTVPGG D  +G+P+  N   F  F+ +D +A F+S  S+L+FLSILTSRY+ 
Sbjct: 504  TVTFQAIFTVPGGIDGTSGSPLILNDLHFRAFIFTDTLAFFASCISVLIFLSILTSRYSF 563

Query: 1605 EDFLHSLPSKLLFGLASLFISIVFMAVAF-SSTFFLIYHNANISIPTMVTAMAIIPITCF 1664
            +DF+ SLP K++ G + LFISI  M VAF +S    + H   +  P  +  +A  P   F
Sbjct: 564  DDFIVSLPRKMILGQSILFISIASMLVAFITSLSASMRHKPALVYP--LKPLASFPSLLF 595

Query: 1665 CLLQFTLWIDIFHNTYSSRFLF 1681
             +LQ+ L  ++  +TY  R  +
Sbjct: 624  LMLQYPLLKEMISSTYGKRLFY 595

BLAST of CSPI01G10200 vs. NCBI nr
Match: gi|449454915|ref|XP_004145199.1| (PREDICTED: uncharacterized protein LOC101215460 [Cucumis sativus])

HSP 1 Score: 856.7 bits (2212), Expect = 7.0e-245
Identity = 435/445 (97.75%), Postives = 438/445 (98.43%), Query Frame = 1

Query: 1245 LKVNVFNKLRNLSLWILKLYPELAVMKDTKNNNETALHVLARKPSAMDSTKQLQNLKMRI 1304
            L   + +   +LSLWILKLYPELAVMKDTKNNNETALHVLARKPSAMDSTKQLQNLKMRI
Sbjct: 208  LIATIHSDFLDLSLWILKLYPELAVMKDTKNNNETALHVLARKPSAMDSTKQLQNLKMRI 267

Query: 1305 NSWRFNSKLFISPWKLINEILASLILPSNSNKDVTKTLAHQLVEFLWRYVVYELPQKEML 1364
            NSWRFNSKLFISPWKLINEILASLILPSNSNKDVTKTLAHQLVEFLWRYVVYELPQKEML
Sbjct: 268  NSWRFNSKLFISPWKLINEILASLILPSNSNKDVTKTLAHQLVEFLWRYVVYELPQKEML 327

Query: 1365 EFIKHPTSLLNDAAGAGNVEFLIVLICEFPDILWGDDDNDDSKSIFHVAVENRLENVFNL 1424
            EFIKHPTSLLNDAAGAGNVEFLIVLICEFPDILWGDDDNDDSKSIFHVAVENRLENVFNL
Sbjct: 328  EFIKHPTSLLNDAAGAGNVEFLIVLICEFPDILWGDDDNDDSKSIFHVAVENRLENVFNL 387

Query: 1425 INEIGKLNEFSTKYRTFKGKYSILHLAGNLAAPNHLNRVSGAALQMQREMLWFKEVEKIV 1484
            INEIGKLNEFSTKYRTFKGKYSILHLAGNLAAPNHLNRVSGAALQMQREMLWFKEVEKIV
Sbjct: 388  INEIGKLNEFSTKYRTFKGKYSILHLAGNLAAPNHLNRVSGAALQMQREMLWFKEVEKIV 447

Query: 1485 LPSQLEVKSNDPDPSIPKLTPRQLFTEKHKRLRKEGEEWMKNTANSCMLVATLISTVVFA 1544
            LPSQLEVKSNDPDPSIPKLTPRQLFTEKHKRLRKEGEEWMKNTANSCMLVATLISTVVFA
Sbjct: 448  LPSQLEVKSNDPDPSIPKLTPRQLFTEKHKRLRKEGEEWMKNTANSCMLVATLISTVVFA 507

Query: 1545 AAFTVPGGNDDNTGTPIFQNKFWFAMFVVSDAIALFSSSTSILMFLSILTSRYAEEDFLH 1604
            AAFTVPGGNDDNTGTPIFQNKFWFAMFVVSDAIALFSSSTSILMFLSILTSRYAEEDFLH
Sbjct: 508  AAFTVPGGNDDNTGTPIFQNKFWFAMFVVSDAIALFSSSTSILMFLSILTSRYAEEDFLH 567

Query: 1605 SLPSKLLFGLASLFISIVFMAVAFSSTFFLIYHNANISIPTMVTAMAIIPITCFCLLQFT 1664
            SLPSKLLFGLASLFISIVFMAVAFSSTFFLIYHNANISIPTMVTAMAIIPITCFCLLQFT
Sbjct: 568  SLPSKLLFGLASLFISIVFMAVAFSSTFFLIYHNANISIPTMVTAMAIIPITCFCLLQFT 627

Query: 1665 LWIDIFHNTYSSRFLFNPNNPRKLF 1690
            LWIDIFHNTYSSRFLFNPNNPRKLF
Sbjct: 628  LWIDIFHNTYSSRFLFNPNNPRKLF 652

BLAST of CSPI01G10200 vs. NCBI nr
Match: gi|659067663|ref|XP_008440640.1| (PREDICTED: uncharacterized protein LOC103484989 isoform X2 [Cucumis melo])

HSP 1 Score: 833.2 bits (2151), Expect = 8.2e-238
Identity = 427/447 (95.53%), Postives = 436/447 (97.54%), Query Frame = 1

Query: 1245 LKVNVFNKLRNLSLWILKLYPELAVMKDTKNNNETALHVLARKPSAMDSTKQLQNLKMRI 1304
            L   + +   +LSLWILKLYPELAVMKDTKNNNETALHVLARKPSAMDSTKQLQN KMRI
Sbjct: 143  LIATIHSDFHDLSLWILKLYPELAVMKDTKNNNETALHVLARKPSAMDSTKQLQNWKMRI 202

Query: 1305 NSWR-FNSKLFISPWKLINEILASLILPSNSNKDVTKTLAHQLVEFLWRYVVYELPQKEM 1364
            NSWR FNSKLFISPW+LI+EILASLILPSNSNKDVTKTLAHQLVEFLWRYVVYELPQKEM
Sbjct: 203  NSWRSFNSKLFISPWELIDEILASLILPSNSNKDVTKTLAHQLVEFLWRYVVYELPQKEM 262

Query: 1365 LEFIKHPTSLLNDAAGAGNVEFLIVLICEFPDILWGDDDN-DDSKSIFHVAVENRLENVF 1424
            LEFI+HPTSLLNDAAGAGNVEFLIVLI E+PDILWGDDD+ DDSKSIFHVAVENRLENVF
Sbjct: 263  LEFIRHPTSLLNDAAGAGNVEFLIVLIREYPDILWGDDDDEDDSKSIFHVAVENRLENVF 322

Query: 1425 NLINEIGKLNEFSTKYRTFKGKYSILHLAGNLAAPNHLNRVSGAALQMQREMLWFKEVEK 1484
            NLINEIGKLNEFSTKYRTFKGKYSILHLAGNLAAPNHLNRVSGAALQMQREMLWFKEVEK
Sbjct: 323  NLINEIGKLNEFSTKYRTFKGKYSILHLAGNLAAPNHLNRVSGAALQMQREMLWFKEVEK 382

Query: 1485 IVLPSQLEVKSNDPDPSIPKLTPRQLFTEKHKRLRKEGEEWMKNTANSCMLVATLISTVV 1544
            IVLPSQLEVKSNDPDPSIPKLTPRQLFTEKHKRLRKEGEEWMKNTANSCMLVATLISTVV
Sbjct: 383  IVLPSQLEVKSNDPDPSIPKLTPRQLFTEKHKRLRKEGEEWMKNTANSCMLVATLISTVV 442

Query: 1545 FAAAFTVPGGNDDNTGTPIFQNKFWFAMFVVSDAIALFSSSTSILMFLSILTSRYAEEDF 1604
            FAAAFTVPGGNDDNTGTPIFQNKFWFAMFVVSDAIALFSSSTSILMFLSILTSRYAEEDF
Sbjct: 443  FAAAFTVPGGNDDNTGTPIFQNKFWFAMFVVSDAIALFSSSTSILMFLSILTSRYAEEDF 502

Query: 1605 LHSLPSKLLFGLASLFISIVFMAVAFSSTFFLIYHNANISIPTMVTAMAIIPITCFCLLQ 1664
            LHSLPSKLLFGLASLFISIVFMA+AFSSTFFLIYHNANISIPTMVTAMAIIPITCFCLLQ
Sbjct: 503  LHSLPSKLLFGLASLFISIVFMAIAFSSTFFLIYHNANISIPTMVTAMAIIPITCFCLLQ 562

Query: 1665 FTLWIDIFHNTYSSRFLFNPNNPRKLF 1690
            FTLWIDIFHNTYSSRFLFNPNNPRKLF
Sbjct: 563  FTLWIDIFHNTYSSRFLFNPNNPRKLF 589

BLAST of CSPI01G10200 vs. NCBI nr
Match: gi|659067661|ref|XP_008440631.1| (PREDICTED: uncharacterized protein LOC103484989 isoform X1 [Cucumis melo])

HSP 1 Score: 833.2 bits (2151), Expect = 8.2e-238
Identity = 427/447 (95.53%), Postives = 436/447 (97.54%), Query Frame = 1

Query: 1245 LKVNVFNKLRNLSLWILKLYPELAVMKDTKNNNETALHVLARKPSAMDSTKQLQNLKMRI 1304
            L   + +   +LSLWILKLYPELAVMKDTKNNNETALHVLARKPSAMDSTKQLQN KMRI
Sbjct: 208  LIATIHSDFHDLSLWILKLYPELAVMKDTKNNNETALHVLARKPSAMDSTKQLQNWKMRI 267

Query: 1305 NSWR-FNSKLFISPWKLINEILASLILPSNSNKDVTKTLAHQLVEFLWRYVVYELPQKEM 1364
            NSWR FNSKLFISPW+LI+EILASLILPSNSNKDVTKTLAHQLVEFLWRYVVYELPQKEM
Sbjct: 268  NSWRSFNSKLFISPWELIDEILASLILPSNSNKDVTKTLAHQLVEFLWRYVVYELPQKEM 327

Query: 1365 LEFIKHPTSLLNDAAGAGNVEFLIVLICEFPDILWGDDDN-DDSKSIFHVAVENRLENVF 1424
            LEFI+HPTSLLNDAAGAGNVEFLIVLI E+PDILWGDDD+ DDSKSIFHVAVENRLENVF
Sbjct: 328  LEFIRHPTSLLNDAAGAGNVEFLIVLIREYPDILWGDDDDEDDSKSIFHVAVENRLENVF 387

Query: 1425 NLINEIGKLNEFSTKYRTFKGKYSILHLAGNLAAPNHLNRVSGAALQMQREMLWFKEVEK 1484
            NLINEIGKLNEFSTKYRTFKGKYSILHLAGNLAAPNHLNRVSGAALQMQREMLWFKEVEK
Sbjct: 388  NLINEIGKLNEFSTKYRTFKGKYSILHLAGNLAAPNHLNRVSGAALQMQREMLWFKEVEK 447

Query: 1485 IVLPSQLEVKSNDPDPSIPKLTPRQLFTEKHKRLRKEGEEWMKNTANSCMLVATLISTVV 1544
            IVLPSQLEVKSNDPDPSIPKLTPRQLFTEKHKRLRKEGEEWMKNTANSCMLVATLISTVV
Sbjct: 448  IVLPSQLEVKSNDPDPSIPKLTPRQLFTEKHKRLRKEGEEWMKNTANSCMLVATLISTVV 507

Query: 1545 FAAAFTVPGGNDDNTGTPIFQNKFWFAMFVVSDAIALFSSSTSILMFLSILTSRYAEEDF 1604
            FAAAFTVPGGNDDNTGTPIFQNKFWFAMFVVSDAIALFSSSTSILMFLSILTSRYAEEDF
Sbjct: 508  FAAAFTVPGGNDDNTGTPIFQNKFWFAMFVVSDAIALFSSSTSILMFLSILTSRYAEEDF 567

Query: 1605 LHSLPSKLLFGLASLFISIVFMAVAFSSTFFLIYHNANISIPTMVTAMAIIPITCFCLLQ 1664
            LHSLPSKLLFGLASLFISIVFMA+AFSSTFFLIYHNANISIPTMVTAMAIIPITCFCLLQ
Sbjct: 568  LHSLPSKLLFGLASLFISIVFMAIAFSSTFFLIYHNANISIPTMVTAMAIIPITCFCLLQ 627

Query: 1665 FTLWIDIFHNTYSSRFLFNPNNPRKLF 1690
            FTLWIDIFHNTYSSRFLFNPNNPRKLF
Sbjct: 628  FTLWIDIFHNTYSSRFLFNPNNPRKLF 654

BLAST of CSPI01G10200 vs. NCBI nr
Match: gi|1012333128|gb|KYP44533.1| (Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cajanus cajan])

HSP 1 Score: 713.0 bits (1839), Expect = 1.2e-201
Identity = 365/596 (61.24%), Postives = 451/596 (75.67%), Query Frame = 1

Query: 216 FLGVDRVKKVRLQKLRGDYESLHMKESESVSDYTSRLLAAVNEMKRYGETISDEQVVEKI 275
           + G D+VKKVRLQ LRG++E+LHMKE E VSDY SR+L   N +KR GE + D +++EKI
Sbjct: 102 YKGADQVKKVRLQTLRGEFEALHMKEGELVSDYFSRVLTVTNNLKRNGEKLDDVRIMEKI 161

Query: 276 LRSLDEKFNFIVVAIEESKYLSTMSIDQLMGSLQAHEEKLLKKNKQTTEQLFQSKLNLKD 335
           LRSLD KF  IV   EE+K L  MSI+QL+GSLQA+EEK  KK ++  EQ+F++ ++ + 
Sbjct: 162 LRSLDPKFEHIVTITEETKDLEAMSIEQLLGSLQAYEEKK-KKKEEIVEQVFKAHVDSRK 221

Query: 336 KEDSLEKGNR----------------GRRRGGN--RGHGDFRDHERSNNDKRYDKRQAEC 395
           +E++  +  R                GRR   N  RG    R   R N + RYDK + +C
Sbjct: 222 EENAHNQSRRSYSQEQGRGRAYGHGQGRRPNNNNQRGESSNRGRGRGNPNSRYDKSRIKC 281

Query: 396 YNCHKFGQYSWEC----KNRVEENANYAEKDEERGNSSLLLACKGVETCENNAWYLDSGA 455
           YNC+KFG Y+ EC    KN+VEE ANYAE+  +  + +LLLA KG +  E+N WYLDSGA
Sbjct: 282 YNCNKFGHYASECRAPNKNKVEEKANYAEERCQE-DGTLLLAYKGQDKGEDNQWYLDSGA 341

Query: 456 SNHMCGCKSMFMELDESVGGDTVFGDATKIPVRGKGKILINLKNGKHEFISNVYYVPDMK 515
           SNHMCG +SMF+ELDESV G+  FGD +K+ V GKG +LI LKNG+H+FISNVYYVP MK
Sbjct: 342 SNHMCGKRSMFVELDESVKGNVAFGDESKVAVEGKGNVLIRLKNGEHQFISNVYYVPSMK 401

Query: 516 NNILSLGQLLEKGYNILMKDYSRLLIRDNHDNMIAKVQMTKNGMFLLNIQTDVAKCLKSC 575
           +NILSLGQLLEKGY+I +K+ + L IRDN    IAKV MT+N MF+LNIQ+D  +CLK C
Sbjct: 402 SNILSLGQLLEKGYDIQLKN-NNLSIRDNTSRFIAKVPMTRNRMFVLNIQSDGPQCLKMC 461

Query: 576 LKDPNWIWHLRFGHLNFDGLRLLAKKNMVKGLPYVKHPDQFCEGCLYGKQSRKSFPQESS 635
            KD +W+WHLRFGHLNF GL LL+KK MV+GLP + HP+Q CEGCL GKQ R SFP+ES 
Sbjct: 462 YKDQSWLWHLRFGHLNFKGLELLSKKAMVRGLPCITHPNQVCEGCLLGKQFRLSFPKESD 521

Query: 636 WKARKPLKLVHIDLCGPIKPSSFGKNNYFLLFIDDFSRKTWIYFVKEKSEVFGMFKRFRA 695
            +A+KPL+L+H D+CGPIKP S GK+NYFLLFIDDFSRKTW+YF+KEKSEVF  FK+F+A
Sbjct: 522 SRAQKPLELIHTDVCGPIKPRSLGKSNYFLLFIDDFSRKTWVYFLKEKSEVFENFKKFKA 581

Query: 696 LVEKESGYYIKALRSDRGGEFTSNEFKKICAENGIRRPMTVSFTPPQNGVVERKNRTIFN 755
            VEKESG  IKALRSDRGGEFTS EF+K C +NGIRR +TV  +P QNGV ERKNRTI  
Sbjct: 582 HVEKESGLLIKALRSDRGGEFTSKEFQKYCEDNGIRRQLTVPRSPQQNGVAERKNRTILE 641

Query: 756 MARSMLKSKKMPKEFWAQVVECAIYLSNCSPTRSLWNKTPQQAWTGRKPSIAHLRI 790
           MARSMLKSKK+PKEFWA+ V CA+YL+N SPTRS+  KTPQ+AW+GRKP I+HLR+
Sbjct: 642 MARSMLKSKKLPKEFWAEAVACAVYLTNRSPTRSVSGKTPQEAWSGRKPGISHLRV 694

BLAST of CSPI01G10200 vs. NCBI nr
Match: gi|1012357856|gb|KYP69041.1| (Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cajanus cajan])

HSP 1 Score: 713.0 bits (1839), Expect = 1.2e-201
Identity = 365/596 (61.24%), Postives = 451/596 (75.67%), Query Frame = 1

Query: 216 FLGVDRVKKVRLQKLRGDYESLHMKESESVSDYTSRLLAAVNEMKRYGETISDEQVVEKI 275
           + G D+VKKVRLQ LRG++E+LHMKE E VSDY SR+L   N +KR GE + D +++EKI
Sbjct: 102 YKGADQVKKVRLQTLRGEFEALHMKEGELVSDYFSRVLTVTNNLKRNGEKLDDVRIMEKI 161

Query: 276 LRSLDEKFNFIVVAIEESKYLSTMSIDQLMGSLQAHEEKLLKKNKQTTEQLFQSKLNLKD 335
           LRSLD KF  IV   EE+K L  MSI+QL+GSLQA+EEK  KK ++  EQ+F++ ++ + 
Sbjct: 162 LRSLDPKFEHIVTITEETKDLEAMSIEQLLGSLQAYEEKK-KKKEEIVEQVFKAHVDSRK 221

Query: 336 KEDSLEKGNR----------------GRRRGGN--RGHGDFRDHERSNNDKRYDKRQAEC 395
           +E++  +  R                GRR   N  RG    R   R N + RYDK + +C
Sbjct: 222 EENAHNQSRRSYSQEQGRGRAYGHGQGRRPNNNNQRGESSNRGRGRGNPNSRYDKSRIKC 281

Query: 396 YNCHKFGQYSWEC----KNRVEENANYAEKDEERGNSSLLLACKGVETCENNAWYLDSGA 455
           YNC+KFG Y+ EC    KN+VEE ANYAE+  +  + +LLLA KG +  E+N WYLDSGA
Sbjct: 282 YNCNKFGHYASECRAPNKNKVEEKANYAEERCQE-DGTLLLAYKGQDKGEDNQWYLDSGA 341

Query: 456 SNHMCGCKSMFMELDESVGGDTVFGDATKIPVRGKGKILINLKNGKHEFISNVYYVPDMK 515
           SNHMCG +SMF+ELDESV G+  FGD +K+ V GKG +LI LKNG+H+FISNVYYVP MK
Sbjct: 342 SNHMCGKRSMFVELDESVKGNVAFGDESKVAVEGKGNVLIRLKNGEHQFISNVYYVPSMK 401

Query: 516 NNILSLGQLLEKGYNILMKDYSRLLIRDNHDNMIAKVQMTKNGMFLLNIQTDVAKCLKSC 575
           +NILSLGQLLEKGY+I +K+ + L IRDN    IAKV MT+N MF+LNIQ+D  +CLK C
Sbjct: 402 SNILSLGQLLEKGYDIQLKN-NNLSIRDNTSRFIAKVPMTRNRMFVLNIQSDGPQCLKMC 461

Query: 576 LKDPNWIWHLRFGHLNFDGLRLLAKKNMVKGLPYVKHPDQFCEGCLYGKQSRKSFPQESS 635
            KD +W+WHLRFGHLNF GL LL+KK MV+GLP + HP+Q CEGCL GKQ R SFP+ES 
Sbjct: 462 YKDQSWLWHLRFGHLNFKGLELLSKKAMVRGLPCITHPNQVCEGCLLGKQFRLSFPKESD 521

Query: 636 WKARKPLKLVHIDLCGPIKPSSFGKNNYFLLFIDDFSRKTWIYFVKEKSEVFGMFKRFRA 695
            +A+KPL+L+H D+CGPIKP S GK+NYFLLFIDDFSRKTW+YF+KEKSEVF  FK+F+A
Sbjct: 522 SRAQKPLELIHTDVCGPIKPRSLGKSNYFLLFIDDFSRKTWVYFLKEKSEVFENFKKFKA 581

Query: 696 LVEKESGYYIKALRSDRGGEFTSNEFKKICAENGIRRPMTVSFTPPQNGVVERKNRTIFN 755
            VEKESG  IKALRSDRGGEFTS EF+K C +NGIRR +TV  +P QNGV ERKNRTI  
Sbjct: 582 HVEKESGLLIKALRSDRGGEFTSKEFQKYCEDNGIRRQLTVPRSPQQNGVAERKNRTILE 641

Query: 756 MARSMLKSKKMPKEFWAQVVECAIYLSNCSPTRSLWNKTPQQAWTGRKPSIAHLRI 790
           MARSMLKSKK+PKEFWA+ V CA+YL+N SPTRS+  KTPQ+AW+GRKP I+HLR+
Sbjct: 642 MARSMLKSKKLPKEFWAEAVACAVYLTNRSPTRSVSGKTPQEAWSGRKPGISHLRV 694

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
POLX_TOBAC1.1e-8536.09Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum... [more]
COPIA_DROME1.4e-7232.53Copia protein OS=Drosophila melanogaster GN=GIP PE=1 SV=3[more]
M810_ARATH9.1e-2739.61Uncharacterized mitochondrial protein AtMg00810 OS=Arabidopsis thaliana GN=AtMg0... [more]
YD15B_YEAST6.7e-2224.09Transposon Ty1-DR6 Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC ... [more]
YL14B_YEAST6.7e-2224.09Transposon Ty1-LR4 Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC ... [more]
Match NameE-valueIdentityDescription
A0A151TPU3_CAJCA8.7e-20261.24Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cajanus cajan GN=... [more]
A0A151RPT4_CAJCA8.7e-20261.24Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cajanus cajan GN=... [more]
A0A151TGP0_CAJCA2.8e-20060.74Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cajanus cajan GN=... [more]
Q9SXB2_ARATH5.4e-19654.35T28P6.8 protein OS=Arabidopsis thaliana GN=T28P6.8 PE=4 SV=1[more]
Q9C536_ARATH2.1e-19554.20Copia-type polyprotein, putative OS=Arabidopsis thaliana GN=T18I24.5 PE=4 SV=1[more]
Match NameE-valueIdentityDescription
AT5G35810.14.6e-7745.24 Ankyrin repeat family protein[more]
AT4G23160.11.2e-7434.04 cysteine-rich RLK (RECEPTOR-like protein kinase) 8[more]
AT3G54070.19.5e-6746.67 Ankyrin repeat family protein[more]
AT3G18670.12.6e-5642.58 Ankyrin repeat family protein[more]
AT5G04730.15.8e-5635.52 Ankyrin-repeat containing protein[more]
Match NameE-valueIdentityDescription
gi|449454915|ref|XP_004145199.1|7.0e-24597.75PREDICTED: uncharacterized protein LOC101215460 [Cucumis sativus][more]
gi|659067663|ref|XP_008440640.1|8.2e-23895.53PREDICTED: uncharacterized protein LOC103484989 isoform X2 [Cucumis melo][more]
gi|659067661|ref|XP_008440631.1|8.2e-23895.53PREDICTED: uncharacterized protein LOC103484989 isoform X1 [Cucumis melo][more]
gi|1012333128|gb|KYP44533.1|1.2e-20161.24Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cajanus cajan][more]
gi|1012357856|gb|KYP69041.1|1.2e-20161.24Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cajanus cajan][more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR001584Integrase_cat-core
IPR002110Ankyrin_rpt
IPR012337RNaseH-like_sf
IPR013103RVT_2
IPR020683Ankyrin_rpt-contain_dom
IPR025724GAG-pre-integrase_dom
IPR026961PGG_dom
Vocabulary: Biological Process
TermDefinition
GO:0015074DNA integration
Vocabulary: Molecular Function
TermDefinition
GO:0005515protein binding
GO:0003676nucleic acid binding
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0015074 DNA integration
cellular_component GO:0005575 cellular_component
molecular_function GO:0046872 metal ion binding
molecular_function GO:0003676 nucleic acid binding
molecular_function GO:0005515 protein binding

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CSPI01G10200.1CSPI01G10200.1mRNA


Analysis Name: InterPro Annotations of cucumber (PI183967)
Date Performed: 2017-01-17
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR001584Integrase, catalytic corePFAMPF00665rvecoord: 618..731
score: 1.6
IPR001584Integrase, catalytic corePROFILEPS50994INTEGRASEcoord: 615..781
score: 23
IPR002110Ankyrin repeatSMARTSM00248ANK_2acoord: 1276..1306
score: 770.0coord: 101..130
score: 130.0coord: 1405..1433
score: 850.0coord: 136..165
score: 320.0coord: 170..198
score: 9
IPR012337Ribonuclease H-like domainGENE3DG3DSA:3.30.420.10coord: 613..773
score: 8.5
IPR012337Ribonuclease H-like domainunknownSSF53098Ribonuclease H-likecoord: 618..788
score: 9.23
IPR013103Reverse transcriptase, RNA-dependent DNA polymerasePFAMPF07727RVT_2coord: 834..1034
score: 7.0
IPR020683Ankyrin repeat-containing domainGENE3DG3DSA:1.25.40.20coord: 1266..1304
score: 1.5E-15coord: 71..202
score: 1.5E-15coord: 1182..1183
score: 1.5
IPR020683Ankyrin repeat-containing domainPROFILEPS50297ANK_REP_REGIONcoord: 101..192
score: 12
IPR020683Ankyrin repeat-containing domainunknownSSF48403Ankyrin repeatcoord: 71..193
score: 1.71
IPR025724GAG-pre-integrase domainPFAMPF13976gag_pre-integrscoord: 549..603
score: 1.1
IPR026961PGG domainPFAMPF13962PGGcoord: 1521..1633
score: 1.0
NoneNo IPR availableunknownCoilCoilcoord: 301..321
scor
NoneNo IPR availablePANTHERPTHR11439GAG-POL-RELATED RETROTRANSPOSONcoord: 184..1186
score:
NoneNo IPR availablePANTHERPTHR11439:SF127SUBFAMILY NOT NAMEDcoord: 184..1186
score:
NoneNo IPR availablePFAMPF14223Retrotran_gag_2coord: 184..316
score: 8.5
NoneNo IPR availableunknownSSF56672DNA/RNA polymerasescoord: 834..1215
score: 7.56

The following gene(s) are orthologous to this gene:
GeneOrthologueOrganismBlock
CSPI01G10200Cla019345Watermelon (97103) v1cpiwmB010
CSPI01G10200ClCG03G004310Watermelon (Charleston Gray)cpiwcgB038
The following gene(s) are paralogous to this gene:

None