CSPI04G12200 (gene) Cucumber (PI 183967) v1

Overview
NameCSPI04G12200
Typegene
OrganismCucumis sativus var. hardwickii cv. PI 183967 (Cucumber (PI 183967) v1)
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
LocationChr4: 10524864 .. 10529371 (-)
RNA-Seq ExpressionCSPI04G12200
SyntenyCSPI04G12200
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: polypeptideCDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGGAGAGAGACCATATCTTTAGACCCATTTCTACCATACTTGATGGTACAAATTATATCACATGGGCAAATCAAATGAAAAGTTTTCTTATTGGAAGAAAACTATGGCGCATTGTAACTGGCGACATCACCAAACCAACTGCAAAGGAAGGGGATAACACATTCATTGAACGTCTCGAAGATTGGGACAGTAAAAATCATCAAATTATCACCTGGCTTGGTAACACCTCTATTCCCGCTATACATGCACAATTTGATGCTTTTGATGATGCTAAACAATTATGAGATTTTTTGTCTACACGCTTCAAGTCTATTGGTTTAGCCCATTATTACCAACTGCACAGTACACTTGTAAATCTAAATCAAGATATTGGTCAGTCTGTTAATGAATATTTGGCAGTTCTTCAACCCATTTGGACTCAACTTGACCAAGCGAACATCAGCAAAGATCATCTTCGCCTTATTAAAGTCCTTATGGGATTACGTCCAGAATATGAATCTGTTAGAGCTGCTTTACTACACCGGAATCCCTTACCCTCATTAGATGCAGCTATTCAAGAAATTCTGTTTGAAGAAAAGCGTCTTGGCATCAACTCTACTAAACAATCTGATGTTGTCCTTGCTAGCACATACACTCCCAACAGAGCCGCAAATATGTTTTGTAAGAATTGTAAGCTCTCTGGTCACAAATTTAGTAACTGTCCTAAAATAGAGTGCAGGTACTGCCATAAACATGGCCACATTCTGGATAACTGCCCTACCAGACCACCCCGACCTCCTGGCACTTCCACAAAAGAGAAAATTTTTACCAAACATGGTTCCTCATCTGTTGTTGCTGCGACCTCGGATGATTCATCCCTCATTCAGATAAGTGATCTTCAGAGCTTATTGAATCAACTAATTTCATCATCCTCCACTCTTGCTGTTTCCTCAGGTAATCGATGGCTTCTTGATTCTGCCTGTTGTAATCATATGACCTCTGACTCTTCTCTTATGTCTACTTCTAGCCCTACAAAATCTTTACCTCCTATTTATGCTGCTGATGGTAATTGTATGAACATCTCTCATACTGGTACCATTGATACTCCCAGTGTACATCTTCCCCATACTTACTGTGTTCCTAACCTGACCTTTAATCTAGTGTCTGTTGGTCAATTATGTGATCTTGGCTTAAATGTTTCATTTTCTCCCAATGGTTGTCAGGTTCAGGATCCGCAGACGGGACAGACGATTGGAACGGGTCGCAAAGTGGGAAGATTGTTTGAGCTCACATCACTTCGGGTTTCATCTCCTTCTTCCATCTCTGCTTCGGTCACTGATTCTGACACATATCAGTGGCATCTTCGTCTTGGTCATGCTTCCTCTGAAAAACTTCGTCATTTAATTTCTGTTAACAATTTGACTAATCTTACTAAATTTGTTCCTTTTAATTGTTTGAATTGCAAACTTGCTAAACAACCTGCCTTATCTTTTTCTCAATCCATCTCTAATTGTGATAAACCTTTTGATTTAGTGCATTCTGATATTTGGGGTCCTGCCCCAATTACTACTGTTCATGGTTATCGCTACTATGTTTTATTCATTGATGACTACTCTCGATTTACATGGATTTACTTTCTAAAACATCGTTCTGAATTATCTCGCACATATATTGAGTTTGCTAACATGATTCGCACTCAATTTTCCTCTCCCATCAAAATTCTTCGCACTGATAATGTTTTGGAATATAAAGATTCCATCCTTCTTTCTTTTCTTTCCCAACAGGGCACTATTGTTCAGCGCTCTTGCCCTCATACCTCTCAACAAAATGGACGTGCTGAGCGCAAACATCGTCACATTCTTGACTCAGTACGTGCCCTCCTTCTTTCTGCCTCTTGTCCAGAAAAATTCTGGGGTGAAGCTGCCCTTACATCAGTATATACAATCAATCGTCTCCCTTCTTCTGTTCTTCAAAACACCTCTCCATTTGAAAAACTATATGGTATTTCTCCCGACTATTCTAAACTCAAAGTTTTTGGTAGTGCCTGCTTCGTTCTGTTACATCCTCATGAACACAATAAACTTGAACCACGTGCCCGTCTCTGTTGCTTCCTTGGCTATGGCACCGAACACAAAGGATTTCGTTGTTGGGACCCTCTTTCCAACCGACTCCGGATATCTCGGCATGTCACTTTTTGGGAACACACTATGTTCTCTCGTTTGTCCTCCTTCCACACCTCCTTCTCTAGTCCTCAATCTTTCTTTACAAACACATCTGTTGACCTTTTTCCTCTCTCTGAACCCACCTTGGATACTGAGCTTGCACAATCTTCACCTGCTACTGCAAATCTGGATCCACCGTCTGTCTTCGATGATGTTCCTGAATCGCCACCTGCTACTCCTCTTCGTCGCTCTACCCGGGTAAGAGAACCTCCCTCTCATCTCACTCAGTATATACAATCAATCGTCTCCCTTCTTCTGTTCTTCAAAACACCTCTCCATTTGAAAAACTATATGGTATTTCTCCCGACTATTCTAAACTCAAAGTTTTTGGTAGTGCCTGCTTCGTTCTGTTACATCCTCATGAACACAATAAACTTGAACCACGTGCCCGTCTCTGTTGCTTCCTTGGCTATGGCACCGAACACAAAGGATTTCGTTGTTGGGACCCTCTTTCCAACCGACTCCGGATATCTCGGCATGTCACTTTTTGGGAACACACTATGTTCTCTCGTTTGTCCTCCTTCCACACCTCTTTCTCTAGTCCTCAATCTTTCTTTACAAATACATCTGTTGACCTTTTTCCTCTCTCTGAACCCACCTTGGATACTGAGCTTGCACAATCTTCACCTGCTACTGCAAATCTGGATCCACCGTCTGTCTCCGATGATGTTCCTGAATCGTCACCTGCTACTCCTCTTCGTCGCTCTACCCGGGTAAGAGAACCTCCCCCTCATCTCACTGATTACCATTGTTTTTCTACCATTGTTTCCCTTGTTGAACCCACCTCTTATCAAGAGGCCAGTATTAACCCAGTATGGCAGAAAGCAATGGATGAAGAATTACAGGCTCTTGAAAAGACGCACACTTGGGACTATGTTGATTTACCTCCCGGTAAAAGACCCATTGGTTGCAAATGGATTTACAAAATCAAAACTCACTCTGATGGAACTATTGAACGTTATAAAGCTCGGCTTGTTGCAAAAGGATACTCACAAGAATATGGGATTGACTATGAAGAAACATTTGCCCCTGTTGCCCGGATGACATCTGTTCGCAGCTTGTTAGCTGTTGCTGCTGCCAAACAGTGGCCTCTTTTTCAGATGGATGTCAAAAATGCATTTCTTAACGGCAACCTATCTGAAGAAGTGTATATGAAGCCACCTCAGGGAACTTCTCCTCCTCCCAACAAGGTGTGTCTCCTTCGTCGCGCTCTATACGGTCTAAAACAGGCTCCACGAGCTTGGTTTGCCACGTTTAGCTCCACCATTACTCAACTTGGATTTACCTCCAGCTCTCACGACAATGCCCTTTTTACACGACAGACAACTCATGGTATTGTTCTTCTCCTTCTTTATGTTGATGATATGATTATTACTGGTAATGATCAACAGGCCATATCCGACCTACAACAATATCTTGGTCAACATTTTGAGATGAAAGACCTTGGATCTCTCAATTACTTTCTCGGTCTTGAAGTCTCTCACCGTTCAGATGGTTATCTGTTATCTCAAGCGAAATATGCATCTGATCTAATAGCACGCTCAGGAATTACAGACTCCACCACATCTTCAACACCGTTAGATCCTCATGTCCATCTAACTCCGTTTGATGGTGTTCCTCTTGACGATGCAAGCTTGTATCGGCAACTTGTTGGCAGTCTTATATACCTAACAGTAACTCGCCCAGATATTGCATATGCTGTTCATATTGTCAGTCAATTTATGGCTGCTCCTCGAACAATTCATTTCACTGCTGTTCTACGCATACTTCGCTATGTCAAAGGCACCTTGGGACATGGTCTTCAATTCTCATCTCAGTCTTCCCTTGTGTTGTCGGGATATTCTGATGCTGATTGGGCGGGGGATCCTACTGATCGACGATCCACTACAGGATACTGTTTTTACTTAGGTGATTCTCTCATCTCATGGCGTAGTAAGAAACAAAGTGTTATATCTCGTTCCAGTACGGAATCTGAATATCGTGCTCTGGCTGATGCTACAGCTGAACTTATATGGCTTCGGTGGCTCCTTGCCGATATGGGTGTCCCTCAACAGGGTCCTACCCTCCTCCATTGTGACAATCGTAGTGCCATTCAGATTGCTCACAATGATGTGTTTCATGAACGTACAAAACACATTGAAAATGACTGTCACTTTGTTCGTCACCACCTCCTAAGCAACACCCTCCTCTTACGTTCTGTTTCTACTATTGAACAACCTGCGGATATCTTCACCAAAGCCTTGCCATCTAATCGATTCTGTCACTTACTTACCAAACTCAAGTTGATCGCTACTCTACCACCTTGA

mRNA sequence

ATGGAGAGAGACCATATCTTTAGACCCATTTCTACCATACTTGATGGTACAAATTATATCACATGGGCAAATCAAATGAAAAGTTTTCTTATTGGAAGAAAACTATGGCGCATTGTAACTGGCGACATCACCAAACCAACTGCAAAGGAAGGGGATAACACATTCATTGAACGTCTCGAAGATTGGGACAGTAAAAATCATCAAATTATCACCTGGCTTGCCCATTATTACCAACTGCACAGTACACTTGTAAATCTAAATCAAGATATTGGTCAGTCTGTTAATGAATATTTGGCAGTTCTTCAACCCATTTGGACTCAACTTGACCAAGCGAACATCAGCAAAGATCATCTTCGCCTTATTAAAGTCCTTATGGGATTACGTCCAGAATATGAATCTGTTAGAGCTGCTTTACTACACCGGAATCCCTTACCCTCATTAGATGCAGCTATTCAAGAAATTCTGTTTGAAGAAAAGCGTCTTGGCATCAACTCTACTAAACAATCTGATGTTGTCCTTGCTAGCACATACACTCCCAACAGAGCCGCAAATATGTTTTGTAAGAATTGTAAGCTCTCTGGTCACAAATTTAGTAACTGTCCTAAAATAGAGTGCAGGTACTGCCATAAACATGGCCACATTCTGGATAACTGCCCTACCAGACCACCCCGACCTCCTGGCACTTCCACAAAAGAGAAAATTTTTACCAAACATGGTTCCTCATCTGTTGTTGCTGCGACCTCGGATGATTCATCCCTCATTCAGATAAGTGATCTTCAGAGCTTATTGAATCAACTAATTTCATCATCCTCCACTCTTGCTGTTTCCTCAGGTAATCGATGGCTTCTTGATTCTGCCTGTTGTAATCATATGACCTCTGACTCTTCTCTTATGTCTACTTCTAGCCCTACAAAATCTTTACCTCCTATTTATGCTGCTGATGGTAATTGTATGAACATCTCTCATACTGGTACCATTGATACTCCCAGTGTACATCTTCCCCATACTTACTGTGTTCCTAACCTGACCTTTAATCTAGTGTCTGTTGGTCAATTATGTGATCTTGGCTTAAATGTTTCATTTTCTCCCAATGGTTGTCAGGTTCAGGATCCGCAGACGGGACAGACGATTGGAACGGGTCGCAAAGTGGGAAGATTGTTTGAGCTCACATCACTTCGGGTTTCATCTCCTTCTTCCATCTCTGCTTCGGTCACTGATTCTGACACATATCAGTGGCATCTTCGTCTTGGTCATGCTTCCTCTGAAAAACTTCGTCATTTAATTTCTGTTAACAATTTGACTAATCTTACTAAATTTGTTCCTTTTAATTGTTTGAATTGCAAACTTGCTAAACAACCTGCCTTATCTTTTTCTCAATCCATCTCTAATTGTGATAAACCTTTTGATTTAGTGCATTCTGATATTTGGGGTCCTGCCCCAATTACTACTGTTCATGGTTATCGCTACTATGTTTTATTCATTGATGACTACTCTCGATTTACATGGATTTACTTTCTAAAACATCGTTCTGAATTATCTCGCACATATATTGAGTTTGCTAACATGATTCGCACTCAATTTTCCTCTCCCATCAAAATTCTTCGCACTGATAATGTTTTGGAATATAAAGATTCCATCCTTCTTTCTTTTCTTTCCCAACAGGGCACTATTGTTCAGCGCTCTTGCCCTCATACCTCTCAACAAAATGGACGTGCTGAGCGCAAACATCGTCACATTCTTGACTCAGTACGTGCCCTCCTTCTTTCTGCCTCTTGTCCAGAAAAATTCTGGGGTGAAGCTGCCCTTACATCAGTATATACAATCAATCGTCTCCCTTCTTCTGTTCTTCAAAACACCTCTCCATTTGAAAAACTATATGGTATTTCTCCCGACTATTCTAAACTCAAAGTTTTTGGTAGTGCCTGCTTCGTTCTGTTACATCCTCATGAACACAATAAACTTGAACCACGTGCCCGTCTCTGTTGCTTCCTTGGCTATGGCACCGAACACAAAGGATTTCGTTGTTGGGACCCTCTTTCCAACCGACTCCGGATATCTCGGCATGTCACTTTTTGGGAACACACTATGTTCTCTCGTTTGTCCTCCTTCCACACCTCCTTCTCTAGTCCTCAATCTTTCTTTACAAACACATCTGTTGACCTTTTTCCTCTCTCTGAACCCACCTTGGATACTGAGCTTGCACAATCTTCACCTGCTACTGCAAATCTGGATCCACCGTCTGTCTTCGATGATGTTCCTGAATCGCCACCTGCTACTCCTCTTCGTCGCTCTACCCGGGTAAGAGAACCTCCCTCTCATCTCACTCATGCCTGCTTCGTTCTGTTACATCCTCATGAACACAATAAACTTGAACCACGTGCCCGTCTCTGTTGCTTCCTTGGCTATGGCACCGAACACAAAGGATTTCGTTGTTGGGACCCTCTTTCCAACCGACTCCGGATATCTCGGCATGTCACTTTTTGGGAACACACTATGTTCTCTCGTTTGTCCTCCTTCCACACCTCTTTCTCTAGTCCTCAATCTTTCTTTACAAATACATCTGTTGACCTTTTTCCTCTCTCTGAACCCACCTTGGATACTGAGCTTGCACAATCTTCACCTGCTACTGCAAATCTGGATCCACCGTCTGTCTCCGATGATGTTCCTGAATCGTCACCTGCTACTCCTCTTCGTCGCTCTACCCGGGTAAGAGAACCTCCCCCTCATCTCACTGATTACCATTGTTTTTCTACCATTGTTTCCCTTGTTGAACCCACCTCTTATCAAGAGGCCAGTATTAACCCAGTATGGCAGAAAGCAATGGATGAAGAATTACAGGCTCTTGAAAAGACGCACACTTGGGACTATGTTGATTTACCTCCCGGTAAAAGACCCATTGGTTGCAAATGGATTTACAAAATCAAAACTCACTCTGATGGAACTATTGAACGTTATAAAGCTCGGCTTGTTGCAAAAGGATACTCACAAGAATATGGGATTGACTATGAAGAAACATTTGCCCCTGTTGCCCGGATGACATCTGTTCGCAGCTTGTTAGCTGTTGCTGCTGCCAAACAGTGGCCTCTTTTTCAGATGGATGTCAAAAATGCATTTCTTAACGGCAACCTATCTGAAGAAGTGTATATGAAGCCACCTCAGGGAACTTCTCCTCCTCCCAACAAGGTGTGTCTCCTTCGTCGCGCTCTATACGGTCTAAAACAGGCTCCACGAGCTTGGTTTGCCACGTTTAGCTCCACCATTACTCAACTTGGATTTACCTCCAGCTCTCACGACAATGCCCTTTTTACACGACAGACAACTCATGGTATTGTTCTTCTCCTTCTTTATGTTGATGATATGATTATTACTGGTAATGATCAACAGGCCATATCCGACCTACAACAATATCTTGGTCAACATTTTGAGATGAAAGACCTTGGATCTCTCAATTACTTTCTCGGTCTTGAAGTCTCTCACCGTTCAGATGGTTATCTGTTATCTCAAGCGAAATATGCATCTGATCTAATAGCACGCTCAGGAATTACAGACTCCACCACATCTTCAACACCGTTAGATCCTCATGTCCATCTAACTCCGTTTGATGGTGTTCCTCTTGACGATGCAAGCTTGTATCGGCAACTTGTTGGCAGTCTTATATACCTAACAGTAACTCGCCCAGATATTGCATATGCTGTTCATATTGTCAGTCAATTTATGGCTGCTCCTCGAACAATTCATTTCACTGCTGTTCTACGCATACTTCGCTATGTCAAAGGCACCTTGGGACATGGTCTTCAATTCTCATCTCAGTCTTCCCTTGTGTTGTCGGGATATTCTGATGCTGATTGGGCGGGGGATCCTACTGATCGACGATCCACTACAGGATACTGTTTTTACTTAGGTGATTCTCTCATCTCATGGCGTAGTAAGAAACAAAGTGTTATATCTCGTTCCAGTACGGAATCTGAATATCGTGCTCTGGCTGATGCTACAGCTGAACTTATATGGCTTCGGTGGCTCCTTGCCGATATGGGTGTCCCTCAACAGGGTCCTACCCTCCTCCATTGTGACAATCGTAGTGCCATTCAGATTGCTCACAATGATGTGTTTCATGAACGTACAAAACACATTGAAAATGACTGTCACTTTGTTCGTCACCACCTCCTAAGCAACACCCTCCTCTTACGTTCTGTTTCTACTATTGAACAACCTGCGGATATCTTCACCAAAGCCTTGCCATCTAATCGATTCTGTCACTTACTTACCAAACTCAAGTTGATCGCTACTCTACCACCTTGA

Coding sequence (CDS)

ATGGAGAGAGACCATATCTTTAGACCCATTTCTACCATACTTGATGGTACAAATTATATCACATGGGCAAATCAAATGAAAAGTTTTCTTATTGGAAGAAAACTATGGCGCATTGTAACTGGCGACATCACCAAACCAACTGCAAAGGAAGGGGATAACACATTCATTGAACGTCTCGAAGATTGGGACAGTAAAAATCATCAAATTATCACCTGGCTTGCCCATTATTACCAACTGCACAGTACACTTGTAAATCTAAATCAAGATATTGGTCAGTCTGTTAATGAATATTTGGCAGTTCTTCAACCCATTTGGACTCAACTTGACCAAGCGAACATCAGCAAAGATCATCTTCGCCTTATTAAAGTCCTTATGGGATTACGTCCAGAATATGAATCTGTTAGAGCTGCTTTACTACACCGGAATCCCTTACCCTCATTAGATGCAGCTATTCAAGAAATTCTGTTTGAAGAAAAGCGTCTTGGCATCAACTCTACTAAACAATCTGATGTTGTCCTTGCTAGCACATACACTCCCAACAGAGCCGCAAATATGTTTTGTAAGAATTGTAAGCTCTCTGGTCACAAATTTAGTAACTGTCCTAAAATAGAGTGCAGGTACTGCCATAAACATGGCCACATTCTGGATAACTGCCCTACCAGACCACCCCGACCTCCTGGCACTTCCACAAAAGAGAAAATTTTTACCAAACATGGTTCCTCATCTGTTGTTGCTGCGACCTCGGATGATTCATCCCTCATTCAGATAAGTGATCTTCAGAGCTTATTGAATCAACTAATTTCATCATCCTCCACTCTTGCTGTTTCCTCAGGTAATCGATGGCTTCTTGATTCTGCCTGTTGTAATCATATGACCTCTGACTCTTCTCTTATGTCTACTTCTAGCCCTACAAAATCTTTACCTCCTATTTATGCTGCTGATGGTAATTGTATGAACATCTCTCATACTGGTACCATTGATACTCCCAGTGTACATCTTCCCCATACTTACTGTGTTCCTAACCTGACCTTTAATCTAGTGTCTGTTGGTCAATTATGTGATCTTGGCTTAAATGTTTCATTTTCTCCCAATGGTTGTCAGGTTCAGGATCCGCAGACGGGACAGACGATTGGAACGGGTCGCAAAGTGGGAAGATTGTTTGAGCTCACATCACTTCGGGTTTCATCTCCTTCTTCCATCTCTGCTTCGGTCACTGATTCTGACACATATCAGTGGCATCTTCGTCTTGGTCATGCTTCCTCTGAAAAACTTCGTCATTTAATTTCTGTTAACAATTTGACTAATCTTACTAAATTTGTTCCTTTTAATTGTTTGAATTGCAAACTTGCTAAACAACCTGCCTTATCTTTTTCTCAATCCATCTCTAATTGTGATAAACCTTTTGATTTAGTGCATTCTGATATTTGGGGTCCTGCCCCAATTACTACTGTTCATGGTTATCGCTACTATGTTTTATTCATTGATGACTACTCTCGATTTACATGGATTTACTTTCTAAAACATCGTTCTGAATTATCTCGCACATATATTGAGTTTGCTAACATGATTCGCACTCAATTTTCCTCTCCCATCAAAATTCTTCGCACTGATAATGTTTTGGAATATAAAGATTCCATCCTTCTTTCTTTTCTTTCCCAACAGGGCACTATTGTTCAGCGCTCTTGCCCTCATACCTCTCAACAAAATGGACGTGCTGAGCGCAAACATCGTCACATTCTTGACTCAGTACGTGCCCTCCTTCTTTCTGCCTCTTGTCCAGAAAAATTCTGGGGTGAAGCTGCCCTTACATCAGTATATACAATCAATCGTCTCCCTTCTTCTGTTCTTCAAAACACCTCTCCATTTGAAAAACTATATGGTATTTCTCCCGACTATTCTAAACTCAAAGTTTTTGGTAGTGCCTGCTTCGTTCTGTTACATCCTCATGAACACAATAAACTTGAACCACGTGCCCGTCTCTGTTGCTTCCTTGGCTATGGCACCGAACACAAAGGATTTCGTTGTTGGGACCCTCTTTCCAACCGACTCCGGATATCTCGGCATGTCACTTTTTGGGAACACACTATGTTCTCTCGTTTGTCCTCCTTCCACACCTCCTTCTCTAGTCCTCAATCTTTCTTTACAAACACATCTGTTGACCTTTTTCCTCTCTCTGAACCCACCTTGGATACTGAGCTTGCACAATCTTCACCTGCTACTGCAAATCTGGATCCACCGTCTGTCTTCGATGATGTTCCTGAATCGCCACCTGCTACTCCTCTTCGTCGCTCTACCCGGGTAAGAGAACCTCCCTCTCATCTCACTCATGCCTGCTTCGTTCTGTTACATCCTCATGAACACAATAAACTTGAACCACGTGCCCGTCTCTGTTGCTTCCTTGGCTATGGCACCGAACACAAAGGATTTCGTTGTTGGGACCCTCTTTCCAACCGACTCCGGATATCTCGGCATGTCACTTTTTGGGAACACACTATGTTCTCTCGTTTGTCCTCCTTCCACACCTCTTTCTCTAGTCCTCAATCTTTCTTTACAAATACATCTGTTGACCTTTTTCCTCTCTCTGAACCCACCTTGGATACTGAGCTTGCACAATCTTCACCTGCTACTGCAAATCTGGATCCACCGTCTGTCTCCGATGATGTTCCTGAATCGTCACCTGCTACTCCTCTTCGTCGCTCTACCCGGGTAAGAGAACCTCCCCCTCATCTCACTGATTACCATTGTTTTTCTACCATTGTTTCCCTTGTTGAACCCACCTCTTATCAAGAGGCCAGTATTAACCCAGTATGGCAGAAAGCAATGGATGAAGAATTACAGGCTCTTGAAAAGACGCACACTTGGGACTATGTTGATTTACCTCCCGGTAAAAGACCCATTGGTTGCAAATGGATTTACAAAATCAAAACTCACTCTGATGGAACTATTGAACGTTATAAAGCTCGGCTTGTTGCAAAAGGATACTCACAAGAATATGGGATTGACTATGAAGAAACATTTGCCCCTGTTGCCCGGATGACATCTGTTCGCAGCTTGTTAGCTGTTGCTGCTGCCAAACAGTGGCCTCTTTTTCAGATGGATGTCAAAAATGCATTTCTTAACGGCAACCTATCTGAAGAAGTGTATATGAAGCCACCTCAGGGAACTTCTCCTCCTCCCAACAAGGTGTGTCTCCTTCGTCGCGCTCTATACGGTCTAAAACAGGCTCCACGAGCTTGGTTTGCCACGTTTAGCTCCACCATTACTCAACTTGGATTTACCTCCAGCTCTCACGACAATGCCCTTTTTACACGACAGACAACTCATGGTATTGTTCTTCTCCTTCTTTATGTTGATGATATGATTATTACTGGTAATGATCAACAGGCCATATCCGACCTACAACAATATCTTGGTCAACATTTTGAGATGAAAGACCTTGGATCTCTCAATTACTTTCTCGGTCTTGAAGTCTCTCACCGTTCAGATGGTTATCTGTTATCTCAAGCGAAATATGCATCTGATCTAATAGCACGCTCAGGAATTACAGACTCCACCACATCTTCAACACCGTTAGATCCTCATGTCCATCTAACTCCGTTTGATGGTGTTCCTCTTGACGATGCAAGCTTGTATCGGCAACTTGTTGGCAGTCTTATATACCTAACAGTAACTCGCCCAGATATTGCATATGCTGTTCATATTGTCAGTCAATTTATGGCTGCTCCTCGAACAATTCATTTCACTGCTGTTCTACGCATACTTCGCTATGTCAAAGGCACCTTGGGACATGGTCTTCAATTCTCATCTCAGTCTTCCCTTGTGTTGTCGGGATATTCTGATGCTGATTGGGCGGGGGATCCTACTGATCGACGATCCACTACAGGATACTGTTTTTACTTAGGTGATTCTCTCATCTCATGGCGTAGTAAGAAACAAAGTGTTATATCTCGTTCCAGTACGGAATCTGAATATCGTGCTCTGGCTGATGCTACAGCTGAACTTATATGGCTTCGGTGGCTCCTTGCCGATATGGGTGTCCCTCAACAGGGTCCTACCCTCCTCCATTGTGACAATCGTAGTGCCATTCAGATTGCTCACAATGATGTGTTTCATGAACGTACAAAACACATTGAAAATGACTGTCACTTTGTTCGTCACCACCTCCTAAGCAACACCCTCCTCTTACGTTCTGTTTCTACTATTGAACAACCTGCGGATATCTTCACCAAAGCCTTGCCATCTAATCGATTCTGTCACTTACTTACCAAACTCAAGTTGATCGCTACTCTACCACCTTGA

Protein sequence

MERDHIFRPISTILDGTNYITWANQMKSFLIGRKLWRIVTGDITKPTAKEGDNTFIERLEDWDSKNHQIITWLAHYYQLHSTLVNLNQDIGQSVNEYLAVLQPIWTQLDQANISKDHLRLIKVLMGLRPEYESVRAALLHRNPLPSLDAAIQEILFEEKRLGINSTKQSDVVLASTYTPNRAANMFCKNCKLSGHKFSNCPKIECRYCHKHGHILDNCPTRPPRPPGTSTKEKIFTKHGSSSVVAATSDDSSLIQISDLQSLLNQLISSSSTLAVSSGNRWLLDSACCNHMTSDSSLMSTSSPTKSLPPIYAADGNCMNISHTGTIDTPSVHLPHTYCVPNLTFNLVSVGQLCDLGLNVSFSPNGCQVQDPQTGQTIGTGRKVGRLFELTSLRVSSPSSISASVTDSDTYQWHLRLGHASSEKLRHLISVNNLTNLTKFVPFNCLNCKLAKQPALSFSQSISNCDKPFDLVHSDIWGPAPITTVHGYRYYVLFIDDYSRFTWIYFLKHRSELSRTYIEFANMIRTQFSSPIKILRTDNVLEYKDSILLSFLSQQGTIVQRSCPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNTSPFEKLYGISPDYSKLKVFGSACFVLLHPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLSEPTLDTELAQSSPATANLDPPSVFDDVPESPPATPLRRSTRVREPPSHLTHACFVLLHPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLSEPTLDTELAQSSPATANLDPPSVSDDVPESSPATPLRRSTRVREPPPHLTDYHCFSTIVSLVEPTSYQEASINPVWQKAMDEELQALEKTHTWDYVDLPPGKRPIGCKWIYKIKTHSDGTIERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLFQMDVKNAFLNGNLSEEVYMKPPQGTSPPPNKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSSHDNALFTRQTTHGIVLLLLYVDDMIITGNDQQAISDLQQYLGQHFEMKDLGSLNYFLGLEVSHRSDGYLLSQAKYASDLIARSGITDSTTSSTPLDPHVHLTPFDGVPLDDASLYRQLVGSLIYLTVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDADWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVISRSSTESEYRALADATAELIWLRWLLADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFVRHHLLSNTLLLRSVSTIEQPADIFTKALPSNRFCHLLTKLKLIATLPP*
Homology
BLAST of CSPI04G12200 vs. ExPASy Swiss-Prot
Match: Q94HW2 (Retrovirus-related Pol polyprotein from transposon RE1 OS=Arabidopsis thaliana OX=3702 GN=RE1 PE=2 SV=1)

HSP 1 Score: 704.1 bits (1816), Expect = 3.2e-201
Identity = 506/1494 (33.87%), Postives = 749/1494 (50.13%), Query Frame = 0

Query: 12   TILDGTNYITWANQMKSFLIGRKLWRIVTGDITKPTAKEGDNTFIERLED---WDSKNHQ 71
            T L  TNY+ W+ Q+ +   G +L   + G  T P A  G +       D   W  ++  
Sbjct: 24   TKLTSTNYLMWSRQVHALFDGYELAGFLDGSTTMPPATIGTDAAPRVNPDYTRWKRQDKL 83

Query: 72   IIT-----------------------W-----------LAHYYQLHSTLVNLNQDIGQSV 131
            I +                       W             H  QL + L    +   +++
Sbjct: 84   IYSAVLGAISMSVQPAVSRATTAAQIWETLRKIYANPSYGHVTQLRTQLKQWTKGT-KTI 143

Query: 132  NEYLAVLQPIWTQLDQANISKDHLRLI-KVLMGLRPEYESVRAALLHRNPLPSLDAAIQE 191
            ++Y+  L   + QL       DH   + +VL  L  EY+ V   +  ++  P+L    + 
Sbjct: 144  DDYMQGLVTRFDQLALLGKPMDHDEQVERVLENLPEEYKPVIDQIAAKDTPPTLTEIHER 203

Query: 192  IL-FEEKRLGINSTKQSDVVLASTYTPNRAANMFCKNCKLSGHKFSNCPKIECRYCHKHG 251
            +L  E K L ++S      V+  T       N    N   +G++ +       RY +++ 
Sbjct: 204  LLNHESKILAVSSA----TVIPITANAVSHRNTTTTNNNNNGNRNN-------RYDNRN- 263

Query: 252  HILDNCPTRPPRPPGTS---TKEKIFTKHGSSSVVAATSDDSSLIQISDLQSLLNQLISS 311
               +N  ++P +   T+      +     G   +       +   + S LQ  L+ + S 
Sbjct: 264  ---NNNNSKPWQQSSTNFHPNNNQSKPYLGKCQICGVQGHSAK--RCSQLQHFLSSVNSQ 323

Query: 312  S-----------STLAVS---SGNRWLLDSACCNHMTSDSSLMSTSSPTKSLPPIYAADG 371
                        + LA+    S N WLLDS   +H+TSD + +S   P      +  ADG
Sbjct: 324  QPPSPFTPWQPRANLALGSPYSSNNWLLDSGATHHITSDFNNLSLHQPYTGGDDVMVADG 383

Query: 372  NCMNISHTG--TIDTPS--VHLPHTYCVPNLTFNLVSVGQLCDL-GLNVSFSPNGCQVQD 431
            + + ISHTG  ++ T S  ++L +   VPN+  NL+SV +LC+  G++V F P   QV+D
Sbjct: 384  STIPISHTGSTSLSTKSRPLNLHNILYVPNIHKNLISVYRLCNANGVSVEFFPASFQVKD 443

Query: 432  PQTGQTIGTGRKVGRLFELTSLRVSSPSSISASVTDSDTY-QWHLRLGHASSEKLRHLIS 491
              TG  +  G+    L+E   +  S P S+ AS +   T+  WH RLGH +   L  +IS
Sbjct: 444  LNTGVPLLQGKTKDELYE-WPIASSQPVSLFASPSSKATHSSWHARLGHPAPSILNSVIS 503

Query: 492  VNNLTNLTKFVPF-NCLNCKLAKQPALSFSQSISNCDKPFDLVHSDIWGPAPITTVHGYR 551
              +L+ L     F +C +C + K   + FSQS  N  +P + ++SD+W  +PI +   YR
Sbjct: 504  NYSLSVLNPSHKFLSCSDCLINKSNKVPFSQSTINSTRPLEYIYSDVWS-SPILSHDNYR 563

Query: 552  YYVLFIDDYSRFTWIYFLKHRSELSRTYIEFANMIRTQFSSPIKILRTDNVLEYKDSILL 611
            YYV+F+D ++R+TW+Y LK +S++  T+I F N++  +F + I    +DN  E+    L 
Sbjct: 564  YYVIFVDHFTRYTWLYPLKQKSQVKETFITFKNLLENRFQTRIGTFYSDNGGEF--VALW 623

Query: 612  SFLSQQGTIVQRSCPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTI 671
             + SQ G     S PHT + NG +ERKHRHI+++   LL  AS P+ +W  A   +VY I
Sbjct: 624  EYFSQHGISHLTSPPHTPEHNGLSERKHRHIVETGLTLLSHASIPKTYWPYAFAVAVYLI 683

Query: 672  NRLPSSVLQNTSPFEKLYGISPDYSKLKVFGSACFVLLHPHEHNKLEPRARLCCFLGYGT 731
            NRLP+ +LQ  SPF+KL+G SP+Y KL+VFG AC+  L P+  +KL+ ++R C FLGY  
Sbjct: 684  NRLPTPLLQLESPFQKLFGTSPNYDKLRVFGCACYPWLRPYNQHKLDDKSRQCVFLGYSL 743

Query: 732  EHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLSEPT 791
                + C    ++RL ISRHV F E+      S++  + S  Q     +S    P    T
Sbjct: 744  TQSAYLCLHLQTSRLYISRHVRFDENCF--PFSNYLATLSPVQEQRRESSCVWSP--HTT 803

Query: 792  LDTELAQSSPATANLDPPSVFDDVPESPPATPLRRSTRVREPPSHLTHACFVLLHPHEHN 851
            L T      PA +  DP       P S P+ P R S    +  S    + F    P    
Sbjct: 804  LPTR-TPVLPAPSCSDPHHA--ATPPSSPSAPFRNS----QVSSSNLDSSFSSSFPSSPE 863

Query: 852  KLEPRARLCCFLGYGTEHKGFRCWDPLSNRLRI-SRHVTFWEHTMFSRLSSFHTSFSSPQ 911
               PR           ++       P   + +  S   T   +      S    S S+P 
Sbjct: 864  PTAPR-----------QNGPQPTTQPTQTQTQTHSSQNTSQNNPTNESPSQLAQSLSTPA 923

Query: 912  SFFTNTSVDLFPLSEPTLDTELAQSSPATANL---DPPSVSDDVPESSPATPLRRSTRVR 971
               +++         PT     + +SP   ++    PP ++  V  ++ A     S   R
Sbjct: 924  QSSSSS-------PSPTTSASSSSTSPTPPSILIHPPPPLAQIVNNNNQAPLNTHSMGTR 983

Query: 972  EPPPHLTDYHCFSTIVSLV---EPTSYQEASINPVWQKAMDEELQALEKTHTWDYVDLPP 1031
                 +     +S  VSL    EP +  +A  +  W+ AM  E+ A    HTWD V  PP
Sbjct: 984  AKAGIIKPNPKYSLAVSLAAESEPRTAIQALKDERWRNAMGSEINAQIGNHTWDLVPPPP 1043

Query: 1032 GKRPI-GCKWIYKIKTHSDGTIERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLA 1091
                I GC+WI+  K +SDG++ RYKARLVAKGY+Q  G+DY ETF+PV + TS+R +L 
Sbjct: 1044 SHVTIVGCRWIFTKKYNSDGSLNRYKARLVAKGYNQRPGLDYAETFSPVIKSTSIRIVLG 1103

Query: 1092 VAAAKQWPLFQMDVKNAFLNGNLSEEVYMKPPQG--TSPPPNKVCLLRRALYGLKQAPRA 1151
            VA  + WP+ Q+DV NAFL G L+++VYM  P G      PN VC LR+ALYGLKQAPRA
Sbjct: 1104 VAVDRSWPIRQLDVNNAFLQGTLTDDVYMSQPPGFIDKDRPNYVCKLRKALYGLKQAPRA 1163

Query: 1152 WFATFSSTITQLGFTSSSHDNALFTRQTTHGIVLLLLYVDDMIITGNDQQAISDLQQYLG 1211
            W+    + +  +GF +S  D +LF  Q    IV +L+YVDD++ITGND   + +    L 
Sbjct: 1164 WYVELRNYLLTIGFVNSVSDTSLFVLQRGKSIVYMLVYVDDILITGNDPTLLHNTLDNLS 1223

Query: 1212 QHFEMKDLGSLNYFLGLEVSHRSDGYLLSQAKYASDLIARSGITDSTTSSTPLDPHVHLT 1271
            Q F +KD   L+YFLG+E      G  LSQ +Y  DL+AR+ +  +   +TP+ P   L+
Sbjct: 1224 QRFSVKDHEELHYFLGIEAKRVPTGLHLSQRRYILDLLARTNMITAKPVTTPMAPSPKLS 1283

Query: 1272 PFDGVPLDDASLYRQLVGSLIYLTVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVK 1331
             + G  L D + YR +VGSL YL  TRPDI+YAV+ +SQFM  P   H  A+ RILRY+ 
Sbjct: 1284 LYSGTKLTDPTEYRGIVGSLQYLAFTRPDISYAVNRLSQFMHMPTEEHLQALKRILRYLA 1343

Query: 1332 GTLGHGLQFSSQSSLVLSGYSDADWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVISRS 1391
            GT  HG+     ++L L  YSDADWAGD  D  ST GY  YLG   ISW SKKQ  + RS
Sbjct: 1344 GTPNHGIFLKKGNTLSLHAYSDADWAGDKDDYVSTNGYIVYLGHHPISWSSKKQKGVVRS 1403

Query: 1392 STESEYRALADATAELIWLRWLLADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIE 1433
            STE+EYR++A+ ++E+ W+  LL ++G+    P +++CDN  A  +  N VFH R KHI 
Sbjct: 1404 STEAEYRSVANTSSEMQWICSLLTELGIRLTRPPVIYCDNVGATYLCANPVFHSRMKHIA 1463

BLAST of CSPI04G12200 vs. ExPASy Swiss-Prot
Match: Q9ZT94 (Retrovirus-related Pol polyprotein from transposon RE2 OS=Arabidopsis thaliana OX=3702 GN=RE2 PE=4 SV=1)

HSP 1 Score: 702.2 bits (1811), Expect = 1.2e-200
Identity = 504/1508 (33.42%), Postives = 744/1508 (49.34%), Query Frame = 0

Query: 12   TILDGTNYITWANQMKSFLIGRKLWRIVTGDITKPTAKEGDNTFIERLEDWDSKNHQIIT 71
            T L  TNY+ W+ Q+ +   G +L   + G    P A  G +  + R+      N     
Sbjct: 24   TKLTSTNYLMWSRQVHALFDGYELAGFLDGSTPMPPATIGTDA-VPRV------NPDYTR 83

Query: 72   WLAHYYQLHSTLVN-LNQDIGQSVNEYLAVLQPIWTQLDQ--ANISKDH---LRLI---- 131
            W      ++S ++  ++  +  +V+      Q IW  L +  AN S  H   LR I    
Sbjct: 84   WRRQDKLIYSAILGAISMSVQPAVSRATTAAQ-IWETLRKIYANPSYGHVTQLRFITRFD 143

Query: 132  ----------------KVLMGLRPEYESVRAALLHRNPLPSLDAAIQEIL-FEEKRLGIN 191
                            +VL  L  +Y+ V   +  ++  PSL    + ++  E K L +N
Sbjct: 144  QLALLGKPMDHDEQVERVLENLPDDYKPVIDQIAAKDTPPSLTEIHERLINRESKLLALN 203

Query: 192  STK----QSDVVLASTYTPNRAANMFCKNCKL--------------SGHKFSN-CPKI-- 251
            S +     ++VV       NR  N    N                 SG +  N  PK   
Sbjct: 204  SAEVVPITANVVTHRNTNTNRNQNNRGDNRNYNNNNNRSNSWQPSSSGSRSDNRQPKPYL 263

Query: 252  -ECRYCHKHGHILDNCPTRPPRPPGTSTKEKIFTKHGSSSVVAATSDDSSLIQISDLQSL 311
              C+ C   GH    CP                                   Q+   QS 
Sbjct: 264  GRCQICSVQGHSAKRCP-----------------------------------QLHQFQST 323

Query: 312  LNQLISSS--------STLAVSS---GNRWLLDSACCNHMTSDSSLMSTSSPTKSLPPIY 371
             NQ  S+S        + LAV+S    N WLLDS   +H+TSD + +S   P      + 
Sbjct: 324  TNQQQSTSPFTPWQPRANLAVNSPYNANNWLLDSGATHHITSDFNNLSFHQPYTGGDDVM 383

Query: 372  AADGNCMNISHTGTIDTP----SVHLPHTYCVPNLTFNLVSVGQLCDLG-LNVSFSPNGC 431
             ADG+ + I+HTG+   P    S+ L     VPN+  NL+SV +LC+   ++V F P   
Sbjct: 384  IADGSTIPITHTGSASLPTSSRSLDLNKVLYVPNIHKNLISVYRLCNTNRVSVEFFPASF 443

Query: 432  QVQDPQTGQTIGTGRKVGRLFELTSLRVSSPSSISASVTDSDTYQWHLRLGHASSEKLRH 491
            QV+D  TG  +  G+    L+E       + S  ++  + +    WH RLGH S   L  
Sbjct: 444  QVKDLNTGVPLLQGKTKDELYEWPIASSQAVSMFASPCSKATHSSWHSRLGHPSLAILNS 503

Query: 492  LISVNNLTNLT-KFVPFNCLNCKLAKQPALSFSQSISNCDKPFDLVHSDIWGPAPITTVH 551
            +IS ++L  L       +C +C + K   + FS S     KP + ++SD+W  +PI ++ 
Sbjct: 504  VISNHSLPVLNPSHKLLSCSDCFINKSHKVPFSNSTITSSKPLEYIYSDVWS-SPILSID 563

Query: 552  GYRYYVLFIDDYSRFTWIYFLKHRSELSRTYIEFANMIRTQFSSPIKILRTDNVLEYKDS 611
             YRYYV+F+D ++R+TW+Y LK +S++  T+I F +++  +F + I  L +DN  E+   
Sbjct: 564  NYRYYVIFVDHFTRYTWLYPLKQKSQVKDTFIIFKSLVENRFQTRIGTLYSDNGGEF--V 623

Query: 612  ILLSFLSQQGTIVQRSCPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSV 671
            +L  +LSQ G     S PHT + NG +ERKHRHI++    LL  AS P+ +W  A   +V
Sbjct: 624  VLRDYLSQHGISHFTSPPHTPEHNGLSERKHRHIVEMGLTLLSHASVPKTYWPYAFSVAV 683

Query: 672  YTINRLPSSVLQNTSPFEKLYGISPDYSKLKVFGSACFVLLHPHEHNKLEPRARLCCFLG 731
            Y INRLP+ +LQ  SPF+KL+G  P+Y KLKVFG AC+  L P+  +KLE +++ C F+G
Sbjct: 684  YLINRLPTPLLQLQSPFQKLFGQPPNYEKLKVFGCACYPWLRPYNRHKLEDKSKQCAFMG 743

Query: 732  YGTEHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLS 791
            Y      + C    + RL  SRHV F E       ++F  S S  Q    + S   +P S
Sbjct: 744  YSLTQSAYLCLHIPTGRLYTSRHVQFDERCFPFSTTNFGVSTSQEQ---RSDSAPNWP-S 803

Query: 792  EPTLDTELAQSSPATANLDPPSVFDDVPESPPATPLRRSTRVRE---PPSHLTHACFVLL 851
              TL T      PA   L P    D  P  P +     +T+V     P S ++       
Sbjct: 804  HTTLPT-TPLVLPAPPCLGPH--LDTSPRPPSSPSPLCTTQVSSSNLPSSSISSPSSSEP 863

Query: 852  HPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTS 911
                HN  +P A+                            H T   ++    L++ + +
Sbjct: 864  TAPSHNGPQPTAQ---------------------------PHQTQNSNSNSPILNNPNPN 923

Query: 912  FSSPQSFFTNTSVDLFPLSE---PTLDTELAQ-SSPATANLDPP------------SVSD 971
              SP S   N+ +   P+S    PT  T +++ +SP++++   P             V+ 
Sbjct: 924  SPSPNSPNQNSPLPQSPISSPHIPTPSTSISEPNSPSSSSTSTPPLPPVLPAPPIIQVNA 983

Query: 972  DVPESSPATPLRRSTRVREPPPHLTDYHCFSTIVSLVEPTSYQEASINPVWQKAMDEELQ 1031
              P ++ +   R    +R+P      Y   +++ +  EP +  +A  +  W++AM  E+ 
Sbjct: 984  QAPVNTHSMATRAKDGIRKPN---QKYSYATSLAANSEPRTAIQAMKDDRWRQAMGSEIN 1043

Query: 1032 ALEKTHTWDYVDLPPGKRPI-GCKWIYKIKTHSDGTIERYKARLVAKGYSQEYGIDYEET 1091
            A    HTWD V  PP    I GC+WI+  K +SDG++ RYKARLVAKGY+Q  G+DY ET
Sbjct: 1044 AQIGNHTWDLVPPPPPSVTIVGCRWIFTKKFNSDGSLNRYKARLVAKGYNQRPGLDYAET 1103

Query: 1092 FAPVARMTSVRSLLAVAAAKQWPLFQMDVKNAFLNGNLSEEVYMKPPQG--TSPPPNKVC 1151
            F+PV + TS+R +L VA  + WP+ Q+DV NAFL G L++EVYM  P G      P+ VC
Sbjct: 1104 FSPVIKSTSIRIVLGVAVDRSWPIRQLDVNNAFLQGTLTDEVYMSQPPGFVDKDRPDYVC 1163

Query: 1152 LLRRALYGLKQAPRAWFATFSSTITQLGFTSSSHDNALFTRQTTHGIVLLLLYVDDMIIT 1211
             LR+A+YGLKQAPRAW+    + +  +GF +S  D +LF  Q    I+ +L+YVDD++IT
Sbjct: 1164 RLRKAIYGLKQAPRAWYVELRTYLLTVGFVNSISDTSLFVLQRGRSIIYMLVYVDDILIT 1223

Query: 1212 GNDQQAISDLQQYLGQHFEMKDLGSLNYFLGLEVSHRSDGYLLSQAKYASDLIARSGITD 1271
            GND   +      L Q F +K+   L+YFLG+E      G  LSQ +Y  DL+AR+ +  
Sbjct: 1224 GNDTVLLKHTLDALSQRFSVKEHEDLHYFLGIEAKRVPQGLHLSQRRYTLDLLARTNMLT 1283

Query: 1272 STTSSTPLDPHVHLTPFDGVPLDDASLYRQLVGSLIYLTVTRPDIAYAVHIVSQFMAAPR 1331
            +   +TP+     LT   G  L D + YR +VGSL YL  TRPD++YAV+ +SQ+M  P 
Sbjct: 1284 AKPVATPMATSPKLTLHSGTKLPDPTEYRGIVGSLQYLAFTRPDLSYAVNRLSQYMHMPT 1343

Query: 1332 TIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDADWAGDPTDRRSTTGYCFYLGDS 1391
              H+ A+ R+LRY+ GT  HG+     ++L L  YSDADWAGD  D  ST GY  YLG  
Sbjct: 1344 DDHWNALKRVLRYLAGTPDHGIFLKKGNTLSLHAYSDADWAGDTDDYVSTNGYIVYLGHH 1403

Query: 1392 LISWRSKKQSVISRSSTESEYRALADATAELIWLRWLLADMGVPQQGPTLLHCDNRSAIQ 1432
             ISW SKKQ  + RSSTE+EYR++A+ ++EL W+  LL ++G+    P +++CDN  A  
Sbjct: 1404 PISWSSKKQKGVVRSSTEAEYRSVANTSSELQWICSLLTELGIQLSHPPVIYCDNVGATY 1448

BLAST of CSPI04G12200 vs. ExPASy Swiss-Prot
Match: P10978 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum OX=4097 PE=2 SV=1)

HSP 1 Score: 516.5 bits (1329), Expect = 9.4e-145
Identity = 422/1452 (29.06%), Postives = 667/1452 (45.94%), Query Frame = 0

Query: 19   YITWANQMKSFLIGRKLWRIVTGDITKP-TAKEGDNTFIE-------RLEDWDSKNHQII 78
            + TW  +M+  LI + L +++  D  KP T K  D   ++       RL   D   + II
Sbjct: 17   FSTWQRRMRDLLIQQGLHKVLDVDSKKPDTMKAEDWADLDERAASAIRLHLSDDVVNNII 76

Query: 79   -------TW-----------LAHYYQLHSTLVNLNQDIGQSVNEYLAVLQPIWTQLDQAN 138
                    W           L +   L   L  L+   G +   +L V   + TQL    
Sbjct: 77   DEDTARGIWTRLESLYMSKTLTNKLYLKKQLYALHMSEGTNFLSHLNVFNGLITQLANLG 136

Query: 139  IS-KDHLRLIKVLMGLRPEYESVRAALLHRNPLPSLDAAIQEILFEEK-RLGINSTKQSD 198
            +  ++  + I +L  L   Y+++   +LH      L      +L  EK R    +  Q+ 
Sbjct: 137  VKIEEEDKAILLLNSLPSSYDNLATTILHGKTTIELKDVTSALLLNEKMRKKPENQGQAL 196

Query: 199  VVLASTYTPNRAANMFCKNCKLSGHKFSNCPKI-ECRYCHKHGHILDNCPTRPPRPPGTS 258
            +      +  R++N + ++      K  +  ++  C  C++ GH   +CP  P +  G +
Sbjct: 197  ITEGRGRSYQRSSNNYGRSGARGKSKNRSKSRVRNCYNCNQPGHFKRDCP-NPRKGKGET 256

Query: 259  TKEKIFTKHGSSSVVAATSDDSSLIQISDLQSLLNQLISSSSTLAVSS-GNRWLLDSACC 318
            + +K              +DD++   + +  +++  +      + +S   + W++D+A  
Sbjct: 257  SGQK--------------NDDNTAAMVQNNDNVVLFINEEEECMHLSGPESEWVVDTAAS 316

Query: 319  NHMTSDSSLM---------STSSPTKSLPPIYAADGNCMNISHTGTIDTPSVHLPHTYCV 378
            +H T    L          +      S   I      C+  +   T+    V       V
Sbjct: 317  HHATPVRDLFCRYVAGDFGTVKMGNTSYSKIAGIGDICIKTNVGCTLVLKDVR-----HV 376

Query: 379  PNLTFNLVSVGQLCDLGLNVSFSPNGCQVQDPQTGQTIGTGRKVGRLFELTSLRVSSPSS 438
            P+L  NL+S   L   G    F+    ++   +    I  G   G L+  T+  +     
Sbjct: 377  PDLRMNLISGIALDRDGYESYFANQKWRL--TKGSLVIAKGVARGTLYR-TNAEICQ-GE 436

Query: 439  ISASVTDSDTYQWHLRLGHASSEKLRHLISVNNLTNLTKFVPFNCLNCKLAKQPALSFSQ 498
            ++A+  +     WH R+GH S + L+ L   + ++         C  C   KQ  +SF  
Sbjct: 437  LNAAQDEISVDLWHKRMGHMSEKGLQILAKKSLISYAKGTTVKPCDYCLFGKQHRVSFQT 496

Query: 499  SISNCDKPFDLVHSDIWGPAPITTVHGYRYYVLFIDDYSRFTWIYFLKHRSELSRTYIEF 558
            S        DLV+SD+ GP  I ++ G +Y+V FIDD SR  W+Y LK + ++ + + +F
Sbjct: 497  SSERKLNILDLVYSDVCGPMEIESMGGNKYFVTFIDDASRKLWVYILKTKDQVFQVFQKF 556

Query: 559  ANMIRTQFSSPIKILRTDNVLEYKDSILLSFLSQQGTIVQRSCPHTSQQNGRAERKHRHI 618
              ++  +    +K LR+DN  EY       + S  G   +++ P T Q NG AER +R I
Sbjct: 557  HALVERETGRKLKRLRSDNGGEYTSREFEEYCSSHGIRHEKTVPGTPQHNGVAERMNRTI 616

Query: 619  LDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNTSPFEKLYGISPDYSKLKVFG 678
            ++ VR++L  A  P+ FWGEA  T+ Y INR PS  L    P          YS LKVFG
Sbjct: 617  VEKVRSMLRMAKLPKSFWGEAVQTACYLINRSPSVPLAFEIPERVWTNKEVSYSHLKVFG 676

Query: 679  SACFVLLHPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSNRLRISRHVTFWEHTMFSR 738
               F  +   +  KL+ ++  C F+GYG E  G+R WDP+  ++  SR V F E  +   
Sbjct: 677  CRAFAHVPKEQRTKLDDKSIPCIFIGYGDEEFGYRLWDPVKKKVIRSRDVVFRESEV--- 736

Query: 739  LSSFHTSFSSPQSFFTNTSVDLFPLSEPTLDTELAQSSPATAN--LDPPSVFDDVPESPP 798
                             T+ D+    +  +      + P+T+N      S  D+V E   
Sbjct: 737  ----------------RTAADMSEKVKNGIIPNFV-TIPSTSNNPTSAESTTDEVSEQG- 796

Query: 799  ATPLRRSTRVREPPSHLTHACFVLLHPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSN 858
                 +   V E    L      + HP +  +               +H+  R     S 
Sbjct: 797  ----EQPGEVIEQGEQLDEGVEEVEHPTQGEE---------------QHQPLR----RSE 856

Query: 859  RLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLSEPTLDTELAQSSPATA 918
            R R+                                    +P +E  L            
Sbjct: 857  RPRVESR--------------------------------RYPSTEYVL------------ 916

Query: 919  NLDPPSVSDDVPESSPATPLRRSTRVREPPPHLTDYHCFSTIVSLVEPTSYQEASINPVW 978
                  +SDD                REP              SL E  S+ E +     
Sbjct: 917  ------ISDD----------------REPE-------------SLKEVLSHPEKN---QL 976

Query: 979  QKAMDEELQALEKTHTWDYVDLPPGKRPIGCKWIYKIKTHSDGTIERYKARLVAKGYSQE 1038
             KAM EE+++L+K  T+  V+LP GKRP+ CKW++K+K   D  + RYKARLV KG+ Q+
Sbjct: 977  MKAMQEEMESLQKNGTYKLVELPKGKRPLKCKWVFKLKKDGDCKLVRYKARLVVKGFEQK 1036

Query: 1039 YGIDYEETFAPVARMTSVRSLLAVAAAKQWPLFQMDVKNAFLNGNLSEEVYMKPPQGTSP 1098
             GID++E F+PV +MTS+R++L++AA+    + Q+DVK AFL+G+L EE+YM+ P+G   
Sbjct: 1037 KGIDFDEIFSPVVKMTSIRTILSLAASLDLEVEQLDVKTAFLHGDLEEEIYMEQPEGFEV 1096

Query: 1099 PPNK--VCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSSHDNAL-FTRQTTHGIVLLL 1158
               K  VC L ++LYGLKQAPR W+  F S +    +  +  D  + F R + +  ++LL
Sbjct: 1097 AGKKHMVCKLNKSLYGLKQAPRQWYMKFDSFMKSQTYLKTYSDPCVYFKRFSENNFIILL 1156

Query: 1159 LYVDDMIITGNDQQAISDLQQYLGQHFEMKDLGSLNYFLGLEV--SHRSDGYLLSQAKYA 1218
            LYVDDM+I G D+  I+ L+  L + F+MKDLG     LG+++     S    LSQ KY 
Sbjct: 1157 LYVDDMLIVGKDKGLIAKLKGDLSKSFDMKDLGPAQQILGMKIVRERTSRKLWLSQEKYI 1216

Query: 1219 SDLIARSGITDSTTSSTPLDPHVHL------TPFDGVPLDDASLYRQLVGSLIYLTV-TR 1278
              ++ R  + ++   STPL  H+ L      T  +         Y   VGSL+Y  V TR
Sbjct: 1217 ERVLERFNMKNAKPVSTPLAGHLKLSKKMCPTTVEEKGNMAKVPYSSAVGSLMYAMVCTR 1276

Query: 1279 PDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDADWAG 1338
            PDIA+AV +VS+F+  P   H+ AV  ILRY++GT G  L F   S  +L GY+DAD AG
Sbjct: 1277 PDIAHAVGVVSRFLENPGKEHWEAVKWILRYLRGTTGDCLCFGG-SDPILKGYTDADMAG 1316

Query: 1339 DPTDRRSTTGYCFYLGDSLISWRSKKQSVISRSSTESEYRALADATAELIWLRWLLADMG 1398
            D  +R+S+TGY F      ISW+SK Q  ++ S+TE+EY A  +   E+IWL+  L ++G
Sbjct: 1337 DIDNRKSSTGYLFTFSGGAISWQSKLQKCVALSTTEAEYIAATETGKEMIWLKRFLQELG 1316

Query: 1399 VPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFVRHHLLSNTLLLRSVSTIEQPA 1418
            + Q+   +++CD++SAI ++ N ++H RTKHI+   H++R  +   +L +  +ST E PA
Sbjct: 1397 LHQK-EYVVYCDSQSAIDLSKNSMYHARTKHIDVRYHWIREMVDDESLKVLKISTNENPA 1316

BLAST of CSPI04G12200 vs. ExPASy Swiss-Prot
Match: P04146 (Copia protein OS=Drosophila melanogaster OX=7227 GN=GIP PE=1 SV=3)

HSP 1 Score: 490.3 bits (1261), Expect = 7.2e-137
Identity = 407/1483 (27.44%), Postives = 683/1483 (46.06%), Query Frame = 0

Query: 15   DGTNYITWANQMKSFLIGRKLWRIVTG-------DITKPTAKEGDNTFIERLED------ 74
            DG  Y  W  ++++ L  + + ++V G       D  K   +   +T IE L D      
Sbjct: 12   DGEKYAIWKFRIRALLAEQDVLKVVDGLMPNEVDDSWKKAERCAKSTIIEYLSDSFLNFA 71

Query: 75   -WDSKNHQIITWLAHYYQ---------LHSTLVNLNQDIGQSVNEYLAVLQPIWTQLDQA 134
              D    QI+  L   Y+         L   L++L      S+  +  +   + ++L  A
Sbjct: 72   TSDITARQILENLDAVYERKSLASQLALRKRLLSLKLSSEMSLLSHFHIFDELISELLAA 131

Query: 135  NISKDHLRLIKVLMGLRPE-YESVRAAL--LHRNPLPSLDAAIQEILFEEKRLGINSTKQ 194
                + +  I  L+   P  Y+ +  A+  L    L +L      +L +E ++  +    
Sbjct: 132  GAKIEEMDKISHLLITLPSCYDGIITAIETLSEENL-TLAFVKNRLLDQEIKIKNDHNDT 191

Query: 195  SDVVL-------ASTYTPNRAANMFCKNCKLSGHKFSNCPKIECRYCHKHGHILDNCPTR 254
            S  V+        +TY  N   N   K  K+   K ++  K++C +C + GHI  +C   
Sbjct: 192  SKKVMNAIVHNNNNTYKNNLFKNRVTKPKKI--FKGNSKYKVKCHHCGREGHIKKDC-FH 251

Query: 255  PPRPPGTSTKEKIFTKHGSSSVVAATSDDSSLIQISDLQSLLNQLISSSSTLAVSSGNRW 314
              R      KE          V  ATS   +             ++   +  +V     +
Sbjct: 252  YKRILNNKNKE------NEKQVQTATSHGIAF------------MVKEVNNTSVMDNCGF 311

Query: 315  LLDSACCNHMTSDSSLMSTSSPTKSLPPI---YAADGNCMNISHTGTIDTPSVH---LPH 374
            +LDS   +H+ +D SL + S   + +PP+    A  G  +  +  G +   + H   L  
Sbjct: 312  VLDSGASDHLINDESLYTDS--VEVVPPLKIAVAKQGEFIYATKRGIVRLRNDHEITLED 371

Query: 375  TYCVPNLTFNLVSVGQLCDLGLNVSFSPNGCQVQDPQTGQTIGTGRKVGRLFELTSLRVS 434
                     NL+SV +L + G+++ F  +G  +          +G        L ++ V 
Sbjct: 372  VLFCKEAAGNLMSVKRLQEAGMSIEFDKSGVTISKNGLMVVKNSGM-------LNNVPVI 431

Query: 435  SPSSISASVTDSDTYQ-WHLRLGHASSEKL-----RHLISVNNLTNLTKFVPFNCLNCKL 494
            +  + S +    + ++ WH R GH S  KL     +++ S  +L N  +     C  C  
Sbjct: 432  NFQAYSINAKHKNNFRLWHERFGHISDGKLLEIKRKNMFSDQSLLNNLELSCEICEPCLN 491

Query: 495  AKQPALSFSQ--SISNCDKPFDLVHSDIWGPAPITTVHGYRYYVLFIDDYSRFTWIYFLK 554
             KQ  L F Q    ++  +P  +VHSD+ GP    T+    Y+V+F+D ++ +   Y +K
Sbjct: 492  GKQARLPFKQLKDKTHIKRPLFVVHSDVCGPITPVTLDDKNYFVIFVDQFTHYCVTYLIK 551

Query: 555  HRSELSRTYIEFANMIRTQFSSPIKILRTDNVLEYKDSILLSFLSQQGTIVQRSCPHTSQ 614
            ++S++   + +F       F+  +  L  DN  EY  + +  F  ++G     + PHT Q
Sbjct: 552  YKSDVFSMFQDFVAKSEAHFNLKVVYLYIDNGREYLSNEMRQFCVKKGISYHLTVPHTPQ 611

Query: 615  QNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNTS--PFEKL 674
             NG +ER  R I +  R ++  A   + FWGEA LT+ Y INR+PS  L ++S  P+E  
Sbjct: 612  LNGVSERMIRTITEKARTMVSGAKLDKSFWGEAVLTATYLINRIPSRALVDSSKTPYEMW 671

Query: 675  YGISPDYSKLKVFGSACFVLLHPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSNRLRI 734
            +   P    L+VFG+  +V +  ++  K + ++    F+GY  E  GF+ WD ++ +  +
Sbjct: 672  HNKKPYLKHLRVFGATVYVHI-KNKQGKFDDKSFKSIFVGY--EPNGFKLWDAVNEKFIV 731

Query: 735  SRHVTFWEHTMF-SRLSSFHTSFSSPQSFFTNTSVDLFPL-SEPTLDTELAQSSPATANL 794
            +R V   E  M  SR   F T F        N +   FP  S   + TE    S    N+
Sbjct: 732  ARDVVVDETNMVNSRAVKFETVFLKDSKESENKN---FPNDSRKIIQTEFPNESKECDNI 791

Query: 795  DPPSVFDDVPESPPATPLRRSTRVREPPSHLTHACFVLLHPHEHNKLEPRARLCCFLGYG 854
                   D  ES                            P++  K+             
Sbjct: 792  ---QFLKDSKESENKN-----------------------FPNDSRKI-----------IQ 851

Query: 855  TEHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLSEP 914
            TE           N  +   ++ F + +  S     + S    +    N S      +E 
Sbjct: 852  TE---------FPNESKECDNIQFLKDSKESNKYFLNESKKRKRDDHLNESKGSGNPNE- 911

Query: 915  TLDTELAQSSPATANLDPPSVSDDVPESSPATPLRRSTRVREPPP-------------HL 974
            + ++E A+       +D P+ +D +   +     RRS R++  P               L
Sbjct: 912  SRESETAEHL-KEIGIDNPTKNDGIEIIN-----RRSERLKTKPQISYNEEDNSLNKVVL 971

Query: 975  TDYHCFSTIVSLVEPTSYQEASINPVWQKAMDEELQALEKTHTWDYVDLPPGKRPIGCKW 1034
              +  F+ + +  +   Y++   +  W++A++ EL A +  +TW     P  K  +  +W
Sbjct: 972  NAHTIFNDVPNSFDEIQYRDDKSS--WEEAINTELNAHKINNTWTITKRPENKNIVDSRW 1031

Query: 1035 IYKIKTHSDGTIERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLF 1094
            ++ +K +  G   RYKARLVA+G++Q+Y IDYEETFAPVAR++S R +L++       + 
Sbjct: 1032 VFSVKYNELGNPIRYKARLVARGFTQKYQIDYEETFAPVARISSFRFILSLVIQYNLKVH 1091

Query: 1095 QMDVKNAFLNGNLSEEVYMKPPQGTSPPPNKVCLLRRALYGLKQAPRAWFATFSSTITQL 1154
            QMDVK AFLNG L EE+YM+ PQG S   + VC L +A+YGLKQA R WF  F   + + 
Sbjct: 1092 QMDVKTAFLNGTLKEEIYMRLPQGISCNSDNVCKLNKAIYGLKQAARCWFEVFEQALKEC 1151

Query: 1155 GFTSSSHDNALF--TRQTTHGIVLLLLYVDDMIITGNDQQAISDLQQYLGQHFEMKDLGS 1214
             F +SS D  ++   +   +  + +LLYVDD++I   D   +++ ++YL + F M DL  
Sbjct: 1152 EFVNSSVDRCIYILDKGNINENIYVLLYVDDVVIATGDMTRMNNFKRYLMEKFRMTDLNE 1211

Query: 1215 LNYFLGLEVSHRSDGYLLSQAKYASDLIARSGITDSTTSSTPLDPHVHLTPFDGVPLDDA 1274
            + +F+G+ +  + D   LSQ+ Y   ++++  + +    STPL   ++    +    D  
Sbjct: 1212 IKHFIGIRIEMQEDKIYLSQSAYVKKILSKFNMENCNAVSTPLPSKINYELLNS-DEDCN 1271

Query: 1275 SLYRQLVGSLIYLTV-TRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQF 1334
            +  R L+G L+Y+ + TRPD+  AV+I+S++ +   +  +  + R+LRY+KGT+   L F
Sbjct: 1272 TPCRSLIGCLMYIMLCTRPDLTTAVNILSRYSSKNNSELWQNLKRVLRYLKGTIDMKLIF 1331

Query: 1335 SSQSSL--VLSGYSDADWAGDPTDRRSTTGYCFYLGD-SLISWRSKKQSVISRSSTESEY 1394
                +    + GY D+DWAG   DR+STTGY F + D +LI W +K+Q+ ++ SSTE+EY
Sbjct: 1332 KKNLAFENKIIGYVDSDWAGSEIDRKSTTGYLFKMFDFNLICWNTKRQNSVAASSTEAEY 1391

Query: 1395 RALADATAELIWLRWLLADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFV 1428
             AL +A  E +WL++LL  + +  + P  ++ DN+  I IA+N   H+R KHI+   HF 
Sbjct: 1392 MALFEAVREALWLKFLLTSINIKLENPIKIYEDNQGCISIANNPSCHKRAKHIDIKYHFA 1401

BLAST of CSPI04G12200 vs. ExPASy Swiss-Prot
Match: P92519 (Uncharacterized mitochondrial protein AtMg00810 OS=Arabidopsis thaliana OX=3702 GN=AtMg00810 PE=4 SV=1)

HSP 1 Score: 199.9 bits (507), Expect = 1.9e-49
Identity = 105/224 (46.88%), Postives = 144/224 (64.29%), Query Frame = 0

Query: 1113 LLLYVDDMIITGNDQQAISDLQQYLGQHFEMKDLGSLNYFLGLEVSHRSDGYLLSQAKYA 1172
            LLLYVDD+++TG+    ++ L   L   F MKDLG ++YFLG+++     G  LSQ KYA
Sbjct: 3    LLLYVDDILLTGSSNTLLNMLIFQLSSTFSMKDLGPVHYFLGIQIKTHPSGLFLSQTKYA 62

Query: 1173 SDLIARSGITDSTTSSTPLDPHVHLTPFDGVPLDDASLYRQLVGSLIYLTVTRPDIAYAV 1232
              ++  +G+ D    STPL   ++ +        D S +R +VG+L YLT+TRPDI+YAV
Sbjct: 63   EQILNNAGMLDCKPMSTPLPLKLN-SSVSTAKYPDPSDFRSIVGALQYLTLTRPDISYAV 122

Query: 1233 HIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDADWAGDPTDRRS 1292
            +IV Q M  P    F  + R+LRYVKGT+ HGL     S L +  + D+DWAG  + RRS
Sbjct: 123  NIVCQRMHEPTLADFDLLKRVLRYVKGTIFHGLYIHKNSKLNVQAFCDSDWAGCTSTRRS 182

Query: 1293 TTGYCFYLGDSLISWRSKKQSVISRSSTESEYRALADATAELIW 1337
            TTG+C +LG ++ISW +K+Q  +SRSSTE+EYRALA   AEL W
Sbjct: 183  TTGFCTFLGCNIISWSAKRQPTVSRSSTETEYRALALTAAELTW 225

BLAST of CSPI04G12200 vs. ExPASy TrEMBL
Match: A0A5A7VIT8 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cucumis melo var. makuwa OX=1194695 GN=E5676_scaffold1017G00050 PE=4 SV=1)

HSP 1 Score: 2217.2 bits (5744), Expect = 0.0e+00
Identity = 1134/1471 (77.09%), Postives = 1194/1471 (81.17%), Query Frame = 0

Query: 1    MERDHIFRPISTILDGTNYITWANQMKSFLIGRKLWRIVTGDITKPTA--KEGDNTFIER 60
            ME+  I RPISTILDG+NYITWANQMKSFLIGRKLWRIVTGDITKPT   KE D+ FIER
Sbjct: 1    MEKSEIARPISTILDGSNYITWANQMKSFLIGRKLWRIVTGDITKPTKQDKEDDSKFIER 60

Query: 61   LEDWDSKNHQIITW----------------------------------LAHYYQLHSTLV 120
            LE+WDSKNHQIITW                                  LAHYYQLH++LV
Sbjct: 61   LEEWDSKNHQIITWLSNTSIPAIHTQFDAFENAKELWNFLSTRFKSVGLAHYYQLHNSLV 120

Query: 121  NLNQDIGQSVNEYLAVLQPIWTQLDQANISKDHLRLIKVLMGLRPEYESVRAALLHRNPL 180
            NLNQ+ GQSVNEYLAVLQPIWTQLDQA ISKDHLRLIKVLMGLRPEYESVRAALLHR+PL
Sbjct: 121  NLNQEAGQSVNEYLAVLQPIWTQLDQATISKDHLRLIKVLMGLRPEYESVRAALLHRSPL 180

Query: 181  PSLDAAIQEILFEEKRLGINSTKQSDVVLASTYTPNRAANMFCKNCKLSGHKFSNCPKIE 240
            PSLDAAIQEILFEE+RLGIN +K SD VLASTY+P  A++ FCKNCKL+GHKF NCPKIE
Sbjct: 181  PSLDAAIQEILFEERRLGINLSKHSDAVLASTYSPPGASSTFCKNCKLTGHKFINCPKIE 240

Query: 241  CRYCHKHGHILDNCPTRPPRPPGTSTKEKIFTKHGSSSVVAATSDDSSL--IQISDLQSL 300
            CRYCHK GHILDNCP +PPRP   ST+ K FTK  +SS      D S++   QISDLQSL
Sbjct: 241  CRYCHKPGHILDNCPIKPPRPRNYSTRAKNFTKPSNSSAATVAPDKSTIPQFQISDLQSL 300

Query: 301  LNQLISS-SSTLAVSSGNRWLLDSACCNHMTSDSSLMSTSSPTKSLPPIYAADGNCMNIS 360
            LNQLISS SS LAVS GNRWLLDS CCNHMTSD SLM+T SPTKSLPPIYAADGNCMNI+
Sbjct: 301  LNQLISSPSSALAVSPGNRWLLDSGCCNHMTSDYSLMNTPSPTKSLPPIYAADGNCMNIT 360

Query: 361  HTGTIDTPSVHLPHTYCVPNLTFNLVSVGQLCDLGLNVSFSPNGCQVQDPQTGQTIGTGR 420
            H GTI+TPS++LPHTYCVPNLTFNLVSVGQLCDLG  VSFS NGCQVQDPQTGQTIGTGR
Sbjct: 361  HMGTINTPSLNLPHTYCVPNLTFNLVSVGQLCDLGFTVSFSSNGCQVQDPQTGQTIGTGR 420

Query: 421  KVGRLFELTSLRVSSPSSISASVTDSDTYQWHLRLGHASSEKLRHLISVNNLTNLTKFVP 480
            KVGRLFEL SL+V SP SISA VTDSDTYQWHLRLGHAS EKLRHLIS+NNL ++TKFVP
Sbjct: 421  KVGRLFELLSLQVPSP-SISAPVTDSDTYQWHLRLGHASPEKLRHLISINNLNSITKFVP 480

Query: 481  FNCLNCKLAKQPALSFSQSISNCDKPFDLVHSDIWGPAPITTVHGYRYYVLFIDDYSRFT 540
            FNCLNCKLAKQPALSFS S S CDKPFDL+HSDIWGPAP +TVHGYRYYVLFIDD+SRFT
Sbjct: 481  FNCLNCKLAKQPALSFSTSTSICDKPFDLIHSDIWGPAPTSTVHGYRYYVLFIDDFSRFT 540

Query: 541  WIYFLKHRSELSRTYIEFANMIRTQFSSPIKILRTDNVLEYKDSILLSFLSQQGTIVQRS 600
            WIYFLKHRSELSRTYIEFANMIRTQFS PIK LRTDN LEYKDS LLSFLSQQGT+VQRS
Sbjct: 541  WIYFLKHRSELSRTYIEFANMIRTQFSCPIKTLRTDNALEYKDSTLLSFLSQQGTLVQRS 600

Query: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNTSP 660
            CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQN SP
Sbjct: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNISP 660

Query: 661  FEKLYGISPDYSKLKVFGSACFVLLHPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720
            FE+LYG  P+YS LKVFG ACFVLL PHEH KLEPRARLCCFLGYGTEHKGFRCWDPLSN
Sbjct: 661  FERLYGTPPNYSNLKVFGCACFVLLQPHEHTKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720

Query: 721  RLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLSEPTLDTELAQSSPATA 780
            RLRISRHVTFWEHTMFSRLSSFH SFSSPQSFFT+TS+DLFPLSE T   ELAQS+P +A
Sbjct: 721  RLRISRHVTFWEHTMFSRLSSFHASFSSPQSFFTDTSIDLFPLSESTPGNELAQSAPTSA 780

Query: 781  NLDPPSVFDDVPESPPATPLRRSTRVREPPSHLTHACFVLLHPHEHNKLEPRARLCCFLG 840
              D  S+ D  P+ PP                                            
Sbjct: 781  TSDQSSISDGNPDPPP-------------------------------------------- 840

Query: 841  YGTEHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLS 900
                                                                        
Sbjct: 841  ------------------------------------------------------------ 900

Query: 901  EPTLDTELAQSSPATANLDPPSVSDDVPESSPATPLRRSTRVREPPPHLTDYHCFSTIVS 960
                                     D+P        RRSTRVREPP HL DYHCFSTIVS
Sbjct: 901  -------------------------DIPP-------RRSTRVREPPIHLQDYHCFSTIVS 960

Query: 961  LVEPTSYQEASINPVWQKAMDEELQALEKTHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020
            L+EPTSYQEAS +P+WQKAM++ELQALEK HTWDYVDLPPGKRPIGCKWIYKIKTHSDGT
Sbjct: 961  LIEPTSYQEASTDPLWQKAMNDELQALEKMHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020

Query: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLFQMDVKNAFLNG 1080
            IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPL QMDVKNAFLNG
Sbjct: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLLQMDVKNAFLNG 1080

Query: 1081 NLSEEVYMKPPQGTSPPPNKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSSHDNAL 1140
             LSEEVYMKPP GTS PP+KVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSS HD AL
Sbjct: 1081 TLSEEVYMKPPPGTSSPPHKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSPHDTAL 1140

Query: 1141 FTRQTTHGIVLLLLYVDDMIITGNDQQAISDLQQYLGQHFEMKDLGSLNYFLGLEVSHRS 1200
            FTR T  GIVLLLLYVDDMIITGND  AISDLQ YLGQHFEMKDLGSLNYFLGLEVS RS
Sbjct: 1141 FTRHTPQGIVLLLLYVDDMIITGNDPHAISDLQHYLGQHFEMKDLGSLNYFLGLEVSRRS 1200

Query: 1201 DGYLLSQAKYASDLIARSGITDSTTSSTPLDPHVHLTPFDGVPLDDASLYRQLVGSLIYL 1260
            DGYLLSQAKYASDL+ARSGITDS T+STPLDP+VHLTP+DGVPL++ SLYRQLVGSLIYL
Sbjct: 1201 DGYLLSQAKYASDLLARSGITDSNTASTPLDPNVHLTPYDGVPLENVSLYRQLVGSLIYL 1260

Query: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320
            TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA
Sbjct: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320

Query: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVISRSSTESEYRALADATAELIWLRWLL 1380
            DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSV+SRSSTESEYRALADATAEL+WLRWLL
Sbjct: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVVSRSSTESEYRALADATAELLWLRWLL 1334

Query: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFVRHHLLSNTLLLRSVSTI 1433
            ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHF+RHHLLS+TLLL+S+ST 
Sbjct: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFIRHHLLSHTLLLQSISTT 1334

BLAST of CSPI04G12200 vs. ExPASy TrEMBL
Match: A0A5A7T8F2 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cucumis melo var. makuwa OX=1194695 GN=E6C27_scaffold141G00410 PE=4 SV=1)

HSP 1 Score: 2215.7 bits (5740), Expect = 0.0e+00
Identity = 1133/1471 (77.02%), Postives = 1194/1471 (81.17%), Query Frame = 0

Query: 1    MERDHIFRPISTILDGTNYITWANQMKSFLIGRKLWRIVTGDITKPTA--KEGDNTFIER 60
            ME+  I RPISTILDG+NYITWANQMKSFLIGRKLWRIVTGDITKPT   KE D+ FIER
Sbjct: 1    MEKSEIARPISTILDGSNYITWANQMKSFLIGRKLWRIVTGDITKPTKQDKEDDSKFIER 60

Query: 61   LEDWDSKNHQIITW----------------------------------LAHYYQLHSTLV 120
            LE+WDSKNHQIITW                                  LAHYYQLH++LV
Sbjct: 61   LEEWDSKNHQIITWLSNTSIPAIHTQFDAFENAKELWNFLSTRFKSVGLAHYYQLHNSLV 120

Query: 121  NLNQDIGQSVNEYLAVLQPIWTQLDQANISKDHLRLIKVLMGLRPEYESVRAALLHRNPL 180
            NLNQ+ GQSVNEYLAVLQPIWTQLDQA ISKDHLRLIKVLMGLRPEYESVRAALLHR+PL
Sbjct: 121  NLNQEAGQSVNEYLAVLQPIWTQLDQATISKDHLRLIKVLMGLRPEYESVRAALLHRSPL 180

Query: 181  PSLDAAIQEILFEEKRLGINSTKQSDVVLASTYTPNRAANMFCKNCKLSGHKFSNCPKIE 240
            PSLDAAIQEILFEE+RLGIN +K SD VLASTY+P  A++ FCKNCKL+GHKF NCPKIE
Sbjct: 181  PSLDAAIQEILFEERRLGINLSKHSDAVLASTYSPPGASSTFCKNCKLTGHKFINCPKIE 240

Query: 241  CRYCHKHGHILDNCPTRPPRPPGTSTKEKIFTKHGSSSVVAATSDDSSL--IQISDLQSL 300
            CRYCHK GHILDNCP +PPRP   ST+ K FTK  +SS      D S++   QISDLQSL
Sbjct: 241  CRYCHKPGHILDNCPIKPPRPRNYSTRAKNFTKPSNSSAATVALDKSTIPQFQISDLQSL 300

Query: 301  LNQLISS-SSTLAVSSGNRWLLDSACCNHMTSDSSLMSTSSPTKSLPPIYAADGNCMNIS 360
            LNQLISS SS LAVS GNRWLLDS CCNHMTSD SLM+T SPTKSLPPIYAADGNCMNI+
Sbjct: 301  LNQLISSPSSALAVSPGNRWLLDSGCCNHMTSDYSLMNTPSPTKSLPPIYAADGNCMNIT 360

Query: 361  HTGTIDTPSVHLPHTYCVPNLTFNLVSVGQLCDLGLNVSFSPNGCQVQDPQTGQTIGTGR 420
            H GTI+TPS++LPHTYCVPNLTFNLVSVGQLCDLG  VSFS NGCQVQDPQTGQTIGTGR
Sbjct: 361  HMGTINTPSLNLPHTYCVPNLTFNLVSVGQLCDLGFTVSFSSNGCQVQDPQTGQTIGTGR 420

Query: 421  KVGRLFELTSLRVSSPSSISASVTDSDTYQWHLRLGHASSEKLRHLISVNNLTNLTKFVP 480
            KVGRLFE+ SL+V SP SISA VTDSDTYQWHLRLGHAS EKLRHLIS+NNL ++TKFVP
Sbjct: 421  KVGRLFEVLSLQVPSP-SISAPVTDSDTYQWHLRLGHASPEKLRHLISINNLNSITKFVP 480

Query: 481  FNCLNCKLAKQPALSFSQSISNCDKPFDLVHSDIWGPAPITTVHGYRYYVLFIDDYSRFT 540
            FNCLNCKLAKQPALSFS S S CDKPFDL+HSDIWGPAP +TVHGYRYYVLFIDD+SRFT
Sbjct: 481  FNCLNCKLAKQPALSFSTSTSICDKPFDLIHSDIWGPAPTSTVHGYRYYVLFIDDFSRFT 540

Query: 541  WIYFLKHRSELSRTYIEFANMIRTQFSSPIKILRTDNVLEYKDSILLSFLSQQGTIVQRS 600
            WIYFLKHRSELSRTYIEFANMIRTQFS PIK LRTDN LEYKDS LLSFLSQQGT+VQRS
Sbjct: 541  WIYFLKHRSELSRTYIEFANMIRTQFSCPIKTLRTDNALEYKDSTLLSFLSQQGTLVQRS 600

Query: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNTSP 660
            CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQN SP
Sbjct: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNISP 660

Query: 661  FEKLYGISPDYSKLKVFGSACFVLLHPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720
            FE+LYG  P+YS LKVFG ACFVLL PHEH KLEPRARLCCFLGYGTEHKGFRCWDPLSN
Sbjct: 661  FERLYGTPPNYSNLKVFGCACFVLLQPHEHTKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720

Query: 721  RLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLSEPTLDTELAQSSPATA 780
            RLRISRHVTFWEHTMFSRLSSFH SFSSPQSFFT+TS+DLFPLSE T   ELAQS+P +A
Sbjct: 721  RLRISRHVTFWEHTMFSRLSSFHASFSSPQSFFTDTSIDLFPLSESTPGNELAQSAPTSA 780

Query: 781  NLDPPSVFDDVPESPPATPLRRSTRVREPPSHLTHACFVLLHPHEHNKLEPRARLCCFLG 840
              D  S+ D  P+ PP                                            
Sbjct: 781  TSDQSSISDGNPDPPP-------------------------------------------- 840

Query: 841  YGTEHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLS 900
                                                                        
Sbjct: 841  ------------------------------------------------------------ 900

Query: 901  EPTLDTELAQSSPATANLDPPSVSDDVPESSPATPLRRSTRVREPPPHLTDYHCFSTIVS 960
                                     D+P        RRSTRVREPP HL DYHCFSTIVS
Sbjct: 901  -------------------------DIPP-------RRSTRVREPPIHLQDYHCFSTIVS 960

Query: 961  LVEPTSYQEASINPVWQKAMDEELQALEKTHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020
            L+EPTSYQEAS +P+WQKAM++ELQALEK HTWDYVDLPPGKRPIGCKWIYKIKTHSDGT
Sbjct: 961  LIEPTSYQEASTDPLWQKAMNDELQALEKMHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020

Query: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLFQMDVKNAFLNG 1080
            IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPL QMDVKNAFLNG
Sbjct: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLLQMDVKNAFLNG 1080

Query: 1081 NLSEEVYMKPPQGTSPPPNKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSSHDNAL 1140
             LSEEVYMKPP GTS PP+KVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSS HD AL
Sbjct: 1081 TLSEEVYMKPPPGTSSPPHKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSPHDTAL 1140

Query: 1141 FTRQTTHGIVLLLLYVDDMIITGNDQQAISDLQQYLGQHFEMKDLGSLNYFLGLEVSHRS 1200
            FTR T  GIVLLLLYVDDMIITGND  AISDLQ YLGQHFEMKDLGSLNYFLGLEVS RS
Sbjct: 1141 FTRHTPQGIVLLLLYVDDMIITGNDPHAISDLQHYLGQHFEMKDLGSLNYFLGLEVSRRS 1200

Query: 1201 DGYLLSQAKYASDLIARSGITDSTTSSTPLDPHVHLTPFDGVPLDDASLYRQLVGSLIYL 1260
            DGYLLSQAKYASDL+ARSGITDS T+STPLDP+VHLTP+DGVPL++ SLYRQLVGSLIYL
Sbjct: 1201 DGYLLSQAKYASDLLARSGITDSNTASTPLDPNVHLTPYDGVPLENVSLYRQLVGSLIYL 1260

Query: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320
            TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA
Sbjct: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320

Query: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVISRSSTESEYRALADATAELIWLRWLL 1380
            DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSV+SRSSTESEYRALADATAEL+WLRWLL
Sbjct: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVVSRSSTESEYRALADATAELLWLRWLL 1334

Query: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFVRHHLLSNTLLLRSVSTI 1433
            ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHF+RHHLLS+TLLL+S+ST 
Sbjct: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFIRHHLLSHTLLLQSISTT 1334

BLAST of CSPI04G12200 vs. ExPASy TrEMBL
Match: A0A5D3C0D7 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cucumis melo var. makuwa OX=1194695 GN=E5676_scaffold68G00020 PE=4 SV=1)

HSP 1 Score: 2215.7 bits (5740), Expect = 0.0e+00
Identity = 1133/1471 (77.02%), Postives = 1194/1471 (81.17%), Query Frame = 0

Query: 1    MERDHIFRPISTILDGTNYITWANQMKSFLIGRKLWRIVTGDITKPTA--KEGDNTFIER 60
            ME+  I RPISTILDG+NYITWANQMKSFLIGRKLWRIVTGDITKPT   KE D+ FIER
Sbjct: 1    MEKSEIARPISTILDGSNYITWANQMKSFLIGRKLWRIVTGDITKPTKQDKEDDSKFIER 60

Query: 61   LEDWDSKNHQIITW----------------------------------LAHYYQLHSTLV 120
            LE+WDSKNHQIITW                                  LAHYYQLH++LV
Sbjct: 61   LEEWDSKNHQIITWLSNTSIPAIHTQFDAFENAKELWNFLSTRFKSVGLAHYYQLHNSLV 120

Query: 121  NLNQDIGQSVNEYLAVLQPIWTQLDQANISKDHLRLIKVLMGLRPEYESVRAALLHRNPL 180
            NLNQ+ GQSVNEYLAVLQPIWTQLDQA ISKDHLRLIKVLMGLRPEYESVRAALLHR+PL
Sbjct: 121  NLNQEAGQSVNEYLAVLQPIWTQLDQATISKDHLRLIKVLMGLRPEYESVRAALLHRSPL 180

Query: 181  PSLDAAIQEILFEEKRLGINSTKQSDVVLASTYTPNRAANMFCKNCKLSGHKFSNCPKIE 240
            PSLDAAIQEILFEE+RLGIN +K SD VLASTY+P  A++ FCKNCKL+GHKF NCPKIE
Sbjct: 181  PSLDAAIQEILFEERRLGINLSKHSDAVLASTYSPPGASSTFCKNCKLTGHKFINCPKIE 240

Query: 241  CRYCHKHGHILDNCPTRPPRPPGTSTKEKIFTKHGSSSVVAATSDDSSL--IQISDLQSL 300
            CRYCHK GHILDNCP +PPRP   ST+ K FTK  +SS      D S++   QISDLQSL
Sbjct: 241  CRYCHKPGHILDNCPIKPPRPRNYSTRAKNFTKPSNSSAATVAPDKSTIPQFQISDLQSL 300

Query: 301  LNQLISS-SSTLAVSSGNRWLLDSACCNHMTSDSSLMSTSSPTKSLPPIYAADGNCMNIS 360
            LNQLISS SS LAVS GNRWLLDS CCNHMTSD SLM+T SPTKSLPPIYAADGNCMNI+
Sbjct: 301  LNQLISSPSSALAVSPGNRWLLDSGCCNHMTSDYSLMNTPSPTKSLPPIYAADGNCMNIT 360

Query: 361  HTGTIDTPSVHLPHTYCVPNLTFNLVSVGQLCDLGLNVSFSPNGCQVQDPQTGQTIGTGR 420
            H GTI+TPS++LPHTYCVPNLTFNLVSVGQLCDLG  VSFS NGCQVQDPQTGQTIGTGR
Sbjct: 361  HMGTINTPSLNLPHTYCVPNLTFNLVSVGQLCDLGFTVSFSSNGCQVQDPQTGQTIGTGR 420

Query: 421  KVGRLFELTSLRVSSPSSISASVTDSDTYQWHLRLGHASSEKLRHLISVNNLTNLTKFVP 480
            KVGRLFEL SL+V SP SISA VTDSDTYQWHLRLGHAS EKLRHLIS+NNL ++TKFVP
Sbjct: 421  KVGRLFELLSLQVPSP-SISAPVTDSDTYQWHLRLGHASPEKLRHLISINNLNSITKFVP 480

Query: 481  FNCLNCKLAKQPALSFSQSISNCDKPFDLVHSDIWGPAPITTVHGYRYYVLFIDDYSRFT 540
            FNCLNCKLAKQPALSFS S S CDKPFDL+HSDIWGPAP +TVHGYRYYVLFIDD+SRFT
Sbjct: 481  FNCLNCKLAKQPALSFSTSTSICDKPFDLIHSDIWGPAPTSTVHGYRYYVLFIDDFSRFT 540

Query: 541  WIYFLKHRSELSRTYIEFANMIRTQFSSPIKILRTDNVLEYKDSILLSFLSQQGTIVQRS 600
            WIYFLKHRSELSRTYIEFANMIRTQFS PIK LRTDN LEYKDS LLSFLSQQGT+VQRS
Sbjct: 541  WIYFLKHRSELSRTYIEFANMIRTQFSCPIKTLRTDNALEYKDSTLLSFLSQQGTLVQRS 600

Query: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNTSP 660
            CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQN SP
Sbjct: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNISP 660

Query: 661  FEKLYGISPDYSKLKVFGSACFVLLHPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720
            FE+LYG  P+YS LKVFG ACFVLL PHEH KLEPRARLCCFLGYGTEHKGFRCWDPLSN
Sbjct: 661  FERLYGTPPNYSNLKVFGCACFVLLQPHEHTKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720

Query: 721  RLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLSEPTLDTELAQSSPATA 780
            RLRISRHVTFWEHTMFSRLSSFH SFSSPQSFFT+TS+DLFPLSE T   ELAQS+P +A
Sbjct: 721  RLRISRHVTFWEHTMFSRLSSFHASFSSPQSFFTDTSIDLFPLSESTPGNELAQSAPTSA 780

Query: 781  NLDPPSVFDDVPESPPATPLRRSTRVREPPSHLTHACFVLLHPHEHNKLEPRARLCCFLG 840
              D  S+ D  P+ PP                                            
Sbjct: 781  TSDQSSISDGNPDPPP-------------------------------------------- 840

Query: 841  YGTEHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLS 900
                                                                        
Sbjct: 841  ------------------------------------------------------------ 900

Query: 901  EPTLDTELAQSSPATANLDPPSVSDDVPESSPATPLRRSTRVREPPPHLTDYHCFSTIVS 960
                                     D+P        RRSTRVREPP HL DYHCFSTIVS
Sbjct: 901  -------------------------DIPP-------RRSTRVREPPIHLQDYHCFSTIVS 960

Query: 961  LVEPTSYQEASINPVWQKAMDEELQALEKTHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020
            L+EPTSYQEAS +P+WQKAM++ELQALEK HTWDYVDLPPGKRPIGCKWIYKIKTHSDGT
Sbjct: 961  LIEPTSYQEASTDPLWQKAMNDELQALEKMHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020

Query: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLFQMDVKNAFLNG 1080
            IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPL QMDVKNAFLNG
Sbjct: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLLQMDVKNAFLNG 1080

Query: 1081 NLSEEVYMKPPQGTSPPPNKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSSHDNAL 1140
             LSEEVYMKPP GTS PP+KVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSS HD AL
Sbjct: 1081 TLSEEVYMKPPPGTSSPPHKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSPHDTAL 1140

Query: 1141 FTRQTTHGIVLLLLYVDDMIITGNDQQAISDLQQYLGQHFEMKDLGSLNYFLGLEVSHRS 1200
            FTR T  GIVLLLLYVDDMIITGND  AISDLQ YLGQHFEMKDLGSLNYFLGLEVS RS
Sbjct: 1141 FTRHTPQGIVLLLLYVDDMIITGNDPHAISDLQHYLGQHFEMKDLGSLNYFLGLEVSRRS 1200

Query: 1201 DGYLLSQAKYASDLIARSGITDSTTSSTPLDPHVHLTPFDGVPLDDASLYRQLVGSLIYL 1260
            DGYLLSQAKYASDL+ARSGITDS T+STPLDP+VHLTP+DGVPL++ SLYRQLVGSLIYL
Sbjct: 1201 DGYLLSQAKYASDLLARSGITDSNTASTPLDPNVHLTPYDGVPLENVSLYRQLVGSLIYL 1260

Query: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320
            TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA
Sbjct: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320

Query: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVISRSSTESEYRALADATAELIWLRWLL 1380
            DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSV+SRSSTESEYRALADATAEL+WLRWLL
Sbjct: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVVSRSSTESEYRALADATAELLWLRWLL 1334

Query: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFVRHHLLSNTLLLRSVSTI 1433
            ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHF+RHHLLS+TLLL+S+ST 
Sbjct: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFIRHHLLSHTLLLQSISTT 1334

BLAST of CSPI04G12200 vs. ExPASy TrEMBL
Match: A0A5D3D7V8 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cucumis melo var. makuwa OX=1194695 GN=E5676_scaffold242G00480 PE=4 SV=1)

HSP 1 Score: 2159.0 bits (5593), Expect = 0.0e+00
Identity = 1110/1471 (75.46%), Postives = 1170/1471 (79.54%), Query Frame = 0

Query: 1    MERDHIFRPISTILDGTNYITWANQMKSFLIGRKLWRIVTGDITKPTA--KEGDNTFIER 60
            ME+  I RPISTILDG+NYITWANQMKSFLIGRKLWRIVTGDITKPT   KE D+ FIER
Sbjct: 1    MEKSEIARPISTILDGSNYITWANQMKSFLIGRKLWRIVTGDITKPTKQDKEDDSKFIER 60

Query: 61   LEDWDSKNHQIITW----------------------------------LAHYYQLHSTLV 120
            LE+WDSKNHQIITW                                  LAHYYQLH++LV
Sbjct: 61   LEEWDSKNHQIITWLSNTSIPAIHTQFDAFENAKELWNFLSTRFKSVGLAHYYQLHNSLV 120

Query: 121  NLNQDIGQSVNEYLAVLQPIWTQLDQANISKDHLRLIKVLMGLRPEYESVRAALLHRNPL 180
            NLNQ+ GQSVNEYLAVLQPIWTQLDQA ISKDHLRLIKVLMGLRPEYESVRAALLHR+PL
Sbjct: 121  NLNQEAGQSVNEYLAVLQPIWTQLDQATISKDHLRLIKVLMGLRPEYESVRAALLHRSPL 180

Query: 181  PSLDAAIQEILFEEKRLGINSTKQSDVVLASTYTPNRAANMFCKNCKLSGHKFSNCPKIE 240
            PSLDAAIQEILFEE+RLGIN +K SD VLASTY+P  A++ FCKNCKL+GHKF NCPKIE
Sbjct: 181  PSLDAAIQEILFEERRLGINLSKHSDAVLASTYSPPGASSTFCKNCKLTGHKFINCPKIE 240

Query: 241  CRYCHKHGHILDNCPTRPPRPPGTSTKEKIFTKHGSSSVVAATSDDSSL--IQISDLQSL 300
            CRYCHK GHILDNCP +PPRP   ST+ K FTK  +SS      D S++   QISDLQSL
Sbjct: 241  CRYCHKPGHILDNCPIKPPRPRNYSTRAKNFTKPSNSSAATVAPDKSTIPQFQISDLQSL 300

Query: 301  LNQLISS-SSTLAVSSGNRWLLDSACCNHMTSDSSLMSTSSPTKSLPPIYAADGNCMNIS 360
            LNQLISS SS LAVS GNRWLLDS CCNHMTSD SLM+T SPTKSLPPIYAADGNCMNI+
Sbjct: 301  LNQLISSPSSALAVSPGNRWLLDSGCCNHMTSDYSLMNTPSPTKSLPPIYAADGNCMNIT 360

Query: 361  HTGTIDTPSVHLPHTYCVPNLTFNLVSVGQLCDLGLNVSFSPNGCQVQDPQTGQTIGTGR 420
            H GTI+TPS++LPHTYCVPNLTFNLVSVGQLCDLG  VSFS NGCQVQDPQTGQTIGTGR
Sbjct: 361  HMGTINTPSLNLPHTYCVPNLTFNLVSVGQLCDLGFTVSFSSNGCQVQDPQTGQTIGTGR 420

Query: 421  KVGRLFELTSLRVSSPSSISASVTDSDTYQWHLRLGHASSEKLRHLISVNNLTNLTKFVP 480
            KVGRLFEL SL+V SP SISA VTDSDTYQWHLRLGHAS EKLRHLIS+NNL ++TKFVP
Sbjct: 421  KVGRLFELLSLQVPSP-SISAPVTDSDTYQWHLRLGHASPEKLRHLISINNLNSITKFVP 480

Query: 481  FNCLNCKLAKQPALSFSQSISNCDKPFDLVHSDIWGPAPITTVHGYRYYVLFIDDYSRFT 540
            FNCLNCKLAKQPALSFS S S CDKPFDL+HSDIWGPAP +TVHGYRYYVLFIDD+SRFT
Sbjct: 481  FNCLNCKLAKQPALSFSTSTSICDKPFDLIHSDIWGPAPTSTVHGYRYYVLFIDDFSRFT 540

Query: 541  WIYFLKHRSELSRTYIEFANMIRTQFSSPIKILRTDNVLEYKDSILLSFLSQQGTIVQRS 600
            WIYFLKHRSELSRTYIEFANMIRTQFS PIK LRTDN LEYKDS LLSFLSQQGT+VQRS
Sbjct: 541  WIYFLKHRSELSRTYIEFANMIRTQFSCPIKTLRTDNALEYKDSTLLSFLSQQGTLVQRS 600

Query: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNTSP 660
            CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQN SP
Sbjct: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNISP 660

Query: 661  FEKLYGISPDYSKLKVFGSACFVLLHPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720
            FE+LYG  P+YS LKVFG ACFVLL PHEH KLEPRARLCCFLGYGTEHKGFRCWDPLSN
Sbjct: 661  FERLYGTPPNYSNLKVFGCACFVLLQPHEHTKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720

Query: 721  RLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLSEPTLDTELAQSSPATA 780
            RLRISRHVTFWEHTMFSRLSSFH SFSSPQSFFT+TS+DLFPLSE T   ELAQS+P +A
Sbjct: 721  RLRISRHVTFWEHTMFSRLSSFHASFSSPQSFFTDTSIDLFPLSESTPGNELAQSAPTSA 780

Query: 781  NLDPPSVFDDVPESPPATPLRRSTRVREPPSHLTHACFVLLHPHEHNKLEPRARLCCFLG 840
              D  S+ D  P+ PP                                            
Sbjct: 781  TSDQSSISDGNPDPPP-------------------------------------------- 840

Query: 841  YGTEHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLS 900
                                                                        
Sbjct: 841  ------------------------------------------------------------ 900

Query: 901  EPTLDTELAQSSPATANLDPPSVSDDVPESSPATPLRRSTRVREPPPHLTDYHCFSTIVS 960
                                     D+P        RRSTRVREPP HL DYHCFSTIVS
Sbjct: 901  -------------------------DIPP-------RRSTRVREPPIHLQDYHCFSTIVS 960

Query: 961  LVEPTSYQEASINPVWQKAMDEELQALEKTHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020
            L+EPTSYQEAS +P+WQKAM++ELQALEK HTWDYVDLPPGKRPIGCKWIYKIKTHSDGT
Sbjct: 961  LIEPTSYQEASTDPLWQKAMNDELQALEKMHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020

Query: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLFQMDVKNAFLNG 1080
            IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPL QMDVKNAFLNG
Sbjct: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLLQMDVKNAFLNG 1080

Query: 1081 NLSEEVYMKPPQGTSPPPNKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSSHDNAL 1140
             LSEEVYMKPP GTS PP+KVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSS HD AL
Sbjct: 1081 TLSEEVYMKPPPGTSSPPHKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSPHDTAL 1140

Query: 1141 FTRQTTHGIVLLLLYVDDMIITGNDQQAISDLQQYLGQHFEMKDLGSLNYFLGLEVSHRS 1200
            FTR T  GIVLLLLYVDDMIITGND  AISDLQ YLGQHFEMKDLGSLNYFLGLEVS RS
Sbjct: 1141 FTRHTPQGIVLLLLYVDDMIITGNDPHAISDLQHYLGQHFEMKDLGSLNYFLGLEVSRRS 1200

Query: 1201 DGYLLSQAKYASDLIARSGITDSTTSSTPLDPHVHLTPFDGVPLDDASLYRQLVGSLIYL 1260
            DGYLLSQAKYASDL+ARSGITDS T+STPLDP+VHLTP+DGVPL++ SLYRQLVGSLIYL
Sbjct: 1201 DGYLLSQAKYASDLLARSGITDSNTASTPLDPNVHLTPYDGVPLENVSLYRQLVGSLIYL 1260

Query: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320
            TVTRPDIAYAVHI                        GTLGHGLQFSSQSSLVLSGYSDA
Sbjct: 1261 TVTRPDIAYAVHI------------------------GTLGHGLQFSSQSSLVLSGYSDA 1310

Query: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVISRSSTESEYRALADATAELIWLRWLL 1380
            DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSV+SRSSTESEYRALADATAEL+WLRWLL
Sbjct: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVVSRSSTESEYRALADATAELLWLRWLL 1310

Query: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFVRHHLLSNTLLLRSVSTI 1433
            ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHF+RHHLLS+TLLL+S+ST 
Sbjct: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFIRHHLLSHTLLLQSISTT 1310

BLAST of CSPI04G12200 vs. ExPASy TrEMBL
Match: A0A5D3DT06 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cucumis melo var. makuwa OX=1194695 GN=E5676_scaffold861G00190 PE=4 SV=1)

HSP 1 Score: 2156.3 bits (5586), Expect = 0.0e+00
Identity = 1108/1471 (75.32%), Postives = 1169/1471 (79.47%), Query Frame = 0

Query: 1    MERDHIFRPISTILDGTNYITWANQMKSFLIGRKLWRIVTGDITKPTA--KEGDNTFIER 60
            ME+  I RPISTILDG+NYITWANQMKSFLIGRKLWRIVTGDITKPT   KE D+ FIER
Sbjct: 1    MEKSEIARPISTILDGSNYITWANQMKSFLIGRKLWRIVTGDITKPTKQDKEDDSKFIER 60

Query: 61   LEDWDSKNHQIITW----------------------------------LAHYYQLHSTLV 120
            LE+WDSKNHQIITW                                  LAHYYQLH++LV
Sbjct: 61   LEEWDSKNHQIITWLSNTSIPAIHTQFDAFENAKELWNFLSTRFKSVGLAHYYQLHNSLV 120

Query: 121  NLNQDIGQSVNEYLAVLQPIWTQLDQANISKDHLRLIKVLMGLRPEYESVRAALLHRNPL 180
            NLNQ+ GQSVNEYLAVLQPIWTQLDQA ISKDHLRLIKVLMGLRPEYESVRAALLHR+PL
Sbjct: 121  NLNQEAGQSVNEYLAVLQPIWTQLDQATISKDHLRLIKVLMGLRPEYESVRAALLHRSPL 180

Query: 181  PSLDAAIQEILFEEKRLGINSTKQSDVVLASTYTPNRAANMFCKNCKLSGHKFSNCPKIE 240
            PSLDAAIQEILFEE+RLGIN +K SD VLASTY+P  A++ FCKNCKL+GHKF NCPKIE
Sbjct: 181  PSLDAAIQEILFEERRLGINLSKHSDAVLASTYSPPGASSTFCKNCKLTGHKFINCPKIE 240

Query: 241  CRYCHKHGHILDNCPTRPPRPPGTSTKEKIFTKHGSSSVVAATSDDSSL--IQISDLQSL 300
            CRYCHK GHILDNCP +PPRP   ST+ K FTK  +SS      D S++   QISDLQSL
Sbjct: 241  CRYCHKPGHILDNCPIKPPRPRNYSTRAKNFTKPSNSSAATVAPDKSTIPQFQISDLQSL 300

Query: 301  LNQLISS-SSTLAVSSGNRWLLDSACCNHMTSDSSLMSTSSPTKSLPPIYAADGNCMNIS 360
            LNQLISS SS LAVS GNRWLLDS CCNHMTSD SLM+T SPTKSLPPIYAADGNCMNI+
Sbjct: 301  LNQLISSPSSALAVSPGNRWLLDSGCCNHMTSDYSLMNTPSPTKSLPPIYAADGNCMNIT 360

Query: 361  HTGTIDTPSVHLPHTYCVPNLTFNLVSVGQLCDLGLNVSFSPNGCQVQDPQTGQTIGTGR 420
            H GTI+TPS++LPHTYCVPNLTFNLVSVGQLCDLG  VSFS NGCQVQDPQTGQTIGTGR
Sbjct: 361  HMGTINTPSLNLPHTYCVPNLTFNLVSVGQLCDLGFTVSFSSNGCQVQDPQTGQTIGTGR 420

Query: 421  KVGRLFELTSLRVSSPSSISASVTDSDTYQWHLRLGHASSEKLRHLISVNNLTNLTKFVP 480
            KVGRLFEL SL+V SP SISA VTDSDTYQWHLRLGHAS EKLRHLIS+NNL ++TKFVP
Sbjct: 421  KVGRLFELLSLQVPSP-SISAPVTDSDTYQWHLRLGHASPEKLRHLISINNLNSITKFVP 480

Query: 481  FNCLNCKLAKQPALSFSQSISNCDKPFDLVHSDIWGPAPITTVHGYRYYVLFIDDYSRFT 540
            FNCLNCKLAKQPALSFS S S CDKPFDL+HSDIWGPAP +TVHGYRYYVLFIDD+SRFT
Sbjct: 481  FNCLNCKLAKQPALSFSTSTSICDKPFDLIHSDIWGPAPTSTVHGYRYYVLFIDDFSRFT 540

Query: 541  WIYFLKHRSELSRTYIEFANMIRTQFSSPIKILRTDNVLEYKDSILLSFLSQQGTIVQRS 600
            WIYFLKHRSELSRTYIEFANMIRTQFS PIK LRTDN LEYKDS LLSFLSQQGT+VQRS
Sbjct: 541  WIYFLKHRSELSRTYIEFANMIRTQFSCPIKTLRTDNALEYKDSTLLSFLSQQGTLVQRS 600

Query: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNTSP 660
            CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQN SP
Sbjct: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNISP 660

Query: 661  FEKLYGISPDYSKLKVFGSACFVLLHPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720
            FE+LYG  P+YS LKVFG ACFVLL PHEH KLEPRARLCCFLGYGTEHKGFRCWDPLSN
Sbjct: 661  FERLYGTPPNYSNLKVFGCACFVLLQPHEHTKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720

Query: 721  RLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLSEPTLDTELAQSSPATA 780
            RLRISRHVTFWEHTMFSRLSSFH SFSSPQSFFT+TS+DLFPLSE T   ELAQS+P +A
Sbjct: 721  RLRISRHVTFWEHTMFSRLSSFHASFSSPQSFFTDTSIDLFPLSESTPGNELAQSAPTSA 780

Query: 781  NLDPPSVFDDVPESPPATPLRRSTRVREPPSHLTHACFVLLHPHEHNKLEPRARLCCFLG 840
              D  S+ D  P+ PP                                            
Sbjct: 781  TSDQSSISDGNPDPPP-------------------------------------------- 840

Query: 841  YGTEHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLS 900
                                                                        
Sbjct: 841  ------------------------------------------------------------ 900

Query: 901  EPTLDTELAQSSPATANLDPPSVSDDVPESSPATPLRRSTRVREPPPHLTDYHCFSTIVS 960
                                     D+P        RRSTRVREPP HL DYHCFSTIVS
Sbjct: 901  -------------------------DIPP-------RRSTRVREPPIHLQDYHCFSTIVS 960

Query: 961  LVEPTSYQEASINPVWQKAMDEELQALEKTHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020
            L+EPTSYQEAS +P+WQKAM++ELQALEK HTWDYVDLPPGKRPIGCKWIYKIKTHSDGT
Sbjct: 961  LIEPTSYQEASTDPLWQKAMNDELQALEKMHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020

Query: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLFQMDVKNAFLNG 1080
            IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPL QMDVKNAFLNG
Sbjct: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLLQMDVKNAFLNG 1080

Query: 1081 NLSEEVYMKPPQGTSPPPNKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSSHDNAL 1140
             LSEEVYMKPP GTS PP+KVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSS HD AL
Sbjct: 1081 TLSEEVYMKPPPGTSSPPHKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSPHDTAL 1140

Query: 1141 FTRQTTHGIVLLLLYVDDMIITGNDQQAISDLQQYLGQHFEMKDLGSLNYFLGLEVSHRS 1200
            FTR T  GIVLLLLYVDDMIITGND  AISDLQ YLGQHFEMKDLGS NYFLGLEVS RS
Sbjct: 1141 FTRHTPQGIVLLLLYVDDMIITGNDPHAISDLQHYLGQHFEMKDLGSFNYFLGLEVSRRS 1200

Query: 1201 DGYLLSQAKYASDLIARSGITDSTTSSTPLDPHVHLTPFDGVPLDDASLYRQLVGSLIYL 1260
            DGYLLSQAKYASDL+ARSGITDS T+STPLDP+VHLTP+DGVPL++ SLYRQLVGSLIYL
Sbjct: 1201 DGYLLSQAKYASDLLARSGITDSNTASTPLDPNVHLTPYDGVPLENVSLYRQLVGSLIYL 1260

Query: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320
            TVTRPDIAYAVHI                        GTLGHGLQFSSQSSLVLSGYSDA
Sbjct: 1261 TVTRPDIAYAVHI------------------------GTLGHGLQFSSQSSLVLSGYSDA 1310

Query: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVISRSSTESEYRALADATAELIWLRWLL 1380
            DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSV+SRSSTESEYRALADATAEL+WLRWLL
Sbjct: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVVSRSSTESEYRALADATAELLWLRWLL 1310

Query: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFVRHHLLSNTLLLRSVSTI 1433
            ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHF+RHHLLS+TLLL+S+ST 
Sbjct: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFIRHHLLSHTLLLQSISTT 1310

BLAST of CSPI04G12200 vs. NCBI nr
Match: KAA0041601.1 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cucumis melo var. makuwa] >KAA0065379.1 Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cucumis melo var. makuwa] >TYK12000.1 Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cucumis melo var. makuwa])

HSP 1 Score: 2217.2 bits (5744), Expect = 0.0e+00
Identity = 1134/1471 (77.09%), Postives = 1194/1471 (81.17%), Query Frame = 0

Query: 1    MERDHIFRPISTILDGTNYITWANQMKSFLIGRKLWRIVTGDITKPTA--KEGDNTFIER 60
            ME+  I RPISTILDG+NYITWANQMKSFLIGRKLWRIVTGDITKPT   KE D+ FIER
Sbjct: 1    MEKSEIARPISTILDGSNYITWANQMKSFLIGRKLWRIVTGDITKPTKQDKEDDSKFIER 60

Query: 61   LEDWDSKNHQIITW----------------------------------LAHYYQLHSTLV 120
            LE+WDSKNHQIITW                                  LAHYYQLH++LV
Sbjct: 61   LEEWDSKNHQIITWLSNTSIPAIHTQFDAFENAKELWNFLSTRFKSVGLAHYYQLHNSLV 120

Query: 121  NLNQDIGQSVNEYLAVLQPIWTQLDQANISKDHLRLIKVLMGLRPEYESVRAALLHRNPL 180
            NLNQ+ GQSVNEYLAVLQPIWTQLDQA ISKDHLRLIKVLMGLRPEYESVRAALLHR+PL
Sbjct: 121  NLNQEAGQSVNEYLAVLQPIWTQLDQATISKDHLRLIKVLMGLRPEYESVRAALLHRSPL 180

Query: 181  PSLDAAIQEILFEEKRLGINSTKQSDVVLASTYTPNRAANMFCKNCKLSGHKFSNCPKIE 240
            PSLDAAIQEILFEE+RLGIN +K SD VLASTY+P  A++ FCKNCKL+GHKF NCPKIE
Sbjct: 181  PSLDAAIQEILFEERRLGINLSKHSDAVLASTYSPPGASSTFCKNCKLTGHKFINCPKIE 240

Query: 241  CRYCHKHGHILDNCPTRPPRPPGTSTKEKIFTKHGSSSVVAATSDDSSL--IQISDLQSL 300
            CRYCHK GHILDNCP +PPRP   ST+ K FTK  +SS      D S++   QISDLQSL
Sbjct: 241  CRYCHKPGHILDNCPIKPPRPRNYSTRAKNFTKPSNSSAATVAPDKSTIPQFQISDLQSL 300

Query: 301  LNQLISS-SSTLAVSSGNRWLLDSACCNHMTSDSSLMSTSSPTKSLPPIYAADGNCMNIS 360
            LNQLISS SS LAVS GNRWLLDS CCNHMTSD SLM+T SPTKSLPPIYAADGNCMNI+
Sbjct: 301  LNQLISSPSSALAVSPGNRWLLDSGCCNHMTSDYSLMNTPSPTKSLPPIYAADGNCMNIT 360

Query: 361  HTGTIDTPSVHLPHTYCVPNLTFNLVSVGQLCDLGLNVSFSPNGCQVQDPQTGQTIGTGR 420
            H GTI+TPS++LPHTYCVPNLTFNLVSVGQLCDLG  VSFS NGCQVQDPQTGQTIGTGR
Sbjct: 361  HMGTINTPSLNLPHTYCVPNLTFNLVSVGQLCDLGFTVSFSSNGCQVQDPQTGQTIGTGR 420

Query: 421  KVGRLFELTSLRVSSPSSISASVTDSDTYQWHLRLGHASSEKLRHLISVNNLTNLTKFVP 480
            KVGRLFEL SL+V SP SISA VTDSDTYQWHLRLGHAS EKLRHLIS+NNL ++TKFVP
Sbjct: 421  KVGRLFELLSLQVPSP-SISAPVTDSDTYQWHLRLGHASPEKLRHLISINNLNSITKFVP 480

Query: 481  FNCLNCKLAKQPALSFSQSISNCDKPFDLVHSDIWGPAPITTVHGYRYYVLFIDDYSRFT 540
            FNCLNCKLAKQPALSFS S S CDKPFDL+HSDIWGPAP +TVHGYRYYVLFIDD+SRFT
Sbjct: 481  FNCLNCKLAKQPALSFSTSTSICDKPFDLIHSDIWGPAPTSTVHGYRYYVLFIDDFSRFT 540

Query: 541  WIYFLKHRSELSRTYIEFANMIRTQFSSPIKILRTDNVLEYKDSILLSFLSQQGTIVQRS 600
            WIYFLKHRSELSRTYIEFANMIRTQFS PIK LRTDN LEYKDS LLSFLSQQGT+VQRS
Sbjct: 541  WIYFLKHRSELSRTYIEFANMIRTQFSCPIKTLRTDNALEYKDSTLLSFLSQQGTLVQRS 600

Query: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNTSP 660
            CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQN SP
Sbjct: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNISP 660

Query: 661  FEKLYGISPDYSKLKVFGSACFVLLHPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720
            FE+LYG  P+YS LKVFG ACFVLL PHEH KLEPRARLCCFLGYGTEHKGFRCWDPLSN
Sbjct: 661  FERLYGTPPNYSNLKVFGCACFVLLQPHEHTKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720

Query: 721  RLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLSEPTLDTELAQSSPATA 780
            RLRISRHVTFWEHTMFSRLSSFH SFSSPQSFFT+TS+DLFPLSE T   ELAQS+P +A
Sbjct: 721  RLRISRHVTFWEHTMFSRLSSFHASFSSPQSFFTDTSIDLFPLSESTPGNELAQSAPTSA 780

Query: 781  NLDPPSVFDDVPESPPATPLRRSTRVREPPSHLTHACFVLLHPHEHNKLEPRARLCCFLG 840
              D  S+ D  P+ PP                                            
Sbjct: 781  TSDQSSISDGNPDPPP-------------------------------------------- 840

Query: 841  YGTEHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLS 900
                                                                        
Sbjct: 841  ------------------------------------------------------------ 900

Query: 901  EPTLDTELAQSSPATANLDPPSVSDDVPESSPATPLRRSTRVREPPPHLTDYHCFSTIVS 960
                                     D+P        RRSTRVREPP HL DYHCFSTIVS
Sbjct: 901  -------------------------DIPP-------RRSTRVREPPIHLQDYHCFSTIVS 960

Query: 961  LVEPTSYQEASINPVWQKAMDEELQALEKTHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020
            L+EPTSYQEAS +P+WQKAM++ELQALEK HTWDYVDLPPGKRPIGCKWIYKIKTHSDGT
Sbjct: 961  LIEPTSYQEASTDPLWQKAMNDELQALEKMHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020

Query: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLFQMDVKNAFLNG 1080
            IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPL QMDVKNAFLNG
Sbjct: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLLQMDVKNAFLNG 1080

Query: 1081 NLSEEVYMKPPQGTSPPPNKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSSHDNAL 1140
             LSEEVYMKPP GTS PP+KVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSS HD AL
Sbjct: 1081 TLSEEVYMKPPPGTSSPPHKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSPHDTAL 1140

Query: 1141 FTRQTTHGIVLLLLYVDDMIITGNDQQAISDLQQYLGQHFEMKDLGSLNYFLGLEVSHRS 1200
            FTR T  GIVLLLLYVDDMIITGND  AISDLQ YLGQHFEMKDLGSLNYFLGLEVS RS
Sbjct: 1141 FTRHTPQGIVLLLLYVDDMIITGNDPHAISDLQHYLGQHFEMKDLGSLNYFLGLEVSRRS 1200

Query: 1201 DGYLLSQAKYASDLIARSGITDSTTSSTPLDPHVHLTPFDGVPLDDASLYRQLVGSLIYL 1260
            DGYLLSQAKYASDL+ARSGITDS T+STPLDP+VHLTP+DGVPL++ SLYRQLVGSLIYL
Sbjct: 1201 DGYLLSQAKYASDLLARSGITDSNTASTPLDPNVHLTPYDGVPLENVSLYRQLVGSLIYL 1260

Query: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320
            TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA
Sbjct: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320

Query: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVISRSSTESEYRALADATAELIWLRWLL 1380
            DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSV+SRSSTESEYRALADATAEL+WLRWLL
Sbjct: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVVSRSSTESEYRALADATAELLWLRWLL 1334

Query: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFVRHHLLSNTLLLRSVSTI 1433
            ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHF+RHHLLS+TLLL+S+ST 
Sbjct: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFIRHHLLSHTLLLQSISTT 1334

BLAST of CSPI04G12200 vs. NCBI nr
Match: KAA0037745.1 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cucumis melo var. makuwa])

HSP 1 Score: 2215.7 bits (5740), Expect = 0.0e+00
Identity = 1133/1471 (77.02%), Postives = 1194/1471 (81.17%), Query Frame = 0

Query: 1    MERDHIFRPISTILDGTNYITWANQMKSFLIGRKLWRIVTGDITKPTA--KEGDNTFIER 60
            ME+  I RPISTILDG+NYITWANQMKSFLIGRKLWRIVTGDITKPT   KE D+ FIER
Sbjct: 1    MEKSEIARPISTILDGSNYITWANQMKSFLIGRKLWRIVTGDITKPTKQDKEDDSKFIER 60

Query: 61   LEDWDSKNHQIITW----------------------------------LAHYYQLHSTLV 120
            LE+WDSKNHQIITW                                  LAHYYQLH++LV
Sbjct: 61   LEEWDSKNHQIITWLSNTSIPAIHTQFDAFENAKELWNFLSTRFKSVGLAHYYQLHNSLV 120

Query: 121  NLNQDIGQSVNEYLAVLQPIWTQLDQANISKDHLRLIKVLMGLRPEYESVRAALLHRNPL 180
            NLNQ+ GQSVNEYLAVLQPIWTQLDQA ISKDHLRLIKVLMGLRPEYESVRAALLHR+PL
Sbjct: 121  NLNQEAGQSVNEYLAVLQPIWTQLDQATISKDHLRLIKVLMGLRPEYESVRAALLHRSPL 180

Query: 181  PSLDAAIQEILFEEKRLGINSTKQSDVVLASTYTPNRAANMFCKNCKLSGHKFSNCPKIE 240
            PSLDAAIQEILFEE+RLGIN +K SD VLASTY+P  A++ FCKNCKL+GHKF NCPKIE
Sbjct: 181  PSLDAAIQEILFEERRLGINLSKHSDAVLASTYSPPGASSTFCKNCKLTGHKFINCPKIE 240

Query: 241  CRYCHKHGHILDNCPTRPPRPPGTSTKEKIFTKHGSSSVVAATSDDSSL--IQISDLQSL 300
            CRYCHK GHILDNCP +PPRP   ST+ K FTK  +SS      D S++   QISDLQSL
Sbjct: 241  CRYCHKPGHILDNCPIKPPRPRNYSTRAKNFTKPSNSSAATVALDKSTIPQFQISDLQSL 300

Query: 301  LNQLISS-SSTLAVSSGNRWLLDSACCNHMTSDSSLMSTSSPTKSLPPIYAADGNCMNIS 360
            LNQLISS SS LAVS GNRWLLDS CCNHMTSD SLM+T SPTKSLPPIYAADGNCMNI+
Sbjct: 301  LNQLISSPSSALAVSPGNRWLLDSGCCNHMTSDYSLMNTPSPTKSLPPIYAADGNCMNIT 360

Query: 361  HTGTIDTPSVHLPHTYCVPNLTFNLVSVGQLCDLGLNVSFSPNGCQVQDPQTGQTIGTGR 420
            H GTI+TPS++LPHTYCVPNLTFNLVSVGQLCDLG  VSFS NGCQVQDPQTGQTIGTGR
Sbjct: 361  HMGTINTPSLNLPHTYCVPNLTFNLVSVGQLCDLGFTVSFSSNGCQVQDPQTGQTIGTGR 420

Query: 421  KVGRLFELTSLRVSSPSSISASVTDSDTYQWHLRLGHASSEKLRHLISVNNLTNLTKFVP 480
            KVGRLFE+ SL+V SP SISA VTDSDTYQWHLRLGHAS EKLRHLIS+NNL ++TKFVP
Sbjct: 421  KVGRLFEVLSLQVPSP-SISAPVTDSDTYQWHLRLGHASPEKLRHLISINNLNSITKFVP 480

Query: 481  FNCLNCKLAKQPALSFSQSISNCDKPFDLVHSDIWGPAPITTVHGYRYYVLFIDDYSRFT 540
            FNCLNCKLAKQPALSFS S S CDKPFDL+HSDIWGPAP +TVHGYRYYVLFIDD+SRFT
Sbjct: 481  FNCLNCKLAKQPALSFSTSTSICDKPFDLIHSDIWGPAPTSTVHGYRYYVLFIDDFSRFT 540

Query: 541  WIYFLKHRSELSRTYIEFANMIRTQFSSPIKILRTDNVLEYKDSILLSFLSQQGTIVQRS 600
            WIYFLKHRSELSRTYIEFANMIRTQFS PIK LRTDN LEYKDS LLSFLSQQGT+VQRS
Sbjct: 541  WIYFLKHRSELSRTYIEFANMIRTQFSCPIKTLRTDNALEYKDSTLLSFLSQQGTLVQRS 600

Query: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNTSP 660
            CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQN SP
Sbjct: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNISP 660

Query: 661  FEKLYGISPDYSKLKVFGSACFVLLHPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720
            FE+LYG  P+YS LKVFG ACFVLL PHEH KLEPRARLCCFLGYGTEHKGFRCWDPLSN
Sbjct: 661  FERLYGTPPNYSNLKVFGCACFVLLQPHEHTKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720

Query: 721  RLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLSEPTLDTELAQSSPATA 780
            RLRISRHVTFWEHTMFSRLSSFH SFSSPQSFFT+TS+DLFPLSE T   ELAQS+P +A
Sbjct: 721  RLRISRHVTFWEHTMFSRLSSFHASFSSPQSFFTDTSIDLFPLSESTPGNELAQSAPTSA 780

Query: 781  NLDPPSVFDDVPESPPATPLRRSTRVREPPSHLTHACFVLLHPHEHNKLEPRARLCCFLG 840
              D  S+ D  P+ PP                                            
Sbjct: 781  TSDQSSISDGNPDPPP-------------------------------------------- 840

Query: 841  YGTEHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLS 900
                                                                        
Sbjct: 841  ------------------------------------------------------------ 900

Query: 901  EPTLDTELAQSSPATANLDPPSVSDDVPESSPATPLRRSTRVREPPPHLTDYHCFSTIVS 960
                                     D+P        RRSTRVREPP HL DYHCFSTIVS
Sbjct: 901  -------------------------DIPP-------RRSTRVREPPIHLQDYHCFSTIVS 960

Query: 961  LVEPTSYQEASINPVWQKAMDEELQALEKTHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020
            L+EPTSYQEAS +P+WQKAM++ELQALEK HTWDYVDLPPGKRPIGCKWIYKIKTHSDGT
Sbjct: 961  LIEPTSYQEASTDPLWQKAMNDELQALEKMHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020

Query: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLFQMDVKNAFLNG 1080
            IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPL QMDVKNAFLNG
Sbjct: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLLQMDVKNAFLNG 1080

Query: 1081 NLSEEVYMKPPQGTSPPPNKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSSHDNAL 1140
             LSEEVYMKPP GTS PP+KVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSS HD AL
Sbjct: 1081 TLSEEVYMKPPPGTSSPPHKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSPHDTAL 1140

Query: 1141 FTRQTTHGIVLLLLYVDDMIITGNDQQAISDLQQYLGQHFEMKDLGSLNYFLGLEVSHRS 1200
            FTR T  GIVLLLLYVDDMIITGND  AISDLQ YLGQHFEMKDLGSLNYFLGLEVS RS
Sbjct: 1141 FTRHTPQGIVLLLLYVDDMIITGNDPHAISDLQHYLGQHFEMKDLGSLNYFLGLEVSRRS 1200

Query: 1201 DGYLLSQAKYASDLIARSGITDSTTSSTPLDPHVHLTPFDGVPLDDASLYRQLVGSLIYL 1260
            DGYLLSQAKYASDL+ARSGITDS T+STPLDP+VHLTP+DGVPL++ SLYRQLVGSLIYL
Sbjct: 1201 DGYLLSQAKYASDLLARSGITDSNTASTPLDPNVHLTPYDGVPLENVSLYRQLVGSLIYL 1260

Query: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320
            TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA
Sbjct: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320

Query: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVISRSSTESEYRALADATAELIWLRWLL 1380
            DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSV+SRSSTESEYRALADATAEL+WLRWLL
Sbjct: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVVSRSSTESEYRALADATAELLWLRWLL 1334

Query: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFVRHHLLSNTLLLRSVSTI 1433
            ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHF+RHHLLS+TLLL+S+ST 
Sbjct: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFIRHHLLSHTLLLQSISTT 1334

BLAST of CSPI04G12200 vs. NCBI nr
Match: TYK04714.1 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cucumis melo var. makuwa])

HSP 1 Score: 2215.7 bits (5740), Expect = 0.0e+00
Identity = 1133/1471 (77.02%), Postives = 1194/1471 (81.17%), Query Frame = 0

Query: 1    MERDHIFRPISTILDGTNYITWANQMKSFLIGRKLWRIVTGDITKPTA--KEGDNTFIER 60
            ME+  I RPISTILDG+NYITWANQMKSFLIGRKLWRIVTGDITKPT   KE D+ FIER
Sbjct: 1    MEKSEIARPISTILDGSNYITWANQMKSFLIGRKLWRIVTGDITKPTKQDKEDDSKFIER 60

Query: 61   LEDWDSKNHQIITW----------------------------------LAHYYQLHSTLV 120
            LE+WDSKNHQIITW                                  LAHYYQLH++LV
Sbjct: 61   LEEWDSKNHQIITWLSNTSIPAIHTQFDAFENAKELWNFLSTRFKSVGLAHYYQLHNSLV 120

Query: 121  NLNQDIGQSVNEYLAVLQPIWTQLDQANISKDHLRLIKVLMGLRPEYESVRAALLHRNPL 180
            NLNQ+ GQSVNEYLAVLQPIWTQLDQA ISKDHLRLIKVLMGLRPEYESVRAALLHR+PL
Sbjct: 121  NLNQEAGQSVNEYLAVLQPIWTQLDQATISKDHLRLIKVLMGLRPEYESVRAALLHRSPL 180

Query: 181  PSLDAAIQEILFEEKRLGINSTKQSDVVLASTYTPNRAANMFCKNCKLSGHKFSNCPKIE 240
            PSLDAAIQEILFEE+RLGIN +K SD VLASTY+P  A++ FCKNCKL+GHKF NCPKIE
Sbjct: 181  PSLDAAIQEILFEERRLGINLSKHSDAVLASTYSPPGASSTFCKNCKLTGHKFINCPKIE 240

Query: 241  CRYCHKHGHILDNCPTRPPRPPGTSTKEKIFTKHGSSSVVAATSDDSSL--IQISDLQSL 300
            CRYCHK GHILDNCP +PPRP   ST+ K FTK  +SS      D S++   QISDLQSL
Sbjct: 241  CRYCHKPGHILDNCPIKPPRPRNYSTRAKNFTKPSNSSAATVAPDKSTIPQFQISDLQSL 300

Query: 301  LNQLISS-SSTLAVSSGNRWLLDSACCNHMTSDSSLMSTSSPTKSLPPIYAADGNCMNIS 360
            LNQLISS SS LAVS GNRWLLDS CCNHMTSD SLM+T SPTKSLPPIYAADGNCMNI+
Sbjct: 301  LNQLISSPSSALAVSPGNRWLLDSGCCNHMTSDYSLMNTPSPTKSLPPIYAADGNCMNIT 360

Query: 361  HTGTIDTPSVHLPHTYCVPNLTFNLVSVGQLCDLGLNVSFSPNGCQVQDPQTGQTIGTGR 420
            H GTI+TPS++LPHTYCVPNLTFNLVSVGQLCDLG  VSFS NGCQVQDPQTGQTIGTGR
Sbjct: 361  HMGTINTPSLNLPHTYCVPNLTFNLVSVGQLCDLGFTVSFSSNGCQVQDPQTGQTIGTGR 420

Query: 421  KVGRLFELTSLRVSSPSSISASVTDSDTYQWHLRLGHASSEKLRHLISVNNLTNLTKFVP 480
            KVGRLFEL SL+V SP SISA VTDSDTYQWHLRLGHAS EKLRHLIS+NNL ++TKFVP
Sbjct: 421  KVGRLFELLSLQVPSP-SISAPVTDSDTYQWHLRLGHASPEKLRHLISINNLNSITKFVP 480

Query: 481  FNCLNCKLAKQPALSFSQSISNCDKPFDLVHSDIWGPAPITTVHGYRYYVLFIDDYSRFT 540
            FNCLNCKLAKQPALSFS S S CDKPFDL+HSDIWGPAP +TVHGYRYYVLFIDD+SRFT
Sbjct: 481  FNCLNCKLAKQPALSFSTSTSICDKPFDLIHSDIWGPAPTSTVHGYRYYVLFIDDFSRFT 540

Query: 541  WIYFLKHRSELSRTYIEFANMIRTQFSSPIKILRTDNVLEYKDSILLSFLSQQGTIVQRS 600
            WIYFLKHRSELSRTYIEFANMIRTQFS PIK LRTDN LEYKDS LLSFLSQQGT+VQRS
Sbjct: 541  WIYFLKHRSELSRTYIEFANMIRTQFSCPIKTLRTDNALEYKDSTLLSFLSQQGTLVQRS 600

Query: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNTSP 660
            CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQN SP
Sbjct: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNISP 660

Query: 661  FEKLYGISPDYSKLKVFGSACFVLLHPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720
            FE+LYG  P+YS LKVFG ACFVLL PHEH KLEPRARLCCFLGYGTEHKGFRCWDPLSN
Sbjct: 661  FERLYGTPPNYSNLKVFGCACFVLLQPHEHTKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720

Query: 721  RLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLSEPTLDTELAQSSPATA 780
            RLRISRHVTFWEHTMFSRLSSFH SFSSPQSFFT+TS+DLFPLSE T   ELAQS+P +A
Sbjct: 721  RLRISRHVTFWEHTMFSRLSSFHASFSSPQSFFTDTSIDLFPLSESTPGNELAQSAPTSA 780

Query: 781  NLDPPSVFDDVPESPPATPLRRSTRVREPPSHLTHACFVLLHPHEHNKLEPRARLCCFLG 840
              D  S+ D  P+ PP                                            
Sbjct: 781  TSDQSSISDGNPDPPP-------------------------------------------- 840

Query: 841  YGTEHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLS 900
                                                                        
Sbjct: 841  ------------------------------------------------------------ 900

Query: 901  EPTLDTELAQSSPATANLDPPSVSDDVPESSPATPLRRSTRVREPPPHLTDYHCFSTIVS 960
                                     D+P        RRSTRVREPP HL DYHCFSTIVS
Sbjct: 901  -------------------------DIPP-------RRSTRVREPPIHLQDYHCFSTIVS 960

Query: 961  LVEPTSYQEASINPVWQKAMDEELQALEKTHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020
            L+EPTSYQEAS +P+WQKAM++ELQALEK HTWDYVDLPPGKRPIGCKWIYKIKTHSDGT
Sbjct: 961  LIEPTSYQEASTDPLWQKAMNDELQALEKMHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020

Query: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLFQMDVKNAFLNG 1080
            IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPL QMDVKNAFLNG
Sbjct: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLLQMDVKNAFLNG 1080

Query: 1081 NLSEEVYMKPPQGTSPPPNKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSSHDNAL 1140
             LSEEVYMKPP GTS PP+KVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSS HD AL
Sbjct: 1081 TLSEEVYMKPPPGTSSPPHKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSPHDTAL 1140

Query: 1141 FTRQTTHGIVLLLLYVDDMIITGNDQQAISDLQQYLGQHFEMKDLGSLNYFLGLEVSHRS 1200
            FTR T  GIVLLLLYVDDMIITGND  AISDLQ YLGQHFEMKDLGSLNYFLGLEVS RS
Sbjct: 1141 FTRHTPQGIVLLLLYVDDMIITGNDPHAISDLQHYLGQHFEMKDLGSLNYFLGLEVSRRS 1200

Query: 1201 DGYLLSQAKYASDLIARSGITDSTTSSTPLDPHVHLTPFDGVPLDDASLYRQLVGSLIYL 1260
            DGYLLSQAKYASDL+ARSGITDS T+STPLDP+VHLTP+DGVPL++ SLYRQLVGSLIYL
Sbjct: 1201 DGYLLSQAKYASDLLARSGITDSNTASTPLDPNVHLTPYDGVPLENVSLYRQLVGSLIYL 1260

Query: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320
            TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA
Sbjct: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320

Query: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVISRSSTESEYRALADATAELIWLRWLL 1380
            DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSV+SRSSTESEYRALADATAEL+WLRWLL
Sbjct: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVVSRSSTESEYRALADATAELLWLRWLL 1334

Query: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFVRHHLLSNTLLLRSVSTI 1433
            ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHF+RHHLLS+TLLL+S+ST 
Sbjct: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFIRHHLLSHTLLLQSISTT 1334

BLAST of CSPI04G12200 vs. NCBI nr
Match: TYK19656.1 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cucumis melo var. makuwa])

HSP 1 Score: 2159.0 bits (5593), Expect = 0.0e+00
Identity = 1110/1471 (75.46%), Postives = 1170/1471 (79.54%), Query Frame = 0

Query: 1    MERDHIFRPISTILDGTNYITWANQMKSFLIGRKLWRIVTGDITKPTA--KEGDNTFIER 60
            ME+  I RPISTILDG+NYITWANQMKSFLIGRKLWRIVTGDITKPT   KE D+ FIER
Sbjct: 1    MEKSEIARPISTILDGSNYITWANQMKSFLIGRKLWRIVTGDITKPTKQDKEDDSKFIER 60

Query: 61   LEDWDSKNHQIITW----------------------------------LAHYYQLHSTLV 120
            LE+WDSKNHQIITW                                  LAHYYQLH++LV
Sbjct: 61   LEEWDSKNHQIITWLSNTSIPAIHTQFDAFENAKELWNFLSTRFKSVGLAHYYQLHNSLV 120

Query: 121  NLNQDIGQSVNEYLAVLQPIWTQLDQANISKDHLRLIKVLMGLRPEYESVRAALLHRNPL 180
            NLNQ+ GQSVNEYLAVLQPIWTQLDQA ISKDHLRLIKVLMGLRPEYESVRAALLHR+PL
Sbjct: 121  NLNQEAGQSVNEYLAVLQPIWTQLDQATISKDHLRLIKVLMGLRPEYESVRAALLHRSPL 180

Query: 181  PSLDAAIQEILFEEKRLGINSTKQSDVVLASTYTPNRAANMFCKNCKLSGHKFSNCPKIE 240
            PSLDAAIQEILFEE+RLGIN +K SD VLASTY+P  A++ FCKNCKL+GHKF NCPKIE
Sbjct: 181  PSLDAAIQEILFEERRLGINLSKHSDAVLASTYSPPGASSTFCKNCKLTGHKFINCPKIE 240

Query: 241  CRYCHKHGHILDNCPTRPPRPPGTSTKEKIFTKHGSSSVVAATSDDSSL--IQISDLQSL 300
            CRYCHK GHILDNCP +PPRP   ST+ K FTK  +SS      D S++   QISDLQSL
Sbjct: 241  CRYCHKPGHILDNCPIKPPRPRNYSTRAKNFTKPSNSSAATVAPDKSTIPQFQISDLQSL 300

Query: 301  LNQLISS-SSTLAVSSGNRWLLDSACCNHMTSDSSLMSTSSPTKSLPPIYAADGNCMNIS 360
            LNQLISS SS LAVS GNRWLLDS CCNHMTSD SLM+T SPTKSLPPIYAADGNCMNI+
Sbjct: 301  LNQLISSPSSALAVSPGNRWLLDSGCCNHMTSDYSLMNTPSPTKSLPPIYAADGNCMNIT 360

Query: 361  HTGTIDTPSVHLPHTYCVPNLTFNLVSVGQLCDLGLNVSFSPNGCQVQDPQTGQTIGTGR 420
            H GTI+TPS++LPHTYCVPNLTFNLVSVGQLCDLG  VSFS NGCQVQDPQTGQTIGTGR
Sbjct: 361  HMGTINTPSLNLPHTYCVPNLTFNLVSVGQLCDLGFTVSFSSNGCQVQDPQTGQTIGTGR 420

Query: 421  KVGRLFELTSLRVSSPSSISASVTDSDTYQWHLRLGHASSEKLRHLISVNNLTNLTKFVP 480
            KVGRLFEL SL+V SP SISA VTDSDTYQWHLRLGHAS EKLRHLIS+NNL ++TKFVP
Sbjct: 421  KVGRLFELLSLQVPSP-SISAPVTDSDTYQWHLRLGHASPEKLRHLISINNLNSITKFVP 480

Query: 481  FNCLNCKLAKQPALSFSQSISNCDKPFDLVHSDIWGPAPITTVHGYRYYVLFIDDYSRFT 540
            FNCLNCKLAKQPALSFS S S CDKPFDL+HSDIWGPAP +TVHGYRYYVLFIDD+SRFT
Sbjct: 481  FNCLNCKLAKQPALSFSTSTSICDKPFDLIHSDIWGPAPTSTVHGYRYYVLFIDDFSRFT 540

Query: 541  WIYFLKHRSELSRTYIEFANMIRTQFSSPIKILRTDNVLEYKDSILLSFLSQQGTIVQRS 600
            WIYFLKHRSELSRTYIEFANMIRTQFS PIK LRTDN LEYKDS LLSFLSQQGT+VQRS
Sbjct: 541  WIYFLKHRSELSRTYIEFANMIRTQFSCPIKTLRTDNALEYKDSTLLSFLSQQGTLVQRS 600

Query: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNTSP 660
            CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQN SP
Sbjct: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNISP 660

Query: 661  FEKLYGISPDYSKLKVFGSACFVLLHPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720
            FE+LYG  P+YS LKVFG ACFVLL PHEH KLEPRARLCCFLGYGTEHKGFRCWDPLSN
Sbjct: 661  FERLYGTPPNYSNLKVFGCACFVLLQPHEHTKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720

Query: 721  RLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLSEPTLDTELAQSSPATA 780
            RLRISRHVTFWEHTMFSRLSSFH SFSSPQSFFT+TS+DLFPLSE T   ELAQS+P +A
Sbjct: 721  RLRISRHVTFWEHTMFSRLSSFHASFSSPQSFFTDTSIDLFPLSESTPGNELAQSAPTSA 780

Query: 781  NLDPPSVFDDVPESPPATPLRRSTRVREPPSHLTHACFVLLHPHEHNKLEPRARLCCFLG 840
              D  S+ D  P+ PP                                            
Sbjct: 781  TSDQSSISDGNPDPPP-------------------------------------------- 840

Query: 841  YGTEHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLS 900
                                                                        
Sbjct: 841  ------------------------------------------------------------ 900

Query: 901  EPTLDTELAQSSPATANLDPPSVSDDVPESSPATPLRRSTRVREPPPHLTDYHCFSTIVS 960
                                     D+P        RRSTRVREPP HL DYHCFSTIVS
Sbjct: 901  -------------------------DIPP-------RRSTRVREPPIHLQDYHCFSTIVS 960

Query: 961  LVEPTSYQEASINPVWQKAMDEELQALEKTHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020
            L+EPTSYQEAS +P+WQKAM++ELQALEK HTWDYVDLPPGKRPIGCKWIYKIKTHSDGT
Sbjct: 961  LIEPTSYQEASTDPLWQKAMNDELQALEKMHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020

Query: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLFQMDVKNAFLNG 1080
            IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPL QMDVKNAFLNG
Sbjct: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLLQMDVKNAFLNG 1080

Query: 1081 NLSEEVYMKPPQGTSPPPNKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSSHDNAL 1140
             LSEEVYMKPP GTS PP+KVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSS HD AL
Sbjct: 1081 TLSEEVYMKPPPGTSSPPHKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSPHDTAL 1140

Query: 1141 FTRQTTHGIVLLLLYVDDMIITGNDQQAISDLQQYLGQHFEMKDLGSLNYFLGLEVSHRS 1200
            FTR T  GIVLLLLYVDDMIITGND  AISDLQ YLGQHFEMKDLGSLNYFLGLEVS RS
Sbjct: 1141 FTRHTPQGIVLLLLYVDDMIITGNDPHAISDLQHYLGQHFEMKDLGSLNYFLGLEVSRRS 1200

Query: 1201 DGYLLSQAKYASDLIARSGITDSTTSSTPLDPHVHLTPFDGVPLDDASLYRQLVGSLIYL 1260
            DGYLLSQAKYASDL+ARSGITDS T+STPLDP+VHLTP+DGVPL++ SLYRQLVGSLIYL
Sbjct: 1201 DGYLLSQAKYASDLLARSGITDSNTASTPLDPNVHLTPYDGVPLENVSLYRQLVGSLIYL 1260

Query: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320
            TVTRPDIAYAVHI                        GTLGHGLQFSSQSSLVLSGYSDA
Sbjct: 1261 TVTRPDIAYAVHI------------------------GTLGHGLQFSSQSSLVLSGYSDA 1310

Query: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVISRSSTESEYRALADATAELIWLRWLL 1380
            DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSV+SRSSTESEYRALADATAEL+WLRWLL
Sbjct: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVVSRSSTESEYRALADATAELLWLRWLL 1310

Query: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFVRHHLLSNTLLLRSVSTI 1433
            ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHF+RHHLLS+TLLL+S+ST 
Sbjct: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFIRHHLLSHTLLLQSISTT 1310

BLAST of CSPI04G12200 vs. NCBI nr
Match: TYK26360.1 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cucumis melo var. makuwa])

HSP 1 Score: 2156.3 bits (5586), Expect = 0.0e+00
Identity = 1108/1471 (75.32%), Postives = 1169/1471 (79.47%), Query Frame = 0

Query: 1    MERDHIFRPISTILDGTNYITWANQMKSFLIGRKLWRIVTGDITKPTA--KEGDNTFIER 60
            ME+  I RPISTILDG+NYITWANQMKSFLIGRKLWRIVTGDITKPT   KE D+ FIER
Sbjct: 1    MEKSEIARPISTILDGSNYITWANQMKSFLIGRKLWRIVTGDITKPTKQDKEDDSKFIER 60

Query: 61   LEDWDSKNHQIITW----------------------------------LAHYYQLHSTLV 120
            LE+WDSKNHQIITW                                  LAHYYQLH++LV
Sbjct: 61   LEEWDSKNHQIITWLSNTSIPAIHTQFDAFENAKELWNFLSTRFKSVGLAHYYQLHNSLV 120

Query: 121  NLNQDIGQSVNEYLAVLQPIWTQLDQANISKDHLRLIKVLMGLRPEYESVRAALLHRNPL 180
            NLNQ+ GQSVNEYLAVLQPIWTQLDQA ISKDHLRLIKVLMGLRPEYESVRAALLHR+PL
Sbjct: 121  NLNQEAGQSVNEYLAVLQPIWTQLDQATISKDHLRLIKVLMGLRPEYESVRAALLHRSPL 180

Query: 181  PSLDAAIQEILFEEKRLGINSTKQSDVVLASTYTPNRAANMFCKNCKLSGHKFSNCPKIE 240
            PSLDAAIQEILFEE+RLGIN +K SD VLASTY+P  A++ FCKNCKL+GHKF NCPKIE
Sbjct: 181  PSLDAAIQEILFEERRLGINLSKHSDAVLASTYSPPGASSTFCKNCKLTGHKFINCPKIE 240

Query: 241  CRYCHKHGHILDNCPTRPPRPPGTSTKEKIFTKHGSSSVVAATSDDSSL--IQISDLQSL 300
            CRYCHK GHILDNCP +PPRP   ST+ K FTK  +SS      D S++   QISDLQSL
Sbjct: 241  CRYCHKPGHILDNCPIKPPRPRNYSTRAKNFTKPSNSSAATVAPDKSTIPQFQISDLQSL 300

Query: 301  LNQLISS-SSTLAVSSGNRWLLDSACCNHMTSDSSLMSTSSPTKSLPPIYAADGNCMNIS 360
            LNQLISS SS LAVS GNRWLLDS CCNHMTSD SLM+T SPTKSLPPIYAADGNCMNI+
Sbjct: 301  LNQLISSPSSALAVSPGNRWLLDSGCCNHMTSDYSLMNTPSPTKSLPPIYAADGNCMNIT 360

Query: 361  HTGTIDTPSVHLPHTYCVPNLTFNLVSVGQLCDLGLNVSFSPNGCQVQDPQTGQTIGTGR 420
            H GTI+TPS++LPHTYCVPNLTFNLVSVGQLCDLG  VSFS NGCQVQDPQTGQTIGTGR
Sbjct: 361  HMGTINTPSLNLPHTYCVPNLTFNLVSVGQLCDLGFTVSFSSNGCQVQDPQTGQTIGTGR 420

Query: 421  KVGRLFELTSLRVSSPSSISASVTDSDTYQWHLRLGHASSEKLRHLISVNNLTNLTKFVP 480
            KVGRLFEL SL+V SP SISA VTDSDTYQWHLRLGHAS EKLRHLIS+NNL ++TKFVP
Sbjct: 421  KVGRLFELLSLQVPSP-SISAPVTDSDTYQWHLRLGHASPEKLRHLISINNLNSITKFVP 480

Query: 481  FNCLNCKLAKQPALSFSQSISNCDKPFDLVHSDIWGPAPITTVHGYRYYVLFIDDYSRFT 540
            FNCLNCKLAKQPALSFS S S CDKPFDL+HSDIWGPAP +TVHGYRYYVLFIDD+SRFT
Sbjct: 481  FNCLNCKLAKQPALSFSTSTSICDKPFDLIHSDIWGPAPTSTVHGYRYYVLFIDDFSRFT 540

Query: 541  WIYFLKHRSELSRTYIEFANMIRTQFSSPIKILRTDNVLEYKDSILLSFLSQQGTIVQRS 600
            WIYFLKHRSELSRTYIEFANMIRTQFS PIK LRTDN LEYKDS LLSFLSQQGT+VQRS
Sbjct: 541  WIYFLKHRSELSRTYIEFANMIRTQFSCPIKTLRTDNALEYKDSTLLSFLSQQGTLVQRS 600

Query: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNTSP 660
            CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQN SP
Sbjct: 601  CPHTSQQNGRAERKHRHILDSVRALLLSASCPEKFWGEAALTSVYTINRLPSSVLQNISP 660

Query: 661  FEKLYGISPDYSKLKVFGSACFVLLHPHEHNKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720
            FE+LYG  P+YS LKVFG ACFVLL PHEH KLEPRARLCCFLGYGTEHKGFRCWDPLSN
Sbjct: 661  FERLYGTPPNYSNLKVFGCACFVLLQPHEHTKLEPRARLCCFLGYGTEHKGFRCWDPLSN 720

Query: 721  RLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLSEPTLDTELAQSSPATA 780
            RLRISRHVTFWEHTMFSRLSSFH SFSSPQSFFT+TS+DLFPLSE T   ELAQS+P +A
Sbjct: 721  RLRISRHVTFWEHTMFSRLSSFHASFSSPQSFFTDTSIDLFPLSESTPGNELAQSAPTSA 780

Query: 781  NLDPPSVFDDVPESPPATPLRRSTRVREPPSHLTHACFVLLHPHEHNKLEPRARLCCFLG 840
              D  S+ D  P+ PP                                            
Sbjct: 781  TSDQSSISDGNPDPPP-------------------------------------------- 840

Query: 841  YGTEHKGFRCWDPLSNRLRISRHVTFWEHTMFSRLSSFHTSFSSPQSFFTNTSVDLFPLS 900
                                                                        
Sbjct: 841  ------------------------------------------------------------ 900

Query: 901  EPTLDTELAQSSPATANLDPPSVSDDVPESSPATPLRRSTRVREPPPHLTDYHCFSTIVS 960
                                     D+P        RRSTRVREPP HL DYHCFSTIVS
Sbjct: 901  -------------------------DIPP-------RRSTRVREPPIHLQDYHCFSTIVS 960

Query: 961  LVEPTSYQEASINPVWQKAMDEELQALEKTHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020
            L+EPTSYQEAS +P+WQKAM++ELQALEK HTWDYVDLPPGKRPIGCKWIYKIKTHSDGT
Sbjct: 961  LIEPTSYQEASTDPLWQKAMNDELQALEKMHTWDYVDLPPGKRPIGCKWIYKIKTHSDGT 1020

Query: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLFQMDVKNAFLNG 1080
            IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPL QMDVKNAFLNG
Sbjct: 1021 IERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVAAAKQWPLLQMDVKNAFLNG 1080

Query: 1081 NLSEEVYMKPPQGTSPPPNKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSSHDNAL 1140
             LSEEVYMKPP GTS PP+KVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSS HD AL
Sbjct: 1081 TLSEEVYMKPPPGTSSPPHKVCLLRRALYGLKQAPRAWFATFSSTITQLGFTSSPHDTAL 1140

Query: 1141 FTRQTTHGIVLLLLYVDDMIITGNDQQAISDLQQYLGQHFEMKDLGSLNYFLGLEVSHRS 1200
            FTR T  GIVLLLLYVDDMIITGND  AISDLQ YLGQHFEMKDLGS NYFLGLEVS RS
Sbjct: 1141 FTRHTPQGIVLLLLYVDDMIITGNDPHAISDLQHYLGQHFEMKDLGSFNYFLGLEVSRRS 1200

Query: 1201 DGYLLSQAKYASDLIARSGITDSTTSSTPLDPHVHLTPFDGVPLDDASLYRQLVGSLIYL 1260
            DGYLLSQAKYASDL+ARSGITDS T+STPLDP+VHLTP+DGVPL++ SLYRQLVGSLIYL
Sbjct: 1201 DGYLLSQAKYASDLLARSGITDSNTASTPLDPNVHLTPYDGVPLENVSLYRQLVGSLIYL 1260

Query: 1261 TVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDA 1320
            TVTRPDIAYAVHI                        GTLGHGLQFSSQSSLVLSGYSDA
Sbjct: 1261 TVTRPDIAYAVHI------------------------GTLGHGLQFSSQSSLVLSGYSDA 1310

Query: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVISRSSTESEYRALADATAELIWLRWLL 1380
            DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSV+SRSSTESEYRALADATAEL+WLRWLL
Sbjct: 1321 DWAGDPTDRRSTTGYCFYLGDSLISWRSKKQSVVSRSSTESEYRALADATAELLWLRWLL 1310

Query: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFVRHHLLSNTLLLRSVSTI 1433
            ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHF+RHHLLS+TLLL+S+ST 
Sbjct: 1381 ADMGVPQQGPTLLHCDNRSAIQIAHNDVFHERTKHIENDCHFIRHHLLSHTLLLQSISTT 1310

BLAST of CSPI04G12200 vs. TAIR 10
Match: AT4G23160.1 (cysteine-rich RLK (RECEPTOR-like protein kinase) 8 )

HSP 1 Score: 500.7 bits (1288), Expect = 3.8e-141
Identity = 268/580 (46.21%), Postives = 364/580 (62.76%), Query Frame = 0

Query: 866  DTELAQSSPATANLDPPSVSDDVPESSPATPLRRSTRVREPPPHLTDYHCFST------- 925
            D + + SS +   +   ++ +DVPE S  T  RR+ +    P +L DY+C S        
Sbjct: 4    DADASTSSSSIDIMPSANIQNDVPEPSVHTSHRRTRK----PAYLQDYYCHSVASLTIHD 63

Query: 926  --------------------IVSLVEPTSYQEASINPVWQKAMDEELQALEKTHTWDYVD 985
                                I    EP++Y EA    VW  AMD+E+ A+E THTW+   
Sbjct: 64   ISQFLSYEKVSPLYHSFLVCIAKAKEPSTYNEAKEFLVWCGAMDDEIGAMETTHTWEICT 123

Query: 986  LPPGKRPIGCKWIYKIKTHSDGTIERYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSL 1045
            LPP K+PIGCKW+YKIK +SDGTIERYKARLVAKGY+Q+ GID+ ETF+PV ++TSV+ +
Sbjct: 124  LPPNKKPIGCKWVYKIKYNSDGTIERYKARLVAKGYTQQEGIDFIETFSPVCKLTSVKLI 183

Query: 1046 LAVAAAKQWPLFQMDVKNAFLNGNLSEEVYMKPP------QGTSPPPNKVCLLRRALYGL 1105
            LA++A   + L Q+D+ NAFLNG+L EE+YMK P      QG S PPN VC L++++YGL
Sbjct: 184  LAISAIYNFTLHQLDISNAFLNGDLDEEIYMKLPPGYAARQGDSLPPNAVCYLKKSIYGL 243

Query: 1106 KQAPRAWFATFSSTITQLGFTSSSHDNALFTRQTTHGIVLLLLYVDDMIITGNDQQAISD 1165
            KQA R WF  FS T+   GF  S  D+  F + T    + +L+YVDD+II  N+  A+ +
Sbjct: 244  KQASRQWFLKFSVTLIGFGFVQSHSDHTYFLKITATLFLCVLVYVDDIIICSNNDAAVDE 303

Query: 1166 LQQYLGQHFEMKDLGSLNYFLGLEVSHRSDGYLLSQAKYASDLIARSGITDSTTSSTPLD 1225
            L+  L   F+++DLG L YFLGLE++  + G  + Q KYA DL+  +G+     SS P+D
Sbjct: 304  LKSQLKSCFKLRDLGPLKYFLGLEIARSAAGINICQRKYALDLLDETGLLGCKPSSVPMD 363

Query: 1226 PHVHLTPFDGVPLDDASLYRQLVGSLIYLTVTRPDIAYAVHIVSQFMAAPRTIHFTAVLR 1285
            P V  +   G    DA  YR+L+G L+YL +TR DI++AV+ +SQF  APR  H  AV++
Sbjct: 364  PSVTFSAHSGGDFVDAKAYRRLIGRLMYLQITRLDISFAVNKLSQFSEAPRLAHQQAVMK 423

Query: 1286 ILRYVKGTLGHGLQFSSQSSLVLSGYSDADWAGDPTDRRSTTGYCFYLGDSLISWRSKKQ 1345
            IL Y+KGT+G GL +SSQ+ + L  +SDA +      RRST GYC +LG SLISW+SKKQ
Sbjct: 424  ILHYIKGTVGQGLFYSSQAEMQLQVFSDASFQSCKDTRRSTNGYCMFLGTSLISWKSKKQ 483

Query: 1346 SVISRSSTESEYRALADATAELIWLRWLLADMGVPQQGPTLLHCDNRSAIQIAHNDVFHE 1405
             V+S+SS E+EYRAL+ AT E++WL     ++ +P   PTLL CDN +AI IA N VFHE
Sbjct: 484  QVVSKSSAEAEYRALSFATDEMMWLAQFFRELQLPLSKPTLLFCDNTAAIHIATNAVFHE 543

Query: 1406 RTKHIENDCHFVRHHLLSNTLLLRSVSTIEQPADIFTKAL 1413
            RTKHIE+DCH VR   +    L  S    ++  D FT+ L
Sbjct: 544  RTKHIESDCHSVRERSVYQATLSYSFQAYDE-QDGFTEYL 578

BLAST of CSPI04G12200 vs. TAIR 10
Match: ATMG00810.1 (DNA/RNA polymerases superfamily protein )

HSP 1 Score: 199.9 bits (507), Expect = 1.4e-50
Identity = 105/224 (46.88%), Postives = 144/224 (64.29%), Query Frame = 0

Query: 1113 LLLYVDDMIITGNDQQAISDLQQYLGQHFEMKDLGSLNYFLGLEVSHRSDGYLLSQAKYA 1172
            LLLYVDD+++TG+    ++ L   L   F MKDLG ++YFLG+++     G  LSQ KYA
Sbjct: 3    LLLYVDDILLTGSSNTLLNMLIFQLSSTFSMKDLGPVHYFLGIQIKTHPSGLFLSQTKYA 62

Query: 1173 SDLIARSGITDSTTSSTPLDPHVHLTPFDGVPLDDASLYRQLVGSLIYLTVTRPDIAYAV 1232
              ++  +G+ D    STPL   ++ +        D S +R +VG+L YLT+TRPDI+YAV
Sbjct: 63   EQILNNAGMLDCKPMSTPLPLKLN-SSVSTAKYPDPSDFRSIVGALQYLTLTRPDISYAV 122

Query: 1233 HIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGYSDADWAGDPTDRRS 1292
            +IV Q M  P    F  + R+LRYVKGT+ HGL     S L +  + D+DWAG  + RRS
Sbjct: 123  NIVCQRMHEPTLADFDLLKRVLRYVKGTIFHGLYIHKNSKLNVQAFCDSDWAGCTSTRRS 182

Query: 1293 TTGYCFYLGDSLISWRSKKQSVISRSSTESEYRALADATAELIW 1337
            TTG+C +LG ++ISW +K+Q  +SRSSTE+EYRALA   AEL W
Sbjct: 183  TTGFCTFLGCNIISWSAKRQPTVSRSSTETEYRALALTAAELTW 225

BLAST of CSPI04G12200 vs. TAIR 10
Match: ATMG00820.1 (Reverse transcriptase (RNA-dependent DNA polymerase) )

HSP 1 Score: 105.1 bits (261), Expect = 4.6e-22
Identity = 50/99 (50.51%), Postives = 67/99 (67.68%), Query Frame = 0

Query: 924  EPTSYQEASINPVWQKAMDEELQALEKTHTWDYVDLPPGKRPIGCKWIYKIKTHSDGTIE 983
            EP S   A  +P W +AM EEL AL +  TW  V  P  +  +GCKW++K K HSDGT++
Sbjct: 27   EPKSVIFALKDPGWCQAMQEELDALSRNKTWILVPPPVNQNILGCKWVFKTKLHSDGTLD 86

Query: 984  RYKARLVAKGYSQEYGIDYEETFAPVARMTSVRSLLAVA 1023
            R KARLVAKG+ QE GI + ET++PV R  ++R++L VA
Sbjct: 87   RLKARLVAKGFHQEEGIYFVETYSPVVRTATIRTILNVA 125

BLAST of CSPI04G12200 vs. TAIR 10
Match: ATMG00240.1 (Gag-Pol-related retrotransposon family protein )

HSP 1 Score: 90.1 bits (222), Expect = 1.5e-17
Identity = 41/79 (51.90%), Postives = 57/79 (72.15%), Query Frame = 0

Query: 1219 IYLTVTRPDIAYAVHIVSQFMAAPRTIHFTAVLRILRYVKGTLGHGLQFSSQSSLVLSGY 1278
            +YLT+TRPD+ +AV+ +SQF +A RT    AV ++L YVKGT+G GL +S+ S L L  +
Sbjct: 1    MYLTITRPDLTFAVNRLSQFSSASRTAQMQAVYKVLHYVKGTVGQGLFYSATSDLQLKAF 60

Query: 1279 SDADWAGDPTDRRSTTGYC 1298
            +D+DWA  P  RRS TG+C
Sbjct: 61   ADSDWASCPDTRRSVTGFC 79

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
Q94HW23.2e-20133.87Retrovirus-related Pol polyprotein from transposon RE1 OS=Arabidopsis thaliana O... [more]
Q9ZT941.2e-20033.42Retrovirus-related Pol polyprotein from transposon RE2 OS=Arabidopsis thaliana O... [more]
P109789.4e-14529.06Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum... [more]
P041467.2e-13727.44Copia protein OS=Drosophila melanogaster OX=7227 GN=GIP PE=1 SV=3[more]
P925191.9e-4946.88Uncharacterized mitochondrial protein AtMg00810 OS=Arabidopsis thaliana OX=3702 ... [more]
Match NameE-valueIdentityDescription
A0A5A7VIT80.0e+0077.09Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cucumis melo var.... [more]
A0A5A7T8F20.0e+0077.02Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cucumis melo var.... [more]
A0A5D3C0D70.0e+0077.02Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cucumis melo var.... [more]
A0A5D3D7V80.0e+0075.46Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cucumis melo var.... [more]
A0A5D3DT060.0e+0075.32Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cucumis melo var.... [more]
Match NameE-valueIdentityDescription
KAA0041601.10.0e+0077.09Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cucumis melo var. m... [more]
KAA0037745.10.0e+0077.02Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cucumis melo var. m... [more]
TYK04714.10.0e+0077.02Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cucumis melo var. m... [more]
TYK19656.10.0e+0075.46Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cucumis melo var. m... [more]
TYK26360.10.0e+0075.32Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cucumis melo var. m... [more]
Match NameE-valueIdentityDescription
AT4G23160.13.8e-14146.21cysteine-rich RLK (RECEPTOR-like protein kinase) 8 [more]
ATMG00810.11.4e-5046.88DNA/RNA polymerases superfamily protein [more]
ATMG00820.14.6e-2250.51Reverse transcriptase (RNA-dependent DNA polymerase) [more]
ATMG00240.11.5e-1751.90Gag-Pol-related retrotransposon family protein [more]
InterPro
Analysis Name: InterPro Annotations of Cucumber (PI 183967) v1
Date Performed: 2021-10-25
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR001878Zinc finger, CCHC-typeSMARTSM00343c2hcfinal6coord: 186..202
e-value: 0.38
score: 13.8
coord: 204..220
e-value: 0.098
score: 17.4
IPR036397Ribonuclease H superfamilyGENE3D3.30.420.10coord: 459..638
e-value: 8.7E-37
score: 128.3
IPR025314Domain of unknown function DUF4219PFAMPF13961DUF4219coord: 14..39
e-value: 1.9E-6
score: 27.5
IPR001584Integrase, catalytic corePFAMPF00665rvecoord: 466..565
e-value: 7.3E-9
score: 35.9
IPR001584Integrase, catalytic corePROSITEPS50994INTEGRASEcoord: 463..629
score: 18.886889
IPR025724GAG-pre-integrase domainPFAMPF13976gag_pre-integrscoord: 391..452
e-value: 2.0E-10
score: 40.4
IPR013103Reverse transcriptase, RNA-dependent DNA polymerasePFAMPF07727RVT_2coord: 952..1191
e-value: 2.1E-75
score: 253.5
NoneNo IPR availableGENE3D4.10.60.10coord: 165..224
e-value: 5.0E-7
score: 31.4
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 733..768
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 869..909
NoneNo IPR availablePANTHERPTHR45895FAMILY NOT NAMEDcoord: 786..1310
coord: 336..701
NoneNo IPR availableCDDcd09272RNase_HI_RT_Ty1coord: 1277..1414
e-value: 1.01713E-77
score: 250.848
IPR036875Zinc finger, CCHC-type superfamilySUPERFAMILY57756Retrovirus zinc finger-like domainscoord: 179..222
IPR012337Ribonuclease H-like superfamilySUPERFAMILY53098Ribonuclease H-likecoord: 465..638
IPR043502DNA/RNA polymerase superfamilySUPERFAMILY56672DNA/RNA polymerasescoord: 951..1382

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CSPI04G12200.1CSPI04G12200.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0015074 DNA integration
molecular_function GO:0003676 nucleic acid binding
molecular_function GO:0008270 zinc ion binding