CmoCh04G021120 (gene) Cucurbita moschata (Rifu)

NameCmoCh04G021120
Typegene
OrganismCucurbita moschata (Cucurbita moschata (Rifu))
DescriptionDNA/RNA polymerases superfamily protein, putative
LocationCmo_Chr04 : 13609323 .. 13611246 (+)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: exonCDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGTCTTTTGGCCTTACCAATGCACCAGCTGTTTTCATGGAACTAATGAACAGGGTGTTTAAGGAATTCTTAGACACCTTTGTCTTAGTGTTCATCGACGACATTCTGGTATACTCTAAGTCAGAGGCAGATCATGAAATACACCTCAGAAAAGTCTTGACAATACTAAGAGCTCAGCAGTTGTATGCCAAGTTCTCTAAGTGTGAGTTTTGGTTGTCTGAAGTTGCGTTTCTGGGTCACGTGGTGTCAAGCAAGGGGATCACAGTGGACCCAGCTAAGATAGAAGCAGTGATGAGGTGGCCACAGCCGACCACAGTCACAGAGGTGAGGAGTTTTCTTGGGCTAGCTGGTTGTTACAGAAGGTTTGTTCAGGATTTCTCCAAAATTTCCTCGGCGCTGACTCAGCTAACCAAGAAGGGCAAGCCCTTTGCTTGGACTCCAGTCTGTGAACAGAGTTTCCAGGAACTCAAGAAGAGGTTGGTAACTGCACCAGTCCTTACGGTTCCAGATGGGTCAGGTAATCTCGTGGTGTACAGTGATGCATCAGGGAAAGGCTTGGGGTGTGTGCTCATGCAGAAAGGTAAGGTGATAGCGTATGCTTCTCGACAATTGAAAGAATATGAACGAAACTACCCCACGCATGATCTCGAGTTAGCAGCGGTAGTATTCGCTCTAAAAACGTGGCGGCACTACCTGTATGGGGAAAAAGTACAAGTCTTCACTGATCATAAGAGCCTCAAGTACTTATTCACGCAGAAGGAGCTCAATATGAGACAGAGGCGATGGTTGGAGCTGGTAAAGGATTATGACATAGAGATTCTGTACCATCCAGGCAAAGCCAGCGTGGTAGCTGATGCATTGAGCAGGAAGGCTGTGCATACTTCTGTGATGATCACCACACAGGAAAAACTACAAGATGAGATGAAGAGGGCTGGGATAGACGTGGTGATTAAAGGTGGTAATGTTCAGATAGCACAGTTAACTATACAGCCTACCCTACGAAAGAAAGTTATCGACGCTCAGAGGTCTGATGAACACCTCAGTAAAGTGTGGAGTCAGATTGAGACAGAGAGGCCAGTAGGGTATTCTATCTCCTCAGACGGGGGTCTGCTATGGCAAAACCGCCTGTGCGTTCCCCGAGACGAGGGAATCTTAAAAGATATTATGACCGAAGCCCACGATACATCTTATATGTTCCACCCTGGAAGTACAAAGATGTATCAGGATCTGAAGAGGTTTTACTGGTGGTCCGGAATGAAGAGGGACATAGCGGATTTCGTAAGCCGTTGCTTGACCTGCCAGCAGGTGAAGGCCCCGAGGCAGCGCCCAGCGGGATTGCTACAGCCCCTGAGCGTCCCTCGTGTGGAAATGGGAAGCAGTCTGTATGGATTTCTTTTCGGGTTTGCCAAAGACAAAGCAGGGTTTCAACGTCGTATGGGTAATTGTAGACAGACTGACTAAGACAGCCCACTTCATTCCAGGAAAGTTCACGTATCGAGTAGACCGGTGGGCTCAGTTATATATCAAGGAGATAGTACGCCTGCACGGGGTACCAGTGTCCATAGTATCAGACCGGGACACCAGGTTCACCTCTCAGTTCTGGAGGAGTCTTCAGAAGGCACTAGGAACTCAGTTGAGGTTCAGTACAGCATTCCATCCTCAAACGGACGGACAGACCGAAAGGCTGAATCAGGTTTTAGAGGACATGTTGCGAGCCTGCTCCTTAGATTTCGCTGGGTGTTGGGACGAACATCTGCCTTTAATGGAGTTTGCCTACAACAATAGTTATCAAGCGACCATTCAGATGGCCCCCTTCGAGGCACTGTATGGGCGTAGGTGTCGAACACCAGTGTTTTGGGAAGAGGTAGGCACGCAGCAACTAATGGGACCAGAGTTGGTCCAGGTCACCAACGCAGCGGTGTAG

mRNA sequence

ATGTCTTTTGGCCTTACCAATGCACCAGCTGTTTTCATGGAACTAATGAACAGGGTGTTTAAGGAATTCTTAGACACCTTTGTCTTAGTGTTCATCGACGACATTCTGGTATACTCTAAGTCAGAGGCAGATCATGAAATACACCTCAGAAAAGTCTTGACAATACTAAGAGCTCAGCAGTTGTATGCCAAGTTCTCTAAGTGTGAGTTTTGGTTGTCTGAAGTTGCGTTTCTGGGTCACGTGGTGTCAAGCAAGGGGATCACAGTGGACCCAGCTAAGATAGAAGCAGTGATGAGGTGGCCACAGCCGACCACAGTCACAGAGGTGAGGAGTTTTCTTGGGCTAGCTGGTTGTTACAGAAGGTTTGTTCAGGATTTCTCCAAAATTTCCTCGGCGCTGACTCAGCTAACCAAGAAGGGCAAGCCCTTTGCTTGGACTCCAGTCTGTGAACAGAGTTTCCAGGAACTCAAGAAGAGGTTGGTAACTGCACCAGTCCTTACGGTTCCAGATGGGTCAGGTAATCTCGTGGTGTACAGTGATGCATCAGGGAAAGGCTTGGGGTGTGTGCTCATGCAGAAAGGTAAGGTGATAGCGTATGCTTCTCGACAATTGAAAGAATATGAACGAAACTACCCCACGCATGATCTCGAGTTAGCAGCGGTAGTATTCGCTCTAAAAACGTGGCGGCACTACCTGTATGGGGAAAAAGTACAAGTCTTCACTGATCATAAGAGCCTCAAGTACTTATTCACGCAGAAGGAGCTCAATATGAGACAGAGGCGATGGTTGGAGCTGGTAAAGGATTATGACATAGAGATTCTGTACCATCCAGGCAAAGCCAGCGTGGTAGCTGATGCATTGAGCAGGAAGGCTGTGCATACTTCTGTGATGATCACCACACAGGAAAAACTACAAGATGAGATGAAGAGGGCTGGGATAGACGTGGTGATTAAAGGTGGTAATGTTCAGATAGCACAGTTAACTATACAGCCTACCCTACGAAAGAAAGTTATCGACGCTCAGAGGTCTGATGAACACCTCAGTAAAGTGTGGAGTCAGATTGAGACAGAGAGGCCAGTAGGGTATTCTATCTCCTCAGACGGGGGTCTGCTATGGCAAAACCGCCTGTGCGTTCCCCGAGACGAGGGAATCTTAAAAGATATTATGACCGAAGCCCACGATACATCTTATATGTTCCACCCTGGAAGTACAAAGATGTATCAGGATCTGAAGAGGTTTTACTGGTGGTCCGGAATGAAGAGGGACATAGCGGATTTCGTAAGCCGTTGCTTGACCTGCCAGCAGGTGAAGGCCCCGAGGCAGCGCCCAGCGGGATTGCTACAGCCCCTGAGCGTCCCTCGTGTGGAAATGGGAAGCAGTCTGTATGGATTTCTTTTCGGGTTTGCCAAAGACAAAGCAGGGTTTCAACGTCGTATGGTGTCCATAGTATCAGACCGGGACACCAGGTTCACCTCTCAGTTCTGGAGGAGTCTTCAGAAGGCACTAGGAACTCAGTTGAGGTTCAGTACAGCATTCCATCCTCAAACGGACGGACAGACCGAAAGGCTGAATCAGGTTTTAGAGGACATGTTGCGAGCCTGCTCCTTAGATTTCGCTGGGTGTTGGGACGAACATCTGCCTTTAATGGAGTTTGCCTACAACAATAGTTATCAAGCGACCATTCAGATGGCCCCCTTCGAGGCACTGTATGGGCGTAGGTGTCGAACACCAGTGTTTTGGGAAGAGGTAGGCACGCAGCAACTAATGGGACCAGAGTTGGTCCAGGTCACCAACGCAGCGGTGTAG

Coding sequence (CDS)

ATGTCTTTTGGCCTTACCAATGCACCAGCTGTTTTCATGGAACTAATGAACAGGGTGTTTAAGGAATTCTTAGACACCTTTGTCTTAGTGTTCATCGACGACATTCTGGTATACTCTAAGTCAGAGGCAGATCATGAAATACACCTCAGAAAAGTCTTGACAATACTAAGAGCTCAGCAGTTGTATGCCAAGTTCTCTAAGTGTGAGTTTTGGTTGTCTGAAGTTGCGTTTCTGGGTCACGTGGTGTCAAGCAAGGGGATCACAGTGGACCCAGCTAAGATAGAAGCAGTGATGAGGTGGCCACAGCCGACCACAGTCACAGAGGTGAGGAGTTTTCTTGGGCTAGCTGGTTGTTACAGAAGGTTTGTTCAGGATTTCTCCAAAATTTCCTCGGCGCTGACTCAGCTAACCAAGAAGGGCAAGCCCTTTGCTTGGACTCCAGTCTGTGAACAGAGTTTCCAGGAACTCAAGAAGAGGTTGGTAACTGCACCAGTCCTTACGGTTCCAGATGGGTCAGGTAATCTCGTGGTGTACAGTGATGCATCAGGGAAAGGCTTGGGGTGTGTGCTCATGCAGAAAGGTAAGGTGATAGCGTATGCTTCTCGACAATTGAAAGAATATGAACGAAACTACCCCACGCATGATCTCGAGTTAGCAGCGGTAGTATTCGCTCTAAAAACGTGGCGGCACTACCTGTATGGGGAAAAAGTACAAGTCTTCACTGATCATAAGAGCCTCAAGTACTTATTCACGCAGAAGGAGCTCAATATGAGACAGAGGCGATGGTTGGAGCTGGTAAAGGATTATGACATAGAGATTCTGTACCATCCAGGCAAAGCCAGCGTGGTAGCTGATGCATTGAGCAGGAAGGCTGTGCATACTTCTGTGATGATCACCACACAGGAAAAACTACAAGATGAGATGAAGAGGGCTGGGATAGACGTGGTGATTAAAGGTGGTAATGTTCAGATAGCACAGTTAACTATACAGCCTACCCTACGAAAGAAAGTTATCGACGCTCAGAGGTCTGATGAACACCTCAGTAAAGTGTGGAGTCAGATTGAGACAGAGAGGCCAGTAGGGTATTCTATCTCCTCAGACGGGGGTCTGCTATGGCAAAACCGCCTGTGCGTTCCCCGAGACGAGGGAATCTTAAAAGATATTATGACCGAAGCCCACGATACATCTTATATGTTCCACCCTGGAAGTACAAAGATGTATCAGGATCTGAAGAGGTTTTACTGGTGGTCCGGAATGAAGAGGGACATAGCGGATTTCGTAAGCCGTTGCTTGACCTGCCAGCAGGTGAAGGCCCCGAGGCAGCGCCCAGCGGGATTGCTACAGCCCCTGAGCGTCCCTCGTGTGGAAATGGGAAGCAGTCTGTATGGATTTCTTTTCGGGTTTGCCAAAGACAAAGCAGGGTTTCAACGTCGTATGGTGTCCATAGTATCAGACCGGGACACCAGGTTCACCTCTCAGTTCTGGAGGAGTCTTCAGAAGGCACTAGGAACTCAGTTGAGGTTCAGTACAGCATTCCATCCTCAAACGGACGGACAGACCGAAAGGCTGAATCAGGTTTTAGAGGACATGTTGCGAGCCTGCTCCTTAGATTTCGCTGGGTGTTGGGACGAACATCTGCCTTTAATGGAGTTTGCCTACAACAATAGTTATCAAGCGACCATTCAGATGGCCCCCTTCGAGGCACTGTATGGGCGTAGGTGTCGAACACCAGTGTTTTGGGAAGAGGTAGGCACGCAGCAACTAATGGGACCAGAGTTGGTCCAGGTCACCAACGCAGCGGTGTAG
BLAST of CmoCh04G021120 vs. Swiss-Prot
Match: TF212_SCHPO (Transposon Tf2-12 polyprotein OS=Schizosaccharomyces pombe (strain 972 / ATCC 24843) GN=Tf2-12 PE=3 SV=1)

HSP 1 Score: 292.4 bits (747), Expect = 1.2e-77
Identity = 187/619 (30.21%), Postives = 311/619 (50.24%), Query Frame = 1

Query: 1    MSFGLTNAPAVFMELMNRVFKEFLDTFVLVFIDDILVYSKSEADHEIHLRKVLTILRAQQ 60
            M +G++ APA F   +N +  E  ++ V+ ++DDIL++SKSE++H  H++ VL  L+   
Sbjct: 534  MPYGISTAPAHFQYFINTILGEAKESHVVCYMDDILIHSKSESEHVKHVKDVLQKLKNAN 593

Query: 61   LYAKFSKCEFWLSEVAFLGHVVSSKGITVDPAKIEAVMRWPQPTTVTEVRSFLGLAGCYR 120
            L    +KCEF  S+V F+G+ +S KG T     I+ V++W QP    E+R FLG     R
Sbjct: 594  LIINQAKCEFHQSQVKFIGYHISEKGFTPCQENIDKVLQWKQPKNRKELRQFLGSVNYLR 653

Query: 121  RFVQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPVLTVPDGSGNLVVYSD 180
            +F+   S+++  L  L KK   + WTP   Q+ + +K+ LV+ PVL   D S  +++ +D
Sbjct: 654  KFIPKTSQLTHPLNNLLKKDVRWKWTPTQTQAIENIKQCLVSPPVLRHFDFSKKILLETD 713

Query: 181  ASGKGLGCVLMQKGK-----VIAYASRQLKEYERNYPTHDLELAAVVFALKTWRHYLYG- 240
            AS   +G VL QK        + Y S ++ + + NY   D E+ A++ +LK WRHYL   
Sbjct: 714  ASDVAVGAVLSQKHDDDKYYPVGYYSAKMSKAQLNYSVSDKEMLAIIKSLKHWRHYLEST 773

Query: 241  -EKVQVFTDHKSLKYLFTQKEL--NMRQRRWLELVKDYDIEILYHPGKASVVADALSRKA 300
             E  ++ TDH++L    T +    N R  RW   ++D++ EI Y PG A+ +ADALSR  
Sbjct: 774  IEPFKILTDHRNLIGRITNESEPENKRLARWQLFLQDFNFEINYRPGSANHIADALSR-- 833

Query: 301  VHTSVMITTQEKLQDEMKRAGIDVVIKGGNVQIAQLTIQPTLRKKVIDAQRSDEHLSKVW 360
                 ++   E +  + +   I+ V         Q++I    + +V+    +D  L  + 
Sbjct: 834  -----IVDETEPIPKDSEDNSINFV--------NQISITDDFKNQVVTEYTNDTKLLNLL 893

Query: 361  SQIETERPVGYSISSDGGLLWQNR--LCVPRDEGILKDIMTEAHDTSYMFHPGSTKMYQD 420
            +    ++ V  +I    GLL  ++  + +P D  + + I+ + H+   + HPG   +   
Sbjct: 894  NN--EDKRVEENIQLKDGLLINSKDQILLPNDTQLTRTIIKKYHEEGKLIHPGIELLTNI 953

Query: 421  LKRFYWWSGMKRDIADFVSRCLTCQQVKAPRQRPAG----------LLQPLSVPRVEM-- 480
            + R + W G+++ I ++V  C TCQ  K+   +P G            + LS+  +    
Sbjct: 954  ILRRFTWKGIRKQIQEYVQNCHTCQINKSRNHKPYGPLQPIPPSERPWESLSMDFITALP 1013

Query: 481  GSSLYGFLF----GFAK---------------DKAGFQRRMVS-------IVSDRDTRFT 540
             SS Y  LF     F+K                   F +R+++       I++D D  FT
Sbjct: 1014 ESSGYNALFVVVDRFSKMAILVPCTKSITAEQTARMFDQRVIAYFGNPKEIIADNDHIFT 1073

Query: 541  SQFWRSLQKALGTQLRFSTAFHPQTDGQTERLNQVLEDMLRACSLDFAGCWDEHLPLMEF 571
            SQ W+         ++FS  + PQTDGQTER NQ +E +LR         W +H+ L++ 
Sbjct: 1074 SQTWKDFAHKYNFVMKFSLPYRPQTDGQTERTNQTVEKLLRCVCSTHPNTWVDHISLVQQ 1133

BLAST of CmoCh04G021120 vs. Swiss-Prot
Match: TF21_SCHPO (Transposon Tf2-1 polyprotein OS=Schizosaccharomyces pombe (strain 972 / ATCC 24843) GN=Tf2-1 PE=3 SV=1)

HSP 1 Score: 292.4 bits (747), Expect = 1.2e-77
Identity = 187/619 (30.21%), Postives = 311/619 (50.24%), Query Frame = 1

Query: 1    MSFGLTNAPAVFMELMNRVFKEFLDTFVLVFIDDILVYSKSEADHEIHLRKVLTILRAQQ 60
            M +G++ APA F   +N +  E  ++ V+ ++DDIL++SKSE++H  H++ VL  L+   
Sbjct: 534  MPYGISTAPAHFQYFINTILGEAKESHVVCYMDDILIHSKSESEHVKHVKDVLQKLKNAN 593

Query: 61   LYAKFSKCEFWLSEVAFLGHVVSSKGITVDPAKIEAVMRWPQPTTVTEVRSFLGLAGCYR 120
            L    +KCEF  S+V F+G+ +S KG T     I+ V++W QP    E+R FLG     R
Sbjct: 594  LIINQAKCEFHQSQVKFIGYHISEKGFTPCQENIDKVLQWKQPKNRKELRQFLGSVNYLR 653

Query: 121  RFVQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPVLTVPDGSGNLVVYSD 180
            +F+   S+++  L  L KK   + WTP   Q+ + +K+ LV+ PVL   D S  +++ +D
Sbjct: 654  KFIPKTSQLTHPLNNLLKKDVRWKWTPTQTQAIENIKQCLVSPPVLRHFDFSKKILLETD 713

Query: 181  ASGKGLGCVLMQKGK-----VIAYASRQLKEYERNYPTHDLELAAVVFALKTWRHYLYG- 240
            AS   +G VL QK        + Y S ++ + + NY   D E+ A++ +LK WRHYL   
Sbjct: 714  ASDVAVGAVLSQKHDDDKYYPVGYYSAKMSKAQLNYSVSDKEMLAIIKSLKHWRHYLEST 773

Query: 241  -EKVQVFTDHKSLKYLFTQKEL--NMRQRRWLELVKDYDIEILYHPGKASVVADALSRKA 300
             E  ++ TDH++L    T +    N R  RW   ++D++ EI Y PG A+ +ADALSR  
Sbjct: 774  IEPFKILTDHRNLIGRITNESEPENKRLARWQLFLQDFNFEINYRPGSANHIADALSR-- 833

Query: 301  VHTSVMITTQEKLQDEMKRAGIDVVIKGGNVQIAQLTIQPTLRKKVIDAQRSDEHLSKVW 360
                 ++   E +  + +   I+ V         Q++I    + +V+    +D  L  + 
Sbjct: 834  -----IVDETEPIPKDSEDNSINFV--------NQISITDDFKNQVVTEYTNDTKLLNLL 893

Query: 361  SQIETERPVGYSISSDGGLLWQNR--LCVPRDEGILKDIMTEAHDTSYMFHPGSTKMYQD 420
            +    ++ V  +I    GLL  ++  + +P D  + + I+ + H+   + HPG   +   
Sbjct: 894  NN--EDKRVEENIQLKDGLLINSKDQILLPNDTQLTRTIIKKYHEEGKLIHPGIELLTNI 953

Query: 421  LKRFYWWSGMKRDIADFVSRCLTCQQVKAPRQRPAG----------LLQPLSVPRVEM-- 480
            + R + W G+++ I ++V  C TCQ  K+   +P G            + LS+  +    
Sbjct: 954  ILRRFTWKGIRKQIQEYVQNCHTCQINKSRNHKPYGPLQPIPPSERPWESLSMDFITALP 1013

Query: 481  GSSLYGFLF----GFAK---------------DKAGFQRRMVS-------IVSDRDTRFT 540
             SS Y  LF     F+K                   F +R+++       I++D D  FT
Sbjct: 1014 ESSGYNALFVVVDRFSKMAILVPCTKSITAEQTARMFDQRVIAYFGNPKEIIADNDHIFT 1073

Query: 541  SQFWRSLQKALGTQLRFSTAFHPQTDGQTERLNQVLEDMLRACSLDFAGCWDEHLPLMEF 571
            SQ W+         ++FS  + PQTDGQTER NQ +E +LR         W +H+ L++ 
Sbjct: 1074 SQTWKDFAHKYNFVMKFSLPYRPQTDGQTERTNQTVEKLLRCVCSTHPNTWVDHISLVQQ 1133

BLAST of CmoCh04G021120 vs. Swiss-Prot
Match: TF22_SCHPO (Transposon Tf2-2 polyprotein OS=Schizosaccharomyces pombe (strain 972 / ATCC 24843) GN=Tf2-2 PE=3 SV=1)

HSP 1 Score: 292.4 bits (747), Expect = 1.2e-77
Identity = 187/619 (30.21%), Postives = 311/619 (50.24%), Query Frame = 1

Query: 1    MSFGLTNAPAVFMELMNRVFKEFLDTFVLVFIDDILVYSKSEADHEIHLRKVLTILRAQQ 60
            M +G++ APA F   +N +  E  ++ V+ ++DDIL++SKSE++H  H++ VL  L+   
Sbjct: 534  MPYGISTAPAHFQYFINTILGEAKESHVVCYMDDILIHSKSESEHVKHVKDVLQKLKNAN 593

Query: 61   LYAKFSKCEFWLSEVAFLGHVVSSKGITVDPAKIEAVMRWPQPTTVTEVRSFLGLAGCYR 120
            L    +KCEF  S+V F+G+ +S KG T     I+ V++W QP    E+R FLG     R
Sbjct: 594  LIINQAKCEFHQSQVKFIGYHISEKGFTPCQENIDKVLQWKQPKNRKELRQFLGSVNYLR 653

Query: 121  RFVQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPVLTVPDGSGNLVVYSD 180
            +F+   S+++  L  L KK   + WTP   Q+ + +K+ LV+ PVL   D S  +++ +D
Sbjct: 654  KFIPKTSQLTHPLNNLLKKDVRWKWTPTQTQAIENIKQCLVSPPVLRHFDFSKKILLETD 713

Query: 181  ASGKGLGCVLMQKGK-----VIAYASRQLKEYERNYPTHDLELAAVVFALKTWRHYLYG- 240
            AS   +G VL QK        + Y S ++ + + NY   D E+ A++ +LK WRHYL   
Sbjct: 714  ASDVAVGAVLSQKHDDDKYYPVGYYSAKMSKAQLNYSVSDKEMLAIIKSLKHWRHYLEST 773

Query: 241  -EKVQVFTDHKSLKYLFTQKEL--NMRQRRWLELVKDYDIEILYHPGKASVVADALSRKA 300
             E  ++ TDH++L    T +    N R  RW   ++D++ EI Y PG A+ +ADALSR  
Sbjct: 774  IEPFKILTDHRNLIGRITNESEPENKRLARWQLFLQDFNFEINYRPGSANHIADALSR-- 833

Query: 301  VHTSVMITTQEKLQDEMKRAGIDVVIKGGNVQIAQLTIQPTLRKKVIDAQRSDEHLSKVW 360
                 ++   E +  + +   I+ V         Q++I    + +V+    +D  L  + 
Sbjct: 834  -----IVDETEPIPKDSEDNSINFV--------NQISITDDFKNQVVTEYTNDTKLLNLL 893

Query: 361  SQIETERPVGYSISSDGGLLWQNR--LCVPRDEGILKDIMTEAHDTSYMFHPGSTKMYQD 420
            +    ++ V  +I    GLL  ++  + +P D  + + I+ + H+   + HPG   +   
Sbjct: 894  NN--EDKRVEENIQLKDGLLINSKDQILLPNDTQLTRTIIKKYHEEGKLIHPGIELLTNI 953

Query: 421  LKRFYWWSGMKRDIADFVSRCLTCQQVKAPRQRPAG----------LLQPLSVPRVEM-- 480
            + R + W G+++ I ++V  C TCQ  K+   +P G            + LS+  +    
Sbjct: 954  ILRRFTWKGIRKQIQEYVQNCHTCQINKSRNHKPYGPLQPIPPSERPWESLSMDFITALP 1013

Query: 481  GSSLYGFLF----GFAK---------------DKAGFQRRMVS-------IVSDRDTRFT 540
             SS Y  LF     F+K                   F +R+++       I++D D  FT
Sbjct: 1014 ESSGYNALFVVVDRFSKMAILVPCTKSITAEQTARMFDQRVIAYFGNPKEIIADNDHIFT 1073

Query: 541  SQFWRSLQKALGTQLRFSTAFHPQTDGQTERLNQVLEDMLRACSLDFAGCWDEHLPLMEF 571
            SQ W+         ++FS  + PQTDGQTER NQ +E +LR         W +H+ L++ 
Sbjct: 1074 SQTWKDFAHKYNFVMKFSLPYRPQTDGQTERTNQTVEKLLRCVCSTHPNTWVDHISLVQQ 1133

BLAST of CmoCh04G021120 vs. Swiss-Prot
Match: TF23_SCHPO (Transposon Tf2-3 polyprotein OS=Schizosaccharomyces pombe (strain 972 / ATCC 24843) GN=Tf2-3 PE=1 SV=1)

HSP 1 Score: 292.4 bits (747), Expect = 1.2e-77
Identity = 187/619 (30.21%), Postives = 311/619 (50.24%), Query Frame = 1

Query: 1    MSFGLTNAPAVFMELMNRVFKEFLDTFVLVFIDDILVYSKSEADHEIHLRKVLTILRAQQ 60
            M +G++ APA F   +N +  E  ++ V+ ++DDIL++SKSE++H  H++ VL  L+   
Sbjct: 534  MPYGISTAPAHFQYFINTILGEAKESHVVCYMDDILIHSKSESEHVKHVKDVLQKLKNAN 593

Query: 61   LYAKFSKCEFWLSEVAFLGHVVSSKGITVDPAKIEAVMRWPQPTTVTEVRSFLGLAGCYR 120
            L    +KCEF  S+V F+G+ +S KG T     I+ V++W QP    E+R FLG     R
Sbjct: 594  LIINQAKCEFHQSQVKFIGYHISEKGFTPCQENIDKVLQWKQPKNRKELRQFLGSVNYLR 653

Query: 121  RFVQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPVLTVPDGSGNLVVYSD 180
            +F+   S+++  L  L KK   + WTP   Q+ + +K+ LV+ PVL   D S  +++ +D
Sbjct: 654  KFIPKTSQLTHPLNNLLKKDVRWKWTPTQTQAIENIKQCLVSPPVLRHFDFSKKILLETD 713

Query: 181  ASGKGLGCVLMQKGK-----VIAYASRQLKEYERNYPTHDLELAAVVFALKTWRHYLYG- 240
            AS   +G VL QK        + Y S ++ + + NY   D E+ A++ +LK WRHYL   
Sbjct: 714  ASDVAVGAVLSQKHDDDKYYPVGYYSAKMSKAQLNYSVSDKEMLAIIKSLKHWRHYLEST 773

Query: 241  -EKVQVFTDHKSLKYLFTQKEL--NMRQRRWLELVKDYDIEILYHPGKASVVADALSRKA 300
             E  ++ TDH++L    T +    N R  RW   ++D++ EI Y PG A+ +ADALSR  
Sbjct: 774  IEPFKILTDHRNLIGRITNESEPENKRLARWQLFLQDFNFEINYRPGSANHIADALSR-- 833

Query: 301  VHTSVMITTQEKLQDEMKRAGIDVVIKGGNVQIAQLTIQPTLRKKVIDAQRSDEHLSKVW 360
                 ++   E +  + +   I+ V         Q++I    + +V+    +D  L  + 
Sbjct: 834  -----IVDETEPIPKDSEDNSINFV--------NQISITDDFKNQVVTEYTNDTKLLNLL 893

Query: 361  SQIETERPVGYSISSDGGLLWQNR--LCVPRDEGILKDIMTEAHDTSYMFHPGSTKMYQD 420
            +    ++ V  +I    GLL  ++  + +P D  + + I+ + H+   + HPG   +   
Sbjct: 894  NN--EDKRVEENIQLKDGLLINSKDQILLPNDTQLTRTIIKKYHEEGKLIHPGIELLTNI 953

Query: 421  LKRFYWWSGMKRDIADFVSRCLTCQQVKAPRQRPAG----------LLQPLSVPRVEM-- 480
            + R + W G+++ I ++V  C TCQ  K+   +P G            + LS+  +    
Sbjct: 954  ILRRFTWKGIRKQIQEYVQNCHTCQINKSRNHKPYGPLQPIPPSERPWESLSMDFITALP 1013

Query: 481  GSSLYGFLF----GFAK---------------DKAGFQRRMVS-------IVSDRDTRFT 540
             SS Y  LF     F+K                   F +R+++       I++D D  FT
Sbjct: 1014 ESSGYNALFVVVDRFSKMAILVPCTKSITAEQTARMFDQRVIAYFGNPKEIIADNDHIFT 1073

Query: 541  SQFWRSLQKALGTQLRFSTAFHPQTDGQTERLNQVLEDMLRACSLDFAGCWDEHLPLMEF 571
            SQ W+         ++FS  + PQTDGQTER NQ +E +LR         W +H+ L++ 
Sbjct: 1074 SQTWKDFAHKYNFVMKFSLPYRPQTDGQTERTNQTVEKLLRCVCSTHPNTWVDHISLVQQ 1133

BLAST of CmoCh04G021120 vs. Swiss-Prot
Match: TF29_SCHPO (Transposon Tf2-9 polyprotein OS=Schizosaccharomyces pombe (strain 972 / ATCC 24843) GN=Tf2-9 PE=3 SV=1)

HSP 1 Score: 292.4 bits (747), Expect = 1.2e-77
Identity = 187/619 (30.21%), Postives = 311/619 (50.24%), Query Frame = 1

Query: 1    MSFGLTNAPAVFMELMNRVFKEFLDTFVLVFIDDILVYSKSEADHEIHLRKVLTILRAQQ 60
            M +G++ APA F   +N +  E  ++ V+ ++DDIL++SKSE++H  H++ VL  L+   
Sbjct: 534  MPYGISTAPAHFQYFINTILGEAKESHVVCYMDDILIHSKSESEHVKHVKDVLQKLKNAN 593

Query: 61   LYAKFSKCEFWLSEVAFLGHVVSSKGITVDPAKIEAVMRWPQPTTVTEVRSFLGLAGCYR 120
            L    +KCEF  S+V F+G+ +S KG T     I+ V++W QP    E+R FLG     R
Sbjct: 594  LIINQAKCEFHQSQVKFIGYHISEKGFTPCQENIDKVLQWKQPKNRKELRQFLGSVNYLR 653

Query: 121  RFVQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPVLTVPDGSGNLVVYSD 180
            +F+   S+++  L  L KK   + WTP   Q+ + +K+ LV+ PVL   D S  +++ +D
Sbjct: 654  KFIPKTSQLTHPLNNLLKKDVRWKWTPTQTQAIENIKQCLVSPPVLRHFDFSKKILLETD 713

Query: 181  ASGKGLGCVLMQKGK-----VIAYASRQLKEYERNYPTHDLELAAVVFALKTWRHYLYG- 240
            AS   +G VL QK        + Y S ++ + + NY   D E+ A++ +LK WRHYL   
Sbjct: 714  ASDVAVGAVLSQKHDDDKYYPVGYYSAKMSKAQLNYSVSDKEMLAIIKSLKHWRHYLEST 773

Query: 241  -EKVQVFTDHKSLKYLFTQKEL--NMRQRRWLELVKDYDIEILYHPGKASVVADALSRKA 300
             E  ++ TDH++L    T +    N R  RW   ++D++ EI Y PG A+ +ADALSR  
Sbjct: 774  IEPFKILTDHRNLIGRITNESEPENKRLARWQLFLQDFNFEINYRPGSANHIADALSR-- 833

Query: 301  VHTSVMITTQEKLQDEMKRAGIDVVIKGGNVQIAQLTIQPTLRKKVIDAQRSDEHLSKVW 360
                 ++   E +  + +   I+ V         Q++I    + +V+    +D  L  + 
Sbjct: 834  -----IVDETEPIPKDSEDNSINFV--------NQISITDDFKNQVVTEYTNDTKLLNLL 893

Query: 361  SQIETERPVGYSISSDGGLLWQNR--LCVPRDEGILKDIMTEAHDTSYMFHPGSTKMYQD 420
            +    ++ V  +I    GLL  ++  + +P D  + + I+ + H+   + HPG   +   
Sbjct: 894  NN--EDKRVEENIQLKDGLLINSKDQILLPNDTQLTRTIIKKYHEEGKLIHPGIELLTNI 953

Query: 421  LKRFYWWSGMKRDIADFVSRCLTCQQVKAPRQRPAG----------LLQPLSVPRVEM-- 480
            + R + W G+++ I ++V  C TCQ  K+   +P G            + LS+  +    
Sbjct: 954  ILRRFTWKGIRKQIQEYVQNCHTCQINKSRNHKPYGPLQPIPPSERPWESLSMDFITALP 1013

Query: 481  GSSLYGFLF----GFAK---------------DKAGFQRRMVS-------IVSDRDTRFT 540
             SS Y  LF     F+K                   F +R+++       I++D D  FT
Sbjct: 1014 ESSGYNALFVVVDRFSKMAILVPCTKSITAEQTARMFDQRVIAYFGNPKEIIADNDHIFT 1073

Query: 541  SQFWRSLQKALGTQLRFSTAFHPQTDGQTERLNQVLEDMLRACSLDFAGCWDEHLPLMEF 571
            SQ W+         ++FS  + PQTDGQTER NQ +E +LR         W +H+ L++ 
Sbjct: 1074 SQTWKDFAHKYNFVMKFSLPYRPQTDGQTERTNQTVEKLLRCVCSTHPNTWVDHISLVQQ 1133

BLAST of CmoCh04G021120 vs. TrEMBL
Match: Q84KB0_CUCME (Pol protein OS=Cucumis melo subsp. melo PE=4 SV=1)

HSP 1 Score: 830.9 bits (2145), Expect = 1.0e-237
Identity = 414/641 (64.59%), Postives = 495/641 (77.22%), Query Frame = 1

Query: 1   MSFGLTNAPAVFMELMNRVFKEFLDTFVLVFIDDILVYSKSEADHEIHLRKVLTILRAQQ 60
           MSFGLTNAPAVFM+LMNRVF+EFLDTFV+VFIDDIL+YSK+EA+HE HLR VL  LR  +
Sbjct: 115 MSFGLTNAPAVFMDLMNRVFREFLDTFVIVFIDDILIYSKTEAEHEEHLRMVLQTLRDNK 174

Query: 61  LYAKFSKCEFWLSEVAFLGHVVSSKGITVDPAKIEAVMRWPQPTTVTEVRSFLGLAGCYR 120
           LYAKFSKCEFWL +V+FLGHVVS  G++VDPAKIEAV  W +P+TV+EVRSFLGLAG YR
Sbjct: 175 LYAKFSKCEFWLKQVSFLGHVVSKAGVSVDPAKIEAVTGWTRPSTVSEVRSFLGLAGYYR 234

Query: 121 RFVQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPVLTVPDGSGNLVVYSD 180
           RFV++FS+I++ LTQLT+KG PF W+  CE SFQ LK++LVTAPVLTVPDGSGN V+YSD
Sbjct: 235 RFVENFSRIATPLTQLTRKGAPFVWSKACEDSFQTLKQKLVTAPVLTVPDGSGNFVIYSD 294

Query: 181 ASGKGLGCVLMQKGKVIAYASRQLKEYERNYPTHDLELAAVVFALKTWRHYLYGEKVQVF 240
           AS KGLGCVLMQ+GKV+AYASRQLK +E+NYPTHDLELAAVVFALK WRHYLYGEK+Q+F
Sbjct: 295 ASKKGLGCVLMQQGKVVAYASRQLKSHEQNYPTHDLELAAVVFALKIWRHYLYGEKIQIF 354

Query: 241 TDHKSLKYLFTQKELNMRQRRWLELVKDYDIEILYHPGKASVVADALSRKAVHTSVMITT 300
           TDHKSLKY FTQKELNMRQRRWLELVKDYD EILYHPGKA+VVADALSRK  H++ +IT 
Sbjct: 355 TDHKSLKYFFTQKELNMRQRRWLELVKDYDCEILYHPGKANVVADALSRKVSHSAALITR 414

Query: 301 QEKLQDEMKRAGIDVVIKGGNVQIAQLTIQPTLRKKVIDAQRSDEHLSKVWSQIETERPV 360
           Q  L  +++RA I V++    +Q+AQLT+QPTLR+++IDAQ +D +L +     E  +  
Sbjct: 415 QAPLHRDLERAEIAVLVGAVTMQLAQLTVQPTLRQRIIDAQSNDPYLVEKRGLAEAGQTA 474

Query: 361 GYSISSDGGLLWQNRLCVPRDEGILKDIMTEAHDTSYMFHPGSTKMYQDLKR-FYWWSGM 420
            +S+SSDGGLL++ RLCVP D  +  ++++EAH + +  HPGST+     +  F     M
Sbjct: 475 EFSLSSDGGLLFERRLCVPSDSAVKTELLSEAHSSPFSMHPGSTEDVSGPEAGFIGGRNM 534

Query: 421 KRDIADFVSRCLTCQQVKAPRQRPAGLLQPLSVPRVEMGSSLYGFLFGFAKDKAGF---- 480
           KR++A+FVS+CL CQQVKAPRQ+PAGLLQPLS+P  +  +    F+ G  +   GF    
Sbjct: 535 KREVAEFVSKCLVCQQVKAPRQKPAGLLQPLSIPEWKWENVSMDFITGLPRTLRGFTVIW 594

Query: 481 -------------------------QRRM----------VSIVSDRDTRFTSQFWRSLQK 540
                                    Q  M          VSIVSDRD RFTS+FW+ LQ 
Sbjct: 595 VVVDRLTKSAHFVPGKSTYTASKWAQLYMSEIVRLHGVPVSIVSDRDARFTSKFWKGLQT 654

Query: 541 ALGTQLRFSTAFHPQTDGQTERLNQVLEDMLRACSLDFAGCWDEHLPLMEFAYNNSYQAT 600
           A+GT+L FSTAFHPQTDGQTERLNQVLEDMLRAC+L+F G WD HL LMEFAYNNSYQAT
Sbjct: 655 AMGTRLDFSTAFHPQTDGQTERLNQVLEDMLRACALEFPGSWDSHLHLMEFAYNNSYQAT 714

Query: 601 IQMAPFEALYGRRCRTPVFWEEVGTQQLMGPELVQVTNAAV 602
           I MAPFEALYGR CR+PV W EVG Q+LMGPELVQ TN A+
Sbjct: 715 IGMAPFEALYGRCCRSPVCWGEVGEQRLMGPELVQSTNEAI 755

BLAST of CmoCh04G021120 vs. TrEMBL
Match: A0A061EE03_THECC (Retrotransposon protein, putative OS=Theobroma cacao GN=TCM_017700 PE=4 SV=1)

HSP 1 Score: 716.8 bits (1849), Expect = 2.1e-203
Identity = 351/599 (58.60%), Postives = 443/599 (73.96%), Query Frame = 1

Query: 1    MSFGLTNAPAVFMELMNRVFKEFLDTFVLVFIDDILVYSKSEADHEIHLRKVLTILRAQQ 60
            MSFGLTNAPA FM+LMNRVFK +LD FV+VFIDDIL+YSKS  +HE HL+ VL ILR  +
Sbjct: 765  MSFGLTNAPAAFMDLMNRVFKPYLDKFVVVFIDDILIYSKSREEHEQHLKIVLQILREHR 824

Query: 61   LYAKFSKCEFWLSEVAFLGHVVSSKGITVDPAKIEAVMRWPQPTTVTEVRSFLGLAGCYR 120
            LYAKFSKCEFWL  VAFLGHVVS +GI VD  KIEAV +WP+ T+VTE+RSF+GLAG YR
Sbjct: 825  LYAKFSKCEFWLERVAFLGHVVSREGIQVDTKKIEAVEKWPRSTSVTEIRSFVGLAGYYR 884

Query: 121  RFVQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPVLTVPDGSGNLVVYSD 180
            RFV+DFSKI + LT+LT+K   F W+  CE SF++LK  L TAPVL++  G+G   V+ D
Sbjct: 885  RFVKDFSKIVALLTKLTRKDTKFEWSDACENSFEKLKACLTTAPVLSLLQGTGGYTVFCD 944

Query: 181  ASGKGLGCVLMQKGKVIAYASRQLKEYERNYPTHDLELAAVVFALKTWRHYLYGEKVQVF 240
            ASG GLGCVLMQ GKVIAYASRQLK +E+NYP HDLE+AA+VFALK WRHYLYGE  +++
Sbjct: 945  ASGVGLGCVLMQHGKVIAYASRQLKRHEQNYPIHDLEMAAIVFALKIWRHYLYGETCEIY 1004

Query: 241  TDHKSLKYLFTQKELNMRQRRWLELVKDYDIEILYHPGKASVVADALSRKAV----HTSV 300
            TDHKSLKY+F Q++LN+RQ RW+EL+KDYD  ILYHPGKA+VVADALSRK++    H S+
Sbjct: 1005 TDHKSLKYIFQQRDLNLRQHRWMELLKDYDCTILYHPGKANVVADALSRKSMGSLAHISI 1064

Query: 301  MITTQEKLQDEMKRAGIDVVIKGGNVQIAQLTIQPTLRKKVIDAQRSDEHLSKVWSQIET 360
               +  +    +   G+ + +   N  +A   ++P L  ++ +AQ  DE + K     + 
Sbjct: 1065 GRRSLVREIHSLGDIGVRLEVAETNALLAHFRVRPILMDRIKEAQSKDEFVIKALEDPQG 1124

Query: 361  ERPVGYSISSDGGLLWQNRLCVPRDEGILKDIMTEAHDTSYMFHPGSTKMYQDLKRFYWW 420
             +   ++  +DG L +  RL VP  +G+ ++I+ EAH  +Y+ HPG+ KMYQDLK  YWW
Sbjct: 1125 RKGKMFTKGTDGVLRYGTRLYVPDGDGLRREILEEAHMAAYVVHPGALKMYQDLKGVYWW 1184

Query: 421  SGMKRDIADFVSRCLTCQQVKAPRQRPAGLLQPLSVPRVEMGSSLYGFLFGFAKDKAGFQ 480
             G+KRD+A+FVS+CL CQQVKA  Q+PAGLLQPL VPRVE+G+  YG   G A       
Sbjct: 1185 EGLKRDVAEFVSKCLVCQQVKAEHQKPAGLLQPLPVPRVEVGTYCYGLCNGGA------- 1244

Query: 481  RRMVSIVSDRDTRFTSQFWRSLQKALGTQLRFSTAFHPQTDGQTERLNQVLEDMLRACSL 540
                        +FTS+FW  LQ+ALGT+  FSTAFHPQTDGQ+ER  Q LEDMLRAC +
Sbjct: 1245 ------------QFTSRFWGKLQEALGTKFDFSTAFHPQTDGQSERTIQTLEDMLRACVI 1304

Query: 541  DFAGCWDEHLPLMEFAYNNSYQATIQMAPFEALYGRRCRTPVFWEEVGTQQLMGPELVQ 596
            D    W+++LPL+EFAYNNS+Q +IQMAPFEALYGRRCR+P+ W EVG ++L+GPELVQ
Sbjct: 1305 DLGVRWEQYLPLVEFAYNNSFQTSIQMAPFEALYGRRCRSPIGWLEVGERKLLGPELVQ 1344

BLAST of CmoCh04G021120 vs. TrEMBL
Match: A0A061FQY7_THECC (DNA/RNA polymerases superfamily protein, putative OS=Theobroma cacao GN=TCM_044877 PE=4 SV=1)

HSP 1 Score: 710.7 bits (1833), Expect = 1.5e-201
Identity = 353/606 (58.25%), Postives = 437/606 (72.11%), Query Frame = 1

Query: 3    FGLTNAPAVFMELMNRVFKEFLDTFVLVFIDDILVYSKSEADHEIHLRKVLTILRAQQLY 62
            FGLTNAPA FM+LMNRVF  +L  FV+VFIDDILVYS+   +H  HLR VL  LR +QLY
Sbjct: 555  FGLTNAPAAFMDLMNRVFHPYLGKFVIVFIDDILVYSRDNDEHAAHLRIVLQTLREKQLY 614

Query: 63   AKFSKCEFWLSEVAFLGHVVSSKGITVDPAKIEAVMRWPQPTTVTEVRSFLGLAGCYRRF 122
            AKFSKCEFWL EV FLGHVVS  GI VDP K+EA+++W QP TVTE+RSFLGLAG YRRF
Sbjct: 615  AKFSKCEFWLQEVVFLGHVVSRTGIYVDPKKVEAILQWEQPKTVTEIRSFLGLAGYYRRF 674

Query: 123  VQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPVLTVPDGSGNLVVYSDAS 182
            VQ FS I++ LT+LT+KG  F    VCE  FQELK RL +APVLT+P      VVYSDAS
Sbjct: 675  VQGFSLIAAPLTRLTRKGVKFVCDDVCENRFQELKNRLTSAPVLTLPVNGKGFVVYSDAS 734

Query: 183  GKGLGCVLMQKGKVIAYASRQLKEYERNYPTHDLELAAVVFALKTWRHYLYGEKVQVFTD 242
              GLGCVLMQ  KV+AYASRQLK +E NYPTHDLELAAVVFALK WRHYLYGE  ++FTD
Sbjct: 735  KLGLGCVLMQDEKVVAYASRQLKRHEANYPTHDLELAAVVFALKIWRHYLYGEHCRIFTD 794

Query: 243  HKSLKYLFTQKELNMRQRRWLELVKDYDIEILYHPGKASVVADALSRKAVHT-SVMITTQ 302
            HKSLKYL TQKELN+RQRRWLEL+KDYD+ I YHPGKA+VVADALSRK+  + + + +  
Sbjct: 795  HKSLKYLLTQKELNLRQRRWLELIKDYDLVIDYHPGKANVVADALSRKSSSSLAALQSCY 854

Query: 303  EKLQDEMKRAGIDVVIKGGNVQIAQLTIQPTLRKKVIDAQRSDEHLSKVWSQIETERPVG 362
                 EMK  G+ +        +A   ++P+L  ++ D QRSD+ L K   ++       
Sbjct: 855  FSALIEMKSLGVQLRNGEDGSVLANFIVRPSLLNQIKDIQRSDDELRKEIQKLTDGGVSE 914

Query: 363  YSISSDGGLLWQNRLCVPRDEGILKDIMTEAHDTSYMFHPGSTKMYQDLKRFYWWSGMKR 422
            +    D  L++++R+CVP    + + IM EAH ++Y  +PGSTKMY+ ++  YWW GMKR
Sbjct: 915  FRFGEDNVLMFRDRVCVPEGNQLRQTIMEEAHSSAYALNPGSTKMYRTIRENYWWPGMKR 974

Query: 423  DIADFVSRCLTCQQVKAPRQRPAGLLQPLSVPRVEMGSSLYGFLFGFAKDKAG------- 482
            D+A+FV++CL CQQVKA  QRP G  Q L V   +       F+ G  + + G       
Sbjct: 975  DVAEFVAKCLVCQQVKAEHQRPVGTFQSLPVLEWKWEHVTMDFVLGLPRTQRGKDAIYEI 1034

Query: 483  --FQRRMVSIVSDRDTRFTSQFWRSLQKALGTQLRFSTAFHPQTDGQTERLNQVLEDMLR 542
                  +VSIVSDRD RFTS+FW   Q+ALGT+L+FSTAFHPQTDGQ+ER  Q LEDMLR
Sbjct: 1035 VRLHGVLVSIVSDRDPRFTSRFWPKFQEALGTKLKFSTAFHPQTDGQSERTIQTLEDMLR 1094

Query: 543  ACSLDFAGCWDEHLPLMEFAYNNSYQATIQMAPFEALYGRRCRTPVFWEEVGTQQLMGPE 599
            AC +DF G WD HLPL+EFAYNNS+Q++I MAP+EALY R+CRTP+ W+EVG ++L+  E
Sbjct: 1095 ACVIDFIGSWDRHLPLVEFAYNNSFQSSIGMAPYEALYERKCRTPLCWDEVGERKLVSVE 1154

BLAST of CmoCh04G021120 vs. TrEMBL
Match: Q7X726_ORYSJ (OSJNBa0089N06.1 protein OS=Oryza sativa subsp. japonica GN=OSJNBa0089N06.1 PE=4 SV=1)

HSP 1 Score: 710.3 bits (1832), Expect = 2.0e-201
Identity = 347/635 (54.65%), Postives = 449/635 (70.71%), Query Frame = 1

Query: 1    MSFGLTNAPAVFMELMNRVFKEFLDTFVLVFIDDILVYSKSEADHEIHLRKVLTILRAQQ 60
            MSFGLTNAPA FM LMN++F E+LD FV+VFIDDIL+YSK+E +H  HLR ++  LR  Q
Sbjct: 1047 MSFGLTNAPAFFMNLMNKIFMEYLDQFVVVFIDDILIYSKNEEEHAEHLRLIMEKLRDHQ 1106

Query: 61   LYAKFSKCEFWLSEVAFLGHVVSSKGITVDPAKIEAVMRWPQPTTVTEVRSFLGLAGCYR 120
            L+AKFSKCEFWL  VAFLGHV+SS G+ VDP+K+EAV+ W  P  V+E+RSFLGLAG YR
Sbjct: 1107 LFAKFSKCEFWLDRVAFLGHVISSNGVEVDPSKVEAVLAWNPPKNVSEIRSFLGLAGYYR 1166

Query: 121  RFVQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPVLTVPDGSGNLVVYSD 180
            RF++ FSK++  +T+L KK K F W+  CE SFQE+KKRL TAPVLT+PD   +  ++ D
Sbjct: 1167 RFIEGFSKLARPMTELLKKEKKFQWSTACEDSFQEMKKRLTTAPVLTLPDIRKDFEIFCD 1226

Query: 181  ASGKGLGCVLMQKGKVIAYASRQLKEYERNYPTHDLELAAVVFALKTWRHYLYGEKVQVF 240
            AS +GLGCVLMQ+ KV+AYASRQL+ +E NYPTHDLELAAV+ ALK WRHYL G + +V+
Sbjct: 1227 ASRQGLGCVLMQERKVVAYASRQLRPHEVNYPTHDLELAAVIHALKIWRHYLIGNRCEVY 1286

Query: 241  TDHKSLKYLFTQKELNMRQRRWLELVKDYDIEILYHPGKASVVADALSRKAVHTSVMI-T 300
            TDHKSLKY+FTQ ELNMRQRRWLEL+KDYD+ I YHPGKA+VVADALSRKA      I  
Sbjct: 1287 TDHKSLKYIFTQTELNMRQRRWLELIKDYDLGIHYHPGKANVVADALSRKAYCNIAQIRP 1346

Query: 301  TQEKLQDEMKRAGIDVVIKGGNVQIAQLTIQPTLRKKVIDAQRSDEHLSKVWSQIETERP 360
             Q+ L  E+++  + VV  G     A LT+QPTL  ++ +AQ+ DE + ++  +I+ ++ 
Sbjct: 1347 DQDHLCRELEKLRLTVVQSG---VPASLTVQPTLESQIREAQKDDEGIKELIKRIQEKKD 1406

Query: 361  VGYSISSDGGLLWQNRLCVPRDEGILKDIMTEAHDTSYMFHPGSTKMYQDLKRFYWWSGM 420
              +SI   G +    R+CVP  + +   I+ EAH+++Y  HPGSTKMYQD+K ++WW+GM
Sbjct: 1407 TNFSIDDQGTIWCGPRICVPAKKELRNLILKEAHESAYSIHPGSTKMYQDIKAYFWWAGM 1466

Query: 421  KRDIADFVSRCLTCQQVKAPRQRPAGLLQPLSVPRVEMGSSLYGFLFGFAKDKAGFQRRM 480
            KRD+A++V+ C  CQ+VKA  QRPAGLLQPL +P  +       F+ G  +  +G+    
Sbjct: 1467 KRDVAEYVALCDICQRVKAEHQRPAGLLQPLPIPEWKWEEIGMDFITGLPRTPSGYDSIW 1526

Query: 481  V---------------------------------------SIVSDRDTRFTSQFWRSLQK 540
            V                                        IVSDR T+FTS+FW+ L +
Sbjct: 1527 VIVDRLTKSAHFVPVKTTYDGKKLAELYMTHVVCRFGCPKKIVSDRGTQFTSRFWKQLHE 1586

Query: 541  ALGTQLRFSTAFHPQTDGQTERLNQVLEDMLRACSLDFAGCWDEHLPLMEFAYNNSYQAT 596
            ALGT L FSTA+HPQTDGQTER+NQ+LEDMLRAC+LDF G WD  LP  EF+YNNSYQA+
Sbjct: 1587 ALGTDLNFSTAYHPQTDGQTERVNQILEDMLRACALDFEGTWDRCLPYAEFSYNNSYQAS 1646

BLAST of CmoCh04G021120 vs. TrEMBL
Match: Q7XPS2_ORYSJ (OSJNBa0065O17.9 protein OS=Oryza sativa subsp. japonica GN=OSJNBa0065O17.9 PE=4 SV=2)

HSP 1 Score: 708.4 bits (1827), Expect = 7.6e-201
Identity = 347/641 (54.13%), Postives = 449/641 (70.05%), Query Frame = 1

Query: 1    MSFGLTNAPAVFMELMNRVFKEFLDTFVLVFIDDILVYSKSEADHEIHLRKVLTILRAQQ 60
            MSFGLTNAPA FM LMN++F E+LD FV+VFIDDIL+YSK+E +H  HLR ++  LR  Q
Sbjct: 1047 MSFGLTNAPAFFMNLMNKIFMEYLDQFVVVFIDDILIYSKNEEEHAEHLRLIMEKLRDHQ 1106

Query: 61   LYAKFSKCEFWLSEVAFLGHVVSSKGITVDPAKIEAVMRWPQPTTVTEVRSFLGLAGCYR 120
            L+AKFSKCEFWL  VAFLGHV+SS G+ VDP+K+EAV+ W  PT V+E+RSFLGLAG YR
Sbjct: 1107 LFAKFSKCEFWLDRVAFLGHVISSNGVEVDPSKVEAVLAWNPPTNVSEIRSFLGLAGYYR 1166

Query: 121  RFVQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPVLTVPDGSGNLVVYSD 180
            RF++ FSK++  +T+L KK K F W+  CE SFQE+KKRL TAPVLT+PD   +  ++ D
Sbjct: 1167 RFIEGFSKLARPMTELLKKEKKFQWSVACEDSFQEMKKRLTTAPVLTLPDIRKDFEIFCD 1226

Query: 181  ASGKGLGCVLMQKGKVIAYASRQLKEYERNYPTHDLELAAVVFALKTWRHYLYGEKVQVF 240
            AS +GLGCVLMQ+ KV+AYASRQL+ +E NYPTHDLELAAVV ALK WRHYL G + +V+
Sbjct: 1227 ASRQGLGCVLMQERKVVAYASRQLRPHEVNYPTHDLELAAVVHALKIWRHYLIGNRCEVY 1286

Query: 241  TDHKSLKYLFTQKELNMRQRRWLELVKDYDIEILYHPGKASVVADALSRKAVHTSVMI-T 300
            TDHKSLKY+FTQ ELNMRQRRWLEL+KDYD+ I YHPGKA+VVADALSRK       I  
Sbjct: 1287 TDHKSLKYIFTQTELNMRQRRWLELIKDYDLGIHYHPGKANVVADALSRKVYCNVAQIWP 1346

Query: 301  TQEKLQDEMKRAGIDVVIKGGNVQIAQLTIQPTLRKKVIDAQRSDEHLSKVWSQIETERP 360
             Q++L  E+++  + VV  G     A LT+QPTL  ++ +AQ+ DE + ++  +I+ ++ 
Sbjct: 1347 DQDRLCRELEKLRLTVVQSG---VPASLTVQPTLESQIREAQKDDEGIKELIKRIQEKKD 1406

Query: 361  VGYSISSDGGLLWQNRLCVPRDEGILKDIMTEAHDTSYMFHPGSTKMYQDLKRFYWWSGM 420
              +SI   G +    R+CVP  + +   I+ EAH+++Y  HPGSTKMYQD+K ++WW+GM
Sbjct: 1407 TNFSIDDQGTVWCGPRICVPAKKELRDLILKEAHESAYSIHPGSTKMYQDIKAYFWWAGM 1466

Query: 421  KRDIADFVSRCLTCQQVKAPRQRPAGLLQPLSVPRVEMGSSLYGFLFGFAKDKAGFQRRM 480
            KRD+A++V+ C  CQ+VKA  Q+PAGLLQPL +P  +       F+ G  +  +G+    
Sbjct: 1467 KRDVAEYVALCDVCQRVKAEHQKPAGLLQPLPIPEWKWEEIGMDFITGLPRTPSGYDSIW 1526

Query: 481  V---------------------------------------SIVSDRDTRFTSQFWRSLQK 540
            V                                        IVSDR T+FTS+FW  L +
Sbjct: 1527 VIVDRLTKSAHFVPVKTTFDGKKLAELYMTRVVCRFGCPKKIVSDRGTQFTSRFWNQLHE 1586

Query: 541  ALGTQLRFSTAFHPQTDGQTERLNQVLEDMLRACSLDFAGCWDEHLPLMEFAYNNSYQAT 600
            ALGT L FSTA+HPQTDGQTER+NQVLEDMLRAC+LDF G WD  LP  EF+YNNSYQA+
Sbjct: 1587 ALGTDLNFSTAYHPQTDGQTERVNQVLEDMLRACALDFEGTWDRCLPYAEFSYNNSYQAS 1646

Query: 601  IQMAPFEALYGRRCRTPVFWEEVGTQQLMGPELVQVTNAAV 602
            IQM+P EA++GR+CRTP+ W E G   + GP++++     V
Sbjct: 1647 IQMSPNEAMFGRKCRTPLCWNEAGEALVFGPDILKTAEEQV 1684

BLAST of CmoCh04G021120 vs. TAIR10
Match: ATMG00860.1 (ATMG00860.1 DNA/RNA polymerases superfamily protein)

HSP 1 Score: 117.5 bits (293), Expect = 2.9e-26
Identity = 57/125 (45.60%), Postives = 79/125 (63.20%), Query Frame = 1

Query: 48  HLRKVLTILRAQQLYAKFSKCEFWLSEVAFLGH--VVSSKGITVDPAKIEAVMRWPQPTT 107
           HL  VL I    Q YA   KC F   ++A+LGH  ++S +G++ DPAK+EA++ WP+P  
Sbjct: 3   HLGMVLQIWEQHQFYANRKKCAFGQPQIAYLGHRHIISGEGVSADPAKLEAMVGWPEPKN 62

Query: 108 VTEVRSFLGLAGCYRRFVQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPV 167
            TE+R FLGL G YRRFV+++ KI   LT+L KK     WT +   +F+ LK  + T PV
Sbjct: 63  TTELRGFLGLTGYYRRFVKNYGKIVRPLTELLKKNS-LKWTEMAALAFKALKGAVTTLPV 122

Query: 168 LTVPD 171
           L +PD
Sbjct: 123 LALPD 126

BLAST of CmoCh04G021120 vs. NCBI nr
Match: gi|28558781|gb|AAO45752.1| (pol protein [Cucumis melo subsp. melo])

HSP 1 Score: 830.9 bits (2145), Expect = 1.5e-237
Identity = 414/641 (64.59%), Postives = 495/641 (77.22%), Query Frame = 1

Query: 1   MSFGLTNAPAVFMELMNRVFKEFLDTFVLVFIDDILVYSKSEADHEIHLRKVLTILRAQQ 60
           MSFGLTNAPAVFM+LMNRVF+EFLDTFV+VFIDDIL+YSK+EA+HE HLR VL  LR  +
Sbjct: 115 MSFGLTNAPAVFMDLMNRVFREFLDTFVIVFIDDILIYSKTEAEHEEHLRMVLQTLRDNK 174

Query: 61  LYAKFSKCEFWLSEVAFLGHVVSSKGITVDPAKIEAVMRWPQPTTVTEVRSFLGLAGCYR 120
           LYAKFSKCEFWL +V+FLGHVVS  G++VDPAKIEAV  W +P+TV+EVRSFLGLAG YR
Sbjct: 175 LYAKFSKCEFWLKQVSFLGHVVSKAGVSVDPAKIEAVTGWTRPSTVSEVRSFLGLAGYYR 234

Query: 121 RFVQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPVLTVPDGSGNLVVYSD 180
           RFV++FS+I++ LTQLT+KG PF W+  CE SFQ LK++LVTAPVLTVPDGSGN V+YSD
Sbjct: 235 RFVENFSRIATPLTQLTRKGAPFVWSKACEDSFQTLKQKLVTAPVLTVPDGSGNFVIYSD 294

Query: 181 ASGKGLGCVLMQKGKVIAYASRQLKEYERNYPTHDLELAAVVFALKTWRHYLYGEKVQVF 240
           AS KGLGCVLMQ+GKV+AYASRQLK +E+NYPTHDLELAAVVFALK WRHYLYGEK+Q+F
Sbjct: 295 ASKKGLGCVLMQQGKVVAYASRQLKSHEQNYPTHDLELAAVVFALKIWRHYLYGEKIQIF 354

Query: 241 TDHKSLKYLFTQKELNMRQRRWLELVKDYDIEILYHPGKASVVADALSRKAVHTSVMITT 300
           TDHKSLKY FTQKELNMRQRRWLELVKDYD EILYHPGKA+VVADALSRK  H++ +IT 
Sbjct: 355 TDHKSLKYFFTQKELNMRQRRWLELVKDYDCEILYHPGKANVVADALSRKVSHSAALITR 414

Query: 301 QEKLQDEMKRAGIDVVIKGGNVQIAQLTIQPTLRKKVIDAQRSDEHLSKVWSQIETERPV 360
           Q  L  +++RA I V++    +Q+AQLT+QPTLR+++IDAQ +D +L +     E  +  
Sbjct: 415 QAPLHRDLERAEIAVLVGAVTMQLAQLTVQPTLRQRIIDAQSNDPYLVEKRGLAEAGQTA 474

Query: 361 GYSISSDGGLLWQNRLCVPRDEGILKDIMTEAHDTSYMFHPGSTKMYQDLKR-FYWWSGM 420
            +S+SSDGGLL++ RLCVP D  +  ++++EAH + +  HPGST+     +  F     M
Sbjct: 475 EFSLSSDGGLLFERRLCVPSDSAVKTELLSEAHSSPFSMHPGSTEDVSGPEAGFIGGRNM 534

Query: 421 KRDIADFVSRCLTCQQVKAPRQRPAGLLQPLSVPRVEMGSSLYGFLFGFAKDKAGF---- 480
           KR++A+FVS+CL CQQVKAPRQ+PAGLLQPLS+P  +  +    F+ G  +   GF    
Sbjct: 535 KREVAEFVSKCLVCQQVKAPRQKPAGLLQPLSIPEWKWENVSMDFITGLPRTLRGFTVIW 594

Query: 481 -------------------------QRRM----------VSIVSDRDTRFTSQFWRSLQK 540
                                    Q  M          VSIVSDRD RFTS+FW+ LQ 
Sbjct: 595 VVVDRLTKSAHFVPGKSTYTASKWAQLYMSEIVRLHGVPVSIVSDRDARFTSKFWKGLQT 654

Query: 541 ALGTQLRFSTAFHPQTDGQTERLNQVLEDMLRACSLDFAGCWDEHLPLMEFAYNNSYQAT 600
           A+GT+L FSTAFHPQTDGQTERLNQVLEDMLRAC+L+F G WD HL LMEFAYNNSYQAT
Sbjct: 655 AMGTRLDFSTAFHPQTDGQTERLNQVLEDMLRACALEFPGSWDSHLHLMEFAYNNSYQAT 714

Query: 601 IQMAPFEALYGRRCRTPVFWEEVGTQQLMGPELVQVTNAAV 602
           I MAPFEALYGR CR+PV W EVG Q+LMGPELVQ TN A+
Sbjct: 715 IGMAPFEALYGRCCRSPVCWGEVGEQRLMGPELVQSTNEAI 755

BLAST of CmoCh04G021120 vs. NCBI nr
Match: gi|702455653|ref|XP_010026793.1| (PREDICTED: uncharacterized protein LOC104417177 [Eucalyptus grandis])

HSP 1 Score: 722.6 bits (1864), Expect = 5.6e-205
Identity = 362/642 (56.39%), Postives = 455/642 (70.87%), Query Frame = 1

Query: 1    MSFGLTNAPAVFMELMNRVFKEFLDTFVLVFIDDILVYSKSEADHEIHLRKVLTILRAQQ 60
            M FGLTNAPA FM+LMNRVFKE+LD FV+VFIDDILVYS+S  DHE HLR VL  LR  +
Sbjct: 689  MPFGLTNAPAAFMDLMNRVFKEYLDQFVIVFIDDILVYSRSSEDHEKHLRIVLQTLRDHE 748

Query: 61   LYAKFSKCEFWLSEVAFLGHVVSSKGITVDPAKIEAVMRWPQPTTVTEVRSFLGLAGCYR 120
            LYAKFSKCEFWL+ VAFLGHV+S +GI+VDPAKIEAV+ WP+PTTVTE+RSFLGLAG YR
Sbjct: 749  LYAKFSKCEFWLTRVAFLGHVISGEGISVDPAKIEAVINWPRPTTVTEIRSFLGLAGYYR 808

Query: 121  RFVQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPVLTVPDGSGNLVVYSD 180
            RFV+ FS+++S +T+L KK + F WT  CE SFQELK +L TAPVLT+P G G   +YSD
Sbjct: 809  RFVEGFSRLASPMTRLLKKEEKFVWTDKCENSFQELKHKLTTAPVLTIPSGPGGFEIYSD 868

Query: 181  ASGKGLGCVLMQKGKVIAYASRQLKEYERNYPTHDLELAAVVFALKTWRHYLYGEKVQVF 240
            AS KGLGCVLMQ G+V+AYASRQL+ +E NYPTHDLELAA++FALK WRHYL GE+ Q+F
Sbjct: 869  ASFKGLGCVLMQHGRVVAYASRQLRLHELNYPTHDLELAAIIFALKIWRHYLCGERFQIF 928

Query: 241  TDHKSLKYLFTQKELNMRQRRWLELVKDYDIEILYHPGKASVVADALSRKAVHTSVMITT 300
            TDH+SLKYLF+QKELNMRQRRW+EL+KDYD EILYHPGKA+ VADALSRK+   + M+  
Sbjct: 929  TDHQSLKYLFSQKELNMRQRRWMELLKDYDCEILYHPGKANKVADALSRKS-SVAQMVLK 988

Query: 301  QEKLQDEMKRAGIDVVIKGGNVQIAQLTIQPTLRKKVIDAQRSDEHLSKVWSQIETERPV 360
            +  L +  + +     +   +  +A L I+P ++ K+   Q+ D  + K+  +   +R  
Sbjct: 989  EWGLIERARDSDFKFEVGHLSNLVATLRIEPEVQVKIRTLQQMDSDVQKILQEDAEKRKA 1048

Query: 361  GYSISSDGGLLWQNRLCVPRDEGILKDIMTEAHDTSYMFHPGSTKMYQDLKRFYWWSGMK 420
             + IS DG L +Q RL VP D  + ++I++EAH ++Y  HPGSTKMYQ+L++ YWW GMK
Sbjct: 1049 DFQISEDGTLRFQGRLVVPDDVELREEILSEAHRSNYSIHPGSTKMYQNLRQHYWWCGMK 1108

Query: 421  RDIADFVSRCLTCQQVKAPRQRPAGLLQPLSVPRVEMGSSLYGFLFGFAKDKAGFQRRMV 480
             DIA  V++CLTCQQVKA   +P GLL+PL +P  +       F+ G  + + G     +
Sbjct: 1109 ADIAKHVAKCLTCQQVKAQHCKPGGLLRPLEIPEWKWEHITMDFVTGLPRSQRG--NDSI 1168

Query: 481  SIVSDRDTRFT-----------------------------------------SQFWRSLQ 540
             +V DR T+                                           + FW+SLQ
Sbjct: 1169 WVVVDRLTKSAHFIAVRRDLSLDRLADLYVRQVVRMHGVPVTITSDRDPRFTAAFWKSLQ 1228

Query: 541  KALGTQLRFSTAFHPQTDGQTERLNQVLEDMLRACSLDFAGCWDEHLPLMEFAYNNSYQA 600
             ALGT+L++STA+HPQTDGQ+ER  Q LEDMLRAC LDF G W+E L L+EFAYNNSYQ 
Sbjct: 1229 SALGTKLQYSTAYHPQTDGQSERTIQTLEDMLRACVLDFKGSWEEQLHLVEFAYNNSYQQ 1288

Query: 601  TIQMAPFEALYGRRCRTPVFWEEVGTQQLMGPELVQVTNAAV 602
            +IQMAPFEALYGR CRTPV W+EVG +++ GPELVQ +  AV
Sbjct: 1289 SIQMAPFEALYGRACRTPVCWDEVGERKITGPELVQQSVEAV 1327

BLAST of CmoCh04G021120 vs. NCBI nr
Match: gi|590648676|ref|XP_007032220.1| (Retrotransposon protein, putative [Theobroma cacao])

HSP 1 Score: 716.8 bits (1849), Expect = 3.1e-203
Identity = 351/599 (58.60%), Postives = 443/599 (73.96%), Query Frame = 1

Query: 1    MSFGLTNAPAVFMELMNRVFKEFLDTFVLVFIDDILVYSKSEADHEIHLRKVLTILRAQQ 60
            MSFGLTNAPA FM+LMNRVFK +LD FV+VFIDDIL+YSKS  +HE HL+ VL ILR  +
Sbjct: 765  MSFGLTNAPAAFMDLMNRVFKPYLDKFVVVFIDDILIYSKSREEHEQHLKIVLQILREHR 824

Query: 61   LYAKFSKCEFWLSEVAFLGHVVSSKGITVDPAKIEAVMRWPQPTTVTEVRSFLGLAGCYR 120
            LYAKFSKCEFWL  VAFLGHVVS +GI VD  KIEAV +WP+ T+VTE+RSF+GLAG YR
Sbjct: 825  LYAKFSKCEFWLERVAFLGHVVSREGIQVDTKKIEAVEKWPRSTSVTEIRSFVGLAGYYR 884

Query: 121  RFVQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPVLTVPDGSGNLVVYSD 180
            RFV+DFSKI + LT+LT+K   F W+  CE SF++LK  L TAPVL++  G+G   V+ D
Sbjct: 885  RFVKDFSKIVALLTKLTRKDTKFEWSDACENSFEKLKACLTTAPVLSLLQGTGGYTVFCD 944

Query: 181  ASGKGLGCVLMQKGKVIAYASRQLKEYERNYPTHDLELAAVVFALKTWRHYLYGEKVQVF 240
            ASG GLGCVLMQ GKVIAYASRQLK +E+NYP HDLE+AA+VFALK WRHYLYGE  +++
Sbjct: 945  ASGVGLGCVLMQHGKVIAYASRQLKRHEQNYPIHDLEMAAIVFALKIWRHYLYGETCEIY 1004

Query: 241  TDHKSLKYLFTQKELNMRQRRWLELVKDYDIEILYHPGKASVVADALSRKAV----HTSV 300
            TDHKSLKY+F Q++LN+RQ RW+EL+KDYD  ILYHPGKA+VVADALSRK++    H S+
Sbjct: 1005 TDHKSLKYIFQQRDLNLRQHRWMELLKDYDCTILYHPGKANVVADALSRKSMGSLAHISI 1064

Query: 301  MITTQEKLQDEMKRAGIDVVIKGGNVQIAQLTIQPTLRKKVIDAQRSDEHLSKVWSQIET 360
               +  +    +   G+ + +   N  +A   ++P L  ++ +AQ  DE + K     + 
Sbjct: 1065 GRRSLVREIHSLGDIGVRLEVAETNALLAHFRVRPILMDRIKEAQSKDEFVIKALEDPQG 1124

Query: 361  ERPVGYSISSDGGLLWQNRLCVPRDEGILKDIMTEAHDTSYMFHPGSTKMYQDLKRFYWW 420
             +   ++  +DG L +  RL VP  +G+ ++I+ EAH  +Y+ HPG+ KMYQDLK  YWW
Sbjct: 1125 RKGKMFTKGTDGVLRYGTRLYVPDGDGLRREILEEAHMAAYVVHPGALKMYQDLKGVYWW 1184

Query: 421  SGMKRDIADFVSRCLTCQQVKAPRQRPAGLLQPLSVPRVEMGSSLYGFLFGFAKDKAGFQ 480
             G+KRD+A+FVS+CL CQQVKA  Q+PAGLLQPL VPRVE+G+  YG   G A       
Sbjct: 1185 EGLKRDVAEFVSKCLVCQQVKAEHQKPAGLLQPLPVPRVEVGTYCYGLCNGGA------- 1244

Query: 481  RRMVSIVSDRDTRFTSQFWRSLQKALGTQLRFSTAFHPQTDGQTERLNQVLEDMLRACSL 540
                        +FTS+FW  LQ+ALGT+  FSTAFHPQTDGQ+ER  Q LEDMLRAC +
Sbjct: 1245 ------------QFTSRFWGKLQEALGTKFDFSTAFHPQTDGQSERTIQTLEDMLRACVI 1304

Query: 541  DFAGCWDEHLPLMEFAYNNSYQATIQMAPFEALYGRRCRTPVFWEEVGTQQLMGPELVQ 596
            D    W+++LPL+EFAYNNS+Q +IQMAPFEALYGRRCR+P+ W EVG ++L+GPELVQ
Sbjct: 1305 DLGVRWEQYLPLVEFAYNNSFQTSIQMAPFEALYGRRCRSPIGWLEVGERKLLGPELVQ 1344

BLAST of CmoCh04G021120 vs. NCBI nr
Match: gi|590568718|ref|XP_007010875.1| (DNA/RNA polymerases superfamily protein, putative [Theobroma cacao])

HSP 1 Score: 710.7 bits (1833), Expect = 2.2e-201
Identity = 353/606 (58.25%), Postives = 437/606 (72.11%), Query Frame = 1

Query: 3    FGLTNAPAVFMELMNRVFKEFLDTFVLVFIDDILVYSKSEADHEIHLRKVLTILRAQQLY 62
            FGLTNAPA FM+LMNRVF  +L  FV+VFIDDILVYS+   +H  HLR VL  LR +QLY
Sbjct: 555  FGLTNAPAAFMDLMNRVFHPYLGKFVIVFIDDILVYSRDNDEHAAHLRIVLQTLREKQLY 614

Query: 63   AKFSKCEFWLSEVAFLGHVVSSKGITVDPAKIEAVMRWPQPTTVTEVRSFLGLAGCYRRF 122
            AKFSKCEFWL EV FLGHVVS  GI VDP K+EA+++W QP TVTE+RSFLGLAG YRRF
Sbjct: 615  AKFSKCEFWLQEVVFLGHVVSRTGIYVDPKKVEAILQWEQPKTVTEIRSFLGLAGYYRRF 674

Query: 123  VQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPVLTVPDGSGNLVVYSDAS 182
            VQ FS I++ LT+LT+KG  F    VCE  FQELK RL +APVLT+P      VVYSDAS
Sbjct: 675  VQGFSLIAAPLTRLTRKGVKFVCDDVCENRFQELKNRLTSAPVLTLPVNGKGFVVYSDAS 734

Query: 183  GKGLGCVLMQKGKVIAYASRQLKEYERNYPTHDLELAAVVFALKTWRHYLYGEKVQVFTD 242
              GLGCVLMQ  KV+AYASRQLK +E NYPTHDLELAAVVFALK WRHYLYGE  ++FTD
Sbjct: 735  KLGLGCVLMQDEKVVAYASRQLKRHEANYPTHDLELAAVVFALKIWRHYLYGEHCRIFTD 794

Query: 243  HKSLKYLFTQKELNMRQRRWLELVKDYDIEILYHPGKASVVADALSRKAVHT-SVMITTQ 302
            HKSLKYL TQKELN+RQRRWLEL+KDYD+ I YHPGKA+VVADALSRK+  + + + +  
Sbjct: 795  HKSLKYLLTQKELNLRQRRWLELIKDYDLVIDYHPGKANVVADALSRKSSSSLAALQSCY 854

Query: 303  EKLQDEMKRAGIDVVIKGGNVQIAQLTIQPTLRKKVIDAQRSDEHLSKVWSQIETERPVG 362
                 EMK  G+ +        +A   ++P+L  ++ D QRSD+ L K   ++       
Sbjct: 855  FSALIEMKSLGVQLRNGEDGSVLANFIVRPSLLNQIKDIQRSDDELRKEIQKLTDGGVSE 914

Query: 363  YSISSDGGLLWQNRLCVPRDEGILKDIMTEAHDTSYMFHPGSTKMYQDLKRFYWWSGMKR 422
            +    D  L++++R+CVP    + + IM EAH ++Y  +PGSTKMY+ ++  YWW GMKR
Sbjct: 915  FRFGEDNVLMFRDRVCVPEGNQLRQTIMEEAHSSAYALNPGSTKMYRTIRENYWWPGMKR 974

Query: 423  DIADFVSRCLTCQQVKAPRQRPAGLLQPLSVPRVEMGSSLYGFLFGFAKDKAG------- 482
            D+A+FV++CL CQQVKA  QRP G  Q L V   +       F+ G  + + G       
Sbjct: 975  DVAEFVAKCLVCQQVKAEHQRPVGTFQSLPVLEWKWEHVTMDFVLGLPRTQRGKDAIYEI 1034

Query: 483  --FQRRMVSIVSDRDTRFTSQFWRSLQKALGTQLRFSTAFHPQTDGQTERLNQVLEDMLR 542
                  +VSIVSDRD RFTS+FW   Q+ALGT+L+FSTAFHPQTDGQ+ER  Q LEDMLR
Sbjct: 1035 VRLHGVLVSIVSDRDPRFTSRFWPKFQEALGTKLKFSTAFHPQTDGQSERTIQTLEDMLR 1094

Query: 543  ACSLDFAGCWDEHLPLMEFAYNNSYQATIQMAPFEALYGRRCRTPVFWEEVGTQQLMGPE 599
            AC +DF G WD HLPL+EFAYNNS+Q++I MAP+EALY R+CRTP+ W+EVG ++L+  E
Sbjct: 1095 ACVIDFIGSWDRHLPLVEFAYNNSFQSSIGMAPYEALYERKCRTPLCWDEVGERKLVSVE 1154

BLAST of CmoCh04G021120 vs. NCBI nr
Match: gi|32489661|emb|CAE04240.1| (OSJNBa0089N06.1 [Oryza sativa Japonica Group])

HSP 1 Score: 710.3 bits (1832), Expect = 2.9e-201
Identity = 347/635 (54.65%), Postives = 449/635 (70.71%), Query Frame = 1

Query: 1    MSFGLTNAPAVFMELMNRVFKEFLDTFVLVFIDDILVYSKSEADHEIHLRKVLTILRAQQ 60
            MSFGLTNAPA FM LMN++F E+LD FV+VFIDDIL+YSK+E +H  HLR ++  LR  Q
Sbjct: 1047 MSFGLTNAPAFFMNLMNKIFMEYLDQFVVVFIDDILIYSKNEEEHAEHLRLIMEKLRDHQ 1106

Query: 61   LYAKFSKCEFWLSEVAFLGHVVSSKGITVDPAKIEAVMRWPQPTTVTEVRSFLGLAGCYR 120
            L+AKFSKCEFWL  VAFLGHV+SS G+ VDP+K+EAV+ W  P  V+E+RSFLGLAG YR
Sbjct: 1107 LFAKFSKCEFWLDRVAFLGHVISSNGVEVDPSKVEAVLAWNPPKNVSEIRSFLGLAGYYR 1166

Query: 121  RFVQDFSKISSALTQLTKKGKPFAWTPVCEQSFQELKKRLVTAPVLTVPDGSGNLVVYSD 180
            RF++ FSK++  +T+L KK K F W+  CE SFQE+KKRL TAPVLT+PD   +  ++ D
Sbjct: 1167 RFIEGFSKLARPMTELLKKEKKFQWSTACEDSFQEMKKRLTTAPVLTLPDIRKDFEIFCD 1226

Query: 181  ASGKGLGCVLMQKGKVIAYASRQLKEYERNYPTHDLELAAVVFALKTWRHYLYGEKVQVF 240
            AS +GLGCVLMQ+ KV+AYASRQL+ +E NYPTHDLELAAV+ ALK WRHYL G + +V+
Sbjct: 1227 ASRQGLGCVLMQERKVVAYASRQLRPHEVNYPTHDLELAAVIHALKIWRHYLIGNRCEVY 1286

Query: 241  TDHKSLKYLFTQKELNMRQRRWLELVKDYDIEILYHPGKASVVADALSRKAVHTSVMI-T 300
            TDHKSLKY+FTQ ELNMRQRRWLEL+KDYD+ I YHPGKA+VVADALSRKA      I  
Sbjct: 1287 TDHKSLKYIFTQTELNMRQRRWLELIKDYDLGIHYHPGKANVVADALSRKAYCNIAQIRP 1346

Query: 301  TQEKLQDEMKRAGIDVVIKGGNVQIAQLTIQPTLRKKVIDAQRSDEHLSKVWSQIETERP 360
             Q+ L  E+++  + VV  G     A LT+QPTL  ++ +AQ+ DE + ++  +I+ ++ 
Sbjct: 1347 DQDHLCRELEKLRLTVVQSG---VPASLTVQPTLESQIREAQKDDEGIKELIKRIQEKKD 1406

Query: 361  VGYSISSDGGLLWQNRLCVPRDEGILKDIMTEAHDTSYMFHPGSTKMYQDLKRFYWWSGM 420
              +SI   G +    R+CVP  + +   I+ EAH+++Y  HPGSTKMYQD+K ++WW+GM
Sbjct: 1407 TNFSIDDQGTIWCGPRICVPAKKELRNLILKEAHESAYSIHPGSTKMYQDIKAYFWWAGM 1466

Query: 421  KRDIADFVSRCLTCQQVKAPRQRPAGLLQPLSVPRVEMGSSLYGFLFGFAKDKAGFQRRM 480
            KRD+A++V+ C  CQ+VKA  QRPAGLLQPL +P  +       F+ G  +  +G+    
Sbjct: 1467 KRDVAEYVALCDICQRVKAEHQRPAGLLQPLPIPEWKWEEIGMDFITGLPRTPSGYDSIW 1526

Query: 481  V---------------------------------------SIVSDRDTRFTSQFWRSLQK 540
            V                                        IVSDR T+FTS+FW+ L +
Sbjct: 1527 VIVDRLTKSAHFVPVKTTYDGKKLAELYMTHVVCRFGCPKKIVSDRGTQFTSRFWKQLHE 1586

Query: 541  ALGTQLRFSTAFHPQTDGQTERLNQVLEDMLRACSLDFAGCWDEHLPLMEFAYNNSYQAT 596
            ALGT L FSTA+HPQTDGQTER+NQ+LEDMLRAC+LDF G WD  LP  EF+YNNSYQA+
Sbjct: 1587 ALGTDLNFSTAYHPQTDGQTERVNQILEDMLRACALDFEGTWDRCLPYAEFSYNNSYQAS 1646

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
TF212_SCHPO1.2e-7730.21Transposon Tf2-12 polyprotein OS=Schizosaccharomyces pombe (strain 972 / ATCC 24... [more]
TF21_SCHPO1.2e-7730.21Transposon Tf2-1 polyprotein OS=Schizosaccharomyces pombe (strain 972 / ATCC 248... [more]
TF22_SCHPO1.2e-7730.21Transposon Tf2-2 polyprotein OS=Schizosaccharomyces pombe (strain 972 / ATCC 248... [more]
TF23_SCHPO1.2e-7730.21Transposon Tf2-3 polyprotein OS=Schizosaccharomyces pombe (strain 972 / ATCC 248... [more]
TF29_SCHPO1.2e-7730.21Transposon Tf2-9 polyprotein OS=Schizosaccharomyces pombe (strain 972 / ATCC 248... [more]
Match NameE-valueIdentityDescription
Q84KB0_CUCME1.0e-23764.59Pol protein OS=Cucumis melo subsp. melo PE=4 SV=1[more]
A0A061EE03_THECC2.1e-20358.60Retrotransposon protein, putative OS=Theobroma cacao GN=TCM_017700 PE=4 SV=1[more]
A0A061FQY7_THECC1.5e-20158.25DNA/RNA polymerases superfamily protein, putative OS=Theobroma cacao GN=TCM_0448... [more]
Q7X726_ORYSJ2.0e-20154.65OSJNBa0089N06.1 protein OS=Oryza sativa subsp. japonica GN=OSJNBa0089N06.1 PE=4 ... [more]
Q7XPS2_ORYSJ7.6e-20154.13OSJNBa0065O17.9 protein OS=Oryza sativa subsp. japonica GN=OSJNBa0065O17.9 PE=4 ... [more]
Match NameE-valueIdentityDescription
ATMG00860.12.9e-2645.60ATMG00860.1 DNA/RNA polymerases superfamily protein[more]
Match NameE-valueIdentityDescription
gi|28558781|gb|AAO45752.1|1.5e-23764.59pol protein [Cucumis melo subsp. melo][more]
gi|702455653|ref|XP_010026793.1|5.6e-20556.39PREDICTED: uncharacterized protein LOC104417177 [Eucalyptus grandis][more]
gi|590648676|ref|XP_007032220.1|3.1e-20358.60Retrotransposon protein, putative [Theobroma cacao][more]
gi|590568718|ref|XP_007010875.1|2.2e-20158.25DNA/RNA polymerases superfamily protein, putative [Theobroma cacao][more]
gi|32489661|emb|CAE04240.1|2.9e-20154.65OSJNBa0089N06.1 [Oryza sativa Japonica Group][more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR000477RT_dom
IPR001584Integrase_cat-core
IPR012337RNaseH-like_sf
Vocabulary: Biological Process
TermDefinition
GO:0015074DNA integration
Vocabulary: Molecular Function
TermDefinition
GO:0003676nucleic acid binding
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0015074 DNA integration
biological_process GO:0006310 DNA recombination
biological_process GO:0006508 proteolysis
biological_process GO:0006278 RNA-dependent DNA biosynthetic process
cellular_component GO:0005575 cellular_component
molecular_function GO:0004190 aspartic-type endopeptidase activity
molecular_function GO:0003677 DNA binding
molecular_function GO:0046872 metal ion binding
molecular_function GO:0003723 RNA binding
molecular_function GO:0003964 RNA-directed DNA polymerase activity
molecular_function GO:0003676 nucleic acid binding

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CmoCh04G021120.1CmoCh04G021120.1mRNA


Analysis Name: InterPro Annotations of Cucurbita moschata
Date Performed: 2017-05-19
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR000477Reverse transcriptase domainPFAMPF00078RVT_1coord: 1..81
score: 4.4
IPR000477Reverse transcriptase domainPROFILEPS50878RT_POLcoord: 1..82
score: 9
IPR001584Integrase, catalytic corePROFILEPS50994INTEGRASEcoord: 481..573
score: 1
IPR012337Ribonuclease H-like domainGENE3DG3DSA:3.30.420.10coord: 479..577
score: 5.0E-20coord: 173..267
score: 2.
IPR012337Ribonuclease H-like domainunknownSSF53098Ribonuclease H-likecoord: 480..567
score: 8.19
NoneNo IPR availableGENE3DG3DSA:3.30.70.270coord: 2..81
score: 7.
NoneNo IPR availablePANTHERPTHR24559FAMILY NOT NAMEDcoord: 1..590
score:
NoneNo IPR availablePANTHERPTHR24559:SF207SUBFAMILY NOT NAMEDcoord: 1..590
score:
NoneNo IPR availableunknownSSF56672DNA/RNA polymerasescoord: 1..275
score: 6.67E

The following gene(s) are orthologous to this gene:

None

The following gene(s) are paralogous to this gene:

None