CmoCh20G011190 (gene) Cucurbita moschata (Rifu)

NameCmoCh20G011190
Typegene
OrganismCucurbita moschata (Cucurbita moschata (Rifu))
DescriptionRetrotransposon protein, putative, Ty3-gypsy subclass
LocationCmo_Chr20 : 10676309 .. 10678249 (+)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: polypeptideexonCDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGTATGTGACAGATATGGACGTAGTTAAGAATTTAGACCCTGGTACATCCAATGTGGACTTGGGAAATGAGGGGCAGCCTGTAGAGGAAATTGCTCCAGCGGAGGCGGTTCCGGAGCCTGCTGCTCAGTCGGCATCCAGAGATCAGCCGACTGTTGTGATTACTTTGGAAGCATTACAATCATTGATTGAGAGTCGAGTAGATCAGGCAATGCAGAGCCGGGTGGATCAAGCGGTTCAGGCAGCCCTTGTTGGTCTTGGAAGCCAGGCGGCTCCAACAGTACCTGTATCGGGCCAGACGACATTGGTGTCTGAAGCACCAGGAGTAGGTGTTCAGACAGTAATACCTCCAACACGGTTGACAGAACTACCTGGTACAGCTGTGGTGACAGAGGCACCATCGCGGGTAGTAACTTATGGCCGACGATGTATGACAGAAGAGAGTGAGTACATACGAGATTTCATGAAACTTGGCCCGCCAACTTTTGGAGGAAAGGGGACTGATCCGGAGGCAGCTGAATGGTGGTTGGAATGTGTTGAAACAAAATTTACATTCTACAACTGCCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTACCTGTAGTGAATGAATTCCTGGATGTTTTTCCTGATAAATTACCTGGTTTACCACCAGAGCGGGAAGTCAACTTTGGTATTGAACTAGAACCAAGGACAACACCTATCTCTAAAGCTTCTTATAGGATGGCTCCCGCGGAATTGAAAGAGTTGAAGTTGCAATTACAGGAATTGTTGAACCAGGGATTCATACGACCTAGTGTGTCACCATGGGGAGCTCCAGTGTTGTTTGTGAAAAAGAAGGATGGCACACTTCGTCTCTGCATTGACTATAGGGAGCTGAATAAGGTGACCATAAAAAATAAGTATCCCTTGCCACGAATTGATGACTTATTTGACCAGCTTCAAGGGGCAGCGGTATTTTCGAAGATTGATCTTCGTTCTGGTTATCACCAGATAAGAGTCAAAGAAGATGACGTACCGAAGACAGCTTTTCGTACTCGGTATGGGCATTATGAATTTGTCGTGATGTCTTTTGGCTTGACTAATGCCCCTGCAGTGTTTATGGAGCTGATGAATCGGGTATTCCAGGATTTTCTGGATTCTTTTGTCATTGTGTTCATTGATGATATCTTGGTTTATTCCAAGACAAACGATGAACATGCAGAACATTTGAGGAAAGTTTTGTGGGTTCTACGTAAACAAAGATTATATGCCAAGTTCTCAAAATGCGAGTTTTGGCTTCAAAAGGTAGTATTCCTTGGTCATGTGGTATCCAAGGATGGTATAACTGTTGATCCAGCAAAGGTGGAGGCAGTTATAGGTTGGGTTCGACCAACTACAATTACTGAGGTGAGAAGTTTTCTGGGTTTAGCCGGATATTACAGGTGCTTTATTAAAGACTTTGCAAGGATTGCTGCACCACTGACTCAGTTAACCCGAAAAGGTAAGAAATTTGATTGGAGTCGAGCTTGTGAAAGTAGTTTTCAGGAACTCAAGGAAAGATTAGCGTCAGCCCCAGTGCTTATTGTACCTGACGGTACTGGGAACTTAGTAATTTATAGTGATGCTTCTAAGCATGGGTTGGGGTGCGTACTTATGCAAAACGGGAGAGTTATTGCTTATGCCTCTCGGCAATTAAAGGATTATGAACGCAATTACCCAACTCATGATTTAGAATTAGCTGCTGTGGTGTTTGCTCTGAAGATATGGAGACATTATCTGTACGGTGAGAGGATACAAGTATATACGGATCATAAGCTTTAG

mRNA sequence

ATGTATGTGACAGATATGGACGTAGTTAAGAATTTAGACCCTGGTACATCCAATGTGGACTTGGGAAATGAGGGGCAGCCTGTAGAGGAAATTGCTCCAGCGGAGGCGGTTCCGGAGCCTGCTGCTCAGTCGGCATCCAGAGATCAGCCGACTGTTGTGATTACTTTGGAAGCATTACAATCATTGATTGAGAGTCGAGTAGATCAGGCAATGCAGAGCCGGGTGGATCAAGCGGTTCAGGCAGCCCTTGTTGGTCTTGGAAGCCAGGCGGCTCCAACAGTACCTGTATCGGGCCAGACGACATTGGTGTCTGAAGCACCAGGAGTAGGTGTTCAGACAGTAATACCTCCAACACGGTTGACAGAACTACCTGGTACAGCTGTGGTGACAGAGGCACCATCGCGGGTAGTAACTTATGGCCGACGATGTATGACAGAAGAGAGTGAGTACATACGAGATTTCATGAAACTTGGCCCGCCAACTTTTGGAGGAAAGGGGACTGATCCGGAGGCAGCTGAATGGTGGTTGGAATGTGTTGAAACAAAATTTACATTCTACAACTGCCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTACCTGTAGTGAATGAATTCCTGGATGTTTTTCCTGATAAATTACCTGGTTTACCACCAGAGCGGGAAGTCAACTTTGGTATTGAACTAGAACCAAGGACAACACCTATCTCTAAAGCTTCTTATAGGATGGCTCCCGCGGAATTGAAAGAGTTGAAGTTGCAATTACAGGAATTGTTGAACCAGGGATTCATACGACCTAGTGTGTCACCATGGGGAGCTCCAGTGTTGTTTGTGAAAAAGAAGGATGGCACACTTCGTCTCTGCATTGACTATAGGGAGCTGAATAAGGTGACCATAAAAAATAAGTATCCCTTGCCACGAATTGATGACTTATTTGACCAGCTTCAAGGGGCAGCGGTATTTTCGAAGATTGATCTTCGTTCTGGTTATCACCAGATAAGAGTCAAAGAAGATGACGTACCGAAGACAGCTTTTCGTACTCGGTATGGGCATTATGAATTTGTCGTGATGTCTTTTGGCTTGACTAATGCCCCTGCAGTGTTTATGGAGCTGATGAATCGGGTATTCCAGGATTTTCTGGATTCTTTTGTCATTGTGTTCATTGATGATATCTTGGTTTATTCCAAGACAAACGATGAACATGCAGAACATTTGAGGAAAGTTTTGTGGGTTCTACGTAAACAAAGATTATATGCCAAGTTCTCAAAATGCGAGTTTTGGCTTCAAAAGGTAGTATTCCTTGGTCATGTGGTATCCAAGGATGGTATAACTGTTGATCCAGCAAAGGTGGAGGCAGTTATAGGTTGGGTTCGACCAACTACAATTACTGAGGTGAGAAGTTTTCTGGGTTTAGCCGGATATTACAGGTGCTTTATTAAAGACTTTGCAAGGATTGCTGCACCACTGACTCAGTTAACCCGAAAAGGTAAGAAATTTGATTGGAGTCGAGCTTGTGAAAGTAGTTTTCAGGAACTCAAGGAAAGATTAGCGTCAGCCCCAGTGCTTATTGTACCTGACGGTACTGGGAACTTAGTAATTTATAGTGATGCTTCTAAGCATGGGTTGGGGTGCGTACTTATGCAAAACGGGAGAGTTATTGCTTATGCCTCTCGGCAATTAAAGGATTATGAACGCAATTACCCAACTCATGATTTAGAATTAGCTGCTGTGGTGTTTGCTCTGAAGATATGGAGACATTATCTGTACGGTGAGAGGATACAAGTATATACGGATCATAAGCTTTAG

Coding sequence (CDS)

ATGTATGTGACAGATATGGACGTAGTTAAGAATTTAGACCCTGGTACATCCAATGTGGACTTGGGAAATGAGGGGCAGCCTGTAGAGGAAATTGCTCCAGCGGAGGCGGTTCCGGAGCCTGCTGCTCAGTCGGCATCCAGAGATCAGCCGACTGTTGTGATTACTTTGGAAGCATTACAATCATTGATTGAGAGTCGAGTAGATCAGGCAATGCAGAGCCGGGTGGATCAAGCGGTTCAGGCAGCCCTTGTTGGTCTTGGAAGCCAGGCGGCTCCAACAGTACCTGTATCGGGCCAGACGACATTGGTGTCTGAAGCACCAGGAGTAGGTGTTCAGACAGTAATACCTCCAACACGGTTGACAGAACTACCTGGTACAGCTGTGGTGACAGAGGCACCATCGCGGGTAGTAACTTATGGCCGACGATGTATGACAGAAGAGAGTGAGTACATACGAGATTTCATGAAACTTGGCCCGCCAACTTTTGGAGGAAAGGGGACTGATCCGGAGGCAGCTGAATGGTGGTTGGAATGTGTTGAAACAAAATTTACATTCTACAACTGCCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTACCTGTAGTGAATGAATTCCTGGATGTTTTTCCTGATAAATTACCTGGTTTACCACCAGAGCGGGAAGTCAACTTTGGTATTGAACTAGAACCAAGGACAACACCTATCTCTAAAGCTTCTTATAGGATGGCTCCCGCGGAATTGAAAGAGTTGAAGTTGCAATTACAGGAATTGTTGAACCAGGGATTCATACGACCTAGTGTGTCACCATGGGGAGCTCCAGTGTTGTTTGTGAAAAAGAAGGATGGCACACTTCGTCTCTGCATTGACTATAGGGAGCTGAATAAGGTGACCATAAAAAATAAGTATCCCTTGCCACGAATTGATGACTTATTTGACCAGCTTCAAGGGGCAGCGGTATTTTCGAAGATTGATCTTCGTTCTGGTTATCACCAGATAAGAGTCAAAGAAGATGACGTACCGAAGACAGCTTTTCGTACTCGGTATGGGCATTATGAATTTGTCGTGATGTCTTTTGGCTTGACTAATGCCCCTGCAGTGTTTATGGAGCTGATGAATCGGGTATTCCAGGATTTTCTGGATTCTTTTGTCATTGTGTTCATTGATGATATCTTGGTTTATTCCAAGACAAACGATGAACATGCAGAACATTTGAGGAAAGTTTTGTGGGTTCTACGTAAACAAAGATTATATGCCAAGTTCTCAAAATGCGAGTTTTGGCTTCAAAAGGTAGTATTCCTTGGTCATGTGGTATCCAAGGATGGTATAACTGTTGATCCAGCAAAGGTGGAGGCAGTTATAGGTTGGGTTCGACCAACTACAATTACTGAGGTGAGAAGTTTTCTGGGTTTAGCCGGATATTACAGGTGCTTTATTAAAGACTTTGCAAGGATTGCTGCACCACTGACTCAGTTAACCCGAAAAGGTAAGAAATTTGATTGGAGTCGAGCTTGTGAAAGTAGTTTTCAGGAACTCAAGGAAAGATTAGCGTCAGCCCCAGTGCTTATTGTACCTGACGGTACTGGGAACTTAGTAATTTATAGTGATGCTTCTAAGCATGGGTTGGGGTGCGTACTTATGCAAAACGGGAGAGTTATTGCTTATGCCTCTCGGCAATTAAAGGATTATGAACGCAATTACCCAACTCATGATTTAGAATTAGCTGCTGTGGTGTTTGCTCTGAAGATATGGAGACATTATCTGTACGGTGAGAGGATACAAGTATATACGGATCATAAGCTTTAG
BLAST of CmoCh20G011190 vs. Swiss-Prot
Match: POL3_DROME (Retrovirus-related Pol polyprotein from transposon 17.6 OS=Drosophila melanogaster GN=pol PE=3 SV=1)

HSP 1 Score: 308.5 bits (789), Expect = 1.7e-82
Identity = 156/370 (42.16%), Postives = 234/370 (63.24%), Query Frame = 1

Query: 282 SKASYRMAPAELKELKLQLQELLNQGFIRPSVSPWGAPVLFVKKKDGT-----LRLCIDY 341
           SK SY  A  +  E++ Q+Q++LNQG IR S SP+ +P+  V KK         R+ IDY
Sbjct: 211 SKYSYPQAYEQ--EVESQIQDMLNQGIIRTSNSPYNSPIWVVPKKQDASGKQKFRIVIDY 270

Query: 342 RELNKVTIKNKYPLPRIDDLFDQLQGAAVFSKIDLRSGYHQIRVKEDDVPKTAFRTRYGH 401
           R+LN++T+ +++P+P +D++  +L     F+ IDL  G+HQI +  + V KTAF T++GH
Sbjct: 271 RKLNEITVGDRHPIPNMDEILGKLGRCNYFTTIDLAKGFHQIEMDPESVSKTAFSTKHGH 330

Query: 402 YEFVVMSFGLTNAPAVFMELMNRVFQDFLDSFVIVFIDDILVYSKTNDEHAEHLRKVLWV 461
           YE++ M FGL NAPA F   MN + +  L+   +V++DDI+V+S + DEH + L  V   
Sbjct: 331 YEYLRMPFGLKNAPATFQRCMNDILRPLLNKHCLVYLDDIIVFSTSLDEHLQSLGLVFEK 390

Query: 462 LRKQRLYAKFSKCEFWLQKVVFLGHVVSKDGITVDPAKVEAVIGWVRPTTITEVRSFLGL 521
           L K  L  +  KCEF  Q+  FLGHV++ DGI  +P K+EA+  +  PT   E+++FLGL
Sbjct: 391 LAKANLKLQLDKCEFLKQETTFLGHVLTPDGIKPNPEKIEAIQKYPIPTKPKEIKAFLGL 450

Query: 522 AGYYRCFIKDFARIAAPLTQLTRKGKKFDWSR-ACESSFQELKERLASAPVLIVPDGTGN 581
            GYYR FI +FA IA P+T+  +K  K D +    +S+F++LK  ++  P+L VPD T  
Sbjct: 451 TGYYRKFIPNFADIAKPMTKCLKKNMKIDTTNPEYDSAFKKLKYLISEDPILKVPDFTKK 510

Query: 582 LVIYSDASKHGLGCVLMQNGRVIAYASRQLKDYERNYPTHDLELAAVVFALKIWRHYLYG 641
             + +DAS   LG VL Q+G  ++Y SR L ++E NY T + EL A+V+A K +RHYL G
Sbjct: 511 FTLTTDASDVALGAVLSQDGHPLSYISRTLNEHEINYSTIEKELLAIVWATKTFRHYLLG 570

Query: 642 ERIQVYTDHK 646
              ++ +DH+
Sbjct: 571 RHFEISSDHQ 578

BLAST of CmoCh20G011190 vs. Swiss-Prot
Match: POL2_DROME (Retrovirus-related Pol polyprotein from transposon 297 OS=Drosophila melanogaster GN=pol PE=3 SV=1)

HSP 1 Score: 302.0 bits (772), Expect = 1.6e-80
Identity = 152/373 (40.75%), Postives = 227/373 (60.86%), Query Frame = 1

Query: 279 TPISKASYRMAPAELKELKLQLQELLNQGFIRPSVSPWGAPVLFVKKKDGT-----LRLC 338
           +PI    Y +A     E++ Q+QE+LNQG IR S SP+ +P   V KK         R+ 
Sbjct: 205 SPIYSKQYPLAQTHEIEVENQVQEMLNQGLIRESNSPYNSPTWVVPKKPDASGANKYRVV 264

Query: 339 IDYRELNKVTIKNKYPLPRIDDLFDQLQGAAVFSKIDLRSGYHQIRVKEDDVPKTAFRTR 398
           IDYR+LN++TI ++YP+P +D++  +L     F+ IDL  G+HQI + E+ + KTAF T+
Sbjct: 265 IDYRKLNEITIPDRYPIPNMDEILGKLGKCQYFTTIDLAKGFHQIEMDEESISKTAFSTK 324

Query: 399 YGHYEFVVMSFGLTNAPAVFMELMNRVFQDFLDSFVIVFIDDILVYSKTNDEHAEHLRKV 458
            GHYE++ M FGL NAPA F   MN + +  L+   +V++DDI+++S +  EH   ++ V
Sbjct: 325 SGHYEYLRMPFGLRNAPATFQRCMNNILRPLLNKHCLVYLDDIIIFSTSLTEHLNSIQLV 384

Query: 459 LWVLRKQRLYAKFSKCEFWLQKVVFLGHVVSKDGITVDPAKVEAVIGWVRPTTITEVRSF 518
              L    L  +  KCEF  ++  FLGH+V+ DGI  +P KV+A++ +  PT   E+R+F
Sbjct: 385 FTKLADANLKLQLDKCEFLKKEANFLGHIVTPDGIKPNPIKVKAIVSYPIPTKDKEIRAF 444

Query: 519 LGLAGYYRCFIKDFARIAAPLTQLTRKGKKFDWSR-ACESSFQELKERLASAPVLIVPDG 578
           LGL GYYR FI ++A IA P+T   +K  K D  +     +F++LK  +   P+L +PD 
Sbjct: 445 LGLTGYYRKFIPNYADIAKPMTSCLKKRTKIDTQKLEYIEAFEKLKALIIRDPILQLPDF 504

Query: 579 TGNLVIYSDASKHGLGCVLMQNGRVIAYASRQLKDYERNYPTHDLELAAVVFALKIWRHY 638
               V+ +DAS   LG VL QNG  I++ SR L D+E NY   + EL A+V+A K +RHY
Sbjct: 505 EKKFVLTTDASNLALGAVLSQNGHPISFISRTLNDHELNYSAIEKELLAIVWATKTFRHY 564

Query: 639 LYGERIQVYTDHK 646
           L G +  + +DH+
Sbjct: 565 LLGRQFLIASDHQ 577

BLAST of CmoCh20G011190 vs. Swiss-Prot
Match: POL5_DROME (Retrovirus-related Pol polyprotein from transposon opus OS=Drosophila melanogaster GN=pol PE=3 SV=1)

HSP 1 Score: 278.9 bits (712), Expect = 1.4e-73
Identity = 162/420 (38.57%), Postives = 231/420 (55.00%), Query Frame = 1

Query: 247 VVNEFLDVFPDKLPGLPPEREVNFGIELEPRTTPISKASYRMAPAELKELKLQLQELLNQ 306
           ++ EF  +F   L G+  E  V   I    +  PI   SY        E++ Q+ ELL  
Sbjct: 91  LLGEFPRIFEPPLSGMSVETAVKAEIRTNTQD-PIYAKSYPYPVNMRGEVERQIDELLQD 150

Query: 307 GFIRPSVSPWGAPVLFVKKK-----DGTLRLCIDYRELNKVTIKNKYPLPRIDDLFDQLQ 366
           G IRPS SP+ +P+  V KK     +   R+ +D++ LN VTI + YP+P I+     L 
Sbjct: 151 GIIRPSNSPYNSPIWIVPKKPKPNGEKQYRMVVDFKRLNTVTIPDTYPIPDINATLASLG 210

Query: 367 GAAVFSKIDLRSGYHQIRVKEDDVPKTAFRTRYGHYEFVVMSFGLTNAPAVFMELMNRVF 426
            A  F+ +DL SG+HQI +KE D+PKTAF T  G YEF+ + FGL NAPA+F  +++ + 
Sbjct: 211 NAKYFTTLDLTSGFHQIHMKESDIPKTAFSTLNGKYEFLRLPFGLKNAPAIFQRMIDDIL 270

Query: 427 QDFLDSFVIVFIDDILVYSKTNDEHAEHLRKVLWVLRKQRLYAKFSKCEFWLQKVVFLGH 486
           ++ +     V+IDDI+V+S+  D H ++LR VL  L K  L     K  F   +V FLG+
Sbjct: 271 REHIGKVCYVYIDDIIVFSEDYDTHWKNLRLVLASLSKANLQVNLEKSHFLDTQVEFLGY 330

Query: 487 VVSKDGITVDPAKVEAVIGWVRPTTITEVRSFLGLAGYYRCFIKDFARIAAPLTQLTR-- 546
           +V+ DGI  DP KV A+     PT++ E++ FLG+  YYR FI+D+A++A PLT LTR  
Sbjct: 331 IVTADGIKADPKKVRAISEMPPPTSVKELKRFLGMTSYYRKFIQDYAKVAKPLTNLTRGL 390

Query: 547 ---------KGKKFDWSRACESSFQELKERLASAPVLIVPDGTGNLVIYSDASKHGLGCV 606
                                 SF +LK  L S+ +L  P  T    + +DAS   +G V
Sbjct: 391 YANIKSSQSSKVPITLDETALQSFNDLKSILCSSEILAFPCFTKPFHLTTDASNWAIGAV 450

Query: 607 LMQN----GRVIAYASRQLKDYERNYPTHDLELAAVVFALKIWRHYLYGE-RIQVYTDHK 646
           L Q+     R IAY SR L   E NY T + E+ A++++L   R YLYG   I+VYTDH+
Sbjct: 451 LSQDDQGRDRPIAYISRSLNKTEENYATIEKEMLAIIWSLDNLRAYLYGAGTIKVYTDHQ 509

BLAST of CmoCh20G011190 vs. Swiss-Prot
Match: YG31B_YEAST (Transposon Ty3-G Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GN=TY3B-G PE=1 SV=3)

HSP 1 Score: 271.6 bits (693), Expect = 2.3e-71
Identity = 149/404 (36.88%), Postives = 230/404 (56.93%), Query Frame = 1

Query: 250 EFLDVFPDKLPGLPPERE---VNFGIELEPRTTPISKASYRMAPAELKELKLQLQELLNQ 309
           ++ ++  + LP  P +     V   IE++P         Y +     +E+   +Q+LL+ 
Sbjct: 563 KYREIIRNDLPPRPADINNIPVKHDIEIKPGARLPRLQPYHVTEKNEQEINKIVQKLLDN 622

Query: 310 GFIRPSVSPWGAPVLFVKKKDGTLRLCIDYRELNKVTIKNKYPLPRIDDLFDQLQGAAVF 369
            FI PS SP  +PV+ V KKDGT RLC+DYR LNK TI + +PLPRID+L  ++  A +F
Sbjct: 623 KFIVPSKSPCSSPVVLVPKKDGTFRLCVDYRTLNKATISDPFPLPRIDNLLSRIGNAQIF 682

Query: 370 SKIDLRSGYHQIRVKEDDVPKTAFRTRYGHYEFVVMSFGLTNAPAVFMELMNRVFQDFLD 429
           + +DL SGYHQI ++  D  KTAF T  G YE+ VM FGL NAP+ F   M   F+D   
Sbjct: 683 TTLDLHSGYHQIPMEPKDRYKTAFVTPSGKYEYTVMPFGLVNAPSTFARYMADTFRDL-- 742

Query: 430 SFVIVFIDDILVYSKTNDEHAEHLRKVLWVLRKQRLYAKFSKCEFWLQKVVFLGHVVSKD 489
            FV V++DDIL++S++ +EH +HL  VL  L+ + L  K  KC+F  ++  FLG+ +   
Sbjct: 743 RFVNVYLDDILIFSESPEEHWKHLDTVLERLKNENLIVKKKKCKFASEETEFLGYSIGIQ 802

Query: 490 GITVDPAKVEAVIGWVRPTTITEVRSFLGLAGYYRCFIKDFARIAAPLTQLTRKGKKFDW 549
            I     K  A+  +  P T+ + + FLG+  YYR FI + ++IA P+        K  W
Sbjct: 803 KIAPLQHKCAAIRDFPTPKTVKQAQRFLGMINYYRRFIPNCSKIAQPIQLFI--CDKSQW 862

Query: 550 SRACESSFQELKERLASAPVLIVPDGTGNLVIYSDASKHGLGCVLMQNGR------VIAY 609
           +   + +  +LK+ L ++PVL+  +   N  + +DASK G+G VL +         V+ Y
Sbjct: 863 TEKQDKAIDKLKDALCNSPVLVPFNNKANYRLTTDASKDGIGAVLEEVDNKNKLVGVVGY 922

Query: 610 ASRQLKDYERNYPTHDLELAAVVFALKIWRHYLYGERIQVYTDH 645
            S+ L+  ++NYP  +LEL  ++ AL  +R+ L+G+   + TDH
Sbjct: 923 FSKSLESAQKNYPAGELELLGIIKALHHFRYMLHGKHFTLRTDH 962

BLAST of CmoCh20G011190 vs. Swiss-Prot
Match: YI31B_YEAST (Transposon Ty3-I Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GN=TY3B-I PE=3 SV=2)

HSP 1 Score: 271.2 bits (692), Expect = 3.0e-71
Identity = 149/404 (36.88%), Postives = 230/404 (56.93%), Query Frame = 1

Query: 250 EFLDVFPDKLPGLPPERE---VNFGIELEPRTTPISKASYRMAPAELKELKLQLQELLNQ 309
           ++ ++  + LP  P +     V   IE++P         Y +     +E+   +Q+LL+ 
Sbjct: 589 KYREIIRNDLPPRPADINNIPVKHDIEIKPGARLPRLQPYHVTEKNEQEINKIVQKLLDN 648

Query: 310 GFIRPSVSPWGAPVLFVKKKDGTLRLCIDYRELNKVTIKNKYPLPRIDDLFDQLQGAAVF 369
            FI PS SP  +PV+ V KKDGT RLC+DYR LNK TI + +PLPRID+L  ++  A +F
Sbjct: 649 KFIVPSKSPCSSPVVLVPKKDGTFRLCVDYRTLNKATISDPFPLPRIDNLLSRIGNAQIF 708

Query: 370 SKIDLRSGYHQIRVKEDDVPKTAFRTRYGHYEFVVMSFGLTNAPAVFMELMNRVFQDFLD 429
           + +DL SGYHQI ++  D  KTAF T  G YE+ VM FGL NAP+ F   M   F+D   
Sbjct: 709 TTLDLHSGYHQIPMEPKDRYKTAFVTPSGKYEYTVMPFGLVNAPSTFARYMADTFRDL-- 768

Query: 430 SFVIVFIDDILVYSKTNDEHAEHLRKVLWVLRKQRLYAKFSKCEFWLQKVVFLGHVVSKD 489
            FV V++DDIL++S++ +EH +HL  VL  L+ + L  K  KC+F  ++  FLG+ +   
Sbjct: 769 RFVNVYLDDILIFSESPEEHWKHLDTVLERLKNENLIVKKKKCKFASEETEFLGYSIGIQ 828

Query: 490 GITVDPAKVEAVIGWVRPTTITEVRSFLGLAGYYRCFIKDFARIAAPLTQLTRKGKKFDW 549
            I     K  A+  +  P T+ + + FLG+  YYR FI + ++IA P+        K  W
Sbjct: 829 KIAPLQHKCAAIRDFPTPKTVKQAQRFLGMINYYRRFIPNCSKIAQPIQLFI--CDKSQW 888

Query: 550 SRACESSFQELKERLASAPVLIVPDGTGNLVIYSDASKHGLGCVLMQNGR------VIAY 609
           +   + + ++LK  L ++PVL+  +   N  + +DASK G+G VL +         V+ Y
Sbjct: 889 TEKQDKAIEKLKAALCNSPVLVPFNNKANYRLTTDASKDGIGAVLEEVDNKNKLVGVVGY 948

Query: 610 ASRQLKDYERNYPTHDLELAAVVFALKIWRHYLYGERIQVYTDH 645
            S+ L+  ++NYP  +LEL  ++ AL  +R+ L+G+   + TDH
Sbjct: 949 FSKSLESAQKNYPAGELELLGIIKALHHFRYMLHGKHFTLRTDH 988

BLAST of CmoCh20G011190 vs. TrEMBL
Match: M5WLY8_PRUPE (Uncharacterized protein OS=Prunus persica GN=PRUPE_ppa021229mg PE=4 SV=1)

HSP 1 Score: 635.6 bits (1638), Expect = 6.7e-179
Identity = 297/400 (74.25%), Postives = 348/400 (87.00%), Query Frame = 1

Query: 246 PVVNEFLDVFPDKLPGLPPEREVNFGIELEPRTTPISKASYRMAPAELKELKLQLQELLN 305
           PV+ +F DVFP+ LPGLPP RE+ F IEL P T PIS+A YRMAPAEL+ELK QLQEL++
Sbjct: 231 PVIQDFPDVFPEDLPGLPPHREIEFVIELAPGTNPISQAPYRMAPAELRELKTQLQELVD 290

Query: 306 QGFIRPSVSPWGAPVLFVKKKDGTLRLCIDYRELNKVTIKNKYPLPRIDDLFDQLQGAAV 365
           +GFIRPS SPWGAPVLFVKKKDGT+RLC+DYR+LNK+T++N+YPLPRIDDLFDQL+GA V
Sbjct: 291 KGFIRPSFSPWGAPVLFVKKKDGTMRLCVDYRQLNKITVRNRYPLPRIDDLFDQLKGAKV 350

Query: 366 FSKIDLRSGYHQIRVKEDDVPKTAFRTRYGHYEFVVMSFGLTNAPAVFMELMNRVFQDFL 425
           FSKIDLRSGYHQ+RV+E+D+PKTAFRTRYGHYEF+VM FGLTNAPA FM+LMNRVF+ +L
Sbjct: 351 FSKIDLRSGYHQLRVREEDMPKTAFRTRYGHYEFLVMPFGLTNAPAAFMDLMNRVFRRYL 410

Query: 426 DSFVIVFIDDILVYSKTNDEHAEHLRKVLWVLRKQRLYAKFSKCEFWLQKVVFLGHVVSK 485
           D FVIVFIDDILVYSK+   H +HL  VL  LR+++LYAKFSKC+FWL +V FLGHV+S 
Sbjct: 411 DRFVIVFIDDILVYSKSQKAHMKHLNLVLRTLRRRQLYAKFSKCQFWLDRVSFLGHVISA 470

Query: 486 DGITVDPAKVEAVIGWVRPTTITEVRSFLGLAGYYRCFIKDFARIAAPLTQLTRKGKKFD 545
           +GI VDP K+EAV+ W+RPT++TE+RSFLGLAGYYR F++ F+ IAAPLT LTRKG KF 
Sbjct: 471 EGIYVDPQKIEAVVNWLRPTSVTEIRSFLGLAGYYRRFVEGFSTIAAPLTYLTRKGVKFV 530

Query: 546 WSRACESSFQELKERLASAPVLIVPDGTGNLVIYSDASKHGLGCVLMQNGRVIAYASRQL 605
           WS  CE SF ELK RL +APVL +PD +GN VIYSDAS+ GLGCVLMQ+GRVIAYASRQL
Sbjct: 531 WSDKCEESFIELKTRLTTAPVLALPDDSGNFVIYSDASQQGLGCVLMQHGRVIAYASRQL 590

Query: 606 KDYERNYPTHDLELAAVVFALKIWRHYLYGERIQVYTDHK 646
           K +E NYP HDLELAAVVFALKIWRHYLYGE  Q++TDHK
Sbjct: 591 KKHELNYPVHDLELAAVVFALKIWRHYLYGETCQIFTDHK 630

BLAST of CmoCh20G011190 vs. TrEMBL
Match: A0A061E6T4_THECC (DNA/RNA polymerases superfamily protein OS=Theobroma cacao GN=TCM_009549 PE=4 SV=1)

HSP 1 Score: 630.9 bits (1626), Expect = 1.7e-177
Identity = 297/399 (74.44%), Postives = 345/399 (86.47%), Query Frame = 1

Query: 247 VVNEFLDVFPDKLPGLPPEREVNFGIELEPRTTPISKASYRMAPAELKELKLQLQELLNQ 306
           +V+EF DVFPD LPGLPP+RE+ F I+L P T PIS   YRMAP ELKELK+QLQEL+++
Sbjct: 544 IVSEFPDVFPDDLPGLPPDRELEFPIDLLPGTAPISIPPYRMAPTELKELKVQLQELVDK 603

Query: 307 GFIRPSVSPWGAPVLFVKKKDGTLRLCIDYRELNKVTIKNKYPLPRIDDLFDQLQGAAVF 366
           GFIRPS+SPWGAP+LFVKKKDGTLRLCID R+LN++TIKNKYPLPRIDDLFDQLQGA VF
Sbjct: 604 GFIRPSISPWGAPILFVKKKDGTLRLCIDCRQLNRMTIKNKYPLPRIDDLFDQLQGATVF 663

Query: 367 SKIDLRSGYHQIRVKEDDVPKTAFRTRYGHYEFVVMSFGLTNAPAVFMELMNRVFQDFLD 426
           SK+DLRSGYHQ+R+KE DVPKTAFRTRYGHYEF+VM FGLTNAPA FM+LMNRVF  +LD
Sbjct: 664 SKVDLRSGYHQLRIKEQDVPKTAFRTRYGHYEFLVMPFGLTNAPAAFMDLMNRVFHPYLD 723

Query: 427 SFVIVFIDDILVYSKTNDEHAEHLRKVLWVLRKQRLYAKFSKCEFWLQKVVFLGHVVSKD 486
            FVIVFIDDILVYS+ NDEHA HLR VL  LR+++LYAKFSKCEFWLQ+VVFLGH+VS+ 
Sbjct: 724 KFVIVFIDDILVYSRDNDEHAAHLRIVLQTLRERQLYAKFSKCEFWLQEVVFLGHIVSRT 783

Query: 487 GITVDPAKVEAVIGWVRPTTITEVRSFLGLAGYYRCFIKDFARIAAPLTQLTRKGKKFDW 546
           GI VDP KVEA++ W +P T+TE+RSFLGLAGYYR F++ F+ +AAPLT+LTRKG KF W
Sbjct: 784 GIYVDPKKVEAILQWEQPKTVTEIRSFLGLAGYYRRFVQGFSLVAAPLTRLTRKGVKFVW 843

Query: 547 SRACESSFQELKERLASAPVLIVPDGTGNLVIYSDASKHGLGCVLMQNGRVIAYASRQLK 606
              CE+ FQELK RL SAPVL +P      ++YSDASK GLGCVLMQ+ +V+AYASRQLK
Sbjct: 844 DDVCENRFQELKNRLTSAPVLTLPVNGKGFIVYSDASKLGLGCVLMQDEKVVAYASRQLK 903

Query: 607 DYERNYPTHDLELAAVVFALKIWRHYLYGERIQVYTDHK 646
            +E NYPTHDLELAAVVFALKIWRHYLYGE  +++TDHK
Sbjct: 904 RHEANYPTHDLELAAVVFALKIWRHYLYGEHCRIFTDHK 942

BLAST of CmoCh20G011190 vs. TrEMBL
Match: A0A061FHZ8_THECC (DNA/RNA polymerases superfamily protein OS=Theobroma cacao GN=TCM_033345 PE=4 SV=1)

HSP 1 Score: 629.0 bits (1621), Expect = 6.3e-177
Identity = 292/399 (73.18%), Postives = 349/399 (87.47%), Query Frame = 1

Query: 247  VVNEFLDVFPDKLPGLPPEREVNFGIELEPRTTPISKASYRMAPAELKELKLQLQELLNQ 306
            VV EF+DVFP++LPGLPPERE+ F I+L P T PIS   YRMAPAELKELK QL++LL++
Sbjct: 605  VVKEFVDVFPEELPGLPPEREIEFCIDLIPDTRPISIPPYRMAPAELKELKDQLEDLLDK 664

Query: 307  GFIRPSVSPWGAPVLFVKKKDGTLRLCIDYRELNKVTIKNKYPLPRIDDLFDQLQGAAVF 366
            GFIRPSVSPWGAPVLFVKKKDG+LRLCIDYR+LNKVT+KNKYPLPRIDDLFDQLQGA  F
Sbjct: 665  GFIRPSVSPWGAPVLFVKKKDGSLRLCIDYRQLNKVTVKNKYPLPRIDDLFDQLQGAQCF 724

Query: 367  SKIDLRSGYHQIRVKEDDVPKTAFRTRYGHYEFVVMSFGLTNAPAVFMELMNRVFQDFLD 426
            SKIDLRSGYHQ+R++ +D+PK AF+TRYGHYEF+VMSFGLTNAPA FM+LMNRVF+ +LD
Sbjct: 725  SKIDLRSGYHQLRIRNEDIPKIAFQTRYGHYEFLVMSFGLTNAPAAFMDLMNRVFKPYLD 784

Query: 427  SFVIVFIDDILVYSKTNDEHAEHLRKVLWVLRKQRLYAKFSKCEFWLQKVVFLGHVVSKD 486
             FV+VFIDDIL+YSK+ +EH +HL+ VL +LR+ RLYAKFSKCEFWL+ V FLGHVVSK+
Sbjct: 785  KFVVVFIDDILIYSKSREEHEQHLKIVLQILREHRLYAKFSKCEFWLESVAFLGHVVSKE 844

Query: 487  GITVDPAKVEAVIGWVRPTTITEVRSFLGLAGYYRCFIKDFARIAAPLTQLTRKGKKFDW 546
            GI VD  K+EAV  W RPT++TE+RSF+GLAGYYR F+KDF++I APLT+LTRK  KF+W
Sbjct: 845  GIQVDTKKIEAVEKWPRPTSVTEIRSFVGLAGYYRRFVKDFSKIVAPLTKLTRKDTKFEW 904

Query: 547  SRACESSFQELKERLASAPVLIVPDGTGNLVIYSDASKHGLGCVLMQNGRVIAYASRQLK 606
            S ACE+SF++LK  L +APVL +P GTG  +++ DAS  GLGCVLMQ+G+VIAYASRQLK
Sbjct: 905  SDACENSFEKLKACLTTAPVLSLPQGTGGYMVFCDASGVGLGCVLMQHGKVIAYASRQLK 964

Query: 607  DYERNYPTHDLELAAVVFALKIWRHYLYGERIQVYTDHK 646
             +E NYP HDLE+AA+VFALKIWRHYLYGE  ++YTDHK
Sbjct: 965  RHEHNYPIHDLEMAAIVFALKIWRHYLYGETCEIYTDHK 1003

BLAST of CmoCh20G011190 vs. TrEMBL
Match: A0A061EEG7_THECC (DNA/RNA polymerases superfamily protein OS=Theobroma cacao GN=TCM_018243 PE=4 SV=1)

HSP 1 Score: 628.6 bits (1620), Expect = 8.2e-177
Identity = 293/399 (73.43%), Postives = 349/399 (87.47%), Query Frame = 1

Query: 247 VVNEFLDVFPDKLPGLPPEREVNFGIELEPRTTPISKASYRMAPAELKELKLQLQELLNQ 306
           VV EF+DVFP++LP LPPEREV F I+L P T PIS   YRMAPAELKELK QL++LL++
Sbjct: 513 VVKEFVDVFPEELPSLPPEREVEFCIDLIPDTRPISIPPYRMAPAELKELKDQLEDLLDK 572

Query: 307 GFIRPSVSPWGAPVLFVKKKDGTLRLCIDYRELNKVTIKNKYPLPRIDDLFDQLQGAAVF 366
           GFIRPSVSPWGAPVLFVKKKDG+LRLCIDYR+LNKVT+KNKYPLPRIDDLFDQLQGA  F
Sbjct: 573 GFIRPSVSPWGAPVLFVKKKDGSLRLCIDYRQLNKVTVKNKYPLPRIDDLFDQLQGAQCF 632

Query: 367 SKIDLRSGYHQIRVKEDDVPKTAFRTRYGHYEFVVMSFGLTNAPAVFMELMNRVFQDFLD 426
           SKIDLRSGYHQ+R++ +D+PKTAFRTRYGHYEF+VMSFGLTNAPA FM+LMNRVF+ +LD
Sbjct: 633 SKIDLRSGYHQLRIRNEDIPKTAFRTRYGHYEFLVMSFGLTNAPAAFMDLMNRVFKPYLD 692

Query: 427 SFVIVFIDDILVYSKTNDEHAEHLRKVLWVLRKQRLYAKFSKCEFWLQKVVFLGHVVSKD 486
            FV+VFIDDIL+YSK+ +EH +HL+ VL +LR+ RLYAKFSKCEFWL+ V FLGHVVSK+
Sbjct: 693 KFVVVFIDDILIYSKSREEHEQHLKIVLQILREHRLYAKFSKCEFWLESVAFLGHVVSKE 752

Query: 487 GITVDPAKVEAVIGWVRPTTITEVRSFLGLAGYYRCFIKDFARIAAPLTQLTRKGKKFDW 546
           GI VD  K+EAV  W RPT+++E+RSF+GLAGYYR F+KDF++I APLT+LTRK  KF+W
Sbjct: 753 GIRVDTKKIEAVEKWPRPTSVSEIRSFVGLAGYYRRFVKDFSKIVAPLTKLTRKDTKFEW 812

Query: 547 SRACESSFQELKERLASAPVLIVPDGTGNLVIYSDASKHGLGCVLMQNGRVIAYASRQLK 606
           S ACE+SF++LK  L +APVL +P GTG   ++ DAS  GLGCVLMQ+G+VIAYASRQLK
Sbjct: 813 SDACENSFEKLKACLTTAPVLSLPQGTGGYTMFCDASGVGLGCVLMQHGKVIAYASRQLK 872

Query: 607 DYERNYPTHDLELAAVVFALKIWRHYLYGERIQVYTDHK 646
            +E+NYP HDLE+AA+VFALKIWRHYLYGE  ++YTDHK
Sbjct: 873 RHEQNYPIHDLEMAAIVFALKIWRHYLYGETCEIYTDHK 911

BLAST of CmoCh20G011190 vs. TrEMBL
Match: A0A061DW51_THECC (DNA/RNA polymerases superfamily protein OS=Theobroma cacao GN=TCM_003698 PE=4 SV=1)

HSP 1 Score: 625.9 bits (1613), Expect = 5.3e-176
Identity = 297/400 (74.25%), Postives = 344/400 (86.00%), Query Frame = 1

Query: 246 PVVNEFLDVFPDKLPGLPPEREVNFGIELEPRTTPISKASYRMAPAELKELKLQLQELLN 305
           P+V+EF DVFPD LPGLPP+RE+ F I+L P T PIS   YRMAPAELKELK+QLQEL++
Sbjct: 506 PIVSEFPDVFPDDLPGLPPDRELEFPIDLLPGTAPISIPPYRMAPAELKELKVQLQELVD 565

Query: 306 QGFIRPSVSPWGAPVLFVKKKDGTLRLCIDYRELNKVTIKNKYPLPRIDDLFDQLQGAAV 365
           +GFIRPS+SPWGAP+LFVKKKDGTLRLCIDYR+LN++TIKNKYPLPRIDD+FDQLQGA V
Sbjct: 566 KGFIRPSISPWGAPILFVKKKDGTLRLCIDYRQLNRMTIKNKYPLPRIDDIFDQLQGATV 625

Query: 366 FSKIDLRSGYHQIRVKEDDVPKTAFRTRYGHYEFVVMSFGLTNAPAVFMELMNRVFQDFL 425
           FSK++LRSGYHQ+R+KE DV KT FRTRYGHYEF+VM FGLTNAPA FM+LM+RVF  +L
Sbjct: 626 FSKVNLRSGYHQLRIKEQDVLKTEFRTRYGHYEFLVMPFGLTNAPATFMDLMSRVFHPYL 685

Query: 426 DSFVIVFIDDILVYSKTNDEHAEHLRKVLWVLRKQRLYAKFSKCEFWLQKVVFLGHVVSK 485
           D FVIVFIDDILVY + NDEHA HLR VL  LR+++LYAKFSKCEFWLQ+VVFLGHVVS+
Sbjct: 686 DKFVIVFIDDILVYLRDNDEHAAHLRIVLQTLRERQLYAKFSKCEFWLQEVVFLGHVVSR 745

Query: 486 DGITVDPAKVEAVIGWVRPTTITEVRSFLGLAGYYRCFIKDFARIAAPLTQLTRKGKKFD 545
            GI VDP KVEA++ W +P T+TE+RSFLGLAGYYR F++ F+ IAAPLT+LTRKG KF 
Sbjct: 746 TGIYVDPKKVEAILQWEQPKTVTEIRSFLGLAGYYRRFVQGFSLIAAPLTRLTRKGVKFV 805

Query: 546 WSRACESSFQELKERLASAPVLIVPDGTGNLVIYSDASKHGLGCVLMQNGRVIAYASRQL 605
           W   CE+ FQELK RL  APVL +P      V+YSDASK GLGCVLMQ+ +V+AYASRQL
Sbjct: 806 WDDVCENRFQELKNRLTFAPVLTLPVNGKGFVVYSDASKLGLGCVLMQDEKVVAYASRQL 865

Query: 606 KDYERNYPTHDLELAAVVFALKIWRHYLYGERIQVYTDHK 646
           K +E NYPTHDLELAAVVFALKIWRHYLYGE  Q++TDHK
Sbjct: 866 KRHEANYPTHDLELAAVVFALKIWRHYLYGEHCQIFTDHK 905

BLAST of CmoCh20G011190 vs. TAIR10
Match: ATMG00860.1 (ATMG00860.1 DNA/RNA polymerases superfamily protein)

HSP 1 Score: 110.9 bits (276), Expect = 2.9e-24
Identity = 51/125 (40.80%), Postives = 78/125 (62.40%), Query Frame = 1

Query: 449 HLRKVLWVLRKQRLYAKFSKCEFWLQKVVFLGH--VVSKDGITVDPAKVEAVIGWVRPTT 508
           HL  VL +  + + YA   KC F   ++ +LGH  ++S +G++ DPAK+EA++GW  P  
Sbjct: 3   HLGMVLQIWEQHQFYANRKKCAFGQPQIAYLGHRHIISGEGVSADPAKLEAMVGWPEPKN 62

Query: 509 ITEVRSFLGLAGYYRCFIKDFARIAAPLTQLTRKGKKFDWSRACESSFQELKERLASAPV 568
            TE+R FLGL GYYR F+K++ +I  PLT+L +K     W+     +F+ LK  + + PV
Sbjct: 63  TTELRGFLGLTGYYRRFVKNYGKIVRPLTELLKKNS-LKWTEMAALAFKALKGAVTTLPV 122

Query: 569 LIVPD 572
           L +PD
Sbjct: 123 LALPD 126

BLAST of CmoCh20G011190 vs. NCBI nr
Match: gi|595885005|ref|XP_007213082.1| (hypothetical protein PRUPE_ppa021229mg [Prunus persica])

HSP 1 Score: 635.6 bits (1638), Expect = 9.6e-179
Identity = 297/400 (74.25%), Postives = 348/400 (87.00%), Query Frame = 1

Query: 246 PVVNEFLDVFPDKLPGLPPEREVNFGIELEPRTTPISKASYRMAPAELKELKLQLQELLN 305
           PV+ +F DVFP+ LPGLPP RE+ F IEL P T PIS+A YRMAPAEL+ELK QLQEL++
Sbjct: 231 PVIQDFPDVFPEDLPGLPPHREIEFVIELAPGTNPISQAPYRMAPAELRELKTQLQELVD 290

Query: 306 QGFIRPSVSPWGAPVLFVKKKDGTLRLCIDYRELNKVTIKNKYPLPRIDDLFDQLQGAAV 365
           +GFIRPS SPWGAPVLFVKKKDGT+RLC+DYR+LNK+T++N+YPLPRIDDLFDQL+GA V
Sbjct: 291 KGFIRPSFSPWGAPVLFVKKKDGTMRLCVDYRQLNKITVRNRYPLPRIDDLFDQLKGAKV 350

Query: 366 FSKIDLRSGYHQIRVKEDDVPKTAFRTRYGHYEFVVMSFGLTNAPAVFMELMNRVFQDFL 425
           FSKIDLRSGYHQ+RV+E+D+PKTAFRTRYGHYEF+VM FGLTNAPA FM+LMNRVF+ +L
Sbjct: 351 FSKIDLRSGYHQLRVREEDMPKTAFRTRYGHYEFLVMPFGLTNAPAAFMDLMNRVFRRYL 410

Query: 426 DSFVIVFIDDILVYSKTNDEHAEHLRKVLWVLRKQRLYAKFSKCEFWLQKVVFLGHVVSK 485
           D FVIVFIDDILVYSK+   H +HL  VL  LR+++LYAKFSKC+FWL +V FLGHV+S 
Sbjct: 411 DRFVIVFIDDILVYSKSQKAHMKHLNLVLRTLRRRQLYAKFSKCQFWLDRVSFLGHVISA 470

Query: 486 DGITVDPAKVEAVIGWVRPTTITEVRSFLGLAGYYRCFIKDFARIAAPLTQLTRKGKKFD 545
           +GI VDP K+EAV+ W+RPT++TE+RSFLGLAGYYR F++ F+ IAAPLT LTRKG KF 
Sbjct: 471 EGIYVDPQKIEAVVNWLRPTSVTEIRSFLGLAGYYRRFVEGFSTIAAPLTYLTRKGVKFV 530

Query: 546 WSRACESSFQELKERLASAPVLIVPDGTGNLVIYSDASKHGLGCVLMQNGRVIAYASRQL 605
           WS  CE SF ELK RL +APVL +PD +GN VIYSDAS+ GLGCVLMQ+GRVIAYASRQL
Sbjct: 531 WSDKCEESFIELKTRLTTAPVLALPDDSGNFVIYSDASQQGLGCVLMQHGRVIAYASRQL 590

Query: 606 KDYERNYPTHDLELAAVVFALKIWRHYLYGERIQVYTDHK 646
           K +E NYP HDLELAAVVFALKIWRHYLYGE  Q++TDHK
Sbjct: 591 KKHELNYPVHDLELAAVVFALKIWRHYLYGETCQIFTDHK 630

BLAST of CmoCh20G011190 vs. NCBI nr
Match: gi|1021486231|ref|XP_016186119.1| (PREDICTED: LOW QUALITY PROTEIN: uncharacterized protein LOC107627811 [Arachis ipaensis])

HSP 1 Score: 632.1 bits (1629), Expect = 1.1e-177
Identity = 296/400 (74.00%), Postives = 343/400 (85.75%), Query Frame = 1

Query: 246  PVVNEFLDVFPDKLPGLPPEREVNFGIELEPRTTPISKASYRMAPAELKELKLQLQELLN 305
            P+V EF DVFPD+L G+PP+REV F IEL P   P+S   YRMAP EL+ELK+QL+++L 
Sbjct: 662  PIVREFPDVFPDELLGMPPDREVEFSIELAPGVQPVSIPPYRMAPTELRELKVQLEDMLE 721

Query: 306  QGFIRPSVSPWGAPVLFVKKKDGTLRLCIDYRELNKVTIKNKYPLPRIDDLFDQLQGAAV 365
            +GFIRPS SPWGAPVLFVKKKDGT+RLC+DYR+LNK+T++NKYPLPRIDDLFDQLQGA  
Sbjct: 722  KGFIRPSTSPWGAPVLFVKKKDGTMRLCVDYRQLNKITVRNKYPLPRIDDLFDQLQGATC 781

Query: 366  FSKIDLRSGYHQIRVKEDDVPKTAFRTRYGHYEFVVMSFGLTNAPAVFMELMNRVFQDFL 425
            FSKIDLRSGYHQ+++KE+D+PKTAFRTRYGHYEF+VMSFGLTNAPA FM+LMNRVF+ FL
Sbjct: 782  FSKIDLRSGYHQLKIKEEDIPKTAFRTRYGHYEFLVMSFGLTNAPAAFMDLMNRVFKPFL 841

Query: 426  DSFVIVFIDDILVYSKTNDEHAEHLRKVLWVLRKQRLYAKFSKCEFWLQKVVFLGHVVSK 485
            D FVIVFIDDILVYSK+  EH  HLR VL  L+  +LYAKFSKCEFWL +V FLGHVVSK
Sbjct: 842  DRFVIVFIDDILVYSKSAAEHEYHLRIVLQTLKDHKLYAKFSKCEFWLDQVTFLGHVVSK 901

Query: 486  DGITVDPAKVEAVIGWVRPTTITEVRSFLGLAGYYRCFIKDFARIAAPLTQLTRKGKKFD 545
            DGI VDP KVEAV  W RPTT+TE+RSFLGLAGYYR FIKDF+RI+APLT+LT+K  KF 
Sbjct: 902  DGIMVDPKKVEAVQKWPRPTTVTEIRSFLGLAGYYRRFIKDFSRISAPLTKLTQKNVKFQ 961

Query: 546  WSRACESSFQELKERLASAPVLIVPDGTGNLVIYSDASKHGLGCVLMQNGRVIAYASRQL 605
            WS ACE SFQ LK  L SAPVL++P G+G   ++ DAS+ GLGCVLM +GRVIAYASRQ 
Sbjct: 962  WSEACEESFQTLKACLTSAPVLVLPSGSGGFSVFCDASRIGLGCVLMXHGRVIAYASRQP 1021

Query: 606  KDYERNYPTHDLELAAVVFALKIWRHYLYGERIQVYTDHK 646
            K +E+NYPTHDLE+AAVVFALKIWRHYLYGE  ++YTDHK
Sbjct: 1022 KKHEQNYPTHDLEMAAVVFALKIWRHYLYGETCEIYTDHK 1061

BLAST of CmoCh20G011190 vs. NCBI nr
Match: gi|590693137|ref|XP_007044250.1| (DNA/RNA polymerases superfamily protein [Theobroma cacao])

HSP 1 Score: 630.9 bits (1626), Expect = 2.4e-177
Identity = 297/399 (74.44%), Postives = 345/399 (86.47%), Query Frame = 1

Query: 247 VVNEFLDVFPDKLPGLPPEREVNFGIELEPRTTPISKASYRMAPAELKELKLQLQELLNQ 306
           +V+EF DVFPD LPGLPP+RE+ F I+L P T PIS   YRMAP ELKELK+QLQEL+++
Sbjct: 544 IVSEFPDVFPDDLPGLPPDRELEFPIDLLPGTAPISIPPYRMAPTELKELKVQLQELVDK 603

Query: 307 GFIRPSVSPWGAPVLFVKKKDGTLRLCIDYRELNKVTIKNKYPLPRIDDLFDQLQGAAVF 366
           GFIRPS+SPWGAP+LFVKKKDGTLRLCID R+LN++TIKNKYPLPRIDDLFDQLQGA VF
Sbjct: 604 GFIRPSISPWGAPILFVKKKDGTLRLCIDCRQLNRMTIKNKYPLPRIDDLFDQLQGATVF 663

Query: 367 SKIDLRSGYHQIRVKEDDVPKTAFRTRYGHYEFVVMSFGLTNAPAVFMELMNRVFQDFLD 426
           SK+DLRSGYHQ+R+KE DVPKTAFRTRYGHYEF+VM FGLTNAPA FM+LMNRVF  +LD
Sbjct: 664 SKVDLRSGYHQLRIKEQDVPKTAFRTRYGHYEFLVMPFGLTNAPAAFMDLMNRVFHPYLD 723

Query: 427 SFVIVFIDDILVYSKTNDEHAEHLRKVLWVLRKQRLYAKFSKCEFWLQKVVFLGHVVSKD 486
            FVIVFIDDILVYS+ NDEHA HLR VL  LR+++LYAKFSKCEFWLQ+VVFLGH+VS+ 
Sbjct: 724 KFVIVFIDDILVYSRDNDEHAAHLRIVLQTLRERQLYAKFSKCEFWLQEVVFLGHIVSRT 783

Query: 487 GITVDPAKVEAVIGWVRPTTITEVRSFLGLAGYYRCFIKDFARIAAPLTQLTRKGKKFDW 546
           GI VDP KVEA++ W +P T+TE+RSFLGLAGYYR F++ F+ +AAPLT+LTRKG KF W
Sbjct: 784 GIYVDPKKVEAILQWEQPKTVTEIRSFLGLAGYYRRFVQGFSLVAAPLTRLTRKGVKFVW 843

Query: 547 SRACESSFQELKERLASAPVLIVPDGTGNLVIYSDASKHGLGCVLMQNGRVIAYASRQLK 606
              CE+ FQELK RL SAPVL +P      ++YSDASK GLGCVLMQ+ +V+AYASRQLK
Sbjct: 844 DDVCENRFQELKNRLTSAPVLTLPVNGKGFIVYSDASKLGLGCVLMQDEKVVAYASRQLK 903

Query: 607 DYERNYPTHDLELAAVVFALKIWRHYLYGERIQVYTDHK 646
            +E NYPTHDLELAAVVFALKIWRHYLYGE  +++TDHK
Sbjct: 904 RHEANYPTHDLELAAVVFALKIWRHYLYGEHCRIFTDHK 942

BLAST of CmoCh20G011190 vs. NCBI nr
Match: gi|702455653|ref|XP_010026793.1| (PREDICTED: uncharacterized protein LOC104417177 [Eucalyptus grandis])

HSP 1 Score: 630.2 bits (1624), Expect = 4.1e-177
Identity = 292/399 (73.18%), Postives = 351/399 (87.97%), Query Frame = 1

Query: 247 VVNEFLDVFPDKLPGLPPEREVNFGIELEPRTTPISKASYRMAPAELKELKLQLQELLNQ 306
           VV EF DVFP +LPGLPPERE+ F IEL P T PISKA YRMA +ELKELK+Q+QELL++
Sbjct: 534 VVREFPDVFPKELPGLPPEREIEFVIELAPGTEPISKAPYRMALSELKELKVQMQELLDK 593

Query: 307 GFIRPSVSPWGAPVLFVKKKDGTLRLCIDYRELNKVTIKNKYPLPRIDDLFDQLQGAAVF 366
           GFIRPS SPWGAPVLFVKKKDG+LRLCIDYR+LN+VTIKNKYPLPRIDDLFDQLQGA++F
Sbjct: 594 GFIRPSASPWGAPVLFVKKKDGSLRLCIDYRQLNQVTIKNKYPLPRIDDLFDQLQGASIF 653

Query: 367 SKIDLRSGYHQIRVKEDDVPKTAFRTRYGHYEFVVMSFGLTNAPAVFMELMNRVFQDFLD 426
           SKIDLR+GYHQ+R+K++D+PK+AFRTRYGHYEF VM FGLTNAPA FM+LMNRVF+++LD
Sbjct: 654 SKIDLRTGYHQLRIKKEDIPKSAFRTRYGHYEFTVMPFGLTNAPAAFMDLMNRVFKEYLD 713

Query: 427 SFVIVFIDDILVYSKTNDEHAEHLRKVLWVLRKQRLYAKFSKCEFWLQKVVFLGHVVSKD 486
            FVIVFIDDILVYS+++++H +HLR VL  LR   LYAKFSKCEFWL +V FLGHV+S +
Sbjct: 714 QFVIVFIDDILVYSRSSEDHEKHLRIVLQTLRDHELYAKFSKCEFWLTRVAFLGHVISGE 773

Query: 487 GITVDPAKVEAVIGWVRPTTITEVRSFLGLAGYYRCFIKDFARIAAPLTQLTRKGKKFDW 546
           GI+VDPAK+EAVI W RPTT+TE+RSFLGLAGYYR F++ F+R+A+P+T+L +K +KF W
Sbjct: 774 GISVDPAKIEAVINWPRPTTVTEIRSFLGLAGYYRRFVEGFSRLASPMTRLLKKEEKFVW 833

Query: 547 SRACESSFQELKERLASAPVLIVPDGTGNLVIYSDASKHGLGCVLMQNGRVIAYASRQLK 606
           +  CE+SFQELK +L +APVL +P G G   IYSDAS  GLGCVLMQ+GRV+AYASRQL+
Sbjct: 834 TDKCENSFQELKHKLTTAPVLTIPSGPGGFEIYSDASFKGLGCVLMQHGRVVAYASRQLR 893

Query: 607 DYERNYPTHDLELAAVVFALKIWRHYLYGERIQVYTDHK 646
            +E NYPTHDLELAA++FALKIWRHYL GER Q++TDH+
Sbjct: 894 LHELNYPTHDLELAAIIFALKIWRHYLCGERFQIFTDHQ 932

BLAST of CmoCh20G011190 vs. NCBI nr
Match: gi|1012113262|ref|XP_015960510.1| (PREDICTED: LOW QUALITY PROTEIN: uncharacterized protein LOC107484454 [Arachis duranensis])

HSP 1 Score: 629.0 bits (1621), Expect = 9.0e-177
Identity = 292/400 (73.00%), Postives = 340/400 (85.00%), Query Frame = 1

Query: 246 PVVNEFLDVFPDKLPGLPPEREVNFGIELEPRTTPISKASYRMAPAELKELKLQLQELLN 305
           P+V EF DVFPD+LPG+PP REV F IEL P   P+S   YRMAP EL+ELK+QL+++L 
Sbjct: 427 PIVREFPDVFPDELPGMPPNREVEFSIELAPGVQPVSIPPYRMAPTELRELKVQLEDMLE 486

Query: 306 QGFIRPSVSPWGAPVLFVKKKDGTLRLCIDYRELNKVTIKNKYPLPRIDDLFDQLQGAAV 365
           +GFIRPS SPWGAPVLFVKKKDGT+RLC+DYR+LNK+T++NKYPLPRIDDLFDQLQGA  
Sbjct: 487 KGFIRPSTSPWGAPVLFVKKKDGTMRLCVDYRQLNKITVRNKYPLPRIDDLFDQLQGATC 546

Query: 366 FSKIDLRSGYHQIRVKEDDVPKTAFRTRYGHYEFVVMSFGLTNAPAVFMELMNRVFQDFL 425
           FSKIDLRSGYHQ+++KE+D+PKT FRTRY HYEF+VMSF LTNAPA FM+LMNRVF+ FL
Sbjct: 547 FSKIDLRSGYHQLKIKEEDIPKTTFRTRYRHYEFLVMSFCLTNAPAAFMDLMNRVFKPFL 606

Query: 426 DSFVIVFIDDILVYSKTNDEHAEHLRKVLWVLRKQRLYAKFSKCEFWLQKVVFLGHVVSK 485
           D FVI+FIDDILVYSK+  EH  HLR VL  LR  +LYAKFSKCEFWL +V FLGHV+SK
Sbjct: 607 DRFVIIFIDDILVYSKSATEHEYHLRIVLQTLRDHKLYAKFSKCEFWLDQVTFLGHVISK 666

Query: 486 DGITVDPAKVEAVIGWVRPTTITEVRSFLGLAGYYRCFIKDFARIAAPLTQLTRKGKKFD 545
           DGI VDP KVEAV  W RPTT+TE+RSFLGLAGYYR FIKDF+RI+ PLT+LT+K  KF 
Sbjct: 667 DGIMVDPKKVEAVQKWPRPTTVTEIRSFLGLAGYYRRFIKDFSRISTPLTKLTQKNVKFQ 726

Query: 546 WSRACESSFQELKERLASAPVLIVPDGTGNLVIYSDASKHGLGCVLMQNGRVIAYASRQL 605
           WS ACE  FQ LK  L SAPVL++P G+G   ++ DAS+ GLGCVLMQ+GRVIAYASRQL
Sbjct: 727 WSEACEEGFQTLKACLTSAPVLVLPSGSGGFSVFCDASRIGLGCVLMQHGRVIAYASRQL 786

Query: 606 KDYERNYPTHDLELAAVVFALKIWRHYLYGERIQVYTDHK 646
           K +E+NYPTHD+E+AAVVFALKIWRHYLYGE  ++YTDHK
Sbjct: 787 KKHEQNYPTHDMEMAAVVFALKIWRHYLYGETCEIYTDHK 826

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
POL3_DROME1.7e-8242.16Retrovirus-related Pol polyprotein from transposon 17.6 OS=Drosophila melanogast... [more]
POL2_DROME1.6e-8040.75Retrovirus-related Pol polyprotein from transposon 297 OS=Drosophila melanogaste... [more]
POL5_DROME1.4e-7338.57Retrovirus-related Pol polyprotein from transposon opus OS=Drosophila melanogast... [more]
YG31B_YEAST2.3e-7136.88Transposon Ty3-G Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 20... [more]
YI31B_YEAST3.0e-7136.88Transposon Ty3-I Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 20... [more]
Match NameE-valueIdentityDescription
M5WLY8_PRUPE6.7e-17974.25Uncharacterized protein OS=Prunus persica GN=PRUPE_ppa021229mg PE=4 SV=1[more]
A0A061E6T4_THECC1.7e-17774.44DNA/RNA polymerases superfamily protein OS=Theobroma cacao GN=TCM_009549 PE=4 SV... [more]
A0A061FHZ8_THECC6.3e-17773.18DNA/RNA polymerases superfamily protein OS=Theobroma cacao GN=TCM_033345 PE=4 SV... [more]
A0A061EEG7_THECC8.2e-17773.43DNA/RNA polymerases superfamily protein OS=Theobroma cacao GN=TCM_018243 PE=4 SV... [more]
A0A061DW51_THECC5.3e-17674.25DNA/RNA polymerases superfamily protein OS=Theobroma cacao GN=TCM_003698 PE=4 SV... [more]
Match NameE-valueIdentityDescription
ATMG00860.12.9e-2440.80ATMG00860.1 DNA/RNA polymerases superfamily protein[more]
Match NameE-valueIdentityDescription
gi|595885005|ref|XP_007213082.1|9.6e-17974.25hypothetical protein PRUPE_ppa021229mg [Prunus persica][more]
gi|1021486231|ref|XP_016186119.1|1.1e-17774.00PREDICTED: LOW QUALITY PROTEIN: uncharacterized protein LOC107627811 [Arachis ip... [more]
gi|590693137|ref|XP_007044250.1|2.4e-17774.44DNA/RNA polymerases superfamily protein [Theobroma cacao][more]
gi|702455653|ref|XP_010026793.1|4.1e-17773.18PREDICTED: uncharacterized protein LOC104417177 [Eucalyptus grandis][more]
gi|1012113262|ref|XP_015960510.1|9.0e-17773.00PREDICTED: LOW QUALITY PROTEIN: uncharacterized protein LOC107484454 [Arachis du... [more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR000477RT_dom
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0015074 DNA integration
biological_process GO:0006508 proteolysis
cellular_component GO:0005575 cellular_component
molecular_function GO:0004190 aspartic-type endopeptidase activity
molecular_function GO:0003676 nucleic acid binding
molecular_function GO:0008270 zinc ion binding

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CmoCh20G011190.1CmoCh20G011190.1mRNA


Analysis Name: InterPro Annotations of Cucurbita moschata
Date Performed: 2017-05-19
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR000477Reverse transcriptase domainPFAMPF00078RVT_1coord: 323..482
score: 5.2
IPR000477Reverse transcriptase domainPROFILEPS50878RT_POLcoord: 304..483
score: 11
NoneNo IPR availableunknownCoilCoilcoord: 286..306
scor
NoneNo IPR availableGENE3DG3DSA:3.10.10.10coord: 272..413
score: 5.5
NoneNo IPR availableGENE3DG3DSA:3.30.70.270coord: 414..492
score: 3.
NoneNo IPR availablePANTHERPTHR24559FAMILY NOT NAMEDcoord: 316..645
score: 4.8E
NoneNo IPR availablePANTHERPTHR24559:SF207SUBFAMILY NOT NAMEDcoord: 316..645
score: 4.8E
NoneNo IPR availableunknownSSF56672DNA/RNA polymerasescoord: 247..645
score: 9.78E

The following gene(s) are orthologous to this gene:

None

The following gene(s) are paralogous to this gene:

None