CmoCh00G000930 (gene) Cucurbita moschata (Rifu)

NameCmoCh00G000930
Typegene
OrganismCucurbita moschata (Cucurbita moschata (Rifu))
DescriptionRetrotransposon protein, putative, Ty3-gypsy subclass
LocationCmo_Chr00 : 17523483 .. 17527912 (+)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: exonCDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGCATGCGTTGTCGGTCGGTACCCCGGCAGGGGTAGACCTAGTTACGAAAGATAGAGTAAGGGACGGACAAGTGGTAATAGCTGGACAAACCATCCACGTAGACTTAAAGGTAGTGGATATGATGGATTTTGACGTCATACTAGGAATGGACTGGTTAGCAGAAAACTTTGCTACCATAGACTGCCATAAGAAGGAGGTGATATTCACACCTCCTAACGGACTTACCTTTAAGTTTAAAGGAACCTCCACAGGTACCACCCCAAAAATAATATCGATGATGAAAGCGAGGCGCCTGATACAACAAGGGGGGTGGGCGTTTCTAGCCTATGCAGTAAATACAAAAGGAAAGGAAAAACCCATAGATACAATACCAGTAGTAAACGAATTTATGGATGTCTTTCGGGAGGACCTCCCGGGAATTCCTCCATCGCGAGAGGTAGACTTTGGAATAGATCTAGAACCAGGAACGGGACCCATCTCTAAGGCCCCGTACCGCATGGCACCAGCAGAACTAAAAGAGCTTAAAACACAACTGCAAGACCTACTAGATAAGGGCTTCATTCGACCTAGCGTGTCCCCTTGGGGTGCGCTAGTGTTGTTTGTTAAAAAGAAAGACGGCTCAATGCGTTTATGCATTGATTATAGGGAATTAAACAAGAGGACGGTAAAGAACAAATACCCATTACCCCGTATTGAGGACCTATTTGATCAACTACGTGGGGCGACAGTGTTTTCTAAAATAGATCTTCGATCAGGATACCATCAAATTAAGATTAAAAATGAAGACATACCGAAAACAGCCTTTCGAACCAGATATGGCCACTACGAGTTTGTGGTGATGTCTTTTGGTCTCACCAATGCCCCAATGGTATTTATGGAACTAATGAACCGAGTATTTAAGGAGTGCCTAGACTTGTTTGTGATCGTGTTCATAGACGATATCCTGATATACTCGAAAACGGACCTAGAACACCAAGAACACCTTCGTAAAGCCTTAACTATCCTAAGAGAGAACAAGCTGTATGCCAATTTCACCAAGTGTGAATTTTGGATATGACAGGTTTCGTTTCTGGGGCACATAGTGTCTAAAGACGGAATCTTCGTAGATCCCAATAAGATAGAGGCGGTCACGAAATGGAAACGCCCAACAACGGTCACCGAGATACGAAGTTTCTTGGGATTGGCGGGTTATTATCGAAGGTTTGTCCAGGACTTCGCTAGAATAGCCACGCCTCTCACCCAATTAACCAAAAAAGGTGTACCTTTTGTTTGGGACGATACTTGTGAGGTCAGCTTTCAAGAACTAAAACAACGACTAGTATCCGCCCCAGTGCTCACTGTTCCAGAAAGTTCTGTGGGATACGCGATTTACAGTGACGCATCCAAAAAGGGATTAGGTTGTGTACTGATGCAACATGGCAAGGTTGTTGCGTATGCGTCACACCAACTAAAAGATTATGAGAAAAACTAGGTTGTTGCGTATGCGTCACACCAACTAAAAGATTATGAGAAAAACTATCCTACACATGATTTAGAGCTGGCAGCAGTAGTGTTCGCGTTAAAAATTTGGCGACACTACCTGTATGGCGAAAAGACCCAGATCTACACGGACCACAAAAGCCTAAAATATTTGTTCACCCAAAAAGAATTGAACATGCGGCAACGTAGATGGTTAGAGCTGGTAAAAGATTACGACATAGACATCTAGTACCACCCTGGAAAAGCAAACGTAGTCGCTGATGCACTGAGTCGAAAAGTAGTACACTCGTCGGCCCTCATCACTAGGGAGCCACGAGTGCGAACTGATTTCGAACAGGCTGATATCGTGGTGGTAACCAAAGAAGTTGCCGCCCAATTAGCCCGACTGACAGTGCGCTCCACGCTCAGACAAAGAATCATAGACTCACAACGTGAGGATCCAAGCCTAAGTAAAATCCTAGACCAATTGGAAGTTGGCCTAGTGGACGGATTCACTAAATCAACAGATGACGGATTATTATGCCAAGGGCGTTTATGTGTCCCACCCCTGAGCGGAATAAAGAATCAAATCCTAACGGAGGCACACAACTCAACCTTCTCGATACATCCAGGTGGAACAAAGATGTATCAAGACTTGAAAAAACACTTCTGGTGGCGAAGTATGAAGAAGGATATCGCGGAGTACGTGAGTAAGTGTCTAGTGTGCCAACAAGTAAAAGCCCCTAGGCAAAAAACCGCCGGGCTACTACAACCTCTAAGTATACAAGAATGGAAATGGGAAAACATTGCCATGGATTTCATAGTAGGCTTACCCAAAACATTAAAGGGCTATACGGTAATCTGGGTGGTCGTGGATCGCCTGACAAAGTCGGCTCACTTCCTACTGGGTAAGGCAACATATACAGTAGACAATTGGGCACAACTGTATGTTAAAGAAATTGTGAGGTTGCATGGAGTACCAGTGTCAATTGTGTCGGATCGGGACCCACGTTTTACGTCGGCCTTTTGGCGCGGTCTCTAAAGAGCGATGGGTACCCGCCTCGACTTTAGCACCGCCTTCCACCCACAAACAGATGGCCAAACAGAACGATTAAAACCAAATTCTGGAAGACATGCTTCGGGCATGCGTTATGGATTTCACAGGGAGTTGGGACACCAAACTACACTTAATGGAATTCTCTTACAATAACAGCTTCCAAGCAACCATCGGAATGGCACCCTTTGAGGCGTTGTATGGAAAACGATGTAGATCCCCATTGTGTTGGGACGAAGTAGGAGAACGAGAATTGATAGGACCCGAGCTGGTCCATGTCACCAATGAAGCAATCCAGAAAATCCGAGTAAGAATGCGTATCACGCAGAGTAGGCAAAAGAGCTACGCCGACGTTAGGCGTAGGAATTTAGAGTTTGAAGAGGGGGACCCAGTGTTCCTAAAAGTAGCCCCCATGAAAGGTATTCTAAGATTCGGACGCAAGGGGAAGCTCAGCCCCCGATTCATTGGACCGTTCGAAATTTTAGAAAGAGTAGGTCCGGTAGCTTACAAACTAGCTTTACCACCTTCACTTTCAAGCGTACATGATGTATTTCATGTGTCCATGTTAAGGAAATATATTCCAGACCCGACGCACGTAATAGACTACAAACCTCTTGAAATTGAGGAAAATTTAAGTTACCAGGAGAAATCGATTAAAATTCAAGCCCGAGAGGTGAAAGCCTTACGTAATAGGAGTATAGGTTTCGTTAAGGTATTGTGGCGTAACCACCAAGTCGAGGAAGCTACGTGGGAATGAGAGGAGGAAATAAAGGAAAAATATACCGAGTTGATCCACGAGTTCGAGGCTTTCGAGGACGAAAGTCCCTTTTAGGGATAGGTAATGTAACGGAACTGAAAAAAAAAAATGATGGTCGTCGGCAGAGCCGAAAACTCCGGCGAACGCCGGAGTTTTCGCTCACACACGACCCACAAAGGGCGGCCGAGGGCTGGGATGCCTCAAAATCGCGTCCCCGACCTTCAGCCACTCTGAAAGAACTTGCAAAGAAACGAAAACGGTAAGGAAAAAGGAGAGAGAACAGAAAACCGACGGAACTCCGGCGAAGCCGAATCGTACGAATCGACGTCGAAAACCCCTATTTCACACCACAAAAGTAATCCCAGAACCCTTTACACACATCAGGAGTAGGTTTTACGTGAATCCCGTACGCTGAACACCGACGGCAAGGAAGACGGAACGGCGAACTTCCGAGCTTCTCCAGATCCGGGAGTGCGAAGCCCTTACCAACGAAATCGGTACCAAAAAATGGGGTTAGGGTGAAGGATAAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAAATGAGAATGAGAATGACATGTAAGCATGAAAACGTGACACGGGGTTATGTTCATTAAATGATGACTGTAGGACTAGTGGGACCTCATGCATATTGTCTGATCATGCATTTGGGGGGATACCCCTATCCCATCACGAGGGTACGGACGCGTACATCCAAATGGCAGGACCACGATCGTTATGTCCTGCATAGTGCTGCCTGACGGTTTTAGTTTTAACGGGTCCCGGTGGGCCCCCAGAGCATGCCCATCGGCTAGGGACAGTGGCCCAGCGGTGTGCGAACTAGCTGGTGAGCCCATACTCGCACATATGGGCTACGTGTAGAGTATATGGTACATGTCCACGATTACCTGTTAGGATTACACCGTAGGGAAAACGAAACGAAACGATAGGTCCTATCATTTAATTGCATGTGTTTGTCGCATAACCCCGATAGGGGGTCACTTACTGAGTATTCTAGAAATACTCAAGCCTCGTGCTACTTTACTTTTGCAGATAAGGGCAACGCACCCATGTGACAACGACGACGGCATCGTACCCCCCAGGATCGTGGCACATGCATAG

mRNA sequence

ATGCATGCGTTGTCGGTCGGTACCCCGGCAGGGGTAGACCTAGTTACGAAAGATAGAGTAAGGGACGGACAAGTGGTAATAGCTGGACAAACCATCCACGTAGACTTAAAGGTAGTGGATATGATGGATTTTGACGTCATACTAGGAATGGACTGGTTAGCAGAAAACTTTGCTACCATAGACTGCCATAAGAAGGAGGTGATATTCACACCTCCTAACGGACTTACCTTTAAGTTTAAAGGAACCTCCACAGGTACCACCCCAAAAATAATATCGATGATGAAAGCGAGGCGCCTGATACAACAAGGGGGGTGGGCGTTTCTAGCCTATGCAGTAAATACAAAAGGAAAGGAAAAACCCATAGATACAATACCAGTAGTAAACGAATTTATGGATGTCTTTCGGGAGGACCTCCCGGGAATTCCTCCATCGCGAGAGGTAGACTTTGGAATAGATCTAGAACCAGGAACGGGACCCATCTCTAAGGCCCCGTACCGCATGGCACCAGCAGAACTAAAAGAGCTTAAAACACAACTGCAAGACCTACTAGATAAGGGCTTCATTCGACCTAGCGTGTCCCCTTGGGGTGCGCTAGTGTTGTTTGTTAAAAAGAAAGACGGCTCAATGCGTTTATGCATTGATTATAGGGAATTAAACAAGAGGACGGTAAAGAACAAATACCCATTACCCCGTATTGAGGACCTATTTGATCAACTACGTGGGGCGACAGTGTTTTCTAAAATAGATCTTCGATCAGGATACCATCAAATTAAGATTAAAAATGAAGACATACCGAAAACAGCCTTTCGAACCAGATATGGCCACTACGAGTTTGTGGTGATGTCTTTTGGTCTCACCAATGCCCCAATGGTTTCGTTTCTGGGGCACATAGTGTCTAAAGACGGAATCTTCGTAGATCCCAATAAGATAGAGGCGGTCACGAAATGGAAACGCCCAACAACGGTCACCGAGATACGAAGTTTCTTGGGATTGGCGGGTTATTATCGAAGGTTTGTCCAGGACTTCGCTAGAATAGCCACGCCTCTCACCCAATTAACCAAAAAAGGTGTACCTTTTGTTTGGGACGATACTTGTGAGGTCAGCTTTCAAGAACTAAAACAACGACTAGTATCCGCCCCAGTGCTCACTGTTCCAGAAAGTTCTGTGGGATACGCGATTTACAGTGACGCATCCAAAAAGGGATTAGGTTGTGTACTGATGCAACATGGCAAGGTTGTTGCGTGGAACAAAGATGTATCAAGACTTGAAAAAACACTTCTGGTGGCGAAGTATGAAGAAGGATATCGCGGAGTACGTGAGCTTACCCAAAACATTAAAGGGCTATACGGTAATCTGGGTGGTCGTGGATCGCCTGACAAAGTCGGCTCACTTCCTACTGGCTTCCAAGCAACCATCGGAATGGCACCCTTTGAGGCGTTGTATGGAAAACGATGTAGATCCCCATTGTGTTGGGACGAAGTAGGAGAACGAGAATTGATAGGACCCGAGCTGGTCCATGTCACCAATGAAGCAATCCAGAAAATCCGAGTAAGAATGCGTATCACGCAGAGTAGGCAAAAGAGCTACGCCGACGTTAGGCGTAGGAATTTAGAGTTTGAAGAGGGGGACCCAGTGTTCCTAAAAGTAGCCCCCATGAAAGAGCCGAAAACTCCGGCGAACGCCGGAGTTTTCGCTCACACACGACCCACAAAGGGCGGCCGAGGGCTGGGATGCCTCAAAATCGCGTCCCCGACCTTCAGCCACTCTGAAAGAACTTGCAAAGAAACGAAAACGATAAGGGCAACGCACCCATGTGACAACGACGACGGCATCGTACCCCCCAGGATCGTGGCACATGCATAG

Coding sequence (CDS)

ATGCATGCGTTGTCGGTCGGTACCCCGGCAGGGGTAGACCTAGTTACGAAAGATAGAGTAAGGGACGGACAAGTGGTAATAGCTGGACAAACCATCCACGTAGACTTAAAGGTAGTGGATATGATGGATTTTGACGTCATACTAGGAATGGACTGGTTAGCAGAAAACTTTGCTACCATAGACTGCCATAAGAAGGAGGTGATATTCACACCTCCTAACGGACTTACCTTTAAGTTTAAAGGAACCTCCACAGGTACCACCCCAAAAATAATATCGATGATGAAAGCGAGGCGCCTGATACAACAAGGGGGGTGGGCGTTTCTAGCCTATGCAGTAAATACAAAAGGAAAGGAAAAACCCATAGATACAATACCAGTAGTAAACGAATTTATGGATGTCTTTCGGGAGGACCTCCCGGGAATTCCTCCATCGCGAGAGGTAGACTTTGGAATAGATCTAGAACCAGGAACGGGACCCATCTCTAAGGCCCCGTACCGCATGGCACCAGCAGAACTAAAAGAGCTTAAAACACAACTGCAAGACCTACTAGATAAGGGCTTCATTCGACCTAGCGTGTCCCCTTGGGGTGCGCTAGTGTTGTTTGTTAAAAAGAAAGACGGCTCAATGCGTTTATGCATTGATTATAGGGAATTAAACAAGAGGACGGTAAAGAACAAATACCCATTACCCCGTATTGAGGACCTATTTGATCAACTACGTGGGGCGACAGTGTTTTCTAAAATAGATCTTCGATCAGGATACCATCAAATTAAGATTAAAAATGAAGACATACCGAAAACAGCCTTTCGAACCAGATATGGCCACTACGAGTTTGTGGTGATGTCTTTTGGTCTCACCAATGCCCCAATGGTTTCGTTTCTGGGGCACATAGTGTCTAAAGACGGAATCTTCGTAGATCCCAATAAGATAGAGGCGGTCACGAAATGGAAACGCCCAACAACGGTCACCGAGATACGAAGTTTCTTGGGATTGGCGGGTTATTATCGAAGGTTTGTCCAGGACTTCGCTAGAATAGCCACGCCTCTCACCCAATTAACCAAAAAAGGTGTACCTTTTGTTTGGGACGATACTTGTGAGGTCAGCTTTCAAGAACTAAAACAACGACTAGTATCCGCCCCAGTGCTCACTGTTCCAGAAAGTTCTGTGGGATACGCGATTTACAGTGACGCATCCAAAAAGGGATTAGGTTGTGTACTGATGCAACATGGCAAGGTTGTTGCGTGGAACAAAGATGTATCAAGACTTGAAAAAACACTTCTGGTGGCGAAGTATGAAGAAGGATATCGCGGAGTACGTGAGCTTACCCAAAACATTAAAGGGCTATACGGTAATCTGGGTGGTCGTGGATCGCCTGACAAAGTCGGCTCACTTCCTACTGGCTTCCAAGCAACCATCGGAATGGCACCCTTTGAGGCGTTGTATGGAAAACGATGTAGATCCCCATTGTGTTGGGACGAAGTAGGAGAACGAGAATTGATAGGACCCGAGCTGGTCCATGTCACCAATGAAGCAATCCAGAAAATCCGAGTAAGAATGCGTATCACGCAGAGTAGGCAAAAGAGCTACGCCGACGTTAGGCGTAGGAATTTAGAGTTTGAAGAGGGGGACCCAGTGTTCCTAAAAGTAGCCCCCATGAAAGAGCCGAAAACTCCGGCGAACGCCGGAGTTTTCGCTCACACACGACCCACAAAGGGCGGCCGAGGGCTGGGATGCCTCAAAATCGCGTCCCCGACCTTCAGCCACTCTGAAAGAACTTGCAAAGAAACGAAAACGATAAGGGCAACGCACCCATGTGACAACGACGACGGCATCGTACCCCCCAGGATCGTGGCACATGCATAG
BLAST of CmoCh00G000930 vs. Swiss-Prot
Match: YG31B_YEAST (Transposon Ty3-G Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GN=TY3B-G PE=1 SV=3)

HSP 1 Score: 143.7 bits (361), Expect = 6.9e-33
Identity = 74/172 (43.02%), Postives = 104/172 (60.47%), Query Frame = 1

Query: 123 TIPV--VNEFMDVFREDLPGIPPSRE---VDFGIDLEPGTGPISKAPYRMAPAELKELKT 182
           T+PV    ++ ++ R DLP  P       V   I+++PG       PY +     +E+  
Sbjct: 555 TLPVWLQQKYREIIRNDLPPRPADINNIPVKHDIEIKPGARLPRLQPYHVTEKNEQEINK 614

Query: 183 QLQDLLDKGFIRPSVSPWGALVLFVKKKDGSMRLCIDYRELNKRTVKNKYPLPRIEDLFD 242
            +Q LLD  FI PS SP  + V+ V KKDG+ RLC+DYR LNK T+ + +PLPRI++L  
Sbjct: 615 IVQKLLDNKFIVPSKSPCSSPVVLVPKKDGTFRLCVDYRTLNKATISDPFPLPRIDNLLS 674

Query: 243 QLRGATVFSKIDLRSGYHQIKIKNEDIPKTAFRTRYGHYEFVVMSFGLTNAP 290
           ++  A +F+ +DL SGYHQI ++ +D  KTAF T  G YE+ VM FGL NAP
Sbjct: 675 RIGNAQIFTTLDLHSGYHQIPMEPKDRYKTAFVTPSGKYEYTVMPFGLVNAP 726

BLAST of CmoCh00G000930 vs. Swiss-Prot
Match: YI31B_YEAST (Transposon Ty3-I Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GN=TY3B-I PE=3 SV=2)

HSP 1 Score: 143.7 bits (361), Expect = 6.9e-33
Identity = 74/172 (43.02%), Postives = 104/172 (60.47%), Query Frame = 1

Query: 123 TIPV--VNEFMDVFREDLPGIPPSRE---VDFGIDLEPGTGPISKAPYRMAPAELKELKT 182
           T+PV    ++ ++ R DLP  P       V   I+++PG       PY +     +E+  
Sbjct: 581 TLPVWLQQKYREIIRNDLPPRPADINNIPVKHDIEIKPGARLPRLQPYHVTEKNEQEINK 640

Query: 183 QLQDLLDKGFIRPSVSPWGALVLFVKKKDGSMRLCIDYRELNKRTVKNKYPLPRIEDLFD 242
            +Q LLD  FI PS SP  + V+ V KKDG+ RLC+DYR LNK T+ + +PLPRI++L  
Sbjct: 641 IVQKLLDNKFIVPSKSPCSSPVVLVPKKDGTFRLCVDYRTLNKATISDPFPLPRIDNLLS 700

Query: 243 QLRGATVFSKIDLRSGYHQIKIKNEDIPKTAFRTRYGHYEFVVMSFGLTNAP 290
           ++  A +F+ +DL SGYHQI ++ +D  KTAF T  G YE+ VM FGL NAP
Sbjct: 701 RIGNAQIFTTLDLHSGYHQIPMEPKDRYKTAFVTPSGKYEYTVMPFGLVNAP 752

BLAST of CmoCh00G000930 vs. Swiss-Prot
Match: POL3_DROME (Retrovirus-related Pol polyprotein from transposon 17.6 OS=Drosophila melanogaster GN=pol PE=3 SV=1)

HSP 1 Score: 119.0 bits (297), Expect = 1.8e-25
Identity = 76/256 (29.69%), Postives = 135/256 (52.73%), Query Frame = 1

Query: 43  DFDVILGMDWLAENFATIDCHKKEVIFTPPNGLTFKFKGTSTGTTPKIISMMKARRLIQQ 102
           ++D++LG   LAE  ATI    +EV          +   T   +  + ++M+    L Q 
Sbjct: 89  NYDLLLGRKLLAEAKATISYRDQEVTLYNNKYKLIEGIATHEQSHFQNVNMIPDTMLRQP 148

Query: 103 GGWAFLA----YAVNTKGKEKPIDTIPVVNEFMDVFREDLPGIPPSREVDFGIDLEPGTG 162
              + +     Y +     E+      ++ ++ D+   +   +  + +    I+ +    
Sbjct: 149 NKISPILESDLYRLEHLNNEEKQRLCALLQKYHDIQYHEGDKLTFTNQTKHTINTKHNLP 208

Query: 163 PISKAPYRMAPAELKELKTQLQDLLDKGFIRPSVSPWGALVLFV-KKKDGS----MRLCI 222
             SK  Y  A  +  E+++Q+QD+L++G IR S SP+ + +  V KK+D S     R+ I
Sbjct: 209 LYSKYSYPQAYEQ--EVESQIQDMLNQGIIRTSNSPYNSPIWVVPKKQDASGKQKFRIVI 268

Query: 223 DYRELNKRTVKNKYPLPRIEDLFDQLRGATVFSKIDLRSGYHQIKIKNEDIPKTAFRTRY 282
           DYR+LN+ TV +++P+P ++++  +L     F+ IDL  G+HQI++  E + KTAF T++
Sbjct: 269 DYRKLNEITVGDRHPIPNMDEILGKLGRCNYFTTIDLAKGFHQIEMDPESVSKTAFSTKH 328

Query: 283 GHYEFVVMSFGLTNAP 290
           GHYE++ M FGL NAP
Sbjct: 329 GHYEYLRMPFGLKNAP 342

BLAST of CmoCh00G000930 vs. Swiss-Prot
Match: TF24_SCHPO (Transposon Tf2-4 polyprotein OS=Schizosaccharomyces pombe (strain 972 / ATCC 24843) GN=Tf2-4 PE=3 SV=1)

HSP 1 Score: 116.3 bits (290), Expect = 1.2e-24
Identity = 57/166 (34.34%), Postives = 99/166 (59.64%), Query Frame = 1

Query: 126 VVNEFMDVFRE-DLPGIP-PSREVDFGIDLEPGTGPISKAPYRMAPAELKELKTQLQDLL 185
           +  EF D+  E +   +P P + ++F ++L      +    Y + P +++ +  ++   L
Sbjct: 377 IYKEFKDITAETNTEKLPKPIKGLEFEVELTQENYRLPIRNYPLPPGKMQAMNDEINQGL 436

Query: 186 DKGFIRPSVSPWGALVLFVKKKDGSMRLCIDYRELNKRTVKNKYPLPRIEDLFDQLRGAT 245
             G IR S +     V+FV KK+G++R+ +DY+ LNK    N YPLP IE L  +++G+T
Sbjct: 437 KSGIIRESKAINACPVMFVPKKEGTLRMVVDYKPLNKYVKPNIYPLPLIEQLLAKIQGST 496

Query: 246 VFSKIDLRSGYHQIKIKNEDIPKTAFRTRYGHYEFVVMSFGLTNAP 290
           +F+K+DL+S YH I+++  D  K AFR   G +E++VM +G++ AP
Sbjct: 497 IFTKLDLKSAYHLIRVRKGDEHKLAFRCPRGVFEYLVMPYGISTAP 542

BLAST of CmoCh00G000930 vs. Swiss-Prot
Match: TF26_SCHPO (Transposon Tf2-6 polyprotein OS=Schizosaccharomyces pombe (strain 972 / ATCC 24843) GN=Tf2-6 PE=3 SV=1)

HSP 1 Score: 116.3 bits (290), Expect = 1.2e-24
Identity = 57/166 (34.34%), Postives = 99/166 (59.64%), Query Frame = 1

Query: 126 VVNEFMDVFRE-DLPGIP-PSREVDFGIDLEPGTGPISKAPYRMAPAELKELKTQLQDLL 185
           +  EF D+  E +   +P P + ++F ++L      +    Y + P +++ +  ++   L
Sbjct: 377 IYKEFKDITAETNTEKLPKPIKGLEFEVELTQENYRLPIRNYPLPPGKMQAMNDEINQGL 436

Query: 186 DKGFIRPSVSPWGALVLFVKKKDGSMRLCIDYRELNKRTVKNKYPLPRIEDLFDQLRGAT 245
             G IR S +     V+FV KK+G++R+ +DY+ LNK    N YPLP IE L  +++G+T
Sbjct: 437 KSGIIRESKAINACPVMFVPKKEGTLRMVVDYKPLNKYVKPNIYPLPLIEQLLAKIQGST 496

Query: 246 VFSKIDLRSGYHQIKIKNEDIPKTAFRTRYGHYEFVVMSFGLTNAP 290
           +F+K+DL+S YH I+++  D  K AFR   G +E++VM +G++ AP
Sbjct: 497 IFTKLDLKSAYHLIRVRKGDEHKLAFRCPRGVFEYLVMPYGISTAP 542

BLAST of CmoCh00G000930 vs. TrEMBL
Match: A0A061EVL2_THECC (DNA/RNA polymerases superfamily protein OS=Theobroma cacao GN=TCM_023542 PE=4 SV=1)

HSP 1 Score: 463.8 bits (1192), Expect = 3.4e-127
Identity = 228/387 (58.91%), Postives = 278/387 (71.83%), Query Frame = 1

Query: 53  LAENFATIDCHKKEVIFTPPNGLTFKFKGTSTGTTPKIISMMKARRLIQQGGWAFLAYAV 112
           L  + A +D  +KEV+    +G    F G        +IS +KA +L+Q+G   +LAY +
Sbjct: 336 LTAHRANVDYFRKEVVLRNSDGAEIVFVGEHRVLPSCVISAIKASKLVQKGYPTYLAYVI 395

Query: 113 NTKGKEKPIDTIPVVNEFMDVFREDLPGIPPSREVDFGIDLEPGTGPISKAPYRMAPAEL 172
           +T   E  ++ +P+V+EF+DVF +DLPG+PP RE++F IDL P T PIS  PYRMAPAEL
Sbjct: 396 DTSKGEPKLEDVPIVSEFLDVFPDDLPGLPPDRELEFPIDLLPDTAPISIPPYRMAPAEL 455

Query: 173 KELKTQLQDLLDKGFIRPSVSPWGALVLFVKKKDGSMRLCIDYRELNKRTVKNKYPLPRI 232
           KELK QLQDL+DKGFIR S+SPWGALVLFVKKKDG++RLCIDYR+LN+ T+KNKYPLPRI
Sbjct: 456 KELKVQLQDLVDKGFIRLSISPWGALVLFVKKKDGTLRLCIDYRQLNRVTIKNKYPLPRI 515

Query: 233 EDLFDQLRGATVFSKIDLRSGYHQIKIKNEDIPKTAFRTRYGHYEFVVMSFGLTNAPM-- 292
           +DLFDQLRGA VFSKIDLRSGY+Q++IK +D+PKTAFR RYGHYEF+VM FGLTNA    
Sbjct: 516 DDLFDQLRGAMVFSKIDLRSGYYQLRIKEQDVPKTAFRMRYGHYEFLVMPFGLTNALAVF 575

Query: 293 -------------------------VSFLGHIVSKDGIFVDPNKIEAVTKWKRPTTVTEI 352
                                    V FLGH+VS  GI+VDP KI+A+ +W++P  VTEI
Sbjct: 576 MDLMNRVFHPYLDKFVIVFIDDILEVVFLGHVVSGAGIYVDPKKIQAILQWEQPRMVTEI 635

Query: 353 RSFLGLAGYYRRFVQDFARIATPLTQLTKKGVPFVWDDTCEVSFQELKQRLVSAPVLTVP 412
            SFLGL  YYRRFVQ F+ IA PLT+LT KGV F WDD CE  FQELK RL S PVLT+P
Sbjct: 636 SSFLGLVDYYRRFVQGFSLIAAPLTRLTHKGVKFEWDDVCENRFQELKNRLTSTPVLTLP 695

BLAST of CmoCh00G000930 vs. TrEMBL
Match: A0A061G9E7_THECC (DNA/RNA polymerases superfamily protein OS=Theobroma cacao GN=TCM_028292 PE=4 SV=1)

HSP 1 Score: 459.9 bits (1182), Expect = 4.9e-126
Identity = 223/355 (62.82%), Postives = 269/355 (75.77%), Query Frame = 1

Query: 90  IISMMKARRLIQQGGWAFLAYAVNTKGKEKPIDTIPVVNEFMDVFREDLPGIPPSREVDF 149
           +I  +KA +L+Q+G  A+LAY ++T   E  ++ +P+V+EF DVF +DLP +PP RE++F
Sbjct: 345 LIFAIKASKLVQKGYPAYLAYVIDTSNGEPKLEDVPIVSEFPDVFPDDLPRLPPDRELEF 404

Query: 150 GIDLEPGTGPISKAPYRMAPAELKELKTQLQDLLDKGFIRPSVSPWGALVLFVKKKDGSM 209
            IDL  GT PIS  PYRMAPAELKELK QLQDL+DKGFIRPS+SPWGA VLFVKKKDG++
Sbjct: 405 PIDLLSGTAPISIPPYRMAPAELKELKVQLQDLVDKGFIRPSISPWGAPVLFVKKKDGTL 464

Query: 210 RLCIDYRELNKRTVKNKYPLPRIEDLFDQLRGATVFSKIDLRSGYHQIKIKNEDIPKTAF 269
           RLCIDYR+LN+ T+KNKYPLP I+DLFDQLRGA VFSKIDLRSGY+Q++IK +D+PKTAF
Sbjct: 465 RLCIDYRQLNRVTIKNKYPLPWIDDLFDQLRGAMVFSKIDLRSGYYQLRIKEQDVPKTAF 524

Query: 270 RTRYGHYEFVVMSFGLTNAPM-----------------------------VSFLGHIVSK 329
           RTRYGHYEF+VM FGLTNAP                              V FLGH+VS 
Sbjct: 525 RTRYGHYEFLVMLFGLTNAPAVFMDLMNRVFHPYLDKFVIVFIDDILLKEVVFLGHVVSG 584

Query: 330 DGIFVDPNKIEAVTKWKRPTTVTEIRSFLGLAGYYRRFVQDFARIATPLTQLTKKGVPFV 389
            GI+VDP KIEA+ +W++P TVTEIRSFLGL GYYRRFVQ F+ IA PLT+LT+KGV F 
Sbjct: 585 AGIYVDPKKIEAILQWEQPRTVTEIRSFLGLVGYYRRFVQRFSLIAAPLTRLTRKGVKFE 644

Query: 390 WDDTCEVSFQELKQRLVSAPVLTVPESSVGYAIYSDASKKGLGCVLMQHGKVVAW 416
           WDD CE  FQELK RL SAP+LT+  S   + +YSDA K GLGCVLMQ  KV+A+
Sbjct: 645 WDDVCENRFQELKNRLTSAPILTLSVSEKEFVVYSDAPKLGLGCVLMQDEKVIAY 699

BLAST of CmoCh00G000930 vs. TrEMBL
Match: Q6F2D6_SOLDE (Putative polyprotein, identical OS=Solanum demissum GN=SDM1_49t00014 PE=4 SV=2)

HSP 1 Score: 456.1 bits (1172), Expect = 7.0e-125
Identity = 239/458 (52.18%), Postives = 299/458 (65.28%), Query Frame = 1

Query: 4    LSVGTPAGVDLVTKDRVRDGQVVIAGQTIHVDLKVVDMMDFDVILGMDWLAENFATIDCH 63
            + V TP G  LV    +R   V I G    VDL ++DM+DFDVILGMDWL+   A +DC+
Sbjct: 695  IHVSTPVGESLVVDQILRSCLVTIQGCDTRVDLILLDMVDFDVILGMDWLSPYHAVLDCY 754

Query: 64   KKEVIFTPPNGLTFKFKGTSTGTTPKIISMMKARRLIQQGGWAFLAYAVNTKGKEKPIDT 123
             K V    P      ++G  + T   IIS M+ARRL+  G  A+LAY  +    +  +D+
Sbjct: 755  AKTVTLAMPGISPVLWQGAYSHTPTWIISFMRARRLVASGCLAYLAYVRDVSRDDSSVDS 814

Query: 124  IPVVNEFMDVFREDLPGIPPSREVDFGIDLEPGTGPISKAPYRMAPAELKELKTQLQDLL 183
            +PVV EF DVF  DLPG+PP R++DF IDLEP T PIS  PYRMAPAEL+EL  QL+DLL
Sbjct: 815  VPVVREFADVFPIDLPGLPPDRDIDFAIDLEPDTRPISIPPYRMAPAELRELSAQLEDLL 874

Query: 184  DKGFIRPSVSPWGALVLFVKKKDGSMRLCIDYRELNKRTVKNKYPLPRIEDLFDQLRGAT 243
             KGFIRPSVSPWGA VLFVKKKDG+MR+CIDYR+LNK TVKN+YP+PRI+DLFDQL+GA 
Sbjct: 875  GKGFIRPSVSPWGAPVLFVKKKDGTMRMCIDYRQLNKVTVKNRYPMPRIDDLFDQLQGAA 934

Query: 244  VFSKIDLRSGYHQIKIKNEDIPKTAFRTRYGHYEFVVMSFGLTNAP-------------- 303
            VFSKIDLRSGYHQ++I+  DIPKTAFRTRYGHYEF+VMSFGLTNAP              
Sbjct: 935  VFSKIDLRSGYHQLRIRAADIPKTAFRTRYGHYEFLVMSFGLTNAPAAFMDLMTRVFRPY 994

Query: 304  ----MVSFLGHIVS--KDGIFVDPNKIE-----------AVTK---WKRPTTVTEIRS-- 363
                ++ F+  I++     ++   +K E            V+K      P  +  IR   
Sbjct: 995  LDLFVIVFIDDILTLRDQRLYAKFSKCEFWLESVAFLGHVVSKEGIRVDPAKIEAIRDWV 1054

Query: 364  ----------FLGLAGYYRRFVQDFARIATPLTQLTKKGVPFVWDDTCEVSFQELKQRLV 416
                      F+GLAGYYRRFV+ F+ IA  LT+LT+  VPFVW + CE SF  LK+ L 
Sbjct: 1055 RPTSVTEIRSFVGLAGYYRRFVEGFSTIAALLTRLTRVDVPFVWSEECEASFLRLKELLT 1114

BLAST of CmoCh00G000930 vs. TrEMBL
Match: Q01MK3_ORYSA (OSIGBa0093M15.2 protein OS=Oryza sativa GN=OSIGBa0093M15.2 PE=4 SV=1)

HSP 1 Score: 441.8 bits (1135), Expect = 1.4e-120
Identity = 221/420 (52.62%), Postives = 285/420 (67.86%), Query Frame = 1

Query: 2   HALSVGTPAGVDLVTKDRVRDGQVVIAGQTIHVDLKVVDMMDFDVILGMDWLAENFATID 61
           H L V TP+   L +  R    ++ I G     +L +++  D DVILGMDWLA     ID
Sbjct: 362 HPLMVSTPSNQAL-SLQRSPSVRIEIKGVPFLANLILLESKDIDVILGMDWLARYKGVID 421

Query: 62  CHKKEVIFTPPNGLTFKFKGTSTGTTPKIISMMKARRLIQQGGWAFLAYAVNTKGKEKPI 121
           C  ++V  T  +G               I+  + +  L              ++  +  +
Sbjct: 422 CANRKVTLTSNDGRVV------------IVHALSSESL-------------RSRLNQITL 481

Query: 122 DTIPVVNEFMDVFREDLPGIPPSREVDFGIDLEPGTGPISKAPYRMAPAELKELKTQLQD 181
           + IPVV E+ DVF +DLPG+PP R+++F ID  PGT PI K PYRMA  EL E+K Q+ D
Sbjct: 482 EEIPVVREYPDVFLDDLPGMPPKRDIEFRIDFVPGTTPIHKRPYRMAANELAEVKRQVDD 541

Query: 182 LLDKGFIRPSVSPWGALVLFVKKKDGSMRLCIDYRELNKRTVKNKYPLPRIEDLFDQLRG 241
           LL KG+IRPS SPWGA V+FV+KKD + R+C+DYR LN  T+KNKYPLPRI+DLFDQL+G
Sbjct: 542 LLQKGYIRPSSSPWGAPVIFVEKKDHTQRMCVDYRALNDVTIKNKYPLPRIDDLFDQLKG 601

Query: 242 ATVFSKIDLRSGYHQIKIKNEDIPKTAFRTRYGHYEFVVMSFGLTNAP------MVSFLG 301
           ATVFS IDLRSGYHQ++IK EDIPKTAF TRYG +E  VMSFGLTNAP      M  FLG
Sbjct: 602 ATVFSMIDLRSGYHQLRIKEEDIPKTAFTTRYGLFECTVMSFGLTNAPAFFMNLMNKFLG 661

Query: 302 HIVSKDGIFVDPNKIEAVTKWKRPTTVTEIRSFLGLAGYYRRFVQDFARIATPLTQLTKK 361
           H++S  G+ VDP+ +E+VT WK+  TV+EIRSFLGLAGYYRRF+++F++IA P+T+L +K
Sbjct: 662 HVISAGGVAVDPSNVESVTNWKQSKTVSEIRSFLGLAGYYRRFIENFSKIAKPMTRLLQK 721

Query: 362 GVPFVWDDTCEVSFQELKQRLVSAPVLTVPESSVGYAIYSDASKKGLGCVLMQHGKVVAW 416
            V + W + CE SFQELK RL+SAP+L +P+   G+ +Y DASK GLGCVLMQ GKVVA+
Sbjct: 722 DVKYKWSEECEQSFQELKNRLISAPILILPDPKKGFQVYCDASKLGLGCVLMQDGKVVAY 755

BLAST of CmoCh00G000930 vs. TrEMBL
Match: A0A061FW58_THECC (Retrotransposon protein, putative OS=Theobroma cacao GN=TCM_013113 PE=4 SV=1)

HSP 1 Score: 439.5 bits (1129), Expect = 6.8e-120
Identity = 232/457 (50.77%), Postives = 294/457 (64.33%), Query Frame = 1

Query: 4   LSVGTPAGVDLVTKDRVRDGQVVIAGQTIHVDLKVVDMMDFDVILGMDWLAENFATIDCH 63
           L+V TP     V +       V +  +   V+L V+D +DFDVILGMDWLA   A++DC+
Sbjct: 387 LTVSTPLNEVFVAEWEYESCVVRVEDKNTLVNLVVLDTLDFDVILGMDWLASCHASVDCY 446

Query: 64  KKEVIFTPPNGLTFKFKGTSTGTTPKIISMMKARRLIQQGGWAFLAYAVNTKGKEKPIDT 123
            K V F  P   +F  +G  +     +IS+M  R+L++QG   +LA   +T+ K   I  
Sbjct: 447 HKLVKFDFPGEPSFNIQGDRSNFPTNLISIMSTRKLLRQGCLGYLAVVKDTQAKVGDISQ 506

Query: 124 IPVVNEFMDVFREDLPGIPPSREVDFGIDLEPGTGPISKAPYRMAPAELKELKTQLQDLL 183
           + VVNEF D +                             P RMAPAELKELK QL+DLL
Sbjct: 507 VSVVNEFKDTYIH--------------------------TPIRMAPAELKELKDQLEDLL 566

Query: 184 DKGFIRPSVSPWGALVLFVKKKDGSMRLCIDYRELNKRTVKNKYPLPRIEDLFDQLRGAT 243
           DKGFIRPSVSPWGALVLFVKKKDGS+RLCIDYR+LNK TVKNKY LPRI+DLFDQL+GA 
Sbjct: 567 DKGFIRPSVSPWGALVLFVKKKDGSLRLCIDYRQLNKVTVKNKYSLPRIDDLFDQLQGAQ 626

Query: 244 VFSKIDLRSGYHQIKIKNEDIPKTAFRTRYGHYEFVVMSFGLTNAPMVSFLG-------- 303
            FSKIDL+SGYHQ++I+NE IPKT FRTRYGHYEF+VMSFGLTNA + +F+         
Sbjct: 627 CFSKIDLQSGYHQLRIQNEAIPKTTFRTRYGHYEFLVMSFGLTNA-LAAFMDLMNWVFKP 686

Query: 304 -------------------HIVSKDGIFVDPNKIEAVTKWKRPTTVTEIRSFLGLAGYYR 363
                              H+VSKDG+ VDP K++ V KW R T+V+EIRSFLGLA YYR
Sbjct: 687 YLDKFVVVFIDDILIYSKRHVVSKDGVQVDPKKVKVVEKWPRQTSVSEIRSFLGLACYYR 746

Query: 364 RFVQDFARIATPLTQLTKKGVPFVWDDTCEVSFQELKQRLVSAPVLTVPESSVGYAIYSD 423
           RFV+DF++I  PLT+LT K   F W D CE SF++LK  L +APVL++P+ + GY ++ D
Sbjct: 747 RFVKDFSKIVFPLTKLTCKDTKFEWSDACENSFEKLKACLTTAPVLSLPQGTRGYTVFCD 806

Query: 424 ASKKGLGCVLMQHGKVVAW-NKDVSRLEKTLLVAKYE 433
           A + GLGCVLMQHGKV+ + ++ + R E+  L    E
Sbjct: 807 ALQIGLGCVLMQHGKVIEYASRQLKRHEQNYLAHDLE 816

BLAST of CmoCh00G000930 vs. TAIR10
Match: ATMG00860.1 (ATMG00860.1 DNA/RNA polymerases superfamily protein)

HSP 1 Score: 97.4 bits (241), Expect = 3.2e-20
Identity = 43/100 (43.00%), Postives = 65/100 (65.00%), Query Frame = 1

Query: 289 PMVSFLGH--IVSKDGIFVDPNKIEAVTKWKRPTTVTEIRSFLGLAGYYRRFVQDFARIA 348
           P +++LGH  I+S +G+  DP K+EA+  W  P   TE+R FLGL GYYRRFV+++ +I 
Sbjct: 28  PQIAYLGHRHIISGEGVSADPAKLEAMVGWPEPKNTTELRGFLGLTGYYRRFVKNYGKIV 87

Query: 349 TPLTQLTKKGVPFVWDDTCEVSFQELKQRLVSAPVLTVPE 387
            PLT+L KK     W +   ++F+ LK  + + PVL +P+
Sbjct: 88  RPLTELLKKN-SLKWTEMAALAFKALKGAVTTLPVLALPD 126

BLAST of CmoCh00G000930 vs. NCBI nr
Match: gi|969998696|ref|XP_015076192.1| (PREDICTED: uncharacterized protein LOC107020370 [Solanum pennellii])

HSP 1 Score: 533.1 bits (1372), Expect = 6.5e-148
Identity = 264/445 (59.33%), Postives = 326/445 (73.26%), Query Frame = 1

Query: 4    LSVGTPAGVDLVTKDRVRDGQVVIAGQTIHVDLKVVDMMDFDVILGMDWLAENFATIDCH 63
            + V TP G  +V     R   V + G   H DLKV+DM+DFDVILGMDWL+   A ++CH
Sbjct: 574  IRVSTPVGDSVVVDQVYRLCTVTLMGYDTHADLKVLDMIDFDVILGMDWLSSYHAILNCH 633

Query: 64   KKEVIFTPPNGLTFKFKGTSTGTTPKIISMMKARRLIQQGGWAFLAYAVNTKGKEKPIDT 123
             K +    P     +++GT +  +  +IS +KAR+L+Q+G  A+LA+  +T  +   +++
Sbjct: 634  AKTITLAMPGIPIVEWRGTLSHPSKGVISFLKARQLVQRGCLAYLAHIRDTSIETPMLES 693

Query: 124  IPVVNEFMDVFREDLPGIPPSREVDFGIDLEPGTGPISKAPYRMAPAELKELKTQLQDLL 183
            IPVV+EF +VF  DLPG+PP R++DF ID+EPGT PIS  PYRMAPAELKELK QLQDLL
Sbjct: 694  IPVVSEFSEVFPTDLPGLPPDRDIDFCIDVEPGTRPISIPPYRMAPAELKELKEQLQDLL 753

Query: 184  DKGFIRPSVSPWGALVLFVKKKDGSMRLCIDYRELNKRTVKNKYPLPRIEDLFDQLRGAT 243
             KGFIRPSVSPWGA VLFVKKKDGSMR+CIDYR+LNK T++NKYP+PRI+DLFDQL+GA+
Sbjct: 754  SKGFIRPSVSPWGAPVLFVKKKDGSMRMCIDYRQLNKVTIRNKYPIPRIDDLFDQLQGAS 813

Query: 244  VFSKIDLRSGYHQIKIKNEDIPKTAFRTRYGHYEFVVMSFGLTNAP-------------- 303
            VFSKIDLRSGYHQ+K++ EDIPKTAFRTRYGHYEF+VMSFGLTNAP              
Sbjct: 814  VFSKIDLRSGYHQLKVRAEDIPKTAFRTRYGHYEFLVMSFGLTNAPAAFMDLMNEVFRPY 873

Query: 304  -------------MVSFLGHIVSKDGIFVDPNKIEAVTKWKRPTTVTEIRSFLGLAGYYR 363
                          V+FLGH+VSK+GI VDP KIEAV  W RPT+VTEIRSFLGLAGY R
Sbjct: 874  LDSFVIVFIDDILSVAFLGHVVSKEGIMVDPKKIEAVRDWIRPTSVTEIRSFLGLAGYNR 933

Query: 364  RFVQDFARIATPLTQLTKKGVPFVWDDTCEVSFQELKQRLVSAPVLTVPESSVGYAIYSD 422
            RFV+ F+ IA+PLT+LT+K V F W D CEVSFQ+LK  L +AP+LT+P    G+ +Y D
Sbjct: 934  RFVEGFSSIASPLTRLTQKEVTFQWFDECEVSFQKLKTLLTTAPILTLPVEGEGFVVYGD 993

BLAST of CmoCh00G000930 vs. NCBI nr
Match: gi|590633610|ref|XP_007028151.1| (DNA/RNA polymerases superfamily protein [Theobroma cacao])

HSP 1 Score: 463.8 bits (1192), Expect = 4.8e-127
Identity = 228/387 (58.91%), Postives = 278/387 (71.83%), Query Frame = 1

Query: 53  LAENFATIDCHKKEVIFTPPNGLTFKFKGTSTGTTPKIISMMKARRLIQQGGWAFLAYAV 112
           L  + A +D  +KEV+    +G    F G        +IS +KA +L+Q+G   +LAY +
Sbjct: 336 LTAHRANVDYFRKEVVLRNSDGAEIVFVGEHRVLPSCVISAIKASKLVQKGYPTYLAYVI 395

Query: 113 NTKGKEKPIDTIPVVNEFMDVFREDLPGIPPSREVDFGIDLEPGTGPISKAPYRMAPAEL 172
           +T   E  ++ +P+V+EF+DVF +DLPG+PP RE++F IDL P T PIS  PYRMAPAEL
Sbjct: 396 DTSKGEPKLEDVPIVSEFLDVFPDDLPGLPPDRELEFPIDLLPDTAPISIPPYRMAPAEL 455

Query: 173 KELKTQLQDLLDKGFIRPSVSPWGALVLFVKKKDGSMRLCIDYRELNKRTVKNKYPLPRI 232
           KELK QLQDL+DKGFIR S+SPWGALVLFVKKKDG++RLCIDYR+LN+ T+KNKYPLPRI
Sbjct: 456 KELKVQLQDLVDKGFIRLSISPWGALVLFVKKKDGTLRLCIDYRQLNRVTIKNKYPLPRI 515

Query: 233 EDLFDQLRGATVFSKIDLRSGYHQIKIKNEDIPKTAFRTRYGHYEFVVMSFGLTNAPM-- 292
           +DLFDQLRGA VFSKIDLRSGY+Q++IK +D+PKTAFR RYGHYEF+VM FGLTNA    
Sbjct: 516 DDLFDQLRGAMVFSKIDLRSGYYQLRIKEQDVPKTAFRMRYGHYEFLVMPFGLTNALAVF 575

Query: 293 -------------------------VSFLGHIVSKDGIFVDPNKIEAVTKWKRPTTVTEI 352
                                    V FLGH+VS  GI+VDP KI+A+ +W++P  VTEI
Sbjct: 576 MDLMNRVFHPYLDKFVIVFIDDILEVVFLGHVVSGAGIYVDPKKIQAILQWEQPRMVTEI 635

Query: 353 RSFLGLAGYYRRFVQDFARIATPLTQLTKKGVPFVWDDTCEVSFQELKQRLVSAPVLTVP 412
            SFLGL  YYRRFVQ F+ IA PLT+LT KGV F WDD CE  FQELK RL S PVLT+P
Sbjct: 636 SSFLGLVDYYRRFVQGFSLIAAPLTRLTHKGVKFEWDDVCENRFQELKNRLTSTPVLTLP 695

BLAST of CmoCh00G000930 vs. NCBI nr
Match: gi|590617810|ref|XP_007023888.1| (DNA/RNA polymerases superfamily protein [Theobroma cacao])

HSP 1 Score: 459.9 bits (1182), Expect = 7.0e-126
Identity = 223/355 (62.82%), Postives = 269/355 (75.77%), Query Frame = 1

Query: 90  IISMMKARRLIQQGGWAFLAYAVNTKGKEKPIDTIPVVNEFMDVFREDLPGIPPSREVDF 149
           +I  +KA +L+Q+G  A+LAY ++T   E  ++ +P+V+EF DVF +DLP +PP RE++F
Sbjct: 345 LIFAIKASKLVQKGYPAYLAYVIDTSNGEPKLEDVPIVSEFPDVFPDDLPRLPPDRELEF 404

Query: 150 GIDLEPGTGPISKAPYRMAPAELKELKTQLQDLLDKGFIRPSVSPWGALVLFVKKKDGSM 209
            IDL  GT PIS  PYRMAPAELKELK QLQDL+DKGFIRPS+SPWGA VLFVKKKDG++
Sbjct: 405 PIDLLSGTAPISIPPYRMAPAELKELKVQLQDLVDKGFIRPSISPWGAPVLFVKKKDGTL 464

Query: 210 RLCIDYRELNKRTVKNKYPLPRIEDLFDQLRGATVFSKIDLRSGYHQIKIKNEDIPKTAF 269
           RLCIDYR+LN+ T+KNKYPLP I+DLFDQLRGA VFSKIDLRSGY+Q++IK +D+PKTAF
Sbjct: 465 RLCIDYRQLNRVTIKNKYPLPWIDDLFDQLRGAMVFSKIDLRSGYYQLRIKEQDVPKTAF 524

Query: 270 RTRYGHYEFVVMSFGLTNAPM-----------------------------VSFLGHIVSK 329
           RTRYGHYEF+VM FGLTNAP                              V FLGH+VS 
Sbjct: 525 RTRYGHYEFLVMLFGLTNAPAVFMDLMNRVFHPYLDKFVIVFIDDILLKEVVFLGHVVSG 584

Query: 330 DGIFVDPNKIEAVTKWKRPTTVTEIRSFLGLAGYYRRFVQDFARIATPLTQLTKKGVPFV 389
            GI+VDP KIEA+ +W++P TVTEIRSFLGL GYYRRFVQ F+ IA PLT+LT+KGV F 
Sbjct: 585 AGIYVDPKKIEAILQWEQPRTVTEIRSFLGLVGYYRRFVQRFSLIAAPLTRLTRKGVKFE 644

Query: 390 WDDTCEVSFQELKQRLVSAPVLTVPESSVGYAIYSDASKKGLGCVLMQHGKVVAW 416
           WDD CE  FQELK RL SAP+LT+  S   + +YSDA K GLGCVLMQ  KV+A+
Sbjct: 645 WDDVCENRFQELKNRLTSAPILTLSVSEKEFVVYSDAPKLGLGCVLMQDEKVIAY 699

BLAST of CmoCh00G000930 vs. NCBI nr
Match: gi|923837836|ref|XP_013699774.1| (PREDICTED: uncharacterized protein LOC106403497 [Brassica napus])

HSP 1 Score: 456.4 bits (1173), Expect = 7.7e-125
Identity = 236/427 (55.27%), Postives = 289/427 (67.68%), Query Frame = 1

Query: 20   VRDGQVVIAGQTIHVDLKVVDMMDFDVILGMDWLAENFATIDCHKKEVIFTPPNGLTFKF 79
            V+D  V+I  + + VDL VV + + +VILGMDWL ++ AT++CH+  V F    G    F
Sbjct: 636  VKDIPVLITDREMPVDLIVVPLENHEVILGMDWLGKHRATLNCHRGRVQFETECGRPVWF 695

Query: 80   KGTSTGTTPKIISMMKARRLIQQGGWAFLAYAVNTK---GKEKPIDTIPVVNEFMDVFRE 139
            +G  +    K++S ++A R++  G  A+LA  + TK   G   P D IP+VNEF DVFR 
Sbjct: 696  QGLCSTPGLKVVSALRAERMLLDGCEAYLA-TITTKEVVGGGDP-DGIPLVNEFEDVFRS 755

Query: 140  DLPGIPPSREVDFGIDLEPGTGPISKAPYRMAPAELKELKTQLQDLLDKGFIRPSVSPWG 199
             L G+PP R   F I+LEPGT P+SK+PYRMAPAE+ ELK QL++LLDKGFI PSVSPWG
Sbjct: 756  -LQGVPPDRADPFKIELEPGTAPLSKSPYRMAPAEMAELKKQLEELLDKGFICPSVSPWG 815

Query: 200  ALVLFVKKKDGSMRLCIDYRELNKRTVKNKYPLPRIEDLFDQLRGATVFSKIDLRSGYHQ 259
            A VLFVKKKDGS RLC+DYR LN+ TVKNKYPLPRI++L DQLRGA  FSKIDL SGYHQ
Sbjct: 816  APVLFVKKKDGSFRLCVDYRGLNRVTVKNKYPLPRIDELLDQLRGAKWFSKIDLASGYHQ 875

Query: 260  IKIKNEDIPKTAFRTRYGHYEFVVMSFGLTNAPM-------------------------- 319
            I I+  DI KTAFRTRYGHYEFVVM FGLTNAP                           
Sbjct: 876  IPIEPSDIKKTAFRTRYGHYEFVVMPFGLTNAPAAFMNMMNGVFREYLDEFVIIFIDDIL 935

Query: 320  --VSFLGHIVSKDGIFVDPNKIEAVTKWKRPTTVTEIRSFLGLAGYYRRFVQDFARIATP 379
              + FLGHIVS  G+ VDP KI+ +  W +P   TE+RSFLGLAGYYR+FV+ F+ +A P
Sbjct: 936  KSIGFLGHIVSDKGVSVDPEKIKCIRNWPQPRNATEVRSFLGLAGYYRKFVKGFSSVAQP 995

Query: 380  LTQLTKKGVPFVWDDTCEVSFQELKQRLVSAPVLTVPESSVGYAIYSDASKKGLGCVLMQ 416
            +TQLT K V F W D CE SF  LK  L S P+L +PE+   Y +Y+DAS  GLGCVL Q
Sbjct: 996  MTQLTGKDVKFAWSDQCEESFSALKDMLTSTPILVLPEADQPYVVYTDASITGLGCVLTQ 1055

BLAST of CmoCh00G000930 vs. NCBI nr
Match: gi|113205363|gb|AAT66771.2| (Putative polyprotein, identical [Solanum demissum])

HSP 1 Score: 456.1 bits (1172), Expect = 1.0e-124
Identity = 239/458 (52.18%), Postives = 299/458 (65.28%), Query Frame = 1

Query: 4    LSVGTPAGVDLVTKDRVRDGQVVIAGQTIHVDLKVVDMMDFDVILGMDWLAENFATIDCH 63
            + V TP G  LV    +R   V I G    VDL ++DM+DFDVILGMDWL+   A +DC+
Sbjct: 695  IHVSTPVGESLVVDQILRSCLVTIQGCDTRVDLILLDMVDFDVILGMDWLSPYHAVLDCY 754

Query: 64   KKEVIFTPPNGLTFKFKGTSTGTTPKIISMMKARRLIQQGGWAFLAYAVNTKGKEKPIDT 123
             K V    P      ++G  + T   IIS M+ARRL+  G  A+LAY  +    +  +D+
Sbjct: 755  AKTVTLAMPGISPVLWQGAYSHTPTWIISFMRARRLVASGCLAYLAYVRDVSRDDSSVDS 814

Query: 124  IPVVNEFMDVFREDLPGIPPSREVDFGIDLEPGTGPISKAPYRMAPAELKELKTQLQDLL 183
            +PVV EF DVF  DLPG+PP R++DF IDLEP T PIS  PYRMAPAEL+EL  QL+DLL
Sbjct: 815  VPVVREFADVFPIDLPGLPPDRDIDFAIDLEPDTRPISIPPYRMAPAELRELSAQLEDLL 874

Query: 184  DKGFIRPSVSPWGALVLFVKKKDGSMRLCIDYRELNKRTVKNKYPLPRIEDLFDQLRGAT 243
             KGFIRPSVSPWGA VLFVKKKDG+MR+CIDYR+LNK TVKN+YP+PRI+DLFDQL+GA 
Sbjct: 875  GKGFIRPSVSPWGAPVLFVKKKDGTMRMCIDYRQLNKVTVKNRYPMPRIDDLFDQLQGAA 934

Query: 244  VFSKIDLRSGYHQIKIKNEDIPKTAFRTRYGHYEFVVMSFGLTNAP-------------- 303
            VFSKIDLRSGYHQ++I+  DIPKTAFRTRYGHYEF+VMSFGLTNAP              
Sbjct: 935  VFSKIDLRSGYHQLRIRAADIPKTAFRTRYGHYEFLVMSFGLTNAPAAFMDLMTRVFRPY 994

Query: 304  ----MVSFLGHIVS--KDGIFVDPNKIE-----------AVTK---WKRPTTVTEIRS-- 363
                ++ F+  I++     ++   +K E            V+K      P  +  IR   
Sbjct: 995  LDLFVIVFIDDILTLRDQRLYAKFSKCEFWLESVAFLGHVVSKEGIRVDPAKIEAIRDWV 1054

Query: 364  ----------FLGLAGYYRRFVQDFARIATPLTQLTKKGVPFVWDDTCEVSFQELKQRLV 416
                      F+GLAGYYRRFV+ F+ IA  LT+LT+  VPFVW + CE SF  LK+ L 
Sbjct: 1055 RPTSVTEIRSFVGLAGYYRRFVEGFSTIAALLTRLTRVDVPFVWSEECEASFLRLKELLT 1114

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
YG31B_YEAST6.9e-3343.02Transposon Ty3-G Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 20... [more]
YI31B_YEAST6.9e-3343.02Transposon Ty3-I Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 20... [more]
POL3_DROME1.8e-2529.69Retrovirus-related Pol polyprotein from transposon 17.6 OS=Drosophila melanogast... [more]
TF24_SCHPO1.2e-2434.34Transposon Tf2-4 polyprotein OS=Schizosaccharomyces pombe (strain 972 / ATCC 248... [more]
TF26_SCHPO1.2e-2434.34Transposon Tf2-6 polyprotein OS=Schizosaccharomyces pombe (strain 972 / ATCC 248... [more]
Match NameE-valueIdentityDescription
A0A061EVL2_THECC3.4e-12758.91DNA/RNA polymerases superfamily protein OS=Theobroma cacao GN=TCM_023542 PE=4 SV... [more]
A0A061G9E7_THECC4.9e-12662.82DNA/RNA polymerases superfamily protein OS=Theobroma cacao GN=TCM_028292 PE=4 SV... [more]
Q6F2D6_SOLDE7.0e-12552.18Putative polyprotein, identical OS=Solanum demissum GN=SDM1_49t00014 PE=4 SV=2[more]
Q01MK3_ORYSA1.4e-12052.62OSIGBa0093M15.2 protein OS=Oryza sativa GN=OSIGBa0093M15.2 PE=4 SV=1[more]
A0A061FW58_THECC6.8e-12050.77Retrotransposon protein, putative OS=Theobroma cacao GN=TCM_013113 PE=4 SV=1[more]
Match NameE-valueIdentityDescription
ATMG00860.13.2e-2043.00ATMG00860.1 DNA/RNA polymerases superfamily protein[more]
Match NameE-valueIdentityDescription
gi|969998696|ref|XP_015076192.1|6.5e-14859.33PREDICTED: uncharacterized protein LOC107020370 [Solanum pennellii][more]
gi|590633610|ref|XP_007028151.1|4.8e-12758.91DNA/RNA polymerases superfamily protein [Theobroma cacao][more]
gi|590617810|ref|XP_007023888.1|7.0e-12662.82DNA/RNA polymerases superfamily protein [Theobroma cacao][more]
gi|923837836|ref|XP_013699774.1|7.7e-12555.27PREDICTED: uncharacterized protein LOC106403497 [Brassica napus][more]
gi|113205363|gb|AAT66771.2|1.0e-12452.18Putative polyprotein, identical [Solanum demissum][more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR000477RT_dom
IPR013242Retroviral aspartyl protease
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0008150 biological_process
biological_process GO:0015074 DNA integration
biological_process GO:0006310 DNA recombination
biological_process GO:0006508 proteolysis
biological_process GO:0006278 RNA-dependent DNA biosynthetic process
cellular_component GO:0005575 cellular_component
cellular_component GO:0005739 mitochondrion
cellular_component GO:0009536 plastid
molecular_function GO:0005488 binding
molecular_function GO:0004190 aspartic-type endopeptidase activity
molecular_function GO:0003677 DNA binding
molecular_function GO:0003723 RNA binding
molecular_function GO:0003964 RNA-directed DNA polymerase activity

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CmoCh00G000930.1CmoCh00G000930.1mRNA


Analysis Name: InterPro Annotations of Cucurbita moschata
Date Performed: 2017-05-19
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR000477Reverse transcriptase domainPFAMPF00078RVT_1coord: 202..291
score: 7.5
IPR013242Retroviral aspartyl proteasePFAMPF08284RVP_2coord: 4..72
score: 6.
NoneNo IPR availableGENE3DG3DSA:3.10.10.10coord: 151..291
score: 5.5
NoneNo IPR availablePANTHERPTHR24559FAMILY NOT NAMEDcoord: 195..553
score: 3.8E
NoneNo IPR availablePANTHERPTHR24559:SF207SUBFAMILY NOT NAMEDcoord: 195..553
score: 3.8E
NoneNo IPR availableunknownSSF56672DNA/RNA polymerasescoord: 126..417
score: 4.89E

The following gene(s) are orthologous to this gene:
GeneOrthologueOrganismBlock
CmoCh00G000930CmaCh00G001860Cucurbita maxima (Rimu)cmacmoB001
The following gene(s) are paralogous to this gene:

None

The following block(s) are covering this gene:
GeneOrganismBlock
CmoCh00G000930Cucurbita maxima (Rimu)cmacmoB000