CmaCh02G004950 (gene) Cucurbita maxima (Rimu) v1.1

Overview
NameCmaCh02G004950
Typegene
OrganismCucurbita maxima (Cucurbita maxima (Rimu) v1.1)
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
LocationCma_Chr02: 2644637 .. 2645781 (-)
RNA-Seq ExpressionCmaCh02G004950
SyntenyCmaCh02G004950
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: polypeptideexonCDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGAAGAGGCTGGTTGCGAAAGGAGTTTTAGAAGGTCCGAAATTTGTTGATGTGGGTCGTTGTGAGAACTACGTTATGGGCAAACAGAAACGAGTTAGTTTCACAAAGGCTGCAAGAGAACCGAAGAAAGTGCGGCTGGAAATGGTCCATACAAACGTCTGGGGACCATCTCCAGTTTCATCACTTGGTGAATCAAGGTACTACGTCACCTTCATCGATGACTTTAGCAGGAAGGTATGGGTTTACTTTCTGAAACACAAGTCAGATGTGTTTACCATCTTCAAGAAGTGGAAAGCTGAAGTTGAAAATCAGACTGACTTGAAGATTAAATGCCTGAGGTCTGACAATGGAGGAGAGTACAACAAATCAGAGTTTATAACATTTTGTGCAGTTGAGGGAATTAGATTAATAAGAACAATTTTCGGTAAGGCAAGACAAAATGGTATTGCAGAAAGAATGAACAGAACATTGAATGAGCGAGCAAGAAGCATGAGGATTCACTCTAGACTACCAAAGACATTCTGGGCTGATGCTGTGAACACAGCAGCATATTTGATTAATAGAGGGCCGTCAGTACCCTTGAAATTCAAATTGCCCGAAGAAGTATGGACAGGAAAAGAACTCAAGTACTCTCACTTGAGAACTTTTGGTTGTACTGTGTATGTTCATGTTGATCCAGAGAAGAGAGATAAGCTTGATGTTAAGGCTGTGAAATGCTACTTCATAGGCTATGGCTCTGACATGTTCGGATACAGATTTTGGGATGAGAAAAATATGAAAATCCTAAGACACTGCGATGTGACCTTTGATGAAAATGTCATGTACAAGGACAGAGAGAAGATAAACTTTGAGACTACAAAGAAGTGGAAGTTGAACTTGAGTGGCAGGAAAATTCACGCAGTGATGTTACAACTGAAGCTCAAGAAACTCATGATCCTATTGCTGAAGAACCAGACGTGGAGCAAGTTGCACCTAAGCAGGTGTTGAGAAGATCATCCAGAACTATCAGAGCACCAGATAGATATTCAGCCTCATTACATTATCTGTTGCTGACTGACGAAGGAGAACCAGAGTCCTTTGATGAGGCCCTACATGTGGAAGATTCAACCAAGTGGGAGCAAGCCATGGATGATGAGCTCACTTGA

mRNA sequence

ATGAAGAGGCTGGTTGCGAAAGGAGTTTTAGAAGGTCCGAAATTTGTTGATGTGGGTCGTTGTGAGAACTACGTTATGGGCAAACAGAAACGAGTTAGTTTCACAAAGGCTGCAAGAGAACCGAAGAAAGTGCGGCTGGAAATGGTCCATACAAACGTCTGGGGACCATCTCCAGTTTCATCACTTGGTGAATCAAGGTACTACGTCACCTTCATCGATGACTTTAGCAGGAAGGTATGGGTTTACTTTCTGAAACACAAGTCAGATGTGTTTACCATCTTCAAGAAGTGGAAAGCTGAAGTTGAAAATCAGACTGACTTGAAGATTAAATGCCTGAGGTCTGACAATGGAGGAGAGTACAACAAATCAGAGTTTATAACATTTTGTGCAGTTGAGGGAATTAGATTAATAAGAACAATTTTCGGTAAGGCAAGACAAAATGGTATTGCAGAAAGAATGAACAGAACATTGAATGAGCGAGCAAGAAGCATGAGGATTCACTCTAGACTACCAAAGACATTCTGGGCTGATGCTGTGAACACAGCAGCATATTTGATTAATAGAGGGCCGTCAGTACCCTTGAAATTCAAATTGCCCGAAGAAGTATGGACAGGAAAAGAACTCAAGTACTCTCACTTGAGAACTTTTGGTTGTACTGTGTATGTTCATGTTGATCCAGAGAAGAGAGATAAGCTTGATGTTAAGGCTGTGAAATGCTACTTCATAGGCTATGGCTCTGACATGTTCGGATACAGATTTTGGGATGAGAAAAATATGAAAATCCTAAGACACTGCGATGTGACCTTTGATGAAAATGTCATTGATGTTACAACTGAAGCTCAAGAAACTCATGATCCTATTGCTGAAGAACCAGACGTGGAGCAAGTTGCACCTAAGCAGGTGTTGAGAAGATCATCCAGAACTATCAGAGCACCAGATAGATATTCAGCCTCATTACATTATCTGTTGCTGACTGACGAAGGAGAACCAGAGTCCTTTGATGAGGCCCTACATGTGGAAGATTCAACCAAGTGGGAGCAAGCCATGGATGATGAGCTCACTTGA

Coding sequence (CDS)

ATGAAGAGGCTGGTTGCGAAAGGAGTTTTAGAAGGTCCGAAATTTGTTGATGTGGGTCGTTGTGAGAACTACGTTATGGGCAAACAGAAACGAGTTAGTTTCACAAAGGCTGCAAGAGAACCGAAGAAAGTGCGGCTGGAAATGGTCCATACAAACGTCTGGGGACCATCTCCAGTTTCATCACTTGGTGAATCAAGGTACTACGTCACCTTCATCGATGACTTTAGCAGGAAGGTATGGGTTTACTTTCTGAAACACAAGTCAGATGTGTTTACCATCTTCAAGAAGTGGAAAGCTGAAGTTGAAAATCAGACTGACTTGAAGATTAAATGCCTGAGGTCTGACAATGGAGGAGAGTACAACAAATCAGAGTTTATAACATTTTGTGCAGTTGAGGGAATTAGATTAATAAGAACAATTTTCGGTAAGGCAAGACAAAATGGTATTGCAGAAAGAATGAACAGAACATTGAATGAGCGAGCAAGAAGCATGAGGATTCACTCTAGACTACCAAAGACATTCTGGGCTGATGCTGTGAACACAGCAGCATATTTGATTAATAGAGGGCCGTCAGTACCCTTGAAATTCAAATTGCCCGAAGAAGTATGGACAGGAAAAGAACTCAAGTACTCTCACTTGAGAACTTTTGGTTGTACTGTGTATGTTCATGTTGATCCAGAGAAGAGAGATAAGCTTGATGTTAAGGCTGTGAAATGCTACTTCATAGGCTATGGCTCTGACATGTTCGGATACAGATTTTGGGATGAGAAAAATATGAAAATCCTAAGACACTGCGATGTGACCTTTGATGAAAATGTCATTGATGTTACAACTGAAGCTCAAGAAACTCATGATCCTATTGCTGAAGAACCAGACGTGGAGCAAGTTGCACCTAAGCAGGTGTTGAGAAGATCATCCAGAACTATCAGAGCACCAGATAGATATTCAGCCTCATTACATTATCTGTTGCTGACTGACGAAGGAGAACCAGAGTCCTTTGATGAGGCCCTACATGTGGAAGATTCAACCAAGTGGGAGCAAGCCATGGATGATGAGCTCACTTGA

Protein sequence

MKRLVAKGVLEGPKFVDVGRCENYVMGKQKRVSFTKAAREPKKVRLEMVHTNVWGPSPVSSLGESRYYVTFIDDFSRKVWVYFLKHKSDVFTIFKKWKAEVENQTDLKIKCLRSDNGGEYNKSEFITFCAVEGIRLIRTIFGKARQNGIAERMNRTLNERARSMRIHSRLPKTFWADAVNTAAYLINRGPSVPLKFKLPEEVWTGKELKYSHLRTFGCTVYVHVDPEKRDKLDVKAVKCYFIGYGSDMFGYRFWDEKNMKILRHCDVTFDENVIDVTTEAQETHDPIAEEPDVEQVAPKQVLRRSSRTIRAPDRYSASLHYLLLTDEGEPESFDEALHVEDSTKWEQAMDDELT
Homology
BLAST of CmaCh02G004950 vs. ExPASy Swiss-Prot
Match: P10978 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum OX=4097 PE=2 SV=1)

HSP 1 Score: 298.1 bits (762), Expect = 1.3e-79
Identity = 166/397 (41.81%), Postives = 228/397 (57.43%), Query Frame = 0

Query: 4   LVAKGVLEGPKFVDVGRCENYVMGKQKRVSFTKAAREPKKVRLEMVHTNVWGPSPVSSLG 63
           L  K ++   K   V  C+  + GKQ RVSF + + E K   L++V+++V GP  + S+G
Sbjct: 440 LAKKSLISYAKGTTVKPCDYCLFGKQHRVSF-QTSSERKLNILDLVYSDVCGPMEIESMG 499

Query: 64  ESRYYVTFIDDFSRKVWVYFLKHKSDVFTIFKKWKAEVENQTDLKIKCLRSDNGGEYNKS 123
            ++Y+VTFIDD SRK+WVY LK K  VF +F+K+ A VE +T  K+K LRSDNGGEY   
Sbjct: 500 GNKYFVTFIDDASRKLWVYILKTKDQVFQVFQKFHALVERETGRKLKRLRSDNGGEYTSR 559

Query: 124 EFITFCAVEGIRLIRTIFGKARQNGIAERMNRTLNERARSMRIHSRLPKTFWADAVNTAA 183
           EF  +C+  GIR  +T+ G  + NG+AERMNRT+ E+ RSM   ++LPK+FW +AV TA 
Sbjct: 560 EFEEYCSSHGIRHEKTVPGTPQHNGVAERMNRTIVEKVRSMLRMAKLPKSFWGEAVQTAC 619

Query: 184 YLINRGPSVPLKFKLPEEVWTGKELKYSHLRTFGCTVYVHVDPEKRDKLDVKAVKCYFIG 243
           YLINR PSVPL F++PE VWT KE+ YSHL+ FGC  + HV  E+R KLD K++ C FIG
Sbjct: 620 YLINRSPSVPLAFEIPERVWTNKEVSYSHLKVFGCRAFAHVPKEQRTKLDDKSIPCIFIG 679

Query: 244 YGSDMFGYRFWDEKNMKILRHCDVTFDENVIDVT-------------------------T 303
           YG + FGYR WD    K++R  DV F E+ +                            T
Sbjct: 680 YGDEEFGYRLWDPVKKKVIRSRDVVFRESEVRTAADMSEKVKNGIIPNFVTIPSTSNNPT 739

Query: 304 EAQETHDPIAEEPD---------------VEQV-------APKQVLRRSSRTIRAPDRYS 354
            A+ T D ++E+ +               VE+V          Q LRRS R  R   R  
Sbjct: 740 SAESTTDEVSEQGEQPGEVIEQGEQLDEGVEEVEHPTQGEEQHQPLRRSERP-RVESRRY 799

BLAST of CmaCh02G004950 vs. ExPASy Swiss-Prot
Match: P04146 (Copia protein OS=Drosophila melanogaster OX=7227 GN=GIP PE=1 SV=3)

HSP 1 Score: 180.6 bits (457), Expect = 3.0e-44
Identity = 103/255 (40.39%), Postives = 149/255 (58.43%), Query Frame = 0

Query: 21  CENYVMGKQKRVSFTKAAREPKKVR--LEMVHTNVWGPSPVSSLGESRYYVTFIDDFSRK 80
           CE  + GKQ R+ F K  ++   ++  L +VH++V GP    +L +  Y+V F+D F+  
Sbjct: 455 CEPCLNGKQARLPF-KQLKDKTHIKRPLFVVHSDVCGPITPVTLDDKNYFVIFVDQFTHY 514

Query: 81  VWVYFLKHKSDVFTIFKKWKAEVENQTDLKIKCLRSDNGGEYNKSEFITFCAVEGIRLIR 140
              Y +K+KSDVF++F+ + A+ E   +LK+  L  DNG EY  +E   FC  +GI    
Sbjct: 515 CVTYLIKYKSDVFSMFQDFVAKSEAHFNLKVVYLYIDNGREYLSNEMRQFCVKKGISYHL 574

Query: 141 TIFGKARQNGIAERMNRTLNERARSMRIHSRLPKTFWADAVNTAAYLINRGPSVPL--KF 200
           T+    + NG++ERM RT+ E+AR+M   ++L K+FW +AV TA YLINR PS  L    
Sbjct: 575 TVPHTPQLNGVSERMIRTITEKARTMVSGAKLDKSFWGEAVLTATYLINRIPSRALVDSS 634

Query: 201 KLPEEVWTGKELKYSHLRTFGCTVYVHVDPEKRDKLDVKAVKCYFIGYGSDMFGYRFWDE 260
           K P E+W  K+    HLR FG TVYVH+   K+ K D K+ K  F+GY  +  G++ WD 
Sbjct: 635 KTPYEMWHNKKPYLKHLRVFGATVYVHI-KNKQGKFDDKSFKSIFVGYEPN--GFKLWDA 694

Query: 261 KNMKILRHCDVTFDE 272
            N K +   DV  DE
Sbjct: 695 VNEKFIVARDVVVDE 705

BLAST of CmaCh02G004950 vs. ExPASy Swiss-Prot
Match: Q9ZT94 (Retrovirus-related Pol polyprotein from transposon RE2 OS=Arabidopsis thaliana OX=3702 GN=RE2 PE=4 SV=1)

HSP 1 Score: 142.9 bits (359), Expect = 7.0e-33
Identity = 85/258 (32.95%), Postives = 131/258 (50.78%), Query Frame = 0

Query: 21  CENYVMGKQKRVSFTKAAREPKKVRLEMVHTNVWGPSPVSSLGESRYYVTFIDDFSRKVW 80
           C +  + K  +V F+ +     K  LE ++++VW  SP+ S+   RYYV F+D F+R  W
Sbjct: 479 CSDCFINKSHKVPFSNSTITSSK-PLEYIYSDVWS-SPILSIDNYRYYVIFVDHFTRYTW 538

Query: 81  VYFLKHKSDVFTIFKKWKAEVENQTDLKIKCLRSDNGGEYNKSEFITFCAVEGIRLIRTI 140
           +Y LK KS V   F  +K+ VEN+   +I  L SDNGGE+       + +  GI    + 
Sbjct: 539 LYPLKQKSQVKDTFIIFKSLVENRFQTRIGTLYSDNGGEF--VVLRDYLSQHGISHFTSP 598

Query: 141 FGKARQNGIAERMNRTLNERARSMRIHSRLPKTFWADAVNTAAYLINRGPSVPLKFKLPE 200
                 NG++ER +R + E   ++  H+ +PKT+W  A + A YLINR P+  L+ + P 
Sbjct: 599 PHTPEHNGLSERKHRHIVEMGLTLLSHASVPKTYWPYAFSVAVYLINRLPTPLLQLQSPF 658

Query: 201 EVWTGKELKYSHLRTFGCTVYVHVDPEKRDKLDVKAVKCYFIGYGSDMFGYRFWDEKNMK 260
           +   G+   Y  L+ FGC  Y  + P  R KL+ K+ +C F+GY      Y        +
Sbjct: 659 QKLFGQPPNYEKLKVFGCACYPWLRPYNRHKLEDKSKQCAFMGYSLTQSAYLCLHIPTGR 718

Query: 261 ILRHCDVTFDENVIDVTT 279
           +     V FDE     +T
Sbjct: 719 LYTSRHVQFDERCFPFST 732

BLAST of CmaCh02G004950 vs. ExPASy Swiss-Prot
Match: Q94HW2 (Retrovirus-related Pol polyprotein from transposon RE1 OS=Arabidopsis thaliana OX=3702 GN=RE1 PE=2 SV=1)

HSP 1 Score: 141.7 bits (356), Expect = 1.6e-32
Identity = 83/252 (32.94%), Postives = 129/252 (51.19%), Query Frame = 0

Query: 21  CENYVMGKQKRVSFTKAAREPKKVRLEMVHTNVWGPSPVSSLGESRYYVTFIDDFSRKVW 80
           C + ++ K  +V F+++     +  LE ++++VW  SP+ S    RYYV F+D F+R  W
Sbjct: 500 CSDCLINKSNKVPFSQSTINSTR-PLEYIYSDVWS-SPILSHDNYRYYVIFVDHFTRYTW 559

Query: 81  VYFLKHKSDVFTIFKKWKAEVENQTDLKIKCLRSDNGGEYNKSEFITFCAVEGIRLIRTI 140
           +Y LK KS V   F  +K  +EN+   +I    SDNGGE+       + +  GI  + + 
Sbjct: 560 LYPLKQKSQVKETFITFKNLLENRFQTRIGTFYSDNGGEF--VALWEYFSQHGISHLTSP 619

Query: 141 FGKARQNGIAERMNRTLNERARSMRIHSRLPKTFWADAVNTAAYLINRGPSVPLKFKLPE 200
                 NG++ER +R + E   ++  H+ +PKT+W  A   A YLINR P+  L+ + P 
Sbjct: 620 PHTPEHNGLSERKHRHIVETGLTLLSHASIPKTYWPYAFAVAVYLINRLPTPLLQLESPF 679

Query: 201 EVWTGKELKYSHLRTFGCTVYVHVDPEKRDKLDVKAVKCYFIGYGSDMFGYRFWDEKNMK 260
           +   G    Y  LR FGC  Y  + P  + KLD K+ +C F+GY      Y     +  +
Sbjct: 680 QKLFGTSPNYDKLRVFGCACYPWLRPYNQHKLDDKSRQCVFLGYSLTQSAYLCLHLQTSR 739

Query: 261 ILRHCDVTFDEN 273
           +     V FDEN
Sbjct: 740 LYISRHVRFDEN 747

BLAST of CmaCh02G004950 vs. ExPASy Swiss-Prot
Match: P92512 (Uncharacterized mitochondrial protein AtMg00710 OS=Arabidopsis thaliana OX=3702 GN=AtMg00710 PE=4 SV=1)

HSP 1 Score: 85.5 bits (210), Expect = 1.3e-15
Identity = 39/76 (51.32%), Postives = 49/76 (64.47%), Query Frame = 0

Query: 153 MNRTLNERARSMRIHSRLPKTFWADAVNTAAYLINRGPSVPLKFKLPEEVWTGKELKYSH 212
           MNRT+ E+ RSM     LPKTF ADA NTA ++IN+ PS  + F +P+EVW      YS+
Sbjct: 1   MNRTIIEKVRSMLCECGLPKTFRADAANTAVHIINKYPSTAINFHVPDEVWFQSVPTYSY 60

Query: 213 LRTFGCTVYVHVDPEK 229
           LR FGC  Y+H D  K
Sbjct: 61  LRRFGCVAYIHCDEGK 76

BLAST of CmaCh02G004950 vs. TAIR 10
Match: ATMG00710.1 (Polynucleotidyl transferase, ribonuclease H-like superfamily protein )

HSP 1 Score: 85.5 bits (210), Expect = 9.4e-17
Identity = 39/76 (51.32%), Postives = 49/76 (64.47%), Query Frame = 0

Query: 153 MNRTLNERARSMRIHSRLPKTFWADAVNTAAYLINRGPSVPLKFKLPEEVWTGKELKYSH 212
           MNRT+ E+ RSM     LPKTF ADA NTA ++IN+ PS  + F +P+EVW      YS+
Sbjct: 1   MNRTIIEKVRSMLCECGLPKTFRADAANTAVHIINKYPSTAINFHVPDEVWFQSVPTYSY 60

Query: 213 LRTFGCTVYVHVDPEK 229
           LR FGC  Y+H D  K
Sbjct: 61  LRRFGCVAYIHCDEGK 76

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
P109781.3e-7941.81Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum... [more]
P041463.0e-4440.39Copia protein OS=Drosophila melanogaster OX=7227 GN=GIP PE=1 SV=3[more]
Q9ZT947.0e-3332.95Retrovirus-related Pol polyprotein from transposon RE2 OS=Arabidopsis thaliana O... [more]
Q94HW21.6e-3232.94Retrovirus-related Pol polyprotein from transposon RE1 OS=Arabidopsis thaliana O... [more]
P925121.3e-1551.32Uncharacterized mitochondrial protein AtMg00710 OS=Arabidopsis thaliana OX=3702 ... [more]
Match NameE-valueIdentityDescription
ATMG00710.19.4e-1751.32Polynucleotidyl transferase, ribonuclease H-like superfamily protein [more]
InterPro
Analysis Name: InterPro Annotations of Cucurbita maxima (Rimu) v1.1
Date Performed: 2021-10-25
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR001584Integrase, catalytic corePFAMPF00665rvecoord: 49..140
e-value: 5.1E-10
score: 39.6
IPR001584Integrase, catalytic corePROSITEPS50994INTEGRASEcoord: 37..207
score: 20.138538
IPR036397Ribonuclease H superfamilyGENE3D3.30.420.10coord: 38..223
e-value: 7.3E-35
score: 122.0
NoneNo IPR availablePANTHERPTHR11439:SF324RIBONUCLEASE H-LIKE DOMAIN, GAG-PRE-INTEGRASE DOMAIN, GAG-POLYPEPTIDE OF LTR COPIA-TYPE-RELATEDcoord: 39..270
NoneNo IPR availablePANTHERPTHR11439GAG-POL-RELATED RETROTRANSPOSONcoord: 39..270
IPR012337Ribonuclease H-like superfamilySUPERFAMILY53098Ribonuclease H-likecoord: 46..201

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CmaCh02G004950.1CmaCh02G004950.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0015074 DNA integration
molecular_function GO:0003676 nucleic acid binding