Clc03G10850 (gene) Watermelon (cordophanus) v2

Overview
NameClc03G10850
Typegene
OrganismCitrullus lanatus subsp. cordophanus cv. cordophanus (Watermelon (cordophanus) v2)
DescriptionRetroelement pol polyprotein-like
LocationClcChr03: 13903942 .. 13904559 (+)
RNA-Seq ExpressionClc03G10850
SyntenyClc03G10850
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: exonCDSpolypeptide
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGAACGCAGTGCACGCGATGAACTCGACAGCATTTGTAGACTGCCCGCAATGTGGTGGGGGGCACTTGTATGACATGTGTATCTACAACCTCCAGTTTATCTGCAGTATTCAAAACAACCGCTACGACAATTTGGGTTGAAGGAACCACCTTAACTTCGGATGGGGTGGGAATCAACAACAAACGCAGAAGGCTGAGCAGCAGCCGCAGAGAGGTAATCCTCCTGGTTTCAACCAATGGAATCAAGGCTGGTTTCATCAGTATTTGAGGGATCCGCAAGCGGACGAATTAGCCTCATTGTCTTCCCTGGAATCTCTTTTACGAAAAGAAAACAGTAAAATTGAAGCAACGCGCCAGTTGGACGAAGCGATGCTCCAGAACCAAGCCGCTGCGATTCGCAATATAGAAATTCACGTGAATCAAATTGCAGAGGAGCTCGAAAATGAAAATCAAGAAGCATTGCTAAGTACGACCGAAGTTCCAAGAGGGAACTTAGGAGAACAATGTCAAGGTGAGACATTGTTAAGTGGGAAAATCAATCCCACAGTGCATAATGAGTGGTCGCAAGGAAACTTGCAACCCCTAAATGTCACACCCCCTCCCAGATTACCCACCTAA

mRNA sequence

ATGAACGCAGTGCACGCGATGAACTCGACAGCATTTGTAGACTGCCCGCAATGTGGTGGGGGGCACTTGAACCACCTTAACTTCGGATGGGGTGGGAATCAACAACAAACGCAGAAGGCTGAGCAGCAGCCGCAGAGAGGTAATCCTCCTGGTTTCAACCAATGGAATCAAGGCTGGTTTCATCAGTATTTGAGGGATCCGCAAGCGGACGAATTAGCCTCATTGTCTTCCCTGGAATCTCTTTTACGAAAAGAAAACAGTAAAATTGAAGCAACGCGCCAGTTGGACGAAGCGATGCTCCAGAACCAAGCCGCTGCGATTCGCAATATAGAAATTCACGTGAATCAAATTGCAGAGGAGCTCGAAAATGAAAATCAAGAAGCATTGCTAAGTACGACCGAAGTTCCAAGAGGGAACTTAGGAGAACAATGTCAAGGTGAGACATTGTTAAGTGGGAAAATCAATCCCACAGTGCATAATGAGTGGTCGCAAGGAAACTTGCAACCCCTAAATGTCACACCCCCTCCCAGATTACCCACCTAA

Coding sequence (CDS)

ATGAACGCAGTGCACGCGATGAACTCGACAGCATTTGTAGACTGCCCGCAATGTGGTGGGGGGCACTTGAACCACCTTAACTTCGGATGGGGTGGGAATCAACAACAAACGCAGAAGGCTGAGCAGCAGCCGCAGAGAGGTAATCCTCCTGGTTTCAACCAATGGAATCAAGGCTGGTTTCATCAGTATTTGAGGGATCCGCAAGCGGACGAATTAGCCTCATTGTCTTCCCTGGAATCTCTTTTACGAAAAGAAAACAGTAAAATTGAAGCAACGCGCCAGTTGGACGAAGCGATGCTCCAGAACCAAGCCGCTGCGATTCGCAATATAGAAATTCACGTGAATCAAATTGCAGAGGAGCTCGAAAATGAAAATCAAGAAGCATTGCTAAGTACGACCGAAGTTCCAAGAGGGAACTTAGGAGAACAATGTCAAGGTGAGACATTGTTAAGTGGGAAAATCAATCCCACAGTGCATAATGAGTGGTCGCAAGGAAACTTGCAACCCCTAAATGTCACACCCCCTCCCAGATTACCCACCTAA

Protein sequence

MNAVHAMNSTAFVDCPQCGGGHLNHLNFGWGGNQQQTQKAEQQPQRGNPPGFNQWNQGWFHQYLRDPQADELASLSSLESLLRKENSKIEATRQLDEAMLQNQAAAIRNIEIHVNQIAEELENENQEALLSTTEVPRGNLGEQCQGETLLSGKINPTVHNEWSQGNLQPLNVTPPPRLPT
Homology
BLAST of Clc03G10850 vs. NCBI nr
Match: WP_217833202.1 (hypothetical protein, partial [Synechococcus sp. PCC 7002])

HSP 1 Score: 127.5 bits (319), Expect = 1.2e-25
Identity = 78/156 (50.00%), Postives = 94/156 (60.26%), Query Frame = 0

Query: 1   MNAVHAMNSTAFVDCPQCGGGHL----------------------------NHLNFGWGG 60
           M+A++ +N+TA V C QCG GHL                            NHLNFGWG 
Sbjct: 35  MDAMNTVNATASVSCVQCGEGHLYDMCPYNPQSVCYVQNNPYAKTYNPGWRNHLNFGWGR 94

Query: 61  NQQQTQKAEQQPQRGNPPGFNQWNQGWFHQYLRDPQADELASLSSLESLLRKENSKIEAT 120
           NQQQTQ AEQQPQRGNPPGFNQ NQG +HQ+ RDPQAD   S  SLESLLR    +  +T
Sbjct: 95  NQQQTQGAEQQPQRGNPPGFNQGNQGQYHQFQRDPQADASTSFYSLESLLR----ECLST 154

Query: 121 RQLDEAMLQNQAAAIRNIEIHVNQIAEELENENQEA 129
           R   +AM Q+Q AA  N+E+++ Q+   LENE Q A
Sbjct: 155 R---DAMFQSQIAANPNLEVYMIQMIGALENERQGA 183

BLAST of Clc03G10850 vs. NCBI nr
Match: XP_038889363.1 (uncharacterized protein LOC120079279 [Benincasa hispida])

HSP 1 Score: 88.2 bits (217), Expect = 7.9e-14
Identity = 62/151 (41.06%), Postives = 80/151 (52.98%), Query Frame = 0

Query: 5   HAMNSTAFVDCPQCGGGHLNHLNFGWGGNQQQTQKAEQQ----PQRGNPPGFNQWNQGWF 64
           +A    A + C QCGGGH NH NFGWGGN  Q   +  Q      RGN P F+Q NQ   
Sbjct: 246 NAPAQVAAISCVQCGGGHANHPNFGWGGNHNQGGPSNHQSNNFENRGNSPPFHQ-NQNQG 305

Query: 65  HQYLRDPQADEL-------ASLSSLESLLRKENSKIEATRQLDEAMLQNQAAAIRNIEIH 124
           HQ    PQ   L       A+ SSLESLL++   K       ++ ++Q+Q ++IRN+EI 
Sbjct: 306 HQ----PQPQNLPSSSNTSANSSSLESLLKQYIEK-------NDVVMQSQVSSIRNLEIQ 365

Query: 125 VNQIAEELENENQEALLSTTEVPRGNLGEQC 145
           V Q+A EL N     L S +E P  +  EQC
Sbjct: 366 VGQLATELRNRTPGTLPSNSEAPGSHGKEQC 384

BLAST of Clc03G10850 vs. NCBI nr
Match: XP_030510138.1 (uncharacterized protein LOC115724905 [Cannabis sativa])

HSP 1 Score: 76.6 bits (187), Expect = 2.4e-10
Identity = 49/131 (37.40%), Postives = 74/131 (56.49%), Query Frame = 0

Query: 24  NHLNFGWGGNQQQTQKAEQQPQRGNPPGFNQWNQGWFHQYLRDPQADELASLSSLESLLR 83
           +H NF WGG    +  A+ Q ++  PPGF+Q            P   + +  SSLESL+R
Sbjct: 300 HHPNFSWGGQGASSSGAQGQGKQSFPPGFSQ-----------QPHQPQGSQTSSLESLMR 359

Query: 84  KENSKIEATRQLDEAMLQNQAAAIRNIEIHVNQIAEELENENQEALLSTTEVPRGNLGEQ 143
              +K       ++A++Q+QAA++RN+E+ + Q+A +L+N  Q  L S TE PR +  E 
Sbjct: 360 DYMAK-------NDAVIQSQAASLRNLEVQLGQLANDLKNRPQGTLPSDTENPRRDCKEH 412

Query: 144 CQGETLLSGKI 155
           C+  TL SGKI
Sbjct: 420 CKAVTLRSGKI 412

BLAST of Clc03G10850 vs. NCBI nr
Match: XP_030503898.1 (uncharacterized protein LOC115719117 [Cannabis sativa])

HSP 1 Score: 76.3 bits (186), Expect = 3.1e-10
Identity = 52/149 (34.90%), Postives = 86/149 (57.72%), Query Frame = 0

Query: 24  NHLNFGWGGNQQQTQKAEQQPQRGNPPGFNQWNQGWFHQYLRDPQADELASLSSLESLLR 83
           +H NF WGG    +  A+ Q ++  PPGF+Q  +       + P   + +  SSLESL+R
Sbjct: 296 HHPNFSWGGQGASSSGAQGQGKQSFPPGFSQQPRP------QQPHQPQGSQTSSLESLMR 355

Query: 84  KENSKIEATRQLDEAMLQNQAAAIRNIEIHVNQIAEELENENQEALLSTTEVPRGNLGEQ 143
              +K       ++A++Q+QAA++RN+E+ + Q+A +L+N  Q  L S TE PR +  E 
Sbjct: 356 DYMAK-------NDAVIQSQAASLRNLEVQLGQLANDLKNRPQGTLPSDTENPRRDGKEH 415

Query: 144 CQGETLLSGKINPTVHNEWSQGNLQPLNV 173
           C+  TL SGKI  +  N  ++G+ +P ++
Sbjct: 416 CKAVTLRSGKIIES--NVAAKGSKEPSSI 429

BLAST of Clc03G10850 vs. NCBI nr
Match: XP_038885789.1 (uncharacterized protein LOC120076081 [Benincasa hispida])

HSP 1 Score: 74.3 bits (181), Expect = 1.2e-09
Identity = 56/142 (39.44%), Postives = 77/142 (54.23%), Query Frame = 0

Query: 21  GHLNHLNFGWGGNQQQTQKAEQQ----PQRGNPPGFNQWNQGWFHQYLRDPQ-----ADE 80
           G  NH NF WGGN  Q  +   Q      RGNPP F   NQ   +     P+     ++ 
Sbjct: 77  GWRNHQNFSWGGNHNQGGQRNHQNSHSENRGNPPVFLHQNQSQGNFKSIQPRDQPSSSNT 136

Query: 81  LASLSSLESLLRKENSKIEATRQLDEAMLQNQAAAIRNIEIHVNQIAEELENENQEALLS 140
            +S SSLE+LL++   K EA R       Q+QA++IRN+E+ + Q+A EL+N+    L S
Sbjct: 137 SSSTSSLETLLKQYIKKNEAIR-------QSQASSIRNLEVQIGQLAIELKNKAPGTLPS 196

Query: 141 TTEVPRGNLGEQCQGETLLSGK 154
           ++E P  +  EQCQ  TL SGK
Sbjct: 197 SSEAPGRSGKEQCQALTLRSGK 211

BLAST of Clc03G10850 vs. ExPASy TrEMBL
Match: A0A6J1DWK1 (uncharacterized protein LOC111025053 OS=Momordica charantia OX=3673 GN=LOC111025053 PE=4 SV=1)

HSP 1 Score: 67.0 bits (162), Expect = 9.1e-08
Identity = 50/141 (35.46%), Postives = 74/141 (52.48%), Query Frame = 0

Query: 24  NHLNFGWGGNQQQ-----TQKAEQQPQRGNPPGFNQWNQGWFHQYLRDPQADELASLSSL 83
           NH NFGW GNQ       +     Q +   PPGF    Q   H   +        S++SL
Sbjct: 199 NHPNFGWSGNQGGHNTGISNAPTFQQKVSYPPGFAYQGQMVEHNQSK-------GSITSL 258

Query: 84  ESLLRKENSKIEATRQLDEAMLQNQAAAIRNIEIHVNQIAEELENENQEALLSTTEVPRG 143
           E+++++  +  +AT       +Q+QAA++RN+E+ V Q+A +L++    AL S TEVP+ 
Sbjct: 259 ENIMKQYMANNDAT-------VQSQAASLRNLELQVGQLAMDLKSRPVGALPSDTEVPKR 318

Query: 144 NLGEQCQGETLLSGKINPTVH 160
           +  EQC   TL SGK  P  H
Sbjct: 319 DSKEQCNALTLRSGKALPPTH 325

BLAST of Clc03G10850 vs. ExPASy TrEMBL
Match: A0A5B6VWJ0 (Retroelement pol polyprotein-like OS=Gossypium australe OX=47621 GN=EPI10_024080 PE=4 SV=1)

HSP 1 Score: 60.5 bits (145), Expect = 8.5e-06
Identity = 57/155 (36.77%), Postives = 75/155 (48.39%), Query Frame = 0

Query: 24  NHLNFGWGGNQQQTQKAEQQPQRGNPPGFNQWNQGWFHQYLRDPQADELASLSSLESLLR 83
           NHL+F W      T     QP+    P F Q  Q          +  +  + +SLESLL+
Sbjct: 237 NHLDFSWSNQGAGTSTVYTQPRPTQLPNFPQQVQ----------KLVQAKASNSLESLLK 296

Query: 84  KENSKIEATRQLDEAMLQNQAAAIRNIEIHVNQIAEELENENQEALLSTTEVPRGNLG-E 143
              +K       ++A++Q+QAA ++N+E  V Q+A EL N  Q AL S TE PR NLG E
Sbjct: 297 TYMAK-------NDALIQSQAATLKNLENQVGQLATELRNRLQGALPSDTENPR-NLGKE 356

Query: 144 QCQGETLLSGKI--NPTVHNEWSQGNLQPLNVTPP 176
            C+  TL S KI    TV  E  Q N Q      P
Sbjct: 357 HCKALTLRSEKIIEPNTVEVEKEQANAQDAEEVQP 373

BLAST of Clc03G10850 vs. ExPASy TrEMBL
Match: A0A6J1BDW4 (uncharacterized protein LOC110426584 OS=Herrania umbratica OX=108875 GN=LOC110426584 PE=4 SV=1)

HSP 1 Score: 57.4 bits (137), Expect = 7.2e-05
Identity = 51/134 (38.06%), Postives = 66/134 (49.25%), Query Frame = 0

Query: 21  GHLNHLNFGWGGNQQQTQKAEQQPQRGNPPGFNQWNQGWFHQYLRDPQADELASLSSLES 80
           G  NH NF W  N          P+   PPG        FHQ  R PQ  E    S LE 
Sbjct: 96  GWRNHPNFSWSNN-----AGPSNPKPIMPPG--------FHQQAR-PQISE--KKSQLEE 155

Query: 81  LLRKENSKIEATRQLDEAMLQNQAAAIRNIEIHVNQIAEELENENQEALLSTTEV-PRGN 140
           LL +  SK        +A++Q+Q A++RN+E  V Q+A  + N  Q +L S T++ P+G 
Sbjct: 156 LLLQYISK-------TDAIIQSQGASLRNLETQVGQLANSINNRPQGSLPSDTQINPKGK 204

Query: 141 LGEQCQGETLLSGK 154
             EQCQ  TL SGK
Sbjct: 216 --EQCQAITLRSGK 204

BLAST of Clc03G10850 vs. ExPASy TrEMBL
Match: A0A6J1AB81 (uncharacterized protein LOC110416390 OS=Herrania umbratica OX=108875 GN=LOC110416390 PE=4 SV=1)

HSP 1 Score: 55.1 bits (131), Expect = 3.6e-04
Identity = 57/185 (30.81%), Postives = 79/185 (42.70%), Query Frame = 0

Query: 21  GHLNHLNFGWGGNQQQTQKAEQQPQRGNPPGFNQWNQGWFHQYLRDPQADELASLSSLES 80
           G  NH NF W  N          P+   PPGF Q  +         PQ  E    S LE 
Sbjct: 39  GWRNHPNFSWSNN-----AGPSNPKPIMPPGFQQQAR---------PQIPE--KKSQLEE 98

Query: 81  LLRKENSKIEATRQLDEAMLQNQAAAIRNIEIHVNQIAEELENENQEALLSTTEV-PRGN 140
           LL +  SK        +A++Q+Q A++RN+E  V Q+A  + N  Q +L S T++ P+G 
Sbjct: 99  LLLQYISK-------TDAIIQSQGASLRNLETQVGQLANSINNRPQGSLPSDTQINPKGK 158

Query: 141 LGEQCQGETLLSGKINPTVHNEWSQGNL-------------------------QPLNVTP 180
             EQCQ  TL SGK    V+ +  +  +                         +PLN+ P
Sbjct: 159 --EQCQAITLRSGKEIEGVNQKAVESEIEHVDKEECVRRRIESTKDVTRLKIKEPLNLHP 198

BLAST of Clc03G10850 vs. ExPASy TrEMBL
Match: A0A6J0ZX64 (LOW QUALITY PROTEIN: uncharacterized protein LOC110412945 OS=Herrania umbratica OX=108875 GN=LOC110412945 PE=4 SV=1)

HSP 1 Score: 55.1 bits (131), Expect = 3.6e-04
Identity = 49/134 (36.57%), Postives = 65/134 (48.51%), Query Frame = 0

Query: 21  GHLNHLNFGWGGNQQQTQKAEQQPQRGNPPGFNQWNQGWFHQYLRDPQADELASLSSLES 80
           G  NH NF W  N          P+   PPGF Q  +         PQ  E    S LE 
Sbjct: 353 GWRNHPNFSWSNN-----AGPSNPKPIMPPGFQQQAR---------PQIPE--KKSQLEE 412

Query: 81  LLRKENSKIEATRQLDEAMLQNQAAAIRNIEIHVNQIAEELENENQEALLSTTEV-PRGN 140
           LL +  SK        +A++Q+Q A++RN+E  V Q+A  + N  Q +L S T++ P+G 
Sbjct: 413 LLLQYISK-------TDAIIQSQGASLRNLETQVGQLANSINNRPQGSLPSDTQINPKGK 461

Query: 141 LGEQCQGETLLSGK 154
             EQCQ  TL SGK
Sbjct: 473 --EQCQAITLRSGK 461

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
WP_217833202.11.2e-2550.00hypothetical protein, partial [Synechococcus sp. PCC 7002][more]
XP_038889363.17.9e-1441.06uncharacterized protein LOC120079279 [Benincasa hispida][more]
XP_030510138.12.4e-1037.40uncharacterized protein LOC115724905 [Cannabis sativa][more]
XP_030503898.13.1e-1034.90uncharacterized protein LOC115719117 [Cannabis sativa][more]
XP_038885789.11.2e-0939.44uncharacterized protein LOC120076081 [Benincasa hispida][more]
Match NameE-valueIdentityDescription
Match NameE-valueIdentityDescription
A0A6J1DWK19.1e-0835.46uncharacterized protein LOC111025053 OS=Momordica charantia OX=3673 GN=LOC111025... [more]
A0A5B6VWJ08.5e-0636.77Retroelement pol polyprotein-like OS=Gossypium australe OX=47621 GN=EPI10_024080... [more]
A0A6J1BDW47.2e-0538.06uncharacterized protein LOC110426584 OS=Herrania umbratica OX=108875 GN=LOC11042... [more]
A0A6J1AB813.6e-0430.81uncharacterized protein LOC110416390 OS=Herrania umbratica OX=108875 GN=LOC11041... [more]
A0A6J0ZX643.6e-0436.57LOW QUALITY PROTEIN: uncharacterized protein LOC110412945 OS=Herrania umbratica ... [more]
Match NameE-valueIdentityDescription
InterPro
Analysis Name: InterPro Annotations of Watermelon (cordophanus) v2
Date Performed: 2022-01-31
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
NoneNo IPR availableCOILSCoilCoilcoord: 107..127
NoneNo IPR availableCOILSCoilCoilcoord: 75..95
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 156..171
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 29..56
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 156..180

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Clc03G10850.1Clc03G10850.1mRNA