ClCG01G010990 (gene) Watermelon (Charleston Gray)

NameClCG01G010990
Typegene
OrganismCitrullus lanatus (Watermelon (Charleston Gray))
DescriptionCopia-like retroelement pol polyprotein
LocationCG_Chr01 : 17259108 .. 17262748 (-)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: CDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGTTTCTGAGAGAGAAGTTCTTCACATTCAAGATGGATTCGGCTAGATCACTTACTGAAAATTTAGATGAGTTTAAGAAGATAGTAATTGAATTCAAGAGTCTTGGTGAGAAACTAGGGGATGACAATGAAGCCTATGTTCTATTGAATTCCCTTCCAGAAGCATATGAAGAGGTAAAGAATGCTTTAAAATACGACAGAGAGTCAATTACCACTGGTGCAATAATTTCAGCTCTTAAAATCAATGAATTAGAGCTTCATGCTACTAAGAAAGAGCAGCCTAGTGTGGAAAGGCTATTTGTTAAGAGCAAGGACAAACCAAATCAGTCTAAGGGTGGTAAACAACAGTCTAATGAGGATCATAAGCTTAAGACGAAGATAAGGTGCAACTATCGCAAGAAGAAGGGCCACCTCATGAGAGATTGCTACAGCTTAAAGAGGAAAAATCAAGAAAAAGAGAAAGATGCTAAGGGGAAACAACCTAAAGCTTCTATAGTGGAAGGCTCTTATTTCTACTCTGATGCACTAGCTTCTACAAGAGACAAGGCCAACCAAGTAAGTTCTCTTGGGAAACATGATTGGTGATAACTTGTAGAAATACAAGTTATGATAGTCTTTTTCTTAGATAAATAGAGATTAAATAGAGAAGCATGCGGTAAGTTATGTTAATTTTTATTAAGAAATCTTCCTTTACGTTATACTAATGTATTTTTATGAAGATAAATGATTTTACCACTTTTTGAGCCTAAAACACATGTTTTGATACAGGAATGCAAACGATGCGTTGATGACTTCTTAACGCAGGAACTAGACGATGCGTTAACAACAACTCAACGCATAACTACGCGTTAGCAACCTACGCCGCGTTGACATCGGCCAAACTATTCATGGACACATAGACGACCATTAATGCAAGGCGCTGAGAATCAAATCAGAATCTTGAGAGATTGACGGATGATTGACGAACATGATTTGTGGCTAAAGTGGTGGATCTTAATTGACCCGAGAATCACGCAATCAGGCGAGATGGACGCAGAGTATCAATCTCGCCGCCAATCAAGTAGATTTGCCATTAATTGCGGACGCATCATCTATAAAAGGGCAGCTCTGCGAGATGAGAAAGGGTTGTTGATTCTCGGAAGAACTCCATAAGTGACAGCAGAGTTCTCTCAGACTTCAGCCGGAGACGAATAGCTCGAGAGAGAAGAGTCTTCCCTCCGCCGGACCACTTCACAGCAGAACCTCACGCTTCCATCCGGGCGACTCAAGACATTGACGCCTTCCCTGCTTTCTTTCTTATTGTTTCTATCTTAATTTTAGATTTGTGTCAAGACATTGAACTTATCGAAAATTGTATCCCATTTCATCAATATAATTATCTTCCATGTCTACTCATTTCTCTAAACTTTCCGCTTTTTACTTTCAAACCATGAGTAACTAATCACTCAAGGGTCTAGGGTTGCGTTAGCATCTAACTAGGAACCATAGTGTAACTATTTCATCTTGTCAATTGCGATCTTAATCCGATGCTTATGTCCATCTGACAAATCGATATAAGTAGTAATAATCAAACTGCTCGAGAGGGTAAGTGATTGTGGAACTCAATTGACAAGGGCCAAGAAGTGTTACCTGTCTATGGGGATAACCTTGGTTCTTAACACTCTCCTATGTGTTATAGACATATAACGCATAGCGGCTGACAACCGGGAGAAGAGTCGCAAGTATGAATAGAATACTACTATGCGCTTCCGACGTTAGATGTTAATCAACCCGAACCCTTTCTTTATCCATTCATTCACATCTTTAGTTGAATTTGCTGTGTCGCATGCTTCCCATCTTTCACAATTTTAGTCTAGTGTACAAACAATCACTTTCCGTAGATTAAAAACAACTCTCTCAACTATTTCTTGGTTACCACTGCAACGCATGAAATCCACATAAATACTAAATCATCATAATCCCTGTGTTCGACCTCAGACTCACCGAGAAAACCTATTCCTTTGCTTATACTTGGGTAAAGGATAGGAAAACTTGACAAGATCACCAACGCATAGTAATAGGACGAATAGCATGTTCAAAGCATGATATCTTGAACGCATAATCACATCATTTAGCACCTTGCGTTAGTACAATTTACACATACAATTACATCACACAACAACATGACACCCTTTAAGTTATGTTTCAACACTTAGAAAGAAGCTAATGGAGAATTAGTCTACATGGGCAACAATGAATCACGTAAAATCCAAGGAATTGGGTCAATCTAATTAAGATTGAAGGATGGGACAGTGAAGTTGCTTAGAATTGTCAAACATGTACCTATGCTAAAGAAGAATTTGATTTCCTTGGTTAAAAGGATTATTGAAAGACTTTGGCATACAACAAAACACAGTGAAGATATTTTGTGACAATCAAAGCACCATACATCTTTCAAAGAATCCTCAGTATCATACAAGAACAAAGCATATAGACATAAAGTACCATTTCGTGAGAGATAAAATAGAAGGAGGAGAGGTAGAAGTGCTGAAAGTTCATACCTCTGAAAATGTTGCCGACATGCTAACCAAACTAGTGTTGAAGCTAAAGCTGCTCAAGTGTCTCGAGCTGATCAACTTTGACCTGCCAGAGAAAGGGTAAAAATGGTTAGATCAAGAGGAAAAGAGTTTGCATGTGTCAGATTCAAGGTGGAGATTTATAAAAATGTGTGAATCCAACAGTAGGTTTTATGTTTCTTTCTGTTACAACCAAAAATTAACAACTTGTAACTGATTTGTATGAGTTATAAATACAAATTAGAAATAGAAATCAAGGCTACTCACATACATAGAAATTATAAAAGCTGTAAAGAGAGAAATAAAAGACTTATCTGCAAGTTTTCCTCCATGGATGTAGGCATTTTGGCCGAACCACGTATATCTTGTGTGTTCATCATCATCTTCTTCTTCCCTTTCTTTATTTTATACATGCAACAAATCTTTTCTTCAGTTTTGACCTCAAAATCGCCATTGATATTCAGAAGAAAAGAAAAGAATGAAGGGTTAAGGCTGAGAGCAAGAGAATCAGAAAGCAAAGAGGGAAAGAAAAGTGAAAAAATAAACTTTCAGATCAATTTTTTCTTTCTTTCAAAATCGGTAGTAGAGTGCTTCCACGTGGTTGACATTCAAAGCATTGGCGTGGTTGATTTTTAATCTTTTATTTTACCTACCTCAAGCCCACTTCCTTTTTCGAGCTTCTTTATACTATTTTGGGCCTAAAAATTTGCATTCGAGCCCAACATCTTCCTCTGTTAATGTCGCCCAAAAAAAGGACAGGCAAGCTCGAATTCTTCTCCCTCAAGCCCAAGTCGAAACTCTGGCAACATTTTATCTAAATAGAGGGGACGAGGAAGAGGTTGTGGGCGATGGAACAACAGGTCGATTTGCTAGGTCTGTGGCAAGGTTGGTCACACAGCCGCTATCTGCTATAATCGCTTCAACAAAGAATTCAATAATCCTAGTTAGAATCAATAAAAAGATGGCCCAAACTCTCAGAAAAATTACAATCTTAATCCACTCTTTCGAGGAGCACCCAACACTTATGTGGCAAACTCGGTCATGGCTACGCCTGAGACTATCATCAACGCCCAACTGCTAAGCTGA

mRNA sequence

ATGTTTCTGAGAGAGAAGTTCTTCACATTCAAGATGGATTCGGCTAGATCACTTACTGAAAATTTAGATGAGTTTAAGAAGATAGTAATTGAATTCAAGAGTCTTGGTGAGAAACTAGGGGATGACAATGAAGCCTATGTTCTATTGAATTCCCTTCCAGAAGCATATGAAGAGGTAAAGAATGCTTTAAAATACGACAGAGAGTCAATTACCACTGGTGCAATAATTTCAGCTCTTAAAATCAATGAATTAGAGCTTCATGCTACTAAGAAAGAGCAGCCTAGTGTGGAAAGGCTATTTGTTAAGAGCAAGGACAAACCAAATCAGTCTAAGGGTGGTAAACAACAGTCTAATGAGGATCATAAGCTTAAGACGAAGATAAGGTGCAACTATCGCAAGAAGAAGGGCCACCTCATGAGAGATTGCTACAGCTTAAAGAGGAAAAATCAAGAAAAAGAGAAAGATGCTAAGGGGAAACAACCTAAAGCTTCTATAGTGGAAGGCTCTTATTTCTACTCTGATGCACTAGCTTCTACAAGAGACAAGGCCAACCAAGCGAGATGGACGCAGAGTATCAATCTCGCCGCCAATCAAGTAGATTTGCCATTAATTGCGGACGCATCATCTATAAAAGGGCAGCTCTGCGAGATGAGAAAGGACTTCAGCCGGAGACGAATAGCTCGAGAGAGAAGAGTCTTCCCTCCGCCGGACCACTTCACAGCAGAACCTCACGCTTCCATCCGGGCGACTCAAGACATTGACGCCTTCCCTGCTTTCTTTCTTATTAATCCTCAGTATCATACAAGAACAAAGCATATAGACATAAAGTACCATTTCGTGAGAGATAAAATAGAAGGAGGAGAGGTAGAAGTGCTGAAAGTTCATACCTCTGAAAATGTTGCCGACATGCTAACCAAACTAGTGTTGAAGCTAAAGCTGCTCAAGTGTCTCGAGCTGATCAACTTTGACCTGCCAGAGAAAGGGCAAGCTCGAATTCTTCTCCCTCAAGCCCAAGTCGAAACTCTGGCAACATTTTATCTAAATAGAGGGGACGAGGAAGAGGTTGTGGGCGATGGAACAACAGGTCGATTTGCTAGGTCTGTGGCAAGGTTGGTCACACAGCCGCTATCTGCTATAATCGCTTCAACAAAGAATTCAATAATCCTAAAAAATTACAATCTTAATCCACTCTTTCGAGGAGCACCCAACACTTATGTGGCAAACTCGGTCATGGCTACGCCTGAGACTATCATCAACGCCCAACTGCTAAGCTGA

Coding sequence (CDS)

ATGTTTCTGAGAGAGAAGTTCTTCACATTCAAGATGGATTCGGCTAGATCACTTACTGAAAATTTAGATGAGTTTAAGAAGATAGTAATTGAATTCAAGAGTCTTGGTGAGAAACTAGGGGATGACAATGAAGCCTATGTTCTATTGAATTCCCTTCCAGAAGCATATGAAGAGGTAAAGAATGCTTTAAAATACGACAGAGAGTCAATTACCACTGGTGCAATAATTTCAGCTCTTAAAATCAATGAATTAGAGCTTCATGCTACTAAGAAAGAGCAGCCTAGTGTGGAAAGGCTATTTGTTAAGAGCAAGGACAAACCAAATCAGTCTAAGGGTGGTAAACAACAGTCTAATGAGGATCATAAGCTTAAGACGAAGATAAGGTGCAACTATCGCAAGAAGAAGGGCCACCTCATGAGAGATTGCTACAGCTTAAAGAGGAAAAATCAAGAAAAAGAGAAAGATGCTAAGGGGAAACAACCTAAAGCTTCTATAGTGGAAGGCTCTTATTTCTACTCTGATGCACTAGCTTCTACAAGAGACAAGGCCAACCAAGCGAGATGGACGCAGAGTATCAATCTCGCCGCCAATCAAGTAGATTTGCCATTAATTGCGGACGCATCATCTATAAAAGGGCAGCTCTGCGAGATGAGAAAGGACTTCAGCCGGAGACGAATAGCTCGAGAGAGAAGAGTCTTCCCTCCGCCGGACCACTTCACAGCAGAACCTCACGCTTCCATCCGGGCGACTCAAGACATTGACGCCTTCCCTGCTTTCTTTCTTATTAATCCTCAGTATCATACAAGAACAAAGCATATAGACATAAAGTACCATTTCGTGAGAGATAAAATAGAAGGAGGAGAGGTAGAAGTGCTGAAAGTTCATACCTCTGAAAATGTTGCCGACATGCTAACCAAACTAGTGTTGAAGCTAAAGCTGCTCAAGTGTCTCGAGCTGATCAACTTTGACCTGCCAGAGAAAGGGCAAGCTCGAATTCTTCTCCCTCAAGCCCAAGTCGAAACTCTGGCAACATTTTATCTAAATAGAGGGGACGAGGAAGAGGTTGTGGGCGATGGAACAACAGGTCGATTTGCTAGGTCTGTGGCAAGGTTGGTCACACAGCCGCTATCTGCTATAATCGCTTCAACAAAGAATTCAATAATCCTAAAAAATTACAATCTTAATCCACTCTTTCGAGGAGCACCCAACACTTATGTGGCAAACTCGGTCATGGCTACGCCTGAGACTATCATCAACGCCCAACTGCTAAGCTGA

Protein sequence

MFLREKFFTFKMDSARSLTENLDEFKKIVIEFKSLGEKLGDDNEAYVLLNSLPEAYEEVKNALKYDRESITTGAIISALKINELELHATKKEQPSVERLFVKSKDKPNQSKGGKQQSNEDHKLKTKIRCNYRKKKGHLMRDCYSLKRKNQEKEKDAKGKQPKASIVEGSYFYSDALASTRDKANQARWTQSINLAANQVDLPLIADASSIKGQLCEMRKDFSRRRIARERRVFPPPDHFTAEPHASIRATQDIDAFPAFFLINPQYHTRTKHIDIKYHFVRDKIEGGEVEVLKVHTSENVADMLTKLVLKLKLLKCLELINFDLPEKGQARILLPQAQVETLATFYLNRGDEEEVVGDGTTGRFARSVARLVTQPLSAIIASTKNSIILKNYNLNPLFRGAPNTYVANSVMATPETIINAQLLS
BLAST of ClCG01G010990 vs. Swiss-Prot
Match: POLX_TOBAC (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum PE=2 SV=1)

HSP 1 Score: 68.6 bits (166), Expect = 1.9e-10
Identity = 29/58 (50.00%), Postives = 41/58 (70.69%), Query Frame = 1

Query: 263  NPQYHTRTKHIDIKYHFVRDKIEGGEVEVLKVHTSENVADMLTKLVLKLKLLKCLELI 321
            N  YH RTKHID++YH++R+ ++   ++VLK+ T+EN ADMLTK+V + K   C EL+
Sbjct: 1266 NSMYHARTKHIDVRYHWIREMVDDESLKVLKISTNENPADMLTKVVPRNKFELCKELV 1323


HSP 2 Score: 63.2 bits (152), Expect = 8.1e-09
Identity = 39/178 (21.91%), Postives = 83/178 (46.63%), Query Frame = 1

Query: 1   MFLREKFFTFKMDSARSLTENLDEFKKIVIEFKSLGEKLGDDNEAYVLLNSLPEAYEEVK 60
           ++L+++ +   M    +   +L+ F  ++ +  +LG K+ ++++A +LLNSLP +Y+ + 
Sbjct: 101 LYLKKQLYALHMSEGTNFLSHLNVFNGLITQLANLGVKIEEEDKAILLLNSLPSSYDNLA 160

Query: 61  NALKYDRESITTGAIISALKINELELHATKKEQPSVERLFVKSKDKPNQSKG-------- 120
             + + + +I    + SAL +NE      KK +   + L  + + +  Q           
Sbjct: 161 TTILHGKTTIELKDVTSALLLNE---KMRKKPENQGQALITEGRGRSYQRSSNNYGRSGA 220

Query: 121 -GKQQSNEDHKLKTKIRCNYRKKKGHLMRDCYSLKRKNQEKEKDAKGKQPKASIVEGS 170
            GK ++    +++    CN   + GH  RDC +  RK + +    K     A++V+ +
Sbjct: 221 RGKSKNRSKSRVRNCYNCN---QPGHFKRDCPN-PRKGKGETSGQKNDDNTAAMVQNN 271

BLAST of ClCG01G010990 vs. Swiss-Prot
Match: COPIA_DROME (Copia protein OS=Drosophila melanogaster GN=GIP PE=1 SV=3)

HSP 1 Score: 56.6 bits (135), Expect = 7.6e-07
Identity = 38/166 (22.89%), Postives = 80/166 (48.19%), Query Frame = 1

Query: 3   LREKFFTFKMDSARSLTENLDEFKKIVIEFKSLGEKLGDDNEAYVLLNSLPEAYEEVKNA 62
           LR++  + K+ S  SL  +   F +++ E  + G K+ + ++   LL +LP  Y+ +  A
Sbjct: 99  LRKRLLSLKLSSEMSLLSHFHIFDELISELLAAGAKIEEMDKISHLLITLPSCYDGIITA 158

Query: 63  LK-YDRESITTGAIISALKINELELHATKKE-QPSVERLFVKSKDKPNQSKGGKQQSNED 122
           ++    E++T   + + L   E+++     +    V    V + +   ++   K +  + 
Sbjct: 159 IETLSEENLTLAFVKNRLLDQEIKIKNDHNDTSKKVMNAIVHNNNNTYKNNLFKNRVTKP 218

Query: 123 HKL-----KTKIRCNYRKKKGHLMRDCYSLKR----KNQEKEKDAK 158
            K+     K K++C++  ++GH+ +DC+  KR    KN+E EK  +
Sbjct: 219 KKIFKGNSKYKVKCHHCGREGHIKKDCFHYKRILNNKNKENEKQVQ 264


HSP 2 Score: 47.0 bits (110), Expect = 6.0e-04
Identity = 19/44 (43.18%), Postives = 26/44 (59.09%), Query Frame = 1

Query: 263  NPQYHTRTKHIDIKYHFVRDKIEGGEVEVLKVHTSENVADMLTK 307
            NP  H R KHIDIKYHF R++++   + +  + T   +AD+ TK
Sbjct: 1341 NPSCHKRAKHIDIKYHFAREQVQNNVICLEYIPTENQLADIFTK 1384

BLAST of ClCG01G010990 vs. TrEMBL
Match: Q02900_ARATH (Orf 1 (Fragment) OS=Arabidopsis thaliana PE=4 SV=1)

HSP 1 Score: 102.4 bits (254), Expect = 1.3e-18
Identity = 64/226 (28.32%), Postives = 117/226 (51.77%), Query Frame = 1

Query: 1   MFLREKFFTFKMDSARSLTENLDEFKKIVIEFKSLGEKLGDDNEAYVLLNSLPEAYEEVK 60
           ++++ KF++FKM+ ++S+ EN++EF KIV E  SL   + ++  A + LN L   Y ++K
Sbjct: 120 IYVQLKFYSFKMNDSKSINENVNEFLKIVAELSSLEINVVEEVRAILFLNGLSSRYSQLK 179

Query: 61  NALKYDRESITTGAIISALKINELELHATKKEQPSVER-LFVKSKDKP-----NQSKGGK 120
           + LKY  ++++   +IS+ +  E EL   K+   +    L+   + +P     NQ+KGG+
Sbjct: 180 HTLKYGNKALSLQDVISSARSLERELDEQKETDKNTSTVLYTNERGRPLTRNQNQNKGGQ 239

Query: 121 QQSNEDHKLKTKIRCNYRKKKGHLMRDCYSLKRKNQEKEKDAKGKQPKASIVEGSYFYSD 180
            +         K+ C Y KK+GH+ +DC++ KRK + +         +A ++     +S+
Sbjct: 240 GRGRSKSNSNAKLTCWYCKKEGHVKKDCFARKRKLESENPG------EAGVITEKLVFSE 299

Query: 181 ALASTRDKANQARWTQSINLAANQVDLPLIADASSIKGQLCEMRKD 221
           AL S  D A +  W         ++D    +  S+ K   C  R+D
Sbjct: 300 AL-SVNDLAVRDIW---------ELDSGCPSHMSARKDWFCNFRED 329

BLAST of ClCG01G010990 vs. TrEMBL
Match: Q1KUM3_9ROSI (Putative uncharacterized protein OS=Tarenaya spinosa PE=4 SV=1)

HSP 1 Score: 101.7 bits (252), Expect = 2.3e-18
Identity = 63/168 (37.50%), Postives = 100/168 (59.52%), Query Frame = 1

Query: 1   MFLREKFFTFKMDSARSLTENLDEFKKIVIEFKSLGEKLGDDNEAYVLLNSLPEAYEEVK 60
           M+L ++   F+MDS+R++ ENLD F+K++ +  SL  K+ ++ +A  LLNSLP AYE+++
Sbjct: 140 MYLMQRVSGFRMDSSRTIEENLDIFQKLLSDLHSLNVKVEEEYQAVYLLNSLPPAYEQLR 199

Query: 61  NALKYDRESITTGAIISALKINELELHAT-KKEQPSVERLFVKSKDKPNQSKGGKQQSNE 120
             LKY R +I+   + +A ++ ELEL A     + + E L VK   KP +S GGK+    
Sbjct: 200 EVLKYSRATISVEEVKAAARMKELELLAQGTLTRGTGEGLVVKG--KPEKSGGGKK---- 259

Query: 121 DHKLKTKIRCNYRKKKGHLMRDCYSLKRKNQEKEKDAKGKQPKASIVE 168
             K K ++ C Y  KKGH  ++C S + K     ++ +GK   AS+ E
Sbjct: 260 --KAKDQVECWYCGKKGHYKKECRSRRAK-----EETEGKGVVASVQE 294

BLAST of ClCG01G010990 vs. TrEMBL
Match: Q9SHR5_ARATH (F28L22.3 protein OS=Arabidopsis thaliana GN=F28L22.3 PE=4 SV=1)

HSP 1 Score: 101.3 bits (251), Expect = 3.0e-18
Identity = 57/181 (31.49%), Postives = 104/181 (57.46%), Query Frame = 1

Query: 1   MFLREKFFTFKMDSARSLTENLDEFKKIVIEFKSLGEKLGDDNEAYVLLNSLPEAYEEVK 60
           ++ + K ++FKM S  ++ +N+DEF +IV E  SL  ++ ++ +A ++LNSLP ++ ++K
Sbjct: 124 IYTQLKLYSFKMVSTMTIDQNVDEFLRIVAELGSLEIQVDEEVQAILILNSLPASHIQLK 183

Query: 61  NALKYDRESITTGAIISALKINELEL-HATKKEQPSVERLFVKSKDKP---NQSKGGKQQ 120
           + LKY  +++T   + S+ K  E EL  A   ++     L+   + +P   N  KGG+ +
Sbjct: 184 HTLKYGNKTLTVQDVTSSAKSLERELAEAVDLDKGQAAVLYTTERGRPLVRNNQKGGQGK 243

Query: 121 SNEDHKLKTKIRCNYRKKKGHLMRDCYSLKRKNQEKEKDAKGKQPKASIVEGSYFYSDAL 178
                  KTK+ C Y KK+GH+ +DCYS K+K + +       Q +A ++     +S+AL
Sbjct: 244 GRSRSNSKTKVPCWYCKKEGHVKKDCYSRKKKMESE------GQGEAGVITEKLVFSEAL 298

BLAST of ClCG01G010990 vs. TrEMBL
Match: Q9SHR5_ARATH (F28L22.3 protein OS=Arabidopsis thaliana GN=F28L22.3 PE=4 SV=1)

HSP 1 Score: 64.3 bits (155), Expect = 4.1e-07
Identity = 30/64 (46.88%), Postives = 44/64 (68.75%), Query Frame = 1

Query: 263  NPQYHTRTKHIDIKYHFVRDKIEGGEVEVLKVHTSENVADMLTKLVLKLKL---LKCLEL 322
            N  YH RTKHID++++++RD +E G+V+VLK+HTS N  D LTK +   K    L  L+L
Sbjct: 1293 NSVYHERTKHIDVRFNYIRDVVESGDVDVLKIHTSRNPVDALTKCIPVNKFKSALGVLKL 1352

Query: 323  INFD 324
            + +D
Sbjct: 1353 MKWD 1356


HSP 2 Score: 100.5 bits (249), Expect = 5.1e-18
Identity = 57/203 (28.08%), Postives = 116/203 (57.14%), Query Frame = 1

Query: 1   MFLREKFFTFKMDSARSLTENLDEFKKIVIEFKSLGEKLGDDNEAYVLLNSLPEAYEEVK 60
           ++ + KF++F+M +++++ +N+D+F +IV E  SL  K+ ++ +A ++LNSLP  Y+++K
Sbjct: 128 IYAQLKFYSFRMMTSKTIDQNVDDFLRIVAELGSLDIKVAEEVQAILILNSLPVTYDQLK 187

Query: 61  NALKYDRESITTGAIISALKINELELHATKKEQPSVE-RLFVKSKDKP-----NQSKGGK 120
           + LKY  ++++   ++S+ K  E E+   K+    V   L+   + +P     N S+G  
Sbjct: 188 HTLKYGNKTLSVKDVVSSSKSLEREMAELKENTKVVNTTLYTAERGRPQTRNQNGSQGNN 247

Query: 121 QQSNED---------HKLKTKIRCNYRKKKGHLMRDCYSLKRKNQEKEKDAKGKQPKASI 180
           Q +N+             K+++ C + KK+GH+ +DC++ K+K + +E      Q +A +
Sbjct: 248 QGNNQGKNQGKGKSRSNSKSRVTCWFCKKEGHVKKDCFARKKKFENEE------QGEAGV 307

Query: 181 VEGSYFYSDALASTRDKANQARW 189
           +     YS+AL S  D+  + +W
Sbjct: 308 ITEKLVYSEAL-SMHDQEAKEKW 323

BLAST of ClCG01G010990 vs. TrEMBL
Match: Q9LVY5_ARATH (Copia-like retroelement pol polyprotein OS=Arabidopsis thaliana PE=4 SV=1)

HSP 1 Score: 31.6 bits (70), Expect = 2.9e+03
Identity = 17/35 (48.57%), Postives = 22/35 (62.86%), Query Frame = 1

Query: 284  IEGGEVEVLKVHTSENVADMLTKLVLKLKLLKCLE 319
            + G  VE  K+HTS N ADMLTK++L  K    L+
Sbjct: 1102 VGGNTVE--KIHTSRNPADMLTKVILVHKFEAALD 1134


HSP 2 Score: 99.8 bits (247), Expect = 8.7e-18
Identity = 64/226 (28.32%), Postives = 115/226 (50.88%), Query Frame = 1

Query: 1   MFLREKFFTFKMDSARSLTENLDEFKKIVIEFKSLGEKLGDDNEAYVLLNSLPEAYEEVK 60
           ++++ KF++FKM+ ++S+ EN++EF KIV E  SL   + ++  A + LN L   Y ++K
Sbjct: 32  IYVQLKFYSFKMNDSKSINENVNEFLKIVAELSSLEINVVEEVRAILFLNGLSSRYSQLK 91

Query: 61  NALKYDRESITTGAIISALKINELELHATKKEQPSVER-LFVKSKDKP-----NQSKGGK 120
           + LKY  ++++   +IS+ +  E EL   K+   +    L+   + +P     NQ+K G+
Sbjct: 92  HTLKYGNKALSLQDVISSARSLERELDEQKETDKNTSTVLYTNERGRPQTRNQNQNKEGQ 151

Query: 121 QQSNEDHKLKTKIRCNYRKKKGHLMRDCYSLKRKNQEKEKDAKGKQPKASIVEGSYFYSD 180
            +         K+ C Y KK+GH+ +DC++ KRK + +         +A ++     +S+
Sbjct: 152 GRGISKSNSNAKLTCWYCKKEGHVKKDCFARKRKLESENPG------EAGVITEKLVFSE 211

Query: 181 ALASTRDKANQARWTQSINLAANQVDLPLIADASSIKGQLCEMRKD 221
           AL S  D A +  W          +D    +  S+ K   C  RKD
Sbjct: 212 AL-SVNDLAVRDIWV---------LDSGCTSHMSARKDWFCNFRKD 241

BLAST of ClCG01G010990 vs. NCBI nr
Match: gi|747053728|ref|XP_011073037.1| (PREDICTED: retrovirus-related Pol polyprotein from transposon TNT 1-94 [Sesamum indicum])

HSP 1 Score: 110.2 bits (274), Expect = 9.3e-21
Identity = 63/183 (34.43%), Postives = 102/183 (55.74%), Query Frame = 1

Query: 1   MFLREKFFTFKMDSARSLTENLDEFKKIVIEFKSLGEKLGDDNEAYVLLNSLPEAYEEVK 60
           +FL EK F +K+D ++++ ENLD+F K++ + K  G+K  D+    VLLN++PE++ +VK
Sbjct: 97  LFLLEKIFRYKLDLSKNIDENLDDFTKLIQDIKLAGDKYIDEYSPIVLLNAIPESFSDVK 156

Query: 61  NALKYDRESITTGAIISALKINELELHATKKEQPSVERLFV-----------------KS 120
            A+KY R+SI    +++ LK  EL+L   K  Q   E   V                 +S
Sbjct: 157 AAIKYGRDSINLETVVNGLKSKELDLKVNKPSQSHYEINSVRGRTRFGNFNSRYNSRSRS 216

Query: 121 KDKPNQSKGGKQQSN-EDHKLKTKIRCNYRKKKGHLMRDCYSLKRKNQEKEKDAKGKQPK 166
           K K N+SK   +++N  D K++ + RC     KGH ++DC   +R+N+++  D K K   
Sbjct: 217 KTKTNRSKSRPRETNLRDDKIRDR-RCYNCGTKGHYIKDCRKPRRENRDRNYDDKEKVSN 276

BLAST of ClCG01G010990 vs. NCBI nr
Match: gi|729318722|ref|XP_010532694.1| (PREDICTED: endo-1,3;1,4-beta-D-glucanase-like [Tarenaya hassleriana])

HSP 1 Score: 102.8 bits (255), Expect = 1.5e-18
Identity = 55/158 (34.81%), Postives = 97/158 (61.39%), Query Frame = 1

Query: 1   MFLREKFFTFKMDSARSLTENLDEFKKIVIEFKSLGEKLGDDNEAYVLLNSLPEAYEEVK 60
           M+L+++   FKMD+ +S+ +N+D FKK+V +  +L  ++  +++  +LLNSLP+ Y+++K
Sbjct: 144 MYLKQRLIDFKMDATKSIEDNVDVFKKLVNDLSNLKIEVAKEDQVLILLNSLPDQYDQLK 203

Query: 61  NALKYD-RESITTGAIISALKINELELHATKKEQPSVERLFVKSKDKPNQSKGGK--QQS 120
           + L+Y+ RE+IT   I S     ELEL A K  + + E L V+ + +   S G K  ++S
Sbjct: 204 DTLRYNRRETITLDEITSVAYSKELEL-AAKGTRATAEGLVVRGRSEKRNSTGNKSRKKS 263

Query: 121 NEDHKLKTKIRCNYRKKKGHLMRDCYSLKRKNQEKEKD 156
               + K++  C +  K+GH  RDC S K+  +E+ K+
Sbjct: 264 RSKSRSKSETECWFCGKEGHFKRDCRSRKKHFEEQAKE 300

BLAST of ClCG01G010990 vs. NCBI nr
Match: gi|1345510|emb|CAA37918.1| (unnamed protein product [Arabidopsis thaliana])

HSP 1 Score: 102.4 bits (254), Expect = 1.9e-18
Identity = 64/226 (28.32%), Postives = 117/226 (51.77%), Query Frame = 1

Query: 1   MFLREKFFTFKMDSARSLTENLDEFKKIVIEFKSLGEKLGDDNEAYVLLNSLPEAYEEVK 60
           ++++ KF++FKM+ ++S+ EN++EF KIV E  SL   + ++  A + LN L   Y ++K
Sbjct: 120 IYVQLKFYSFKMNDSKSINENVNEFLKIVAELSSLEINVVEEVRAILFLNGLSSRYSQLK 179

Query: 61  NALKYDRESITTGAIISALKINELELHATKKEQPSVER-LFVKSKDKP-----NQSKGGK 120
           + LKY  ++++   +IS+ +  E EL   K+   +    L+   + +P     NQ+KGG+
Sbjct: 180 HTLKYGNKALSLQDVISSARSLERELDEQKETDKNTSTVLYTNERGRPLTRNQNQNKGGQ 239

Query: 121 QQSNEDHKLKTKIRCNYRKKKGHLMRDCYSLKRKNQEKEKDAKGKQPKASIVEGSYFYSD 180
            +         K+ C Y KK+GH+ +DC++ KRK + +         +A ++     +S+
Sbjct: 240 GRGRSKSNSNAKLTCWYCKKEGHVKKDCFARKRKLESENPG------EAGVITEKLVFSE 299

Query: 181 ALASTRDKANQARWTQSINLAANQVDLPLIADASSIKGQLCEMRKD 221
           AL S  D A +  W         ++D    +  S+ K   C  R+D
Sbjct: 300 AL-SVNDLAVRDIW---------ELDSGCPSHMSARKDWFCNFRED 329

BLAST of ClCG01G010990 vs. NCBI nr
Match: gi|90657665|gb|ABD96963.1| (hypothetical protein [Tarenaya spinosa])

HSP 1 Score: 101.7 bits (252), Expect = 3.3e-18
Identity = 63/168 (37.50%), Postives = 100/168 (59.52%), Query Frame = 1

Query: 1   MFLREKFFTFKMDSARSLTENLDEFKKIVIEFKSLGEKLGDDNEAYVLLNSLPEAYEEVK 60
           M+L ++   F+MDS+R++ ENLD F+K++ +  SL  K+ ++ +A  LLNSLP AYE+++
Sbjct: 140 MYLMQRVSGFRMDSSRTIEENLDIFQKLLSDLHSLNVKVEEEYQAVYLLNSLPPAYEQLR 199

Query: 61  NALKYDRESITTGAIISALKINELELHAT-KKEQPSVERLFVKSKDKPNQSKGGKQQSNE 120
             LKY R +I+   + +A ++ ELEL A     + + E L VK   KP +S GGK+    
Sbjct: 200 EVLKYSRATISVEEVKAAARMKELELLAQGTLTRGTGEGLVVKG--KPEKSGGGKK---- 259

Query: 121 DHKLKTKIRCNYRKKKGHLMRDCYSLKRKNQEKEKDAKGKQPKASIVE 168
             K K ++ C Y  KKGH  ++C S + K     ++ +GK   AS+ E
Sbjct: 260 --KAKDQVECWYCGKKGHYKKECRSRRAK-----EETEGKGVVASVQE 294

BLAST of ClCG01G010990 vs. NCBI nr
Match: gi|6623973|gb|AAF19226.1|AC007505_2 (Highly similar to Ta1-3 polyprotein [Arabidopsis thaliana])

HSP 1 Score: 101.3 bits (251), Expect = 4.3e-18
Identity = 57/181 (31.49%), Postives = 104/181 (57.46%), Query Frame = 1

Query: 1   MFLREKFFTFKMDSARSLTENLDEFKKIVIEFKSLGEKLGDDNEAYVLLNSLPEAYEEVK 60
           ++ + K ++FKM S  ++ +N+DEF +IV E  SL  ++ ++ +A ++LNSLP ++ ++K
Sbjct: 124 IYTQLKLYSFKMVSTMTIDQNVDEFLRIVAELGSLEIQVDEEVQAILILNSLPASHIQLK 183

Query: 61  NALKYDRESITTGAIISALKINELEL-HATKKEQPSVERLFVKSKDKP---NQSKGGKQQ 120
           + LKY  +++T   + S+ K  E EL  A   ++     L+   + +P   N  KGG+ +
Sbjct: 184 HTLKYGNKTLTVQDVTSSAKSLERELAEAVDLDKGQAAVLYTTERGRPLVRNNQKGGQGK 243

Query: 121 SNEDHKLKTKIRCNYRKKKGHLMRDCYSLKRKNQEKEKDAKGKQPKASIVEGSYFYSDAL 178
                  KTK+ C Y KK+GH+ +DCYS K+K + +       Q +A ++     +S+AL
Sbjct: 244 GRSRSNSKTKVPCWYCKKEGHVKKDCYSRKKKMESE------GQGEAGVITEKLVFSEAL 298

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
POLX_TOBAC1.9e-1050.00Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum... [more]
COPIA_DROME7.6e-0722.89Copia protein OS=Drosophila melanogaster GN=GIP PE=1 SV=3[more]
Match NameE-valueIdentityDescription
Q02900_ARATH1.3e-1828.32Orf 1 (Fragment) OS=Arabidopsis thaliana PE=4 SV=1[more]
Q1KUM3_9ROSI2.3e-1837.50Putative uncharacterized protein OS=Tarenaya spinosa PE=4 SV=1[more]
Q9SHR5_ARATH3.0e-1831.49F28L22.3 protein OS=Arabidopsis thaliana GN=F28L22.3 PE=4 SV=1[more]
Q9SHR5_ARATH4.1e-0746.88F28L22.3 protein OS=Arabidopsis thaliana GN=F28L22.3 PE=4 SV=1[more]
Q9LVY5_ARATH2.9e+0348.57Copia-like retroelement pol polyprotein OS=Arabidopsis thaliana PE=4 SV=1[more]
Match NameE-valueIdentityDescription
gi|747053728|ref|XP_011073037.1|9.3e-2134.43PREDICTED: retrovirus-related Pol polyprotein from transposon TNT 1-94 [Sesamum ... [more]
gi|729318722|ref|XP_010532694.1|1.5e-1834.81PREDICTED: endo-1,3;1,4-beta-D-glucanase-like [Tarenaya hassleriana][more]
gi|1345510|emb|CAA37918.1|1.9e-1828.32unnamed protein product [Arabidopsis thaliana][more]
gi|90657665|gb|ABD96963.1|3.3e-1837.50hypothetical protein [Tarenaya spinosa][more]
gi|6623973|gb|AAF19226.1|AC007505_24.3e-1831.49Highly similar to Ta1-3 polyprotein [Arabidopsis thaliana][more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0008150 biological_process
cellular_component GO:0005575 cellular_component
molecular_function GO:0003674 molecular_function

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
ClCG01G010990.1ClCG01G010990.1mRNA


Analysis Name: InterPro Annotations of watermelon (Charleston Gray)
Date Performed: 2016-09-28
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
NoneNo IPR availablePANTHERPTHR11439GAG-POL-RELATED RETROTRANSPOSONcoord: 1..160
score: 2.3
NoneNo IPR availablePANTHERPTHR11439:SF192SUBFAMILY NOT NAMEDcoord: 1..160
score: 2.3
NoneNo IPR availablePFAMPF14223Retrotran_gag_2coord: 2..84
score: 1.0