Cla97C02G043590 (gene) Watermelon (97103) v2

NameCla97C02G043590
Typegene
OrganismCitrullus lanatus (Watermelon (97103) v2)
DescriptionRetrotransposon gag protein
LocationCla97Chr02 : 31729802 .. 31730932 (-)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: polypeptideCDSexon
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGGCGCGAAAACTCAGGCGTTCACCGTCGCCATTGCCATTCCGGCGACGTAATGCCGATTATACCACCGCCACCGATTACGATGCTTCTCCATCTCAATCTCTCTATGCATCGAACGAAGACGACTATGACCCCTCTGAATCTGTTAACTCCCACCCCACTGACCCCAAATCAAAATCCCTAGAAATTAAGCCCTCTGATTTAAGAACCGCCGCAGAATCCGCCTCCAAAAACAGCTTAGCGTATTTACAGACTCCAAACGCCGCCCAAACTGTATTTCCATACATCAACGTTGCACCGTTGCCGATTTTTCACGGCAGCGCCGATGAGTGTCCGGTGATACATTTAAGCAGATTCGCCAAAGTCTGCCGTGCGAACAACGCAACCTCCATCGACATGATGATGAGAATCTTCCCGGTGACGTTAGAGGGTGAGGCAGCGCTTTGGTACGACTTGAACATCGAGCCCTACCCTCCAATTTCTTGGGAAGAATTGAAGTCTTGTTTCTTGGATGCTTTCAATAAAATTGAATTGACTGACCAGTTGCGATCGGAGCTTATGACGATAAAACAACGGGAAGAGGAGAGTGTACGTTTGTATTTTCTGAGGTTGCAGTTGATTTTGAAGAAATGGCCACCGGGTAATTCACTTTCCGATGGCTTGTTGAAGACGATTTTTGTTGACGGATTGAGGGAAGAGTTCAAGGAATGGATGATTCTACAGAAACCGAGTTCATTGAACGAGGCATTGAGACTTGCATTTGGGTTTGAACAAGTAAGGACCGTCAGTACATCTGGCAAAAGGGGGTTTCTTCGGTGTGGGTTTTGTGAGGGGCCGCACGAGGAATTGGTTTGTGAGGTTAGGGAGAGAATGAGACAGTTGTGGAAGAGTAGGGAAAAGAAGAATACGGTTGACGTGGTGCAGAGTGACGGCCGTGAAGCGGCAATGGCAACGGCGGAGCTTATGCGATCGTCTTCGGCAATTAGTAGAAACGAATCGGAGGTTGAAAATGATGGCGGGGAGATGGTGGGTTTGAAGAAGAAGAGTCAGTGTCAATGTTGGAAGCATCAGTGTGGGATGAAGAAATTGGATCGAAACCTTAGCATGGTATCAAAAAATTCTAAAGGCTGA

mRNA sequence

ATGGCGCGAAAACTCAGGCGTTCACCGTCGCCATTGCCATTCCGGCGACGTAATGCCGATTATACCACCGCCACCGATTACGATGCTTCTCCATCTCAATCTCTCTATGCATCGAACGAAGACGACTATGACCCCTCTGAATCTGTTAACTCCCACCCCACTGACCCCAAATCAAAATCCCTAGAAATTAAGCCCTCTGATTTAAGAACCGCCGCAGAATCCGCCTCCAAAAACAGCTTAGCGTATTTACAGACTCCAAACGCCGCCCAAACTGTATTTCCATACATCAACGTTGCACCGTTGCCGATTTTTCACGGCAGCGCCGATGAGTGTCCGGTGATACATTTAAGCAGATTCGCCAAAGTCTGCCGTGCGAACAACGCAACCTCCATCGACATGATGATGAGAATCTTCCCGGTGACGTTAGAGGGTGAGGCAGCGCTTTGGTACGACTTGAACATCGAGCCCTACCCTCCAATTTCTTGGGAAGAATTGAAGTCTTGTTTCTTGGATGCTTTCAATAAAATTGAATTGACTGACCAGTTGCGATCGGAGCTTATGACGATAAAACAACGGGAAGAGGAGAGTGTACGTTTGTATTTTCTGAGGTTGCAGTTGATTTTGAAGAAATGGCCACCGGGTAATTCACTTTCCGATGGCTTGTTGAAGACGATTTTTGTTGACGGATTGAGGGAAGAGTTCAAGGAATGGATGATTCTACAGAAACCGAGTTCATTGAACGAGGCATTGAGACTTGCATTTGGGTTTGAACAAGTAAGGACCGTCAGTACATCTGGCAAAAGGGGGTTTCTTCGGTGTGGGTTTTGTGAGGGGCCGCACGAGGAATTGGTTTGTGAGGTTAGGGAGAGAATGAGACAGTTGTGGAAGAGTAGGGAAAAGAAGAATACGGTTGACGTGGTGCAGAGTGACGGCCGTGAAGCGGCAATGGCAACGGCGGAGCTTATGCGATCGTCTTCGGCAATTAGTAGAAACGAATCGGAGGTTGAAAATGATGGCGGGGAGATGGTGGGTTTGAAGAAGAAGAGTCAGTGTCAATGTTGGAAGCATCAGTGTGGGATGAAGAAATTGGATCGAAACCTTAGCATGGTATCAAAAAATTCTAAAGGCTGA

Coding sequence (CDS)

ATGGCGCGAAAACTCAGGCGTTCACCGTCGCCATTGCCATTCCGGCGACGTAATGCCGATTATACCACCGCCACCGATTACGATGCTTCTCCATCTCAATCTCTCTATGCATCGAACGAAGACGACTATGACCCCTCTGAATCTGTTAACTCCCACCCCACTGACCCCAAATCAAAATCCCTAGAAATTAAGCCCTCTGATTTAAGAACCGCCGCAGAATCCGCCTCCAAAAACAGCTTAGCGTATTTACAGACTCCAAACGCCGCCCAAACTGTATTTCCATACATCAACGTTGCACCGTTGCCGATTTTTCACGGCAGCGCCGATGAGTGTCCGGTGATACATTTAAGCAGATTCGCCAAAGTCTGCCGTGCGAACAACGCAACCTCCATCGACATGATGATGAGAATCTTCCCGGTGACGTTAGAGGGTGAGGCAGCGCTTTGGTACGACTTGAACATCGAGCCCTACCCTCCAATTTCTTGGGAAGAATTGAAGTCTTGTTTCTTGGATGCTTTCAATAAAATTGAATTGACTGACCAGTTGCGATCGGAGCTTATGACGATAAAACAACGGGAAGAGGAGAGTGTACGTTTGTATTTTCTGAGGTTGCAGTTGATTTTGAAGAAATGGCCACCGGGTAATTCACTTTCCGATGGCTTGTTGAAGACGATTTTTGTTGACGGATTGAGGGAAGAGTTCAAGGAATGGATGATTCTACAGAAACCGAGTTCATTGAACGAGGCATTGAGACTTGCATTTGGGTTTGAACAAGTAAGGACCGTCAGTACATCTGGCAAAAGGGGGTTTCTTCGGTGTGGGTTTTGTGAGGGGCCGCACGAGGAATTGGTTTGTGAGGTTAGGGAGAGAATGAGACAGTTGTGGAAGAGTAGGGAAAAGAAGAATACGGTTGACGTGGTGCAGAGTGACGGCCGTGAAGCGGCAATGGCAACGGCGGAGCTTATGCGATCGTCTTCGGCAATTAGTAGAAACGAATCGGAGGTTGAAAATGATGGCGGGGAGATGGTGGGTTTGAAGAAGAAGAGTCAGTGTCAATGTTGGAAGCATCAGTGTGGGATGAAGAAATTGGATCGAAACCTTAGCATGGTATCAAAAAATTCTAAAGGCTGA

Protein sequence

MARKLRRSPSPLPFRRRNADYTTATDYDASPSQSLYASNEDDYDPSESVNSHPTDPKSKSLEIKPSDLRTAAESASKNSLAYLQTPNAAQTVFPYINVAPLPIFHGSADECPVIHLSRFAKVCRANNATSIDMMMRIFPVTLEGEAALWYDLNIEPYPPISWEELKSCFLDAFNKIELTDQLRSELMTIKQREEESVRLYFLRLQLILKKWPPGNSLSDGLLKTIFVDGLREEFKEWMILQKPSSLNEALRLAFGFEQVRTVSTSGKRGFLRCGFCEGPHEELVCEVRERMRQLWKSREKKNTVDVVQSDGREAAMATAELMRSSSAISRNESEVENDGGEMVGLKKKSQCQCWKHQCGMKKLDRNLSMVSKNSKG
BLAST of Cla97C02G043590 vs. NCBI nr
Match: POE94094.1 (hypothetical protein CFP56_47498 [Quercus suber])

HSP 1 Score: 329.7 bits (844), Expect = 1.3e-86
Identity = 185/338 (54.73%), Postives = 234/338 (69.23%), Query Frame = 0

Query: 39  NEDDYDPSESVNSHPTDPKSKSLEIKPS---DLRTAAESASKNSLAYLQTPNAAQTVFPY 98
           N+D Y  SES  S P D  S  L    S   +L T     S ++   +  P +   +  Y
Sbjct: 152 NDDAYIGSESETSAPGDRFSSQLRDPDSQSINLSTTVFPNSTSNFPKISQPPSTH-LASY 211

Query: 99  INVAPLPIFHGSADECPVIHLSRFAKVCRANNATSIDMMMRIFPVTLEGEAALWYDLNIE 158
           +N+AP PIFHG+ +ECPV H+SRFAKVC ANN ++ DMMMRIFPVTLE EAALWYDLNIE
Sbjct: 212 MNIAPFPIFHGNPNECPVKHVSRFAKVCVANNVSTTDMMMRIFPVTLEDEAALWYDLNIE 271

Query: 159 PYPPISWEELKSCFLDAFNKIELTDQLRSELMTIKQREEESVRLYFLRLQLILKKWPPGN 218
           PYP ++WEE+KS FL A++KIE+ DQLRSELM I Q +EESVR YFLRLQ ILK+W P +
Sbjct: 272 PYPSLTWEEIKSSFLHAYHKIEVVDQLRSELMMINQGDEESVRSYFLRLQWILKQW-PDH 331

Query: 219 SLSDGLLKTIFVDGLREEFKEWMILQKPSSLNEALRLAFGFEQVRTVSTSGKRGFLRCGF 278
            +SDGLLK +F+DGLREEF++W+I QKP SL+EALRLAFGFEQV+++    K   L+CGF
Sbjct: 332 GISDGLLKGVFIDGLREEFRDWIIPQKPDSLHEALRLAFGFEQVKSIRAVRKE--LKCGF 391

Query: 279 CEGPHEELVCEVRERMRQLWKSREKKNTVDVVQSDGREAAMATAELMRSSSAISRNESEV 338
           C+G HEE  CEVRERMR+LW+  ++K    V+    R       EL+RS S I  + S  
Sbjct: 392 CDGMHEERDCEVRERMRKLWRESKEKEEAVVLAKSTRSDDELGKELVRSVS-IGASSSVG 451

Query: 339 ENDGGEMVGLK--KKSQCQCWKHQCGMKKLDRNLSMVS 372
           +N+ GE  G    KK+Q Q WK+Q  MKKL+RN S++S
Sbjct: 452 KNNEGEEGGFMDGKKNQFQYWKYQRWMKKLERNNSLIS 484

BLAST of Cla97C02G043590 vs. NCBI nr
Match: EXB78111.1 (hypothetical protein L484_004813 [Morus notabilis])

HSP 1 Score: 324.7 bits (831), Expect = 4.1e-85
Identity = 176/318 (55.35%), Postives = 222/318 (69.81%), Query Frame = 0

Query: 74  SASKNSLAYLQTPNAAQTVF-PYINVAPLPIFHGSADECPVIHLSRFAKVCRANNATSID 133
           SAS + + +L     +QT +  Y+N+A  PIF G ++ECP  HLSRFAKVCRANN +SID
Sbjct: 92  SASHSPILHLPQQPVSQTGYNSYMNIAQFPIFRGGSEECPFAHLSRFAKVCRANNVSSID 151

Query: 134 MMMRIFPVTLEGEAALWYDLNIEPYPPISWEELKSCFLDAFNKIELTDQLRSELMTIKQR 193
           MMM+IFPVTLE EAALWYDLN+EPY  +SWEE+KS F  A+ KIELT+QLRS+LMTI Q 
Sbjct: 152 MMMKIFPVTLEDEAALWYDLNVEPYEELSWEEIKSSFYHAYGKIELTEQLRSQLMTINQG 211

Query: 194 EEESVRLYFLRLQLILKKWPPGNSLSDGLLKTIFVDGLREEFKEWMILQKPSSLNEALRL 253
           + ESVR YFLRLQ ILKKWP  + LSD LLK +FVDGLR +F+EWM  QKP SLN+ALRL
Sbjct: 212 DAESVRSYFLRLQWILKKWPE-HGLSDDLLKGVFVDGLRGDFQEWMAPQKPGSLNKALRL 271

Query: 254 AFGFEQVRTVSTSGKRGFLRCGFCEGPHEELVCEVRERMRQLW----------KSREKKN 313
           AF FEQV+++    +   ++CGFC G HEE  CEVRERMR+LW          K   ++N
Sbjct: 272 AFCFEQVKSIRNVRRNASVKCGFCGGLHEERGCEVRERMRELWLKSNKDDGLGKGMLERN 331

Query: 314 TVDV---VQSDGREAAMATAELMRSSSAISRNESEVENDG----GEMVGLKKKSQCQCWK 373
            ++    V+  GR  +MAT+   RS+  + +N+ +VE DG     E+   KK+SQCQC K
Sbjct: 332 LIEKSEGVKELGRSVSMATS---RSTCVVGKND-QVEEDGKEEEDELGSKKKRSQCQCGK 391

BLAST of Cla97C02G043590 vs. NCBI nr
Match: CAN62167.1 (hypothetical protein VITISV_007470 [Vitis vinifera])

HSP 1 Score: 318.5 bits (815), Expect = 3.0e-83
Identity = 192/370 (51.89%), Postives = 245/370 (66.22%), Query Frame = 0

Query: 20  DYTTATDYDASPSQSLYASNE------DDYDPSESVN--SHPTDPKS-KSLEIKPSDLRT 79
           DYT     + SPSQS Y  +E        Y  +ES +  + P D  S  +LE  P   ++
Sbjct: 161 DYT-----EQSPSQSPYEFDEXXXXXQSXYTDNESASGTNAPGDQFSLPALESIPKG-KS 220

Query: 80  AAESASKNSLAYLQTPNAAQTVFPYINVAPLPIFHGSADECPVIHLSRFAKVCRANNATS 139
              S+S NS +    P    +   YIN+APLPIF GS+DECPV HLSRF KVCRANN +S
Sbjct: 221 FRPSSSLNSSSNSLNPFXQSS---YINIAPLPIFRGSSDECPVTHLSRFTKVCRANNVSS 280

Query: 140 IDMMMRIFPVTLEGEAALWYDLNIEPYPPISWEELKSCFLDAFNKIELTDQLRSELMTIK 199
           ++M+MRIFPVTL+GEAALWYDLNIEPY  +SWEE+KS FL A++++ LTD+LRSELM I 
Sbjct: 281 VEMIMRIFPVTLDGEAALWYDLNIEPYSSLSWEEIKSSFLQAYHRJGLTDELRSELMMIN 340

Query: 200 QREEESVRLYFLRLQLILKKWPPGNSLSDGLLKTIFVDGLREEFKEWMILQKPSSLNEAL 259
           Q  EESVR YFLRLQ ILK+W P + L DGLL+ IF+DGLR++F++W+I QKPSSLNEAL
Sbjct: 341 QGTEESVRSYFLRLQWILKRW-PDHGLPDGLLEGIFIDGLRKDFQDWIIPQKPSSLNEAL 400

Query: 260 RLAFGFEQVRTVSTSGKRGFLRCGFCEGPHEELVCEVRERMRQLWKSREKKNTVD----- 319
           RLAF +E+V+++    ++    CGFC G H+E  CE+RERMR LW  + KK T D     
Sbjct: 401 RLAFAWEKVQSIRGGREK---ECGFCSGGHDEEGCEIRERMRXLW-VKSKKQTRDYSGRI 460

Query: 320 VVQSDGREAAMATAELMRSSSAISRNESEVENDGGEMVGLKKKSQCQCWKHQCGMKKLDR 375
           V   DG +       +   S  + +NE E E      +G KKKSQCQC KHQC  KKL+R
Sbjct: 461 VNDEDGEKEFERRVSVGGESRBVGKNEEEGEEG---XMGWKKKSQCQCGKHQCWKKKLER 513

BLAST of Cla97C02G043590 vs. NCBI nr
Match: EOX92844.1 (Uncharacterized protein TCM_001704 [Theobroma cacao])

HSP 1 Score: 317.8 bits (813), Expect = 5.0e-83
Identity = 185/370 (50.00%), Postives = 238/370 (64.32%), Query Frame = 0

Query: 20  DYTTATDYDASPSQS-----LYASNEDDYD-------PSESVNSHPTDPKSKSLEIKPSD 79
           +Y   T    SP +S         NE+DYD        SES+ + P  PK+    ++ + 
Sbjct: 41  NYVDNTSLSHSPDESNGDDLEQPRNENDYDDFDASDFQSESMTNAPNAPKTL---LRGNG 100

Query: 80  LRTAAESASKNSLAYLQTPNAAQTVFPYINVAPLPIFHGSADECPVIHLSRFAKVCRANN 139
           L  AA   S ++ A     N  +    YIN+APLPIF GS  +CPV HLSRFAKVCRANN
Sbjct: 101 LSAAASLNSVSNSAIWSRSNLIEAT-SYINIAPLPIFQGSPSDCPVTHLSRFAKVCRANN 160

Query: 140 ATSIDMMMRIFPVTLEGEAALWYDLNIEPYPPISWEELKSCFLDAFNKIELTDQLRSELM 199
            +S+DMMMRIFPVTLE EA LWYDLNIEPYP + WEE+KS FL A++K ++T+QLR ELM
Sbjct: 161 VSSVDMMMRIFPVTLENEAGLWYDLNIEPYPSLRWEEIKSSFLQAYHKTQVTEQLRHELM 220

Query: 200 TIKQREEESVRLYFLRLQLILKKWPPGNSLSDGLLKTIFVDGLREEFKEWMILQKPSSLN 259
            I Q  EE VR YFLRLQ  L++W P + + + LLK IFVDGLRE+F++W++ QKP SL 
Sbjct: 221 MINQGSEERVRSYFLRLQWSLQRW-PDHGIPENLLKEIFVDGLREDFQDWIVPQKPDSLV 280

Query: 260 EALRLAFGFEQVRTVSTSGKRGFLRCGFCEGPHEELVCEVRERMRQLWKSREKKNTVDVV 319
           EALRLA  FEQ++++  S K+  L+C FCEG HEE  C+VRERM++LW+  + K  +D  
Sbjct: 281 EALRLAIAFEQLKSIKISRKKD-LKCDFCEGSHEERNCQVRERMKELWRKTKDKEWMDSS 340

Query: 320 Q-SDGREAAMATAELMRSSSAISRNESEVENDGGEMVG--LKKKSQCQCWKHQCGMKKLD 375
           + +   EA   +AE     SA  R E E   +G  + G   KKKS CQC KHQC  K+LD
Sbjct: 341 EKNQSNEAVNESAE----GSAEDRIEEENVVEGEMLSGRKQKKKSPCQCCKHQCWKKQLD 400

BLAST of Cla97C02G043590 vs. NCBI nr
Match: OMO61075.1 (Retrotransposon gag protein [Corchorus olitorius])

HSP 1 Score: 310.5 bits (794), Expect = 8.0e-81
Identity = 169/347 (48.70%), Postives = 228/347 (65.71%), Query Frame = 0

Query: 40  EDDYDPSESVNSHPTDPKSKSLEIKPSDLRTAAESASKNSLAYLQ---TPNAAQTVFPYI 99
           EDDYD    V++     +S + + KPS+  +   SAS NS+   +     +  + +  YI
Sbjct: 73  EDDYD--NEVDASEYQSESMANDPKPSNNGSLGASASSNSITNSEIWTQSHVTEAISSYI 132

Query: 100 NVAPLPIFHGSADECPVIHLSRFAKVCRANNATSIDMMMRIFPVTLEGEAALWYDLNIEP 159
           N+AP PIF G  +ECPV HLSRFAKVCRANN +S+DMMMRIFP+TLE EA +WYDLNIEP
Sbjct: 133 NIAPFPIFRGGPNECPVTHLSRFAKVCRANNVSSVDMMMRIFPITLEDEAGIWYDLNIEP 192

Query: 160 YPPISWEELKSCFLDAFNKIELTDQLRSELMTIKQREEESVRLYFLRLQLILKKWPPGNS 219
           YP +SWEE+KS FL A+NK ++++QLR EL  I Q  EE VR YFLRLQ  L++W P + 
Sbjct: 193 YPSLSWEEIKSSFLQAYNKTQVSEQLRHELTMINQGSEECVRSYFLRLQWSLRRW-PDHG 252

Query: 220 LSDGLLKTIFVDGLREEFKEWMILQKPSSLNEALRLAFGFEQVRTVSTSGKRGFLRCGFC 279
           + + L+K IFVDGL+E+F++W+I QKP SL EALRLAF +EQV+ +    K+  L+C FC
Sbjct: 253 IPETLIKEIFVDGLKEDFQDWIIPQKPDSLAEALRLAFAYEQVKNIKLLRKKD-LKCDFC 312

Query: 280 EGPHEELVCEVRERMRQLWKSREKKNTVDVVQSDGREAAMATAELMRSSSAISRNESEVE 339
            G HEE  C VRE+M+ LW+  + K  +D                  SSS++ ++ES  E
Sbjct: 313 GGQHEERSCLVREKMKGLWQRAKDKQLID------------------SSSSLKKDESNEE 372

Query: 340 -------NDGGEMV--GLKKKSQCQCWKHQCGMKKLDRNLSMVSKNS 375
                  ++ GEM+  G KKKSQCQC KH+C  K+L R+ S++S+NS
Sbjct: 373 VKETIDGDEEGEMMLNGKKKKSQCQCSKHRCWKKQLQRSSSLISRNS 397

BLAST of Cla97C02G043590 vs. TrEMBL
Match: tr|A0A2P4KM35|A0A2P4KM35_QUESU (Uncharacterized protein OS=Quercus suber OX=58331 GN=CFP56_47498 PE=4 SV=1)

HSP 1 Score: 329.7 bits (844), Expect = 8.5e-87
Identity = 185/338 (54.73%), Postives = 234/338 (69.23%), Query Frame = 0

Query: 39  NEDDYDPSESVNSHPTDPKSKSLEIKPS---DLRTAAESASKNSLAYLQTPNAAQTVFPY 98
           N+D Y  SES  S P D  S  L    S   +L T     S ++   +  P +   +  Y
Sbjct: 152 NDDAYIGSESETSAPGDRFSSQLRDPDSQSINLSTTVFPNSTSNFPKISQPPSTH-LASY 211

Query: 99  INVAPLPIFHGSADECPVIHLSRFAKVCRANNATSIDMMMRIFPVTLEGEAALWYDLNIE 158
           +N+AP PIFHG+ +ECPV H+SRFAKVC ANN ++ DMMMRIFPVTLE EAALWYDLNIE
Sbjct: 212 MNIAPFPIFHGNPNECPVKHVSRFAKVCVANNVSTTDMMMRIFPVTLEDEAALWYDLNIE 271

Query: 159 PYPPISWEELKSCFLDAFNKIELTDQLRSELMTIKQREEESVRLYFLRLQLILKKWPPGN 218
           PYP ++WEE+KS FL A++KIE+ DQLRSELM I Q +EESVR YFLRLQ ILK+W P +
Sbjct: 272 PYPSLTWEEIKSSFLHAYHKIEVVDQLRSELMMINQGDEESVRSYFLRLQWILKQW-PDH 331

Query: 219 SLSDGLLKTIFVDGLREEFKEWMILQKPSSLNEALRLAFGFEQVRTVSTSGKRGFLRCGF 278
            +SDGLLK +F+DGLREEF++W+I QKP SL+EALRLAFGFEQV+++    K   L+CGF
Sbjct: 332 GISDGLLKGVFIDGLREEFRDWIIPQKPDSLHEALRLAFGFEQVKSIRAVRKE--LKCGF 391

Query: 279 CEGPHEELVCEVRERMRQLWKSREKKNTVDVVQSDGREAAMATAELMRSSSAISRNESEV 338
           C+G HEE  CEVRERMR+LW+  ++K    V+    R       EL+RS S I  + S  
Sbjct: 392 CDGMHEERDCEVRERMRKLWRESKEKEEAVVLAKSTRSDDELGKELVRSVS-IGASSSVG 451

Query: 339 ENDGGEMVGLK--KKSQCQCWKHQCGMKKLDRNLSMVS 372
           +N+ GE  G    KK+Q Q WK+Q  MKKL+RN S++S
Sbjct: 452 KNNEGEEGGFMDGKKNQFQYWKYQRWMKKLERNNSLIS 484

BLAST of Cla97C02G043590 vs. TrEMBL
Match: tr|W9R9S0|W9R9S0_9ROSA (Uncharacterized protein OS=Morus notabilis OX=981085 GN=L484_004813 PE=4 SV=1)

HSP 1 Score: 324.7 bits (831), Expect = 2.7e-85
Identity = 176/318 (55.35%), Postives = 222/318 (69.81%), Query Frame = 0

Query: 74  SASKNSLAYLQTPNAAQTVF-PYINVAPLPIFHGSADECPVIHLSRFAKVCRANNATSID 133
           SAS + + +L     +QT +  Y+N+A  PIF G ++ECP  HLSRFAKVCRANN +SID
Sbjct: 92  SASHSPILHLPQQPVSQTGYNSYMNIAQFPIFRGGSEECPFAHLSRFAKVCRANNVSSID 151

Query: 134 MMMRIFPVTLEGEAALWYDLNIEPYPPISWEELKSCFLDAFNKIELTDQLRSELMTIKQR 193
           MMM+IFPVTLE EAALWYDLN+EPY  +SWEE+KS F  A+ KIELT+QLRS+LMTI Q 
Sbjct: 152 MMMKIFPVTLEDEAALWYDLNVEPYEELSWEEIKSSFYHAYGKIELTEQLRSQLMTINQG 211

Query: 194 EEESVRLYFLRLQLILKKWPPGNSLSDGLLKTIFVDGLREEFKEWMILQKPSSLNEALRL 253
           + ESVR YFLRLQ ILKKWP  + LSD LLK +FVDGLR +F+EWM  QKP SLN+ALRL
Sbjct: 212 DAESVRSYFLRLQWILKKWPE-HGLSDDLLKGVFVDGLRGDFQEWMAPQKPGSLNKALRL 271

Query: 254 AFGFEQVRTVSTSGKRGFLRCGFCEGPHEELVCEVRERMRQLW----------KSREKKN 313
           AF FEQV+++    +   ++CGFC G HEE  CEVRERMR+LW          K   ++N
Sbjct: 272 AFCFEQVKSIRNVRRNASVKCGFCGGLHEERGCEVRERMRELWLKSNKDDGLGKGMLERN 331

Query: 314 TVDV---VQSDGREAAMATAELMRSSSAISRNESEVENDG----GEMVGLKKKSQCQCWK 373
            ++    V+  GR  +MAT+   RS+  + +N+ +VE DG     E+   KK+SQCQC K
Sbjct: 332 LIEKSEGVKELGRSVSMATS---RSTCVVGKND-QVEEDGKEEEDELGSKKKRSQCQCGK 391

BLAST of Cla97C02G043590 vs. TrEMBL
Match: tr|A0A061DJI4|A0A061DJI4_THECC (Uncharacterized protein OS=Theobroma cacao OX=3641 GN=TCM_001704 PE=4 SV=1)

HSP 1 Score: 317.8 bits (813), Expect = 3.3e-83
Identity = 185/370 (50.00%), Postives = 238/370 (64.32%), Query Frame = 0

Query: 20  DYTTATDYDASPSQS-----LYASNEDDYD-------PSESVNSHPTDPKSKSLEIKPSD 79
           +Y   T    SP +S         NE+DYD        SES+ + P  PK+    ++ + 
Sbjct: 41  NYVDNTSLSHSPDESNGDDLEQPRNENDYDDFDASDFQSESMTNAPNAPKTL---LRGNG 100

Query: 80  LRTAAESASKNSLAYLQTPNAAQTVFPYINVAPLPIFHGSADECPVIHLSRFAKVCRANN 139
           L  AA   S ++ A     N  +    YIN+APLPIF GS  +CPV HLSRFAKVCRANN
Sbjct: 101 LSAAASLNSVSNSAIWSRSNLIEAT-SYINIAPLPIFQGSPSDCPVTHLSRFAKVCRANN 160

Query: 140 ATSIDMMMRIFPVTLEGEAALWYDLNIEPYPPISWEELKSCFLDAFNKIELTDQLRSELM 199
            +S+DMMMRIFPVTLE EA LWYDLNIEPYP + WEE+KS FL A++K ++T+QLR ELM
Sbjct: 161 VSSVDMMMRIFPVTLENEAGLWYDLNIEPYPSLRWEEIKSSFLQAYHKTQVTEQLRHELM 220

Query: 200 TIKQREEESVRLYFLRLQLILKKWPPGNSLSDGLLKTIFVDGLREEFKEWMILQKPSSLN 259
            I Q  EE VR YFLRLQ  L++W P + + + LLK IFVDGLRE+F++W++ QKP SL 
Sbjct: 221 MINQGSEERVRSYFLRLQWSLQRW-PDHGIPENLLKEIFVDGLREDFQDWIVPQKPDSLV 280

Query: 260 EALRLAFGFEQVRTVSTSGKRGFLRCGFCEGPHEELVCEVRERMRQLWKSREKKNTVDVV 319
           EALRLA  FEQ++++  S K+  L+C FCEG HEE  C+VRERM++LW+  + K  +D  
Sbjct: 281 EALRLAIAFEQLKSIKISRKKD-LKCDFCEGSHEERNCQVRERMKELWRKTKDKEWMDSS 340

Query: 320 Q-SDGREAAMATAELMRSSSAISRNESEVENDGGEMVG--LKKKSQCQCWKHQCGMKKLD 375
           + +   EA   +AE     SA  R E E   +G  + G   KKKS CQC KHQC  K+LD
Sbjct: 341 EKNQSNEAVNESAE----GSAEDRIEEENVVEGEMLSGRKQKKKSPCQCCKHQCWKKQLD 400

BLAST of Cla97C02G043590 vs. TrEMBL
Match: tr|A5C7E6|A5C7E6_VITVI (Uncharacterized protein OS=Vitis vinifera OX=29760 GN=VITISV_007470 PE=4 SV=1)

HSP 1 Score: 317.0 bits (811), Expect = 5.7e-83
Identity = 192/370 (51.89%), Postives = 244/370 (65.95%), Query Frame = 0

Query: 20  DYTTATDYDASPSQSLYASNE------DDYDPSESVN--SHPTDPKS-KSLEIKPSDLRT 79
           DYT     + SPSQS Y  +E        Y  +ES +  + P D  S  +LE  P   ++
Sbjct: 161 DYT-----EQSPSQSPYEFDEXXXXXQSXYTDNESASGTNAPGDQFSLPALESIPKG-KS 220

Query: 80  AAESASKNSLAYLQTPNAAQTVFPYINVAPLPIFHGSADECPVIHLSRFAKVCRANNATS 139
              S+S NS +    P    +   YIN+APLPIF GS+DECPV HLSRF KVCRANN +S
Sbjct: 221 FRPSSSLNSSSNSLNPFXQSS---YINIAPLPIFRGSSDECPVTHLSRFTKVCRANNVSS 280

Query: 140 IDMMMRIFPVTLEGEAALWYDLNIEPYPPISWEELKSCFLDAFNKIELTDQLRSELMTIK 199
           ++M+MRIFPVTL+GEAALWYDLNIEPY  +SWEE+KS FL A+++  LTD+LRSELM I 
Sbjct: 281 VEMIMRIFPVTLDGEAALWYDLNIEPYSSLSWEEIKSSFLQAYHRXGLTDELRSELMMIN 340

Query: 200 QREEESVRLYFLRLQLILKKWPPGNSLSDGLLKTIFVDGLREEFKEWMILQKPSSLNEAL 259
           Q  EESVR YFLRLQ ILK+W P + L DGLL+ IF+DGLR++F++W+I QKPSSLNEAL
Sbjct: 341 QGTEESVRSYFLRLQWILKRW-PDHGLPDGLLEGIFIDGLRKDFQDWIIPQKPSSLNEAL 400

Query: 260 RLAFGFEQVRTVSTSGKRGFLRCGFCEGPHEELVCEVRERMRQLWKSREKKNTVD----- 319
           RLAF +E+V+++    ++    CGFC G H+E  CE+RERMR LW  + KK T D     
Sbjct: 401 RLAFAWEKVQSIRGGREK---ECGFCSGGHDEEGCEIRERMRXLW-VKSKKQTRDYSGRI 460

Query: 320 VVQSDGREAAMATAELMRSSSAISRNESEVENDGGEMVGLKKKSQCQCWKHQCGMKKLDR 375
           V   DG +       +   S  + +NE E E      +G KKKSQCQC KHQC  KKL+R
Sbjct: 461 VNDEDGEKEFERRVSVGGESRBVGKNEEEGEEG---XMGWKKKSQCQCGKHQCWKKKLER 513

BLAST of Cla97C02G043590 vs. TrEMBL
Match: tr|A0A1R3GSI9|A0A1R3GSI9_9ROSI (Retrotransposon gag protein OS=Corchorus olitorius OX=93759 GN=COLO4_33577 PE=4 SV=1)

HSP 1 Score: 310.5 bits (794), Expect = 5.3e-81
Identity = 169/347 (48.70%), Postives = 228/347 (65.71%), Query Frame = 0

Query: 40  EDDYDPSESVNSHPTDPKSKSLEIKPSDLRTAAESASKNSLAYLQ---TPNAAQTVFPYI 99
           EDDYD    V++     +S + + KPS+  +   SAS NS+   +     +  + +  YI
Sbjct: 73  EDDYD--NEVDASEYQSESMANDPKPSNNGSLGASASSNSITNSEIWTQSHVTEAISSYI 132

Query: 100 NVAPLPIFHGSADECPVIHLSRFAKVCRANNATSIDMMMRIFPVTLEGEAALWYDLNIEP 159
           N+AP PIF G  +ECPV HLSRFAKVCRANN +S+DMMMRIFP+TLE EA +WYDLNIEP
Sbjct: 133 NIAPFPIFRGGPNECPVTHLSRFAKVCRANNVSSVDMMMRIFPITLEDEAGIWYDLNIEP 192

Query: 160 YPPISWEELKSCFLDAFNKIELTDQLRSELMTIKQREEESVRLYFLRLQLILKKWPPGNS 219
           YP +SWEE+KS FL A+NK ++++QLR EL  I Q  EE VR YFLRLQ  L++W P + 
Sbjct: 193 YPSLSWEEIKSSFLQAYNKTQVSEQLRHELTMINQGSEECVRSYFLRLQWSLRRW-PDHG 252

Query: 220 LSDGLLKTIFVDGLREEFKEWMILQKPSSLNEALRLAFGFEQVRTVSTSGKRGFLRCGFC 279
           + + L+K IFVDGL+E+F++W+I QKP SL EALRLAF +EQV+ +    K+  L+C FC
Sbjct: 253 IPETLIKEIFVDGLKEDFQDWIIPQKPDSLAEALRLAFAYEQVKNIKLLRKKD-LKCDFC 312

Query: 280 EGPHEELVCEVRERMRQLWKSREKKNTVDVVQSDGREAAMATAELMRSSSAISRNESEVE 339
            G HEE  C VRE+M+ LW+  + K  +D                  SSS++ ++ES  E
Sbjct: 313 GGQHEERSCLVREKMKGLWQRAKDKQLID------------------SSSSLKKDESNEE 372

Query: 340 -------NDGGEMV--GLKKKSQCQCWKHQCGMKKLDRNLSMVSKNS 375
                  ++ GEM+  G KKKSQCQC KH+C  K+L R+ S++S+NS
Sbjct: 373 VKETIDGDEEGEMMLNGKKKKSQCQCSKHRCWKKQLQRSSSLISRNS 397

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
POE94094.11.3e-8654.73hypothetical protein CFP56_47498 [Quercus suber][more]
EXB78111.14.1e-8555.35hypothetical protein L484_004813 [Morus notabilis][more]
CAN62167.13.0e-8351.89hypothetical protein VITISV_007470 [Vitis vinifera][more]
EOX92844.15.0e-8350.00Uncharacterized protein TCM_001704 [Theobroma cacao][more]
OMO61075.18.0e-8148.70Retrotransposon gag protein [Corchorus olitorius][more]
Match NameE-valueIdentityDescription
tr|A0A2P4KM35|A0A2P4KM35_QUESU8.5e-8754.73Uncharacterized protein OS=Quercus suber OX=58331 GN=CFP56_47498 PE=4 SV=1[more]
tr|W9R9S0|W9R9S0_9ROSA2.7e-8555.35Uncharacterized protein OS=Morus notabilis OX=981085 GN=L484_004813 PE=4 SV=1[more]
tr|A0A061DJI4|A0A061DJI4_THECC3.3e-8350.00Uncharacterized protein OS=Theobroma cacao OX=3641 GN=TCM_001704 PE=4 SV=1[more]
tr|A5C7E6|A5C7E6_VITVI5.7e-8351.89Uncharacterized protein OS=Vitis vinifera OX=29760 GN=VITISV_007470 PE=4 SV=1[more]
tr|A0A1R3GSI9|A0A1R3GSI9_9ROSI5.3e-8148.70Retrotransposon gag protein OS=Corchorus olitorius OX=93759 GN=COLO4_33577 PE=4 ... [more]
Match NameE-valueIdentityDescription
Match NameE-valueIdentityDescription
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR005162Retrotrans_gag_dom
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0008150 biological_process
cellular_component GO:0005575 cellular_component
molecular_function GO:0003674 molecular_function

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Cla97C02G043590.1Cla97C02G043590.1mRNA


Analysis Name: InterPro Annotations of watermelon 97103 v2
Date Performed: 2019-05-12
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR005162Retrotransposon gag domainPFAMPF03732Retrotrans_gagcoord: 136..232
e-value: 2.9E-10
score: 40.2
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 322..341
NoneNo IPR availablePANTHERPTHR33223FAMILY NOT NAMEDcoord: 95..260

The following gene(s) are orthologous to this gene:
GeneOrthologueOrganismBlock
Cla97C02G043590Cla013544Watermelon (97103) v1wmwmbB316
Cla97C02G043590ClCG02G017800Watermelon (Charleston Gray)wcgwmbB138
Cla97C02G043590CmaCh02G000800Cucurbita maxima (Rimu)cmawmbB622
Cla97C02G043590CmoCh02G000780Cucurbita moschata (Rifu)cmowmbB597
Cla97C02G043590Lsi10G015890Bottle gourd (USVL1VR-Ls)lsiwmbB050
Cla97C02G043590Bhi10G000829Wax gourdwgowmbB361
The following gene(s) are paralogous to this gene:

None