CSPI04G19780 (gene) Wild cucumber (PI 183967)

NameCSPI04G19780
Typegene
OrganismCucumis sativus (Wild cucumber (PI 183967))
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
LocationChr4 : 17555797 .. 17557358 (-)
   



The following sequences are available for this feature:

Gene sequence (with intron)

Legend: CDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGAAGGCTCTTTATGGACCTCAAGATCTTTGGAATATTGTTGACATCGGATATTCAGAGCCAGAAAGTGAGAATGGTCTTTCAGCACAACAACTCAATGAGTTGAGAGATGCTAGAAAAAAGGATAAGAAGGCATTATTTTTCATCTACCAATCTGTGGATGAAAATATTTTTGAAAGAATATCAGGAGTCTCTACTGCTAAAGCAGCATGGGATGCATTGCAAAATTTGTATGAAGGAGAAGAAAAGGTAAAATTGGTCCGATTACAAACACTTCGAGCTGAATTTGATACAATTCGAATGAAAGATTCTGAAACTATTGAAGAATTTTTCAACCGTGTTCTCTTAATTGTTAATCAATTGAGATCAAATGGAGAAACAATTGAAGATCAAAGGATTGTTGAGAAGATTCTTAGAAGCATGACTAGAAGATATGAGCATATTGTTGTAGCAATTGAAGAATCCAAAGATTTGTCAACTCTCTCTATAAATAGCTTAATGGGATCTCTTCAATCTCATGAGCTCAGATTGAAGATGTTTGATTCTAATCCTTCAGAAGAAGCTTTTCATATGCAGTCCTCCTATAGAGGTCGATCCAATGGAAGAAGAGGTGGACGTGGTGGTAGAGGCAATGGACGATCCAACGTTGTAACAAATACAGAGTCAGAAAGCAGAGACAATCAATTTTTTTCAAATAGAGGACGAGGAAGAAGTTCAAATAGAGGAAGAGGTAGAAGTGGTGGTCGTGGAGATTTTTCTCACATACAATGTTTCAATTGTAGACGTTATGGACATTTTCAAGCAGACTGTTGGTCTAAGAAGACTAATTCTAATCAAGCAGAAACCACACTAATGCATGAGCAATCAAATAATGATCAAGGTCTTCTCTTCCTCACTCTCAATGTTCAAGAATCAAGCACTGAAGAAATATGGTATCTTGATAGTGGTTGTAGTAACCACATGACAGGAAGAAAGGATATTTTTATATCTTTAGATGAATCTCATCAAAATGTAGTGAAGACTGGTGACAACAAGATGCTTGAAGTCAAAGGAAAAGGAGATATTCTTGTCAAGACAAAAATGGGAGCAAAAAAAATTACTGATGTGTATTATGTTTCAGGTCTCAAACACAATCTTTTAAGTGTTGGACAACTTCTCCTAAGAGGACATGATGTTATTTTTAAAGATAAAATATGCGAGATTAGAACCAAGAATGGAGATCTCATAACGAAGGTTCGTATGACTCACAACAAAATGTTTCCAATTAAAATATGTTATGAGAAGCTTGTTTGTTTTGAGACTTTAGTAAATGACACCTCATGGTTATGGCATTGTCGATTTGGGCACCTAAGTTTTGACACTTTGTCTCACATGTGTCAACAACATATGGTGAGAGGAATGTCAAATATTAAAAAGGAAGATCAACTCTGTGAAGCATGTGTTTTCAGAAAGCATCATCGAAATTCATTTCCGACTGGAGGTTCTTGGAGAGCATCAAAACCACTCGAGCTTGTTCATACAGACTTATGTGGACCTATGAGAACTACTACACATGGAGGTAA

mRNA sequence

ATGAAGGCTCTTTATGGACCTCAAGATCTTTGGAATATTGTTGACATCGGATATTCAGAGCCAGAAAGTGAGAATGGTCTTTCAGCACAACAACTCAATGAGTTGAGAGATGCTAGAAAAAAGGATAAGAAGGCATTATTTTTCATCTACCAATCTGTGGATGAAAATATTTTTGAAAGAATATCAGGAGTCTCTACTGCTAAAGCAGCATGGGATGCATTGCAAAATTTGTATGAAGGAGAAGAAAAGGTAAAATTGGTCCGATTACAAACACTTCGAGCTGAATTTGATACAATTCGAATGAAAGATTCTGAAACTATTGAAGAATTTTTCAACCGTGTTCTCTTAATTGTTAATCAATTGAGATCAAATGGAGAAACAATTGAAGATCAAAGGATTGTTGAGAAGATTCTTAGAAGCATGACTAGAAGATATGAGCATATTGTTGTAGCAATTGAAGAATCCAAAGATTTGTCAACTCTCTCTATAAATAGCTTAATGGGATCTCTTCAATCTCATGAGCTCAGATTGAAGATGTTTGATTCTAATCCTTCAGAAGAAGCTTTTCATATGCAGTCCTCCTATAGAGGTCGATCCAATGGAAGAAGAGGTGGACGTGGTGGTAGAGGCAATGGACGATCCAACGTTGTAACAAATACAGAGTCAGAAAGCAGAGACAATCAATTTTTTTCAAATAGAGGACGAGGAAGAAGTTCAAATAGAGGAAGAGGTAGAAGTGGTGGTCGTGGAGATTTTTCTCACATACAATGTTTCAATTGTAGACGTTATGGACATTTTCAAGCAGACTGTTGGTCTAAGAAGACTAATTCTAATCAAGCAGAAACCACACTAATGCATGAGCAATCAAATAATGATCAAGGTCTTCTCTTCCTCACTCTCAATGTTCAAGAATCAAGCACTGAAGAAATATGGTATCTTGATAGTGGTTGTAGTAACCACATGACAGGAAGAAAGGATATTTTTATATCTTTAGATGAATCTCATCAAAATGTAGTGAAGACTGGTGACAACAAGATGCTTGAAGTCAAAGGAAAAGGAGATATTCTTGTCAAGACAAAAATGGGAGCAAAAAAAATTACTGATGTGTATTATGTTTCAGGTCTCAAACACAATCTTTTAAGTGTTGGACAACTTCTCCTAAGAGGACATGATGTTATTTTTAAAGATAAAATATGCGAGATTAGAACCAAGAATGGAGATCTCATAACGAAGAAAGCATCATCGAAATTCATTTCCGACTGGAGGTTCTTGGAGAGCATCAAAACCACTCGAGCTTGTTCATACAGACTTATGTGGACCTATGAGAACTACTACACATGGAGGTAA

Coding sequence (CDS)

ATGAAGGCTCTTTATGGACCTCAAGATCTTTGGAATATTGTTGACATCGGATATTCAGAGCCAGAAAGTGAGAATGGTCTTTCAGCACAACAACTCAATGAGTTGAGAGATGCTAGAAAAAAGGATAAGAAGGCATTATTTTTCATCTACCAATCTGTGGATGAAAATATTTTTGAAAGAATATCAGGAGTCTCTACTGCTAAAGCAGCATGGGATGCATTGCAAAATTTGTATGAAGGAGAAGAAAAGGTAAAATTGGTCCGATTACAAACACTTCGAGCTGAATTTGATACAATTCGAATGAAAGATTCTGAAACTATTGAAGAATTTTTCAACCGTGTTCTCTTAATTGTTAATCAATTGAGATCAAATGGAGAAACAATTGAAGATCAAAGGATTGTTGAGAAGATTCTTAGAAGCATGACTAGAAGATATGAGCATATTGTTGTAGCAATTGAAGAATCCAAAGATTTGTCAACTCTCTCTATAAATAGCTTAATGGGATCTCTTCAATCTCATGAGCTCAGATTGAAGATGTTTGATTCTAATCCTTCAGAAGAAGCTTTTCATATGCAGTCCTCCTATAGAGGTCGATCCAATGGAAGAAGAGGTGGACGTGGTGGTAGAGGCAATGGACGATCCAACGTTGTAACAAATACAGAGTCAGAAAGCAGAGACAATCAATTTTTTTCAAATAGAGGACGAGGAAGAAGTTCAAATAGAGGAAGAGGTAGAAGTGGTGGTCGTGGAGATTTTTCTCACATACAATGTTTCAATTGTAGACGTTATGGACATTTTCAAGCAGACTGTTGGTCTAAGAAGACTAATTCTAATCAAGCAGAAACCACACTAATGCATGAGCAATCAAATAATGATCAAGGTCTTCTCTTCCTCACTCTCAATGTTCAAGAATCAAGCACTGAAGAAATATGGTATCTTGATAGTGGTTGTAGTAACCACATGACAGGAAGAAAGGATATTTTTATATCTTTAGATGAATCTCATCAAAATGTAGTGAAGACTGGTGACAACAAGATGCTTGAAGTCAAAGGAAAAGGAGATATTCTTGTCAAGACAAAAATGGGAGCAAAAAAAATTACTGATGTGTATTATGTTTCAGGTCTCAAACACAATCTTTTAAGTGTTGGACAACTTCTCCTAAGAGGACATGATGTTATTTTTAAAGATAAAATATGCGAGATTAGAACCAAGAATGGAGATCTCATAACGAAGAAAGCATCATCGAAATTCATTTCCGACTGGAGGTTCTTGGAGAGCATCAAAACCACTCGAGCTTGTTCATACAGACTTATGTGGACCTATGAGAACTACTACACATGGAGGTAA
BLAST of CSPI04G19780 vs. Swiss-Prot
Match: POLX_TOBAC (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum PE=2 SV=1)

HSP 1 Score: 70.9 bits (172), Expect = 4.1e-11
Identity = 100/427 (23.42%), Postives = 174/427 (40.75%), Query Frame = 1

Query: 1   MKALYGPQDLWNIVDIGYSEPESENGLSAQQLNELRDARKKDKKALFFIYQSVDENIFER 60
           M+ L   Q L  ++D+   +P++   + A+   +L      D++A   I   + +++   
Sbjct: 24  MRDLLIQQGLHKVLDVDSKKPDT---MKAEDWADL------DERAASAIRLHLSDDVVNN 83

Query: 61  ISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEFDTIRMKDSETIEEFFNRVLLIVNQ 120
           I    TA+  W  L++LY  +    L     L+ +   + M +        N    ++ Q
Sbjct: 84  IIDEDTARGIWTRLESLYMSKT---LTNKLYLKKQLYALHMSEGTNFLSHLNVFNGLITQ 143

Query: 121 LRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKDLSTLSINSLMGSLQSHELRLKMF 180
           L + G  IE++                         D + L +NSL  S  +  L   + 
Sbjct: 144 LANLGVKIEEE-------------------------DKAILLLNSLPSSYDN--LATTIL 203

Query: 181 DSNPSEEAFHMQSSYRGRSNGRRGGRGGRGNGRSNVVTNTESESRDNQFFSNRGRGRSSN 240
               + E   + S+       R+        G++ +   TE   R  Q  SN   GRS  
Sbjct: 204 HGKTTIELKDVTSALLLNEKMRKKPEN---QGQALI---TEGRGRSYQRSSNN-YGRSG- 263

Query: 241 RGRGRSGGRGDFSHIQCFNCRRYGHFQADCWSKK-----TNSNQAETTLMHEQSNNDQGL 300
             RG+S  R       C+NC + GHF+ DC + +     T+  + +        NND  +
Sbjct: 264 -ARGKSKNRSKSRVRNCYNCNQPGHFKRDCPNPRKGKGETSGQKNDDNTAAMVQNNDNVV 323

Query: 301 LFLTLNVQE-----SSTEEIWYLDSGCSNHMTGRKDIFISLDESHQNVVKTGDNKMLEVK 360
           LF  +N +E     S  E  W +D+  S+H T  +D+F          VK G+    ++ 
Sbjct: 324 LF--INEEEECMHLSGPESEWVVDTAASHHATPVRDLFCRYVAGDFGTVKMGNTSYSKIA 383

Query: 361 GKGDILVKTKMGAKKI-TDVYYVSGLKHNLLSVGQLLLRGHDVIFKDKICEIRTKNGDLI 417
           G GDI +KT +G   +  DV +V  L+ NL+S   L   G++  F ++  + R   G L+
Sbjct: 384 GIGDICIKTNVGCTLVLKDVRHVPDLRMNLISGIALDRDGYESYFANQ--KWRLTKGSLV 398

BLAST of CSPI04G19780 vs. TrEMBL
Match: A6YTD9_CUCME (Integrase OS=Cucumis melo subsp. melo PE=4 SV=1)

HSP 1 Score: 430.3 bits (1105), Expect = 3.0e-117
Identity = 239/418 (57.18%), Postives = 293/418 (70.10%), Query Frame = 1

Query: 1   MKALYGPQDLWNIVDIGYSEPESENGLSAQQLNELRDARKKDKKALFFIYQSVDENIFER 60
           MK LYG Q+LW+IV+ GY+E E+++ L+ QQL ELR+ R KDKKALFFIYQ+VDE I ER
Sbjct: 27  MKVLYGSQELWDIVERGYTEVENQSELTNQQLVELRENRNKDKKALFFIYQAVDEFISER 86

Query: 61  ISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEFDTIRMKDSETIEEFFNRVLLIVNQ 120
           IS  ++AKAAWD L++ Y+GE+KVK++RLQ LR+EFD I+MK++ETIEEFFN +L+IVN 
Sbjct: 87  ISTATSAKAAWDILRSTYQGEDKVKMIRLQALRSEFDCIKMKETETIEEFFNHILVIVNS 146

Query: 121 LRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKDLSTLSINSLMGSLQSHELRLKMF 180
           LRSNGE + DQR+VEKILRSM R++EHIVVAIEESKDLSTLSINSLMGSLQSHELRLK F
Sbjct: 147 LRSNGEEVGDQRVVEKILRSMPRKFEHIVVAIEESKDLSTLSINSLMGSLQSHELRLKQF 206

Query: 181 DSNPSEEAFHMQSSYRGRSNGRRGGRGGRGNGR---SNVVTNTESESRDNQFFSNRGRGR 240
           D NP EEAF MQ+S+RG S GRRGG G RG GR   +    N+E+    +     RG GR
Sbjct: 207 DVNP-EEAFQMQTSFRGGSRGRRGGHGRRGGGRNYDNRSGANSENSQESSSLSRGRGSGR 266

Query: 241 SSNRGRGRSGGRGDFSHIQCFNCRRYGHFQADCWSKKTNSNQAETTLMHEQSNNDQGLLF 300
               GR + GGRG+FS IQCFNCR+YGHFQADCW+ K         +  EQ  ND+G+LF
Sbjct: 267 RRGFGRNQGGGRGNFSQIQCFNCRKYGHFQADCWALKNGVGNTTMNMHKEQKKNDEGILF 326

Query: 301 LTLNVQESSTEEIWYLDSGCSNHMTGRKDIFISLDESHQNVVK----TGDNKMLEVKGKG 360
           L  +VQ+                                NVVK     GDN  L+VKG+G
Sbjct: 327 LACSVQD--------------------------------NVVKPTCEDGDNTRLQVKGQG 386

Query: 361 DILVKTKMGAKKITDVYYVSGLKHNLLSVGQLLLRGHDVIFKDKICEIRTKNGDLITK 412
           DILVKTK   K++T+V+YV GLKHNLLS+GQLL RG  V F+  IC I+ +   LI+K
Sbjct: 387 DILVKTKKRTKRVTNVFYVPGLKHNLLSIGQLLQRGLKVSFEGDICAIKDQADVLISK 411

BLAST of CSPI04G19780 vs. TrEMBL
Match: A0A068B703_GOSBA (Polyprotein OS=Gossypium barbadense PE=4 SV=1)

HSP 1 Score: 343.2 bits (879), Expect = 4.8e-91
Identity = 199/434 (45.85%), Postives = 277/434 (63.82%), Query Frame = 1

Query: 1   MKALYGPQDLWNIVDIGYSEP---ESENGLSAQQLNELRDARKKDKKALFFIYQSVDENI 60
           MKAL G QD W IV+ GY EP    +E  LS      LR+ARKKD+KAL  I+Q +DE+ 
Sbjct: 24  MKALLGSQDCWEIVEKGYIEPGDAATEAALSNDAKKALREARKKDQKALNSIFQGMDEST 83

Query: 61  FERISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEFDTIRMKDSETIEEFFNRVLLI 120
           FE+IS V  AK AW+ LQ  ++G EK K VRLQ+LRAEF+ ++MK SE I+++ NRV  +
Sbjct: 84  FEKISDVKNAKNAWEILQKSFQGVEKAKKVRLQSLRAEFEMLKMKSSENIDDYANRVKSV 143

Query: 121 VNQLRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKDLSTLSINSLMGSLQSHELRL 180
           VN+++ NGET+++ R++EKILRS+TR++E++VVAIEESKDLS +S+  L+GSLQ+HE ++
Sbjct: 144 VNEMKRNGETLDEVRVMEKILRSLTRKFEYVVVAIEESKDLSKMSLEELVGSLQAHEQKM 203

Query: 181 KM-FDSNPSEEAFHMQ------------SSYRGRSNGRRGG-----RGGRGN-GRSNVVT 240
           K+  DS    +A H +            S  RG   G RGG     RGGRG+ GR N   
Sbjct: 204 KLNEDSENLNQALHSKLSIDDGETSNNFSQGRGNRRGYRGGYRGGNRGGRGSRGRGNQSY 263

Query: 241 NTESESRDNQFFSNRGRGRSSNRGRGRSGGRGDFSHIQCFNCRRYGHFQADCWSKKTNSN 300
               E++D Q  SNRGRG   +RGRGR   + + S +QC+NC +YGHF  +C S      
Sbjct: 264 GRYQENKDYQ-TSNRGRG---SRGRGRGRFQENKSQVQCYNCNKYGHFSYECRSTHKVDE 323

Query: 301 QAETTLMHEQSNNDQGLLFLTLNVQESSTEEIWYLDSGCSNHMTGRKDIFISLDESHQNV 360
           +    +  E +   +  +FLT    E     +WYLD+G SNHM GRK++F  LDE+    
Sbjct: 324 RNHVAVAAEGNEKVESSVFLTYGENEDRKRSVWYLDNGASNHMCGRKELFTELDETVHGQ 383

Query: 361 VKTGDNKMLEVKGKGDILVKTKMGAKK-ITDVYYVSGLKHNLLSVGQLLLRGHDVIFKDK 412
           +  GDN   E+KGKG +++  + G KK I+DVYYV  LK NL+S+GQLL +G++V  KD+
Sbjct: 384 ITFGDNSHAEIKGKGKVVITQRNGEKKYISDVYYVPALKSNLISLGQLLEKGYEVHMKDR 443

BLAST of CSPI04G19780 vs. TrEMBL
Match: V9H129_ARATH (Lectin receptor kinase (Fragment) OS=Arabidopsis thaliana PE=4 SV=1)

HSP 1 Score: 321.2 bits (822), Expect = 1.9e-84
Identity = 184/426 (43.19%), Postives = 266/426 (62.44%), Query Frame = 1

Query: 1   MKALYGPQDLWNIVDIGYSEPESENGLSAQQLNELRDARKKDKKALFFIYQSVDENIFER 60
           MKA+ G  D+W IV+ G+ EPE+E  LS  Q + LRD+RK+DKKAL  IYQ +DE+ FE+
Sbjct: 41  MKAILGAHDVWEIVEKGFIEPENEGSLSQTQKDGLRDSRKRDKKALCLIYQGLDEDTFEK 100

Query: 61  ISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEFDTIRMKDSETIEEFFNRVLLIVNQ 120
           +   ++AK AW+ L+  Y+G ++VK VRLQTLR EF+ ++MK+ E + ++F+RVL + N 
Sbjct: 101 VVEATSAKEAWEKLRTSYKGADQVKKVRLQTLRGEFEALQMKEGELVSDYFSRVLTVTNN 160

Query: 121 LRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKDLSTLSINSLMGSLQSHELRLKMF 180
           L+ NGE ++D RI+EK+LRS+  ++EHIV  IEE+KDL  ++I  L+GSLQ++E + K  
Sbjct: 161 LKRNGEKLDDVRIMEKVLRSLDLKFEHIVTVIEETKDLEAMTIEQLLGSLQAYEEKKKKK 220

Query: 181 DSNPSEEAFHMQ-------SSYRGRSNGR-RG-GRGGRGNGRSNVVTNTESESRDNQFFS 240
           + +  E+  +MQ        SY+ R  G+ RG GRGG GNGR         E   NQ   
Sbjct: 221 E-DIVEQVLNMQITKEENGQSYQRRGGGQVRGRGRGGYGNGRGW----RPHEDNTNQ--- 280

Query: 241 NRGRGRSSNRGRGRSGGRGDFSHIQCFNCRRYGHFQADCWSKKTNSNQAETTLMHEQSNN 300
            RG   S  RG+G    R D S ++C+NC ++GH+ ++C +      + +   + E+   
Sbjct: 281 -RGENSSRGRGKGHPKSRYDKSSVKCYNCGKFGHYASECKAPSNKKFKEKANYVEEKIQE 340

Query: 301 DQGLLFLTLNVQESSTEEIWYLDSGCSNHMTGRKDIFISLDESHQNVVKTGDNKMLEVKG 360
           +  LL  +    E      WYLDSG SNHM GRK +F  LDES +  V  GD   +EVKG
Sbjct: 341 EDMLLMASYKKDEQEENHKWYLDSGASNHMCGRKSMFAELDESVRGNVALGDESKMEVKG 400

Query: 361 KGDILVKTKMGAKK-ITDVYYVSGLKHNLLSVGQLLLRGHDVIFKDKICEIRTKNGDLIT 417
           KG+IL++ K G  + I++VYY+  +K N+LS+GQLL +G+D+  KD    IR K  +LIT
Sbjct: 401 KGNILIRLKNGDHQFISNVYYIPSMKTNILSLGQLLEKGYDIRLKDNNLSIRDKESNLIT 457

BLAST of CSPI04G19780 vs. TrEMBL
Match: Q9M197_ARATH (Copia-type reverse transcriptase-like protein OS=Arabidopsis thaliana GN=T16L24.270 PE=4 SV=1)

HSP 1 Score: 321.2 bits (822), Expect = 1.9e-84
Identity = 184/426 (43.19%), Postives = 266/426 (62.44%), Query Frame = 1

Query: 1   MKALYGPQDLWNIVDIGYSEPESENGLSAQQLNELRDARKKDKKALFFIYQSVDENIFER 60
           MKA+ G  D+W IV+ G+ EPE+E  LS  Q + LRD+RK+DKKAL  IYQ +DE+ FE+
Sbjct: 25  MKAILGAHDVWEIVEKGFIEPENEGSLSQTQKDGLRDSRKRDKKALCLIYQGLDEDTFEK 84

Query: 61  ISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEFDTIRMKDSETIEEFFNRVLLIVNQ 120
           +   ++AK AW+ L+  Y+G ++VK VRLQTLR EF+ ++MK+ E + ++F+RVL + N 
Sbjct: 85  VVEATSAKEAWEKLRTSYKGADQVKKVRLQTLRGEFEALQMKEGELVSDYFSRVLTVTNN 144

Query: 121 LRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKDLSTLSINSLMGSLQSHELRLKMF 180
           L+ NGE ++D RI+EK+LRS+  ++EHIV  IEE+KDL  ++I  L+GSLQ++E + K  
Sbjct: 145 LKRNGEKLDDVRIMEKVLRSLDLKFEHIVTVIEETKDLEAMTIEQLLGSLQAYEEKKKKK 204

Query: 181 DSNPSEEAFHMQ-------SSYRGRSNGR-RG-GRGGRGNGRSNVVTNTESESRDNQFFS 240
           + +  E+  +MQ        SY+ R  G+ RG GRGG GNGR         E   NQ   
Sbjct: 205 E-DIVEQVLNMQITKEENGQSYQRRGGGQVRGRGRGGYGNGRGW----RPHEDNTNQ--- 264

Query: 241 NRGRGRSSNRGRGRSGGRGDFSHIQCFNCRRYGHFQADCWSKKTNSNQAETTLMHEQSNN 300
            RG   S  RG+G    R D S ++C+NC ++GH+ ++C +      + +   + E+   
Sbjct: 265 -RGENSSRGRGKGHPKSRYDKSSVKCYNCGKFGHYASECKAPSNKKFKEKANYVEEKIQE 324

Query: 301 DQGLLFLTLNVQESSTEEIWYLDSGCSNHMTGRKDIFISLDESHQNVVKTGDNKMLEVKG 360
           +  LL  +    E      WYLDSG SNHM GRK +F  LDES +  V  GD   +EVKG
Sbjct: 325 EDMLLMASYKKDEQEENHKWYLDSGASNHMCGRKSMFAELDESVRGNVALGDESKMEVKG 384

Query: 361 KGDILVKTKMGAKK-ITDVYYVSGLKHNLLSVGQLLLRGHDVIFKDKICEIRTKNGDLIT 417
           KG+IL++ K G  + I++VYY+  +K N+LS+GQLL +G+D+  KD    IR K  +LIT
Sbjct: 385 KGNILIRLKNGDHQFISNVYYIPSMKTNILSLGQLLEKGYDIRLKDNNLSIRDKESNLIT 441

BLAST of CSPI04G19780 vs. TrEMBL
Match: Q9M2D1_ARATH (Copia-type polyprotein OS=Arabidopsis thaliana GN=T20K12.230 PE=4 SV=1)

HSP 1 Score: 320.5 bits (820), Expect = 3.3e-84
Identity = 183/426 (42.96%), Postives = 267/426 (62.68%), Query Frame = 1

Query: 1   MKALYGPQDLWNIVDIGYSEPESENGLSAQQLNELRDARKKDKKALFFIYQSVDENIFER 60
           MKA+ G  D+W IV+ G+ EPE+E  LS  Q + LRD+RK+DKKAL  IYQ +DE+ FE+
Sbjct: 25  MKAILGAHDVWEIVEKGFIEPENEGSLSQTQKDGLRDSRKRDKKALCLIYQGLDEDTFEK 84

Query: 61  ISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEFDTIRMKDSETIEEFFNRVLLIVNQ 120
           +   ++AK AW+ L+  Y+G ++VK VRLQTLR EF+ ++MK+ E + ++F+RVL + N 
Sbjct: 85  VVEATSAKEAWEKLRTSYKGADQVKKVRLQTLRGEFEALQMKEGELVSDYFSRVLTVTNN 144

Query: 121 LRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKDLSTLSINSLMGSLQSHELRLKMF 180
           L+ NGE ++D RI+EK+LRS+  ++EHIV  IEE+KDL  ++I  L+GSLQ++E + K  
Sbjct: 145 LKRNGEKLDDVRIMEKVLRSLDLKFEHIVTVIEETKDLEAMTIEQLLGSLQAYEEKKKKK 204

Query: 181 DSNPSEEAFHMQ-------SSYRGRSNGR-RG-GRGGRGNGRSNVVTNTESESRDNQFFS 240
           + + +E+  +MQ        SY+ R  G+ RG GRGG GNGR         E   NQ   
Sbjct: 205 E-DIAEQVLNMQITKEENGQSYQRRGGGQVRGRGRGGYGNGRGW----RPHEDNTNQ--- 264

Query: 241 NRGRGRSSNRGRGRSGGRGDFSHIQCFNCRRYGHFQADCWSKKTNSNQAETTLMHEQSNN 300
            RG   S  RG+G    R D S ++C+NC ++GH+ ++C +      + +   + E+   
Sbjct: 265 -RGENSSRGRGKGHPKSRYDKSSVKCYNCGKFGHYASECKAPSNKKFEEKAHYVEEKIQE 324

Query: 301 DQGLLFLTLNVQESSTEEIWYLDSGCSNHMTGRKDIFISLDESHQNVVKTGDNKMLEVKG 360
           +  LL  +    E      WYLDSG SNHM GRK +F  LDES +  V  GD   +EVKG
Sbjct: 325 EDMLLMASYKKDEQKENHKWYLDSGASNHMCGRKSMFAELDESVRGNVALGDESKMEVKG 384

Query: 361 KGDILVKTKMGAKK-ITDVYYVSGLKHNLLSVGQLLLRGHDVIFKDKICEIRTKNGDLIT 417
           KG+IL++ K G  + I++VYY+  +K N+LS+GQLL +G+D+  KD    IR +  +LIT
Sbjct: 385 KGNILIRLKNGDHQFISNVYYIPSMKTNILSLGQLLEKGYDIRLKDNNLSIRDQESNLIT 441

BLAST of CSPI04G19780 vs. TAIR10
Match: AT1G48720.1 (AT1G48720.1 unknown protein)

HSP 1 Score: 76.6 bits (187), Expect = 4.2e-14
Identity = 35/68 (51.47%), Postives = 49/68 (72.06%), Query Frame = 1

Query: 1  MKALYGPQDLWNIVDIGYSEPESENGLSAQQLNELRDARKKDKKALFFIYQSVDENIFER 60
          MKA+ G  D+W IV+ G+ EPE+E  LS  Q + LRD+RK+DKKAL  IYQ +DE+ FE+
Sbjct: 25 MKAILGAHDVWEIVEKGFIEPENEGSLSQTQKDGLRDSRKRDKKALCLIYQGLDEDTFEK 84

Query: 61 ISGVSTAK 69
          +   ++AK
Sbjct: 85 VVEATSAK 92

BLAST of CSPI04G19780 vs. TAIR10
Match: AT3G20980.1 (AT3G20980.1 Gag-Pol-related retrotransposon family protein)

HSP 1 Score: 57.4 bits (137), Expect = 2.7e-08
Identity = 37/90 (41.11%), Postives = 48/90 (53.33%), Query Frame = 1

Query: 308 EEIWYLDSGCSNHMTGRKDIFISLDESHQNVVK--TGDNK---MLEVKGKGDILVKTKMG 367
           E IW + S  SNHMT     F +LD S +  VK  +GD     +  V+G GD+   T  G
Sbjct: 266 ENIWLISSTNSNHMTPHVKFFTTLDRSRKCKVKFISGDKSETTVAMVEGIGDVTFITNEG 325

Query: 368 AKKITDVYYVSGLKHNLLSVGQLLLRGHDV 393
            K I +V YV G++ N LSV QL   G +V
Sbjct: 326 NKTIKNVLYVPGIEGNALSVSQLKRNGFEV 355

BLAST of CSPI04G19780 vs. TAIR10
Match: AT3G21000.1 (AT3G21000.1 Gag-Pol-related retrotransposon family protein)

HSP 1 Score: 55.8 bits (133), Expect = 7.7e-08
Identity = 46/206 (22.33%), Postives = 97/206 (47.09%), Query Frame = 1

Query: 8   QDLWNIVDIGYSEPESENG-----LSAQQLNELRDARKKDKKALFFIYQSVDENIFERIS 67
           Q LW++V  G  +  S+N      +  ++L++ RD   KD KAL  +  S+ +++F +  
Sbjct: 31  QGLWDVVVNGVPQDPSKNPELAATIQPEELSKWRDFVVKDAKALQILQSSLTDSVFRKTL 90

Query: 68  GVSTAKAAWDALQNLYEGEEKVKLVRLQ-----TLRAEFDTIRMKDSETIEEFFNRVLLI 127
             S+AK  WD L+   +G E+  + RL+      L  + + ++M D E+   + ++ L I
Sbjct: 91  SASSAKDVWDLLR---KGNEQATIRRLEQVTIRRLEKQLEDLKMVDKESGSSYLDKALEI 150

Query: 128 VNQLRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKDLSTLSINSLMGSLQSHELRL 187
           + +L        D  I + +  +++  ++ +   +EE  D+  ++  SL+          
Sbjct: 151 LERLGRAKLEKSDYEICKNVFTTLSGSFDGLDSMLEELIDVHKMTSKSLV-----EYFYY 210

Query: 188 KMFDSNPSEEAFHMQSSYRGRSNGRR 204
           ++ +S+  E  F +    R +S   +
Sbjct: 211 RVHESSTEEAIFGLLKDLRLKSKSEK 228

BLAST of CSPI04G19780 vs. NCBI nr
Match: gi|659099180|ref|XP_008450471.1| (PREDICTED: uncharacterized protein LOC103492064 [Cucumis melo])

HSP 1 Score: 491.9 bits (1265), Expect = 1.2e-135
Identity = 257/411 (62.53%), Postives = 320/411 (77.86%), Query Frame = 1

Query: 1   MKALYGPQDLWNIVDIGYSEPESENGLSAQQLNELRDARKKDKKALFFIYQSVDENIFER 60
           MK LY  Q+LW+IV+ GY+E E+++ L+ QQL ELR+  KKDKKALFFIYQ VDE IFER
Sbjct: 27  MKVLYDSQELWDIVERGYTEVENQSELTNQQLVELRENCKKDKKALFFIYQVVDEFIFER 86

Query: 61  ISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEFDTIRMKDSETIEEFFNRVLLIVNQ 120
           IS  ++AKAAWD L++ Y+GE+KVK++RLQ LR+EFD I+MK++E IEEFFNR+L+IVN 
Sbjct: 87  ISTATSAKAAWDILRSTYQGEDKVKMIRLQALRSEFDCIKMKETEPIEEFFNRILVIVNS 146

Query: 121 LRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKDLSTLSINSLMGSLQSHELRLKMF 180
           LRSNGE + DQR+VEKILRSM R++EHI+VAIEESKDLST SINSLMGSLQSHELRLK F
Sbjct: 147 LRSNGEEVGDQRVVEKILRSMPRKFEHIIVAIEESKDLSTFSINSLMGSLQSHELRLKQF 206

Query: 181 DSNPSEEAFHMQSSYRGRSNGRRGGRGGRGNGRSNVVTNTESESRDNQFFSNRGRGRSSN 240
           D +P EEAF MQ+S+RG S GRRGG G RG+GR N    + + S ++Q  S+  RGR   
Sbjct: 207 DVDP-EEAFQMQTSFRGGSRGRRGGHGRRGDGR-NYDNRSGANSENSQEISSLSRGRGF- 266

Query: 241 RGRGRSGGRGDFSHIQCFNCRRYGHFQADCWSKKTNSNQAETTLMHEQSNNDQGLLFLTL 300
            GR + GGRG+FS IQCF CR+YGHFQADCW+ K         +  EQ  ND+G+LFL  
Sbjct: 267 -GRNQGGGRGNFSQIQCFKCRKYGHFQADCWALKNGVGNTTMNMHKEQKKNDEGILFLAC 326

Query: 301 NVQESSTEEIWYLDSGCSNHMTGRKDIFISLDESHQNVVKTGDNKMLEVKGKGDILVKTK 360
           +VQ++  E  WYLDSGCSNHMTG ++IF++LDES Q+ VKTGDN  L+VKG+GDILVKTK
Sbjct: 327 SVQDNVVEPTWYLDSGCSNHMTGNRNIFVTLDESFQSEVKTGDNTRLQVKGQGDILVKTK 386

Query: 361 MGAKKITDVYYVSGLKHNLLSVGQLLLRGHDVIFKDKICEIRTKNGDLITK 412
            G K++T+V+YV GLKHNLLS+GQLL +G  V F+  IC I+ + G LI K
Sbjct: 387 KGTKRVTNVFYVPGLKHNLLSIGQLLQQGLKVSFEGDICAIKDQAGVLIAK 433

BLAST of CSPI04G19780 vs. NCBI nr
Match: gi|150036244|gb|ABR67407.1| (integrase [Cucumis melo subsp. melo])

HSP 1 Score: 430.3 bits (1105), Expect = 4.3e-117
Identity = 239/418 (57.18%), Postives = 293/418 (70.10%), Query Frame = 1

Query: 1   MKALYGPQDLWNIVDIGYSEPESENGLSAQQLNELRDARKKDKKALFFIYQSVDENIFER 60
           MK LYG Q+LW+IV+ GY+E E+++ L+ QQL ELR+ R KDKKALFFIYQ+VDE I ER
Sbjct: 27  MKVLYGSQELWDIVERGYTEVENQSELTNQQLVELRENRNKDKKALFFIYQAVDEFISER 86

Query: 61  ISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEFDTIRMKDSETIEEFFNRVLLIVNQ 120
           IS  ++AKAAWD L++ Y+GE+KVK++RLQ LR+EFD I+MK++ETIEEFFN +L+IVN 
Sbjct: 87  ISTATSAKAAWDILRSTYQGEDKVKMIRLQALRSEFDCIKMKETETIEEFFNHILVIVNS 146

Query: 121 LRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKDLSTLSINSLMGSLQSHELRLKMF 180
           LRSNGE + DQR+VEKILRSM R++EHIVVAIEESKDLSTLSINSLMGSLQSHELRLK F
Sbjct: 147 LRSNGEEVGDQRVVEKILRSMPRKFEHIVVAIEESKDLSTLSINSLMGSLQSHELRLKQF 206

Query: 181 DSNPSEEAFHMQSSYRGRSNGRRGGRGGRGNGR---SNVVTNTESESRDNQFFSNRGRGR 240
           D NP EEAF MQ+S+RG S GRRGG G RG GR   +    N+E+    +     RG GR
Sbjct: 207 DVNP-EEAFQMQTSFRGGSRGRRGGHGRRGGGRNYDNRSGANSENSQESSSLSRGRGSGR 266

Query: 241 SSNRGRGRSGGRGDFSHIQCFNCRRYGHFQADCWSKKTNSNQAETTLMHEQSNNDQGLLF 300
               GR + GGRG+FS IQCFNCR+YGHFQADCW+ K         +  EQ  ND+G+LF
Sbjct: 267 RRGFGRNQGGGRGNFSQIQCFNCRKYGHFQADCWALKNGVGNTTMNMHKEQKKNDEGILF 326

Query: 301 LTLNVQESSTEEIWYLDSGCSNHMTGRKDIFISLDESHQNVVK----TGDNKMLEVKGKG 360
           L  +VQ+                                NVVK     GDN  L+VKG+G
Sbjct: 327 LACSVQD--------------------------------NVVKPTCEDGDNTRLQVKGQG 386

Query: 361 DILVKTKMGAKKITDVYYVSGLKHNLLSVGQLLLRGHDVIFKDKICEIRTKNGDLITK 412
           DILVKTK   K++T+V+YV GLKHNLLS+GQLL RG  V F+  IC I+ +   LI+K
Sbjct: 387 DILVKTKKRTKRVTNVFYVPGLKHNLLSIGQLLQRGLKVSFEGDICAIKDQADVLISK 411

BLAST of CSPI04G19780 vs. NCBI nr
Match: gi|651219311|gb|AIC77183.1| (polyprotein [Gossypium barbadense])

HSP 1 Score: 343.2 bits (879), Expect = 6.9e-91
Identity = 199/434 (45.85%), Postives = 277/434 (63.82%), Query Frame = 1

Query: 1   MKALYGPQDLWNIVDIGYSEP---ESENGLSAQQLNELRDARKKDKKALFFIYQSVDENI 60
           MKAL G QD W IV+ GY EP    +E  LS      LR+ARKKD+KAL  I+Q +DE+ 
Sbjct: 24  MKALLGSQDCWEIVEKGYIEPGDAATEAALSNDAKKALREARKKDQKALNSIFQGMDEST 83

Query: 61  FERISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEFDTIRMKDSETIEEFFNRVLLI 120
           FE+IS V  AK AW+ LQ  ++G EK K VRLQ+LRAEF+ ++MK SE I+++ NRV  +
Sbjct: 84  FEKISDVKNAKNAWEILQKSFQGVEKAKKVRLQSLRAEFEMLKMKSSENIDDYANRVKSV 143

Query: 121 VNQLRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKDLSTLSINSLMGSLQSHELRL 180
           VN+++ NGET+++ R++EKILRS+TR++E++VVAIEESKDLS +S+  L+GSLQ+HE ++
Sbjct: 144 VNEMKRNGETLDEVRVMEKILRSLTRKFEYVVVAIEESKDLSKMSLEELVGSLQAHEQKM 203

Query: 181 KM-FDSNPSEEAFHMQ------------SSYRGRSNGRRGG-----RGGRGN-GRSNVVT 240
           K+  DS    +A H +            S  RG   G RGG     RGGRG+ GR N   
Sbjct: 204 KLNEDSENLNQALHSKLSIDDGETSNNFSQGRGNRRGYRGGYRGGNRGGRGSRGRGNQSY 263

Query: 241 NTESESRDNQFFSNRGRGRSSNRGRGRSGGRGDFSHIQCFNCRRYGHFQADCWSKKTNSN 300
               E++D Q  SNRGRG   +RGRGR   + + S +QC+NC +YGHF  +C S      
Sbjct: 264 GRYQENKDYQ-TSNRGRG---SRGRGRGRFQENKSQVQCYNCNKYGHFSYECRSTHKVDE 323

Query: 301 QAETTLMHEQSNNDQGLLFLTLNVQESSTEEIWYLDSGCSNHMTGRKDIFISLDESHQNV 360
           +    +  E +   +  +FLT    E     +WYLD+G SNHM GRK++F  LDE+    
Sbjct: 324 RNHVAVAAEGNEKVESSVFLTYGENEDRKRSVWYLDNGASNHMCGRKELFTELDETVHGQ 383

Query: 361 VKTGDNKMLEVKGKGDILVKTKMGAKK-ITDVYYVSGLKHNLLSVGQLLLRGHDVIFKDK 412
           +  GDN   E+KGKG +++  + G KK I+DVYYV  LK NL+S+GQLL +G++V  KD+
Sbjct: 384 ITFGDNSHAEIKGKGKVVITQRNGEKKYISDVYYVPALKSNLISLGQLLEKGYEVHMKDR 443

BLAST of CSPI04G19780 vs. NCBI nr
Match: gi|923871285|ref|XP_013709794.1| (PREDICTED: uncharacterized protein LOC106413593, partial [Brassica napus])

HSP 1 Score: 326.2 bits (835), Expect = 8.7e-86
Identity = 188/449 (41.87%), Postives = 284/449 (63.25%), Query Frame = 1

Query: 1   MKALYGPQDLWNIVDIGYSEPESENGLSAQQLNELRDARKKDKKALFFIYQSVDENIFER 60
           + A+ G  D+W+IV+ G++EPE++ GLS  Q + LRD+RK+DKKAL  IYQ +DE+ FE+
Sbjct: 37  LMAILGAHDVWDIVEKGFNEPENDGGLSQTQKDGLRDSRKRDKKALCLIYQGLDEDTFEK 96

Query: 61  ISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEFDTIRMKDSETIEEFFNRVLLIVNQ 120
           ++G  T+K AW+ LQ  Y+G E+VK VRLQTLR EF+ ++MK+ E I ++F+RVL + N 
Sbjct: 97  VAGAKTSKEAWEKLQTSYKGAEQVKKVRLQTLRGEFEALQMKEGELISDYFSRVLTVTNN 156

Query: 121 LRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKDLSTLSINSLMGSLQSHELRLKMF 180
           L+ NGE ++D RI+EK+LRS+  ++EHIV  IEE+KDL T++I  L+GSLQ++E + K  
Sbjct: 157 LKRNGEKLDDVRIMEKVLRSLDSKFEHIVTVIEETKDLETMTIEQLLGSLQAYEEKKKKK 216

Query: 181 DSNPSEEAFHMQSSYR---GRSNGRRGGRGGRGNGRSNVVTNTESESRDNQFFSNRGRGR 240
           + +  E+   M+ +++   GRSN RRGG   RG GR     N          F+ RG   
Sbjct: 217 E-DIVEQVLKMRINHKEESGRSNLRRGGGHFRGRGRG---VNGRGWRPYEDNFNQRGENS 276

Query: 241 SSNRGRGRSGGRGDFSHIQCFNCRRYGHFQADCWSKKTNSNQAETTLMH-EQSNNDQGLL 300
           S  RGRG    R D S I+C++C ++GH+ ++C  K  N+N+ E    + E+   ++ +L
Sbjct: 277 SRGRGRGNPKSRYDKSSIKCYSCGKFGHYASEC--KTPNNNRVEEKSNYVEEMIKEKDML 336

Query: 301 FLTLNVQESSTEEIWYLDSGCSNHMTGRKDIFISLDESHQNVVKTGDNKMLEVKGKGDIL 360
            +     E +    WYLDSG SNHM G K +F+ LDES +  V  GD   +EVKGKG+IL
Sbjct: 337 LMAYKKDEPNEVHKWYLDSGASNHMCGNKSMFVELDESVKTDVALGDESKMEVKGKGNIL 396

Query: 361 VKTKMGAKK-ITDVYYVSGLKHNLLSVGQLLLRGHDVIFKDKICEIRTKNGDLITKKASS 420
           ++ K G  + I +VYY+  +K N+LS+GQLL +G+D+  KD    +R    +LITK   S
Sbjct: 397 IRLKNGDHQFIFNVYYIPSMKTNILSLGQLLEKGYDIRLKDNSLSLRDNANNLITKVPMS 456

Query: 421 KFISDWRFLESIKTTRACSYRLMWTYENY 445
              S+  F+ +I+   A   ++ +  E++
Sbjct: 457 ---SNRMFVLNIQNDIARCLKMCYKEESW 476

BLAST of CSPI04G19780 vs. NCBI nr
Match: gi|685311121|ref|XP_009146135.1| (PREDICTED: uncharacterized protein LOC103869817, partial [Brassica rapa])

HSP 1 Score: 326.2 bits (835), Expect = 8.7e-86
Identity = 186/449 (41.43%), Postives = 284/449 (63.25%), Query Frame = 1

Query: 1   MKALYGPQDLWNIVDIGYSEPESENGLSAQQLNELRDARKKDKKALFFIYQSVDENIFER 60
           M A+ G  D+W IV+ G++EPE++ GLS  Q + LRD+RK+DKKAL  IYQ +DE+ FE+
Sbjct: 78  MMAILGAHDVWEIVEKGFNEPENDGGLSQTQKDGLRDSRKRDKKALCLIYQGLDEDTFEK 137

Query: 61  ISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEFDTIRMKDSETIEEFFNRVLLIVNQ 120
           ++G  T+K AWD L+  Y+G E+VK VRLQTLR EF+ ++MK+ E I ++F+RVL + N 
Sbjct: 138 VAGAKTSKEAWDKLRTSYKGAEQVKKVRLQTLRGEFEALQMKEGELISDYFSRVLTVTNN 197

Query: 121 LRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKDLSTLSINSLMGSLQSHELRLKMF 180
           L+ NGE +++ RI+EK+LRS+  ++EHIV  IEE+KDL T+++  L+GSLQ++E + K  
Sbjct: 198 LKRNGEKLDEVRIMEKVLRSLDSKFEHIVTVIEETKDLETMTMEQLLGSLQAYEEKKKKK 257

Query: 181 DSNPSEEAFHMQSSYR---GRSNGRRGGRGGRGNGRSNVVTNTESESRDNQFFSNRGRGR 240
           + +  E+   M+  ++   GRSN RRGG   RG GR     N        + F+ RG   
Sbjct: 258 E-DIVEQVLKMRIDHKEESGRSNPRRGGGHFRGRGRG---VNGRGWRPYEENFNQRGENS 317

Query: 241 SSNRGRGRSGGRGDFSHIQCFNCRRYGHFQADCWSKKTNSNQAETTLMH-EQSNNDQGLL 300
           S  RGRG    R D S I+C++C ++GH+ ++C  K  N+N+ E    + E+ + ++ +L
Sbjct: 318 SRGRGRGNPKSRYDKSSIKCYSCGKFGHYASEC--KTPNNNRVEEKSNYVEERHKEEDIL 377

Query: 301 FLTLNVQESSTEEIWYLDSGCSNHMTGRKDIFISLDESHQNVVKTGDNKMLEVKGKGDIL 360
            +     E +    WYLDSG S HM G K +F+ LDES +  V  GD   +EVKGKG+IL
Sbjct: 378 LMAYKKDEPNEVHKWYLDSGASTHMCGNKSMFVELDESVKTNVALGDESKMEVKGKGNIL 437

Query: 361 VKTKMGAKK-ITDVYYVSGLKHNLLSVGQLLLRGHDVIFKDKICEIRTKNGDLITKKASS 420
           ++ K G  + I++VYY+  +K N+LS+GQLL +G+D+  KD    +R    +LITK   S
Sbjct: 438 IRLKNGDHQFISNVYYIPSMKTNILSLGQLLEKGYDIRLKDNSLSLRDNANNLITKVPMS 497

Query: 421 KFISDWRFLESIKTTRACSYRLMWTYENY 445
              S+  F+ +I+   A   ++ +  E++
Sbjct: 498 ---SNRMFVLNIQNDIARCLKMCYKEESW 517

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
POLX_TOBAC4.1e-1123.42Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum... [more]
Match NameE-valueIdentityDescription
A6YTD9_CUCME3.0e-11757.18Integrase OS=Cucumis melo subsp. melo PE=4 SV=1[more]
A0A068B703_GOSBA4.8e-9145.85Polyprotein OS=Gossypium barbadense PE=4 SV=1[more]
V9H129_ARATH1.9e-8443.19Lectin receptor kinase (Fragment) OS=Arabidopsis thaliana PE=4 SV=1[more]
Q9M197_ARATH1.9e-8443.19Copia-type reverse transcriptase-like protein OS=Arabidopsis thaliana GN=T16L24.... [more]
Q9M2D1_ARATH3.3e-8442.96Copia-type polyprotein OS=Arabidopsis thaliana GN=T20K12.230 PE=4 SV=1[more]
Match NameE-valueIdentityDescription
AT1G48720.14.2e-1451.47 unknown protein[more]
AT3G20980.12.7e-0841.11 Gag-Pol-related retrotransposon family protein[more]
AT3G21000.17.7e-0822.33 Gag-Pol-related retrotransposon family protein[more]
Match NameE-valueIdentityDescription
gi|659099180|ref|XP_008450471.1|1.2e-13562.53PREDICTED: uncharacterized protein LOC103492064 [Cucumis melo][more]
gi|150036244|gb|ABR67407.1|4.3e-11757.18integrase [Cucumis melo subsp. melo][more]
gi|651219311|gb|AIC77183.1|6.9e-9145.85polyprotein [Gossypium barbadense][more]
gi|923871285|ref|XP_013709794.1|8.7e-8641.87PREDICTED: uncharacterized protein LOC106413593, partial [Brassica napus][more]
gi|685311121|ref|XP_009146135.1|8.7e-8641.43PREDICTED: uncharacterized protein LOC103869817, partial [Brassica rapa][more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR001878Znf_CCHC
Vocabulary: Molecular Function
TermDefinition
GO:0003676nucleic acid binding
GO:0008270zinc ion binding
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0015074 DNA integration
cellular_component GO:0005575 cellular_component
molecular_function GO:0003676 nucleic acid binding
molecular_function GO:0008270 zinc ion binding

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CSPI04G19780.1CSPI04G19780.1mRNA


Analysis Name: InterPro Annotations of cucumber (PI183967)
Date Performed: 2017-01-17
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR001878Zinc finger, CCHC-typeGENE3DG3DSA:4.10.60.10coord: 251..274
score: 3.
IPR001878Zinc finger, CCHC-typePROFILEPS50158ZF_CCHCcoord: 257..270
score: 9
IPR001878Zinc finger, CCHC-typeunknownSSF57756Retrovirus zinc finger-like domainscoord: 239..279
score: 2.9
NoneNo IPR availablePANTHERPTHR11439GAG-POL-RELATED RETROTRANSPOSONcoord: 308..411
score: 1.1E-85coord: 1..202
score: 1.1E-85coord: 230..282
score: 1.1
NoneNo IPR availablePFAMPF14223Retrotran_gag_2coord: 39..178
score: 5.6

The following gene(s) are orthologous to this gene:

None

The following gene(s) are paralogous to this gene:

None