CSPI01G19040 (gene) Wild cucumber (PI 183967)

NameCSPI01G19040
Typegene
OrganismCucumis sativus (Wild cucumber (PI 183967))
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
LocationChr1 : 14446725 .. 14450609 (+)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: CDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGGTAGTGGCAAGAGTGGAAATCGAGAAGTTCGATGGAAAGAGAGACTTTGCCTTATGGAAGGCAAAGATTAAGGCCTTGTTAGGACAACAAAAAGCTCACAAAGCCCTTCTAGATTCTTCAGAACTTCCAACAACCCTCACAGCAGTACAAAATGAAGAAATGAAATTAAATGCCTATGGAACACTAATACTAAACCTTAGTGACAATGTTATAAGGCAAGTACTAGAAGAAGAGACAACACACAAAATTTGGAAGAAATTAGAAAGTCTATATGCCACTAAAGATCTCTCAAACAAGATGTATCTAAGGGAAAAATTCTTTACATATAAAATGGATCCTTCCAAAAGTTTAACAGACAACTTAGATAAGTTCAAGAAAATAGTTTCAGACTTTAAAAGTCTTGAAGACAAACTCAGTGATGATAATGAGGCATTCGTTCTCCAAAATTCTCTACCCGAGGCATACAAAGAAGTGAAAAATTCACTAAAGTATGGTAAAGACTCAGCCAAAACAGATGTCATAATATCAACCCTAAGAACCAGAGAATTGAAAATACAGTTATCTCATAAGGAACATCAAAGTGGAAATGGTTTGTTTGTCAAAGGTAAATCCCAAAACAATCAGGGTAAAAATAGTAGCAAATCCTCCTCTTATGAAGATAAGAAAGTTAAACAAAAAGAGAAGTGTAAATAGAGTAAAAAGAAGGGACACCTTCAAGAGTGTTTTTTCCTTAAAAGAAATATCAAGAAAAAGAAAAAGACTTAAAAGGAAAACAACCAGAAGCCTTCATAGTAGAAATATCATTTACCTACACGGATGCCCTGACATCAACCTTAGACCAAGCCAATTATGTTAACCCCTTCCGAAAACGTGATTGGGTCCTAGACTCAGGATGCACCTACCATATGACACTTTTTAGAGCATGGTTCAATACCTATAAAGAAATCAGTAGAGAGTCTGTGTTTATGGGGAATAATAATGCTTGTAACATTGCTGGAATTGGATCGGTCACCATGAAACTAAAGGATGAGACTGTAAATCTCCTTAGAAATGTAAGACATGTTCCTCACCTTAAAAGAAACTTAATCTCCTTGGGAATGCTAGACTCTCTAGGATGCGAATACAAAGGAAAATGTGGAGTCTTCCAAGTTTTCATGGACTCTAAAGAAGTATTGGTTGGGGGAAAAGGTAAATGACCTGTTTATAATAAAAGGGGTTGAAATGATAGATGAAGCAAATATTGTATTAGCTCTAAACTTAACAGAGGATGACCTTTGGCACAAAAGACTGTCTCATATAAGTCAGAAGGGACTTGAGGCTTTATCTAAACAAGACATTCTACCTTTAGACATATGCAGTAAACTGTCTTTCTGTGAACACTGTGTGCTAGGCAAAGCAAGGAAACAAAGTTTCACCAAAGCACAACACACAACTAAAGGAATTCTAGACTATATCTATCCAGATTTATGGGGTCCAACCCCAACTCCAAGCCTAAGAGCTCAAGGTATTTCTTATCATTTACTGATGATTTTTCAAGAAAAAATTGGATTTATTTAAAAACAAAAGATCAAGTTTTTGAAAAATTTTAAAAATGAAAACTAATGATAGAAAAACAAACTGATAAAGAGATTAAATACCTTAGAGCTGACAATGGTTCGGAGTTTTGTGGAGAGGTGTTCAATCATTTTTGCAAGGAAAGTGGAATCACAAGACACAAAAATGTGAGATACACACCTCAACAAAATATGGTGGCAGAAAGACTTAACAGAACTATAATGGAAAAGGTAAGATGCCAACTATCAGATGTCATTCTGGAAGAAAAGTTTTGGGTTGAAGTTGCTGCCTATGCGGTGTACACATTGAATAGAAGCCCCCATACCTCCTTAGGACTCTTAACACCTGAGGAGAAATGGTCCATCTAAATGATCTTTAGGTGTTTGGATGTGTAGGGTATGTCCATCAAAACTAAGGGAAACTAAAGGCTAGAGATGTTAAGTGTATGTTTGTTGGCTTCACAGAAGGGGTAAAGGGTTTCAAGTTGTGGCACCACTAACAAGAAGTTCATAATTAGTAGGGATGTTCATTTCAGAGAAACTGAGATGTTTATGCAAGGGAAAGGTAATACTAAAAGGAGCACTGAAGCCACAAAAACCTATACTCAGATTGAACTGGAGAATGCTAGAAACGGTGCTCAATTTACTAAGAAAACTAAAGTTATTGATCAAGAAATCGATGGAGAACAAACTAAAATAGTTAAAGAACAATCTAACTTGAGCCAATATTCCCTAGCAAGAGACAGACAAAGAACGGTAATTATTCCTCCAATCGGTATGCTGAAACTAACTATATAAGTTTTTTTCTAAACGTTACTATAGCTCCCAGTGATAATGAACCTAGTTCTTTTGAGGAAGTTAAGCTGTAATGATGCTAGACGGTGGATTGAAGCCATGAATGAAGAAATAAATTCTCTAAAGGTAAATGACACATGGACCCTTATCCCTTTACCTAAGGAATGCAAACCAATAGCATCTAAGTGGATCTATAAACTCAAGGAAGGAGTCACTGAACTCACTACCAAGGTACAAGGCAAAGTTGGTAGCTAAGGACTTTACTCAAAGAGAAGGTATAGACTATTCTGAAATTTTCTCACCAGTTGTTAAACAAACCTCTATTAGACCTCTCTTATCCCTAGTTACTTAATATAATCTAGAGTTGGACCAACTTGATGTAAAGGCAGCCTTCCTTCATGGCTATCTAGATGAAACAATTTACATGGTTCAACCTAAAGGTTTTGAAGTTCAAGGTAAGGAAGACCTCTACTGCTTACTAAAGAAGTCAATATATGGACTGGAGCAATCACCTAGGTGCTGGATCTTATGGTTTTATATCCAGACTGAGTTTTAATAGAAGCTCATACAATATGTGTGTCTACATAAACTCAACAACCTATAAGGAGATGGTTTTCTTTCTACTCTATGTCGATGATATGCTCCTTGCCAGAAGCTCTAAGGAAGATTTAACTCATGTCAAAACTCTTTTAAGTAAAGAATTTTACATGAAGGATTTAGGTGAATCAAGAAAGATCCTAGGAATTGACATCACAAGAGACAAAAACCAATCCATACGAAGCATAAGTCAATCAACCTACTATGAGAAGGTGATCAGAAGATTCAACCTCACCACTGCTAGACCAATCACACTCCCTATTGCACAATATTTTAAACTATCAGCCGCTAATTTTCCTAGTGAGACAGACATAGAGCACATGAATATATGTTAGAAATTAATAAGCTAGATACATTATGGATAACACATTAATAGATACATTTCTAGTTTCAAGATACATATGCATTTCTAGATACATGTAATTATTCAAGTACATGAATGGTATGTATTCATGTACTTCATTTATTTTGTGTCTTTTAGGAAGTGTCTCTTGAATACTATATATAGAGATCATTTCCTTCATTTGTACACAAAGATAAAAGAATAAAATCAGTAGAAGCAAATTGAGTTTTAGAAAGTAAGTGAGTTTCAAGAGAACAACATTCTTCTCTTGTGTGTTATTGAGTTTTCTAAGAGAACAAATTTCTTCTCTTAAAAGTTTGTGAGATTTAGAGTAAATTTAGAGTGGGCTCGTCAACGCCTCCAACAAAGTGGTATCAGAGCAAGGATTGAAAGATGTCATCTAACATGTTGCAGCCTCAACTTCCTCGTTTTGAGGGAAAAACTTATAGGCGGTGGAGCCAGCAAATGAAGGTTCTTTATGGATCTCAAGATCTTTGGGATATTGTTGACATCGGATATTCAGAGCCAGAAAGTGAGAATGGTCTTTCAGCACAACAACTCAATGAGTTGAAAGATGCTAGAAAAAAAGGATAA

mRNA sequence

ATGGTAGTGGCAAGAGTGGAAATCGAGAAGTTCGATGGAAAGAGAGACTTTGCCTTATGGAAGGCAAAGATTAAGGCCTTGTTAGGACAACAAAAAGCTCACAAAGCCCTTCTAGATTCTTCAGAACTTCCAACAACCCTCACAGCAGTACAAAATGAAGAAATGAAATTAAATGCCTATGGAACACTAATACTAAACCTTAGTGACAATGTTATAAGGCAAGTACTAGAAGAAGAGACAACACACAAAATTTGGAAGAAATTAGAAAGTCTATATGCCACTAAAGATCTCTCAAACAAGATGTATCTAAGGGAAAAATTCTTTACATATAAAATGGATCCTTCCAAAAGTTTAACAGACAACTTAGATAAGTTCAAGAAAATAGTTTCAGACTTTAAAAGTCTTGAAGACAAACTCAGTGATGATAATGAGGCATTCGTTCTCCAAAATTCTCTACCCGAGGCATACAAAGAAGTGAAAAATTCACTAAAGTATGGTAAAGACTCAGCCAAAACAGATGTCATAATATCAACCCTAAGAACCAGAGAATTGAAAATACAGTTATCTCATAAGGAACATCAAAGTGGAAATGGTTTGTTTGTCAAAGCCAATTATGTTAACCCCTTCCGAAAACGTGATTGGGTCCTAGACTCAGGATGCACCTACCATATGACACTTTTTAGAGCATGGTTCAATACCTATAAAGAAATCAGTAGAGAGTCTGTGTTTATGGGGAATAATAATGCTTGTAACATTGCTGGAATTGGATCGGTCACCATGAAACTAAAGGATGAGACTGTAAATCTCCTTAGAAATGTAAGACATGTTCCTCACCTTAAAAGAAACTTAATCTCCTTGGGAATGCTAGACTCTCTAGGATGCGAATACAAAGGAAAATGTGGAGTCTTCCAAGTTTTCATGGACTCTAAAGAAGTATTGGTTGGGGGAAAAGAAAAACAAACTGATAAAGAGATTAAATACCTTAGAGCTGACAATGGTTCGGAGTTTTGTGGAGAGGTGTTCAATCATTTTTGCAAGGAAAGTGGAATCACAAGACACAAAAATGTGAGATACACACCTCAACAAAATATGGTGGCAGAAAGACTTAACAGAACTATAATGGAAAAGGTAAGATGCCAACTATCAGATGTCATTCTGGAAGAAAAGTTTTGGGTTGAAGTTGCTGCCTATGCGGTGTACACATTGAATAGAAGCCCCCATACCTCCTTAGGACTCTTAACACCTGAGGAGAAATGGGTTTCAAGTTGTGGCACCACTAACAAGAAGTTCATAATTAGTAGGGATGTTCATTTCAGAGAAACTGAGATGTTTATGCAAGGGAAAGGTAATACTAAAAGGAGCACTGAAGCCACAAAAACCTATACTCAGATTGAACTGGAGAATGCTAGAAACGGTGCTCAATTTACTAAGAAAACTAAAGTTATTGATCAAGAAATCGATGGAGAACAAACTAAAATAGTTAAAGAACAATCTAACTTGAGCCAATATTCCCTAGCAAGAGACAGACAAAGAACGCTCCCAGTGATAATGAACCTAGTTCTTTTGAGGAAGTTAAGCTGTAATGATGCTAGACGGTGGATTGAAGCCATGAATGAAGAAATAAATTCTCTAAAGGTAAATGACACATGGACCCTTATCCCTTTACCTAAGGAATGCAAACCAATAGCATCTAAGTGGATCTATAAACTCAAGGAAGGAGTCACTGAACTCACTACCAAGGCAGCCTTCCTTCATGGCTATCTAGATGAAACAATTTACATGGTTCAACCTAAAGGTTTTGAAGTTCAAGGTAAGGAAGACCTCTACTGCTTACTAAAGAAGTCAATATATGGACTGGAGCAATCACCTAGAAGCTCTAAGGAAGATTTAACTCATGTCAAAACTCTTTTAAGTAAAGAATTTTACATGAAGGATTTAGGTGAATCAAGAAAGATCCTAGGAATTGACATCACAAGAGACAAAAACCAATCCATACGAAGCATAAGTCAATCAACCTACTATGAGAAGCCTCAACTTCCTCGTTTTGAGGGAAAAACTTATAGGCGGTGGAGCCAGCAAATGAAGGTTCTTTATGGATCTCAAGATCTTTGGGATATTGTTGACATCGGATATTCAGAGCCAGAAAGTGAGAATGGTCTTTCAGCACAACAACTCAATGAGTTGAAAGATGCTAGAAAAAAAGGATAA

Coding sequence (CDS)

ATGGTAGTGGCAAGAGTGGAAATCGAGAAGTTCGATGGAAAGAGAGACTTTGCCTTATGGAAGGCAAAGATTAAGGCCTTGTTAGGACAACAAAAAGCTCACAAAGCCCTTCTAGATTCTTCAGAACTTCCAACAACCCTCACAGCAGTACAAAATGAAGAAATGAAATTAAATGCCTATGGAACACTAATACTAAACCTTAGTGACAATGTTATAAGGCAAGTACTAGAAGAAGAGACAACACACAAAATTTGGAAGAAATTAGAAAGTCTATATGCCACTAAAGATCTCTCAAACAAGATGTATCTAAGGGAAAAATTCTTTACATATAAAATGGATCCTTCCAAAAGTTTAACAGACAACTTAGATAAGTTCAAGAAAATAGTTTCAGACTTTAAAAGTCTTGAAGACAAACTCAGTGATGATAATGAGGCATTCGTTCTCCAAAATTCTCTACCCGAGGCATACAAAGAAGTGAAAAATTCACTAAAGTATGGTAAAGACTCAGCCAAAACAGATGTCATAATATCAACCCTAAGAACCAGAGAATTGAAAATACAGTTATCTCATAAGGAACATCAAAGTGGAAATGGTTTGTTTGTCAAAGCCAATTATGTTAACCCCTTCCGAAAACGTGATTGGGTCCTAGACTCAGGATGCACCTACCATATGACACTTTTTAGAGCATGGTTCAATACCTATAAAGAAATCAGTAGAGAGTCTGTGTTTATGGGGAATAATAATGCTTGTAACATTGCTGGAATTGGATCGGTCACCATGAAACTAAAGGATGAGACTGTAAATCTCCTTAGAAATGTAAGACATGTTCCTCACCTTAAAAGAAACTTAATCTCCTTGGGAATGCTAGACTCTCTAGGATGCGAATACAAAGGAAAATGTGGAGTCTTCCAAGTTTTCATGGACTCTAAAGAAGTATTGGTTGGGGGAAAAGAAAAACAAACTGATAAAGAGATTAAATACCTTAGAGCTGACAATGGTTCGGAGTTTTGTGGAGAGGTGTTCAATCATTTTTGCAAGGAAAGTGGAATCACAAGACACAAAAATGTGAGATACACACCTCAACAAAATATGGTGGCAGAAAGACTTAACAGAACTATAATGGAAAAGGTAAGATGCCAACTATCAGATGTCATTCTGGAAGAAAAGTTTTGGGTTGAAGTTGCTGCCTATGCGGTGTACACATTGAATAGAAGCCCCCATACCTCCTTAGGACTCTTAACACCTGAGGAGAAATGGGTTTCAAGTTGTGGCACCACTAACAAGAAGTTCATAATTAGTAGGGATGTTCATTTCAGAGAAACTGAGATGTTTATGCAAGGGAAAGGTAATACTAAAAGGAGCACTGAAGCCACAAAAACCTATACTCAGATTGAACTGGAGAATGCTAGAAACGGTGCTCAATTTACTAAGAAAACTAAAGTTATTGATCAAGAAATCGATGGAGAACAAACTAAAATAGTTAAAGAACAATCTAACTTGAGCCAATATTCCCTAGCAAGAGACAGACAAAGAACGCTCCCAGTGATAATGAACCTAGTTCTTTTGAGGAAGTTAAGCTGTAATGATGCTAGACGGTGGATTGAAGCCATGAATGAAGAAATAAATTCTCTAAAGGTAAATGACACATGGACCCTTATCCCTTTACCTAAGGAATGCAAACCAATAGCATCTAAGTGGATCTATAAACTCAAGGAAGGAGTCACTGAACTCACTACCAAGGCAGCCTTCCTTCATGGCTATCTAGATGAAACAATTTACATGGTTCAACCTAAAGGTTTTGAAGTTCAAGGTAAGGAAGACCTCTACTGCTTACTAAAGAAGTCAATATATGGACTGGAGCAATCACCTAGAAGCTCTAAGGAAGATTTAACTCATGTCAAAACTCTTTTAAGTAAAGAATTTTACATGAAGGATTTAGGTGAATCAAGAAAGATCCTAGGAATTGACATCACAAGAGACAAAAACCAATCCATACGAAGCATAAGTCAATCAACCTACTATGAGAAGCCTCAACTTCCTCGTTTTGAGGGAAAAACTTATAGGCGGTGGAGCCAGCAAATGAAGGTTCTTTATGGATCTCAAGATCTTTGGGATATTGTTGACATCGGATATTCAGAGCCAGAAAGTGAGAATGGTCTTTCAGCACAACAACTCAATGAGTTGAAAGATGCTAGAAAAAAAGGATAA
BLAST of CSPI01G19040 vs. Swiss-Prot
Match: POLX_TOBAC (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum PE=2 SV=1)

HSP 1 Score: 108.6 bits (270), Expect = 2.9e-22
Identity = 58/198 (29.29%), Postives = 110/198 (55.56%), Query Frame = 1

Query: 5   RVEIEKFDGKRDFALWKAKIKALLGQQKAHKALLDSSELPTTLTAVQNEEMKLNAYGTLI 64
           + E+ KF+G   F+ W+ +++ LL QQ  HK L   S+ P T+ A    ++   A   + 
Sbjct: 5   KYEVAKFNGDNGFSTWQRRMRDLLIQQGLHKVLDVDSKKPDTMKAEDWADLDERAASAIR 64

Query: 65  LNLSDNVIRQVLEEETTHKIWKKLESLYATKDLSNKMYLREKFFTYKMDPSKSLTDNLDK 124
           L+LSD+V+  +++E+T   IW +LESLY +K L+NK+YL+++ +   M    +   +L+ 
Sbjct: 65  LHLSDDVVNNIIDEDTARGIWTRLESLYMSKTLTNKLYLKKQLYALHMSEGTNFLSHLNV 124

Query: 125 FKKIVSDFKSLEDKLSDDNEAFVLQNSLPEAYKEVKNSLKYGKDSAKTDVIISTLRTREL 184
           F  +++   +L  K+ ++++A +L NSLP +Y  +  ++ +GK + +   + S L   E 
Sbjct: 125 FNGLITQLANLGVKIEEEDKAILLLNSLPSSYDNLATTILHGKTTIELKDVTSALLLNE- 184

Query: 185 KIQLSHKEHQSGNGLFVK 203
             ++  K    G  L  +
Sbjct: 185 --KMRKKPENQGQALITE 199

BLAST of CSPI01G19040 vs. Swiss-Prot
Match: COPIA_DROME (Copia protein OS=Drosophila melanogaster GN=GIP PE=1 SV=3)

HSP 1 Score: 82.8 bits (203), Expect = 1.7e-14
Identity = 48/125 (38.40%), Postives = 63/125 (50.40%), Query Frame = 1

Query: 297 KGKCGVFQVFMDSKEVLVGGKEKQTDKEIKYLRADNGSEFCGEVFNHFCKESGITRHKNV 356
           K K  VF +F D     V   E   + ++ YL  DNG E+       FC + GI+ H  V
Sbjct: 520 KYKSDVFSMFQD----FVAKSEAHFNLKVVYLYIDNGREYLSNEMRQFCVKKGISYHLTV 579

Query: 357 RYTPQQNMVAERLNRTIMEKVRCQLSDVILEEKFWVEVAAYAVYTLNRSPHTSL--GLLT 416
            +TPQ N V+ER+ RTI EK R  +S   L++ FW E    A Y +NR P  +L     T
Sbjct: 580 PHTPQLNGVSERMIRTITEKARTMVSGAKLDKSFWGEAVLTATYLINRIPSRALVDSSKT 639

Query: 417 PEEKW 420
           P E W
Sbjct: 640 PYEMW 640

BLAST of CSPI01G19040 vs. TrEMBL
Match: A5AJF1_VITVI (Putative uncharacterized protein OS=Vitis vinifera GN=VITISV_025437 PE=4 SV=1)

HSP 1 Score: 280.0 bits (715), Expect = 8.2e-72
Identity = 183/578 (31.66%), Postives = 298/578 (51.56%), Query Frame = 1

Query: 36  ALLDSSELPTTLTAVQNEEMKLNAYGTLILNLSDNVIRQVLEEETTHKIWKKLESLYATK 95
           ALL    LP+T+   Q  E+   A+  +IL+LSD V+R+V + ++  K+W KLESLY TK
Sbjct: 423 ALLGEKNLPSTMQEKQKIELLEEAHSAIILSLSDTVLREVAKAKSATKLWLKLESLYMTK 482

Query: 96  DLSNKMYLREKFFTYKMDPSKSLTDNLDKFKKIVSDFKSLEDKLSDDNEAFVLQNSLPEA 155
            L+N+++ + K +T+KM    S+ ++LD F KI+ D ++++  + D+++A +L  SL  +
Sbjct: 483 SLANRLHKKIKLYTFKMTTGMSIEEHLDHFNKIILDLENIDITILDEDKAILLLTSLDAS 542

Query: 156 YKEVKNSLKYGKDSAKTDVIISTLRTRELKIQLSHKEHQSGNGLFVKANYVNPFRKRDWV 215
           Y  +K ++ YG+DS   D    T  ++E +          G       N       ++W+
Sbjct: 543 YTNMKEAIMYGRDSLTFDE--ETKHSQENREWGDAAVILDGYDSAEMLNVAEMDSSKEWI 602

Query: 216 LDSGCTYHMTLFRAWFNTYKEISRESVFMGNNNACNIAGIGSVTMKLKDETVNLLRNVRH 275
           LDSGC++HM   +AWF  +KE     V  G+N    I G  +V +K  D    +L++V++
Sbjct: 603 LDSGCSFHMCPIKAWFKDFKEADGGYVIQGSNEHYKILGTDTVRIKHYDGIERVLKDVKY 662

Query: 276 VPHLKRNLISLGMLDSLGCEYKGKCGVFQVFMDSKEVLVGGKEKQTDKEIKYLRADNGSE 335
           +P LKRNLISLGMLD+ G  ++ +    +V   S  ++ G  +      I  +     S 
Sbjct: 663 IPELKRNLISLGMLDNSGYTFESEPNSLRVARGSLTIMKGTIKNGLYTFIGQIVTGKVST 722

Query: 336 FCGE---VFNHFCKESGITRHKNVRYTPQQNMVAERLNRTIMEKVRCQLSDVILEEKFWV 395
              E     N + +  GI  H+ VRYTPQQN +AER+NRTI+E+VRC LS   L + FW 
Sbjct: 723 VPKEDVGTTNLWHQRLGIATHRTVRYTPQQNALAERMNRTILERVRCMLSSSGLSKVFWA 782

Query: 396 EVAAYAVYTLNRSPHTSLGLLTPEEKWVSSCGTTN--KKFIISRDVHFRETEMFMQ---- 455
           E     V+ +N+S  ++L L TP+EKW          K F  +  VH +  ++  +    
Sbjct: 783 ETVETTVHLINKSLSSALQLKTPQEKWTGKYADYQYLKVFGCTAYVHTKTNKLESRVVKC 842

Query: 456 -----GKGNTKRSTEATKTYTQIELENARNGAQFTKKTKVIDQEIDGEQTKIVKEQSN-- 515
                 KG   +  E +    Q++ +      Q  K T+   + +   Q KIV E+ +  
Sbjct: 843 IFLGYSKGVKAKDVEES---DQLQFDVEHETLQPKKSTETSSKTV---QEKIVYERQDEP 902

Query: 516 ---LSQYSLARDRQR-----------------TLPVIMNLVLL------RKLSCNDARRW 572
              L  YSLARDRQ+                  L V   +V +        ++ ++A +W
Sbjct: 903 TQGLESYSLARDRQKRQVKPPKRYGQAEMTTFALSVAEEIVDMEPKTNQEAINSDEANQW 962

BLAST of CSPI01G19040 vs. TrEMBL
Match: Q94LG0_ORYSJ (Putative retroelement pol polyprotein OS=Oryza sativa subsp. japonica GN=LOC_Os03g27910 PE=4 SV=1)

HSP 1 Score: 241.9 bits (616), Expect = 2.5e-60
Identity = 159/447 (35.57%), Postives = 219/447 (48.99%), Query Frame = 1

Query: 318 EKQTDKEIKYLRADNGSEFCGEVFNHFCKESGITRHKNVRYTPQQNMVAERLNRTIMEKV 377
           E+QT+KE+K LR DNG EFC + F+ +C++ GI RH  + YTPQQN VAER+NRTI+ K 
Sbjct: 541 ERQTEKEVKVLRTDNGGEFCSDAFDDYCRKEGIVRHHTIPYTPQQNGVAERMNRTIISKA 600

Query: 378 RCQLSDVILEEKFWVEVAAYAVYTLNRSPHTSLGLLTPEEKW-----------VSSCGT- 437
           RC LS+  + ++FW E A  A Y +NRSP   L   TP E W           V  C   
Sbjct: 601 RCMLSNARMNKRFWAEAANTACYLINRSPSIPLNKKTPIEVWSGMPADYSQLRVFGCTAY 660

Query: 438 ---------------------------------TNKKFIISRDVHFRETEMFMQGKGNTK 497
                                            TNK F+  R V F ++ MF     N  
Sbjct: 661 AHVDNGKLEPRAIKCLFLGYGSGVKRYKLWNPETNKTFM-RRSVVFNKSVMF-----NDS 720

Query: 498 RSTEATKTYTQIELENARNGAQFTKKTKVIDQEIDGEQTKIVKEQSNLSQ-YSLARDRQR 557
             T+     +  E +  ++     ++TK       G   ++++E   +   +S A   + 
Sbjct: 721 LPTDVIPGGSDEEQQYPQDEPIAHRRTK----RSCGAPVRLIEECDMVYYAFSCAEQVEN 780

Query: 558 TLPVIMNLVLLRKLSCNDARRWIEAMNEEINSLKVNDTWTLIPLPKECKPIASKWIYKLK 617
           TL           +   D  +WI AM EE+ SL+ N TW L+ LPK+ KP+  KWI+K K
Sbjct: 781 TLEPA---TYTEAVVSGDREKWISAMQEEMQSLEKNGTWELVHLPKQKKPVRCKWIFKRK 840

Query: 618 EGVT--------------------ELTTKAAFLHGYLDETIYMVQPKGFEVQGKEDLYCL 677
           EG++                    +L  K AFL+G L+E IYM QP+GF V GKED  C 
Sbjct: 841 EGLSSSEPPRFKASIVAMHDLELEQLDVKTAFLYGELEEEIYMDQPEGFIVPGKEDYVCK 900

BLAST of CSPI01G19040 vs. TrEMBL
Match: A0A061GYD4_THECC (Uncharacterized protein OS=Theobroma cacao GN=TCM_041944 PE=4 SV=1)

HSP 1 Score: 219.5 bits (558), Expect = 1.3e-53
Identity = 134/440 (30.45%), Postives = 232/440 (52.73%), Query Frame = 1

Query: 5   RVEIEKFDGKRDFALWKAKIKALLGQQKAHKALLDSSELPTTLTAVQNEEMKLNAYGTLI 64
           + EIEKF+G+ DF+LW  K+ ALL QQ   KAL +   L + L+  + + +   A+  ++
Sbjct: 7   KYEIEKFNGRNDFSLWCVKMCALLVQQGLLKALKEKEHLLSNLSNGEKDNLMEKAHSAIL 66

Query: 65  LNLSDNVIRQVLEEETTHKIWKKLESLYATKDLSNKMYLREKFFTYKMDPSKSLTDNLDK 124
           L LSD VIR+V +EE+   +W KL+S+Y TK L N++Y++++ +T KM    S+  ++D+
Sbjct: 67  LALSDEVIREVTDEESAIAVWLKLKSIYMTKSLMNRLYIKQRLYTLKMSEGTSVNTHIDE 126

Query: 125 FKKIVSDFKSLEDKLSDDNEAFVLQNSLPEAYKEVKNSLKYGKDSAKTDVIISTLRTREL 184
           F +++ D K+++ K+ D++ A +L  SLP +Y+   +++ YG+D+   + + ++L ++EL
Sbjct: 127 FNRVILDLKNIDVKIEDEDLALILLCSLPPSYENFMDTMLYGRDTFTFEDVRASLNSKEL 186

Query: 185 KIQLSH--------KEHQSGNGLFVKANYV----NPFRKRD-------------WVLDSG 244
           K ++ H        K+ +  N      N V    + F + D             W+LD  
Sbjct: 187 KKKVGHFRQDCTKFKDDEKINKFVNTVNVVGDDFDTFEETDNVLTITNDNLMDTWILDLA 246

Query: 245 CTYHMTLFRAWFNTYKEISRESVFMGNNNACNIAGIGSVTMKLKDETVNLLRNVRHVPHL 304
           C +H+ L R WF+TY+ +   +V + ++ + ++ GIG++ +K+ D  V  L        L
Sbjct: 247 CCFHICLRRDWFSTYQSVDMGTVQLEDDFSLSVVGIGTIRIKMFDGMVRSLEG-----KL 306

Query: 305 KRNLISLGMLDSLGCEYKGKCGVFQVFMDSKEVLVGGKEKQTDKEIKYLRADNGSEFCGE 364
             +L  L     +G        V        +VL    EKQT K I+  + D G EFC  
Sbjct: 307 SNDLYCL-----VGNTVIETVSVVSSNDPEDDVLT---EKQTKKRIESFQIDKGLEFCKG 366

Query: 365 VFNHFCKESGITRHKNVRYTPQQNMVAERLNRTIMEKVRCQLSDVILEEKFWVEVAAYAV 420
            F  F K   I R+  V  TP QN + E + +T++E+ +   S+V L + FW +    A 
Sbjct: 367 EFGLFYKNEKIVRYCTVVKTPHQNGIVEWMKKTLLERAKYMFSNVGLTKVFWTKAINKAC 426

BLAST of CSPI01G19040 vs. TrEMBL
Match: Q01N60_ORYSA (OSIGBa0127D24.3 protein OS=Oryza sativa GN=OSIGBa0127D24.3 PE=4 SV=1)

HSP 1 Score: 211.8 bits (538), Expect = 2.7e-51
Identity = 149/412 (36.17%), Postives = 211/412 (51.21%), Query Frame = 1

Query: 297 KGKCGVFQVFMDSKEVLVGGKEKQTDKEIKYLRADNGSEFCGEVFNHFCKESGITRHKNV 356
           K K   F VF + K ++    E+QT++++K LR DNG EFC ++F  +CK  GI RH  V
Sbjct: 352 KHKYQAFDVFKEWKTMV----ERQTERKVKILRPDNGMEFCSKIFKSYCKSEGIVRHYTV 411

Query: 357 RYTPQQNMVAERLNRTIMEKVRCQLSDVILEEKFWVEVAAYAVYTLNRSPHTSLGLLTPE 416
            +TPQQN VAER+NRTI+ K RC LS+  L ++FW E  + A Y +NRSP  ++   TP 
Sbjct: 412 PHTPQQNGVAERMNRTIISKARCMLSNAGLPKQFWAEAVSTACYLINRSPSYAIDKKTPI 471

Query: 417 EKWVSS---------------------------------------------CGTTNKKFI 476
           E W SS                                             C  T KK +
Sbjct: 472 EVWSSSPAKYSDLRVFSCTAYAHVDNGKLEPRAIKCIFLGYPSGVKGYKLWCPET-KKVV 531

Query: 477 ISRDVHFRETEMFMQGKGNTKRSTEATKTYTQIELENARNGAQFTKKTKV-IDQE---ID 536
           I+R+V F E+ M +  K +T    E+ +    +++E+  +     +K  V I+Q+   I+
Sbjct: 532 INRNVVFHESVM-LHDKPSTNVPVESQEK-ASVQVEHLISSGHAPEKEDVAINQDAPVIE 591

Query: 537 GEQTKIVKEQSNLSQYSLARDRQR--TLP----------VIMNLVLLRKLSCN------- 596
              + IV++     + S+A+DR +  T P          V   L +  ++  N       
Sbjct: 592 DSDSSIVQQSP---KRSIAKDRPKRNTKPPRKYIEEANIVTYALSVAEEIEGNTEPSTYS 651

Query: 597 ------DARRWIEAMNEEINSLKVNDTWTLIPLPKECKPIASKWIYKLKEGVT---ELTT 625
                 D  RWI AM++E+ SL+ N TW L+ LPKE KPI  KWI+K KEG++   E   
Sbjct: 652 DAIVSDDCNRWITAMHDEMESLEKNHTWELVKLPKEKKPIRCKWIFKRKEGMSPSDEARY 711

BLAST of CSPI01G19040 vs. TrEMBL
Match: Q2QP43_ORYSJ (Retrotransposon protein, putative, Ty1-copia subclass OS=Oryza sativa subsp. japonica GN=LOC_Os12g35740 PE=4 SV=1)

HSP 1 Score: 209.1 bits (531), Expect = 1.8e-50
Identity = 203/753 (26.96%), Postives = 323/753 (42.90%), Query Frame = 1

Query: 1   MVVARVEIEKFDGKRDFALWKAKIKALLGQQKAHKALLDSSELPTTLTAVQNEEMKLNAY 60
           MV  + ++   D K  F+LW+ K++A+L Q         + +L     +V N    ++ +
Sbjct: 1   MVSMKYDLPLLDYKTRFSLWQVKMRAVLAQ---------TLDLDEESGSVLNH---ISVF 60

Query: 61  GTLILNLSDNVIRQVLEEETTHKIWKKLESLYATKDLSNKMYLREKFFTYKMDPSKSLTD 120
             ++ +L    + Q  +E+    +   L S YA  +  + + L     T  +        
Sbjct: 61  KEIVADLVSMEV-QFDDEDLGLLLLCSLPSSYA--NFRDTILLSHDELT--LAEVYEALQ 120

Query: 121 NLDKFKKIV------SDFKSLEDKLSDDNEAFVLQNSLPEAYKEVKNSLKYGKDSAKTDV 180
           N +K K +V      S  ++L+ +   +   +   N   ++    K   KY K       
Sbjct: 121 NREKMKGMVQSDASSSKGEALQVRGRFERRTYNDSNDHDKSQSRGKKLCKYCKKKNHFIE 180

Query: 181 IISTLRTRELK-------IQLSHKEHQSGNGLFVKANYVNPFRKRDWVLDSGCTYHMTLF 240
               L+ +E +       +  S     SG+ L V A  V      +W+LD+ C++H+ + 
Sbjct: 181 ECWNLQNKEKRKSDGKASVVTSADNSDSGDCLVVFAGCVA--NHDEWILDTACSFHICIN 240

Query: 241 RAWFNTYKEISRESVF-MGNNNACNIAGIGSVTMKLKDETVNLLRNVRHVPHLKRNLISL 300
           R WF++YK +    V  MG++N   I GIGSV +K  D     L+ VRH+P + RNLISL
Sbjct: 241 RDWFSSYKSVQNGDVVRMGDDNPRVIMGIGSVQIKTHDGMTRTLKYVRHIPGMARNLISL 300

Query: 301 GMLDSLGCEYKGKC--GVFQVFMDSKE---------VLVGGKEKQTDKEIKYLRADNGS- 360
             LD+ G +Y G    G       SK+         + +G   +    E+      +G  
Sbjct: 301 STLDAEGYKYSGSTLHGSVTAAAVSKDEASRTNLWHMRLGHMSELGIAELMKRNLLDGCT 360

Query: 361 ----EFCG----EVFNHFCKESGITRHKNVRYTPQQNMVAERLNRTIMEKVRCQLSDVIL 420
               +FC     + F+ +C++ GI RH  + YTPQQN VA+R+NRTI+ + RC LS+  +
Sbjct: 361 QGNMKFCEHCVFDAFDDYCRKEGIVRHHTIPYTPQQNGVAKRMNRTIISEARCLLSNARM 420

Query: 421 EEKFWVEVAAYAVYTLNRSPHTSLGLLTPEEKW-----------VSSC------------ 480
            ++FW E A  A Y +NRSP   L   TP E W           V  C            
Sbjct: 421 NKRFWAEAANTACYLINRSPSIPLNKKTPIEVWSGMPADYSQLRVFGCTAYAHVDNGKLE 480

Query: 481 ----------------------GTTNKKFIISRDVHFRETEMFMQGKGNT--KRSTEATK 540
                                   TNK F+ SR V F E+ MF            ++  +
Sbjct: 481 PRAIKCLFLSYGSGVKGYKLWNPQTNKTFM-SRSVVFNESIMFNDSLPTDVIPGGSDEEQ 540

Query: 541 TYTQIELENARNGAQFTKKTKVIDQEID--------------------------GEQTKI 600
            Y  +++E+  +     ++T+++  +I+                          G   ++
Sbjct: 541 QYVSVQVEHVDD-----QETEIVGNDINDIVQHSPPVLQPQDEPIAHRRTKRSCGAPVRL 600

Query: 601 VKEQSNLSQY--SLARDRQRTLPVIMNLVLLRKLSCNDARRWIEAMNEEINSLKVNDTWT 625
           + E+ ++  Y  S A   + TL           +   D  + I AM EE+ SL+ N TW 
Sbjct: 601 I-EKCDMVYYAFSCAEQVENTLEPA---TYTEAIVSGDREKSISAMQEEMQSLEKNGTWE 660

BLAST of CSPI01G19040 vs. TAIR10
Match: AT1G48720.1 (AT1G48720.1 unknown protein)

HSP 1 Score: 56.2 bits (134), Expect = 9.7e-08
Identity = 24/57 (42.11%), Postives = 37/57 (64.91%), Query Frame = 1

Query: 678 QLPRFEGKTYRRWSQQMKVLYGSQDLWDIVDIGYSEPESENGLSAQQLNELKDARKK 735
           Q+P      Y  WS +MK + G+ D+W+IV+ G+ EPE+E  LS  Q + L+D+RK+
Sbjct: 9   QVPVLTKSNYDNWSLRMKAILGAHDVWEIVEKGFIEPENEGSLSQTQKDGLRDSRKR 65

BLAST of CSPI01G19040 vs. NCBI nr
Match: gi|147777710|emb|CAN69096.1| (hypothetical protein VITISV_025437 [Vitis vinifera])

HSP 1 Score: 280.0 bits (715), Expect = 1.2e-71
Identity = 183/578 (31.66%), Postives = 298/578 (51.56%), Query Frame = 1

Query: 36  ALLDSSELPTTLTAVQNEEMKLNAYGTLILNLSDNVIRQVLEEETTHKIWKKLESLYATK 95
           ALL    LP+T+   Q  E+   A+  +IL+LSD V+R+V + ++  K+W KLESLY TK
Sbjct: 423 ALLGEKNLPSTMQEKQKIELLEEAHSAIILSLSDTVLREVAKAKSATKLWLKLESLYMTK 482

Query: 96  DLSNKMYLREKFFTYKMDPSKSLTDNLDKFKKIVSDFKSLEDKLSDDNEAFVLQNSLPEA 155
            L+N+++ + K +T+KM    S+ ++LD F KI+ D ++++  + D+++A +L  SL  +
Sbjct: 483 SLANRLHKKIKLYTFKMTTGMSIEEHLDHFNKIILDLENIDITILDEDKAILLLTSLDAS 542

Query: 156 YKEVKNSLKYGKDSAKTDVIISTLRTRELKIQLSHKEHQSGNGLFVKANYVNPFRKRDWV 215
           Y  +K ++ YG+DS   D    T  ++E +          G       N       ++W+
Sbjct: 543 YTNMKEAIMYGRDSLTFDE--ETKHSQENREWGDAAVILDGYDSAEMLNVAEMDSSKEWI 602

Query: 216 LDSGCTYHMTLFRAWFNTYKEISRESVFMGNNNACNIAGIGSVTMKLKDETVNLLRNVRH 275
           LDSGC++HM   +AWF  +KE     V  G+N    I G  +V +K  D    +L++V++
Sbjct: 603 LDSGCSFHMCPIKAWFKDFKEADGGYVIQGSNEHYKILGTDTVRIKHYDGIERVLKDVKY 662

Query: 276 VPHLKRNLISLGMLDSLGCEYKGKCGVFQVFMDSKEVLVGGKEKQTDKEIKYLRADNGSE 335
           +P LKRNLISLGMLD+ G  ++ +    +V   S  ++ G  +      I  +     S 
Sbjct: 663 IPELKRNLISLGMLDNSGYTFESEPNSLRVARGSLTIMKGTIKNGLYTFIGQIVTGKVST 722

Query: 336 FCGE---VFNHFCKESGITRHKNVRYTPQQNMVAERLNRTIMEKVRCQLSDVILEEKFWV 395
              E     N + +  GI  H+ VRYTPQQN +AER+NRTI+E+VRC LS   L + FW 
Sbjct: 723 VPKEDVGTTNLWHQRLGIATHRTVRYTPQQNALAERMNRTILERVRCMLSSSGLSKVFWA 782

Query: 396 EVAAYAVYTLNRSPHTSLGLLTPEEKWVSSCGTTN--KKFIISRDVHFRETEMFMQ---- 455
           E     V+ +N+S  ++L L TP+EKW          K F  +  VH +  ++  +    
Sbjct: 783 ETVETTVHLINKSLSSALQLKTPQEKWTGKYADYQYLKVFGCTAYVHTKTNKLESRVVKC 842

Query: 456 -----GKGNTKRSTEATKTYTQIELENARNGAQFTKKTKVIDQEIDGEQTKIVKEQSN-- 515
                 KG   +  E +    Q++ +      Q  K T+   + +   Q KIV E+ +  
Sbjct: 843 IFLGYSKGVKAKDVEES---DQLQFDVEHETLQPKKSTETSSKTV---QEKIVYERQDEP 902

Query: 516 ---LSQYSLARDRQR-----------------TLPVIMNLVLL------RKLSCNDARRW 572
              L  YSLARDRQ+                  L V   +V +        ++ ++A +W
Sbjct: 903 TQGLESYSLARDRQKRQVKPPKRYGQAEMTTFALSVAEEIVDMEPKTNQEAINSDEANQW 962

BLAST of CSPI01G19040 vs. NCBI nr
Match: gi|14029020|gb|AAK52561.1|AC079853_14 (Putative retroelement pol polyprotein [Oryza sativa Japonica Group])

HSP 1 Score: 241.9 bits (616), Expect = 3.5e-60
Identity = 159/447 (35.57%), Postives = 219/447 (48.99%), Query Frame = 1

Query: 318 EKQTDKEIKYLRADNGSEFCGEVFNHFCKESGITRHKNVRYTPQQNMVAERLNRTIMEKV 377
           E+QT+KE+K LR DNG EFC + F+ +C++ GI RH  + YTPQQN VAER+NRTI+ K 
Sbjct: 541 ERQTEKEVKVLRTDNGGEFCSDAFDDYCRKEGIVRHHTIPYTPQQNGVAERMNRTIISKA 600

Query: 378 RCQLSDVILEEKFWVEVAAYAVYTLNRSPHTSLGLLTPEEKW-----------VSSCGT- 437
           RC LS+  + ++FW E A  A Y +NRSP   L   TP E W           V  C   
Sbjct: 601 RCMLSNARMNKRFWAEAANTACYLINRSPSIPLNKKTPIEVWSGMPADYSQLRVFGCTAY 660

Query: 438 ---------------------------------TNKKFIISRDVHFRETEMFMQGKGNTK 497
                                            TNK F+  R V F ++ MF     N  
Sbjct: 661 AHVDNGKLEPRAIKCLFLGYGSGVKRYKLWNPETNKTFM-RRSVVFNKSVMF-----NDS 720

Query: 498 RSTEATKTYTQIELENARNGAQFTKKTKVIDQEIDGEQTKIVKEQSNLSQ-YSLARDRQR 557
             T+     +  E +  ++     ++TK       G   ++++E   +   +S A   + 
Sbjct: 721 LPTDVIPGGSDEEQQYPQDEPIAHRRTK----RSCGAPVRLIEECDMVYYAFSCAEQVEN 780

Query: 558 TLPVIMNLVLLRKLSCNDARRWIEAMNEEINSLKVNDTWTLIPLPKECKPIASKWIYKLK 617
           TL           +   D  +WI AM EE+ SL+ N TW L+ LPK+ KP+  KWI+K K
Sbjct: 781 TLEPA---TYTEAVVSGDREKWISAMQEEMQSLEKNGTWELVHLPKQKKPVRCKWIFKRK 840

Query: 618 EGVT--------------------ELTTKAAFLHGYLDETIYMVQPKGFEVQGKEDLYCL 677
           EG++                    +L  K AFL+G L+E IYM QP+GF V GKED  C 
Sbjct: 841 EGLSSSEPPRFKASIVAMHDLELEQLDVKTAFLYGELEEEIYMDQPEGFIVPGKEDYVCK 900

BLAST of CSPI01G19040 vs. NCBI nr
Match: gi|590589905|ref|XP_007016583.1| (Uncharacterized protein TCM_041944 [Theobroma cacao])

HSP 1 Score: 219.5 bits (558), Expect = 1.9e-53
Identity = 134/440 (30.45%), Postives = 232/440 (52.73%), Query Frame = 1

Query: 5   RVEIEKFDGKRDFALWKAKIKALLGQQKAHKALLDSSELPTTLTAVQNEEMKLNAYGTLI 64
           + EIEKF+G+ DF+LW  K+ ALL QQ   KAL +   L + L+  + + +   A+  ++
Sbjct: 7   KYEIEKFNGRNDFSLWCVKMCALLVQQGLLKALKEKEHLLSNLSNGEKDNLMEKAHSAIL 66

Query: 65  LNLSDNVIRQVLEEETTHKIWKKLESLYATKDLSNKMYLREKFFTYKMDPSKSLTDNLDK 124
           L LSD VIR+V +EE+   +W KL+S+Y TK L N++Y++++ +T KM    S+  ++D+
Sbjct: 67  LALSDEVIREVTDEESAIAVWLKLKSIYMTKSLMNRLYIKQRLYTLKMSEGTSVNTHIDE 126

Query: 125 FKKIVSDFKSLEDKLSDDNEAFVLQNSLPEAYKEVKNSLKYGKDSAKTDVIISTLRTREL 184
           F +++ D K+++ K+ D++ A +L  SLP +Y+   +++ YG+D+   + + ++L ++EL
Sbjct: 127 FNRVILDLKNIDVKIEDEDLALILLCSLPPSYENFMDTMLYGRDTFTFEDVRASLNSKEL 186

Query: 185 KIQLSH--------KEHQSGNGLFVKANYV----NPFRKRD-------------WVLDSG 244
           K ++ H        K+ +  N      N V    + F + D             W+LD  
Sbjct: 187 KKKVGHFRQDCTKFKDDEKINKFVNTVNVVGDDFDTFEETDNVLTITNDNLMDTWILDLA 246

Query: 245 CTYHMTLFRAWFNTYKEISRESVFMGNNNACNIAGIGSVTMKLKDETVNLLRNVRHVPHL 304
           C +H+ L R WF+TY+ +   +V + ++ + ++ GIG++ +K+ D  V  L        L
Sbjct: 247 CCFHICLRRDWFSTYQSVDMGTVQLEDDFSLSVVGIGTIRIKMFDGMVRSLEG-----KL 306

Query: 305 KRNLISLGMLDSLGCEYKGKCGVFQVFMDSKEVLVGGKEKQTDKEIKYLRADNGSEFCGE 364
             +L  L     +G        V        +VL    EKQT K I+  + D G EFC  
Sbjct: 307 SNDLYCL-----VGNTVIETVSVVSSNDPEDDVLT---EKQTKKRIESFQIDKGLEFCKG 366

Query: 365 VFNHFCKESGITRHKNVRYTPQQNMVAERLNRTIMEKVRCQLSDVILEEKFWVEVAAYAV 420
            F  F K   I R+  V  TP QN + E + +T++E+ +   S+V L + FW +    A 
Sbjct: 367 EFGLFYKNEKIVRYCTVVKTPHQNGIVEWMKKTLLERAKYMFSNVGLTKVFWTKAINKAC 426

BLAST of CSPI01G19040 vs. NCBI nr
Match: gi|116317760|emb|CAH65740.1| (OSIGBa0127D24.3 [Oryza sativa Indica Group])

HSP 1 Score: 211.8 bits (538), Expect = 3.9e-51
Identity = 149/412 (36.17%), Postives = 211/412 (51.21%), Query Frame = 1

Query: 297 KGKCGVFQVFMDSKEVLVGGKEKQTDKEIKYLRADNGSEFCGEVFNHFCKESGITRHKNV 356
           K K   F VF + K ++    E+QT++++K LR DNG EFC ++F  +CK  GI RH  V
Sbjct: 352 KHKYQAFDVFKEWKTMV----ERQTERKVKILRPDNGMEFCSKIFKSYCKSEGIVRHYTV 411

Query: 357 RYTPQQNMVAERLNRTIMEKVRCQLSDVILEEKFWVEVAAYAVYTLNRSPHTSLGLLTPE 416
            +TPQQN VAER+NRTI+ K RC LS+  L ++FW E  + A Y +NRSP  ++   TP 
Sbjct: 412 PHTPQQNGVAERMNRTIISKARCMLSNAGLPKQFWAEAVSTACYLINRSPSYAIDKKTPI 471

Query: 417 EKWVSS---------------------------------------------CGTTNKKFI 476
           E W SS                                             C  T KK +
Sbjct: 472 EVWSSSPAKYSDLRVFSCTAYAHVDNGKLEPRAIKCIFLGYPSGVKGYKLWCPET-KKVV 531

Query: 477 ISRDVHFRETEMFMQGKGNTKRSTEATKTYTQIELENARNGAQFTKKTKV-IDQE---ID 536
           I+R+V F E+ M +  K +T    E+ +    +++E+  +     +K  V I+Q+   I+
Sbjct: 532 INRNVVFHESVM-LHDKPSTNVPVESQEK-ASVQVEHLISSGHAPEKEDVAINQDAPVIE 591

Query: 537 GEQTKIVKEQSNLSQYSLARDRQR--TLP----------VIMNLVLLRKLSCN------- 596
              + IV++     + S+A+DR +  T P          V   L +  ++  N       
Sbjct: 592 DSDSSIVQQSP---KRSIAKDRPKRNTKPPRKYIEEANIVTYALSVAEEIEGNTEPSTYS 651

Query: 597 ------DARRWIEAMNEEINSLKVNDTWTLIPLPKECKPIASKWIYKLKEGVT---ELTT 625
                 D  RWI AM++E+ SL+ N TW L+ LPKE KPI  KWI+K KEG++   E   
Sbjct: 652 DAIVSDDCNRWITAMHDEMESLEKNHTWELVKLPKEKKPIRCKWIFKRKEGMSPSDEARY 711

BLAST of CSPI01G19040 vs. NCBI nr
Match: gi|77556671|gb|ABA99467.1| (retrotransposon protein, putative, Ty1-copia subclass [Oryza sativa Japonica Group])

HSP 1 Score: 209.1 bits (531), Expect = 2.5e-50
Identity = 203/753 (26.96%), Postives = 323/753 (42.90%), Query Frame = 1

Query: 1   MVVARVEIEKFDGKRDFALWKAKIKALLGQQKAHKALLDSSELPTTLTAVQNEEMKLNAY 60
           MV  + ++   D K  F+LW+ K++A+L Q         + +L     +V N    ++ +
Sbjct: 1   MVSMKYDLPLLDYKTRFSLWQVKMRAVLAQ---------TLDLDEESGSVLNH---ISVF 60

Query: 61  GTLILNLSDNVIRQVLEEETTHKIWKKLESLYATKDLSNKMYLREKFFTYKMDPSKSLTD 120
             ++ +L    + Q  +E+    +   L S YA  +  + + L     T  +        
Sbjct: 61  KEIVADLVSMEV-QFDDEDLGLLLLCSLPSSYA--NFRDTILLSHDELT--LAEVYEALQ 120

Query: 121 NLDKFKKIV------SDFKSLEDKLSDDNEAFVLQNSLPEAYKEVKNSLKYGKDSAKTDV 180
           N +K K +V      S  ++L+ +   +   +   N   ++    K   KY K       
Sbjct: 121 NREKMKGMVQSDASSSKGEALQVRGRFERRTYNDSNDHDKSQSRGKKLCKYCKKKNHFIE 180

Query: 181 IISTLRTRELK-------IQLSHKEHQSGNGLFVKANYVNPFRKRDWVLDSGCTYHMTLF 240
               L+ +E +       +  S     SG+ L V A  V      +W+LD+ C++H+ + 
Sbjct: 181 ECWNLQNKEKRKSDGKASVVTSADNSDSGDCLVVFAGCVA--NHDEWILDTACSFHICIN 240

Query: 241 RAWFNTYKEISRESVF-MGNNNACNIAGIGSVTMKLKDETVNLLRNVRHVPHLKRNLISL 300
           R WF++YK +    V  MG++N   I GIGSV +K  D     L+ VRH+P + RNLISL
Sbjct: 241 RDWFSSYKSVQNGDVVRMGDDNPRVIMGIGSVQIKTHDGMTRTLKYVRHIPGMARNLISL 300

Query: 301 GMLDSLGCEYKGKC--GVFQVFMDSKE---------VLVGGKEKQTDKEIKYLRADNGS- 360
             LD+ G +Y G    G       SK+         + +G   +    E+      +G  
Sbjct: 301 STLDAEGYKYSGSTLHGSVTAAAVSKDEASRTNLWHMRLGHMSELGIAELMKRNLLDGCT 360

Query: 361 ----EFCG----EVFNHFCKESGITRHKNVRYTPQQNMVAERLNRTIMEKVRCQLSDVIL 420
               +FC     + F+ +C++ GI RH  + YTPQQN VA+R+NRTI+ + RC LS+  +
Sbjct: 361 QGNMKFCEHCVFDAFDDYCRKEGIVRHHTIPYTPQQNGVAKRMNRTIISEARCLLSNARM 420

Query: 421 EEKFWVEVAAYAVYTLNRSPHTSLGLLTPEEKW-----------VSSC------------ 480
            ++FW E A  A Y +NRSP   L   TP E W           V  C            
Sbjct: 421 NKRFWAEAANTACYLINRSPSIPLNKKTPIEVWSGMPADYSQLRVFGCTAYAHVDNGKLE 480

Query: 481 ----------------------GTTNKKFIISRDVHFRETEMFMQGKGNT--KRSTEATK 540
                                   TNK F+ SR V F E+ MF            ++  +
Sbjct: 481 PRAIKCLFLSYGSGVKGYKLWNPQTNKTFM-SRSVVFNESIMFNDSLPTDVIPGGSDEEQ 540

Query: 541 TYTQIELENARNGAQFTKKTKVIDQEID--------------------------GEQTKI 600
            Y  +++E+  +     ++T+++  +I+                          G   ++
Sbjct: 541 QYVSVQVEHVDD-----QETEIVGNDINDIVQHSPPVLQPQDEPIAHRRTKRSCGAPVRL 600

Query: 601 VKEQSNLSQY--SLARDRQRTLPVIMNLVLLRKLSCNDARRWIEAMNEEINSLKVNDTWT 625
           + E+ ++  Y  S A   + TL           +   D  + I AM EE+ SL+ N TW 
Sbjct: 601 I-EKCDMVYYAFSCAEQVENTLEPA---TYTEAIVSGDREKSISAMQEEMQSLEKNGTWE 660

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
POLX_TOBAC2.9e-2229.29Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum... [more]
COPIA_DROME1.7e-1438.40Copia protein OS=Drosophila melanogaster GN=GIP PE=1 SV=3[more]
Match NameE-valueIdentityDescription
A5AJF1_VITVI8.2e-7231.66Putative uncharacterized protein OS=Vitis vinifera GN=VITISV_025437 PE=4 SV=1[more]
Q94LG0_ORYSJ2.5e-6035.57Putative retroelement pol polyprotein OS=Oryza sativa subsp. japonica GN=LOC_Os0... [more]
A0A061GYD4_THECC1.3e-5330.45Uncharacterized protein OS=Theobroma cacao GN=TCM_041944 PE=4 SV=1[more]
Q01N60_ORYSA2.7e-5136.17OSIGBa0127D24.3 protein OS=Oryza sativa GN=OSIGBa0127D24.3 PE=4 SV=1[more]
Q2QP43_ORYSJ1.8e-5026.96Retrotransposon protein, putative, Ty1-copia subclass OS=Oryza sativa subsp. jap... [more]
Match NameE-valueIdentityDescription
AT1G48720.19.7e-0842.11 unknown protein[more]
Match NameE-valueIdentityDescription
gi|147777710|emb|CAN69096.1|1.2e-7131.66hypothetical protein VITISV_025437 [Vitis vinifera][more]
gi|14029020|gb|AAK52561.1|AC079853_143.5e-6035.57Putative retroelement pol polyprotein [Oryza sativa Japonica Group][more]
gi|590589905|ref|XP_007016583.1|1.9e-5330.45Uncharacterized protein TCM_041944 [Theobroma cacao][more]
gi|116317760|emb|CAH65740.1|3.9e-5136.17OSIGBa0127D24.3 [Oryza sativa Indica Group][more]
gi|77556671|gb|ABA99467.1|2.5e-5026.96retrotransposon protein, putative, Ty1-copia subclass [Oryza sativa Japonica Gro... [more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR001584Integrase_cat-core
IPR012337RNaseH-like_sf
IPR013103RVT_2
Vocabulary: Biological Process
TermDefinition
GO:0015074DNA integration
Vocabulary: Molecular Function
TermDefinition
GO:0003676nucleic acid binding
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0015074 DNA integration
cellular_component GO:0005575 cellular_component
molecular_function GO:1901363 heterocyclic compound binding
molecular_function GO:0003676 nucleic acid binding

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CSPI01G19040.1CSPI01G19040.1mRNA


Analysis Name: InterPro Annotations of cucumber (PI183967)
Date Performed: 2017-01-17
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR001584Integrase, catalytic corePFAMPF00665rvecoord: 320..376
score: 2.
IPR001584Integrase, catalytic corePROFILEPS50994INTEGRASEcoord: 326..423
score: 15
IPR012337Ribonuclease H-like domainGENE3DG3DSA:3.30.420.10coord: 325..415
score: 1.7
IPR012337Ribonuclease H-like domainunknownSSF53098Ribonuclease H-likecoord: 323..417
score: 4.05
IPR013103Reverse transcriptase, RNA-dependent DNA polymerasePFAMPF07727RVT_2coord: 575..630
score: 2.0
NoneNo IPR availablePANTHERPTHR11439GAG-POL-RELATED RETROTRANSPOSONcoord: 7..676
score: 1.0E
NoneNo IPR availablePANTHERPTHR11439:SF192SUBFAMILY NOT NAMEDcoord: 7..676
score: 1.0E
NoneNo IPR availablePFAMPF14223Retrotran_gag_2coord: 55..184
score: 2.5

The following gene(s) are orthologous to this gene:

None

The following gene(s) are paralogous to this gene:

None