CSPI02G15880 (gene) Wild cucumber (PI 183967)

NameCSPI02G15880
Typegene
OrganismCucumis sativus (Wild cucumber (PI 183967))
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
LocationChr2 : 15347244 .. 15349890 (+)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: CDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGACGACACCTGAGAGTTCGACGTCCGGCAGCTCCAACTACTCAGTTTCGATTACCTCTTCTGATCTTGACGCCCAACTAAATCCCTTCATGCTTCATCATTCCATCACTCCAACCACCAACCTTGTTTCTACACCATTGGCAGGATCGAACAACTACTCATCATGGAGTAGAGCAATGATGCTGGCCTTATCTGGAAAGAACAAAGTTGGGTTCATCACTGGCCTGATCAAGAAACCTTCAGAAGGTAATCTATTATCCGCTTGGAAATGCAACAATGATGTGATAGCTTCTTGGATTATCAACTCTATTTCAAAAGAGATAGCAGCAAGCCTCGTTTATAATGGAAACGTAAAGGAAATATGGGATGAATTGAAAGAAAGGTACAAACAGTCCAATGGACCTCACATATACCAGTTACGAAAGGACCTAGTAACCACTACACAAGGAAGTTTATCAGTTGAAATATACTATGCAAAAATCACCACTATATGGCAGGAACTTGTGGAATATCGTCCTATGGATGAATGCACTTGTGAAGGATCAAAGAAAATGATCGATTTCTTGAATGCAGAATTCGTAATGATCTTTCTCATGGGATTAAATGAATCCTATTCGCAAATTAGAGCCCAAATTTTGTTGATTGATCCTTTACCTCCCATAAATAGAGTCTTTTCTCTAATCATCCAAGAAGAAAGACAAAGATCCATTGGATCTTCACCCTCCATTGAGAGCATCACATTGATGGCTAACTCTGAAAGAAGATTTTCTTCTGATAAATCCAAGAAGAAAGATACAAGACCTATATGCTCCAACTGTGGCTATAAAGGACACACTGCTGACAAATGCTACAAGTTACATGGCTACCCACCCGGACATAGACTTGCCAACAGCAATAATTCTGTCCATCAAAGACAGGACAACACAATCCAAAATGGAAATGACAAAGTGACAGAAGTTTCTAAGAGGAATCAATCTGCCTTTTTTGCTAGTCTCAACAGTGATCAATATACACAACTTCTGGGCATGCTTCAAACTCATCTCAACACACTTCAAAATGGTGAGAATTTCAAAAATGAGACTACACACATAGCAGGTACTTGCCTCTCTAACTCACTCAATGATCCCTTAACATGGATCATTGACTCTGGTGCTTCCTCACATATTTGTCATGACAAGTTTATGTTTACAAATCTCTATAGCGCCTAGAATATGTTTGTTATTTTGCCCACTAAGACTCGTCTAAAGGTTGAGCATATAGGAGATGTTTTCATATCAAATGATCTAGTCCTGAAAGATGTACTTTATATTCCTGACTTCAAGTATAACCTGCTGTCAGTAAGTACTCTCCTCAAGGATGACAAATTTGCCATATCATTTGCTGATTCTAATTGTCTTATTCAGGACAAGTGGCTTTTGAAAACGATTGGGAAGGCTGAATTAACTAATGGGCTCTACCTACTCAGAATGAAAAATGAAAGAGTTAATTGCATTCAGCACACTGCACTAATGTGTAAAGTCTCGGCCTCTATGTGGCATAAACGGATGGGACATCCCTCTATCAGCAGAATAAATGAGCTAGCTAAGATGATAGAAATTTCTGACTTTCCAAATTGTAAAGAAGTCTGCCATATTTGTCCCTTAGCTAAACAAAGACGTCTCTCTTTTCCTATATTGAATAACATTGCTGAAAATATATTTGATCTTATACATTGTGACATATGGGGTCCTTTCAAAACCCCAACACATGCTGGTCATTCATATTTTGCCACCATTGTATATGATAAATCTAGATACACTTGGGTATATCTTTTGGAACATAAGAGTGATATCCTACAAGTTATTCCTAGATTTTTGAAGTTAATTGAAACCCAATTTTCAAAAGTCATCAAGGTCTTTCGATCTGACAATGCTCCTGAGTTGAATTTCAGGGATCTTTTTGCCAAAACTGGAACAACTCATCAATTCTCGTGTGCTTACACTCCTCAACAAAATTCAGTAGTGGAAAGAAAACACCAACACCTTCTTAACGTAGCAAGAGCATTGATGTTCCAATCAAAGGTTCCTCTTATCTTTTGGGGAGAATGTGTTCTGAGTGCTGCATACTTGATCAACAGAACGCCTATGGTATTACTATCAAATAACACTCCCTTTGCTGCTCTATTCAAGAAGAAAGCAGATTACAACATCATTAAGACCTTCGGGTGTCTTGCCTATGCCTCTACCCCCTCAGTAAACAGATCTAAGTTTGATCCTAGAGCACAACCTTGTGTTTTTATGGGGTTCCCACCAGGCATCAAAGGATACAGATTATATGACATAGCCAAGAGAAAGTTCTTTATATCTAGGGATGTCCTATTCTTTGAAGAACTATTTCCCTTTCATTCTATCAAAGAAAAGGACATTCTGATCTCCCATGACTTCCTTGAGCAATTCGTCATACCATGCCCTCTATTTGATTGCCTAGAAAAGGAAGATAGTATTGATGCAAGACCTACGACAGAGGATAGCCCTGAAGACAGCCACGGTGTTGATGATCAAAATCCACATATCAGTAACTCAGAAGAAACCAGTAACACTGACCAAGAACCAATTCCCATCATGACCAGAAAATCCTCCCGACCACACCACCCACCATCTTACCTAA

mRNA sequence

ATGACGACACCTGAGAGTTCGACGTCCGGCAGCTCCAACTACTCAGTTTCGATTACCTCTTCTGATCTTGACGCCCAACTAAATCCCTTCATGCTTCATCATTCCATCACTCCAACCACCAACCTTGTTTCTACACCATTGGCAGGATCGAACAACTACTCATCATGGAGTAGAGCAATGATGCTGGCCTTATCTGGAAAGAACAAAGTTGGGTTCATCACTGGCCTGATCAAGAAACCTTCAGAAGGTAATCTATTATCCGCTTGGAAATGCAACAATGATGTGATAGCTTCTTGGATTATCAACTCTATTTCAAAAGAGATAGCAGCAAGCCTCGTTTATAATGGAAACGTAAAGGAAATATGGGATGAATTGAAAGAAAGGTACAAACAGTCCAATGGACCTCACATATACCAGTTACGAAAGGACCTAGTAACCACTACACAAGGAAGTTTATCAGTTGAAATATACTATGCAAAAATCACCACTATATGGCAGGAACTTGTGGAATATCGTCCTATGGATGAATGCACTTGTGAAGGATCAAAGAAAATGATCGATTTCTTGAATGCAGAATTCGTAATGATCTTTCTCATGGGATTAAATGAATCCTATTCGCAAATTAGAGCCCAAATTTTGTTGATTGATCCTTTACCTCCCATAAATAGAGTCTTTTCTCTAATCATCCAAGAAGAAAGACAAAGATCCATTGGATCTTCACCCTCCATTGAGAGCATCACATTGATGGCTAACTCTGAAAGAAGATTTTCTTCTGATAAATCCAAGAAGAAAGATACAAGACCTATATGCTCCAACTGTGGCTATAAAGGACACACTGCTGACAAATGCTACAAGTTACATGGCTACCCACCCGGACATAGACTTGCCAACAGCAATAATTCTGTCCATCAAAGACAGGACAACACAATCCAAAATGGAAATGACAAAGTGACAGAAGTTTCTAAGAGGAATCAATCTGCCTTTTTTGCTAGTCTCAACAGTGATCAATATACACAACTTCTGGGCATGCTTCAAACTCATCTCAACACACTTCAAAATGGTGAGAATTTCAAAAATGAGACTACACACATAGCAGTCCTGAAAGATGTACTTTATATTCCTGACTTCAAGTATAACCTGCTGTCAGTAAGTACTCTCCTCAAGGATGACAAATTTGCCATATCATTTGCTGATTCTAATTGTCTTATTCAGGACAAGTGGCTTTTGAAAACGATTGGGAAGGCTGAATTAACTAATGGGCTCTACCTACTCAGAATGAAAAATGAAAGAGTTAATTGCATTCAGCACACTGCACTAATGTGTAAAGTCTCGGCCTCTATGTGGCATAAACGGATGGGACATCCCTCTATCAGCAGAATAAATGAGCTAGCTAAGATGATAGAAATTTCTGACTTTCCAAATTGTAAAGAAGTCTGCCATATTTGTCCCTTAGCTAAACAAAGACGTCTCTCTTTTCCTATATTGAATAACATTGCTGAAAATATATTTGATCTTATACATTGTGACATATGGGGTCCTTTCAAAACCCCAACACATGCTGGTCATTCATATTTTGCCACCATTGTATATGATAAATCTAGATACACTTGGGTATATCTTTTGGAACATAAGAGTGATATCCTACAAGTTATTCCTAGATTTTTGAAGTTAATTGAAACCCAATTTTCAAAAGTCATCAAGGTCTTTCGATCTGACAATGCTCCTGAGTTGAATTTCAGGGATCTTTTTGCCAAAACTGGAACAACTCATCAATTCTCGTGTGCTTACACTCCTCAACAAAATTCAGTAGTGGAAAGAAAACACCAACACCTTCTTAACGTAGCAAGAGCATTGATGTTCCAATCAAAGGTTCCTCTTATCTTTTGGGGAGAATGTGTTCTGAGTGCTGCATACTTGATCAACAGAACGCCTATGGTATTACTATCAAATAACACTCCCTTTGCTGCTCTATTCAAGAAGAAAGCAGATTACAACATCATTAAGACCTTCGGGTGTCTTGCCTATGCCTCTACCCCCTCAGTAAACAGATCTAAGTTTGATCCTAGAGCACAACCTTGTGTTTTTATGGGGTTCCCACCAGGCATCAAAGGATACAGATTATATGACATAGCCAAGAGAAAGTTCTTTATATCTAGGGATGTCCTATTCTTTGAAGAACTATTTCCCTTTCATTCTATCAAAGAAAAGGACATTCTGATCTCCCATGACTTCCTTGAGCAATTCGTCATACCATGCCCTCTATTTGATTGCCTAGAAAAGGAAGATAGTATTGATGCAAGACCTACGACAGAGGATAGCCCTGAAGACAGCCACGGTGTTGATGATCAAAATCCACATATCAGTAACTCAGAAGAAACCAAAAATCCTCCCGACCACACCACCCACCATCTTACCTAA

Coding sequence (CDS)

ATGACGACACCTGAGAGTTCGACGTCCGGCAGCTCCAACTACTCAGTTTCGATTACCTCTTCTGATCTTGACGCCCAACTAAATCCCTTCATGCTTCATCATTCCATCACTCCAACCACCAACCTTGTTTCTACACCATTGGCAGGATCGAACAACTACTCATCATGGAGTAGAGCAATGATGCTGGCCTTATCTGGAAAGAACAAAGTTGGGTTCATCACTGGCCTGATCAAGAAACCTTCAGAAGGTAATCTATTATCCGCTTGGAAATGCAACAATGATGTGATAGCTTCTTGGATTATCAACTCTATTTCAAAAGAGATAGCAGCAAGCCTCGTTTATAATGGAAACGTAAAGGAAATATGGGATGAATTGAAAGAAAGGTACAAACAGTCCAATGGACCTCACATATACCAGTTACGAAAGGACCTAGTAACCACTACACAAGGAAGTTTATCAGTTGAAATATACTATGCAAAAATCACCACTATATGGCAGGAACTTGTGGAATATCGTCCTATGGATGAATGCACTTGTGAAGGATCAAAGAAAATGATCGATTTCTTGAATGCAGAATTCGTAATGATCTTTCTCATGGGATTAAATGAATCCTATTCGCAAATTAGAGCCCAAATTTTGTTGATTGATCCTTTACCTCCCATAAATAGAGTCTTTTCTCTAATCATCCAAGAAGAAAGACAAAGATCCATTGGATCTTCACCCTCCATTGAGAGCATCACATTGATGGCTAACTCTGAAAGAAGATTTTCTTCTGATAAATCCAAGAAGAAAGATACAAGACCTATATGCTCCAACTGTGGCTATAAAGGACACACTGCTGACAAATGCTACAAGTTACATGGCTACCCACCCGGACATAGACTTGCCAACAGCAATAATTCTGTCCATCAAAGACAGGACAACACAATCCAAAATGGAAATGACAAAGTGACAGAAGTTTCTAAGAGGAATCAATCTGCCTTTTTTGCTAGTCTCAACAGTGATCAATATACACAACTTCTGGGCATGCTTCAAACTCATCTCAACACACTTCAAAATGGTGAGAATTTCAAAAATGAGACTACACACATAGCAGTCCTGAAAGATGTACTTTATATTCCTGACTTCAAGTATAACCTGCTGTCAGTAAGTACTCTCCTCAAGGATGACAAATTTGCCATATCATTTGCTGATTCTAATTGTCTTATTCAGGACAAGTGGCTTTTGAAAACGATTGGGAAGGCTGAATTAACTAATGGGCTCTACCTACTCAGAATGAAAAATGAAAGAGTTAATTGCATTCAGCACACTGCACTAATGTGTAAAGTCTCGGCCTCTATGTGGCATAAACGGATGGGACATCCCTCTATCAGCAGAATAAATGAGCTAGCTAAGATGATAGAAATTTCTGACTTTCCAAATTGTAAAGAAGTCTGCCATATTTGTCCCTTAGCTAAACAAAGACGTCTCTCTTTTCCTATATTGAATAACATTGCTGAAAATATATTTGATCTTATACATTGTGACATATGGGGTCCTTTCAAAACCCCAACACATGCTGGTCATTCATATTTTGCCACCATTGTATATGATAAATCTAGATACACTTGGGTATATCTTTTGGAACATAAGAGTGATATCCTACAAGTTATTCCTAGATTTTTGAAGTTAATTGAAACCCAATTTTCAAAAGTCATCAAGGTCTTTCGATCTGACAATGCTCCTGAGTTGAATTTCAGGGATCTTTTTGCCAAAACTGGAACAACTCATCAATTCTCGTGTGCTTACACTCCTCAACAAAATTCAGTAGTGGAAAGAAAACACCAACACCTTCTTAACGTAGCAAGAGCATTGATGTTCCAATCAAAGGTTCCTCTTATCTTTTGGGGAGAATGTGTTCTGAGTGCTGCATACTTGATCAACAGAACGCCTATGGTATTACTATCAAATAACACTCCCTTTGCTGCTCTATTCAAGAAGAAAGCAGATTACAACATCATTAAGACCTTCGGGTGTCTTGCCTATGCCTCTACCCCCTCAGTAAACAGATCTAAGTTTGATCCTAGAGCACAACCTTGTGTTTTTATGGGGTTCCCACCAGGCATCAAAGGATACAGATTATATGACATAGCCAAGAGAAAGTTCTTTATATCTAGGGATGTCCTATTCTTTGAAGAACTATTTCCCTTTCATTCTATCAAAGAAAAGGACATTCTGATCTCCCATGACTTCCTTGAGCAATTCGTCATACCATGCCCTCTATTTGATTGCCTAGAAAAGGAAGATAGTATTGATGCAAGACCTACGACAGAGGATAGCCCTGAAGACAGCCACGGTGTTGATGATCAAAATCCACATATCAGTAACTCAGAAGAAACCAAAAATCCTCCCGACCACACCACCCACCATCTTACCTAA
BLAST of CSPI02G15880 vs. Swiss-Prot
Match: POLX_TOBAC (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum PE=2 SV=1)

HSP 1 Score: 215.3 bits (547), Expect = 2.4e-54
Identity = 145/446 (32.51%), Postives = 219/446 (49.10%), Query Frame = 1

Query: 366 VLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLY--L 425
           VLKDV ++PD + NL+S    L  D +   FA+       KW L T G   +  G+    
Sbjct: 349 VLKDVRHVPDLRMNLIS-GIALDRDGYESYFANQ------KWRL-TKGSLVIAKGVARGT 408

Query: 426 LRMKNERVNCIQHTALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNCK-EVCHIC 485
           L   N  +   +  A   ++S  +WHKRMGH S   +  LAK   IS       + C  C
Sbjct: 409 LYRTNAEICQGELNAAQDEISVDLWHKRMGHMSEKGLQILAKKSLISYAKGTTVKPCDYC 468

Query: 486 PLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYTWVYLLE 545
              KQ R+SF   +    NI DL++ D+ GP +  +  G+ YF T + D SR  WVY+L+
Sbjct: 469 LFGKQHRVSFQTSSERKLNILDLVYSDVCGPMEIESMGGNKYFVTFIDDASRKLWVYILK 528

Query: 546 HKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELN---FRDLFAKTGTTHQFSCAYTPQ 605
            K  + QV  +F  L+E +  + +K  RSDN  E     F +  +  G  H+ +   TPQ
Sbjct: 529 TKDQVFQVFQKFHALVERETGRKLKRLRSDNGGEYTSREFEEYCSSHGIRHEKTVPGTPQ 588

Query: 606 QNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFK 665
            N V ER ++ ++   R+++  +K+P  FWGE V +A YLINR+P V L+   P      
Sbjct: 589 HNGVAERMNRTIVEKVRSMLRMAKLPKSFWGEAVQTACYLINRSPSVPLAFEIPERVWTN 648

Query: 666 KKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISR 725
           K+  Y+ +K FGC A+A  P   R+K D ++ PC+F+G+     GYRL+D  K+K   SR
Sbjct: 649 KEVSYSHLKVFGCRAFAHVPKEQRTKLDDKSIPCIFIGYGDEEFGYRLWDPVKKKVIRSR 708

Query: 726 DVLFFE-ELFPFHSIKEKDILISHDFLEQFV-IPCPLFDCLEKEDSIDARPTTEDSPEDS 785
           DV+F E E+     + EK   + +  +  FV IP    +    E + D    +E   +  
Sbjct: 709 DVVFRESEVRTAADMSEK---VKNGIIPNFVTIPSTSNNPTSAESTTD--EVSEQGEQPG 768

Query: 786 HGVDDQNPHISNSEETKNPPDHTTHH 804
             ++         EE ++P      H
Sbjct: 769 EVIEQGEQLDEGVEEVEHPTQGEEQH 781

BLAST of CSPI02G15880 vs. Swiss-Prot
Match: COPIA_DROME (Copia protein OS=Drosophila melanogaster GN=GIP PE=1 SV=3)

HSP 1 Score: 154.1 bits (388), Expect = 6.6e-36
Identity = 110/378 (29.10%), Postives = 179/378 (47.35%), Query Frame = 1

Query: 363 HIAVLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLY 422
           H   L+DVL+  +   NL+SV  L ++   +I F  S   I    L+  +  + + N + 
Sbjct: 341 HEITLEDVLFCKEAAGNLMSVKRL-QEAGMSIEFDKSGVTISKNGLM-VVKNSGMLNNVP 400

Query: 423 LLRMKNERVNCIQHTALMCKVSASMWHKRMGHPSISRINELAKMIEISDFP-------NC 482
           ++  +   +N  +H     K +  +WH+R GH S  ++ E+ +    SD         +C
Sbjct: 401 VINFQAYSINA-KH-----KNNFRLWHERFGHISDGKLLEIKRKNMFSDQSLLNNLELSC 460

Query: 483 KEVCHICPLAKQRRLSFPILNN---IAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDK 542
            E+C  C   KQ RL F  L +   I   +F ++H D+ GP    T    +YF   V   
Sbjct: 461 -EICEPCLNGKQARLPFKQLKDKTHIKRPLF-VVHSDVCGPITPVTLDDKNYFVIFVDQF 520

Query: 543 SRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPEL---NFRDLFAKTGTT 602
           + Y   YL+++KSD+  +   F+   E  F+  +     DN  E      R    K G +
Sbjct: 521 THYCVTYLIKYKSDVFSMFQDFVAKSEAHFNLKVVYLYIDNGREYLSNEMRQFCVKKGIS 580

Query: 603 HQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLL- 662
           +  +  +TPQ N V ER  + +   AR ++  +K+   FWGE VL+A YLINR P   L 
Sbjct: 581 YHLTVPHTPQLNGVSERMIRTITEKARTMVSGAKLDKSFWGEAVLTATYLINRIPSRALV 640

Query: 663 -SNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRL 722
            S+ TP+     KK     ++ FG   Y    +  + KFD ++   +F+G+ P   G++L
Sbjct: 641 DSSKTPYEMWHNKKPYLKHLRVFGATVYVHIKN-KQGKFDDKSFKSIFVGYEP--NGFKL 700

Query: 723 YDIAKRKFFISRDVLFFE 726
           +D    KF ++RDV+  E
Sbjct: 701 WDAVNEKFIVARDVVVDE 705

BLAST of CSPI02G15880 vs. TrEMBL
Match: O04543_ARATH (F20P5.25 protein OS=Arabidopsis thaliana GN=F20P5.25 PE=4 SV=1)

HSP 1 Score: 488.0 bits (1255), Expect = 2.2e-134
Identity = 269/701 (38.37%), Postives = 395/701 (56.35%), Query Frame = 1

Query: 60  MMLALSGKNKVGFITGLIKKPSEGN-LLSAWKCNNDVIASWIINSISKEIAASLVYNGNV 119
           M  ++  KNK+GF+ G I KP + +     W+  N ++ SW++NS+SKEI  S++Y    
Sbjct: 1   MTTSIEAKNKLGFVDGSIPKPDDDDPYCKIWRRCNSMVKSWLLNSVSKEIYTSILYFPTA 60

Query: 120 KEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECT 179
             IW +L  R+ +S+ P +Y+LR+ + +  QG+L +  Y+ +  T+W+EL   + +    
Sbjct: 61  AAIWKDLYTRFHKSSLPRLYKLRQQIHSLRQGNLDLSSYHTRTQTLWEELTSLQAVPRTV 120

Query: 180 CEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPLPPINRVFSLIIQEERQRS-- 239
               + ++       V+ FLMGLN+ Y  +R+QIL+   LP ++ VF++I Q+E QRS  
Sbjct: 121 ----EDLLIERETNRVIDFLMGLNDCYDTVRSQILMKKTLPSLSEVFNMIDQDETQRSAR 180

Query: 240 IGSSPSIESITLMAN---SERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGH 299
           I ++P + S     +   S+   + D  +KK+ RP+CS C   GH  D CYK HGYP   
Sbjct: 181 ISTTPGMTSSVFPVSNQSSQSALNGDTYQKKE-RPVCSYCSRPGHVEDTCYKKHGYPTSF 240

Query: 300 RLANSNNSVHQRQDNTIQNGNDKVTEVSKRNQSAFFASLNSDQYTQLLGML--------- 359
           +  +    V          G+++V      N S     L + Q  QL+  L         
Sbjct: 241 K--SKQKFVKPSISANAAIGSEEVV----NNTSVSTGDLTTSQIQQLVSFLSSKLQPPST 300

Query: 360 --QTHLNTLQNGENFKNETT--------HIA---VLKDVLYIPDFKYNLLSVSTLLKDDK 419
             Q  ++++    +  + +T        H+    +L DVL+IP FK+NLLSVS+L K   
Sbjct: 301 PVQPEVHSISVSSDPSSSSTVCPISGSVHLGRHLILNDVLFIPQFKFNLLSVSSLTKSMG 360

Query: 420 FAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQHTALMCKV-SASMWHK 479
             I F +++C++QD      +G  +    LY++ + +        +  +  V S  +WHK
Sbjct: 361 CRIWFDETSCVLQDATRELMVGMGKQVANLYIVDLDSLSHPGTDSSITVASVTSHDLWHK 420

Query: 480 RMGHPSISRINELAKMIEISDFPNCKEV-CHICPLAKQRRLSFPILNNIAENIFDLIHCD 539
           R+GHPS+ ++  ++ ++      N  +  C +C ++KQ+ L F   NN +   FDLIH D
Sbjct: 421 RLGHPSVQKLQPMSSLLSFPKQKNNTDFHCRVCHISKQKHLPFVSHNNKSSRPFDLIHID 480

Query: 540 IWGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVF 599
            WGPF   TH G+ YF TIV D SR TWVYLL +KSD+L VIP F+ ++E QF   IK  
Sbjct: 481 TWGPFSVQTHDGYRYFLTIVDDYSRATWVYLLRNKSDVLTVIPTFVTMVENQFETTIKGV 540

Query: 600 RSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFW 659
           RSDNAPELNF   +   G     SC  TPQQNSVVERKHQH+LNVAR+L FQS +P+ +W
Sbjct: 541 RSDNAPELNFTQFYHSKGIVPYHSCPETPQQNSVVERKHQHILNVARSLFFQSHIPISYW 600

Query: 660 GECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPR 719
           G+C+L+A YLINR P  +L +  PF  L K    Y+ IK FGCL YAST   +R KF PR
Sbjct: 601 GDCILTAVYLINRLPAPILEDKCPFEVLTKTVPTYDHIKVFGCLCYASTSPKDRHKFSPR 660

Query: 720 AQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFPF 731
           A+ C F+G+P G KGY+L D+      +SR V+F EELFPF
Sbjct: 661 AKACAFIGYPSGFKGYKLLDLETHSIIVSRHVVFHEELFPF 690

BLAST of CSPI02G15880 vs. TrEMBL
Match: A0A151QVL3_CAJCA (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cajanus cajan GN=KK1_044776 PE=4 SV=1)

HSP 1 Score: 469.5 bits (1207), Expect = 8.0e-129
Identity = 277/734 (37.74%), Postives = 392/734 (53.41%), Query Frame = 1

Query: 45  TPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKP-SEGNLLSAWKCNNDVIASWIINS 104
           +P   S NY SWSR+M+ ALS KNKV F+ G   +P S   + SAWK  N+++ SW++ S
Sbjct: 5   SPSLDSTNYHSWSRSMLTALSAKNKVEFVDGSAPQPPSSDRIYSAWKRCNNMVVSWLVPS 64

Query: 105 ISKEIAASLVYNGNVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITT 164
           +S  I  S+++  + +EIW +LK RY Q +   I  L+ +  +  QG LSV  Y+ ++  
Sbjct: 65  VSFSIRQSILWMDSAEEIWRDLKSRYSQGDLLRISALQLEASSIKQGDLSVTDYFTQLRI 124

Query: 165 IWQELVEYRP------MDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDP 224
           IW EL  +RP      + +C C+ S  +      +  M FL GLN+ Y+ +R+ +LL+DP
Sbjct: 125 IWDELENFRPDPICVCIVKCICKVSSILAQRKLEDQAMQFLRGLNDQYANVRSHVLLMDP 184

Query: 225 LPPINRVFSLIIQEERQRSIGSSPS-----IESITLMANSERRFSS-------------- 284
           LPPIN++FS + Q+ERQ ++  S +       +  +M NS   F                
Sbjct: 185 LPPINKIFSYVAQQERQFAVSDSLAEVKNGFANAAIM-NSTCNFCGRNGHTESTCYRKHG 244

Query: 285 --DKSKKKDTR--PICSNCGYKGHTADKCYKLHGYPPGHRLAN----SNNSVHQRQDNTI 344
             DK+ K  +     CS+CG  GHT D CYK HG+PPGHR +N    S NS  Q    T 
Sbjct: 245 FPDKNGKSSSNRGKACSHCGKNGHTVDTCYKKHGFPPGHRFSNNKSASANSSVQPHICTT 304

Query: 345 QNGNDKVTEVSKRNQSAFFASLNSDQYTQLLG-----------MLQTHLNTLQNGENFKN 404
            + +     +          SL      +L+            +  TH  T+    NF+ 
Sbjct: 305 HSFDTTPWIIDSGATDHVTCSLQFFTSYKLIEPVIVNLPTGHKVTATHSGTVYFSSNFQ- 364

Query: 405 ETTHIAVLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTN 464
                  L DVLYI  F +NL+SVS L+    + I+F ++ C IQD      IG  ++  
Sbjct: 365 -------LTDVLYISSFAFNLISVSKLVSTTSYQITFTNNVCFIQDIRTKMKIGSVDVRG 424

Query: 465 GLYLLRMKNERVNCIQHTALMCK---VSASMWHKRMGHPSISRINELAKMIEISDFPNCK 524
           GLY L   + +   I  T +  K   +   +WH R+GH S +R++ + ++       N  
Sbjct: 425 GLYQLIPHHFKPPFIHSTIIHPKCDVIPIDLWHFRLGHLSNTRLHNMQQLYPCLTI-NKD 484

Query: 525 EVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHA-GHSYFATIVYDKSRY 584
             C+IC  AKQR+LSF   ++ A   F L+H DIWGP+   +   GH +F TIV D + +
Sbjct: 485 FTCNICHYAKQRKLSFSSSHSTASRPFSLLHMDIWGPYSCISSIHGHKFFFTIVDDNTHF 544

Query: 585 TWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCA 644
           TW++L+ +KS+    I  F+ LIE QF+  I+  ++DN  E   ++ F   G  HQ +C 
Sbjct: 545 TWIFLMINKSETRMHISNFINLIENQFNTRIQTIQTDNGAEFLMQNFFNSKGIVHQTTCI 604

Query: 645 YTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFA 704
            TPQQN VVERKHQHLLNV  AL+F SK+P  FW   +L A YLINR    LL N TPF 
Sbjct: 605 ETPQQNGVVERKHQHLLNVTHALLFHSKLPYCFWSYALLHATYLINRITTPLLDNKTPFQ 664

Query: 705 ALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKF 730
            L+ +  D   ++ FGCL Y ST + NR K DPRA PCVF+GF P  KGY  YD+  R  
Sbjct: 665 KLYGQTCDITELRVFGCLCYVSTSTANRKKLDPRAHPCVFLGFSPTTKGYITYDLHTRAI 724

BLAST of CSPI02G15880 vs. TrEMBL
Match: A0A151RKM5_CAJCA (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cajanus cajan GN=KK1_035441 PE=4 SV=1)

HSP 1 Score: 462.6 bits (1189), Expect = 9.7e-127
Identity = 280/789 (35.49%), Postives = 426/789 (53.99%), Query Frame = 1

Query: 60  MMLALSGKNKVGFITGLIKKP-SEGNLLSAWKCNNDVIASWIINSISKEIAASLVYNGNV 119
           M+L L  KNK+ F+ G + +P + G   +AW   N ++ SWI+ S+   +  S+++    
Sbjct: 1   MLLVLGTKNKLAFVDGSLARPVTAGVDQTAWDRCNKLVISWIVQSLDTSLIPSVIWMPTA 60

Query: 120 KEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECT 179
            +IW++LK+RY Q +   I +L +++ +  QG++S+  Y+  +  +WQEL  + P+  CT
Sbjct: 61  SQIWNDLKKRYYQGDAFRISELLEEIYSLKQGNMSITHYFTTLQGLWQELDHFCPIPSCT 120

Query: 180 CEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPLPPINRVFSLIIQEERQRSIG 239
           C             F   FL GLNE YS +R+Q++L+DPLP + +VFS+I+Q+ER+    
Sbjct: 121 C-------------FTTRFLKGLNEQYSNVRSQVMLMDPLPSVQKVFSMILQQEREFHGT 180

Query: 240 SSPSIESITLMANSERR-FSSDKSKKKD---TRPICSNCGYKGHTADKCYKLHGYPPGH- 299
           +   + ++T  +N+ER  +   K+ K++      +CS+CG  GH  D CYK HG P  H 
Sbjct: 181 NDNQVLAVT--SNNERNNYKGSKTFKRNKDYNTKVCSHCGRIGHLVDSCYKKHGPPLQHK 240

Query: 300 --RLANS-----------NNSVHQRQDNTIQNGNDKVTE-----VSKRNQSAFFASL--- 359
             R+ N            + SVH ++  +  +GN    E     ++   QS    S    
Sbjct: 241 HGRIVNQYQSVSDEDTDDDQSVHSQRVVSHNSGNMFTPEQHQALLALLQQSGMILSFPYS 300

Query: 360 -NSD----------------QYTQLLGMLQTHLNTLQNGENFKNETTHIAVLK------D 419
            NSD                 + Q    ++  L  L NG       +   +L       D
Sbjct: 301 HNSDCWIIDTGATDHVCKNINFFQSYRRIKPILIQLPNGSQVSTCVSGTVLLSKTCYLTD 360

Query: 420 VLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNE 479
           VL+IP+F +NL+SVS L K     ++F+DS+C IQ    ++ IG AEL  GLY + + + 
Sbjct: 361 VLFIPNFHFNLISVSKLAKTLSCTLTFSDSDCQIQANHSMRMIGAAELRAGLYAM-VSSP 420

Query: 480 RVNCIQH-TALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQR 539
             N + H T+      + +WH R+GH S  +++ L               C IC LAK +
Sbjct: 421 ESNVVHHCTSHFFTYQSDLWHLRLGHLSHDKLSALKGSYPEIQCNKISLPCEICHLAKHK 480

Query: 540 RLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDIL 599
           RL FP     +EN+FDLIH DIWGP    +  GH YF TIV DKSR+TW++ +++K +  
Sbjct: 481 RLPFPDSLTKSENVFDLIHVDIWGPLSVASIFGHKYFLTIVDDKSRFTWIFFMKNKFETK 540

Query: 600 QVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKH 659
            ++  F+  ++TQF + IK  R+DN  E   +D +AK G  HQ SC  TPQQN VVERKH
Sbjct: 541 LLLQNFVSFVQTQFQQNIKTIRTDNGSEFLLKDWYAKLGIVHQTSCVNTPQQNGVVERKH 600

Query: 660 QHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIK 719
           QH+L++ARALMFQS V  +FW   +  A +LINR P   L  N+P+  L+ +K D++ +K
Sbjct: 601 QHILSMARALMFQSNVSKMFWNYAIGHAVHLINRLPTRFLQQNSPYYVLYSEKPDFSHLK 660

Query: 720 TFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELF 779
            FGCLA+AST S NR+K +PR++ C+F+G+  G KG+ +YD+  R+ FISRDV F+E +F
Sbjct: 661 VFGCLAFASTLSHNRTKLEPRSRKCMFLGYSSGTKGFIMYDLKTRETFISRDVQFYENIF 720

Query: 780 PFHSIKEKDILI-SHDFLEQFVIPCPLFDCLEKEDSIDARPTTEDSPEDSHGVDDQNPHI 797
           P     +KD  I S D     +   PL  C    D I +  T ++  E  H  +     +
Sbjct: 721 PL----QKDFSIQSTDGPVVPIAQMPLTSC----DPIPSH-THDNLDETEHEHNSSTLPM 764

BLAST of CSPI02G15880 vs. TrEMBL
Match: Q9XIL8_ARATH (Putative retroelement pol polyprotein OS=Arabidopsis thaliana GN=At2g15870 PE=4 SV=1)

HSP 1 Score: 439.5 bits (1129), Expect = 8.8e-120
Identity = 270/808 (33.42%), Postives = 419/808 (51.86%), Query Frame = 1

Query: 6   SSTSGSSNYSVSITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALS 65
           S T+G+S   VS  S+D     +PF LH +  P  N++S  L    NY  W+ AM+++  
Sbjct: 43  SKTTGASRVIVSPESTD--PTQSPFFLHSADHPCLNIISHRL-DETNYGDWNVAMLISC- 102

Query: 66  GKNKVGFITGLIKKPSEGNLLSAWKCNNDVIASWIINSISKEIAASLVYNGNVKEIWDEL 125
                                      N ++ SW++NS+S +I  S++   +V +IW +L
Sbjct: 103 ---------------------------NTMVKSWLLNSMSPQIYRSILRMNDVSDIWRDL 162

Query: 126 KERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDE-CTCEGSKK 185
             R+  +N P  Y L +++    Q +LS+  YY ++ T+W +L     +D+ CTC  + +
Sbjct: 163 NSRFNMTNLPRTYNLTQEIQDLRQSTLSLSEYYTRLKTLWDQLNSTEELDDPCTCGKALR 222

Query: 186 MIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPLPPINRVFSLIIQEERQRSIGSSPS-- 245
           +        ++ FL GLNESY+ IR Q++    LP +  V+ ++ Q+  Q+   +  +  
Sbjct: 223 LQQKAERAKIVKFLAGLNESYAIIRRQVIAKKILPSLAEVYHIVDQDNSQQGFSNVVAPP 282

Query: 246 ----IESITLMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANS 305
               +  +T+    +      ++     RP+CS     GH A++CYK HG+PPG      
Sbjct: 283 VAFQVSEVTVANIIDPTICYVQNCPNKGRPMCSFYNRVGHIAERCYKKHGFPPGF----- 342

Query: 306 NNSVHQRQDNTIQNGNDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTHLNTLQNGENFK 365
                           DKV + +++ +S                +    L T       K
Sbjct: 343 -------------TPKDKVGDKTQKPKSV---------------VANVALATT------K 402

Query: 366 NETTHIAVLK-DVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAEL 425
           ++ TH  +   +VL+IP+F+ NL+S+S+L  D    + F      IQD    + +G    
Sbjct: 403 SDDTHSGLESLNVLFIPEFRLNLISISSLTDDIGSRVIFDQHAFEIQDLIKGRMLGHGRR 462

Query: 426 TNGLYLLRMKNERVNCIQHTALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNC-K 485
              LY++ +++  V+      +   V  S WH R+ H S+ R++ +++ +  +   N   
Sbjct: 463 VANLYVMDVEDTNVS------VNAVVDISTWHNRLRHASLQRLDVISESLGTTKHKNKGS 522

Query: 486 EVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYT 545
           + CH+C LAK R+LSFP  NN+   IF+++H DIWGPF   T  G+ YF TIV D SR T
Sbjct: 523 DYCHVCHLAKHRKLSFPSQNNVCNEIFEMLHIDIWGPFSVETVDGYQYFLTIVDDHSRAT 582

Query: 546 WVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAY 605
           W+YLL+ KS++L +   F++ +E Q+   +K  RSDNAPEL F  L+ + G     SC  
Sbjct: 583 WIYLLKTKSEVLTIFHDFIQQVENQYKVKVKAVRSDNAPELRFTSLYQRKGIMAFHSCPE 642

Query: 606 TPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAA 665
           TP+QNSVVERKHQH+LNVARALMFQS+VPL  WGECVL+A +LINRTP  LLSN TP+  
Sbjct: 643 TPEQNSVVERKHQHILNVARALMFQSQVPLFLWGECVLTAVFLINRTPSQLLSNKTPYEI 702

Query: 666 LFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFF 725
           L      Y  ++TFGCL Y+ST    R KF PR++ C+F+G+  G KGY+L D+     F
Sbjct: 703 LSGTAPQYGQLRTFGCLCYSSTSPKQRHKFQPRSKACIFLGYSSGYKGYKLMDLESNAIF 759

Query: 726 ISRDVLFFEELFPFHSIKEKDILISHDFLEQFVIPCPLFDCLEKEDSIDARPTTEDSPED 785
           ISR+V+F EE+FP    K+     S D L+       +   ++++ S  +          
Sbjct: 763 ISRNVVFLEEVFPLAGTKK-----SADSLKLCTPLVLVPSGIQQQSSFSSL--------- 759

Query: 786 SHGVDDQNPHISNSEETKNPPDHTTHHL 805
           S  + D  P IS S+  + PP H + ++
Sbjct: 823 SSQISDLPPQIS-SQRDRKPPTHLSDYV 759

BLAST of CSPI02G15880 vs. TrEMBL
Match: A0A151R1J9_CAJCA (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cajanus cajan GN=KK1_042468 PE=4 SV=1)

HSP 1 Score: 439.1 bits (1128), Expect = 1.2e-119
Identity = 283/848 (33.37%), Postives = 429/848 (50.59%), Query Frame = 1

Query: 32  LHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSEGNLL-SAWK 91
           LH +  P   LVS PL  S NY SWSR+M+ ALS KNKV F+ G   KPS  + + +AW+
Sbjct: 17  LHPNENPAIALVS-PLLDSTNYHSWSRSMLTALSAKNKVEFVDGSAPKPSPSDPMHAAWR 76

Query: 92  CNNDVIASWIINSISKEIAASLVYNGNVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQG 151
             N+++ SW+++S+S  I  S+++   V EIW +LK RY                   QG
Sbjct: 77  RCNNMVVSWLVHSVSMPIRHSVLWMDKVDEIWKDLKSRYSNK----------------QG 136

Query: 152 SLSVEIYYAKITTIWQELVEYRPMDECTC--EGSKKMIDFLNA----EFVMIFLMGLNES 211
            L++  ++ K+  +W EL  +RP   CTC  + S K+   +      +  M FL GLN+ 
Sbjct: 137 DLTITEFFTKLRVVWDELENFRPDPVCTCTIKCSCKLFSTIAQRKLEDRAMQFLRGLNDQ 196

Query: 212 YSQIRAQILLIDPLPPINRVFSLIIQEERQ-----------RSIGSSPSI---------- 271
           YS +R+ +LL+DPLPPI+++FS + Q ERQ           R I ++ S           
Sbjct: 197 YSNVRSHVLLMDPLPPISKIFSYVAQLERQLLGPIIPYIKERLINATTSFSCTHCGCLGH 256

Query: 272 -ESITLMANSERRFSSDKSKK---KDTRPICSNCGYKGHTADKCYKLHGYPPGHRLA--- 331
            ESI    +     + +K  K        +C++CG  GHT D CY+ HG+PPGH+L+   
Sbjct: 257 TESIYYRKHGFPSHNDNKHNKVTFNRNGKLCTHCGKMGHTIDVCYRKHGFPPGHKLSTKH 316

Query: 332 ------------NSNNSVHQRQDNTIQN----GNDKVTEVSKRNQSAFFASLNSDQYTQL 391
                       ++++S H  Q ++I      G  K   V   +QS     ++SD    +
Sbjct: 317 TVINSTVTDNGGSTSSSSHVNQISSISPSSLAGICKHFSVCSISQSVASWIIDSDATDHV 376

Query: 392 LGMLQTHLNTLQNGENFKN-------ETTHIAVLK--------DVLYIPDFKYNLLSVST 451
              L    + +     F           TH  V+K        DVLYIP F +NL+S+S 
Sbjct: 377 SSSLLNFSSYVMINPVFVKLPTGQTVTATHSGVVKFSESLFLVDVLYIPSFTFNLISLSK 436

Query: 452 LLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQHTAL--MC-K 511
           L+   +  + F+ ++C+IQD    K IG  ++  GLY L ++    + +  T +   C K
Sbjct: 437 LVSSLQCELIFSHNSCIIQDSKNKKRIGTVDVNGGLYTLALQPIVNHFVYSTIVNPQCNK 496

Query: 512 VSASMWHKRMGHPSISRINELAKMIEISDFPNCKEV-----CHICPLAKQRRLSFPILNN 571
           +   +WH  +GH S  R      M  +  + +C  V     C+ C  AKQ +L FP+ ++
Sbjct: 497 LPIDLWHFHLGHLSHER------MFIMKQYYSCLSVDKTFICNTCHHAKQHKLPFPLSHS 556

Query: 572 IAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKL 631
            A   F+L+H DIWGP    +  GH YF T+V D +R+TW++L+  KS+    I  F+  
Sbjct: 557 HASQPFELLHMDIWGPCTLTSMHGHRYFLTVVDDHTRFTWIFLMTSKSETRTHIINFITQ 616

Query: 632 IETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARA 691
           IE QFSK +K+ R+DN  + N    F   G  HQ +C  TPQQN +VE KHQH+LNV RA
Sbjct: 617 IEKQFSKTVKIIRTDNGAKFNMHQYFLSKGIIHQTTCIETPQQNGIVEHKHQHILNVTRA 676

Query: 692 LMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYAS 751
           L+FQS +P  FW   +L    LIN  P   L N +PF  ++    D +I++ FGCL Y  
Sbjct: 677 LLFQSHLPSSFWFFALLHVVLLINCIPTPFLHNKSPFEIIYNHPFDISILRVFGCLCYTG 736

Query: 752 TPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFPF-----HS 800
           T + +R+K DPRA   +F+GF P  KGY ++++  R+  +SR+V+F+E+ FP+     H 
Sbjct: 737 TITAHRTKLDPRAHLSIFLGFKPYTKGYLVFNLHSRQLVVSRNVIFYEDHFPYVHHTHHE 796

BLAST of CSPI02G15880 vs. TAIR10
Match: AT1G21280.1 (AT1G21280.1 Retrotransposon gag protein (InterPro:IPR005162))

HSP 1 Score: 104.0 bits (258), Expect = 4.4e-22
Identity = 67/226 (29.65%), Postives = 110/226 (48.67%), Query Frame = 1

Query: 15  SVSITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFIT 74
           SVS TS        P  +HH   P+   +       +NY +W       L    K GFI 
Sbjct: 7   SVSPTSDPDSPYYLPPDIHH---PSDFSIQKLSKDEDNYVAWKIRFRSFLRVTKKFGFID 66

Query: 75  GLIKKPSEGN-LLSAWKCNNDVIASWIINSISKEIAASLVYNGNVKEIWDELKERYKQSN 134
           G + KP   + L   W+  N ++  W++NS++ ++  S++Y     ++W++L+  +    
Sbjct: 67  GTLPKPDPFSPLYQPWEQCNAMVMYWLMNSMTDKLLESVMYAETAHKMWEDLRRVFVPCV 126

Query: 135 GPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEG-----SKKMIDF 194
              IYQLR+ L T  QG  SVE Y+ K++ +W EL EY P+ EC C G     +K+  + 
Sbjct: 127 DLKIYQLRRRLATLRQGGDSVEEYFGKLSKVWMELSEYAPIPECKCGGCNCECTKRAEEA 186

Query: 195 LNAEFVMIFLMG--LNESYSQIRAQILLIDPLPPINRVFSLIIQEE 233
              E    FLMG  LN+ +  +  +I+   P P ++  F+++   E
Sbjct: 187 REKEQRYEFLMGLKLNQGFEAVTTKIMFQKPPPSLHEAFAMVKDAE 229

BLAST of CSPI02G15880 vs. NCBI nr
Match: gi|2194136|gb|AAB61111.1| (Strong similarity to Zea mays retrotransposon Hopscotch polyprotein (gb|U12626) [Arabidopsis thaliana])

HSP 1 Score: 488.0 bits (1255), Expect = 3.1e-134
Identity = 269/701 (38.37%), Postives = 395/701 (56.35%), Query Frame = 1

Query: 60  MMLALSGKNKVGFITGLIKKPSEGN-LLSAWKCNNDVIASWIINSISKEIAASLVYNGNV 119
           M  ++  KNK+GF+ G I KP + +     W+  N ++ SW++NS+SKEI  S++Y    
Sbjct: 1   MTTSIEAKNKLGFVDGSIPKPDDDDPYCKIWRRCNSMVKSWLLNSVSKEIYTSILYFPTA 60

Query: 120 KEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECT 179
             IW +L  R+ +S+ P +Y+LR+ + +  QG+L +  Y+ +  T+W+EL   + +    
Sbjct: 61  AAIWKDLYTRFHKSSLPRLYKLRQQIHSLRQGNLDLSSYHTRTQTLWEELTSLQAVPRTV 120

Query: 180 CEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPLPPINRVFSLIIQEERQRS-- 239
               + ++       V+ FLMGLN+ Y  +R+QIL+   LP ++ VF++I Q+E QRS  
Sbjct: 121 ----EDLLIERETNRVIDFLMGLNDCYDTVRSQILMKKTLPSLSEVFNMIDQDETQRSAR 180

Query: 240 IGSSPSIESITLMAN---SERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGH 299
           I ++P + S     +   S+   + D  +KK+ RP+CS C   GH  D CYK HGYP   
Sbjct: 181 ISTTPGMTSSVFPVSNQSSQSALNGDTYQKKE-RPVCSYCSRPGHVEDTCYKKHGYPTSF 240

Query: 300 RLANSNNSVHQRQDNTIQNGNDKVTEVSKRNQSAFFASLNSDQYTQLLGML--------- 359
           +  +    V          G+++V      N S     L + Q  QL+  L         
Sbjct: 241 K--SKQKFVKPSISANAAIGSEEVV----NNTSVSTGDLTTSQIQQLVSFLSSKLQPPST 300

Query: 360 --QTHLNTLQNGENFKNETT--------HIA---VLKDVLYIPDFKYNLLSVSTLLKDDK 419
             Q  ++++    +  + +T        H+    +L DVL+IP FK+NLLSVS+L K   
Sbjct: 301 PVQPEVHSISVSSDPSSSSTVCPISGSVHLGRHLILNDVLFIPQFKFNLLSVSSLTKSMG 360

Query: 420 FAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQHTALMCKV-SASMWHK 479
             I F +++C++QD      +G  +    LY++ + +        +  +  V S  +WHK
Sbjct: 361 CRIWFDETSCVLQDATRELMVGMGKQVANLYIVDLDSLSHPGTDSSITVASVTSHDLWHK 420

Query: 480 RMGHPSISRINELAKMIEISDFPNCKEV-CHICPLAKQRRLSFPILNNIAENIFDLIHCD 539
           R+GHPS+ ++  ++ ++      N  +  C +C ++KQ+ L F   NN +   FDLIH D
Sbjct: 421 RLGHPSVQKLQPMSSLLSFPKQKNNTDFHCRVCHISKQKHLPFVSHNNKSSRPFDLIHID 480

Query: 540 IWGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVF 599
            WGPF   TH G+ YF TIV D SR TWVYLL +KSD+L VIP F+ ++E QF   IK  
Sbjct: 481 TWGPFSVQTHDGYRYFLTIVDDYSRATWVYLLRNKSDVLTVIPTFVTMVENQFETTIKGV 540

Query: 600 RSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFW 659
           RSDNAPELNF   +   G     SC  TPQQNSVVERKHQH+LNVAR+L FQS +P+ +W
Sbjct: 541 RSDNAPELNFTQFYHSKGIVPYHSCPETPQQNSVVERKHQHILNVARSLFFQSHIPISYW 600

Query: 660 GECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPR 719
           G+C+L+A YLINR P  +L +  PF  L K    Y+ IK FGCL YAST   +R KF PR
Sbjct: 601 GDCILTAVYLINRLPAPILEDKCPFEVLTKTVPTYDHIKVFGCLCYASTSPKDRHKFSPR 660

Query: 720 AQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFPF 731
           A+ C F+G+P G KGY+L D+      +SR V+F EELFPF
Sbjct: 661 AKACAFIGYPSGFKGYKLLDLETHSIIVSRHVVFHEELFPF 690

BLAST of CSPI02G15880 vs. NCBI nr
Match: gi|727473974|ref|XP_010412743.1| (PREDICTED: uncharacterized protein LOC104699096 [Camelina sativa])

HSP 1 Score: 483.4 bits (1243), Expect = 7.6e-133
Identity = 285/799 (35.67%), Postives = 437/799 (54.69%), Query Frame = 1

Query: 24  DAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSEG 83
           D   NP+ LH S      LVS  L   + + SW R++ +AL+ +NK+GFI G I KP + 
Sbjct: 23  DQYENPYFLHSSDHAGMVLVSDRLTTGSEFHSWRRSVRMALNVRNKLGFIDGTIPKPPDD 82

Query: 84  NLLSA-WKCNNDVIASWIINSISKEIAASLVYNGNVKEIWDELKERYKQSNGPHIYQLRK 143
           +  +A W   ND++ +W+++S+SK I  SL+Y      IW +L  R++Q + P +++L +
Sbjct: 83  HPDAAFWSRCNDMVITWLMSSVSKHIGQSLLYMATASAIWLDLMSRFRQDDAPRVFELEQ 142

Query: 144 DLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCE----GSKKMIDFLNAEF-VMIF 203
            L T  QGS+ V  YY  + T+W+EL  Y  +  CTC      +  M + L     V  F
Sbjct: 143 RLSTLQQGSMDVTTYYTALVTLWEELKNYIDVHVCTCGKCECNAALMWEKLQQRSRVTKF 202

Query: 204 LMGLNESYSQIRAQILLIDPLPPINRVFSLIIQEERQRSIGSSPSIESITL--------- 263
           LMGLNE+Y   R  IL++ P+P I  VF ++ ++ERQ++I  +  I+++           
Sbjct: 203 LMGLNEAYQPTRRHILMLKPMPSIGEVFHMVTEDERQKNIQPATKIDNVAFQASGPSDNV 262

Query: 264 ----MANSERRFSSDKSK-------KKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLAN 323
               MA S   + S  +        +   +P C++C   GHT  KC+KLHGYPP HR A 
Sbjct: 263 GYNPMAPSPNGYFSGPTDNTAYAAFRPHQKPYCTHCNRVGHTIQKCFKLHGYPPSHRNA- 322

Query: 324 SNNSVHQRQDNTIQNGNDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTHLNTLQNGENF 383
                   Q   +Q    +     +   +   +S  SDQ   L+  L T     +   ++
Sbjct: 323 --------QRPPMQQVLSQSAPPPQVATNLDLSSFTSDQIQSLVHQLSTQTKASEQPSSY 382

Query: 384 KNETT--HIA------------VLKDV-LYIPDFKYNLLSVSTLLKDDKFAISFADSNCL 443
            + T   H A            V  D+ L+   F    ++VS L    + AI+     C 
Sbjct: 383 PSATITEHGAMAATSSSGATSHVCSDLSLFSETFPVTGVTVS-LPNGTRVAITHT---CT 442

Query: 444 IQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQHTALMC---KVSASMWHKRMGHPSISR 503
              + L+  IGK  L + LY+L   +   +    +A      +V   +WH+R+GHPS  +
Sbjct: 443 DHTRDLM--IGKGVLIHCLYILEHHDISASANSVSASFSGSLQVDGRVWHQRLGHPSSVK 502

Query: 504 INELAKMIEIS-DFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPT 563
           +  ++  + +     +  + C ICPLAKQ++L F   + ++   FDLIH D WGPF   +
Sbjct: 503 LQYMSGTLPMFISNSSSHDHCDICPLAKQKKLPFVSHSKLSALPFDLIHIDTWGPFSVES 562

Query: 564 HAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELN 623
             G+ YF TIV D +R TW+Y++ +K+D+L + P F+K ++TQ++ VIK+ RSDNAPEL 
Sbjct: 563 IEGYRYFLTIVDDCTRITWIYMMRNKNDVLSIFPSFVKHVQTQYNSVIKIVRSDNAPELG 622

Query: 624 FRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAY 683
           F  L    G  H FSC YTPQQNSVVERKHQHLLNVA AL+FQS VPL +  +C+ +AA+
Sbjct: 623 FSQLVKDHGMLHHFSCPYTPQQNSVVERKHQHLLNVAHALLFQSNVPLCYRSDCIHTAAF 682

Query: 684 LINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGF 743
           LINRTP ++L   +P+   FK K DY+ +++FGCL YAST + +R+KF PRA+PCVF+G+
Sbjct: 683 LINRTPSLVLDKMSPYEKNFKPKPDYSFLRSFGCLCYASTLAKDRNKFIPRAEPCVFLGY 742

Query: 744 PPGIKGYRLYDIAKRKFFISRDVLFFEELFPFHSIKEKDILISHDFLEQFVIPCPLFDCL 778
           P G KGY++ ++      I+R+++F E +FPF ++     L + D   + V+P P+   +
Sbjct: 743 PSGYKGYKVLNLETNVVHITRNIVFHEHIFPFKNVTPS--LPTTDLFSKMVLPLPVSTEI 802

BLAST of CSPI02G15880 vs. NCBI nr
Match: gi|1012321879|gb|KYP34293.1| (Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cajanus cajan])

HSP 1 Score: 469.5 bits (1207), Expect = 1.1e-128
Identity = 277/734 (37.74%), Postives = 392/734 (53.41%), Query Frame = 1

Query: 45  TPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKP-SEGNLLSAWKCNNDVIASWIINS 104
           +P   S NY SWSR+M+ ALS KNKV F+ G   +P S   + SAWK  N+++ SW++ S
Sbjct: 5   SPSLDSTNYHSWSRSMLTALSAKNKVEFVDGSAPQPPSSDRIYSAWKRCNNMVVSWLVPS 64

Query: 105 ISKEIAASLVYNGNVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITT 164
           +S  I  S+++  + +EIW +LK RY Q +   I  L+ +  +  QG LSV  Y+ ++  
Sbjct: 65  VSFSIRQSILWMDSAEEIWRDLKSRYSQGDLLRISALQLEASSIKQGDLSVTDYFTQLRI 124

Query: 165 IWQELVEYRP------MDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDP 224
           IW EL  +RP      + +C C+ S  +      +  M FL GLN+ Y+ +R+ +LL+DP
Sbjct: 125 IWDELENFRPDPICVCIVKCICKVSSILAQRKLEDQAMQFLRGLNDQYANVRSHVLLMDP 184

Query: 225 LPPINRVFSLIIQEERQRSIGSSPS-----IESITLMANSERRFSS-------------- 284
           LPPIN++FS + Q+ERQ ++  S +       +  +M NS   F                
Sbjct: 185 LPPINKIFSYVAQQERQFAVSDSLAEVKNGFANAAIM-NSTCNFCGRNGHTESTCYRKHG 244

Query: 285 --DKSKKKDTR--PICSNCGYKGHTADKCYKLHGYPPGHRLAN----SNNSVHQRQDNTI 344
             DK+ K  +     CS+CG  GHT D CYK HG+PPGHR +N    S NS  Q    T 
Sbjct: 245 FPDKNGKSSSNRGKACSHCGKNGHTVDTCYKKHGFPPGHRFSNNKSASANSSVQPHICTT 304

Query: 345 QNGNDKVTEVSKRNQSAFFASLNSDQYTQLLG-----------MLQTHLNTLQNGENFKN 404
            + +     +          SL      +L+            +  TH  T+    NF+ 
Sbjct: 305 HSFDTTPWIIDSGATDHVTCSLQFFTSYKLIEPVIVNLPTGHKVTATHSGTVYFSSNFQ- 364

Query: 405 ETTHIAVLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTN 464
                  L DVLYI  F +NL+SVS L+    + I+F ++ C IQD      IG  ++  
Sbjct: 365 -------LTDVLYISSFAFNLISVSKLVSTTSYQITFTNNVCFIQDIRTKMKIGSVDVRG 424

Query: 465 GLYLLRMKNERVNCIQHTALMCK---VSASMWHKRMGHPSISRINELAKMIEISDFPNCK 524
           GLY L   + +   I  T +  K   +   +WH R+GH S +R++ + ++       N  
Sbjct: 425 GLYQLIPHHFKPPFIHSTIIHPKCDVIPIDLWHFRLGHLSNTRLHNMQQLYPCLTI-NKD 484

Query: 525 EVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHA-GHSYFATIVYDKSRY 584
             C+IC  AKQR+LSF   ++ A   F L+H DIWGP+   +   GH +F TIV D + +
Sbjct: 485 FTCNICHYAKQRKLSFSSSHSTASRPFSLLHMDIWGPYSCISSIHGHKFFFTIVDDNTHF 544

Query: 585 TWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCA 644
           TW++L+ +KS+    I  F+ LIE QF+  I+  ++DN  E   ++ F   G  HQ +C 
Sbjct: 545 TWIFLMINKSETRMHISNFINLIENQFNTRIQTIQTDNGAEFLMQNFFNSKGIVHQTTCI 604

Query: 645 YTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFA 704
            TPQQN VVERKHQHLLNV  AL+F SK+P  FW   +L A YLINR    LL N TPF 
Sbjct: 605 ETPQQNGVVERKHQHLLNVTHALLFHSKLPYCFWSYALLHATYLINRITTPLLDNKTPFQ 664

Query: 705 ALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKF 730
            L+ +  D   ++ FGCL Y ST + NR K DPRA PCVF+GF P  KGY  YD+  R  
Sbjct: 665 KLYGQTCDITELRVFGCLCYVSTSTANRKKLDPRAHPCVFLGFSPTTKGYITYDLHTRAI 724

BLAST of CSPI02G15880 vs. NCBI nr
Match: gi|1012331645|gb|KYP43110.1| (Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cajanus cajan])

HSP 1 Score: 462.6 bits (1189), Expect = 1.4e-126
Identity = 280/789 (35.49%), Postives = 426/789 (53.99%), Query Frame = 1

Query: 60  MMLALSGKNKVGFITGLIKKP-SEGNLLSAWKCNNDVIASWIINSISKEIAASLVYNGNV 119
           M+L L  KNK+ F+ G + +P + G   +AW   N ++ SWI+ S+   +  S+++    
Sbjct: 1   MLLVLGTKNKLAFVDGSLARPVTAGVDQTAWDRCNKLVISWIVQSLDTSLIPSVIWMPTA 60

Query: 120 KEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECT 179
            +IW++LK+RY Q +   I +L +++ +  QG++S+  Y+  +  +WQEL  + P+  CT
Sbjct: 61  SQIWNDLKKRYYQGDAFRISELLEEIYSLKQGNMSITHYFTTLQGLWQELDHFCPIPSCT 120

Query: 180 CEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPLPPINRVFSLIIQEERQRSIG 239
           C             F   FL GLNE YS +R+Q++L+DPLP + +VFS+I+Q+ER+    
Sbjct: 121 C-------------FTTRFLKGLNEQYSNVRSQVMLMDPLPSVQKVFSMILQQEREFHGT 180

Query: 240 SSPSIESITLMANSERR-FSSDKSKKKD---TRPICSNCGYKGHTADKCYKLHGYPPGH- 299
           +   + ++T  +N+ER  +   K+ K++      +CS+CG  GH  D CYK HG P  H 
Sbjct: 181 NDNQVLAVT--SNNERNNYKGSKTFKRNKDYNTKVCSHCGRIGHLVDSCYKKHGPPLQHK 240

Query: 300 --RLANS-----------NNSVHQRQDNTIQNGNDKVTE-----VSKRNQSAFFASL--- 359
             R+ N            + SVH ++  +  +GN    E     ++   QS    S    
Sbjct: 241 HGRIVNQYQSVSDEDTDDDQSVHSQRVVSHNSGNMFTPEQHQALLALLQQSGMILSFPYS 300

Query: 360 -NSD----------------QYTQLLGMLQTHLNTLQNGENFKNETTHIAVLK------D 419
            NSD                 + Q    ++  L  L NG       +   +L       D
Sbjct: 301 HNSDCWIIDTGATDHVCKNINFFQSYRRIKPILIQLPNGSQVSTCVSGTVLLSKTCYLTD 360

Query: 420 VLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNE 479
           VL+IP+F +NL+SVS L K     ++F+DS+C IQ    ++ IG AEL  GLY + + + 
Sbjct: 361 VLFIPNFHFNLISVSKLAKTLSCTLTFSDSDCQIQANHSMRMIGAAELRAGLYAM-VSSP 420

Query: 480 RVNCIQH-TALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQR 539
             N + H T+      + +WH R+GH S  +++ L               C IC LAK +
Sbjct: 421 ESNVVHHCTSHFFTYQSDLWHLRLGHLSHDKLSALKGSYPEIQCNKISLPCEICHLAKHK 480

Query: 540 RLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDIL 599
           RL FP     +EN+FDLIH DIWGP    +  GH YF TIV DKSR+TW++ +++K +  
Sbjct: 481 RLPFPDSLTKSENVFDLIHVDIWGPLSVASIFGHKYFLTIVDDKSRFTWIFFMKNKFETK 540

Query: 600 QVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKH 659
            ++  F+  ++TQF + IK  R+DN  E   +D +AK G  HQ SC  TPQQN VVERKH
Sbjct: 541 LLLQNFVSFVQTQFQQNIKTIRTDNGSEFLLKDWYAKLGIVHQTSCVNTPQQNGVVERKH 600

Query: 660 QHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIK 719
           QH+L++ARALMFQS V  +FW   +  A +LINR P   L  N+P+  L+ +K D++ +K
Sbjct: 601 QHILSMARALMFQSNVSKMFWNYAIGHAVHLINRLPTRFLQQNSPYYVLYSEKPDFSHLK 660

Query: 720 TFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELF 779
            FGCLA+AST S NR+K +PR++ C+F+G+  G KG+ +YD+  R+ FISRDV F+E +F
Sbjct: 661 VFGCLAFASTLSHNRTKLEPRSRKCMFLGYSSGTKGFIMYDLKTRETFISRDVQFYENIF 720

Query: 780 PFHSIKEKDILI-SHDFLEQFVIPCPLFDCLEKEDSIDARPTTEDSPEDSHGVDDQNPHI 797
           P     +KD  I S D     +   PL  C    D I +  T ++  E  H  +     +
Sbjct: 721 PL----QKDFSIQSTDGPVVPIAQMPLTSC----DPIPSH-THDNLDETEHEHNSSTLPM 764

BLAST of CSPI02G15880 vs. NCBI nr
Match: gi|5306246|gb|AAD41979.1| (putative retroelement pol polyprotein [Arabidopsis thaliana])

HSP 1 Score: 439.5 bits (1129), Expect = 1.3e-119
Identity = 270/808 (33.42%), Postives = 419/808 (51.86%), Query Frame = 1

Query: 6   SSTSGSSNYSVSITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALS 65
           S T+G+S   VS  S+D     +PF LH +  P  N++S  L    NY  W+ AM+++  
Sbjct: 43  SKTTGASRVIVSPESTD--PTQSPFFLHSADHPCLNIISHRL-DETNYGDWNVAMLISC- 102

Query: 66  GKNKVGFITGLIKKPSEGNLLSAWKCNNDVIASWIINSISKEIAASLVYNGNVKEIWDEL 125
                                      N ++ SW++NS+S +I  S++   +V +IW +L
Sbjct: 103 ---------------------------NTMVKSWLLNSMSPQIYRSILRMNDVSDIWRDL 162

Query: 126 KERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDE-CTCEGSKK 185
             R+  +N P  Y L +++    Q +LS+  YY ++ T+W +L     +D+ CTC  + +
Sbjct: 163 NSRFNMTNLPRTYNLTQEIQDLRQSTLSLSEYYTRLKTLWDQLNSTEELDDPCTCGKALR 222

Query: 186 MIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPLPPINRVFSLIIQEERQRSIGSSPS-- 245
           +        ++ FL GLNESY+ IR Q++    LP +  V+ ++ Q+  Q+   +  +  
Sbjct: 223 LQQKAERAKIVKFLAGLNESYAIIRRQVIAKKILPSLAEVYHIVDQDNSQQGFSNVVAPP 282

Query: 246 ----IESITLMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANS 305
               +  +T+    +      ++     RP+CS     GH A++CYK HG+PPG      
Sbjct: 283 VAFQVSEVTVANIIDPTICYVQNCPNKGRPMCSFYNRVGHIAERCYKKHGFPPGF----- 342

Query: 306 NNSVHQRQDNTIQNGNDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTHLNTLQNGENFK 365
                           DKV + +++ +S                +    L T       K
Sbjct: 343 -------------TPKDKVGDKTQKPKSV---------------VANVALATT------K 402

Query: 366 NETTHIAVLK-DVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAEL 425
           ++ TH  +   +VL+IP+F+ NL+S+S+L  D    + F      IQD    + +G    
Sbjct: 403 SDDTHSGLESLNVLFIPEFRLNLISISSLTDDIGSRVIFDQHAFEIQDLIKGRMLGHGRR 462

Query: 426 TNGLYLLRMKNERVNCIQHTALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNC-K 485
              LY++ +++  V+      +   V  S WH R+ H S+ R++ +++ +  +   N   
Sbjct: 463 VANLYVMDVEDTNVS------VNAVVDISTWHNRLRHASLQRLDVISESLGTTKHKNKGS 522

Query: 486 EVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYT 545
           + CH+C LAK R+LSFP  NN+   IF+++H DIWGPF   T  G+ YF TIV D SR T
Sbjct: 523 DYCHVCHLAKHRKLSFPSQNNVCNEIFEMLHIDIWGPFSVETVDGYQYFLTIVDDHSRAT 582

Query: 546 WVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAY 605
           W+YLL+ KS++L +   F++ +E Q+   +K  RSDNAPEL F  L+ + G     SC  
Sbjct: 583 WIYLLKTKSEVLTIFHDFIQQVENQYKVKVKAVRSDNAPELRFTSLYQRKGIMAFHSCPE 642

Query: 606 TPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAA 665
           TP+QNSVVERKHQH+LNVARALMFQS+VPL  WGECVL+A +LINRTP  LLSN TP+  
Sbjct: 643 TPEQNSVVERKHQHILNVARALMFQSQVPLFLWGECVLTAVFLINRTPSQLLSNKTPYEI 702

Query: 666 LFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFF 725
           L      Y  ++TFGCL Y+ST    R KF PR++ C+F+G+  G KGY+L D+     F
Sbjct: 703 LSGTAPQYGQLRTFGCLCYSSTSPKQRHKFQPRSKACIFLGYSSGYKGYKLMDLESNAIF 759

Query: 726 ISRDVLFFEELFPFHSIKEKDILISHDFLEQFVIPCPLFDCLEKEDSIDARPTTEDSPED 785
           ISR+V+F EE+FP    K+     S D L+       +   ++++ S  +          
Sbjct: 763 ISRNVVFLEEVFPLAGTKK-----SADSLKLCTPLVLVPSGIQQQSSFSSL--------- 759

Query: 786 SHGVDDQNPHISNSEETKNPPDHTTHHL 805
           S  + D  P IS S+  + PP H + ++
Sbjct: 823 SSQISDLPPQIS-SQRDRKPPTHLSDYV 759

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
POLX_TOBAC2.4e-5432.51Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum... [more]
COPIA_DROME6.6e-3629.10Copia protein OS=Drosophila melanogaster GN=GIP PE=1 SV=3[more]
Match NameE-valueIdentityDescription
O04543_ARATH2.2e-13438.37F20P5.25 protein OS=Arabidopsis thaliana GN=F20P5.25 PE=4 SV=1[more]
A0A151QVL3_CAJCA8.0e-12937.74Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cajanus cajan GN=... [more]
A0A151RKM5_CAJCA9.7e-12735.49Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cajanus cajan GN=... [more]
Q9XIL8_ARATH8.8e-12033.42Putative retroelement pol polyprotein OS=Arabidopsis thaliana GN=At2g15870 PE=4 ... [more]
A0A151R1J9_CAJCA1.2e-11933.37Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Cajanus cajan GN=... [more]
Match NameE-valueIdentityDescription
AT1G21280.14.4e-2229.65 Retrotransposon gag protein (InterPro:IPR005162)[more]
Match NameE-valueIdentityDescription
gi|2194136|gb|AAB61111.1|3.1e-13438.37Strong similarity to Zea mays retrotransposon Hopscotch polyprotein (gb|U12626) ... [more]
gi|727473974|ref|XP_010412743.1|7.6e-13335.67PREDICTED: uncharacterized protein LOC104699096 [Camelina sativa][more]
gi|1012321879|gb|KYP34293.1|1.1e-12837.74Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cajanus cajan][more]
gi|1012331645|gb|KYP43110.1|1.4e-12635.49Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cajanus cajan][more]
gi|5306246|gb|AAD41979.1|1.3e-11933.42putative retroelement pol polyprotein [Arabidopsis thaliana][more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR001584Integrase_cat-core
IPR001878Znf_CCHC
IPR005162Retrotrans_gag_dom
IPR012337RNaseH-like_sf
IPR025724GAG-pre-integrase_dom
Vocabulary: Biological Process
TermDefinition
GO:0015074DNA integration
Vocabulary: Molecular Function
TermDefinition
GO:0003676nucleic acid binding
GO:0008270zinc ion binding
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0015074 DNA integration
cellular_component GO:0005575 cellular_component
molecular_function GO:0003676 nucleic acid binding
molecular_function GO:0008270 zinc ion binding

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CSPI02G15880.1CSPI02G15880.1mRNA


Analysis Name: InterPro Annotations of cucumber (PI183967)
Date Performed: 2017-01-17
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR001584Integrase, catalytic corePFAMPF00665rvecoord: 504..612
score: 1.1
IPR001584Integrase, catalytic corePROFILEPS50994INTEGRASEcoord: 489..661
score: 15
IPR001878Zinc finger, CCHC-typeGENE3DG3DSA:4.10.60.10coord: 261..285
score: 4.
IPR001878Zinc finger, CCHC-typePROFILEPS50158ZF_CCHCcoord: 270..283
score: 8
IPR005162Retrotransposon gag domainPFAMPF03732Retrotrans_gagcoord: 94..203
score: 2.2
IPR012337Ribonuclease H-like domainGENE3DG3DSA:3.30.420.10coord: 499..653
score: 6.4
IPR012337Ribonuclease H-like domainunknownSSF53098Ribonuclease H-likecoord: 498..655
score: 4.7
IPR025724GAG-pre-integrase domainPFAMPF13976gag_pre-integrscoord: 420..487
score: 1.3
NoneNo IPR availablePANTHERPTHR11439GAG-POL-RELATED RETROTRANSPOSONcoord: 758..797
score: 9.8E-160coord: 9..317
score: 9.8E-160coord: 366..733
score: 9.8E
NoneNo IPR availablePANTHERPTHR11439:SF185SUBFAMILY NOT NAMEDcoord: 9..317
score: 9.8E-160coord: 366..733
score: 9.8E-160coord: 758..797
score: 9.8E

The following gene(s) are orthologous to this gene:

None

The following gene(s) are paralogous to this gene:

None