CSPI02G15660 (gene) Wild cucumber (PI 183967)

NameCSPI02G15660
Typegene
OrganismCucumis sativus (Wild cucumber (PI 183967))
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
LocationChr2 : 15198349 .. 15202126 (-)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: CDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGAATAGCTCAATAGTTCAACTTTTAGCTTCCGAAAAACTTAATGGCGATAATTATGCGGCTTGGAAATCAAATCTTAACACAATACTAGTGGTTGACGATTTAAGATTTGTCTTAACTGAGGAATGTCCTCAAAACCCTGCCTCTAATGCTAACCGAACTAGTCGGGATGCATATGATCGATGGATAAAAGCTAATGAAAAAGCCCGTGTCTACATTCTTGCCAGCATGTCTGATGTATTGGCAAAGAAACATGAATCCTTAGCCACGGCTAAAGAGATTATGGATTCATTAAGGGGAATGTTTGGGCAACCAGAATGGTCCTTAAGACACGAGGCAGTCAAATACATTTACACTAAGCGTATGAAGGAAGGGACCTCTGTTAGAGAACATGTTCTGGACATGATGATGCACTTCAACATCGCTGAAGTGAATGGTGGTCCCATCGAAGAGGTTAATCAAGTTAGTTTTATCTTAGAGTCTCTTCCGAAGAGCTTCATTCCATTCCAAACGAATGCGTCTTTGAACAAGATAGAATTTAACCTGACAACCCTTCTGAATGAACTCCAGCGATTCCAAAACCTAACTATGGGTAAAGGAAAACAAGTGGAAGCAAATGTTGCTACCACAAAAAGAAAATTTATAAGAGGATCGTCCTCTAAAACCAAAGCTGGACCCTCAAAACCTAATGCTCAAATAAAAAAGAAGGGAAAGGGAAAGACTCCCAAACAGAACAAGGGTAAGAAAGCTGCAGAAAAAGGTAAGTGTTACCATTGTGGCCAAAACGGGCACTGGTTAAGAAACTGCCCAAAATACCTTGCAGAAAAAAAAGGCAGAGAAGGAAACACAAGGTAAATATGATTTACTAGTTGTAGAAACATGTTTAGTGGAATATGAAAATTCTACCTGGATATTAGATTCAGGAGCCACTAACCATATTTGCTTCTCATTTCAGGAAAATAGTTCTTGGAAAAAGCTTTCAGAAGGCGAGATCACTCTCAAGGTTGGAACAGGAGAGATGGTCTCAGCTTCAGCAGTGGGAGATTTAAAGTTGTTTTTTAGAGATAGATATGTCATACTTAAGAATGTCTTATATGTACCTCAAATGAAAAGAAATTTAATATCTATCTCTTGTATTTTGGAACAAATGTATAGAATATCTTTTGAAATTAATGAAGCGTTCGTTTTCTATAAAGGTATTCTAGTTTGTTCTGCTATACTTGAAGACAACTTATATAAGTTAAGACCAACTAGAGCAAATTTTGTCTTAAATACTGAAATATTCAGAACAGCTGAAACTCAGAATAAAAGACAAAAAGTTTCTTCTAATGCCTATTTATGGCACTTAAGACTTGGTCACATAAATCTCAATAGGATTGGGAGATTGGTTAAAAGTGGGCTTCTAAGTCCGTTAGAAGATAACTCTTTACCTCCTTGTGAATCTTGTCTTGAAGGAAAAATGACCAAGAGATCTTTTACTGGAAAAGGTCTAAGAGCCAAAGGACCCTTAGAGCTCGTACATTCGGACCTTTGTGGACCAATGAATGTCAAAGCTCGAGGTGGATATGAATATTTCATTAGCTTCATTGATGATTATTCAAGGTATGGTCATATTTACCTAATACATCATAAGTCTAATAGTCTTGAAAAGTTCAAAGAATATAAGGCTGAAGTAGAAAACGAATTAGGTAAAACAATAAAAATACTTCGATCAGATCGAGGTGGAGAGTATATGGACTTACGATTCCGAGACTATTTAATAGAAAATGGAATCCAGTCACAACTCTCTGCACCTAGTACACCTCAACAGAACGGTGTATCAGAAAGAAGAAACCGGACCTTGTTAGACATGGTTCGCTCTATGATAAGTTTTTCTCAGATGTCAGATTCTTTTTGGGGATATGCTTTAGAAACAGCTGCTTATATTTTGAATAATGTTCCCTCTAAAAGTGTTTCAGAAACACCTTATGAGCTATGGAAAGGGCGTAAAGGAAGTTTACGTCATTTTAGAATTTGGGGTTGTCCAGCACACGTGTTGGTACAAAATCCAAAGAAATTGGAACATCGTTCAAAATTATGCTTTTTTATAGGTTATCCAAAAGAATCAAGAGGTGGTTTGTTTTATGATCCTCAAGAAAATAAAATATTTGTGTCAACAAATGCCACATTCTTAGAGGAAGACCACATCAGGGATCATCAACCTCGTAGTAAACTAGTATTAAAAGAAATTTCCAAAAGTGCTATAGATAAACCTAGTTCATCCACTAAGGTAGTTGATAAGACTAGGAAATCTGGTCAATCACATCCTTCTCAACAGTTGAGAGAGCCTCGACGTAGTGGGAGGGTTGTTCATCAGCCTGATCGCTATTTGGGTTTAATTGAAACTCAAGTCGTCATACCTGACGATGGCATAGAGGATCCATTAACCTATAAACAGGCAATGAAAGATGTAGATCGTGACCAATGGATCAAAGCCATGGACCTCGAAATGGAGTCTATGTACTTTAATTCTGTCTGGACTCTAGTAGATCAACCAAATGACGTAAAACCTATTGGTTGTAAATGGATCTACAAGAGAAAACGAGACCATGCCGGTAAAGTACAGACTTTCAAGGCTCGACTTGTGGCAAAGGGTTATACCCAGAGAGAGGGAGTAGACTATGAGGAAACTTTCTCTCCTGTTGCCATGTTAAAGTCAATTAGAATACTCTTATCCATCGCCACTTTTTATGATTATGAAATTTGGCAGATGGATGTCAAGACAGCTTTTTTGAATGGTAATCTTGAAGAGAGTATCTATATGTCTCAACCAGAGGGGTTTATAGAACAAGATCAAGAACAAAAGGTTTGTAAGCTTAAAAAATCCATTTATGGATTAAAACAAGCTTCTAGATCCTGGAATATAAGATTTGATACTGCGATCAAATCTTATGGCTTTGAACAAAATGTTGACGAGCCTTGTGTTTACAAAAAGGTCGTCAATTTCATTATAGCATTTTTAGTCTTATATGTAGATGATGTTCTACTTATTGGAAATGACGTAGGATATCTTACTGATATCAAGAAATGGCTAGCTATGCAATTTCAAAAAAGATCTGGGAGATGCACAATACGTTCTCGGAATCCAAATTGTTCGAAACCGTAAGAACAAAACACTAGCCATGTCTCAAGCATCTTACATAGACAAAATGTTGTCTAGATATAAAATGCAGAATTCCAAAAAGGGTCTGCTGCCGTACAGATATGGAATTCATTTGTCAAAGGAACAATGTCCTAAGACACCTCAAGAAGTTGAGGATATGAGAAATATTCCCTATGCTTCCGCTGTTGGAAGTTTAATGTATGCAATGTTATGTACTAGACCTGACATTTGCTACTCAGTAGGGATGGTCAGTAGGTATCAATCCAATCCTGGACGTGATCACTGGACAGCCGTTAAAAACATTCTAAAATATCTTCGAAGAACAAAAGACTACATGCTCATGTATGGTACAAAGGATCTGATCCTTACTGGATACACTGATTCAGATTTCCAAACTGATAAAGATGCTAGAAAGTCTACATCAGGATCAGTATTTACTCTAAATGGAGGAGCAGTAGTTTGGAGAAGCATAAAGCAAACTTGTATAGCTGATTCCACAATGGAAGCTGAATACGTAGCGGCTTGTGAAGCAGCAAAAGAAGCAGTATGGCTAAGAAAATTCTTGACAGATTTGGAAGTCGTTCCAAATATGCATCTACCAATCACTTTATACTGTGACAACAGTGGTGCAGTTGA

mRNA sequence

ATGAATAGCTCAATAGTTCAACTTTTAGCTTCCGAAAAACTTAATGGCGATAATTATGCGGCTTGGAAATCAAATCTTAACACAATACTAGTGGTTGACGATTTAAGATTTGTCTTAACTGAGGAATGTCCTCAAAACCCTGCCTCTAATGCTAACCGAACTAGTCGGGATGCATATGATCGATGGATAAAAGCTAATGAAAAAGCCCGTGTCTACATTCTTGCCAGCATGTCTGATGTATTGGCAAAGAAACATGAATCCTTAGCCACGGCTAAAGAGATTATGGATTCATTAAGGGGAATGTTTGGGCAACCAGAATGGTCCTTAAGACACGAGGCAGTCAAATACATTTACACTAAGCGTATGAAGGAAGGGACCTCTGTTAGAGAACATGTTCTGGACATGATGATGCACTTCAACATCGCTGAAGTGAATGGTGGTCCCATCGAAGAGGTTAATCAAGTTAGTTTTATCTTAGAGTCTCTTCCGAAGAGCTTCATTCCATTCCAAACGAATGCGTCTTTGAACAAGATAGAATTTAACCTGACAACCCTTCTGAATGAACTCCAGCGATTCCAAAACCTAACTATGGGTAAAGGAAAACAAGTGGAAGCAAATGTTGCTACCACAAAAAGAAAATTTATAAGAGGATCGTCCTCTAAAACCAAAGCTGGACCCTCAAAACCTAATGCTCAAATAAAAAAGAAGGGAAAGGGAAAGACTCCCAAACAGAACAAGGGTAAGAAAGCTGCAGAAAAAGGTAAGTGTTACCATTGTGGCCAAAACGGGCACTGGTTAAGAAACTGCCCAAAATACCTTGCAGAAAAAAAAGGCAGAGAAGGAAACACAAGGAAAATAGTTCTTGGAAAAAGCTTTCAGAAGGCGAGATCACTCTCAAGGTTGGAACAGGAGAGATGGTCTCAGCTTCAGCAAACAGCTGAAACTCAGAATAAAAGACAAAAAGTTTCTTCTAATGCCTATTTATGGCACTTAAGACTTGGTCACATAAATCTCAATAGGATTGGGAGATTGGTTAAAAGTGGGCTTCTAAGTCCGTTAGAAGATAACTCTTTACCTCCTTGTGAATCTTGTCTTGAAGGAAAAATGACCAAGAGATCTTTTACTGGAAAAGGTCTAAGAGCCAAAGGACCCTTAGAGCTCGTACATTCGGACCTTTGTGGACCAATGAATGTCAAAGCTCGAGGTGGATATGAATATTTCATTAGCTTCATTGATGATTATTCAAGGTATGGTCATATTTACCTAATACATCATAAGTCTAATAGTCTTGAAAAGTTCAAAGAATATAAGGCTGAAGTAGAAAACGAATTAGGTAAAACAATAAAAATACTTCGATCAGATCGAGGTGGAGAGTATATGGACTTACGATTCCGAGACTATTTAATAGAAAATGGAATCCAGTCACAACTCTCTGCACCTAGTACACCTCAACAGAACGGTGTATCAGAAAGAAGAAACCGGACCTTGTTAGACATGGTTCGCTCTATGATAAGTTTTTCTCAGATGTCAGATTCTTTTTGGGGATATGCTTTAGAAACAGCTGCTTATATTTTGAATAATGTTCCCTCTAAAAGTGTTTCAGAAACACCTTATGAGCTATGGAAAGGGCGTAAAGGAAGTTTACGTCATTTTAGAATTTGGGGTTGTCCAGCACACGTGTTGGTACAAAATCCAAAGAAATTGGAACATCGTTCAAAATTATGCTTTTTTATAGGTTATCCAAAAGAATCAAGAGGTGGTTTGTTTTATGATCCTCAAGAAAATAAAATATTTGTGTCAACAAATGCCACATTCTTAGAGGAAGACCACATCAGGGATCATCAACCTCGTAGTAAACTAGTATTAAAAGAAATTTCCAAAAGTGCTATAGATAAACCTAGTTCATCCACTAAGGTAGTTGATAAGACTAGGAAATCTGGTCAATCACATCCTTCTCAACAGTTGAGAGAGCCTCGACGTAGTGGGAGGGTTGTTCATCAGCCTGATCGCTATTTGGGTTTAATTGAAACTCAAGTCGTCATACCTGACGATGGCATAGAGGATCCATTAACCTATAAACAGGCAATGAAAGATGTAGATCGTGACCAATGGATCAAAGCCATGGACCTCGAAATGGAGTCTATGTACTTTAATTCTGTCTGGACTCTAGTAGATCAACCAAATGACGTAAAACCTATTGGTTGTAAATGGATCTACAAGAGAAAACGAGACCATGCCGGTAAAGTACAGACTTTCAAGGCTCGACTTGTGGCAAAGGGTTATACCCAGAGAGAGGGAGTAGACTATGAGGAAACTTTCTCTCCTGTTGCCATGTTAAAGTCAATTAGAATACTCTTATCCATCGCCACTTTTTATGATTATGAAATTTGGCAGATGGATGTCAAGACAGCTTTTTTGAATGGTAATCTTGAAGAGAGTATCTATATGTCTCAACCAGAGGGGTTTATAGAACAAGATCAAGAACAAAAGGTTTGTAAGCTTAAAAAATCCATTTATGGATTAAAACAAGCTTCTAGATCCTGGAATATAAGATTTGATACTGCGATCAAATCTTATGGCTTTGAACAAAATGTTGACGAGCCTTGTGTTTACAAAAAGGTCGTCAATTTCATTATAGCATTTTTAGTCTTATATGTAGATGATGTTCTACTTATTGGAAATGACGTAGGATATCTTACTGATATCAAGAAATGGCTAGCTATGCAATTTCAAAAAAGATCTGGGAGATGCACAATACGTTCTCGGAATCCAAATTGTTCGAAACCATATGGAATTCATTTGTCAAAGGAACAATGTCCTAAGACACCTCAAGAAGTTGAGGATATGAGAAATATTCCCTATGCTTCCGCTGTTGGAAGTTTAATGTATGCAATGTTATGTACTAGACCTGACATTTGCTACTCAGTAGGGATGGTCAGTAGGTATCAATCCAATCCTGGACGTGATCACTGGACAGCCGTTAAAAACATTCTAAAATATCTTCGAAGAACAAAAGACTACATGCTCATGTATGGTACAAAGGATCTGATCCTTACTGGATACACTGATTCAGATTTCCAAACTGATAAAGATGCTAGAAAGTCTACATCAGGATCAGTATTTACTCTAAATGGAGGAGCAGTAGTTTGGAGAAGCATAAAGCAAACTTGTATAGCTGATTCCACAATGGAAGCTGAATACGTAGCGGCTTGTGAAGCAGCAAAAGAAGCATGGTGCAGTTGA

Coding sequence (CDS)

ATGAATAGCTCAATAGTTCAACTTTTAGCTTCCGAAAAACTTAATGGCGATAATTATGCGGCTTGGAAATCAAATCTTAACACAATACTAGTGGTTGACGATTTAAGATTTGTCTTAACTGAGGAATGTCCTCAAAACCCTGCCTCTAATGCTAACCGAACTAGTCGGGATGCATATGATCGATGGATAAAAGCTAATGAAAAAGCCCGTGTCTACATTCTTGCCAGCATGTCTGATGTATTGGCAAAGAAACATGAATCCTTAGCCACGGCTAAAGAGATTATGGATTCATTAAGGGGAATGTTTGGGCAACCAGAATGGTCCTTAAGACACGAGGCAGTCAAATACATTTACACTAAGCGTATGAAGGAAGGGACCTCTGTTAGAGAACATGTTCTGGACATGATGATGCACTTCAACATCGCTGAAGTGAATGGTGGTCCCATCGAAGAGGTTAATCAAGTTAGTTTTATCTTAGAGTCTCTTCCGAAGAGCTTCATTCCATTCCAAACGAATGCGTCTTTGAACAAGATAGAATTTAACCTGACAACCCTTCTGAATGAACTCCAGCGATTCCAAAACCTAACTATGGGTAAAGGAAAACAAGTGGAAGCAAATGTTGCTACCACAAAAAGAAAATTTATAAGAGGATCGTCCTCTAAAACCAAAGCTGGACCCTCAAAACCTAATGCTCAAATAAAAAAGAAGGGAAAGGGAAAGACTCCCAAACAGAACAAGGGTAAGAAAGCTGCAGAAAAAGGTAAGTGTTACCATTGTGGCCAAAACGGGCACTGGTTAAGAAACTGCCCAAAATACCTTGCAGAAAAAAAAGGCAGAGAAGGAAACACAAGGAAAATAGTTCTTGGAAAAAGCTTTCAGAAGGCGAGATCACTCTCAAGGTTGGAACAGGAGAGATGGTCTCAGCTTCAGCAAACAGCTGAAACTCAGAATAAAAGACAAAAAGTTTCTTCTAATGCCTATTTATGGCACTTAAGACTTGGTCACATAAATCTCAATAGGATTGGGAGATTGGTTAAAAGTGGGCTTCTAAGTCCGTTAGAAGATAACTCTTTACCTCCTTGTGAATCTTGTCTTGAAGGAAAAATGACCAAGAGATCTTTTACTGGAAAAGGTCTAAGAGCCAAAGGACCCTTAGAGCTCGTACATTCGGACCTTTGTGGACCAATGAATGTCAAAGCTCGAGGTGGATATGAATATTTCATTAGCTTCATTGATGATTATTCAAGGTATGGTCATATTTACCTAATACATCATAAGTCTAATAGTCTTGAAAAGTTCAAAGAATATAAGGCTGAAGTAGAAAACGAATTAGGTAAAACAATAAAAATACTTCGATCAGATCGAGGTGGAGAGTATATGGACTTACGATTCCGAGACTATTTAATAGAAAATGGAATCCAGTCACAACTCTCTGCACCTAGTACACCTCAACAGAACGGTGTATCAGAAAGAAGAAACCGGACCTTGTTAGACATGGTTCGCTCTATGATAAGTTTTTCTCAGATGTCAGATTCTTTTTGGGGATATGCTTTAGAAACAGCTGCTTATATTTTGAATAATGTTCCCTCTAAAAGTGTTTCAGAAACACCTTATGAGCTATGGAAAGGGCGTAAAGGAAGTTTACGTCATTTTAGAATTTGGGGTTGTCCAGCACACGTGTTGGTACAAAATCCAAAGAAATTGGAACATCGTTCAAAATTATGCTTTTTTATAGGTTATCCAAAAGAATCAAGAGGTGGTTTGTTTTATGATCCTCAAGAAAATAAAATATTTGTGTCAACAAATGCCACATTCTTAGAGGAAGACCACATCAGGGATCATCAACCTCGTAGTAAACTAGTATTAAAAGAAATTTCCAAAAGTGCTATAGATAAACCTAGTTCATCCACTAAGGTAGTTGATAAGACTAGGAAATCTGGTCAATCACATCCTTCTCAACAGTTGAGAGAGCCTCGACGTAGTGGGAGGGTTGTTCATCAGCCTGATCGCTATTTGGGTTTAATTGAAACTCAAGTCGTCATACCTGACGATGGCATAGAGGATCCATTAACCTATAAACAGGCAATGAAAGATGTAGATCGTGACCAATGGATCAAAGCCATGGACCTCGAAATGGAGTCTATGTACTTTAATTCTGTCTGGACTCTAGTAGATCAACCAAATGACGTAAAACCTATTGGTTGTAAATGGATCTACAAGAGAAAACGAGACCATGCCGGTAAAGTACAGACTTTCAAGGCTCGACTTGTGGCAAAGGGTTATACCCAGAGAGAGGGAGTAGACTATGAGGAAACTTTCTCTCCTGTTGCCATGTTAAAGTCAATTAGAATACTCTTATCCATCGCCACTTTTTATGATTATGAAATTTGGCAGATGGATGTCAAGACAGCTTTTTTGAATGGTAATCTTGAAGAGAGTATCTATATGTCTCAACCAGAGGGGTTTATAGAACAAGATCAAGAACAAAAGGTTTGTAAGCTTAAAAAATCCATTTATGGATTAAAACAAGCTTCTAGATCCTGGAATATAAGATTTGATACTGCGATCAAATCTTATGGCTTTGAACAAAATGTTGACGAGCCTTGTGTTTACAAAAAGGTCGTCAATTTCATTATAGCATTTTTAGTCTTATATGTAGATGATGTTCTACTTATTGGAAATGACGTAGGATATCTTACTGATATCAAGAAATGGCTAGCTATGCAATTTCAAAAAAGATCTGGGAGATGCACAATACGTTCTCGGAATCCAAATTGTTCGAAACCATATGGAATTCATTTGTCAAAGGAACAATGTCCTAAGACACCTCAAGAAGTTGAGGATATGAGAAATATTCCCTATGCTTCCGCTGTTGGAAGTTTAATGTATGCAATGTTATGTACTAGACCTGACATTTGCTACTCAGTAGGGATGGTCAGTAGGTATCAATCCAATCCTGGACGTGATCACTGGACAGCCGTTAAAAACATTCTAAAATATCTTCGAAGAACAAAAGACTACATGCTCATGTATGGTACAAAGGATCTGATCCTTACTGGATACACTGATTCAGATTTCCAAACTGATAAAGATGCTAGAAAGTCTACATCAGGATCAGTATTTACTCTAAATGGAGGAGCAGTAGTTTGGAGAAGCATAAAGCAAACTTGTATAGCTGATTCCACAATGGAAGCTGAATACGTAGCGGCTTGTGAAGCAGCAAAAGAAGCATGGTGCAGTTGA
BLAST of CSPI02G15660 vs. Swiss-Prot
Match: POLX_TOBAC (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum PE=2 SV=1)

HSP 1 Score: 578.6 bits (1490), Expect = 1.5e-163
Identity = 327/817 (40.02%), Postives = 477/817 (58.38%), Query Frame = 1

Query: 328  LWHLRLGHINLNRIGRLVKSGLLSPLEDNSLPPCESCLEGKMTKRSFTGKGLRAKGPLEL 387
            LWH R+GH++   +  L K  L+S  +  ++ PC+ CL GK  + SF     R    L+L
Sbjct: 424  LWHKRMGHMSEKGLQILAKKSLISYAKGTTVKPCDYCLFGKQHRVSFQTSSERKLNILDL 483

Query: 388  VHSDLCGPMNVKARGGYEYFISFIDDYSRYGHIYLIHHKSNSLEKFKEYKAEVENELGKT 447
            V+SD+CGPM +++ GG +YF++FIDD SR   +Y++  K    + F+++ A VE E G+ 
Sbjct: 484  VYSDVCGPMEIESMGGNKYFVTFIDDASRKLWVYILKTKDQVFQVFQKFHALVERETGRK 543

Query: 448  IKILRSDRGGEYMDLRFRDYLIENGIQSQLSAPSTPQQNGVSERRNRTLLDMVRSMISFS 507
            +K LRSD GGEY    F +Y   +GI+ + + P TPQ NGV+ER NRT+++ VRSM+  +
Sbjct: 544  LKRLRSDNGGEYTSREFEEYCSSHGIRHEKTVPGTPQHNGVAERMNRTIVEKVRSMLRMA 603

Query: 508  QMSDSFWGYALETAAYILNNVPSKSVS-ETPYELWKGRKGSLRHFRIWGCPA--HVLVQN 567
            ++  SFWG A++TA Y++N  PS  ++ E P  +W  ++ S  H +++GC A  HV  + 
Sbjct: 604  KLPKSFWGEAVQTACYLINRSPSVPLAFEIPERVWTNKEVSYSHLKVFGCRAFAHVPKEQ 663

Query: 568  PKKLEHRSKLCFFIGYPKESRGGLFYDPQENKIFVSTNATFLEEDHIRDHQPRSKLVLKE 627
              KL+ +S  C FIGY  E  G   +DP + K+  S +  F E + +R     S+ V   
Sbjct: 664  RTKLDDKSIPCIFIGYGDEEFGYRLWDPVKKKVIRSRDVVFRESE-VRTAADMSEKVKNG 723

Query: 628  ISKSAI------DKPSSSTKVVDKTRKSGQS-------------------HPSQ---QLR 687
            I  + +      + P+S+    D+  + G+                    HP+Q   Q +
Sbjct: 724  IIPNFVTIPSTSNNPTSAESTTDEVSEQGEQPGEVIEQGEQLDEGVEEVEHPTQGEEQHQ 783

Query: 688  EPRRSGRVVHQPDRYLGLIETQVVIPDDGIEDPLTYKQAMKDVDRDQWIKAMDLEMESMY 747
              RRS R   +  RY       V+I DD   +P + K+ +   +++Q +KAM  EMES+ 
Sbjct: 784  PLRRSERPRVESRRYPST--EYVLISDD--REPESLKEVLSHPEKNQLMKAMQEEMESLQ 843

Query: 748  FNSVWTLVDQPNDVKPIGCKWIYKRKRDHAGKVQTFKARLVAKGYTQREGVDYEETFSPV 807
             N  + LV+ P   +P+ CKW++K K+D   K+  +KARLV KG+ Q++G+D++E FSPV
Sbjct: 844  KNGTYKLVELPKGKRPLKCKWVFKLKKDGDCKLVRYKARLVVKGFEQKKGIDFDEIFSPV 903

Query: 808  AMLKSIRILLSIATFYDYEIWQMDVKTAFLNGNLEESIYMSQPEGFIEQDQEQKVCKLKK 867
              + SIR +LS+A   D E+ Q+DVKTAFL+G+LEE IYM QPEGF    ++  VCKL K
Sbjct: 904  VKMTSIRTILSLAASLDLEVEQLDVKTAFLHGDLEEEIYMEQPEGFEVAGKKHMVCKLNK 963

Query: 868  SIYGLKQASRSWNIRFDTAIKSYGFEQNVDEPCVYKKVV---NFIIAFLVLYVDDVLLIG 927
            S+YGLKQA R W ++FD+ +KS  + +   +PCVY K     NFII  L+LYVDD+L++G
Sbjct: 964  SLYGLKQAPRQWYMKFDSFMKSQTYLKTYSDPCVYFKRFSENNFII--LLLYVDDMLIVG 1023

Query: 928  NDVGYLTDIKKWLAMQFQKRS--------GRCTIRSR--------------------NPN 987
             D G +  +K  L+  F  +         G   +R R                    N  
Sbjct: 1024 KDKGLIAKLKGDLSKSFDMKDLGPAQQILGMKIVRERTSRKLWLSQEKYIERVLERFNMK 1083

Query: 988  CSKPYG------IHLSKEQCPKTPQEVEDMRNIPYASAVGSLMYAMLCTRPDICYSVGMV 1047
             +KP        + LSK+ CP T +E  +M  +PY+SAVGSLMYAM+CTRPDI ++VG+V
Sbjct: 1084 NAKPVSTPLAGHLKLSKKMCPTTVEEKGNMAKVPYSSAVGSLMYAMVCTRPDIAHAVGVV 1143

Query: 1048 SRYQSNPGRDHWTAVKNILKYLRRTKDYMLMYGTKDLILTGYTDSDFQTDKDARKSTSGS 1077
            SR+  NPG++HW AVK IL+YLR T    L +G  D IL GYTD+D   D D RKS++G 
Sbjct: 1144 SRFLENPGKEHWEAVKWILRYLRGTTGDCLCFGGSDPILKGYTDADMAGDIDNRKSSTGY 1203

BLAST of CSPI02G15660 vs. Swiss-Prot
Match: COPIA_DROME (Copia protein OS=Drosophila melanogaster GN=GIP PE=1 SV=3)

HSP 1 Score: 242.7 bits (618), Expect = 1.9e-62
Identity = 166/550 (30.18%), Postives = 279/550 (50.73%), Query Frame = 1

Query: 570  HRSKLCFFIGYPKESRGGLFYDPQENKIFVSTNATFLEEDHIRD-------HQPRSKLVL 629
            + SK C  I + K+S+       + NK F++ +     +DH+ +       ++ R     
Sbjct: 782  NESKECDNIQFLKDSK-------ESNKYFLNESKKRKRDDHLNESKGSGNPNESRESETA 841

Query: 630  KEISKSAIDKPSSSTKVVDKTRKSGQSHPSQQLREPRRSGRVVHQPDRYLG--LIETQVV 689
            + + +  ID P+ +  +    R+S +     Q+          ++ D  L   ++    +
Sbjct: 842  EHLKEIGIDNPTKNDGIEIINRRSERLKTKPQIS--------YNEEDNSLNKVVLNAHTI 901

Query: 690  IPDDGIEDPLTYKQAMKDVDRDQWIKAMDLEMESMYFNSVWTLVDQPNDVKPIGCKWIYK 749
              D     P ++ +     D+  W +A++ E+ +   N+ WT+  +P +   +  +W++ 
Sbjct: 902  FNDV----PNSFDEIQYRDDKSSWEEAINTELNAHKINNTWTITKRPENKNIVDSRWVFS 961

Query: 750  RKRDHAGKVQTFKARLVAKGYTQREGVDYEETFSPVAMLKSIRILLSIATFYDYEIWQMD 809
             K +  G    +KARLVA+G+TQ+  +DYEETF+PVA + S R +LS+   Y+ ++ QMD
Sbjct: 962  VKYNELGNPIRYKARLVARGFTQKYQIDYEETFAPVARISSFRFILSLVIQYNLKVHQMD 1021

Query: 810  VKTAFLNGNLEESIYMSQPEGFIEQDQEQKVCKLKKSIYGLKQASRSWNIRFDTAIKSYG 869
            VKTAFLNG L+E IYM  P+G         VCKL K+IYGLKQA+R W   F+ A+K   
Sbjct: 1022 VKTAFLNGTLKEEIYMRLPQGI--SCNSDNVCKLNKAIYGLKQAARCWFEVFEQALKECE 1081

Query: 870  FEQNVDEPCVY---KKVVNFIIAFLVLYVDDVLLIGNDVGYLTDIKKWLAMQFQ------ 929
            F  +  + C+Y   K  +N  I +++LYVDDV++   D+  + + K++L  +F+      
Sbjct: 1082 FVNSSVDRCIYILDKGNINENI-YVLLYVDDVVIATGDMTRMNNFKRYLMEKFRMTDLNE 1141

Query: 930  -KRSGRCTIRSRNPNCSKPYGIHLSK-------EQC----PKTPQEV-------EDMRNI 989
             K      I  +          ++ K       E C       P ++       ++  N 
Sbjct: 1142 IKHFIGIRIEMQEDKIYLSQSAYVKKILSKFNMENCNAVSTPLPSKINYELLNSDEDCNT 1201

Query: 990  PYASAVGSLMYAMLCTRPDICYSVGMVSRYQSNPGRDHWTAVKNILKYLRRTKDYMLMYG 1049
            P  S +G LMY MLCTRPD+  +V ++SRY S    + W  +K +L+YL+ T D  L++ 
Sbjct: 1202 PCRSLIGCLMYIMLCTRPDLTTAVNILSRYSSKNNSELWQNLKRVLRYLKGTIDMKLIF- 1261

Query: 1050 TKDLI----LTGYTDSDFQTDKDARKSTSGSVFTL-NGGAVVWRSIKQTCIADSTMEAEY 1078
             K+L     + GY DSD+   +  RKST+G +F + +   + W + +Q  +A S+ EAEY
Sbjct: 1262 KKNLAFENKIIGYVDSDWAGSEIDRKSTTGYLFKMFDFNLICWNTKRQNSVAASSTEAEY 1308

BLAST of CSPI02G15660 vs. Swiss-Prot
Match: YCH4_YEAST (Putative transposon Ty5-1 protein YCL074W OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GN=TY5A PE=5 SV=2)

HSP 1 Score: 118.6 bits (296), Expect = 4.2e-25
Identity = 89/307 (28.99%), Postives = 137/307 (44.63%), Query Frame = 1

Query: 799  MDVKTAFLNGNLEESIYMSQPEGFIEQDQEQKVCKLKKSIYGLKQASRSWNIRFDTAIKS 858
            MDV TAFLN  ++E IY+ QP GF+ +     V +L   +YGLKQA   WN   +  +K 
Sbjct: 1    MDVDTAFLNSTMDEPIYVKQPPGFVNERNPDYVWELYGGMYGLKQAPLLWNEHINNTLKK 60

Query: 859  YGFEQNVDEPCVYKKVVNFIIAFLVLYVDDVLLIGNDVGYLTDIKKWLAMQF-------- 918
             GF ++  E  +Y +  +    ++ +YVDD+L+          +K+ L   +        
Sbjct: 61   IGFCRHEGEHGLYFRSTSDGPIYIAVYVDDLLVAAPSPKIYDRVKQELTKLYSMKDLGKV 120

Query: 919  ---------QKRSGRCTIRSRNPNCSKPYGIHLSKEQCPKTP---------QEVEDMRNI 978
                     Q  +G  T+  ++          ++  +  +TP              +++I
Sbjct: 121  DKFLGLNIHQSSNGDITLSLQDYIAKAASESEINTFKLTQTPLCNSKPLFETTSPHLKDI 180

Query: 979  -PYASAVGSLMYAMLCTRPDICYSVGMVSRYQSNPGRDHWTAVKNILKYLRRTKDYMLMY 1038
             PY S VG L++     RPDI Y V ++SR+   P   H  + + +L+YL  T+   L Y
Sbjct: 181  TPYQSIVGQLLFCANTGRPDISYPVSLLSRFLREPRAIHLESARRVLRYLYTTRSMCLKY 240

Query: 1039 GT-KDLILTGYTDSDFQTDKDARKSTSGSVFTLNGGAVVWRSIK-QTCIADSTMEAEYVA 1077
             +   L LT Y D+      D   ST G V  L G  V W S K +  I   + EAEY+ 
Sbjct: 241  RSGSQLALTVYCDASHGAIHDLPHSTGGYVTLLAGAPVTWSSKKLKGVIPVPSTEAEYIT 300

BLAST of CSPI02G15660 vs. Swiss-Prot
Match: YL21B_YEAST (Transposon Ty2-LR1 Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GN=TY2B-LR1 PE=3 SV=1)

HSP 1 Score: 114.4 bits (285), Expect = 7.8e-24
Identity = 84/315 (26.67%), Postives = 146/315 (46.35%), Query Frame = 1

Query: 312 TAETQNKRQKVSSNAY-LWHLRLGHINLNRIGRLVKSGLLSPLEDNSLP-------PCES 371
           T    NK + V+   Y L H  LGH N   I + +K   ++ L+++ +         C  
Sbjct: 576 TINNVNKSKSVNKYPYPLIHRMLGHANFRSIQKSLKKNAVTYLKESDIEWSNASTYQCPD 635

Query: 372 CLEGKMTKRSFTGKGLRAK-----GPLELVHSDLCGPMNVKARGGYEYFISFIDDYSRYG 431
           CL GK TK     KG R K      P + +H+D+ GP++   +    YFISF D+ +R+ 
Sbjct: 636 CLIGKSTKHRHV-KGSRLKYQESYEPFQYLHTDIFGPVHHLPKSAPSYFISFTDEKTRFQ 695

Query: 432 HIYLIHHKSNS--LEKFKEYKAEVENELGKTIKILRSDRGGEYMDLRFRDYLIENGIQSQ 491
            +Y +H +     L  F    A ++N+    + +++ DRG EY +     +    GI + 
Sbjct: 696 WVYPLHDRREESILNVFTSILAFIKNQFNARVLVIQMDRGSEYTNKTLHKFFTNRGITAC 755

Query: 492 LSAPSTPQQNGVSERRNRTLLDMVRSMISFSQMSDSFWGYALETAAYILNNVPSKSVSET 551
            +  +  + +GV+ER NRTLL+  R+++  S + +  W  A+E +  I N++ S    ++
Sbjct: 756 YTTTADSRAHGVAERLNRTLLNDCRTLLHCSGLPNHLWFSAVEFSTIIRNSLVSPKNDKS 815

Query: 552 PYELWKGRKG-SLRHFRIWGCPAHVLVQNPKKLEHRSKLCFFIGYP-KESRGGLFYDPQE 610
             +   G  G  +     +G P  V   NP    H   +  +  +P + S G + Y P  
Sbjct: 816 ARQ-HAGLAGLDITTILPFGQPVIVNNHNPDSKIHPRGIPGYALHPSRNSYGYIIYLPSL 875

BLAST of CSPI02G15660 vs. Swiss-Prot
Match: YO22B_YEAST (Transposon Ty2-OR2 Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GN=TY2B-OR2 PE=1 SV=1)

HSP 1 Score: 114.0 bits (284), Expect = 1.0e-23
Identity = 84/318 (26.42%), Postives = 146/318 (45.91%), Query Frame = 1

Query: 312 TAETQNKRQKVSSNAY-LWHLRLGHINLNRIGRLVKSGLLSPLEDNSLP-------PCES 371
           T    NK + V+   Y L H  LGH N   I + +K   ++ L+++ +         C  
Sbjct: 576 TINNVNKSKSVNKYPYPLIHRMLGHANFRSIQKSLKKNAVTYLKESDIEWSNASTYQCPD 635

Query: 372 CLEGKMTKRSFTGKGLRAK-----GPLELVHSDLCGPMNVKARGGYEYFISFIDDYSRYG 431
           CL GK TK     KG R K      P + +H+D+ GP++   +    YFISF D+ +R+ 
Sbjct: 636 CLIGKSTKHRHV-KGSRLKYQESYEPFQYLHTDIFGPVHHLPKSAPSYFISFTDEKTRFQ 695

Query: 432 HIYLIHHKSNS--LEKFKEYKAEVENELGKTIKILRSDRGGEYMDLRFRDYLIENGIQSQ 491
            +Y +H +     L  F    A ++N+    + +++ DRG EY +     +    GI + 
Sbjct: 696 WVYPLHDRREESILNVFTSILAFIKNQFNARVLVIQMDRGSEYTNKTLHKFFTNRGITAC 755

Query: 492 LSAPSTPQQNGVSERRNRTLLDMVRSMISFSQMSDSFWGYALETAAYILNNVPSKSVSET 551
            +  +  + +GV+ER NRTLL+  R+++  S + +  W  A+E +  I N++ S    ++
Sbjct: 756 YTTTADSRAHGVAERLNRTLLNDCRTLLHCSGLPNHLWFSAVEFSTIIRNSLVSPKNDKS 815

Query: 552 PYELWKGRKG-SLRHFRIWGCPAHVLVQNPKKLEHRSKLCFFIGYP-KESRGGLFYDPQE 611
             +   G  G  +     +G P  V   NP    H   +  +  +P + S G + Y P  
Sbjct: 816 ARQ-HAGLAGLDITTILPFGQPVIVNNHNPDSKIHPRGIPGYALHPSRNSYGYIIYLPSL 875

Query: 612 NKIFVSTNATFLEEDHIR 613
            K   +TN   L+ +  +
Sbjct: 876 KKTVDTTNYVILQNNQTK 891

BLAST of CSPI02G15660 vs. TrEMBL
Match: E2GK51_BRYDI (Gag/pol protein (Fragment) OS=Bryonia dioica PE=4 SV=1)

HSP 1 Score: 1365.1 bits (3532), Expect = 0.0e+00
Identity = 672/805 (83.48%), Postives = 725/805 (90.06%), Query Frame = 1

Query: 307  SQLQQTAETQNKRQKVSSNAYLWHLRLGHINLNRIGRLVKSGLLSPLEDNSLPPCESCLE 366
            +++ +T ETQNK+QKVSSNAYLWHLRLGHINLNRI RLVKSG+L+ LEDNSLPPCESCLE
Sbjct: 423  TEMFRTLETQNKKQKVSSNAYLWHLRLGHINLNRIERLVKSGILNQLEDNSLPPCESCLE 482

Query: 367  GKMTKRSFTGKGLRAKGPLELVHSDLCGPMNVKARGGYEYFISFIDDYSRYGHIYLIHHK 426
            GKMTKRSFTGKGLRAK PLELVHSDLCGPMNVKARGGYEYFISFIDD+SRYGH+YL+HHK
Sbjct: 483  GKMTKRSFTGKGLRAKVPLELVHSDLCGPMNVKARGGYEYFISFIDDFSRYGHVYLLHHK 542

Query: 427  SNSLEKFKEYKAEVENELGKTIKILRSDRGGEYMDLRFRDYLIENGIQSQLSAPSTPQQN 486
            S S EKFKEYKAEVENE+GKTIK LRSDRGGEYMD +F+DYLIE GIQSQLSAPSTPQQN
Sbjct: 543  SESFEKFKEYKAEVENEIGKTIKTLRSDRGGEYMDSKFQDYLIEFGIQSQLSAPSTPQQN 602

Query: 487  GVSERRNRTLLDMVRSMISFSQMSDSFWGYALETAAYILNNVPSKSVSETPYELWKGRKG 546
            GVSERRNRTLLDMVRSM+S++Q+ DSFWGYALETA +ILNNVPSKSV ETPYELWKGRK 
Sbjct: 603  GVSERRNRTLLDMVRSMMSYAQLPDSFWGYALETAIHILNNVPSKSVLETPYELWKGRKS 662

Query: 547  SLRHFRIWGCPAHVLVQNPKKLEHRSKLCFFIGYPKESRGGLFYDPQENKIFVSTNATFL 606
            SLR+FRIWGCPAHVLVQNPKKLE RSKLC F+GYPKESRGGLFY PQENK+FVSTNATFL
Sbjct: 663  SLRYFRIWGCPAHVLVQNPKKLEPRSKLCLFVGYPKESRGGLFYHPQENKVFVSTNATFL 722

Query: 607  EEDHIRDHQPRSKLVLKEISKSAIDKPSSSTKVVDKTRKSGQSHPSQQLREPRRSGRVVH 666
            EEDH R+HQPRSK+VLKE+ K+A DKPSSSTKVVDK   S QSH SQ+LR PRRSGRVVH
Sbjct: 723  EEDHXRNHQPRSKIVLKEMFKNATDKPSSSTKVVDKANISDQSHTSQELRVPRRSGRVVH 782

Query: 667  QPDRYLGLIETQVVIPDDGIEDPLTYKQAMKDVDRDQWIKAMDLEMESMYFNSVWTLVDQ 726
            QP+RYLGL+ETQ++IPDDG+EDPLTYKQAM DVDRDQWIKAM+LEMESMYFNSVWTLVD 
Sbjct: 783  QPNRYLGLVETQIIIPDDGVEDPLTYKQAMNDVDRDQWIKAMNLEMESMYFNSVWTLVDL 842

Query: 727  PNDVKPIGCKWIYKRKRDHAGKVQTFKARLVAKGYTQREGVDYEETFSPVAMLKSIRILL 786
            P+DVKPIGCKWIYKRKRD AGKVQTFKARLVAKGYTQ+EGVDYEETFSPVAMLKSIRILL
Sbjct: 843  PSDVKPIGCKWIYKRKRDQAGKVQTFKARLVAKGYTQKEGVDYEETFSPVAMLKSIRILL 902

Query: 787  SIATFYDYEIWQMDVKTAFLNGNLEESIYMSQPEGFIEQDQEQKVCKLKKSIYGLKQASR 846
            SIATFY+YEIWQMDVKTAFLNGNLEESIYM QPEGFI QDQEQKVCKL+KSIYGLKQASR
Sbjct: 903  SIATFYNYEIWQMDVKTAFLNGNLEESIYMVQPEGFIAQDQEQKVCKLQKSIYGLKQASR 962

Query: 847  SWNIRFDTAIKSYGFEQNVDEPCVYKKVVNFIIAFLVLYVDDVLLIGNDVGYLTDIKKWL 906
            SWNIRFDTAIKSYGFEQNVDEPCVYKK+VN ++AFL+LYVDD+LLIGNDV YLTD+KKWL
Sbjct: 963  SWNIRFDTAIKSYGFEQNVDEPCVYKKIVNSVVAFLILYVDDILLIGNDVEYLTDVKKWL 1022

Query: 907  AMQFQKRS--------GRCTIRSRN---------------------PNCSK-----PYGI 966
              QFQ +         G   +R+R                       N  K      +GI
Sbjct: 1023 NTQFQMKDLGEAQYILGIQIVRNRKNKTLAMSQASYIDKVLSRYKMQNSKKGQLPFRHGI 1082

Query: 967  HLSKEQCPKTPQEVEDMRNIPYASAVGSLMYAMLCTRPDICYSVGMVSRYQSNPGRDHWT 1026
            HLSKEQCPKTPQEVEDMRNIPY+SAVGSLMYAMLCTRPDICYSVG+VSRYQSNPGRDHWT
Sbjct: 1083 HLSKEQCPKTPQEVEDMRNIPYSSAVGSLMYAMLCTRPDICYSVGIVSRYQSNPGRDHWT 1142

Query: 1027 AVKNILKYLRRTKDYMLMYGTKDLILTGYTDSDFQTDKDARKSTSGSVFTLNGGAVVWRS 1078
            AVKNILKYLRRT++YML+YG KDLILTGYTDSDFQ+DKDARKSTSGSVFTLNGGAVVWRS
Sbjct: 1143 AVKNILKYLRRTRNYMLVYGAKDLILTGYTDSDFQSDKDARKSTSGSVFTLNGGAVVWRS 1202

BLAST of CSPI02G15660 vs. TrEMBL
Match: A0A165U314_9ROSI (Gag/pol protein OS=Momordica dioica PE=4 SV=1)

HSP 1 Score: 1074.3 bits (2777), Expect = 1.2e-310
Identity = 549/808 (67.95%), Postives = 621/808 (76.86%), Query Frame = 1

Query: 314  ETQNKRQKVSSNAYLWHLRLGHINLNRIGRLVKSGLLSPLEDNSLPPCESCLEGKMTKRS 373
            ETQNK+Q   S  YLWHLRLGHINLNRI +L K GLL  LED SLPPCESCLEGKMTKR 
Sbjct: 437  ETQNKKQNNDSAMYLWHLRLGHINLNRIEKLHKDGLLEQLEDFSLPPCESCLEGKMTKRP 496

Query: 374  FTGKGLRAKGPLELVHSDLCGPMNVKARGGYEYFISFIDDYSRYGHIYLIHHKSNSLEKF 433
            FTGKGLRA   LEL+H+D+CGPM+VKARGGY+YF+SF DD SRYG++YL+ HKS S EKF
Sbjct: 497  FTGKGLRASDLLELIHTDVCGPMSVKARGGYQYFLSFTDDLSRYGYVYLLKHKSESFEKF 556

Query: 434  KEYKAEVENELGKTIKILRSDRGGEYMDLRFRDYLIENGIQSQLSAPSTPQQNGVSERRN 493
            KE++AEVENE+GK IK LRSDRGGEYM   F D+L E GI SQLSAP TPQ NGVSERRN
Sbjct: 557  KEFQAEVENEIGKKIKTLRSDRGGEYMSSEFGDHLREFGIVSQLSAPGTPQCNGVSERRN 616

Query: 494  RTLLDMVRSMISFSQMSDSFWGYALETAAYILNNVPSKSVSETPYELWKGRKGSLRHFRI 553
            RTLLDMVRSM+S++ + DSFWGYA E    ILN VPSKSV ETPYELW GRK SL   +I
Sbjct: 617  RTLLDMVRSMMSYADLPDSFWGYARERERAILNRVPSKSVEETPYELWYGRKSSLSFLKI 676

Query: 554  WGCPAHVLVQNPKKLEHRSKLCFFIGYPKESRGGLFYDPQENKIFVSTNATFLEEDHIRD 613
            WGCPAHV    PKKLE RS+ C F+GYPKE+RG  FY PQENK+FV+TN  FLE++ +  
Sbjct: 677  WGCPAHVKKLQPKKLEPRSEKCLFVGYPKETRGYYFYHPQENKVFVATNEAFLEKEFLSR 736

Query: 614  HQPRSKLVLKEISKSAI-----DKPSSSTK-VVDKTR-KSGQSH--PSQQLREPRRSGRV 673
            HQP SK+VLK + +  I     DKPSSSTK VVDK      QSH    Q+LR PRRSGR 
Sbjct: 737  HQPGSKIVLKAVVEPLIPLDGTDKPSSSTKVVVDKAEVNDDQSHTPDQQELRVPRRSGRS 796

Query: 674  VHQPDRYLGLIETQVVIPDDGIEDPLTYKQAMKDVDRDQWIKAMDLEMESMYFNSVWTLV 733
               P+RYLGL+ETQ++I D+G EDP  YKQAM   D DQW+KAM+ EMESMY N VWTLV
Sbjct: 797  RRAPNRYLGLVETQIMILDNGEEDPTNYKQAMVGPDSDQWLKAMNSEMESMYDNKVWTLV 856

Query: 734  DQPNDVKPIGCKWIYKRKRDHAGKVQTFKARLVAKGYTQREGVDYEETFSPVAMLKSIRI 793
            D P+DVKPIGCKWIYK+KRD    V  FKARLVAKG+T+   + YEETFSPVAMLKSIRI
Sbjct: 857  DLPSDVKPIGCKWIYKKKRDQDSNVTVFKARLVAKGFTRSLSLSYEETFSPVAMLKSIRI 916

Query: 794  LLSIATFYDYEIWQMDVKTAFLNGNLEESIYMSQPEGFIEQDQEQKVCKLKKSIYGLKQA 853
            +L+IA F+DYEIWQMDVKTAFLNGNLEESIYM QPEGF+ QDQEQK CKL+ SIYGLKQA
Sbjct: 917  ILAIAAFFDYEIWQMDVKTAFLNGNLEESIYMIQPEGFVAQDQEQKACKLQGSIYGLKQA 976

Query: 854  SRSWNIRFDTAIKSYGFEQNVDEPCVYKKVVNFIIAFLVLYVDDVLLIGNDVGYLTDIKK 913
            SRSWNIRFD  IK++GF QNVDE CVYKK+   ++AFL+LYVDD+LLIGNDV YL D+KK
Sbjct: 977  SRSWNIRFDEVIKAFGFIQNVDESCVYKKISGSVVAFLILYVDDILLIGNDVEYLEDVKK 1036

Query: 914  WLAMQF-----------------QKRSGRCTIRSRNPNCSKPYG---------------- 973
            WL   F                 + RS +    S++    K                   
Sbjct: 1037 WLNTSFSMKDLGEAQYILGIRIYRDRSNKTIGMSQSTYIDKVLSRFKMQDSKKGLLPFRH 1096

Query: 974  -IHLSKEQCPKTPQEVEDMRNIPYASAVGSLMYAMLCTRPDICYSVGMVSRYQSNPGRDH 1033
             IHLSKEQCPKTPQEVEDMRNIPY+SA+GSLMYAMLCTRPD+CY++ +VSRYQSNPGRDH
Sbjct: 1097 GIHLSKEQCPKTPQEVEDMRNIPYSSAIGSLMYAMLCTRPDVCYALSIVSRYQSNPGRDH 1156

Query: 1034 WTAVKNILKYLRRTKDYMLMY-GTKDLILTGYTDSDFQTDKDARKSTSGSVFTLNGGAVV 1078
            WTAVKNILKYLRRT++  L+Y G KDL + GYTDS FQTDKD  KS SG VFTLNGGAV 
Sbjct: 1157 WTAVKNILKYLRRTRNMFLVYGGDKDLAVKGYTDSSFQTDKDDSKSQSG-VFTLNGGAVS 1216

BLAST of CSPI02G15660 vs. TrEMBL
Match: O23864_9ORYZ (Polyprotein OS=Oryza australiensis PE=4 SV=1)

HSP 1 Score: 823.9 bits (2127), Expect = 2.2e-235
Identity = 426/786 (54.20%), Postives = 542/786 (68.96%), Query Frame = 1

Query: 327  YLWHLRLGHINLNRIGRLVKSGLLSPLEDNSLPPCESCLEGKMTKRSFTGKGLRAKGPLE 386
            ++WH RLGHIN  R+ +L K GLL   +  S   CESCL GKMTK  FTG   RA   L 
Sbjct: 438  FIWHCRLGHINKKRMEKLHKDGLLHSFDFESFETCESCLLGKMTKAPFTGHSERASDLLA 497

Query: 387  LVHSDLCGPMNVKARGGYEYFISFIDDYSRYGHIYLIHHKSNSLEKFKEYKAEVENELGK 446
            LVH+D+CGPM+  ARGGY+YFI+F DD+SRYG+IYL+ HKS S EKFKE++ EV+N LGK
Sbjct: 498  LVHTDVCGPMSSTARGGYQYFITFTDDFSRYGYIYLMRHKSESFEKFKEFQNEVQNHLGK 557

Query: 447  TIKILRSDRGGEYMDLRFRDYLIENGIQSQLSAPSTPQQNGVSERRNRTLLDMVRSMISF 506
            TIK LRSDRGGEY+   F ++L + GI  QL+ P TPQ NGVSERRNRTLLDMVRSM+S 
Sbjct: 558  TIKFLRSDRGGEYVSQEFGNHLKDCGIVPQLTPPGTPQWNGVSERRNRTLLDMVRSMMSQ 617

Query: 507  SQMSDSFWGYALETAAYILNNVPSKSVSETPYELWKGRKGSLRHFRIWGCPAHVLVQNPK 566
            S +  SFWGYALETAA  LN VPSKSV +TPYE+W G+  SL   +IWGC A+V      
Sbjct: 618  SDLPLSFWGYALETAALTLNRVPSKSVEKTPYEIWTGQPPSLSFLKIWGCEAYVKRLQSD 677

Query: 567  KLEHRSKLCFFIGYPKESRGGLFYDPQENKIFVSTNATFLEEDHIRDHQPRSKLVLKEIS 626
            KL  +S  CF +GYPKE++G  FY+ ++ K+FV+ +  FLE++ +       ++ L+E+ 
Sbjct: 678  KLTPKSDKCFVVGYPKETKGYYFYNREQAKVFVARHGVFLEKEFLSRRVSGIRVHLEEVQ 737

Query: 627  KSAIDKPSSSTKVVDKTRKSGQSHPSQQLREPRRSGRVVHQPDRYLGLIETQVVIPDDGI 686
            ++  +  S++T+   +      + P      PRRS R    PDRY G  +  +++ D+  
Sbjct: 738  ETP-ETVSATTE--PQQEDQSVAPPVVDTPAPRRSERSRRAPDRYTGAEQRDILLLDN-- 797

Query: 687  EDPLTYKQAMKDVDRDQWIKAMDLEMESMYFNSVWTLVDQPNDVKPIGCKWIYKRKRDHA 746
            ++P TY++AM   D ++W+ AM  E+ESMY N VW LVD P+ VK I CKW++K+K D  
Sbjct: 798  DEPKTYEEAMVGHDSNKWLGAMKSEIESMYDNQVWNLVDPPDGVKTIECKWLFKKKADMD 857

Query: 747  GKVQTFKARLVAKGYTQREGVDYEETFSPVAMLKSIRILLSIATFYDYEIWQMDVKTAFL 806
            G V  +KARLVAKG+ Q +GVDY+ETFSPVAMLKSIRI+L+IA ++DYEIWQMDVKTAFL
Sbjct: 858  GNVHIYKARLVAKGFKQIQGVDYDETFSPVAMLKSIRIILAIAAYFDYEIWQMDVKTAFL 917

Query: 807  NGNLEESIYMSQPEGFIEQDQEQKVCKLKKSIYGLKQASRSWNIRFDTAIKSYGFEQNVD 866
            NGNL E +YM QP+GF++ +   K+CKL+KSIYGLKQASRSWNIRFD  IK +GF +N +
Sbjct: 918  NGNLSEDVYMIQPQGFVDPESPGKICKLQKSIYGLKQASRSWNIRFDEVIKGFGFIKNEE 977

Query: 867  EPCVYKKVVNFIIAFLVLYVDDVLLIGNDVGYLTDIKKWLAMQF---------------- 926
            E CVYKKV    I FL+LYVDD+LLIGND+  L  +K  L   F                
Sbjct: 978  EACVYKKVSGSAIVFLILYVDDILLIGNDIPMLESVKSSLKNSFSMKDLGEAAYILGIRI 1037

Query: 927  -QKRSGRC-----------TIRSRNPNCSK------PYGIHLSKEQCPKTPQEVEDMRNI 986
             + RS R             ++  N + SK       +GI+LSK QCP+T  E   M  +
Sbjct: 1038 YRDRSKRLIGLSQSTYIDKVLKRFNMHDSKKGFLPMSHGINLSKNQCPQTHDERNKMGMV 1097

Query: 987  PYASAVGSLMYAMLCTRPDICYSVGMVSRYQSNPGRDHWTAVKNILKYLRRTKDYMLMY- 1046
            PYASA+GS+MYAMLCTRPD+ Y++   SRYQS+PG  HWTAVKNILKYLRRTKD  L+Y 
Sbjct: 1098 PYASAIGSIMYAMLCTRPDVSYALSATSRYQSDPGEGHWTAVKNILKYLRRTKDMFLVYG 1157

Query: 1047 GTKDLILTGYTDSDFQTDKDARKSTSGSVFTLNGGAVVWRSIKQTCIADSTMEAEYVAAC 1078
            G +DL+++GYTD+ FQTDKD  +S SG VF LNGGAV W+S KQ  +ADST EAEY+AA 
Sbjct: 1158 GEEDLVVSGYTDASFQTDKDDYRSQSGFVFCLNGGAVSWKSSKQDTVADSTTEAEYIAAS 1217

BLAST of CSPI02G15660 vs. TrEMBL
Match: Q7Y1M7_ORYSJ (Putative polyprotein OS=Oryza sativa subsp. japonica GN=OSJNBa0053G10.9 PE=4 SV=1)

HSP 1 Score: 813.1 bits (2099), Expect = 3.9e-232
Identity = 487/1126 (43.25%), Postives = 654/1126 (58.08%), Query Frame = 1

Query: 2    NSSIVQLLASEKLNGDNYAAWKSNLNTILVVDDLRFVLTEECPQNPASNANRTSRDAYDR 61
            N ++  +L  EKL G N+  W  NL  +L  +   FVLTE  P N  +NA    R  +++
Sbjct: 8    NFNLRSILEKEKLTGTNFMDWYRNLRIVLRQEHKEFVLTEPFPANLPNNAPAAQRREHEK 67

Query: 62   WIKANEKARVYILASMSDVLAKKHESLATAKEIMDSLRGMFGQPEWSLRHEAVKYIYTKR 121
                       +LA+MS  L +++E+L  A  I+  LR MF     + R    K ++  R
Sbjct: 68   RCNDYLDISCLMLATMSPELQRQYEAL-DAHTIITGLRNMFEDQARAERFNTSKSLFACR 127

Query: 122  MKEGTSVREHVLDMMMHFNIAEVNGGPIEEVNQVSFILESLPKSFIPFQTNASLNKIEFN 181
            + EG  V  HV+ M+ +    +  G P+        IL+SLP SF PF  N ++N +   
Sbjct: 128  LAEGNLVSPHVIKMIGYTESLDKLGFPLSRELATDLILQSLPPSFEPFIMNFNMNNLNRT 187

Query: 182  LTTLLNELQRFQNLTMGKGKQVEANVATTKRKFIRGSSSKTKAGPSKPNA-QIKKKGKGK 241
            L  L   L+  +         V   +   KRK     ++K      K N+ +I      K
Sbjct: 188  LAELHGMLKTAEESIKKNSNHV---MVMHKRK----PNNKKSGQKRKLNSDEITSTSNSK 247

Query: 242  TPKQNKGKKAAEKGKCYHCGQN-GHWLRN----CPKYLAEKKGREGNTRKIVLGKSFQKA 301
            T  Q  G  +A+  +C+ C +  G+  R+    C  Y  +           +        
Sbjct: 248  TKVQKTG--SAKDAECFFCKETEGYGFRSVDNGCSVYYND-----------IFYFHAPMM 307

Query: 302  RSLSRLEQERWSQLQQTAETQNKRQKVSSNAYLWHLRLGHINLNRIGRLVKSGLLSPLED 361
              L  +  +  S     A+ Q  R    +  ++WH  LGHIN  RI +L + GLL   + 
Sbjct: 308  NGLYIVNLDGCSVYNINAKRQ--RPNNLNPTFIWHCCLGHINEKRIEKLHRDGLLHSFDF 367

Query: 362  NSLPPCESCLEGKMTKRSFTGKGLRAKGPLELVHSDLCGPMNVKARGGYEYFISFIDDYS 421
             S   CESCL GKMTK  FTG+  RA   L LVH+D+CGPM+  ARGG+ YFI+F D++S
Sbjct: 368  ESFKTCESCLLGKMTKAPFTGQSERASELLGLVHTDVCGPMSSTARGGFGYFITFTDEFS 427

Query: 422  RYGHIYLIHHKSNSLEKFKEYKAEVENELGKTIKILRSDRGGEYMDLRFRDYLIENGIQS 481
            RYG++YL+ HKS S EKF+E++ EV+N LGKTIK LRSDRGGEY+ L + ++L E GI  
Sbjct: 428  RYGYVYLMRHKSESFEKFQEFQNEVQNHLGKTIKYLRSDRGGEYLSLEYGNHLKECGIVP 487

Query: 482  QLSAPSTPQQNGVSERRNRTLLDMVRSMISFSQMSDSFWGYALETAAYILNNVPSKSVSE 541
            QL+ P TPQ N VSERRNR LLDMVRSM+S + M  SFWGYALETAA+ LN VPSKSV +
Sbjct: 488  QLTPPGTPQWNAVSERRNRILLDMVRSMMSQTDMPLSFWGYALETAAFTLNRVPSKSVDK 547

Query: 542  TPYELWKGRKGSLRHFRIWGCPAHVLVQNPKKLEHRSKLCFFIGYPKESRGGLFYDPQEN 601
            TPYE+W G++ SL   +IW C                         +E++G  FY+ +E 
Sbjct: 548  TPYEIWTGKRPSLSFLKIWCC-------------------------EETKGYYFYNREEG 607

Query: 602  KIFVSTNATFLEEDHIRDHQPRSKLVLKEISKSAIDKPSSSTKVVDKTRKSGQSHPSQQL 661
            K+FV+ +  FLE++ I      S + LKEI ++     ++ST    +  +       Q +
Sbjct: 608  KVFVARHGVFLEKEFISRKDSGSMVRLKEIQETP---ENASTSTQPQVEQDVVQQVEQVV 667

Query: 662  REP-------RRSGRVVHQPDRYLGLIETQ--VVIPDDGIEDPLTYKQAMKDVDRDQWIK 721
             EP       RRS R+   P RY  L   Q  +++ D+  ++P TY++AM   D ++W+ 
Sbjct: 668  VEPVVEAPASRRSERIRRTPARYALLTSGQRDILLLDN--DEPTTYEEAMVGPDTEKWLG 727

Query: 722  AMDLEMESMYFNSVWTLVDQPNDVKPIGCKWIYKRKRDHAGKVQTFKARLVAKGYTQREG 781
            AM  E+ESM+ N VW LVD P+ VK I CKWI+K+  D  G V  + ARLVAKG+ Q +G
Sbjct: 728  AMKSEIESMHVNQVWNLVDPPDGVKAIECKWIFKKMTDVDGTVHIYNARLVAKGFRQIQG 787

Query: 782  VDYEETFSPVAMLKSIRILLSIATFYDYEIWQMDVKTAFLNGNLEESIYMSQPEGFIEQD 841
            VDY+ETFSPVAMLKSIRI+L+IA ++DYEIWQMDVKTAFLNGNL+E +YM+QP+GF++  
Sbjct: 788  VDYDETFSPVAMLKSIRIVLAIAAYFDYEIWQMDVKTAFLNGNLDEDVYMTQPKGFVDPQ 847

Query: 842  QEQKVCKLKKSIYGLKQASRSWNIRFDTAIKSYGFEQNVDEPCVYKKVVNFIIAFLVLYV 901
              +K+CKL+KSIY LKQASRSWNIRFD  +K+ GF +N +EPCVYKK+    + FL+LYV
Sbjct: 848  SAKKICKLQKSIYRLKQASRSWNIRFDEVVKALGFVKNEEEPCVYKKISGSALVFLILYV 907

Query: 902  DDVLLIGNDVGYLTDIKKWLAMQF-----------------QKRSGRCTIRSRNPNCSK- 961
            DD+LLIGND+  L  +K  L   F                 + RS R    S++    K 
Sbjct: 908  DDILLIGNDIPMLESVKTSLKYSFSMKDLGEAAYILGIRIYRDRSKRLIGLSQSTYIDKV 967

Query: 962  ----------------PYGIHLSKEQCPKTPQEVEDMRNIPYASAVGSLMYAMLCTRPDI 1021
                             +GI+L K QCP+T  E   M  IPYASA+GS+MYAMLCTR D+
Sbjct: 968  LKRFNMQDSKKGFLPMSHGINLGKNQCPQTTDERNKMSVIPYASAIGSIMYAMLCTRLDV 1027

Query: 1022 CYSVGMVSRYQSNPGRDHWTAVKNILKYLRRTKDYMLMYG-TKDLILTGYTDSDFQTDKD 1078
             Y++   SRYQS+ G  HW AVKNILKYLRRTKD  L+YG  ++L++ GYTD+ FQTDKD
Sbjct: 1028 SYALSATSRYQSDLGESHWIAVKNILKYLRRTKDMFLVYGRQEELVVNGYTDASFQTDKD 1080

BLAST of CSPI02G15660 vs. TrEMBL
Match: A0A015J5I9_9GLOM (Gag-pol fusion protein OS=Rhizophagus irregularis DAOM 197198w GN=RirG_273610 PE=4 SV=1)

HSP 1 Score: 783.5 bits (2022), Expect = 3.3e-223
Identity = 409/792 (51.64%), Postives = 527/792 (66.54%), Query Frame = 1

Query: 326  AYLWHLRLGHINLNRIGRLVKSGLLSPLEDNSLPPCESCLEGKMTKRSFTGKGLRAKGPL 385
            A LWH RLGHI+  RI +L K G+L   +  S   CESCL GKMTK  F G   R +G L
Sbjct: 412  ACLWHSRLGHISKKRIAQLQKDGVLESFDLKSDDVCESCLLGKMTKSPFKGSFERGEGLL 471

Query: 386  ELVHSDLCGPMNVKARGGYEYFISFIDDYSRYGHIYLIHHKSNSLEKFKEYKAEVENELG 445
            +++H+D+CGP     + G  ++++F DD+SRYG+IYLI HKS++ EKFKE+K EVEN+LG
Sbjct: 472  DIIHTDVCGPFRSTTKDGTRFYVTFTDDFSRYGYIYLIKHKSDTFEKFKEFKNEVENQLG 531

Query: 446  KTIKILRSDRGGEYMDLRFRDYLIENGIQSQLSAPSTPQQNGVSERRNRTLLDMVRSMIS 505
            + IK+LRSDRGGEY+ + F DYL E GI SQL+ P TPQ NGV+ERRNRTLLDMVRSM+S
Sbjct: 532  RKIKMLRSDRGGEYLSIEFLDYLKECGIVSQLTPPRTPQLNGVAERRNRTLLDMVRSMMS 591

Query: 506  FSQMSDSFWGYALETAAYILNNVPSKSVSETPYELWKGRKGSLRHFRIWGCPAHVLVQNP 565
             + +   FWGYALETAA+ILN VP+K V++TP+E+W G+  SL H ++WGC A V  +  
Sbjct: 592  RASLPIHFWGYALETAAHILNLVPTKKVAKTPHEMWTGKVPSLAHIKVWGCEAFVRRETQ 651

Query: 566  KKLEHRSKLCFFIGYPKESRGGLFYDPQENKIFVSTNATFLEEDHIRDHQPRSKLVLKEI 625
             KL  RS+ CFF+GYPK+S G LFY P E+ +FV+  A F E + I      S + L+EI
Sbjct: 652  DKLAERSERCFFLGYPKQSFGYLFYRPSEDVVFVARRAVFRERELIFKEDSGSTIDLEEI 711

Query: 626  SKSAIDKPSSSTKVVDKTRKSGQSHPSQQLREPRRSGRVVHQPDRYLGLI--ETQVVIPD 685
             +S+ D     T   ++  +     P+      RRSGRV   P+ Y   I  +    + D
Sbjct: 712  QESSDDATLGETS--NQHEEEVPVGPTDVSLPLRRSGRVSMPPEFYGFHITSDGDTFVSD 771

Query: 686  D---GIEDPLTYKQAMKDVDRDQWIKAMDLEMESMYFNSVWTLVDQPNDVKPIGCKWIYK 745
                 +++P  Y++A+   +  +W +AMD E++SMY N VW LVD     K +GCKWI+K
Sbjct: 772  RTLINLDEPANYQEAVAGPESAKWKEAMDSEIKSMYDNQVWNLVDNVPGRKTVGCKWIFK 831

Query: 746  RKRDHAGKVQTFKARLVAKGYTQREGVDYEETFSPVAMLKSIRILLSIATFYDYEIWQMD 805
            +K D  GKV TFKARLVAKG+TQ  GVDY+ETFSPVA +KSIRI+L+IA F+DYEIWQMD
Sbjct: 832  KKTDMDGKVHTFKARLVAKGFTQTPGVDYDETFSPVAKIKSIRIMLAIAAFHDYEIWQMD 891

Query: 806  VKTAFLNGNLEESIYMSQPEGFIEQDQEQKVCKLKKSIYGLKQASRSWNIRFDTAIKSYG 865
            VKTAFLNG L E +YM+QPEGF++     KVCKL++SIYGLKQASRSWN+ F   +K +G
Sbjct: 892  VKTAFLNGKLTEDVYMNQPEGFVDAKYPNKVCKLERSIYGLKQASRSWNLCFHEKVKEFG 951

Query: 866  FEQNVDEPCVYKKVVNFIIAFLVLYVDDVLLIGNDVGYLTDIKKWLAMQFQKRS------ 925
            F ++ DE CVY K    I+ FLVLYVDD+LL+GND+  L D+K WL   F  +       
Sbjct: 952  FSRSEDESCVYVKASGSIVTFLVLYVDDILLMGNDIPTLQDVKAWLGKCFAMKDLGEAAY 1011

Query: 926  --GRCTIRSRN---------------------PNCSKPY-----GIHLSKEQCPKTPQEV 985
              G   +R R                       N  K          LSK Q P T +E+
Sbjct: 1012 ILGIRILRDRKKRLIGLSQGTYLEKVLKRFSMENSKKGELPIQSNAKLSKTQSPSTDEEI 1071

Query: 986  EDMRNIPYASAVGSLMYAMLCTRPDICYSVGMVSRYQSNPGRDHWTAVKNILKYLRRTKD 1045
             +M  +PYASAVGS+MYAM CTRPD+ +++ MVSRYQ NPGR HW AVKNILKYLRRTK+
Sbjct: 1072 AEMSRVPYASAVGSIMYAMTCTRPDVAFALSMVSRYQGNPGRAHWIAVKNILKYLRRTKN 1131

Query: 1046 YMLMYGTKD-LILTGYTDSDFQTDKDARKSTSGSVFTLNGGAVVWRSIKQTCIADSTMEA 1078
             +L+ G  D L + GYTD+ FQTD+D+ +S SG VF LNGGAV W+S KQ  +ADST E+
Sbjct: 1132 MVLVLGGSDTLRVEGYTDASFQTDRDSGRSQSGWVFLLNGGAVTWKSSKQETVADSTCES 1191

BLAST of CSPI02G15660 vs. TAIR10
Match: AT4G23160.1 (AT4G23160.1 cysteine-rich RLK (RECEPTOR-like protein kinase) 8)

HSP 1 Score: 242.3 bits (617), Expect = 1.4e-63
Identity = 145/421 (34.44%), Postives = 225/421 (53.44%), Query Frame = 1

Query: 687  EDPLTYKQAMKDVDRDQWIKAMDLEMESMYFNSVWTLVDQPNDVKPIGCKWIYKRKRDHA 746
            ++P TY +A + +    W  AMD E+ +M     W +   P + KPIGCKW+YK K +  
Sbjct: 84   KEPSTYNEAKEFL---VWCGAMDDEIGAMETTHTWEICTLPPNKKPIGCKWVYKIKYNSD 143

Query: 747  GKVQTFKARLVAKGYTQREGVDYEETFSPVAMLKSIRILLSIATFYDYEIWQMDVKTAFL 806
            G ++ +KARLVAKGYTQ+EG+D+ ETFSPV  L S++++L+I+  Y++ + Q+D+  AFL
Sbjct: 144  GTIERYKARLVAKGYTQQEGIDFIETFSPVCKLTSVKLILAISAIYNFTLHQLDISNAFL 203

Query: 807  NGNLEESIYMSQPEGFIEQDQE----QKVCKLKKSIYGLKQASRSWNIRFDTAIKSYGFE 866
            NG+L+E IYM  P G+  +  +      VC LKKSIYGLKQASR W ++F   +  +GF 
Sbjct: 204  NGDLDEEIYMKLPPGYAARQGDSLPPNAVCYLKKSIYGLKQASRQWFLKFSVTLIGFGFV 263

Query: 867  QNVDEPCVYKKVVNFIIAFLVLYVDDVLLIGNDVGYLTDIKKWLAMQFQKRS-------- 926
            Q+  +   + K+   +   +++YVDD+++  N+   + ++K  L   F+ R         
Sbjct: 264  QSHSDHTYFLKITATLFLCVLVYVDDIIICSNNDAAVDELKSQLKSCFKLRDLGPLKYFL 323

Query: 927  GRCTIRSRN--PNCSKPYGIHLSKE---------QCPKTPQEVEDMRN-------IPYAS 986
            G    RS      C + Y + L  E           P  P       +         Y  
Sbjct: 324  GLEIARSAAGINICQRKYALDLLDETGLLGCKPSSVPMDPSVTFSAHSGGDFVDAKAYRR 383

Query: 987  AVGSLMYAMLCTRPDICYSVGMVSRYQSNPGRDHWTAVKNILKYLRRTKDYMLMYGTK-D 1046
             +G LMY  + TR DI ++V  +S++   P   H  AV  IL Y++ T    L Y ++ +
Sbjct: 384  LIGRLMYLQI-TRLDISFAVNKLSQFSEAPRLAHQQAVMKILHYIKGTVGQGLFYSSQAE 443

Query: 1047 LILTGYTDSDFQTDKDARKSTSGSVFTLNGGAVVWRSIKQTCIADSTMEAEYVAACEAAK 1077
            + L  ++D+ FQ+ KD R+ST+G    L    + W+S KQ  ++ S+ EAEY A   A  
Sbjct: 444  MQLQVFSDASFQSCKDTRRSTNGYCMFLGTSLISWKSKKQQVVSKSSAEAEYRALSFATD 500

BLAST of CSPI02G15660 vs. TAIR10
Match: ATMG00820.1 (ATMG00820.1 Reverse transcriptase (RNA-dependent DNA polymerase))

HSP 1 Score: 82.0 bits (201), Expect = 2.4e-15
Identity = 40/103 (38.83%), Postives = 62/103 (60.19%), Query Frame = 1

Query: 687 EDPLTYKQAMKDVDRDQWIKAMDLEMESMYFNSVWTLVDQPNDVKPIGCKWIYKRKRDHA 746
           ++P +   A+KD     W +AM  E++++  N  W LV  P +   +GCKW++K K    
Sbjct: 26  KEPKSVIFALKDPG---WCQAMQEELDALSRNKTWILVPPPVNQNILGCKWVFKTKLHSD 85

Query: 747 GKVQTFKARLVAKGYTQREGVDYEETFSPVAMLKSIRILLSIA 790
           G +   KARLVAKG+ Q EG+ + ET+SPV    +IR +L++A
Sbjct: 86  GTLDRLKARLVAKGFHQEEGIYFVETYSPVVRTATIRTILNVA 125

BLAST of CSPI02G15660 vs. TAIR10
Match: ATMG00300.1 (ATMG00300.1 Gag-Pol-related retrotransposon family protein)

HSP 1 Score: 58.5 bits (140), Expect = 2.9e-08
Identity = 34/92 (36.96%), Postives = 47/92 (51.09%), Query Frame = 1

Query: 309 LQQTAET--QNKRQKVSSNAYLWHLRLGHINLNRIGRLVKSGLLSPLEDNSLPPCESCLE 368
           LQ + ET   N  +       LWH RL H++   +  LVK G L   + +SL  CE C+ 
Sbjct: 50  LQGSVETGESNLAETAKDETRLWHSRLAHMSQRGMELLVKKGFLDSSKVSSLKFCEDCIY 109

Query: 369 GKMTKRSFTGKGLRAKGPLELVHSDLCGPMNV 399
           GK  + +F+      K PL+ VHSDL G  +V
Sbjct: 110 GKTHRVNFSTGQHTTKNPLDYVHSDLWGAPSV 141

BLAST of CSPI02G15660 vs. NCBI nr
Match: gi|299474487|gb|ADJ18449.1| (gag/pol protein [Bryonia dioica])

HSP 1 Score: 1365.1 bits (3532), Expect = 0.0e+00
Identity = 672/805 (83.48%), Postives = 725/805 (90.06%), Query Frame = 1

Query: 307  SQLQQTAETQNKRQKVSSNAYLWHLRLGHINLNRIGRLVKSGLLSPLEDNSLPPCESCLE 366
            +++ +T ETQNK+QKVSSNAYLWHLRLGHINLNRI RLVKSG+L+ LEDNSLPPCESCLE
Sbjct: 423  TEMFRTLETQNKKQKVSSNAYLWHLRLGHINLNRIERLVKSGILNQLEDNSLPPCESCLE 482

Query: 367  GKMTKRSFTGKGLRAKGPLELVHSDLCGPMNVKARGGYEYFISFIDDYSRYGHIYLIHHK 426
            GKMTKRSFTGKGLRAK PLELVHSDLCGPMNVKARGGYEYFISFIDD+SRYGH+YL+HHK
Sbjct: 483  GKMTKRSFTGKGLRAKVPLELVHSDLCGPMNVKARGGYEYFISFIDDFSRYGHVYLLHHK 542

Query: 427  SNSLEKFKEYKAEVENELGKTIKILRSDRGGEYMDLRFRDYLIENGIQSQLSAPSTPQQN 486
            S S EKFKEYKAEVENE+GKTIK LRSDRGGEYMD +F+DYLIE GIQSQLSAPSTPQQN
Sbjct: 543  SESFEKFKEYKAEVENEIGKTIKTLRSDRGGEYMDSKFQDYLIEFGIQSQLSAPSTPQQN 602

Query: 487  GVSERRNRTLLDMVRSMISFSQMSDSFWGYALETAAYILNNVPSKSVSETPYELWKGRKG 546
            GVSERRNRTLLDMVRSM+S++Q+ DSFWGYALETA +ILNNVPSKSV ETPYELWKGRK 
Sbjct: 603  GVSERRNRTLLDMVRSMMSYAQLPDSFWGYALETAIHILNNVPSKSVLETPYELWKGRKS 662

Query: 547  SLRHFRIWGCPAHVLVQNPKKLEHRSKLCFFIGYPKESRGGLFYDPQENKIFVSTNATFL 606
            SLR+FRIWGCPAHVLVQNPKKLE RSKLC F+GYPKESRGGLFY PQENK+FVSTNATFL
Sbjct: 663  SLRYFRIWGCPAHVLVQNPKKLEPRSKLCLFVGYPKESRGGLFYHPQENKVFVSTNATFL 722

Query: 607  EEDHIRDHQPRSKLVLKEISKSAIDKPSSSTKVVDKTRKSGQSHPSQQLREPRRSGRVVH 666
            EEDH R+HQPRSK+VLKE+ K+A DKPSSSTKVVDK   S QSH SQ+LR PRRSGRVVH
Sbjct: 723  EEDHXRNHQPRSKIVLKEMFKNATDKPSSSTKVVDKANISDQSHTSQELRVPRRSGRVVH 782

Query: 667  QPDRYLGLIETQVVIPDDGIEDPLTYKQAMKDVDRDQWIKAMDLEMESMYFNSVWTLVDQ 726
            QP+RYLGL+ETQ++IPDDG+EDPLTYKQAM DVDRDQWIKAM+LEMESMYFNSVWTLVD 
Sbjct: 783  QPNRYLGLVETQIIIPDDGVEDPLTYKQAMNDVDRDQWIKAMNLEMESMYFNSVWTLVDL 842

Query: 727  PNDVKPIGCKWIYKRKRDHAGKVQTFKARLVAKGYTQREGVDYEETFSPVAMLKSIRILL 786
            P+DVKPIGCKWIYKRKRD AGKVQTFKARLVAKGYTQ+EGVDYEETFSPVAMLKSIRILL
Sbjct: 843  PSDVKPIGCKWIYKRKRDQAGKVQTFKARLVAKGYTQKEGVDYEETFSPVAMLKSIRILL 902

Query: 787  SIATFYDYEIWQMDVKTAFLNGNLEESIYMSQPEGFIEQDQEQKVCKLKKSIYGLKQASR 846
            SIATFY+YEIWQMDVKTAFLNGNLEESIYM QPEGFI QDQEQKVCKL+KSIYGLKQASR
Sbjct: 903  SIATFYNYEIWQMDVKTAFLNGNLEESIYMVQPEGFIAQDQEQKVCKLQKSIYGLKQASR 962

Query: 847  SWNIRFDTAIKSYGFEQNVDEPCVYKKVVNFIIAFLVLYVDDVLLIGNDVGYLTDIKKWL 906
            SWNIRFDTAIKSYGFEQNVDEPCVYKK+VN ++AFL+LYVDD+LLIGNDV YLTD+KKWL
Sbjct: 963  SWNIRFDTAIKSYGFEQNVDEPCVYKKIVNSVVAFLILYVDDILLIGNDVEYLTDVKKWL 1022

Query: 907  AMQFQKRS--------GRCTIRSRN---------------------PNCSK-----PYGI 966
              QFQ +         G   +R+R                       N  K      +GI
Sbjct: 1023 NTQFQMKDLGEAQYILGIQIVRNRKNKTLAMSQASYIDKVLSRYKMQNSKKGQLPFRHGI 1082

Query: 967  HLSKEQCPKTPQEVEDMRNIPYASAVGSLMYAMLCTRPDICYSVGMVSRYQSNPGRDHWT 1026
            HLSKEQCPKTPQEVEDMRNIPY+SAVGSLMYAMLCTRPDICYSVG+VSRYQSNPGRDHWT
Sbjct: 1083 HLSKEQCPKTPQEVEDMRNIPYSSAVGSLMYAMLCTRPDICYSVGIVSRYQSNPGRDHWT 1142

Query: 1027 AVKNILKYLRRTKDYMLMYGTKDLILTGYTDSDFQTDKDARKSTSGSVFTLNGGAVVWRS 1078
            AVKNILKYLRRT++YML+YG KDLILTGYTDSDFQ+DKDARKSTSGSVFTLNGGAVVWRS
Sbjct: 1143 AVKNILKYLRRTRNYMLVYGAKDLILTGYTDSDFQSDKDARKSTSGSVFTLNGGAVVWRS 1202

BLAST of CSPI02G15660 vs. NCBI nr
Match: gi|1019597807|gb|AMY96445.1| (gag/pol protein [Momordica dioica])

HSP 1 Score: 1074.3 bits (2777), Expect = 1.7e-310
Identity = 549/808 (67.95%), Postives = 621/808 (76.86%), Query Frame = 1

Query: 314  ETQNKRQKVSSNAYLWHLRLGHINLNRIGRLVKSGLLSPLEDNSLPPCESCLEGKMTKRS 373
            ETQNK+Q   S  YLWHLRLGHINLNRI +L K GLL  LED SLPPCESCLEGKMTKR 
Sbjct: 437  ETQNKKQNNDSAMYLWHLRLGHINLNRIEKLHKDGLLEQLEDFSLPPCESCLEGKMTKRP 496

Query: 374  FTGKGLRAKGPLELVHSDLCGPMNVKARGGYEYFISFIDDYSRYGHIYLIHHKSNSLEKF 433
            FTGKGLRA   LEL+H+D+CGPM+VKARGGY+YF+SF DD SRYG++YL+ HKS S EKF
Sbjct: 497  FTGKGLRASDLLELIHTDVCGPMSVKARGGYQYFLSFTDDLSRYGYVYLLKHKSESFEKF 556

Query: 434  KEYKAEVENELGKTIKILRSDRGGEYMDLRFRDYLIENGIQSQLSAPSTPQQNGVSERRN 493
            KE++AEVENE+GK IK LRSDRGGEYM   F D+L E GI SQLSAP TPQ NGVSERRN
Sbjct: 557  KEFQAEVENEIGKKIKTLRSDRGGEYMSSEFGDHLREFGIVSQLSAPGTPQCNGVSERRN 616

Query: 494  RTLLDMVRSMISFSQMSDSFWGYALETAAYILNNVPSKSVSETPYELWKGRKGSLRHFRI 553
            RTLLDMVRSM+S++ + DSFWGYA E    ILN VPSKSV ETPYELW GRK SL   +I
Sbjct: 617  RTLLDMVRSMMSYADLPDSFWGYARERERAILNRVPSKSVEETPYELWYGRKSSLSFLKI 676

Query: 554  WGCPAHVLVQNPKKLEHRSKLCFFIGYPKESRGGLFYDPQENKIFVSTNATFLEEDHIRD 613
            WGCPAHV    PKKLE RS+ C F+GYPKE+RG  FY PQENK+FV+TN  FLE++ +  
Sbjct: 677  WGCPAHVKKLQPKKLEPRSEKCLFVGYPKETRGYYFYHPQENKVFVATNEAFLEKEFLSR 736

Query: 614  HQPRSKLVLKEISKSAI-----DKPSSSTK-VVDKTR-KSGQSH--PSQQLREPRRSGRV 673
            HQP SK+VLK + +  I     DKPSSSTK VVDK      QSH    Q+LR PRRSGR 
Sbjct: 737  HQPGSKIVLKAVVEPLIPLDGTDKPSSSTKVVVDKAEVNDDQSHTPDQQELRVPRRSGRS 796

Query: 674  VHQPDRYLGLIETQVVIPDDGIEDPLTYKQAMKDVDRDQWIKAMDLEMESMYFNSVWTLV 733
               P+RYLGL+ETQ++I D+G EDP  YKQAM   D DQW+KAM+ EMESMY N VWTLV
Sbjct: 797  RRAPNRYLGLVETQIMILDNGEEDPTNYKQAMVGPDSDQWLKAMNSEMESMYDNKVWTLV 856

Query: 734  DQPNDVKPIGCKWIYKRKRDHAGKVQTFKARLVAKGYTQREGVDYEETFSPVAMLKSIRI 793
            D P+DVKPIGCKWIYK+KRD    V  FKARLVAKG+T+   + YEETFSPVAMLKSIRI
Sbjct: 857  DLPSDVKPIGCKWIYKKKRDQDSNVTVFKARLVAKGFTRSLSLSYEETFSPVAMLKSIRI 916

Query: 794  LLSIATFYDYEIWQMDVKTAFLNGNLEESIYMSQPEGFIEQDQEQKVCKLKKSIYGLKQA 853
            +L+IA F+DYEIWQMDVKTAFLNGNLEESIYM QPEGF+ QDQEQK CKL+ SIYGLKQA
Sbjct: 917  ILAIAAFFDYEIWQMDVKTAFLNGNLEESIYMIQPEGFVAQDQEQKACKLQGSIYGLKQA 976

Query: 854  SRSWNIRFDTAIKSYGFEQNVDEPCVYKKVVNFIIAFLVLYVDDVLLIGNDVGYLTDIKK 913
            SRSWNIRFD  IK++GF QNVDE CVYKK+   ++AFL+LYVDD+LLIGNDV YL D+KK
Sbjct: 977  SRSWNIRFDEVIKAFGFIQNVDESCVYKKISGSVVAFLILYVDDILLIGNDVEYLEDVKK 1036

Query: 914  WLAMQF-----------------QKRSGRCTIRSRNPNCSKPYG---------------- 973
            WL   F                 + RS +    S++    K                   
Sbjct: 1037 WLNTSFSMKDLGEAQYILGIRIYRDRSNKTIGMSQSTYIDKVLSRFKMQDSKKGLLPFRH 1096

Query: 974  -IHLSKEQCPKTPQEVEDMRNIPYASAVGSLMYAMLCTRPDICYSVGMVSRYQSNPGRDH 1033
             IHLSKEQCPKTPQEVEDMRNIPY+SA+GSLMYAMLCTRPD+CY++ +VSRYQSNPGRDH
Sbjct: 1097 GIHLSKEQCPKTPQEVEDMRNIPYSSAIGSLMYAMLCTRPDVCYALSIVSRYQSNPGRDH 1156

Query: 1034 WTAVKNILKYLRRTKDYMLMY-GTKDLILTGYTDSDFQTDKDARKSTSGSVFTLNGGAVV 1078
            WTAVKNILKYLRRT++  L+Y G KDL + GYTDS FQTDKD  KS SG VFTLNGGAV 
Sbjct: 1157 WTAVKNILKYLRRTRNMFLVYGGDKDLAVKGYTDSSFQTDKDDSKSQSG-VFTLNGGAVS 1216

BLAST of CSPI02G15660 vs. NCBI nr
Match: gi|2443320|dbj|BAA22288.1| (polyprotein [Oryza australiensis])

HSP 1 Score: 823.9 bits (2127), Expect = 3.2e-235
Identity = 426/786 (54.20%), Postives = 542/786 (68.96%), Query Frame = 1

Query: 327  YLWHLRLGHINLNRIGRLVKSGLLSPLEDNSLPPCESCLEGKMTKRSFTGKGLRAKGPLE 386
            ++WH RLGHIN  R+ +L K GLL   +  S   CESCL GKMTK  FTG   RA   L 
Sbjct: 438  FIWHCRLGHINKKRMEKLHKDGLLHSFDFESFETCESCLLGKMTKAPFTGHSERASDLLA 497

Query: 387  LVHSDLCGPMNVKARGGYEYFISFIDDYSRYGHIYLIHHKSNSLEKFKEYKAEVENELGK 446
            LVH+D+CGPM+  ARGGY+YFI+F DD+SRYG+IYL+ HKS S EKFKE++ EV+N LGK
Sbjct: 498  LVHTDVCGPMSSTARGGYQYFITFTDDFSRYGYIYLMRHKSESFEKFKEFQNEVQNHLGK 557

Query: 447  TIKILRSDRGGEYMDLRFRDYLIENGIQSQLSAPSTPQQNGVSERRNRTLLDMVRSMISF 506
            TIK LRSDRGGEY+   F ++L + GI  QL+ P TPQ NGVSERRNRTLLDMVRSM+S 
Sbjct: 558  TIKFLRSDRGGEYVSQEFGNHLKDCGIVPQLTPPGTPQWNGVSERRNRTLLDMVRSMMSQ 617

Query: 507  SQMSDSFWGYALETAAYILNNVPSKSVSETPYELWKGRKGSLRHFRIWGCPAHVLVQNPK 566
            S +  SFWGYALETAA  LN VPSKSV +TPYE+W G+  SL   +IWGC A+V      
Sbjct: 618  SDLPLSFWGYALETAALTLNRVPSKSVEKTPYEIWTGQPPSLSFLKIWGCEAYVKRLQSD 677

Query: 567  KLEHRSKLCFFIGYPKESRGGLFYDPQENKIFVSTNATFLEEDHIRDHQPRSKLVLKEIS 626
            KL  +S  CF +GYPKE++G  FY+ ++ K+FV+ +  FLE++ +       ++ L+E+ 
Sbjct: 678  KLTPKSDKCFVVGYPKETKGYYFYNREQAKVFVARHGVFLEKEFLSRRVSGIRVHLEEVQ 737

Query: 627  KSAIDKPSSSTKVVDKTRKSGQSHPSQQLREPRRSGRVVHQPDRYLGLIETQVVIPDDGI 686
            ++  +  S++T+   +      + P      PRRS R    PDRY G  +  +++ D+  
Sbjct: 738  ETP-ETVSATTE--PQQEDQSVAPPVVDTPAPRRSERSRRAPDRYTGAEQRDILLLDN-- 797

Query: 687  EDPLTYKQAMKDVDRDQWIKAMDLEMESMYFNSVWTLVDQPNDVKPIGCKWIYKRKRDHA 746
            ++P TY++AM   D ++W+ AM  E+ESMY N VW LVD P+ VK I CKW++K+K D  
Sbjct: 798  DEPKTYEEAMVGHDSNKWLGAMKSEIESMYDNQVWNLVDPPDGVKTIECKWLFKKKADMD 857

Query: 747  GKVQTFKARLVAKGYTQREGVDYEETFSPVAMLKSIRILLSIATFYDYEIWQMDVKTAFL 806
            G V  +KARLVAKG+ Q +GVDY+ETFSPVAMLKSIRI+L+IA ++DYEIWQMDVKTAFL
Sbjct: 858  GNVHIYKARLVAKGFKQIQGVDYDETFSPVAMLKSIRIILAIAAYFDYEIWQMDVKTAFL 917

Query: 807  NGNLEESIYMSQPEGFIEQDQEQKVCKLKKSIYGLKQASRSWNIRFDTAIKSYGFEQNVD 866
            NGNL E +YM QP+GF++ +   K+CKL+KSIYGLKQASRSWNIRFD  IK +GF +N +
Sbjct: 918  NGNLSEDVYMIQPQGFVDPESPGKICKLQKSIYGLKQASRSWNIRFDEVIKGFGFIKNEE 977

Query: 867  EPCVYKKVVNFIIAFLVLYVDDVLLIGNDVGYLTDIKKWLAMQF---------------- 926
            E CVYKKV    I FL+LYVDD+LLIGND+  L  +K  L   F                
Sbjct: 978  EACVYKKVSGSAIVFLILYVDDILLIGNDIPMLESVKSSLKNSFSMKDLGEAAYILGIRI 1037

Query: 927  -QKRSGRC-----------TIRSRNPNCSK------PYGIHLSKEQCPKTPQEVEDMRNI 986
             + RS R             ++  N + SK       +GI+LSK QCP+T  E   M  +
Sbjct: 1038 YRDRSKRLIGLSQSTYIDKVLKRFNMHDSKKGFLPMSHGINLSKNQCPQTHDERNKMGMV 1097

Query: 987  PYASAVGSLMYAMLCTRPDICYSVGMVSRYQSNPGRDHWTAVKNILKYLRRTKDYMLMY- 1046
            PYASA+GS+MYAMLCTRPD+ Y++   SRYQS+PG  HWTAVKNILKYLRRTKD  L+Y 
Sbjct: 1098 PYASAIGSIMYAMLCTRPDVSYALSATSRYQSDPGEGHWTAVKNILKYLRRTKDMFLVYG 1157

Query: 1047 GTKDLILTGYTDSDFQTDKDARKSTSGSVFTLNGGAVVWRSIKQTCIADSTMEAEYVAAC 1078
            G +DL+++GYTD+ FQTDKD  +S SG VF LNGGAV W+S KQ  +ADST EAEY+AA 
Sbjct: 1158 GEEDLVVSGYTDASFQTDKDDYRSQSGFVFCLNGGAVSWKSSKQDTVADSTTEAEYIAAS 1217

BLAST of CSPI02G15660 vs. NCBI nr
Match: gi|31126682|gb|AAP44605.1| (putative polyprotein [Oryza sativa Japonica Group])

HSP 1 Score: 813.1 bits (2099), Expect = 5.6e-232
Identity = 487/1126 (43.25%), Postives = 654/1126 (58.08%), Query Frame = 1

Query: 2    NSSIVQLLASEKLNGDNYAAWKSNLNTILVVDDLRFVLTEECPQNPASNANRTSRDAYDR 61
            N ++  +L  EKL G N+  W  NL  +L  +   FVLTE  P N  +NA    R  +++
Sbjct: 8    NFNLRSILEKEKLTGTNFMDWYRNLRIVLRQEHKEFVLTEPFPANLPNNAPAAQRREHEK 67

Query: 62   WIKANEKARVYILASMSDVLAKKHESLATAKEIMDSLRGMFGQPEWSLRHEAVKYIYTKR 121
                       +LA+MS  L +++E+L  A  I+  LR MF     + R    K ++  R
Sbjct: 68   RCNDYLDISCLMLATMSPELQRQYEAL-DAHTIITGLRNMFEDQARAERFNTSKSLFACR 127

Query: 122  MKEGTSVREHVLDMMMHFNIAEVNGGPIEEVNQVSFILESLPKSFIPFQTNASLNKIEFN 181
            + EG  V  HV+ M+ +    +  G P+        IL+SLP SF PF  N ++N +   
Sbjct: 128  LAEGNLVSPHVIKMIGYTESLDKLGFPLSRELATDLILQSLPPSFEPFIMNFNMNNLNRT 187

Query: 182  LTTLLNELQRFQNLTMGKGKQVEANVATTKRKFIRGSSSKTKAGPSKPNA-QIKKKGKGK 241
            L  L   L+  +         V   +   KRK     ++K      K N+ +I      K
Sbjct: 188  LAELHGMLKTAEESIKKNSNHV---MVMHKRK----PNNKKSGQKRKLNSDEITSTSNSK 247

Query: 242  TPKQNKGKKAAEKGKCYHCGQN-GHWLRN----CPKYLAEKKGREGNTRKIVLGKSFQKA 301
            T  Q  G  +A+  +C+ C +  G+  R+    C  Y  +           +        
Sbjct: 248  TKVQKTG--SAKDAECFFCKETEGYGFRSVDNGCSVYYND-----------IFYFHAPMM 307

Query: 302  RSLSRLEQERWSQLQQTAETQNKRQKVSSNAYLWHLRLGHINLNRIGRLVKSGLLSPLED 361
              L  +  +  S     A+ Q  R    +  ++WH  LGHIN  RI +L + GLL   + 
Sbjct: 308  NGLYIVNLDGCSVYNINAKRQ--RPNNLNPTFIWHCCLGHINEKRIEKLHRDGLLHSFDF 367

Query: 362  NSLPPCESCLEGKMTKRSFTGKGLRAKGPLELVHSDLCGPMNVKARGGYEYFISFIDDYS 421
             S   CESCL GKMTK  FTG+  RA   L LVH+D+CGPM+  ARGG+ YFI+F D++S
Sbjct: 368  ESFKTCESCLLGKMTKAPFTGQSERASELLGLVHTDVCGPMSSTARGGFGYFITFTDEFS 427

Query: 422  RYGHIYLIHHKSNSLEKFKEYKAEVENELGKTIKILRSDRGGEYMDLRFRDYLIENGIQS 481
            RYG++YL+ HKS S EKF+E++ EV+N LGKTIK LRSDRGGEY+ L + ++L E GI  
Sbjct: 428  RYGYVYLMRHKSESFEKFQEFQNEVQNHLGKTIKYLRSDRGGEYLSLEYGNHLKECGIVP 487

Query: 482  QLSAPSTPQQNGVSERRNRTLLDMVRSMISFSQMSDSFWGYALETAAYILNNVPSKSVSE 541
            QL+ P TPQ N VSERRNR LLDMVRSM+S + M  SFWGYALETAA+ LN VPSKSV +
Sbjct: 488  QLTPPGTPQWNAVSERRNRILLDMVRSMMSQTDMPLSFWGYALETAAFTLNRVPSKSVDK 547

Query: 542  TPYELWKGRKGSLRHFRIWGCPAHVLVQNPKKLEHRSKLCFFIGYPKESRGGLFYDPQEN 601
            TPYE+W G++ SL   +IW C                         +E++G  FY+ +E 
Sbjct: 548  TPYEIWTGKRPSLSFLKIWCC-------------------------EETKGYYFYNREEG 607

Query: 602  KIFVSTNATFLEEDHIRDHQPRSKLVLKEISKSAIDKPSSSTKVVDKTRKSGQSHPSQQL 661
            K+FV+ +  FLE++ I      S + LKEI ++     ++ST    +  +       Q +
Sbjct: 608  KVFVARHGVFLEKEFISRKDSGSMVRLKEIQETP---ENASTSTQPQVEQDVVQQVEQVV 667

Query: 662  REP-------RRSGRVVHQPDRYLGLIETQ--VVIPDDGIEDPLTYKQAMKDVDRDQWIK 721
             EP       RRS R+   P RY  L   Q  +++ D+  ++P TY++AM   D ++W+ 
Sbjct: 668  VEPVVEAPASRRSERIRRTPARYALLTSGQRDILLLDN--DEPTTYEEAMVGPDTEKWLG 727

Query: 722  AMDLEMESMYFNSVWTLVDQPNDVKPIGCKWIYKRKRDHAGKVQTFKARLVAKGYTQREG 781
            AM  E+ESM+ N VW LVD P+ VK I CKWI+K+  D  G V  + ARLVAKG+ Q +G
Sbjct: 728  AMKSEIESMHVNQVWNLVDPPDGVKAIECKWIFKKMTDVDGTVHIYNARLVAKGFRQIQG 787

Query: 782  VDYEETFSPVAMLKSIRILLSIATFYDYEIWQMDVKTAFLNGNLEESIYMSQPEGFIEQD 841
            VDY+ETFSPVAMLKSIRI+L+IA ++DYEIWQMDVKTAFLNGNL+E +YM+QP+GF++  
Sbjct: 788  VDYDETFSPVAMLKSIRIVLAIAAYFDYEIWQMDVKTAFLNGNLDEDVYMTQPKGFVDPQ 847

Query: 842  QEQKVCKLKKSIYGLKQASRSWNIRFDTAIKSYGFEQNVDEPCVYKKVVNFIIAFLVLYV 901
              +K+CKL+KSIY LKQASRSWNIRFD  +K+ GF +N +EPCVYKK+    + FL+LYV
Sbjct: 848  SAKKICKLQKSIYRLKQASRSWNIRFDEVVKALGFVKNEEEPCVYKKISGSALVFLILYV 907

Query: 902  DDVLLIGNDVGYLTDIKKWLAMQF-----------------QKRSGRCTIRSRNPNCSK- 961
            DD+LLIGND+  L  +K  L   F                 + RS R    S++    K 
Sbjct: 908  DDILLIGNDIPMLESVKTSLKYSFSMKDLGEAAYILGIRIYRDRSKRLIGLSQSTYIDKV 967

Query: 962  ----------------PYGIHLSKEQCPKTPQEVEDMRNIPYASAVGSLMYAMLCTRPDI 1021
                             +GI+L K QCP+T  E   M  IPYASA+GS+MYAMLCTR D+
Sbjct: 968  LKRFNMQDSKKGFLPMSHGINLGKNQCPQTTDERNKMSVIPYASAIGSIMYAMLCTRLDV 1027

Query: 1022 CYSVGMVSRYQSNPGRDHWTAVKNILKYLRRTKDYMLMYG-TKDLILTGYTDSDFQTDKD 1078
             Y++   SRYQS+ G  HW AVKNILKYLRRTKD  L+YG  ++L++ GYTD+ FQTDKD
Sbjct: 1028 SYALSATSRYQSDLGESHWIAVKNILKYLRRTKDMFLVYGRQEELVVNGYTDASFQTDKD 1080

BLAST of CSPI02G15660 vs. NCBI nr
Match: gi|595434714|gb|EXX50149.1| (gag-pol fusion protein [Rhizophagus irregularis DAOM 197198w])

HSP 1 Score: 783.5 bits (2022), Expect = 4.8e-223
Identity = 409/792 (51.64%), Postives = 527/792 (66.54%), Query Frame = 1

Query: 326  AYLWHLRLGHINLNRIGRLVKSGLLSPLEDNSLPPCESCLEGKMTKRSFTGKGLRAKGPL 385
            A LWH RLGHI+  RI +L K G+L   +  S   CESCL GKMTK  F G   R +G L
Sbjct: 412  ACLWHSRLGHISKKRIAQLQKDGVLESFDLKSDDVCESCLLGKMTKSPFKGSFERGEGLL 471

Query: 386  ELVHSDLCGPMNVKARGGYEYFISFIDDYSRYGHIYLIHHKSNSLEKFKEYKAEVENELG 445
            +++H+D+CGP     + G  ++++F DD+SRYG+IYLI HKS++ EKFKE+K EVEN+LG
Sbjct: 472  DIIHTDVCGPFRSTTKDGTRFYVTFTDDFSRYGYIYLIKHKSDTFEKFKEFKNEVENQLG 531

Query: 446  KTIKILRSDRGGEYMDLRFRDYLIENGIQSQLSAPSTPQQNGVSERRNRTLLDMVRSMIS 505
            + IK+LRSDRGGEY+ + F DYL E GI SQL+ P TPQ NGV+ERRNRTLLDMVRSM+S
Sbjct: 532  RKIKMLRSDRGGEYLSIEFLDYLKECGIVSQLTPPRTPQLNGVAERRNRTLLDMVRSMMS 591

Query: 506  FSQMSDSFWGYALETAAYILNNVPSKSVSETPYELWKGRKGSLRHFRIWGCPAHVLVQNP 565
             + +   FWGYALETAA+ILN VP+K V++TP+E+W G+  SL H ++WGC A V  +  
Sbjct: 592  RASLPIHFWGYALETAAHILNLVPTKKVAKTPHEMWTGKVPSLAHIKVWGCEAFVRRETQ 651

Query: 566  KKLEHRSKLCFFIGYPKESRGGLFYDPQENKIFVSTNATFLEEDHIRDHQPRSKLVLKEI 625
             KL  RS+ CFF+GYPK+S G LFY P E+ +FV+  A F E + I      S + L+EI
Sbjct: 652  DKLAERSERCFFLGYPKQSFGYLFYRPSEDVVFVARRAVFRERELIFKEDSGSTIDLEEI 711

Query: 626  SKSAIDKPSSSTKVVDKTRKSGQSHPSQQLREPRRSGRVVHQPDRYLGLI--ETQVVIPD 685
             +S+ D     T   ++  +     P+      RRSGRV   P+ Y   I  +    + D
Sbjct: 712  QESSDDATLGETS--NQHEEEVPVGPTDVSLPLRRSGRVSMPPEFYGFHITSDGDTFVSD 771

Query: 686  D---GIEDPLTYKQAMKDVDRDQWIKAMDLEMESMYFNSVWTLVDQPNDVKPIGCKWIYK 745
                 +++P  Y++A+   +  +W +AMD E++SMY N VW LVD     K +GCKWI+K
Sbjct: 772  RTLINLDEPANYQEAVAGPESAKWKEAMDSEIKSMYDNQVWNLVDNVPGRKTVGCKWIFK 831

Query: 746  RKRDHAGKVQTFKARLVAKGYTQREGVDYEETFSPVAMLKSIRILLSIATFYDYEIWQMD 805
            +K D  GKV TFKARLVAKG+TQ  GVDY+ETFSPVA +KSIRI+L+IA F+DYEIWQMD
Sbjct: 832  KKTDMDGKVHTFKARLVAKGFTQTPGVDYDETFSPVAKIKSIRIMLAIAAFHDYEIWQMD 891

Query: 806  VKTAFLNGNLEESIYMSQPEGFIEQDQEQKVCKLKKSIYGLKQASRSWNIRFDTAIKSYG 865
            VKTAFLNG L E +YM+QPEGF++     KVCKL++SIYGLKQASRSWN+ F   +K +G
Sbjct: 892  VKTAFLNGKLTEDVYMNQPEGFVDAKYPNKVCKLERSIYGLKQASRSWNLCFHEKVKEFG 951

Query: 866  FEQNVDEPCVYKKVVNFIIAFLVLYVDDVLLIGNDVGYLTDIKKWLAMQFQKRS------ 925
            F ++ DE CVY K    I+ FLVLYVDD+LL+GND+  L D+K WL   F  +       
Sbjct: 952  FSRSEDESCVYVKASGSIVTFLVLYVDDILLMGNDIPTLQDVKAWLGKCFAMKDLGEAAY 1011

Query: 926  --GRCTIRSRN---------------------PNCSKPY-----GIHLSKEQCPKTPQEV 985
              G   +R R                       N  K          LSK Q P T +E+
Sbjct: 1012 ILGIRILRDRKKRLIGLSQGTYLEKVLKRFSMENSKKGELPIQSNAKLSKTQSPSTDEEI 1071

Query: 986  EDMRNIPYASAVGSLMYAMLCTRPDICYSVGMVSRYQSNPGRDHWTAVKNILKYLRRTKD 1045
             +M  +PYASAVGS+MYAM CTRPD+ +++ MVSRYQ NPGR HW AVKNILKYLRRTK+
Sbjct: 1072 AEMSRVPYASAVGSIMYAMTCTRPDVAFALSMVSRYQGNPGRAHWIAVKNILKYLRRTKN 1131

Query: 1046 YMLMYGTKD-LILTGYTDSDFQTDKDARKSTSGSVFTLNGGAVVWRSIKQTCIADSTMEA 1078
             +L+ G  D L + GYTD+ FQTD+D+ +S SG VF LNGGAV W+S KQ  +ADST E+
Sbjct: 1132 MVLVLGGSDTLRVEGYTDASFQTDRDSGRSQSGWVFLLNGGAVTWKSSKQETVADSTCES 1191

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
POLX_TOBAC1.5e-16340.02Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum... [more]
COPIA_DROME1.9e-6230.18Copia protein OS=Drosophila melanogaster GN=GIP PE=1 SV=3[more]
YCH4_YEAST4.2e-2528.99Putative transposon Ty5-1 protein YCL074W OS=Saccharomyces cerevisiae (strain AT... [more]
YL21B_YEAST7.8e-2426.67Transposon Ty2-LR1 Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC ... [more]
YO22B_YEAST1.0e-2326.42Transposon Ty2-OR2 Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC ... [more]
Match NameE-valueIdentityDescription
E2GK51_BRYDI0.0e+0083.48Gag/pol protein (Fragment) OS=Bryonia dioica PE=4 SV=1[more]
A0A165U314_9ROSI1.2e-31067.95Gag/pol protein OS=Momordica dioica PE=4 SV=1[more]
O23864_9ORYZ2.2e-23554.20Polyprotein OS=Oryza australiensis PE=4 SV=1[more]
Q7Y1M7_ORYSJ3.9e-23243.25Putative polyprotein OS=Oryza sativa subsp. japonica GN=OSJNBa0053G10.9 PE=4 SV=... [more]
A0A015J5I9_9GLOM3.3e-22351.64Gag-pol fusion protein OS=Rhizophagus irregularis DAOM 197198w GN=RirG_273610 PE... [more]
Match NameE-valueIdentityDescription
AT4G23160.11.4e-6334.44 cysteine-rich RLK (RECEPTOR-like protein kinase) 8[more]
ATMG00820.12.4e-1538.83ATMG00820.1 Reverse transcriptase (RNA-dependent DNA polymerase)[more]
ATMG00300.12.9e-0836.96ATMG00300.1 Gag-Pol-related retrotransposon family protein[more]
Match NameE-valueIdentityDescription
gi|299474487|gb|ADJ18449.1|0.0e+0083.48gag/pol protein [Bryonia dioica][more]
gi|1019597807|gb|AMY96445.1|1.7e-31067.95gag/pol protein [Momordica dioica][more]
gi|2443320|dbj|BAA22288.1|3.2e-23554.20polyprotein [Oryza australiensis][more]
gi|31126682|gb|AAP44605.1|5.6e-23243.25putative polyprotein [Oryza sativa Japonica Group][more]
gi|595434714|gb|EXX50149.1|4.8e-22351.64gag-pol fusion protein [Rhizophagus irregularis DAOM 197198w][more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR001584Integrase_cat-core
IPR001878Znf_CCHC
IPR012337RNaseH-like_sf
IPR013103RVT_2
IPR025724GAG-pre-integrase_dom
Vocabulary: Biological Process
TermDefinition
GO:0015074DNA integration
Vocabulary: Molecular Function
TermDefinition
GO:0003676nucleic acid binding
GO:0008270zinc ion binding
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0015074 DNA integration
biological_process GO:0006310 DNA recombination
cellular_component GO:0005575 cellular_component
molecular_function GO:0003677 DNA binding
molecular_function GO:0016787 hydrolase activity
molecular_function GO:0008270 zinc ion binding
molecular_function GO:0003676 nucleic acid binding

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CSPI02G15660.1CSPI02G15660.1mRNA


Analysis Name: InterPro Annotations of cucumber (PI183967)
Date Performed: 2017-01-17
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR001584Integrase, catalytic corePFAMPF00665rvecoord: 383..499
score: 5.7
IPR001584Integrase, catalytic corePROFILEPS50994INTEGRASEcoord: 380..545
score: 23
IPR001878Zinc finger, CCHC-typeGENE3DG3DSA:4.10.60.10coord: 248..271
score: 1.
IPR001878Zinc finger, CCHC-typePFAMPF00098zf-CCHCcoord: 254..271
score: 1.
IPR001878Zinc finger, CCHC-typeSMARTSM00343c2hcfinal6coord: 255..271
score: 4.
IPR001878Zinc finger, CCHC-typePROFILEPS50158ZF_CCHCcoord: 255..271
score: 10
IPR001878Zinc finger, CCHC-typeunknownSSF57756Retrovirus zinc finger-like domainscoord: 236..274
score: 4.8
IPR012337Ribonuclease H-like domainGENE3DG3DSA:3.30.420.10coord: 380..538
score: 2.8
IPR012337Ribonuclease H-like domainunknownSSF53098Ribonuclease H-likecoord: 379..539
score: 4.55
IPR013103Reverse transcriptase, RNA-dependent DNA polymerasePFAMPF07727RVT_2coord: 718..911
score: 3.6
IPR025724GAG-pre-integrase domainPFAMPF13976gag_pre-integrscoord: 315..369
score: 1.2
NoneNo IPR availableunknownCoilCoilcoord: 175..195
scor
NoneNo IPR availablePANTHERPTHR11439GAG-POL-RELATED RETROTRANSPOSONcoord: 687..1077
score: 0.0coord: 12..669
score:
NoneNo IPR availablePANTHERPTHR11439:SF192SUBFAMILY NOT NAMEDcoord: 687..1077
score: 0.0coord: 12..669
score:
NoneNo IPR availablePFAMPF14223Retrotran_gag_2coord: 62..191
score: 7.7
NoneNo IPR availableunknownSSF56672DNA/RNA polymerasescoord: 718..1075
score: 4.89

The following gene(s) are orthologous to this gene:

None

The following gene(s) are paralogous to this gene:

None