CSPI03G21050 (gene) Wild cucumber (PI 183967)

NameCSPI03G21050
Typegene
OrganismCucumis sativus (Wild cucumber (PI 183967))
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
LocationChr3 : 17116157 .. 17120164 (+)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: CDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGTCATCTAACATGTTGCAGCCTCAACTTCCTTGTTTTGAGGGAAAAAACTATAGGCGGTGGAGCCATCAAATGAAGGTTCTTTATGGATCTCAAGATCTTTGGGATATTGTTGACATCGGATATTCAGAGCCAGAAAGTGAGAATGGTCTTTCAGCACAACAACTCAATGAGTTGAGAGATGCTAGAAAAAAGGATAAGAAGGCATTATTTTTCATCTACCAATCTGTGGATGAAAATATTTTTGAAAGAATATCAGGAGTCTCTACTGCTAAAGCAGCATGGGATGCATTGCAAAATTTGTATGAAGGAGAAGAAAAGGTAAAATTGGTCCGATTACAAACACTTCGAGCTGAATTTGATACAATTCGAATGAAAGATTCTGAAACTATTGAAGAATTTTTCAACCGTGTTCTCTTAATTGTTAATCAATTGAGATCAAATGGAGAAACAATTGAAGATCAAAGGATTGTTGAGAAGATTCTTAGAAGCATGACTAGAAGATATGAGCATATTGTTGTAGCAATTGAAGAATCCAAAGATTTGTCAACTCTCTCTATAAATAGCTTAATGGGATCTCTTCAATCTCATGAGCTCAGATTGAAGATGTTTGATTCTAATCCTTCAGAAGAAGCTTTTCATATGCAGTCCTCCTATAGAGGTCGATCCAATGGAAGAAGAGGTGGACGTGGTGGTAGAGGCAATGGACGATCCAACGTTGTGACAAATACAGAGTCAGAAAGCAGAGACAATCAATTTTTTTCAAATAGAGGACGAGGAAGAAGTTCAAATAGAGGAAGAGGTAGAAGTGGTGGTCGTGGAGATTTTTCTCACATACAATGTTTCAATTGTAGACGTTATGGACATTTTCAAGCAGACTGTTGGTCTAAGAAGACTAATTCTAATCAAGCAGAAACCACACTAATGCATGAGCAATCAAATAATGATCAAGGTCTTCTCTTCCTCACTCTCAATGTTCAAGAATCAAGCACTGAAGAAATATGGTATCTTGATAGTGGTTGTAGTAACCACATGACAGGAAGAAAGGATATTTTTATATCTTTAGATGAATCTCATCAAAATGTAGTGAAGACTGGTGACAACAAGATGCTTGAAGTCAAAGGAAAAGGAGATATTCTTGTCAAGACAAAAATGGGAGCAAAAAAAATTACTGATGTGTATTATGTTTCAGGTCTCAAACACAATCTTTTAAGTGTTGGACAACTTCTCCTAAGAGGACATGATGTTATTTTTAAAGATAACATATGTGAGATTAGAACCAAGAATGGGGATCTCATAACGAAGGTTTGTATGACTCACAACAAAATGTTTCCAATTAAAATATGTTATGAGAAGCTTGTTTGTTTTGAGACTTTAGTAAATGACACCTCATGGTTATGGCATTGTCGATTTGGGCACCTAAGTTTTGACACTTTGTCTCACATGTGTCAACAACATATGGTGAGAGGAATGTCAAATATTAAAAAGGAAGATCAACTCTGTGAAGCATGTGTTTTCAGAAAGCATCATCGAAATTCATTTCCGACTGGAGGTTCTTGGAGAGCATCAAAACCACTCGAGCTTGTTCATACAGACTTATGTGGACCTATGAGAACTACTACACATGGAGGTAACCGTTATTTTCTCACATTTATTGATGACTACAGTCGAAAAACATGGATTTATCTACTAAAAGAAAAGAGTGCTACTTTCGAATGTTTCAAGACATTCAAAGCAATGGTGGAAAATGAAAGTAACTTGAAATTGAAATCATTGCGTTCGGATCGTGGAGGAGAATATATTGTTTTTGCAGATTTCTTGAAGGAAAATGGAATCAAGCATCAGAAGACTGTTCGAAGAACTCCTCAACAAAACGGAGTTGCAGAGAGGAAAAATAGAATAATAATGGAACTTGCAAGAAGTATGTTGAAGGCAAAGAAGCTTCCTGATCAATTTTGGGGAGACGCAGTAACTTGTGCTGTTTATCTCCTAAATAGAGCTTCAACGAAAAGTGTGCAAGGTATTACTCCTCGAGAAGCATGGAGCGGATTGAAACCAACTGTTAGTTAAGAGTGTTTGGGTGCATTGCTTACTCTCACATTTCAGATGAGAAAAGAGGTAAGCTAGATGATAAATCAGAGAAATGCATTTTTGTTGGGTACAGTGAGAACTCTAAGGCCTACAGACTATACAATCCAATAAGTAAGAAAGTTGTTATTAGTCGAGATGTCAAGTTCGATGAAGCAAAATTGTGGCAATGGAATGCACCAAATGAAGACCAAAATCCATTACATGTTGATATGGATGGAAAAAAAGATGCTCGAGACTTGGAGCTTGAAGTAACTCAACCACTGACTTCACCTTCTTCATCACACTCCACAAGTGATGAAGAAACTACTCCAAGGAAGACCAGAAATATTCAAGAGATCTATAATACTTCAAGAAGGATACTAGATGAAGAACATGTTGATTTTGCTTTATTTGCAAATGTTGATCCTGTATACTTTGAAGAAGCAATTCAAGATGAAAATTGGAAAGATGCAATGAATCAAGAGATTGATGCAATAAGAAGAAACGAAAGATGAGAATTAGTAAAATTACCAAAAAATAAAAAGGCTCTTGGAGTCAAATGGATCTATAGAACAAAGCTAAAGCAAAACGGAGAAGTGCAAAAATACAAAGCCAGACTCGTTGTAAAAGGTTACAAACAAAAGTTTGGTGTGGATTATGAAGAAGTTTTTGCACCGGTAACTCGCTTGGAGACTGTTCGTTTGTTGTTAGCCCTTGCAGCAAAAAATAACTGGAAAGTTCATCAATGTAAAGTCAGCATTCCTAAATGGGTATTTAGAGGATGAAATATATGTTGAGCAACCCCCCGGTTATGCAAAGATTGGAGAAGAAAATAAGGTGTGTCGATTAAAGAAAGCCTTGTACGGGCTAAAGCAAGTACCAAGGGCTTGGTATAGTCGCATCTACAATTTTTTCTTAAAGGATGGTTTCAGAAGATGTCCATATGAACATGCTCTCCACACCAAAGAAGATGAAAATGGTAATTTCTTGATAATTTGTTTATATGTTGATGATTTAATATTTACGGGCAACTAAAATATGATGATTGAAGAATTCAAAGAGAGCATGAAAAAGGAATTTGAGATGACTGATATGGGTTTACTTCATTATTTTCTTGGTATTGAAGTTAAACAAGATGATAATGAGATTGCAATTTTCCAAAAAAAGTATGCAAAAGATTTGTTGAAAAGGTTCAAAATGGAGAATGCTTATCCTACCAATACTCCTATGGAATTGGGTTTAAAGTTAAGTAAGCATGATGTTAGTGAAGCTTTTGATGCCACCATTTATAGAAGTTTGGTTGGAAGTTTAATGTATTTAACTACAACTAGACCTGATATTATGTTCTCGGTCAGTTTATTGAGTAGATTTATGACATCACCAAAGAGAAGTCATTGGGAAGCTGGAAAGAGAGTTCTTAGATACATTCTTGGAACTGTTGATCATGGAATCCACTATAAAAGGAATGTGGATAATGTTCTTGTTGGCTACAGTGATAGTGATTGGGGAGGAAATATTGATGATTTCAAAAGTACTTCTGGGTATATATTTAATATTGGTTTTAAAGCAGTTTCATGGGCATCAAAGAAGCAAGATGTTGTAGCATTGTCCACAACAGAAGCTGAATACATTTCTTTGTCTGTTGCTAGTTGTCAAGCACTTTGGCTAAGAAATGTACTACATGAATTGAAGTGTCCTCAAGAGAAAGGGACCATCATGTTCTGTGACAATCAATCATCTATTTCACTTTCGAAGAATCCCGTTTTTCATGGAAGAAGCAAACACATAAACATCAAATATCATTTCATCAGAGAATTGATCAAAGATGGAGAAGTATATATCAGGTATTGCAAGACTCAAGATCAAGTTGCAGACGTATTCACAAAAGCATTAAAGACAGATTCATTCTTGAAAATGAAAGAGAAGCTCGGAGTTTAG

mRNA sequence

ATGTCATCTAACATGTTGCAGCCTCAACTTCCTTGTTTTGAGGGAAAAAACTATAGGCGGTGGAGCCATCAAATGAAGGTTCTTTATGGATCTCAAGATCTTTGGGATATTGTTGACATCGGATATTCAGAGCCAGAAAGTGAGAATGGTCTTTCAGCACAACAACTCAATGAGTTGAGAGATGCTAGAAAAAAGGATAAGAAGGCATTATTTTTCATCTACCAATCTGTGGATGAAAATATTTTTGAAAGAATATCAGGAGTCTCTACTGCTAAAGCAGCATGGGATGCATTGCAAAATTTGTATGAAGGAGAAGAAAAGGTAAAATTGGTCCGATTACAAACACTTCGAGCTGAATTTGATACAATTCGAATGAAAGATTCTGAAACTATTGAAGAATTTTTCAACCGTGTTCTCTTAATTGTTAATCAATTGAGATCAAATGGAGAAACAATTGAAGATCAAAGGATTGTTGAGAAGATTCTTAGAAGCATGACTAGAAGATATGAGCATATTGTTGTAGCAATTGAAGAATCCAAAGATTTGTCAACTCTCTCTATAAATAGCTTAATGGGATCTCTTCAATCTCATGAGCTCAGATTGAAGATGTTTGATTCTAATCCTTCAGAAGAAGCTTTTCATATGCAGTCCTCCTATAGAGGTCGATCCAATGGAAGAAGAGGTGGACGTGGTGGTAGAGGCAATGGACGATCCAACGTTGTGACAAATACAGAGTCAGAAAGCAGAGACAATCAATTTTTTTCAAATAGAGGACGAGGAAGAAGTTCAAATAGAGGAAGAGGTAGAAGTGGTGGTCGTGGAGATTTTTCTCACATACAATGTTTCAATTGTAGACGTTATGGACATTTTCAAGCAGACTGTTGGTCTAAGAAGACTAATTCTAATCAAGCAGAAACCACACTAATGCATGAGCAATCAAATAATGATCAAGGTCTTCTCTTCCTCACTCTCAATGTTCAAGAATCAAGCACTGAAGAAATATGGTATCTTGATAGTGGTTGTAGTAACCACATGACAGGAAGAAAGGATATTTTTATATCTTTAGATGAATCTCATCAAAATGTAGTGAAGACTGGTGACAACAAGATGCTTGAAGTCAAAGGAAAAGGAGATATTCTTGTCAAGACAAAAATGGGAGCAAAAAAAATTACTGATGTGTATTATGTTTCAGGTCTCAAACACAATCTTTTAAGTGTTGGACAACTTCTCCTAAGAGGACATGATGTTATTTTTAAAGATAACATATGTGAGATTAGAACCAAGAATGGGGATCTCATAACGAAGGTTTGTATGACTCACAACAAAATGTTTCCAATTAAAATATGTTATGAGAAGCTTGTTTGTTTTGAGACTTTAGTAAATGACACCTCATGGTTATGGCATTGTCGATTTGGGCACCTAAGTTTTGACACTTTGTCTCACATGTGTCAACAACATATGGTGAGAGGAATGTCAAATATTAAAAAGGAAGATCAACTCTGTGAAGCATGTGTTTTCAGAAAGCATCATCGAAATTCATTTCCGACTGGAGGTTCTTGGAGAGCATCAAAACCACTCGAGCTTGTTCATACAGACTTATGTGGACCTATGAGAACTACTACACATGGAGGTAACCGTTATTTTCTCACATTTATTGATGACTACAGTCGAAAAACATGGATTTATCTACTAAAAGAAAAGAGTGCTACTTTCGAATGTTTCAAGACATTCAAAGCAATGGTGGAAAATGAAAGTAACTTGAAATTGAAATCATTGCGTTCGGATCGTGGAGGAGAATATATTGTTTTTGCAGATTTCTTGAAGGAAAATGGAATCAAGCATCAGAAGACTGTTCGAAGAACTCCTCAACAAAACGGAGTTGCAGAGAGGAAAAATAGAATAATAATGGAACTTGCAAGAAGTATGTTGAAGGCAAAGAAGCTTCCTGATCAATTTTGGGGAGACGCAGTAACTTGTGCTGTTTATCTCCTAAATAGAGCTTCAACGAAAAGTGTGCAAGATGAGAAAAGAGGTAAGCTAGATGATAAATCAGAGAAATGCATTTTTGTTGGGTACAGTGAGAACTCTAAGGCCTACAGACTATACAATCCAATAAGTAAGAAAGTTGTTATTAGTCGAGATGTCAAGTTCGATGAAGCAAAATTGTGGCAATGGAATGCACCAAATGAAGACCAAAATCCATTACATGTTGATATGGATGGAAAAAAAGATGCTCGAGACTTGGAGCTTGAAGTAACTCAACCACTGACTTCACCTTCTTCATCACACTCCACAAGTGATGAAGAAACTACTCCAAGGAAGACCAGAAATATTCAAGAGATCTATAATACTTCAAGAAGGATACTAGATGAAGAACATGTTGATTTTGCTTTATTTGCAAATGTTGATCCTGCTCTTGGAGTCAAATGGATCTATAGAACAAAGCTAAAGCAAAACGGAGAAGTGCAAAAATACAAAGCCAGACTCGTTGTAAAAGGTTACAAACAAAAGTTTGGTGTGGATTATGAAGAAGTTTTTGCACCGCCCTTGCAGCAAAAAATAACTGGAAAGTTCATCAATGTAAAGTCAGCATTCCTAAATGGGTATTTAGAGGATGAAATATATGTTGAGCAACCCCCCGGTTATGCAAAGATTGGAGAAGAAAATAAGGATGGTTTCAGAAGATGTCCATATGAACATGCTCTCCACACCAAAGAAGATGAAAATGAATTCAAAGAGAGCATGAAAAAGGAATTTGAGATGACTGATATGGGTTTACTTCATTATTTTCTTGGTATTGAAGTTAAACAAGATGATAATGAGATTGCAATTTTCCAAAAAAAGTATGCAAAAGATTTGTTGAAAAGGTTCAAAATGGAGAATGCTTATCCTACCAATACTCCTATGGAATTGGGTTTAAAGTTAAGTAAGCATGATGTTAGTGAAGCTTTTGATGCCACCATTTATAGAAGTTTGGTTGGAAGTTTAATGTATTTAACTACAACTAGACCTGATATTATGTTCTCGGTCAGTTTATTGAGTAGATTTATGACATCACCAAAGAGAAGTCATTGGGAAGCTGGAAAGAGAGTTCTTAGATACATTCTTGGAACTGTTGATCATGGAATCCACTATAAAAGGAATGTGGATAATGTTCTTGTTGGCTACAGTGATAGTGATTGGGGAGGAAATATTGATGATTTCAAAAGTACTTCTGGGTATATATTTAATATTGGTTTTAAAGCAGTTTCATGGGCATCAAAGAAGCAAGATGTTGTAGCATTGTCCACAACAGAAGCTGAATACATTTCTTTGTCTGTTGCTAGTTGTCAAGCACTTTGGCTAAGAAATGTACTACATGAATTGAAGTGTCCTCAAGAGAAAGGGACCATCATGTTCTGTGACAATCAATCATCTATTTCACTTTCGAAGAATCCCGTTTTTCATGGAAGAAGCAAACACATAAACATCAAATATCATTTCATCAGAGAATTGATCAAAGATGGAGAAGTATATATCAGGTATTGCAAGACTCAAGATCAAGTTGCAGACGTATTCACAAAAGCATTAAAGACAGATTCATTCTTGAAAATGAAAGAGAAGCTCGGAGTTTAG

Coding sequence (CDS)

ATGTCATCTAACATGTTGCAGCCTCAACTTCCTTGTTTTGAGGGAAAAAACTATAGGCGGTGGAGCCATCAAATGAAGGTTCTTTATGGATCTCAAGATCTTTGGGATATTGTTGACATCGGATATTCAGAGCCAGAAAGTGAGAATGGTCTTTCAGCACAACAACTCAATGAGTTGAGAGATGCTAGAAAAAAGGATAAGAAGGCATTATTTTTCATCTACCAATCTGTGGATGAAAATATTTTTGAAAGAATATCAGGAGTCTCTACTGCTAAAGCAGCATGGGATGCATTGCAAAATTTGTATGAAGGAGAAGAAAAGGTAAAATTGGTCCGATTACAAACACTTCGAGCTGAATTTGATACAATTCGAATGAAAGATTCTGAAACTATTGAAGAATTTTTCAACCGTGTTCTCTTAATTGTTAATCAATTGAGATCAAATGGAGAAACAATTGAAGATCAAAGGATTGTTGAGAAGATTCTTAGAAGCATGACTAGAAGATATGAGCATATTGTTGTAGCAATTGAAGAATCCAAAGATTTGTCAACTCTCTCTATAAATAGCTTAATGGGATCTCTTCAATCTCATGAGCTCAGATTGAAGATGTTTGATTCTAATCCTTCAGAAGAAGCTTTTCATATGCAGTCCTCCTATAGAGGTCGATCCAATGGAAGAAGAGGTGGACGTGGTGGTAGAGGCAATGGACGATCCAACGTTGTGACAAATACAGAGTCAGAAAGCAGAGACAATCAATTTTTTTCAAATAGAGGACGAGGAAGAAGTTCAAATAGAGGAAGAGGTAGAAGTGGTGGTCGTGGAGATTTTTCTCACATACAATGTTTCAATTGTAGACGTTATGGACATTTTCAAGCAGACTGTTGGTCTAAGAAGACTAATTCTAATCAAGCAGAAACCACACTAATGCATGAGCAATCAAATAATGATCAAGGTCTTCTCTTCCTCACTCTCAATGTTCAAGAATCAAGCACTGAAGAAATATGGTATCTTGATAGTGGTTGTAGTAACCACATGACAGGAAGAAAGGATATTTTTATATCTTTAGATGAATCTCATCAAAATGTAGTGAAGACTGGTGACAACAAGATGCTTGAAGTCAAAGGAAAAGGAGATATTCTTGTCAAGACAAAAATGGGAGCAAAAAAAATTACTGATGTGTATTATGTTTCAGGTCTCAAACACAATCTTTTAAGTGTTGGACAACTTCTCCTAAGAGGACATGATGTTATTTTTAAAGATAACATATGTGAGATTAGAACCAAGAATGGGGATCTCATAACGAAGGTTTGTATGACTCACAACAAAATGTTTCCAATTAAAATATGTTATGAGAAGCTTGTTTGTTTTGAGACTTTAGTAAATGACACCTCATGGTTATGGCATTGTCGATTTGGGCACCTAAGTTTTGACACTTTGTCTCACATGTGTCAACAACATATGGTGAGAGGAATGTCAAATATTAAAAAGGAAGATCAACTCTGTGAAGCATGTGTTTTCAGAAAGCATCATCGAAATTCATTTCCGACTGGAGGTTCTTGGAGAGCATCAAAACCACTCGAGCTTGTTCATACAGACTTATGTGGACCTATGAGAACTACTACACATGGAGGTAACCGTTATTTTCTCACATTTATTGATGACTACAGTCGAAAAACATGGATTTATCTACTAAAAGAAAAGAGTGCTACTTTCGAATGTTTCAAGACATTCAAAGCAATGGTGGAAAATGAAAGTAACTTGAAATTGAAATCATTGCGTTCGGATCGTGGAGGAGAATATATTGTTTTTGCAGATTTCTTGAAGGAAAATGGAATCAAGCATCAGAAGACTGTTCGAAGAACTCCTCAACAAAACGGAGTTGCAGAGAGGAAAAATAGAATAATAATGGAACTTGCAAGAAGTATGTTGAAGGCAAAGAAGCTTCCTGATCAATTTTGGGGAGACGCAGTAACTTGTGCTGTTTATCTCCTAAATAGAGCTTCAACGAAAAGTGTGCAAGATGAGAAAAGAGGTAAGCTAGATGATAAATCAGAGAAATGCATTTTTGTTGGGTACAGTGAGAACTCTAAGGCCTACAGACTATACAATCCAATAAGTAAGAAAGTTGTTATTAGTCGAGATGTCAAGTTCGATGAAGCAAAATTGTGGCAATGGAATGCACCAAATGAAGACCAAAATCCATTACATGTTGATATGGATGGAAAAAAAGATGCTCGAGACTTGGAGCTTGAAGTAACTCAACCACTGACTTCACCTTCTTCATCACACTCCACAAGTGATGAAGAAACTACTCCAAGGAAGACCAGAAATATTCAAGAGATCTATAATACTTCAAGAAGGATACTAGATGAAGAACATGTTGATTTTGCTTTATTTGCAAATGTTGATCCTGCTCTTGGAGTCAAATGGATCTATAGAACAAAGCTAAAGCAAAACGGAGAAGTGCAAAAATACAAAGCCAGACTCGTTGTAAAAGGTTACAAACAAAAGTTTGGTGTGGATTATGAAGAAGTTTTTGCACCGCCCTTGCAGCAAAAAATAACTGGAAAGTTCATCAATGTAAAGTCAGCATTCCTAAATGGGTATTTAGAGGATGAAATATATGTTGAGCAACCCCCCGGTTATGCAAAGATTGGAGAAGAAAATAAGGATGGTTTCAGAAGATGTCCATATGAACATGCTCTCCACACCAAAGAAGATGAAAATGAATTCAAAGAGAGCATGAAAAAGGAATTTGAGATGACTGATATGGGTTTACTTCATTATTTTCTTGGTATTGAAGTTAAACAAGATGATAATGAGATTGCAATTTTCCAAAAAAAGTATGCAAAAGATTTGTTGAAAAGGTTCAAAATGGAGAATGCTTATCCTACCAATACTCCTATGGAATTGGGTTTAAAGTTAAGTAAGCATGATGTTAGTGAAGCTTTTGATGCCACCATTTATAGAAGTTTGGTTGGAAGTTTAATGTATTTAACTACAACTAGACCTGATATTATGTTCTCGGTCAGTTTATTGAGTAGATTTATGACATCACCAAAGAGAAGTCATTGGGAAGCTGGAAAGAGAGTTCTTAGATACATTCTTGGAACTGTTGATCATGGAATCCACTATAAAAGGAATGTGGATAATGTTCTTGTTGGCTACAGTGATAGTGATTGGGGAGGAAATATTGATGATTTCAAAAGTACTTCTGGGTATATATTTAATATTGGTTTTAAAGCAGTTTCATGGGCATCAAAGAAGCAAGATGTTGTAGCATTGTCCACAACAGAAGCTGAATACATTTCTTTGTCTGTTGCTAGTTGTCAAGCACTTTGGCTAAGAAATGTACTACATGAATTGAAGTGTCCTCAAGAGAAAGGGACCATCATGTTCTGTGACAATCAATCATCTATTTCACTTTCGAAGAATCCCGTTTTTCATGGAAGAAGCAAACACATAAACATCAAATATCATTTCATCAGAGAATTGATCAAAGATGGAGAAGTATATATCAGGTATTGCAAGACTCAAGATCAAGTTGCAGACGTATTCACAAAAGCATTAAAGACAGATTCATTCTTGAAAATGAAAGAGAAGCTCGGAGTTTAG
BLAST of CSPI03G21050 vs. Swiss-Prot
Match: POLX_TOBAC (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum PE=2 SV=1)

HSP 1 Score: 297.0 bits (759), Expect = 9.5e-79
Identity = 205/573 (35.78%), Postives = 292/573 (50.96%), Query Frame = 1

Query: 252 QFFSNRGRGRS---SNRGRGRSGGRGDFSHIQ------CFNCRRYGHFQADCWSKK---- 311
           Q     GRGRS   S+   GRSG RG   +        C+NC + GHF+ DC + +    
Sbjct: 194 QALITEGRGRSYQRSSNNYGRSGARGKSKNRSKSRVRNCYNCNQPGHFKRDCPNPRKGKG 253

Query: 312 -TNSNQAETTLMHEQSNNDQGLLFLTLNVQE-----SSTEEIWYLDSGCSNHMTGRKDIF 371
            T+  + +        NND  +LF+  N +E     S  E  W +D+  S+H T  +D+F
Sbjct: 254 ETSGQKNDDNTAAMVQNNDNVVLFI--NEEEECMHLSGPESEWVVDTAASHHATPVRDLF 313

Query: 372 ISLDESHQNVVKTGDNKMLEVKGKGDILVKTKMGAKKIT-DVYYVSGLKHNLLSVGQLLL 431
                     VK G+    ++ G GDI +KT +G   +  DV +V  L+ NL+S   L  
Sbjct: 314 CRYVAGDFGTVKMGNTSYSKIAGIGDICIKTNVGCTLVLKDVRHVPDLRMNLISGIALDR 373

Query: 432 RGHDVIFKDNICEIRTKNGDLITKVCMTHNKMFPI--KICYEKLVCFETLVNDTSWLWHC 491
            G++  F +   + R   G L+    +    ++    +IC  +L   +  ++    LWH 
Sbjct: 374 DGYESYFANQ--KWRLTKGSLVIAKGVARGTLYRTNAEICQGELNAAQDEISVD--LWHK 433

Query: 492 RFGHLSFDTLSHMCQQHMVRGMSNIKKEDQLCEACVFRKHHRNSFPTGGSWRASKPLELV 551
           R GH+S   L  + ++ ++        +   C+ C+F K HR SF T  S R    L+LV
Sbjct: 434 RMGHMSEKGLQILAKKSLISYAKGTTVKP--CDYCLFGKQHRVSFQTS-SERKLNILDLV 493

Query: 552 HTDLCGPMRTTTHGGNRYFLTFIDDYSRKTWIYLLKEKSATFECFKTFKAMVENESNLKL 611
           ++D+CGPM   + GGN+YF+TFIDD SRK W+Y+LK K   F+ F+ F A+VE E+  KL
Sbjct: 494 YSDVCGPMEIESMGGNKYFVTFIDDASRKLWVYILKTKDQVFQVFQKFHALVERETGRKL 553

Query: 612 KSLRSDRGGEYIV--FADFLKENGIKHQKTVRRTPQQNGVAERKNRIIMELARSMLKAKK 671
           K LRSD GGEY    F ++   +GI+H+KTV  TPQ NGVAER NR I+E  RSML+  K
Sbjct: 554 KRLRSDNGGEYTSREFEEYCSSHGIRHEKTVPGTPQHNGVAERMNRTIVEKVRSMLRMAK 613

Query: 672 LPDQFWGDAVTCAVYLLNRAST------------------------------KSVQDEKR 731
           LP  FWG+AV  A YL+NR+ +                                V  E+R
Sbjct: 614 LPKSFWGEAVQTACYLINRSPSVPLAFEIPERVWTNKEVSYSHLKVFGCRAFAHVPKEQR 673

Query: 732 GKLDDKSEKCIFVGYSENSKAYRLYNPISKKVVISRDVKFDEAKLWQWNAPNEDQNPLHV 769
            KLDDKS  CIF+GY +    YRL++P+ KKV+ SRDV F E+++               
Sbjct: 674 TKLDDKSIPCIFIGYGDEEFGYRLWDPVKKKVIRSRDVVFRESEV-----------RTAA 733

BLAST of CSPI03G21050 vs. Swiss-Prot
Match: COPIA_DROME (Copia protein OS=Drosophila melanogaster GN=GIP PE=1 SV=3)

HSP 1 Score: 226.5 bits (576), Expect = 1.6e-57
Identity = 117/300 (39.00%), Postives = 191/300 (63.67%), Query Frame = 1

Query: 909  NEFKESMKKEFEMTDMGLLHYFLGIEVKQDDNEIAIFQKKYAKDLLKRFKMENAYPTNTP 968
            N FK  + ++F MTD+  + +F+GI ++  +++I + Q  Y K +L +F MEN    +TP
Sbjct: 1102 NNFKRYLMEKFRMTDLNEIKHFIGIRIEMQEDKIYLSQSAYVKKILSKFNMENCNAVSTP 1161

Query: 969  MELGLKLSKHDVSEAFDATIYRSLVGSLMYLTT-TRPDIMFSVSLLSRFMTSPKRSHWEA 1028
            +   +     +  E  + T  RSL+G LMY+   TRPD+  +V++LSR+ +      W+ 
Sbjct: 1162 LPSKINYELLNSDEDCN-TPCRSLIGCLMYIMLCTRPDLTTAVNILSRYSSKNNSELWQN 1221

Query: 1029 GKRVLRYILGTVDHGIHYKRNV--DNVLVGYSDSDWGGNIDDFKSTSGYIFNI-GFKAVS 1088
             KRVLRY+ GT+D  + +K+N+  +N ++GY DSDW G+  D KST+GY+F +  F  + 
Sbjct: 1222 LKRVLRYLKGTIDMKLIFKKNLAFENKIIGYVDSDWAGSEIDRKSTTGYLFKMFDFNLIC 1281

Query: 1089 WASKKQDVVALSTTEAEYISLSVASCQALWLRNVLHELKCPQEKGTIMFCDNQSSISLSK 1148
            W +K+Q+ VA S+TEAEY++L  A  +ALWL+ +L  +    E    ++ DNQ  IS++ 
Sbjct: 1282 WNTKRQNSVAASSTEAEYMALFEAVREALWLKFLLTSINIKLENPIKIYEDNQGCISIAN 1341

Query: 1149 NPVFHGRSKHINIKYHFIRELIKDGEVYIRYCKTQDQVADVFTKALKTDSFLKMKEKLGV 1205
            NP  H R+KHI+IKYHF RE +++  + + Y  T++Q+AD+FTK L    F+++++KLG+
Sbjct: 1342 NPSCHKRAKHIDIKYHFAREQVQNNVICLEYIPTENQLADIFTKPLPAARFVELRDKLGL 1400

BLAST of CSPI03G21050 vs. Swiss-Prot
Match: M810_ARATH (Uncharacterized mitochondrial protein AtMg00810 OS=Arabidopsis thaliana GN=AtMg00810 PE=4 SV=1)

HSP 1 Score: 142.5 bits (358), Expect = 3.0e-32
Identity = 72/200 (36.00%), Postives = 122/200 (61.00%), Query Frame = 1

Query: 915  MKKEFEMTDMGLLHYFLGIEVKQDDNEIAIFQKKYAKDLLKRFKMENAYPTNTPMELGLK 974
            +   F M D+G +HYFLGI++K   + + + Q KYA+ +L    M +  P +TP+ L L 
Sbjct: 27   LSSTFSMKDLGPVHYFLGIQIKTHPSGLFLSQTKYAEQILNNAGMLDCKPMSTPLPLKLN 86

Query: 975  LSKHDVSEAFDATIYRSLVGSLMYLTTTRPDIMFSVSLLSRFMTSPKRSHWEAGKRVLRY 1034
             S    ++  D + +RS+VG+L YLT TRPDI ++V+++ + M  P  + ++  KRVLRY
Sbjct: 87   -SSVSTAKYPDPSDFRSIVGALQYLTLTRPDISYAVNIVCQRMHEPTLADFDLLKRVLRY 146

Query: 1035 ILGTVDHGIHYKRNVDNVLVGYSDSDWGGNIDDFKSTSGYIFNIGFKAVSWASKKQDVVA 1094
            + GT+ HG++  +N    +  + DSDW G     +ST+G+   +G   +SW++K+Q  V+
Sbjct: 147  VKGTIFHGLYIHKNSKLNVQAFCDSDWAGCTSTRRSTTGFCTFLGCNIISWSAKRQPTVS 206

Query: 1095 LSTTEAEYISLSVASCQALW 1115
             S+TE EY +L++ + +  W
Sbjct: 207  RSSTETEYRALALTAAELTW 225

BLAST of CSPI03G21050 vs. Swiss-Prot
Match: YCH4_YEAST (Putative transposon Ty5-1 protein YCL074W OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GN=TY5A PE=5 SV=2)

HSP 1 Score: 104.0 bits (258), Expect = 1.2e-20
Identity = 67/198 (33.84%), Postives = 103/198 (52.02%), Query Frame = 1

Query: 912  KESMKKEFEMTDMGLLHYFLGIEVKQDDN-EIAIFQKKYAKDLLKRFKMENAYPTNTPME 971
            K+ + K + M D+G +  FLG+ + Q  N +I +  + Y        ++     T TP+ 
Sbjct: 105  KQELTKLYSMKDLGKVDKFLGLNIHQSSNGDITLSLQDYIAKAASESEINTFKLTQTPLC 164

Query: 972  LGLKLSKHDVSEAFDATIYRSLVGSLMYLTTT-RPDIMFSVSLLSRFMTSPKRSHWEAGK 1031
                L +       D T Y+S+VG L++   T RPDI + VSLLSRF+  P+  H E+ +
Sbjct: 165  NSKPLFETTSPHLKDITPYQSIVGQLLFCANTGRPDISYPVSLLSRFLREPRAIHLESAR 224

Query: 1032 RVLRYILGTVDHGIHYKRNVDNVLVGYSDSDWGGNIDDFKSTSGYIFNIGFKAVSWASKK 1091
            RVLRY+  T    + Y+      L  Y D+  G   D   ST GY+  +    V+W+SKK
Sbjct: 225  RVLRYLYTTRSMCLKYRSGSQLALTVYCDASHGAIHDLPHSTGGYVTLLAGAPVTWSSKK 284

Query: 1092 -QDVVALSTTEAEYISLS 1107
             + V+ + +TEAEYI+ S
Sbjct: 285  LKGVIPVPSTEAEYITAS 302

BLAST of CSPI03G21050 vs. Swiss-Prot
Match: YB21B_YEAST (Transposon Ty2-B Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GN=TY2B-B PE=3 SV=1)

HSP 1 Score: 71.6 bits (174), Expect = 6.5e-11
Identity = 81/318 (25.47%), Postives = 147/318 (46.23%), Query Frame = 1

Query: 914  SMKKEFEM-------TDMGLLHYFLGIEVKQDDNEIAIFQKKYAKDLLKRFKMENAYPTN 973
            ++KK+++        +D  + +  LG+E+K         + KY K  +++   E     N
Sbjct: 1460 TLKKQYDTKIINLGESDNEIQYDILGLEIKYQ-------RSKYMKLGMEKSLTEKLPKLN 1519

Query: 974  TPME-LGLKLSK-----HDVSE---AFDATIYRS-------LVGSLMYLTTT-RPDIMFS 1033
             P+   G KLS      H + +     D   Y+        L+G   Y+    R D+++ 
Sbjct: 1520 VPLNPKGKKLSAPGQPGHYIDQDELEIDEDEYKEKVHEMQKLIGLASYVGYKFRFDLLYY 1579

Query: 1034 VSLLSRFMTSPKRSHWEAGKRVLRYILGTVDHGIHYKRNV----DNVLVGYSDSDWGGNI 1093
            ++ L++ +  P R   +    +++++  T D  + + +N     DN LV  SD+ +G N 
Sbjct: 1580 INTLAQHILFPSRQVLDMTYELIQFMWDTRDKQLIWHKNKPTKPDNKLVAISDASYG-NQ 1639

Query: 1094 DDFKSTSGYIFNIGFKAVSWASKKQDVVALSTTEAEYISLSVASCQALWLRNVLHEL-KC 1153
              +KS  G IF +  K +   S K  +   STTEAE  ++S A      L +++ EL K 
Sbjct: 1640 PYYKSQIGNIFLLNGKVIGGKSTKASLTCTSTTEAEIHAVSEAIPLLNNLSHLVQELNKK 1699

Query: 1154 PQEKGTIMFCDNQSSISLSKNPVFHG-RSKHINIKYHFIRELIKDGEVYIRYCKTQDQVA 1202
            P  KG  +  D++S+IS+ K+      R++    K   +R+ +    +Y+ Y +T+  +A
Sbjct: 1700 PIIKG--LLTDSRSTISIIKSTNEEKFRNRFFGTKAMRLRDEVSGNNLYVYYIETKKNIA 1759

BLAST of CSPI03G21050 vs. TrEMBL
Match: A6YTD9_CUCME (Integrase OS=Cucumis melo subsp. melo PE=4 SV=1)

HSP 1 Score: 1424.8 bits (3687), Expect = 0.0e+00
Identity = 752/1323 (56.84%), Postives = 924/1323 (69.84%), Query Frame = 1

Query: 2    SSNMLQPQLPCFEGKNYRRWSHQMKVLYGSQDLWDIVDIGYSEPESENGLSAQQLNELRD 61
            + NMLQ QLP F GKN+ +WS QMKVLYGSQ+LWDIV+ GY+E E+++ L+ QQL ELR+
Sbjct: 4    NGNMLQHQLPRFSGKNFNQWSIQMKVLYGSQELWDIVERGYTEVENQSELTNQQLVELRE 63

Query: 62   ARKKDKKALFFIYQSVDENIFERISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEFD 121
             R KDKKALFFIYQ+VDE I ERIS  ++AKAAWD L++ Y+GE+KVK++RLQ LR+EFD
Sbjct: 64   NRNKDKKALFFIYQAVDEFISERISTATSAKAAWDILRSTYQGEDKVKMIRLQALRSEFD 123

Query: 122  TIRMKDSETIEEFFNRVLLIVNQLRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKD 181
             I+MK++ETIEEFFN +L+IVN LRSNGE + DQR+VEKILRSM R++EHIVVAIEESKD
Sbjct: 124  CIKMKETETIEEFFNHILVIVNSLRSNGEEVGDQRVVEKILRSMPRKFEHIVVAIEESKD 183

Query: 182  LSTLSINSLMGSLQSHELRLKMFDSNPSEEAFHMQSSYRGRSNGRRGGRGGRGNGR---S 241
            LSTLSINSLMGSLQSHELRLK FD NP EEAF MQ+S+RG S GRRGG G RG GR   +
Sbjct: 184  LSTLSINSLMGSLQSHELRLKQFDVNP-EEAFQMQTSFRGGSRGRRGGHGRRGGGRNYDN 243

Query: 242  NVVTNTESESRDNQFFSNRGRGRSSNRGRGRSGGRGDFSHIQCFNCRRYGHFQADCWSKK 301
                N+E+    +     RG GR    GR + GGRG+FS IQCFNCR+YGHFQADCW+ K
Sbjct: 244  RSGANSENSQESSSLSRGRGSGRRRGFGRNQGGGRGNFSQIQCFNCRKYGHFQADCWALK 303

Query: 302  TNSNQAETTLMHEQSNNDQGLLFLTLNVQESSTEEIWYLDSGCSNHMTGRKDIFISLDES 361
                     +  EQ  ND+G+LFL  +VQ+                              
Sbjct: 304  NGVGNTTMNMHKEQKKNDEGILFLACSVQD------------------------------ 363

Query: 362  HQNVVK----TGDNKMLEVKGKGDILVKTKMGAKKITDVYYVSGLKHNLLSVGQLLLRGH 421
              NVVK     GDN  L+VKG+GDILVKTK   K++T+V+YV GLKHNLLS+GQLL RG 
Sbjct: 364  --NVVKPTCEDGDNTRLQVKGQGDILVKTKKRTKRVTNVFYVPGLKHNLLSIGQLLQRGL 423

Query: 422  DVIFKDNICEIRTKNGDLITKVCMTHNKMFPIKICYEKLVCFETLVNDTSWLWHCRFGHL 481
             V F+ +IC I+ +   LI+KV MT NKMFP+   Y ++ CF +++ D+SWLWH R+GHL
Sbjct: 424  KVSFEGDICAIKDQADVLISKVKMTANKMFPLNFTYGQISCFSSILKDSSWLWHFRYGHL 483

Query: 482  SFDTLSHMCQQHMVRGMSNIKKEDQLCEACVFRKHHRNSFPTGGSWRASKPLELVHTDLC 541
            +F +LS++C+ HMVR              C+  KHHR+SFPTG +WRASKPLEL+HTDLC
Sbjct: 484  NFKSLSYLCKNHMVR-------------VCILAKHHRDSFPTGKAWRASKPLELIHTDLC 543

Query: 542  GPMRTTTHGGNRYFLTFIDDYSRKTWIYLLKEKSATFECFKTFKAMVENESNLKLKSLRS 601
            GPMRTTT+GGNRYF+TFIDD+SRK WIY LKEKS    CFK+FKA  EN+S  K+K+LRS
Sbjct: 544  GPMRTTTNGGNRYFITFIDDFSRKLWIYFLKEKSEALVCFKSFKAFTENQSGYKIKTLRS 603

Query: 602  DRGGEYIVFADFLKENGIKHQKTVRRTPQQNGVAERKNRIIMELARSMLKAKKLPDQFWG 661
            DRGGEYIVF +F KE GI HQ T R T QQNGVAERKNR IME+ARSMLKAK LP++FWG
Sbjct: 604  DRGGEYIVFGNFFKEQGIHHQMTARMTTQQNGVAERKNRTIMEMARSMLKAKNLPNEFWG 663

Query: 662  DAVTCAVYLLNRASTKSVQ---------DEK---------------------RGKLDDKS 721
            DAV C VY+LNRA TKSV          DEK                     RGKLDDKS
Sbjct: 664  DAVACTVYILNRAPTKSVPGMTPYEAWCDEKPSVSHLKVFRSIAYSHIPNQLRGKLDDKS 723

Query: 722  EKCIFVGYSENSKAYRLYNPISKKVVISRDVKFDEAKLWQWNAP-NEDQNPLHVDMDGKK 781
            EKCI VGY+ENSKAYRLYNP+S+K++I+RDV F E + W WN   +E ++P HV+++  +
Sbjct: 724  EKCIMVGYNENSKAYRLYNPVSRKIIINRDVIFSEDESWNWNDDVDEAKSPFHVNINENE 783

Query: 782  DARDLELEVTQPLTSPSS--SHSTSDEETTPRKTRNIQEIYNTSRRILDEEHVDFALFAN 841
             A++LE    Q + S SS  S STS++E +PR+ R+IQEIYN + RI  +   +FALFA 
Sbjct: 784  VAQELEQAKIQAVESSSSSTSSSTSNDEISPRRMRSIQEIYNNTNRINVDHFANFALFAG 843

Query: 842  VDPAL---------------------------------------GVKWIYRTKLKQNGEV 901
            V P                                         GVKW+YRTKLK +G V
Sbjct: 844  VGPVTFDEAIQDEKWKIAMDQEIDAIRRNETWELMELPTNKQALGVKWVYRTKLKSDGNV 903

Query: 902  QKYKARLVVKGYKQKFGVDYEEVFAPPLQQKITGKFI-------------NVKSAFLNGY 961
            + YKARLVVKGYKQ++GVDYEE+FAP  + +     +             ++KSAFLNG+
Sbjct: 904  EIYKARLVVKGYKQEYGVDYEEIFAPVTRIETIRLILSLAAQNGWKVHQMDIKSAFLNGH 963

Query: 962  LEDEIYVEQPPGYAKIGEEN----------------------------KDGFRRCPYEHA 1021
            L+DEI+V QP GY + GEE                             K GFRRCPYEHA
Sbjct: 964  LKDEIFVAQPLGYVQRGEEEKVYKLKKALYGLKQAPRAWYSRIDSFFLKTGFRRCPYEHA 1023

Query: 1022 LHTKEDENEFKESMKKEFEMTDMGLLHYFLGIEVKQDDNEIAIFQKKYAKDLLKRFKMEN 1081
            L+ KED  ++ + +     M+DMGL+HYFLGIEV Q++ EI I Q+KYA DLLK+F+MEN
Sbjct: 1024 LYVKED--KYGKFLIVSLYMSDMGLIHYFLGIEVNQNEGEIVISQQKYAHDLLKKFRMEN 1083

Query: 1082 AYPTNTPMELGLKLSKHDVSEAFDATIYRSLVGSLMYLTTTRPDIMFSVSLLSRFMTSPK 1141
            A P NTPM+  LKL K D+ EA D ++YRSLVGSLMYLT TRPDI+F VS+LSRFMT+PK
Sbjct: 1084 ASPCNTPMDANLKLCKDDIGEAVDPSLYRSLVGSLMYLTATRPDILFVVSMLSRFMTNPK 1143

Query: 1142 RSHWEAGKRVLRYILGTVDHGIHYKRNVDNVLVGYSDSDWGGNIDDFKSTSGYIFNIGFK 1201
            RSHWEAGKRVLRYILGT++ GI+YK+  ++VL G+ DSDWGGN+DD +STSGY+F++G  
Sbjct: 1144 RSHWEAGKRVLRYILGTINFGIYYKKVSESVLFGFCDSDWGGNVDDHRSTSGYVFSMGSG 1203

Query: 1202 AVSWASKKQDVVALSTTEAEYISLSVASCQALWLRNVLHELKCPQEKGTIMFCDNQSSIS 1205
              SW SKKQ VV LSTTEAEYISL+ A CQALWLR +L ELKC Q+  T++FCDN S+I+
Sbjct: 1204 VFSWTSKKQSVVTLSTTEAEYISLAAAGCQALWLRWMLKELKCTQKCETVLFCDNGSAIA 1263

BLAST of CSPI03G21050 vs. TrEMBL
Match: Q9C536_ARATH (Copia-type polyprotein, putative OS=Arabidopsis thaliana GN=T18I24.5 PE=4 SV=1)

HSP 1 Score: 1111.7 bits (2874), Expect = 0.0e+00
Identity = 610/1321 (46.18%), Postives = 826/1321 (62.53%), Query Frame = 1

Query: 1    MSSNMLQPQLPCFEGKNYRRWSHQMKVLYGSQDLWDIVDIGYSEPESENGLSAQQLNELR 60
            M+SN +  Q+P     NY  WS +MK + G+ D+W+IV+ G+ EPE+E  LS  Q + LR
Sbjct: 1    MASNNVPFQVPVLTKSNYDNWSLRMKAILGAHDVWEIVEKGFIEPENEGSLSQTQKDGLR 60

Query: 61   DARKKDKKALFFIYQSVDENIFERISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEF 120
            D+RK+DKKAL  IYQ +DE+ FE++   ++AK AW+ L+  Y+G ++VK VRLQTLR EF
Sbjct: 61   DSRKRDKKALCLIYQGLDEDTFEKVVEATSAKEAWEKLRTSYKGADQVKKVRLQTLRGEF 120

Query: 121  DTIRMKDSETIEEFFNRVLLIVNQLRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESK 180
            + ++MK+ E + ++F+RVL + N L+ NGE ++D RI+EK+LRS+  ++EHIV  IEE+K
Sbjct: 121  EALQMKEGELVSDYFSRVLTVTNNLKRNGEKLDDVRIMEKVLRSLDLKFEHIVTVIEETK 180

Query: 181  DLSTLSINSLMGSLQSHELRLKMFDSNPSEEAFHMQ-------SSYRGRSNGR-RG-GRG 240
            DL  ++I  L+GSLQ++E + K  + +  E+  +MQ        SY+ R  G+ RG GRG
Sbjct: 181  DLEAMTIEQLLGSLQAYEEKKKKKE-DIVEQVLNMQITKEENGQSYQRRGGGQVRGRGRG 240

Query: 241  GRGNGRSNVVTNTESESRDNQFFSNRGRGRSSNRGRGRSGGRGDFSHIQCFNCRRYGHFQ 300
            G GNGR         E   NQ    RG   S  RG+G    R D S ++C+NC ++GH+ 
Sbjct: 241  GYGNGRGW----RPHEDNTNQ----RGENSSRGRGKGHPKSRYDKSSVKCYNCGKFGHYA 300

Query: 301  ADCWSKKTNSNQAETTLMHEQSNNDQGLLFLTLNVQESSTEEIWYLDSGCSNHMTGRKDI 360
            ++C +      + +   + E+   +  LL  +    E      WYLDSG SNHM GRK +
Sbjct: 301  SECKAPSNKKFEEKANYVEEKIQEEDMLLMASYKKDEQEENHKWYLDSGASNHMCGRKSM 360

Query: 361  FISLDESHQNVVKTGDNKMLEVKGKGDILVKTKMGAKK-ITDVYYVSGLKHNLLSVGQLL 420
            F  LDES +  V  GD   +EVKGKG+IL++ K G  + I++VYY+  +K N+LS+GQLL
Sbjct: 361  FAELDESVRGNVALGDESKMEVKGKGNILIRLKNGDHQFISNVYYIPSMKTNILSLGQLL 420

Query: 421  LRGHDVIFKDNICEIRTKNGDLITKVCMTHNKMFPIKICYEKLVCFETLVNDTSWLWHCR 480
             +G+D+  KDN   IR +  +LITKV M+ N+MF + I  +   C +    + SWLWH R
Sbjct: 421  EKGYDIRLKDNNLSIRDQESNLITKVPMSKNRMFVLNIRNDIAQCLKMCYKEESWLWHLR 480

Query: 481  FGHLSFDTLSHMCQQHMVRGMSNIKKEDQLCEACVFRKHHRNSFPTGGSWRASKPLELVH 540
            FGHL+F  L  + ++ MVRG+  I   +Q+CE C+  K  + SFP   S RA KPLEL+H
Sbjct: 481  FGHLNFGGLELLSRKEMVRGLPCINHPNQVCEGCLLGKQFKMSFPKESSSRAQKPLELIH 540

Query: 541  TDLCGPMRTTTHGGNRYFLTFIDDYSRKTWIYLLKEKSATFECFKTFKAMVENESNLKLK 600
            TD+CGP++  + G + YFL FIDD+SRKTW+Y LKEKS  FE FK FKA VE ES L +K
Sbjct: 541  TDVCGPIKPKSLGKSNYFLLFIDDFSRKTWVYFLKEKSEVFEIFKKFKAHVEKESGLVIK 600

Query: 601  SLRSDRGGEYIV--FADFLKENGIKHQKTVRRTPQQNGVAERKNRIIMELARSMLKAKKL 660
            ++RSDRGGE+    F  + ++NGI+ Q TV R+PQQNGVAERKNR I+E+ARSMLK+K+L
Sbjct: 601  TMRSDRGGEFTSKEFLKYCEDNGIRRQLTVPRSPQQNGVAERKNRTILEMARSMLKSKRL 660

Query: 661  PDQFWGDAVTCAVYLLNRASTKSVQ------------------------------DEKRG 720
            P + W +AV CAVYLLNR+ TKSV                               DEKR 
Sbjct: 661  PKELWAEAVACAVYLLNRSPTKSVSGKTPQEAWSGRKPGVSHLRVFGSIAHAHVPDEKRS 720

Query: 721  KLDDKSEKCIFVGYSENSKAYRLYNPISKKVVISRDVKFDEAKLWQWNAPNEDQNPL-HV 780
            KLDDKSEK IF+GY  NSK Y+LYNP +KK +ISR++ FDE   W WN+  ED N   H 
Sbjct: 721  KLDDKSEKYIFIGYDNNSKGYKLYNPDTKKTIISRNIVFDEEGEWDWNSNEEDYNFFPHF 780

Query: 781  DMDGKKDARD--LELEVTQPLTSPSSSHSTSDEETTPRKTRNIQEIYNTSRRILDEE--- 840
            + D  +  R+     E T P TSP+SS    +E+  P   +   E   T R  +DEE   
Sbjct: 781  EEDKPEPTREEPPSEEPTTPPTSPTSSQ--IEEKCEPMDFQEAIE-KKTWRNAMDEEIKS 840

Query: 841  -----HVDFALFANVDPALGVKWIYRTKLKQNGEVQKYKARLVVKGYKQKFGVDYEEVFA 900
                   +     N   A+GVKW+Y+ K    GEV++YKARLV KGY Q+ G+DY+EVFA
Sbjct: 841  IQKNDTWELTSLPNGHKAIGVKWVYKAKKNSKGEVERYKARLVAKGYSQRAGIDYDEVFA 900

Query: 901  P-------------PLQQKITGKFINVKSAFLNGYLEDEIYVEQPPGYAKIGEENK---- 960
            P               Q K     ++VKSAFLNG LE+E+Y+EQP GY   GEE+K    
Sbjct: 901  PVARLETVRLIISLAAQNKWKIHQMDVKSAFLNGDLEEEVYIEQPQGYIVKGEEDKVLRL 960

Query: 961  ------------------------DGFRRCPYEHALHTKEDE------------------ 1020
                                      F +CPYEHAL+ K  +                  
Sbjct: 961  KKALYGLKQAPRAWNTRIDKYFKEKDFIKCPYEHALYIKIQKEDILIACLYVDDLIFTGN 1020

Query: 1021 -----NEFKESMKKEFEMTDMGLLHYFLGIEVKQDDNEIAIFQKKYAKDLLKRFKMENAY 1080
                  EFK+ M KEFEMTD+GL+ Y+LGIEVKQ+DN I I Q+ YAK++LK+FKM+++ 
Sbjct: 1021 NPSMFEEFKKEMTKEFEMTDIGLMSYYLGIEVKQEDNGIFITQEGYAKEVLKKFKMDDSN 1080

Query: 1081 PTNTPMELGLKLSKHDVSEAFDATIYRSLVGSLMYLTTTRPDIMFSVSLLSRFMTSPKRS 1140
            P  TPME G+KLSK +  E  D T ++SLVGSL YLT TRPDI+++V ++SR+M  P  +
Sbjct: 1081 PVCTPMECGIKLSKKEEGEGVDPTTFKSLVGSLRYLTCTRPDILYAVGVVSRYMEHPTTT 1140

Query: 1141 HWEAGKRVLRYILGTVDHGIHYKRNVDNVLVGYSDSDWGGNIDDFKSTSGYIFNIGFKAV 1200
            H++A KR+LRYI GTV+ G+HY    D  LVGYSDSDWGG++DD KSTSG++F IG  A 
Sbjct: 1141 HFKAAKRILRYIKGTVNFGLHYSTTSDYKLVGYSDSDWGGDVDDRKSTSGFVFYIGDTAF 1200

Query: 1201 SWASKKQDVVALSTTEAEYISLSVASCQALWLRNVLHELKCPQEKGTIMFCDNQSSISLS 1205
            +W SKKQ +V LST EAEY++ +   C A+WLRN+L EL  PQE+ T +F DN+S+I+L+
Sbjct: 1201 TWMSKKQPIVTLSTCEAEYVAATSCVCHAIWLRNLLKELSLPQEEPTKIFVDNKSAIALA 1260

BLAST of CSPI03G21050 vs. TrEMBL
Match: Q9SXB2_ARATH (T28P6.8 protein OS=Arabidopsis thaliana GN=T28P6.8 PE=4 SV=1)

HSP 1 Score: 1064.3 bits (2751), Expect = 1.1e-307
Identity = 602/1355 (44.43%), Postives = 821/1355 (60.59%), Query Frame = 1

Query: 1    MSSNMLQPQLPCFEGKNYRRWSHQMKVLYGSQDLWDIVDIGYSEPESENGLSAQQLNELR 60
            M+SN +  Q+P     NY  WS +MK + G+ D+W+IV+ G+ EPE+E  LS  Q + LR
Sbjct: 1    MASNNVPFQVPVLTKSNYDNWSLRMKAILGAHDVWEIVEKGFIEPENEGSLSQTQKDGLR 60

Query: 61   DARKKDKKALFFIYQSVDENIFERISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEF 120
            D+RK+DKKAL  IYQ +DE+ FE++   ++AK AW+ L+  Y+G ++VK VRLQTLR EF
Sbjct: 61   DSRKRDKKALCLIYQGLDEDTFEKVVEATSAKEAWEKLRTSYKGADQVKKVRLQTLRGEF 120

Query: 121  DTIRMKDSETIEEFFNRVLLIVNQLRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESK 180
            + ++MK+ E + ++F+RVL + N L+ NGE ++D RI+EK+LRS+  ++EHIV  IEE+K
Sbjct: 121  EALQMKEGELVSDYFSRVLTVTNNLKRNGEKLDDVRIMEKVLRSLDLKFEHIVTVIEETK 180

Query: 181  DLSTLSINSLMGSLQSHELRLKMFDSNPSEEAFHMQ-------SSYRGRSNGR-RG-GRG 240
            DL  ++I  L+GSLQ++E + K  + +  E+  +MQ        SY+ R  G+ RG GRG
Sbjct: 181  DLEAMTIEQLLGSLQAYEEKKKKKE-DIVEQVLNMQITKEENGQSYQRRGGGQVRGRGRG 240

Query: 241  GRGNGRSNVVTNTESESRDNQFFSNRGRGRSSNRGRGRSGGRGDFSHIQCFNCRRYGHFQ 300
            G GNGR         E   NQ    RG   S  RG+G    R D S ++C+NC ++GH+ 
Sbjct: 241  GYGNGRGW----RPHEDNTNQ----RGENSSRGRGKGHPKSRYDKSSVKCYNCGKFGHYA 300

Query: 301  ADCWSKKTNSNQAETTLMHEQSNNDQGLLFLTLNVQESSTEEIWYLDSGCSNHMTGRKDI 360
            ++C +      + +   + E+   +  LL  +    E      WYLDSG SNHM GRK +
Sbjct: 301  SECKAPSNKKFEEKANYVEEKIQEEDMLLMASYKKDEQKENHKWYLDSGASNHMCGRKSM 360

Query: 361  FISLDESHQNVVKTGDNKMLEVKGKGDILVKTKMGAKK-ITDVYYVSGLKHNLLSVGQLL 420
            F  LDES +  V  GD   +EVKGKG+IL++ K G  + I++VYY+  +K N+LS+GQLL
Sbjct: 361  FAELDESVRGNVALGDESKMEVKGKGNILIRLKNGDHQFISNVYYIPSMKTNILSLGQLL 420

Query: 421  LRGHDVIFKDNICEIRTKNGDLITKVCMTHNKMFPIKICYEKLVCFETLVNDTSWLWHCR 480
             +G+D+  KDN   IR +  +LITKV M+ N+MF + I  +   C +    + SWLWH R
Sbjct: 421  EKGYDIRLKDNNLSIRDQESNLITKVPMSKNRMFVLNIRNDIAQCLKMCYKEESWLWHLR 480

Query: 481  FGHLSFDTLSHMCQQHMVRGMSNIKKEDQLCEACVFRKHHRNSFPTGGSWRASKPLELVH 540
            FGHL+F  L  + ++ MVRG+  I   +Q+CE C+  K  + SFP   S RA KPLEL+H
Sbjct: 481  FGHLNFGGLELLSRKEMVRGLPCINHPNQVCEGCLLGKQFKMSFPKESSSRAQKPLELIH 540

Query: 541  TDLCGPMRTTTHGGNRYFLTFIDDYSRKTWIYLLKEKSATFECFKTFKAMVENESNLKLK 600
            TD+CGP++  + G + YFL FIDD+SRKTW+Y LKEKS  FE FK FKA VE ES L +K
Sbjct: 541  TDVCGPIKPKSLGKSNYFLLFIDDFSRKTWVYFLKEKSEVFEIFKKFKAHVEKESGLVIK 600

Query: 601  SLRSDRGGEYIV--FADFLKENGIKHQKTVRRTPQQNGVAERKNRIIMELARSMLKAKKL 660
            ++RSDRGGE+    F  + ++NGI+ Q TV R+PQQNGV ERKNR I+E+ARSMLK+K+L
Sbjct: 601  TMRSDRGGEFTSKEFLKYCEDNGIRRQLTVPRSPQQNGVVERKNRTILEMARSMLKSKRL 660

Query: 661  PDQFWGDAVTCAVYLLNRASTKSVQ------------------------------DEKRG 720
            P + W +AV CAVYLLNR+ TKSV                               DEKR 
Sbjct: 661  PKELWAEAVACAVYLLNRSPTKSVSGKTPQEAWSGRKPGVSHLRVFGSIAHAHVPDEKRS 720

Query: 721  KLDDKSEKCIFVGYSENSKAYRLYNP------ISKKVVISRDVKFDEAKLWQWNAPNEDQ 780
            KLDDKSEK IF+GY  NSK Y+LYNP      IS+ +V   + ++D    W  N  + + 
Sbjct: 721  KLDDKSEKYIFIGYDNNSKGYKLYNPDTKKTIISRNIVFDEEGEWD----WNSNEEDYNF 780

Query: 781  NPLHVDMDGKKDARDL--ELEVTQPLTS-------PSSSHSTSD--------EETTPRKT 840
             P H + D  +  R+     E T P TS        SSS  T          E T  ++ 
Sbjct: 781  FP-HFEEDEPEPTREEPPSEEPTTPPTSPTSSQIEESSSERTPRFRSIQELYEVTENQEN 840

Query: 841  RNIQEIY--------------NTSRRILDEEHV--------DFALFANVDPALGVKWIYR 900
              +  ++               T R  +DEE          +     N   A+GVKW+Y+
Sbjct: 841  LTLFCLFAECEPMDFQKAIEKKTWRNAMDEEIKSIQKNDTWELTSLPNGHKAIGVKWVYK 900

Query: 901  TKLKQNGEVQKYKARLVVKGYKQKFGVDYEEVFAP-------------PLQQKITGKFIN 960
             K    GEV++YKARLV KGY Q+ G+DY+EVFAP               Q K     ++
Sbjct: 901  AKKNSKGEVERYKARLVAKGYSQRVGIDYDEVFAPVARLETVRLIISLAAQNKWKIHQMD 960

Query: 961  VKSAFLNGYLEDEIYVEQPPGYAKIGEENK----------------------------DG 1020
            VKSAFLNG LE+E+Y+EQP GY   GEE+K                              
Sbjct: 961  VKSAFLNGDLEEEVYIEQPQGYIVKGEEDKVLRLKKVLYGLKQAPRAWNTRIDKYFKEKD 1020

Query: 1021 FRRCPYEHALHTKEDE-----------------------NEFKESMKKEFEMTDMGLLHY 1080
            F +CPYEHAL+ K  +                        EFK+ M KEFEMTD+GL+ Y
Sbjct: 1021 FIKCPYEHALYIKIQKEDILIACLYVDDLIFTGNNPSIFEEFKKEMTKEFEMTDIGLMSY 1080

Query: 1081 FLGIEVKQDDNEIAIFQKKYAKDLLKRFKMENAYPTNTPMELGLKLSKHDVSEAFDATIY 1140
            +LGIEVKQ+DN I I Q+ YAK++LK+FKM+++ P  TPME G+KLSK +  E  D T +
Sbjct: 1081 YLGIEVKQEDNGIFITQEGYAKEVLKKFKMDDSNPVCTPMECGIKLSKKEEGEGVDPTTF 1140

Query: 1141 RSLVGSLMYLTTTRPDIMFSVSLLSRFMTSPKRSHWEAGKRVLRYILGTVDHGIHYKRNV 1200
            +SLVGSL YLT TRPDI+++V ++SR+M  P  +H++A KR+LRYI GTV+ G+HY    
Sbjct: 1141 KSLVGSLRYLTCTRPDILYAVGVVSRYMEHPTTTHFKAAKRILRYIKGTVNFGLHYSTTS 1200

Query: 1201 DNVLVGYSDSDWGGNIDDFKSTSGYIFNIGFKAVSWASKKQDVVALSTTEAEYISLSVAS 1205
            D  LVGYSDSDWGG++DD KSTSG++F IG  A +W SKKQ +V LST EAEY++ +   
Sbjct: 1201 DYKLVGYSDSDWGGDVDDRKSTSGFVFYIGDTAFTWMSKKQPIVTLSTCEAEYVAATSCV 1260

BLAST of CSPI03G21050 vs. TrEMBL
Match: Q9M2D1_ARATH (Copia-type polyprotein OS=Arabidopsis thaliana GN=T20K12.230 PE=4 SV=1)

HSP 1 Score: 1063.1 bits (2748), Expect = 2.4e-307
Identity = 601/1355 (44.35%), Postives = 822/1355 (60.66%), Query Frame = 1

Query: 1    MSSNMLQPQLPCFEGKNYRRWSHQMKVLYGSQDLWDIVDIGYSEPESENGLSAQQLNELR 60
            M+SN +  Q+P     NY  WS +MK + G+ D+W+IV+ G+ EPE+E  LS  Q + LR
Sbjct: 1    MASNNVPFQVPVLTKSNYDNWSLRMKAILGAHDVWEIVEKGFIEPENEGSLSQTQKDGLR 60

Query: 61   DARKKDKKALFFIYQSVDENIFERISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEF 120
            D+RK+DKKAL  IYQ +DE+ FE++   ++AK AW+ L+  Y+G ++VK VRLQTLR EF
Sbjct: 61   DSRKRDKKALCLIYQGLDEDTFEKVVEATSAKEAWEKLRTSYKGADQVKKVRLQTLRGEF 120

Query: 121  DTIRMKDSETIEEFFNRVLLIVNQLRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESK 180
            + ++MK+ E + ++F+RVL + N L+ NGE ++D RI+EK+LRS+  ++EHIV  IEE+K
Sbjct: 121  EALQMKEGELVSDYFSRVLTVTNNLKRNGEKLDDVRIMEKVLRSLDLKFEHIVTVIEETK 180

Query: 181  DLSTLSINSLMGSLQSHELRLKMFDSNPSEEAFHMQ-------SSYRGRSNGR-RG-GRG 240
            DL  ++I  L+GSLQ++E + K  + + +E+  +MQ        SY+ R  G+ RG GRG
Sbjct: 181  DLEAMTIEQLLGSLQAYEEKKKKKE-DIAEQVLNMQITKEENGQSYQRRGGGQVRGRGRG 240

Query: 241  GRGNGRSNVVTNTESESRDNQFFSNRGRGRSSNRGRGRSGGRGDFSHIQCFNCRRYGHFQ 300
            G GNGR         E   NQ    RG   S  RG+G    R D S ++C+NC ++GH+ 
Sbjct: 241  GYGNGRGW----RPHEDNTNQ----RGENSSRGRGKGHPKSRYDKSSVKCYNCGKFGHYA 300

Query: 301  ADCWSKKTNSNQAETTLMHEQSNNDQGLLFLTLNVQESSTEEIWYLDSGCSNHMTGRKDI 360
            ++C +      + +   + E+   +  LL  +    E      WYLDSG SNHM GRK +
Sbjct: 301  SECKAPSNKKFEEKAHYVEEKIQEEDMLLMASYKKDEQKENHKWYLDSGASNHMCGRKSM 360

Query: 361  FISLDESHQNVVKTGDNKMLEVKGKGDILVKTKMGAKK-ITDVYYVSGLKHNLLSVGQLL 420
            F  LDES +  V  GD   +EVKGKG+IL++ K G  + I++VYY+  +K N+LS+GQLL
Sbjct: 361  FAELDESVRGNVALGDESKMEVKGKGNILIRLKNGDHQFISNVYYIPSMKTNILSLGQLL 420

Query: 421  LRGHDVIFKDNICEIRTKNGDLITKVCMTHNKMFPIKICYEKLVCFETLVNDTSWLWHCR 480
             +G+D+  KDN   IR +  +LITKV M+ N+MF + I  +   C +    + SWLWH R
Sbjct: 421  EKGYDIRLKDNNLSIRDQESNLITKVPMSKNRMFVLNIRNDIAQCLKMCYKEESWLWHLR 480

Query: 481  FGHLSFDTLSHMCQQHMVRGMSNIKKEDQLCEACVFRKHHRNSFPTGGSWRASKPLELVH 540
            FGHL+F  L  + ++ MVRG+  I   +Q+CE C+  K  + SFP   S RA KPLEL+H
Sbjct: 481  FGHLNFGGLELLSRKEMVRGLPCINHPNQVCEGCLLGKQFKMSFPKESSSRAQKPLELIH 540

Query: 541  TDLCGPMRTTTHGGNRYFLTFIDDYSRKTWIYLLKEKSATFECFKTFKAMVENESNLKLK 600
            TD+CGP++  + G + YFL FIDD+SRKTW+Y LKEKS  FE FK FKA VE ES L +K
Sbjct: 541  TDVCGPIKPKSLGKSNYFLLFIDDFSRKTWVYFLKEKSEVFEIFKKFKAHVEKESGLVIK 600

Query: 601  SLRSDRGGEYIV--FADFLKENGIKHQKTVRRTPQQNGVAERKNRIIMELARSMLKAKKL 660
            ++RSDRGGE+    F  + ++NGI+ Q TV R+PQQNGV ERKNR I+E+ARSMLK+K+L
Sbjct: 601  TMRSDRGGEFTSKEFLKYCEDNGIRRQLTVPRSPQQNGVVERKNRTILEMARSMLKSKRL 660

Query: 661  PDQFWGDAVTCAVYLLNRASTKSVQ------------------------------DEKRG 720
            P + W +AV CAVYLLNR+ TKSV                               DEKR 
Sbjct: 661  PKELWAEAVACAVYLLNRSPTKSVSGKTPQEAWSGRKPGVSHLRVFGSIAHAHVPDEKRS 720

Query: 721  KLDDKSEKCIFVGYSENSKAYRLYNP------ISKKVVISRDVKFDEAKLWQWNAPNEDQ 780
            KLDDKSEK IF+GY  NSK Y+LYNP      IS+ +V   + ++D    W  N  + + 
Sbjct: 721  KLDDKSEKYIFIGYDNNSKGYKLYNPDTKKTIISRNIVFDEEGEWD----WNSNEEDYNF 780

Query: 781  NPLHVDMDGKKDARDL--ELEVTQPLTS-------PSSSHSTSD--------EETTPRKT 840
             P H + D  +  R+     E T P TS        SSS  T          E T  ++ 
Sbjct: 781  FP-HFEEDEPEPTREEPPSEEPTTPPTSPTSSQIEESSSERTPRFRSIQELYEVTENQEN 840

Query: 841  RNIQEIY--------------NTSRRILDEEHV--------DFALFANVDPALGVKWIYR 900
              +  ++               T R  +DEE          +     N   A+GVKW+Y+
Sbjct: 841  LTLFCLFAECEPMDFQKAIEKKTWRNAMDEEIKSIQKNDTWELTSLPNGHKAIGVKWVYK 900

Query: 901  TKLKQNGEVQKYKARLVVKGYKQKFGVDYEEVFAP-------------PLQQKITGKFIN 960
             K    GEV++YKARLV KGY Q+ G+DY+EVFAP               Q K     ++
Sbjct: 901  AKKNSKGEVERYKARLVAKGYSQRVGIDYDEVFAPVARLETVRLIISLAAQNKWKIHQMD 960

Query: 961  VKSAFLNGYLEDEIYVEQPPGYAKIGEENK----------------------------DG 1020
            VKSAFLNG LE+E+Y+EQP GY   GEE+K                              
Sbjct: 961  VKSAFLNGDLEEEVYIEQPQGYIVKGEEDKVLRLKKVLYGLKQAPRAWNTRIDKYFKEKD 1020

Query: 1021 FRRCPYEHALHTKEDE-----------------------NEFKESMKKEFEMTDMGLLHY 1080
            F +CPYEHAL+ K  +                        EFK+ M KEFEMTD+GL+ Y
Sbjct: 1021 FIKCPYEHALYIKIQKEDILIACLYVDDLIFTGNNPSIFEEFKKEMTKEFEMTDIGLMSY 1080

Query: 1081 FLGIEVKQDDNEIAIFQKKYAKDLLKRFKMENAYPTNTPMELGLKLSKHDVSEAFDATIY 1140
            +LGIEVKQ+DN I I Q+ YAK++LK+FK++++ P  TPME G+KLSK +  E  D T +
Sbjct: 1081 YLGIEVKQEDNGIFITQEGYAKEVLKKFKIDDSNPVCTPMECGIKLSKKEEGEGVDPTTF 1140

Query: 1141 RSLVGSLMYLTTTRPDIMFSVSLLSRFMTSPKRSHWEAGKRVLRYILGTVDHGIHYKRNV 1200
            +SLVGSL YLT TRPDI+++V ++SR+M  P  +H++A KR+LRYI GTV+ G+HY    
Sbjct: 1141 KSLVGSLRYLTCTRPDILYAVGVVSRYMEHPTTTHFKAAKRILRYIKGTVNFGLHYSTTS 1200

Query: 1201 DNVLVGYSDSDWGGNIDDFKSTSGYIFNIGFKAVSWASKKQDVVALSTTEAEYISLSVAS 1205
            D  LVGYSDSDWGG++DD KSTSG++F IG  A +W SKKQ +V LST EAEY++ +   
Sbjct: 1201 DYKLVGYSDSDWGGDVDDRKSTSGFVFYIGDTAFTWMSKKQPIVTLSTCEAEYVAATSCV 1260

BLAST of CSPI03G21050 vs. TrEMBL
Match: A0A059QBK0_PHAVU (Polyprotein OS=Phaseolus vulgaris PE=4 SV=1)

HSP 1 Score: 1040.0 bits (2688), Expect = 2.2e-300
Identity = 574/1350 (42.52%), Postives = 808/1350 (59.85%), Query Frame = 1

Query: 18   YRRWSHQMKVLYGSQDLWDIVDIGYSEPESENGLSAQQLNELRDARKKDKKALFFIYQSV 77
            Y  WS QMK L GSQD W++V+ G+ EP +  G +A Q   L++ R KDK AL+ +Y++V
Sbjct: 19   YDNWSIQMKALLGSQDSWEVVEEGFEEPTNTTGYTAAQTKALKEMRSKDKAALYMLYRAV 78

Query: 78   DENIFERISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEFDTIRMKDSETIEEFFNR 137
            DE IFE+I+G ST+K AWD L+ +++G ++VK VRLQTLR E + ++M +SE++ ++  R
Sbjct: 79   DEAIFEKIAGASTSKEAWDILEKVFKGADRVKQVRLQTLRGELENMKMMESESVSDYITR 138

Query: 138  VLLIVNQLRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKDLSTLSINSLMGSLQSH 197
            V  +VNQL  NGET+ D R+VEKILR++T  +E IV AIEESKDL+TL+++ L GSL++H
Sbjct: 139  VQAVVNQLNRNGETLTDARVVEKILRTLTDNFESIVCAIEESKDLATLTVDELAGSLEAH 198

Query: 198  ELRLKMFDSNPSEEA-------------FHMQSSYRGRSNGRRG-GRGGRGNGRSNVVTN 257
            E R K       E+A             +H  S YRGR  G RG GRGG+G+        
Sbjct: 199  EQRKKKKKEETLEQALQTKASIKDEKVLYHQNSQYRGRGRGSRGNGRGGKGSNHEGYYKE 258

Query: 258  TESESRDNQFFSNRGRGRSSNRGRGRSGGRGDFSHIQCFNCRRYGHFQADCWSKK----- 317
             E  S+ N     RGRGR    GRGR GGR ++S+I+C+ C +YGH+  DC S K     
Sbjct: 259  KEQSSQPNW----RGRGR----GRGR-GGRSNYSNIECYKCHKYGHYAKDCNSDKCYNCG 318

Query: 318  ----------TNSNQAETTLMHEQSNNDQGLLFLTLNVQESSTEEIWYLDSGCSNHMTGR 377
                       +    ETT +  +   ++G+L +  +    + + +WYLDSG SNHM G 
Sbjct: 319  KVGHFAKDCRADIKIEETTNLALEVETNEGVLLMAQDEVNINNDTLWYLDSGASNHMCGH 378

Query: 378  KDIFISLDESHQNVVKTGDNKMLEVKGKGDILVKTKMGA-KKITDVYYVSGLKHNLLSVG 437
            + +F  + +     V  GD   +EVKG+G +    K G    + DVYYV  LK N+LS+G
Sbjct: 379  EYLFKDMQKIEDGHVSFGDASKVEVKGRGTVCYLQKDGLIGSLQDVYYVPDLKTNILSMG 438

Query: 438  QLLLRGHDVIFKDNICEIRTKNGDLITKVCMTHNKMFPIKICYEKLVCFETLVNDTSWLW 497
            QL  +G+ +  KD    ++ K G L+ ++ M  N+M+ + +   +  C +  + D + LW
Sbjct: 439  QLTEKGYSIFLKDRFLHLKNKQGCLVARIEMARNRMYKLNLRSIREKCLQVNIEDKASLW 498

Query: 498  HCRFGHLSFDTLSHMCQQHMVRGMSNIKKEDQLCEACVFRKHHRNSFPTGGSWRASKPLE 557
            H RFGHL    L  + +++MV G+ N+  E + CE CV  KH R SFP    + A +PLE
Sbjct: 499  HLRFGHLHHGGLKELAKKNMVHGLPNMDYEGKFCEECVLSKHVRTSFPKKAQYWAKQPLE 558

Query: 558  LVHTDLCGPMRTTTHGGNRYFLTFIDDYSRKTWIYLLKEKSATFECFKTFKAMVENESNL 617
            L+HTD+CGP+   +  G RYF+TFIDD+SRKTW+Y LKEKS  FE FK FK MVE  ++ 
Sbjct: 559  LIHTDICGPITPESFSGKRYFITFIDDFSRKTWVYFLKEKSEAFEVFKKFKVMVERTTDK 618

Query: 618  KLKSLRSDRGGEYI--VFADFLKENGIKHQKTVRRTPQQNGVAERKNRIIMELARSMLKA 677
            ++K++RSDRGGEY    F ++ +E GI+   T   TPQQNGVAERKNR I+++ RSMLK+
Sbjct: 619  QIKAVRSDRGGEYTSTTFMEYCEEQGIRRFLTAPYTPQQNGVAERKNRTILDMVRSMLKS 678

Query: 678  KKLPDQFWGDAVTCAVYLLNRA------------------------------STKSVQDE 737
            KK+P +FW +AV CA+Y+ NR                               +   V D+
Sbjct: 679  KKMPKEFWAEAVQCAIYVQNRCPHVKLDDQTPQEAWSGQKPTVSHLKVFGSVAYAHVPDQ 738

Query: 738  KRGKLDDKSEKCIFVGYSENSKAYRLYNPISKKVVISRDVKFDEAKLWQWNAPNEDQNPL 797
            +R KL+DKS++ +F+GY E +K Y+L +PISKKV +SRDV+ +EA  W WN         
Sbjct: 739  RRTKLEDKSKRYVFIGYDEKTKGYKLLDPISKKVTVSRDVQINEASEWDWN--------- 798

Query: 798  HVDMDGKKDARDLELEVTQPLTSPSSSHS-TSDEETTPR--KTRNIQEIYNTSRRI---- 857
                    ++ ++ +EV +  +SP+S +S T+D+E  PR  K R++ ++Y+++  +    
Sbjct: 799  --------NSSEVMIEVGE--SSPTSINSETTDDEDEPRQPKIRSLHDLYDSTNEVHLVC 858

Query: 858  --LDEEHVDF------------------ALFAN----------VDPALGVKWIYRTKLKQ 917
               D E++ F                  A+  N              +GVKWI++ K+  
Sbjct: 859  LLADAENISFEEAVRDKKWQTAMDEEIKAIDRNNTWELTELPEGSQPIGVKWIFKKKMNA 918

Query: 918  NGEVQKYKARLVVKGYKQKFGVDYEEVFAPPLQQKITGKFI-------------NVKSAF 977
             GE+++YKARLV KGYKQK G+DY+EVFAP ++ +     I             +VKSAF
Sbjct: 919  QGEIERYKARLVAKGYKQKEGIDYDEVFAPVVRMETIRLLISQAAQFKWPIFQMDVKSAF 978

Query: 978  LNGYLEDEIYVEQPPGYAKIGEENK----------------------------DGFRRCP 1037
            LNG LE+E+Y+EQPPGY KIGEE K                            +GF++CP
Sbjct: 979  LNGVLEEEVYIEQPPGYMKIGEEKKVLKLKKALYGLKQAPRAWNTRIDTYFKENGFKQCP 1038

Query: 1038 YEHALHTKEDE-----------------------NEFKESMKKEFEMTDMGLLHYFLGIE 1097
            YEHAL+ K +                         EFK +M++EFEMTD+GL+ +FLG+E
Sbjct: 1039 YEHALYAKNNGGNMIFVALYVDDLIFMGNNNDMIEEFKGTMRREFEMTDLGLMKFFLGLE 1098

Query: 1098 VKQDDNEIAIFQKKYAKDLLKRFKMENAYPTNTPMELGLKLSKHDVSEAFDATIYRSLVG 1157
            V+Q +  I + Q+KYAK++LK++KMEN  P + PME G KLSK D  E  DA+ YRSLVG
Sbjct: 1099 VRQKETGIFVSQEKYAKEILKKYKMENCNPVSIPMEPGAKLSKFDGGERVDASRYRSLVG 1158

Query: 1158 SLMYLTTTRPDIMFSVSLLSRFMTSPKRSHWEAGKRVLRYILGTVDHGIHYKRNVDNVLV 1205
            SL YLT TRPD+  SV ++SRFM  P  SHW+A KRVLRYI GTV  G+ Y +  D  LV
Sbjct: 1159 SLRYLTCTRPDLSLSVGIISRFMEEPVYSHWKALKRVLRYIQGTVSLGLFYSKAEDYKLV 1218

BLAST of CSPI03G21050 vs. TAIR10
Match: AT4G23160.1 (AT4G23160.1 cysteine-rich RLK (RECEPTOR-like protein kinase) 8)

HSP 1 Score: 199.1 bits (505), Expect = 1.5e-50
Identity = 104/256 (40.62%), Postives = 153/256 (59.77%), Query Frame = 1

Query: 909  NEFKESMKKEFEMTDMGLLHYFLGIEVKQDDNEIAIFQKKYAKDLLKRFKMENAYPTNTP 968
            +E K  +K  F++ D+G L YFLG+E+ +    I I Q+KYA DLL    +    P++ P
Sbjct: 298  DELKSQLKSCFKLRDLGPLKYFLGLEIARSAAGINICQRKYALDLLDETGLLGCKPSSVP 357

Query: 969  MELGLKLSKHDVSEAFDATIYRSLVGSLMYLTTTRPDIMFSVSLLSRFMTSPKRSHWEAG 1028
            M+  +  S H   +  DA  YR L+G LMYL  TR DI F+V+ LS+F  +P+ +H +A 
Sbjct: 358  MDPSVTFSAHSGGDFVDAKAYRRLIGRLMYLQITRLDISFAVNKLSQFSEAPRLAHQQAV 417

Query: 1029 KRVLRYILGTVDHGIHYKRNVDNVLVGYSDSDWGGNIDDFKSTSGYIFNIGFKAVSWASK 1088
             ++L YI GTV  G+ Y    +  L  +SD+ +    D  +ST+GY   +G   +SW SK
Sbjct: 418  MKILHYIKGTVGQGLFYSSQAEMQLQVFSDASFQSCKDTRRSTNGYCMFLGTSLISWKSK 477

Query: 1089 KQDVVALSTTEAEYISLSVASCQALWLRNVLHELKCPQEKGTIMFCDNQSSISLSKNPVF 1148
            KQ VV+ S+ EAEY +LS A+ + +WL     EL+ P  K T++FCDN ++I ++ N VF
Sbjct: 478  KQQVVSKSSAEAEYRALSFATDEMMWLAQFFRELQLPLSKPTLLFCDNTAAIHIATNAVF 537

Query: 1149 HGRSKHINIKYHFIRE 1165
            H R+KHI    H +RE
Sbjct: 538  HERTKHIESDCHSVRE 553

BLAST of CSPI03G21050 vs. TAIR10
Match: ATMG00810.1 (ATMG00810.1 DNA/RNA polymerases superfamily protein)

HSP 1 Score: 142.5 bits (358), Expect = 1.7e-33
Identity = 72/200 (36.00%), Postives = 122/200 (61.00%), Query Frame = 1

Query: 915  MKKEFEMTDMGLLHYFLGIEVKQDDNEIAIFQKKYAKDLLKRFKMENAYPTNTPMELGLK 974
            +   F M D+G +HYFLGI++K   + + + Q KYA+ +L    M +  P +TP+ L L 
Sbjct: 27   LSSTFSMKDLGPVHYFLGIQIKTHPSGLFLSQTKYAEQILNNAGMLDCKPMSTPLPLKLN 86

Query: 975  LSKHDVSEAFDATIYRSLVGSLMYLTTTRPDIMFSVSLLSRFMTSPKRSHWEAGKRVLRY 1034
             S    ++  D + +RS+VG+L YLT TRPDI ++V+++ + M  P  + ++  KRVLRY
Sbjct: 87   -SSVSTAKYPDPSDFRSIVGALQYLTLTRPDISYAVNIVCQRMHEPTLADFDLLKRVLRY 146

Query: 1035 ILGTVDHGIHYKRNVDNVLVGYSDSDWGGNIDDFKSTSGYIFNIGFKAVSWASKKQDVVA 1094
            + GT+ HG++  +N    +  + DSDW G     +ST+G+   +G   +SW++K+Q  V+
Sbjct: 147  VKGTIFHGLYIHKNSKLNVQAFCDSDWAGCTSTRRSTTGFCTFLGCNIISWSAKRQPTVS 206

Query: 1095 LSTTEAEYISLSVASCQALW 1115
             S+TE EY +L++ + +  W
Sbjct: 207  RSSTETEYRALALTAAELTW 225

BLAST of CSPI03G21050 vs. TAIR10
Match: AT1G48720.1 (AT1G48720.1 unknown protein)

HSP 1 Score: 92.8 bits (229), Expect = 1.5e-18
Identity = 43/92 (46.74%), Postives = 63/92 (68.48%), Query Frame = 1

Query: 1  MSSNMLQPQLPCFEGKNYRRWSHQMKVLYGSQDLWDIVDIGYSEPESENGLSAQQLNELR 60
          M+SN +  Q+P     NY  WS +MK + G+ D+W+IV+ G+ EPE+E  LS  Q + LR
Sbjct: 1  MASNNVPFQVPVLTKSNYDNWSLRMKAILGAHDVWEIVEKGFIEPENEGSLSQTQKDGLR 60

Query: 61 DARKKDKKALFFIYQSVDENIFERISGVSTAK 93
          D+RK+DKKAL  IYQ +DE+ FE++   ++AK
Sbjct: 61 DSRKRDKKALCLIYQGLDEDTFEKVVEATSAK 92

BLAST of CSPI03G21050 vs. TAIR10
Match: ATMG00240.1 (ATMG00240.1 Gag-Pol-related retrotransposon family protein)

HSP 1 Score: 65.1 bits (157), Expect = 3.4e-10
Identity = 31/78 (39.74%), Postives = 48/78 (61.54%), Query Frame = 1

Query: 997  MYLTTTRPDIMFSVSLLSRFMTSPKRSHWEAGKRVLRYILGTVDHGIHYKRNVDNVLVGY 1056
            MYLT TRPD+ F+V+ LS+F ++ + +  +A  +VL Y+ GTV  G+ Y    D  L  +
Sbjct: 1    MYLTITRPDLTFAVNRLSQFSSASRTAQMQAVYKVLHYVKGTVGQGLFYSATSDLQLKAF 60

Query: 1057 SDSDWGGNIDDFKSTSGY 1075
            +DSDW    D  +S +G+
Sbjct: 61   ADSDWASCPDTRRSVTGF 78

BLAST of CSPI03G21050 vs. TAIR10
Match: AT3G21000.1 (AT3G21000.1 Gag-Pol-related retrotransposon family protein)

HSP 1 Score: 61.2 bits (147), Expect = 4.9e-09
Identity = 50/221 (22.62%), Postives = 102/221 (46.15%), Query Frame = 1

Query: 17  NYRRWSHQMKVLYGSQDLWDIVDIGYSEPESENG-----LSAQQLNELRDARKKDKKALF 76
           +Y  W+   K     Q LWD+V  G  +  S+N      +  ++L++ RD   KD KAL 
Sbjct: 16  DYEIWAPITKSTLIEQGLWDVVVNGVPQDPSKNPELAATIQPEELSKWRDFVVKDAKALQ 75

Query: 77  FIYQSVDENIFERISGVSTAKAAWDALQNLYEGEEKVKLVRLQ-----TLRAEFDTIRMK 136
            +  S+ +++F +    S+AK  WD L+   +G E+  + RL+      L  + + ++M 
Sbjct: 76  ILQSSLTDSVFRKTLSASSAKDVWDLLR---KGNEQATIRRLEQVTIRRLEKQLEDLKMV 135

Query: 137 DSETIEEFFNRVLLIVNQLRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKDLSTLS 196
           D E+   + ++ L I+ +L        D  I + +  +++  ++ +   +EE  D+  ++
Sbjct: 136 DKESGSSYLDKALEILERLGRAKLEKSDYEICKNVFTTLSGSFDGLDSMLEELIDVHKMT 195

Query: 197 INSLMGSLQSHELRLKMFDSNPSEEAFHMQSSYRGRSNGRR 228
             SL+          ++ +S+  E  F +    R +S   +
Sbjct: 196 SKSLV-----EYFYYRVHESSTEEAIFGLLKDLRLKSKSEK 228

BLAST of CSPI03G21050 vs. NCBI nr
Match: gi|150036244|gb|ABR67407.1| (integrase [Cucumis melo subsp. melo])

HSP 1 Score: 1424.8 bits (3687), Expect = 0.0e+00
Identity = 752/1323 (56.84%), Postives = 924/1323 (69.84%), Query Frame = 1

Query: 2    SSNMLQPQLPCFEGKNYRRWSHQMKVLYGSQDLWDIVDIGYSEPESENGLSAQQLNELRD 61
            + NMLQ QLP F GKN+ +WS QMKVLYGSQ+LWDIV+ GY+E E+++ L+ QQL ELR+
Sbjct: 4    NGNMLQHQLPRFSGKNFNQWSIQMKVLYGSQELWDIVERGYTEVENQSELTNQQLVELRE 63

Query: 62   ARKKDKKALFFIYQSVDENIFERISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEFD 121
             R KDKKALFFIYQ+VDE I ERIS  ++AKAAWD L++ Y+GE+KVK++RLQ LR+EFD
Sbjct: 64   NRNKDKKALFFIYQAVDEFISERISTATSAKAAWDILRSTYQGEDKVKMIRLQALRSEFD 123

Query: 122  TIRMKDSETIEEFFNRVLLIVNQLRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKD 181
             I+MK++ETIEEFFN +L+IVN LRSNGE + DQR+VEKILRSM R++EHIVVAIEESKD
Sbjct: 124  CIKMKETETIEEFFNHILVIVNSLRSNGEEVGDQRVVEKILRSMPRKFEHIVVAIEESKD 183

Query: 182  LSTLSINSLMGSLQSHELRLKMFDSNPSEEAFHMQSSYRGRSNGRRGGRGGRGNGR---S 241
            LSTLSINSLMGSLQSHELRLK FD NP EEAF MQ+S+RG S GRRGG G RG GR   +
Sbjct: 184  LSTLSINSLMGSLQSHELRLKQFDVNP-EEAFQMQTSFRGGSRGRRGGHGRRGGGRNYDN 243

Query: 242  NVVTNTESESRDNQFFSNRGRGRSSNRGRGRSGGRGDFSHIQCFNCRRYGHFQADCWSKK 301
                N+E+    +     RG GR    GR + GGRG+FS IQCFNCR+YGHFQADCW+ K
Sbjct: 244  RSGANSENSQESSSLSRGRGSGRRRGFGRNQGGGRGNFSQIQCFNCRKYGHFQADCWALK 303

Query: 302  TNSNQAETTLMHEQSNNDQGLLFLTLNVQESSTEEIWYLDSGCSNHMTGRKDIFISLDES 361
                     +  EQ  ND+G+LFL  +VQ+                              
Sbjct: 304  NGVGNTTMNMHKEQKKNDEGILFLACSVQD------------------------------ 363

Query: 362  HQNVVK----TGDNKMLEVKGKGDILVKTKMGAKKITDVYYVSGLKHNLLSVGQLLLRGH 421
              NVVK     GDN  L+VKG+GDILVKTK   K++T+V+YV GLKHNLLS+GQLL RG 
Sbjct: 364  --NVVKPTCEDGDNTRLQVKGQGDILVKTKKRTKRVTNVFYVPGLKHNLLSIGQLLQRGL 423

Query: 422  DVIFKDNICEIRTKNGDLITKVCMTHNKMFPIKICYEKLVCFETLVNDTSWLWHCRFGHL 481
             V F+ +IC I+ +   LI+KV MT NKMFP+   Y ++ CF +++ D+SWLWH R+GHL
Sbjct: 424  KVSFEGDICAIKDQADVLISKVKMTANKMFPLNFTYGQISCFSSILKDSSWLWHFRYGHL 483

Query: 482  SFDTLSHMCQQHMVRGMSNIKKEDQLCEACVFRKHHRNSFPTGGSWRASKPLELVHTDLC 541
            +F +LS++C+ HMVR              C+  KHHR+SFPTG +WRASKPLEL+HTDLC
Sbjct: 484  NFKSLSYLCKNHMVR-------------VCILAKHHRDSFPTGKAWRASKPLELIHTDLC 543

Query: 542  GPMRTTTHGGNRYFLTFIDDYSRKTWIYLLKEKSATFECFKTFKAMVENESNLKLKSLRS 601
            GPMRTTT+GGNRYF+TFIDD+SRK WIY LKEKS    CFK+FKA  EN+S  K+K+LRS
Sbjct: 544  GPMRTTTNGGNRYFITFIDDFSRKLWIYFLKEKSEALVCFKSFKAFTENQSGYKIKTLRS 603

Query: 602  DRGGEYIVFADFLKENGIKHQKTVRRTPQQNGVAERKNRIIMELARSMLKAKKLPDQFWG 661
            DRGGEYIVF +F KE GI HQ T R T QQNGVAERKNR IME+ARSMLKAK LP++FWG
Sbjct: 604  DRGGEYIVFGNFFKEQGIHHQMTARMTTQQNGVAERKNRTIMEMARSMLKAKNLPNEFWG 663

Query: 662  DAVTCAVYLLNRASTKSVQ---------DEK---------------------RGKLDDKS 721
            DAV C VY+LNRA TKSV          DEK                     RGKLDDKS
Sbjct: 664  DAVACTVYILNRAPTKSVPGMTPYEAWCDEKPSVSHLKVFRSIAYSHIPNQLRGKLDDKS 723

Query: 722  EKCIFVGYSENSKAYRLYNPISKKVVISRDVKFDEAKLWQWNAP-NEDQNPLHVDMDGKK 781
            EKCI VGY+ENSKAYRLYNP+S+K++I+RDV F E + W WN   +E ++P HV+++  +
Sbjct: 724  EKCIMVGYNENSKAYRLYNPVSRKIIINRDVIFSEDESWNWNDDVDEAKSPFHVNINENE 783

Query: 782  DARDLELEVTQPLTSPSS--SHSTSDEETTPRKTRNIQEIYNTSRRILDEEHVDFALFAN 841
             A++LE    Q + S SS  S STS++E +PR+ R+IQEIYN + RI  +   +FALFA 
Sbjct: 784  VAQELEQAKIQAVESSSSSTSSSTSNDEISPRRMRSIQEIYNNTNRINVDHFANFALFAG 843

Query: 842  VDPAL---------------------------------------GVKWIYRTKLKQNGEV 901
            V P                                         GVKW+YRTKLK +G V
Sbjct: 844  VGPVTFDEAIQDEKWKIAMDQEIDAIRRNETWELMELPTNKQALGVKWVYRTKLKSDGNV 903

Query: 902  QKYKARLVVKGYKQKFGVDYEEVFAPPLQQKITGKFI-------------NVKSAFLNGY 961
            + YKARLVVKGYKQ++GVDYEE+FAP  + +     +             ++KSAFLNG+
Sbjct: 904  EIYKARLVVKGYKQEYGVDYEEIFAPVTRIETIRLILSLAAQNGWKVHQMDIKSAFLNGH 963

Query: 962  LEDEIYVEQPPGYAKIGEEN----------------------------KDGFRRCPYEHA 1021
            L+DEI+V QP GY + GEE                             K GFRRCPYEHA
Sbjct: 964  LKDEIFVAQPLGYVQRGEEEKVYKLKKALYGLKQAPRAWYSRIDSFFLKTGFRRCPYEHA 1023

Query: 1022 LHTKEDENEFKESMKKEFEMTDMGLLHYFLGIEVKQDDNEIAIFQKKYAKDLLKRFKMEN 1081
            L+ KED  ++ + +     M+DMGL+HYFLGIEV Q++ EI I Q+KYA DLLK+F+MEN
Sbjct: 1024 LYVKED--KYGKFLIVSLYMSDMGLIHYFLGIEVNQNEGEIVISQQKYAHDLLKKFRMEN 1083

Query: 1082 AYPTNTPMELGLKLSKHDVSEAFDATIYRSLVGSLMYLTTTRPDIMFSVSLLSRFMTSPK 1141
            A P NTPM+  LKL K D+ EA D ++YRSLVGSLMYLT TRPDI+F VS+LSRFMT+PK
Sbjct: 1084 ASPCNTPMDANLKLCKDDIGEAVDPSLYRSLVGSLMYLTATRPDILFVVSMLSRFMTNPK 1143

Query: 1142 RSHWEAGKRVLRYILGTVDHGIHYKRNVDNVLVGYSDSDWGGNIDDFKSTSGYIFNIGFK 1201
            RSHWEAGKRVLRYILGT++ GI+YK+  ++VL G+ DSDWGGN+DD +STSGY+F++G  
Sbjct: 1144 RSHWEAGKRVLRYILGTINFGIYYKKVSESVLFGFCDSDWGGNVDDHRSTSGYVFSMGSG 1203

Query: 1202 AVSWASKKQDVVALSTTEAEYISLSVASCQALWLRNVLHELKCPQEKGTIMFCDNQSSIS 1205
              SW SKKQ VV LSTTEAEYISL+ A CQALWLR +L ELKC Q+  T++FCDN S+I+
Sbjct: 1204 VFSWTSKKQSVVTLSTTEAEYISLAAAGCQALWLRWMLKELKCTQKCETVLFCDNGSAIA 1263

BLAST of CSPI03G21050 vs. NCBI nr
Match: gi|12321254|gb|AAG50698.1|AC079604_5 (copia-type polyprotein, putative [Arabidopsis thaliana])

HSP 1 Score: 1111.7 bits (2874), Expect = 0.0e+00
Identity = 610/1321 (46.18%), Postives = 826/1321 (62.53%), Query Frame = 1

Query: 1    MSSNMLQPQLPCFEGKNYRRWSHQMKVLYGSQDLWDIVDIGYSEPESENGLSAQQLNELR 60
            M+SN +  Q+P     NY  WS +MK + G+ D+W+IV+ G+ EPE+E  LS  Q + LR
Sbjct: 1    MASNNVPFQVPVLTKSNYDNWSLRMKAILGAHDVWEIVEKGFIEPENEGSLSQTQKDGLR 60

Query: 61   DARKKDKKALFFIYQSVDENIFERISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEF 120
            D+RK+DKKAL  IYQ +DE+ FE++   ++AK AW+ L+  Y+G ++VK VRLQTLR EF
Sbjct: 61   DSRKRDKKALCLIYQGLDEDTFEKVVEATSAKEAWEKLRTSYKGADQVKKVRLQTLRGEF 120

Query: 121  DTIRMKDSETIEEFFNRVLLIVNQLRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESK 180
            + ++MK+ E + ++F+RVL + N L+ NGE ++D RI+EK+LRS+  ++EHIV  IEE+K
Sbjct: 121  EALQMKEGELVSDYFSRVLTVTNNLKRNGEKLDDVRIMEKVLRSLDLKFEHIVTVIEETK 180

Query: 181  DLSTLSINSLMGSLQSHELRLKMFDSNPSEEAFHMQ-------SSYRGRSNGR-RG-GRG 240
            DL  ++I  L+GSLQ++E + K  + +  E+  +MQ        SY+ R  G+ RG GRG
Sbjct: 181  DLEAMTIEQLLGSLQAYEEKKKKKE-DIVEQVLNMQITKEENGQSYQRRGGGQVRGRGRG 240

Query: 241  GRGNGRSNVVTNTESESRDNQFFSNRGRGRSSNRGRGRSGGRGDFSHIQCFNCRRYGHFQ 300
            G GNGR         E   NQ    RG   S  RG+G    R D S ++C+NC ++GH+ 
Sbjct: 241  GYGNGRGW----RPHEDNTNQ----RGENSSRGRGKGHPKSRYDKSSVKCYNCGKFGHYA 300

Query: 301  ADCWSKKTNSNQAETTLMHEQSNNDQGLLFLTLNVQESSTEEIWYLDSGCSNHMTGRKDI 360
            ++C +      + +   + E+   +  LL  +    E      WYLDSG SNHM GRK +
Sbjct: 301  SECKAPSNKKFEEKANYVEEKIQEEDMLLMASYKKDEQEENHKWYLDSGASNHMCGRKSM 360

Query: 361  FISLDESHQNVVKTGDNKMLEVKGKGDILVKTKMGAKK-ITDVYYVSGLKHNLLSVGQLL 420
            F  LDES +  V  GD   +EVKGKG+IL++ K G  + I++VYY+  +K N+LS+GQLL
Sbjct: 361  FAELDESVRGNVALGDESKMEVKGKGNILIRLKNGDHQFISNVYYIPSMKTNILSLGQLL 420

Query: 421  LRGHDVIFKDNICEIRTKNGDLITKVCMTHNKMFPIKICYEKLVCFETLVNDTSWLWHCR 480
             +G+D+  KDN   IR +  +LITKV M+ N+MF + I  +   C +    + SWLWH R
Sbjct: 421  EKGYDIRLKDNNLSIRDQESNLITKVPMSKNRMFVLNIRNDIAQCLKMCYKEESWLWHLR 480

Query: 481  FGHLSFDTLSHMCQQHMVRGMSNIKKEDQLCEACVFRKHHRNSFPTGGSWRASKPLELVH 540
            FGHL+F  L  + ++ MVRG+  I   +Q+CE C+  K  + SFP   S RA KPLEL+H
Sbjct: 481  FGHLNFGGLELLSRKEMVRGLPCINHPNQVCEGCLLGKQFKMSFPKESSSRAQKPLELIH 540

Query: 541  TDLCGPMRTTTHGGNRYFLTFIDDYSRKTWIYLLKEKSATFECFKTFKAMVENESNLKLK 600
            TD+CGP++  + G + YFL FIDD+SRKTW+Y LKEKS  FE FK FKA VE ES L +K
Sbjct: 541  TDVCGPIKPKSLGKSNYFLLFIDDFSRKTWVYFLKEKSEVFEIFKKFKAHVEKESGLVIK 600

Query: 601  SLRSDRGGEYIV--FADFLKENGIKHQKTVRRTPQQNGVAERKNRIIMELARSMLKAKKL 660
            ++RSDRGGE+    F  + ++NGI+ Q TV R+PQQNGVAERKNR I+E+ARSMLK+K+L
Sbjct: 601  TMRSDRGGEFTSKEFLKYCEDNGIRRQLTVPRSPQQNGVAERKNRTILEMARSMLKSKRL 660

Query: 661  PDQFWGDAVTCAVYLLNRASTKSVQ------------------------------DEKRG 720
            P + W +AV CAVYLLNR+ TKSV                               DEKR 
Sbjct: 661  PKELWAEAVACAVYLLNRSPTKSVSGKTPQEAWSGRKPGVSHLRVFGSIAHAHVPDEKRS 720

Query: 721  KLDDKSEKCIFVGYSENSKAYRLYNPISKKVVISRDVKFDEAKLWQWNAPNEDQNPL-HV 780
            KLDDKSEK IF+GY  NSK Y+LYNP +KK +ISR++ FDE   W WN+  ED N   H 
Sbjct: 721  KLDDKSEKYIFIGYDNNSKGYKLYNPDTKKTIISRNIVFDEEGEWDWNSNEEDYNFFPHF 780

Query: 781  DMDGKKDARD--LELEVTQPLTSPSSSHSTSDEETTPRKTRNIQEIYNTSRRILDEE--- 840
            + D  +  R+     E T P TSP+SS    +E+  P   +   E   T R  +DEE   
Sbjct: 781  EEDKPEPTREEPPSEEPTTPPTSPTSSQ--IEEKCEPMDFQEAIE-KKTWRNAMDEEIKS 840

Query: 841  -----HVDFALFANVDPALGVKWIYRTKLKQNGEVQKYKARLVVKGYKQKFGVDYEEVFA 900
                   +     N   A+GVKW+Y+ K    GEV++YKARLV KGY Q+ G+DY+EVFA
Sbjct: 841  IQKNDTWELTSLPNGHKAIGVKWVYKAKKNSKGEVERYKARLVAKGYSQRAGIDYDEVFA 900

Query: 901  P-------------PLQQKITGKFINVKSAFLNGYLEDEIYVEQPPGYAKIGEENK---- 960
            P               Q K     ++VKSAFLNG LE+E+Y+EQP GY   GEE+K    
Sbjct: 901  PVARLETVRLIISLAAQNKWKIHQMDVKSAFLNGDLEEEVYIEQPQGYIVKGEEDKVLRL 960

Query: 961  ------------------------DGFRRCPYEHALHTKEDE------------------ 1020
                                      F +CPYEHAL+ K  +                  
Sbjct: 961  KKALYGLKQAPRAWNTRIDKYFKEKDFIKCPYEHALYIKIQKEDILIACLYVDDLIFTGN 1020

Query: 1021 -----NEFKESMKKEFEMTDMGLLHYFLGIEVKQDDNEIAIFQKKYAKDLLKRFKMENAY 1080
                  EFK+ M KEFEMTD+GL+ Y+LGIEVKQ+DN I I Q+ YAK++LK+FKM+++ 
Sbjct: 1021 NPSMFEEFKKEMTKEFEMTDIGLMSYYLGIEVKQEDNGIFITQEGYAKEVLKKFKMDDSN 1080

Query: 1081 PTNTPMELGLKLSKHDVSEAFDATIYRSLVGSLMYLTTTRPDIMFSVSLLSRFMTSPKRS 1140
            P  TPME G+KLSK +  E  D T ++SLVGSL YLT TRPDI+++V ++SR+M  P  +
Sbjct: 1081 PVCTPMECGIKLSKKEEGEGVDPTTFKSLVGSLRYLTCTRPDILYAVGVVSRYMEHPTTT 1140

Query: 1141 HWEAGKRVLRYILGTVDHGIHYKRNVDNVLVGYSDSDWGGNIDDFKSTSGYIFNIGFKAV 1200
            H++A KR+LRYI GTV+ G+HY    D  LVGYSDSDWGG++DD KSTSG++F IG  A 
Sbjct: 1141 HFKAAKRILRYIKGTVNFGLHYSTTSDYKLVGYSDSDWGGDVDDRKSTSGFVFYIGDTAF 1200

Query: 1201 SWASKKQDVVALSTTEAEYISLSVASCQALWLRNVLHELKCPQEKGTIMFCDNQSSISLS 1205
            +W SKKQ +V LST EAEY++ +   C A+WLRN+L EL  PQE+ T +F DN+S+I+L+
Sbjct: 1201 TWMSKKQPIVTLSTCEAEYVAATSCVCHAIWLRNLLKELSLPQEEPTKIFVDNKSAIALA 1260

BLAST of CSPI03G21050 vs. NCBI nr
Match: gi|5734736|gb|AAD50001.1|AC007259_14 (Hypothetical protein [Arabidopsis thaliana])

HSP 1 Score: 1064.3 bits (2751), Expect = 1.6e-307
Identity = 602/1355 (44.43%), Postives = 821/1355 (60.59%), Query Frame = 1

Query: 1    MSSNMLQPQLPCFEGKNYRRWSHQMKVLYGSQDLWDIVDIGYSEPESENGLSAQQLNELR 60
            M+SN +  Q+P     NY  WS +MK + G+ D+W+IV+ G+ EPE+E  LS  Q + LR
Sbjct: 1    MASNNVPFQVPVLTKSNYDNWSLRMKAILGAHDVWEIVEKGFIEPENEGSLSQTQKDGLR 60

Query: 61   DARKKDKKALFFIYQSVDENIFERISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEF 120
            D+RK+DKKAL  IYQ +DE+ FE++   ++AK AW+ L+  Y+G ++VK VRLQTLR EF
Sbjct: 61   DSRKRDKKALCLIYQGLDEDTFEKVVEATSAKEAWEKLRTSYKGADQVKKVRLQTLRGEF 120

Query: 121  DTIRMKDSETIEEFFNRVLLIVNQLRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESK 180
            + ++MK+ E + ++F+RVL + N L+ NGE ++D RI+EK+LRS+  ++EHIV  IEE+K
Sbjct: 121  EALQMKEGELVSDYFSRVLTVTNNLKRNGEKLDDVRIMEKVLRSLDLKFEHIVTVIEETK 180

Query: 181  DLSTLSINSLMGSLQSHELRLKMFDSNPSEEAFHMQ-------SSYRGRSNGR-RG-GRG 240
            DL  ++I  L+GSLQ++E + K  + +  E+  +MQ        SY+ R  G+ RG GRG
Sbjct: 181  DLEAMTIEQLLGSLQAYEEKKKKKE-DIVEQVLNMQITKEENGQSYQRRGGGQVRGRGRG 240

Query: 241  GRGNGRSNVVTNTESESRDNQFFSNRGRGRSSNRGRGRSGGRGDFSHIQCFNCRRYGHFQ 300
            G GNGR         E   NQ    RG   S  RG+G    R D S ++C+NC ++GH+ 
Sbjct: 241  GYGNGRGW----RPHEDNTNQ----RGENSSRGRGKGHPKSRYDKSSVKCYNCGKFGHYA 300

Query: 301  ADCWSKKTNSNQAETTLMHEQSNNDQGLLFLTLNVQESSTEEIWYLDSGCSNHMTGRKDI 360
            ++C +      + +   + E+   +  LL  +    E      WYLDSG SNHM GRK +
Sbjct: 301  SECKAPSNKKFEEKANYVEEKIQEEDMLLMASYKKDEQKENHKWYLDSGASNHMCGRKSM 360

Query: 361  FISLDESHQNVVKTGDNKMLEVKGKGDILVKTKMGAKK-ITDVYYVSGLKHNLLSVGQLL 420
            F  LDES +  V  GD   +EVKGKG+IL++ K G  + I++VYY+  +K N+LS+GQLL
Sbjct: 361  FAELDESVRGNVALGDESKMEVKGKGNILIRLKNGDHQFISNVYYIPSMKTNILSLGQLL 420

Query: 421  LRGHDVIFKDNICEIRTKNGDLITKVCMTHNKMFPIKICYEKLVCFETLVNDTSWLWHCR 480
             +G+D+  KDN   IR +  +LITKV M+ N+MF + I  +   C +    + SWLWH R
Sbjct: 421  EKGYDIRLKDNNLSIRDQESNLITKVPMSKNRMFVLNIRNDIAQCLKMCYKEESWLWHLR 480

Query: 481  FGHLSFDTLSHMCQQHMVRGMSNIKKEDQLCEACVFRKHHRNSFPTGGSWRASKPLELVH 540
            FGHL+F  L  + ++ MVRG+  I   +Q+CE C+  K  + SFP   S RA KPLEL+H
Sbjct: 481  FGHLNFGGLELLSRKEMVRGLPCINHPNQVCEGCLLGKQFKMSFPKESSSRAQKPLELIH 540

Query: 541  TDLCGPMRTTTHGGNRYFLTFIDDYSRKTWIYLLKEKSATFECFKTFKAMVENESNLKLK 600
            TD+CGP++  + G + YFL FIDD+SRKTW+Y LKEKS  FE FK FKA VE ES L +K
Sbjct: 541  TDVCGPIKPKSLGKSNYFLLFIDDFSRKTWVYFLKEKSEVFEIFKKFKAHVEKESGLVIK 600

Query: 601  SLRSDRGGEYIV--FADFLKENGIKHQKTVRRTPQQNGVAERKNRIIMELARSMLKAKKL 660
            ++RSDRGGE+    F  + ++NGI+ Q TV R+PQQNGV ERKNR I+E+ARSMLK+K+L
Sbjct: 601  TMRSDRGGEFTSKEFLKYCEDNGIRRQLTVPRSPQQNGVVERKNRTILEMARSMLKSKRL 660

Query: 661  PDQFWGDAVTCAVYLLNRASTKSVQ------------------------------DEKRG 720
            P + W +AV CAVYLLNR+ TKSV                               DEKR 
Sbjct: 661  PKELWAEAVACAVYLLNRSPTKSVSGKTPQEAWSGRKPGVSHLRVFGSIAHAHVPDEKRS 720

Query: 721  KLDDKSEKCIFVGYSENSKAYRLYNP------ISKKVVISRDVKFDEAKLWQWNAPNEDQ 780
            KLDDKSEK IF+GY  NSK Y+LYNP      IS+ +V   + ++D    W  N  + + 
Sbjct: 721  KLDDKSEKYIFIGYDNNSKGYKLYNPDTKKTIISRNIVFDEEGEWD----WNSNEEDYNF 780

Query: 781  NPLHVDMDGKKDARDL--ELEVTQPLTS-------PSSSHSTSD--------EETTPRKT 840
             P H + D  +  R+     E T P TS        SSS  T          E T  ++ 
Sbjct: 781  FP-HFEEDEPEPTREEPPSEEPTTPPTSPTSSQIEESSSERTPRFRSIQELYEVTENQEN 840

Query: 841  RNIQEIY--------------NTSRRILDEEHV--------DFALFANVDPALGVKWIYR 900
              +  ++               T R  +DEE          +     N   A+GVKW+Y+
Sbjct: 841  LTLFCLFAECEPMDFQKAIEKKTWRNAMDEEIKSIQKNDTWELTSLPNGHKAIGVKWVYK 900

Query: 901  TKLKQNGEVQKYKARLVVKGYKQKFGVDYEEVFAP-------------PLQQKITGKFIN 960
             K    GEV++YKARLV KGY Q+ G+DY+EVFAP               Q K     ++
Sbjct: 901  AKKNSKGEVERYKARLVAKGYSQRVGIDYDEVFAPVARLETVRLIISLAAQNKWKIHQMD 960

Query: 961  VKSAFLNGYLEDEIYVEQPPGYAKIGEENK----------------------------DG 1020
            VKSAFLNG LE+E+Y+EQP GY   GEE+K                              
Sbjct: 961  VKSAFLNGDLEEEVYIEQPQGYIVKGEEDKVLRLKKVLYGLKQAPRAWNTRIDKYFKEKD 1020

Query: 1021 FRRCPYEHALHTKEDE-----------------------NEFKESMKKEFEMTDMGLLHY 1080
            F +CPYEHAL+ K  +                        EFK+ M KEFEMTD+GL+ Y
Sbjct: 1021 FIKCPYEHALYIKIQKEDILIACLYVDDLIFTGNNPSIFEEFKKEMTKEFEMTDIGLMSY 1080

Query: 1081 FLGIEVKQDDNEIAIFQKKYAKDLLKRFKMENAYPTNTPMELGLKLSKHDVSEAFDATIY 1140
            +LGIEVKQ+DN I I Q+ YAK++LK+FKM+++ P  TPME G+KLSK +  E  D T +
Sbjct: 1081 YLGIEVKQEDNGIFITQEGYAKEVLKKFKMDDSNPVCTPMECGIKLSKKEEGEGVDPTTF 1140

Query: 1141 RSLVGSLMYLTTTRPDIMFSVSLLSRFMTSPKRSHWEAGKRVLRYILGTVDHGIHYKRNV 1200
            +SLVGSL YLT TRPDI+++V ++SR+M  P  +H++A KR+LRYI GTV+ G+HY    
Sbjct: 1141 KSLVGSLRYLTCTRPDILYAVGVVSRYMEHPTTTHFKAAKRILRYIKGTVNFGLHYSTTS 1200

Query: 1201 DNVLVGYSDSDWGGNIDDFKSTSGYIFNIGFKAVSWASKKQDVVALSTTEAEYISLSVAS 1205
            D  LVGYSDSDWGG++DD KSTSG++F IG  A +W SKKQ +V LST EAEY++ +   
Sbjct: 1201 DYKLVGYSDSDWGGDVDDRKSTSGFVFYIGDTAFTWMSKKQPIVTLSTCEAEYVAATSCV 1260

BLAST of CSPI03G21050 vs. NCBI nr
Match: gi|6850900|emb|CAB71063.1| (copia-type polyprotein [Arabidopsis thaliana])

HSP 1 Score: 1063.1 bits (2748), Expect = 3.5e-307
Identity = 601/1355 (44.35%), Postives = 822/1355 (60.66%), Query Frame = 1

Query: 1    MSSNMLQPQLPCFEGKNYRRWSHQMKVLYGSQDLWDIVDIGYSEPESENGLSAQQLNELR 60
            M+SN +  Q+P     NY  WS +MK + G+ D+W+IV+ G+ EPE+E  LS  Q + LR
Sbjct: 1    MASNNVPFQVPVLTKSNYDNWSLRMKAILGAHDVWEIVEKGFIEPENEGSLSQTQKDGLR 60

Query: 61   DARKKDKKALFFIYQSVDENIFERISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEF 120
            D+RK+DKKAL  IYQ +DE+ FE++   ++AK AW+ L+  Y+G ++VK VRLQTLR EF
Sbjct: 61   DSRKRDKKALCLIYQGLDEDTFEKVVEATSAKEAWEKLRTSYKGADQVKKVRLQTLRGEF 120

Query: 121  DTIRMKDSETIEEFFNRVLLIVNQLRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESK 180
            + ++MK+ E + ++F+RVL + N L+ NGE ++D RI+EK+LRS+  ++EHIV  IEE+K
Sbjct: 121  EALQMKEGELVSDYFSRVLTVTNNLKRNGEKLDDVRIMEKVLRSLDLKFEHIVTVIEETK 180

Query: 181  DLSTLSINSLMGSLQSHELRLKMFDSNPSEEAFHMQ-------SSYRGRSNGR-RG-GRG 240
            DL  ++I  L+GSLQ++E + K  + + +E+  +MQ        SY+ R  G+ RG GRG
Sbjct: 181  DLEAMTIEQLLGSLQAYEEKKKKKE-DIAEQVLNMQITKEENGQSYQRRGGGQVRGRGRG 240

Query: 241  GRGNGRSNVVTNTESESRDNQFFSNRGRGRSSNRGRGRSGGRGDFSHIQCFNCRRYGHFQ 300
            G GNGR         E   NQ    RG   S  RG+G    R D S ++C+NC ++GH+ 
Sbjct: 241  GYGNGRGW----RPHEDNTNQ----RGENSSRGRGKGHPKSRYDKSSVKCYNCGKFGHYA 300

Query: 301  ADCWSKKTNSNQAETTLMHEQSNNDQGLLFLTLNVQESSTEEIWYLDSGCSNHMTGRKDI 360
            ++C +      + +   + E+   +  LL  +    E      WYLDSG SNHM GRK +
Sbjct: 301  SECKAPSNKKFEEKAHYVEEKIQEEDMLLMASYKKDEQKENHKWYLDSGASNHMCGRKSM 360

Query: 361  FISLDESHQNVVKTGDNKMLEVKGKGDILVKTKMGAKK-ITDVYYVSGLKHNLLSVGQLL 420
            F  LDES +  V  GD   +EVKGKG+IL++ K G  + I++VYY+  +K N+LS+GQLL
Sbjct: 361  FAELDESVRGNVALGDESKMEVKGKGNILIRLKNGDHQFISNVYYIPSMKTNILSLGQLL 420

Query: 421  LRGHDVIFKDNICEIRTKNGDLITKVCMTHNKMFPIKICYEKLVCFETLVNDTSWLWHCR 480
             +G+D+  KDN   IR +  +LITKV M+ N+MF + I  +   C +    + SWLWH R
Sbjct: 421  EKGYDIRLKDNNLSIRDQESNLITKVPMSKNRMFVLNIRNDIAQCLKMCYKEESWLWHLR 480

Query: 481  FGHLSFDTLSHMCQQHMVRGMSNIKKEDQLCEACVFRKHHRNSFPTGGSWRASKPLELVH 540
            FGHL+F  L  + ++ MVRG+  I   +Q+CE C+  K  + SFP   S RA KPLEL+H
Sbjct: 481  FGHLNFGGLELLSRKEMVRGLPCINHPNQVCEGCLLGKQFKMSFPKESSSRAQKPLELIH 540

Query: 541  TDLCGPMRTTTHGGNRYFLTFIDDYSRKTWIYLLKEKSATFECFKTFKAMVENESNLKLK 600
            TD+CGP++  + G + YFL FIDD+SRKTW+Y LKEKS  FE FK FKA VE ES L +K
Sbjct: 541  TDVCGPIKPKSLGKSNYFLLFIDDFSRKTWVYFLKEKSEVFEIFKKFKAHVEKESGLVIK 600

Query: 601  SLRSDRGGEYIV--FADFLKENGIKHQKTVRRTPQQNGVAERKNRIIMELARSMLKAKKL 660
            ++RSDRGGE+    F  + ++NGI+ Q TV R+PQQNGV ERKNR I+E+ARSMLK+K+L
Sbjct: 601  TMRSDRGGEFTSKEFLKYCEDNGIRRQLTVPRSPQQNGVVERKNRTILEMARSMLKSKRL 660

Query: 661  PDQFWGDAVTCAVYLLNRASTKSVQ------------------------------DEKRG 720
            P + W +AV CAVYLLNR+ TKSV                               DEKR 
Sbjct: 661  PKELWAEAVACAVYLLNRSPTKSVSGKTPQEAWSGRKPGVSHLRVFGSIAHAHVPDEKRS 720

Query: 721  KLDDKSEKCIFVGYSENSKAYRLYNP------ISKKVVISRDVKFDEAKLWQWNAPNEDQ 780
            KLDDKSEK IF+GY  NSK Y+LYNP      IS+ +V   + ++D    W  N  + + 
Sbjct: 721  KLDDKSEKYIFIGYDNNSKGYKLYNPDTKKTIISRNIVFDEEGEWD----WNSNEEDYNF 780

Query: 781  NPLHVDMDGKKDARDL--ELEVTQPLTS-------PSSSHSTSD--------EETTPRKT 840
             P H + D  +  R+     E T P TS        SSS  T          E T  ++ 
Sbjct: 781  FP-HFEEDEPEPTREEPPSEEPTTPPTSPTSSQIEESSSERTPRFRSIQELYEVTENQEN 840

Query: 841  RNIQEIY--------------NTSRRILDEEHV--------DFALFANVDPALGVKWIYR 900
              +  ++               T R  +DEE          +     N   A+GVKW+Y+
Sbjct: 841  LTLFCLFAECEPMDFQKAIEKKTWRNAMDEEIKSIQKNDTWELTSLPNGHKAIGVKWVYK 900

Query: 901  TKLKQNGEVQKYKARLVVKGYKQKFGVDYEEVFAP-------------PLQQKITGKFIN 960
             K    GEV++YKARLV KGY Q+ G+DY+EVFAP               Q K     ++
Sbjct: 901  AKKNSKGEVERYKARLVAKGYSQRVGIDYDEVFAPVARLETVRLIISLAAQNKWKIHQMD 960

Query: 961  VKSAFLNGYLEDEIYVEQPPGYAKIGEENK----------------------------DG 1020
            VKSAFLNG LE+E+Y+EQP GY   GEE+K                              
Sbjct: 961  VKSAFLNGDLEEEVYIEQPQGYIVKGEEDKVLRLKKVLYGLKQAPRAWNTRIDKYFKEKD 1020

Query: 1021 FRRCPYEHALHTKEDE-----------------------NEFKESMKKEFEMTDMGLLHY 1080
            F +CPYEHAL+ K  +                        EFK+ M KEFEMTD+GL+ Y
Sbjct: 1021 FIKCPYEHALYIKIQKEDILIACLYVDDLIFTGNNPSIFEEFKKEMTKEFEMTDIGLMSY 1080

Query: 1081 FLGIEVKQDDNEIAIFQKKYAKDLLKRFKMENAYPTNTPMELGLKLSKHDVSEAFDATIY 1140
            +LGIEVKQ+DN I I Q+ YAK++LK+FK++++ P  TPME G+KLSK +  E  D T +
Sbjct: 1081 YLGIEVKQEDNGIFITQEGYAKEVLKKFKIDDSNPVCTPMECGIKLSKKEEGEGVDPTTF 1140

Query: 1141 RSLVGSLMYLTTTRPDIMFSVSLLSRFMTSPKRSHWEAGKRVLRYILGTVDHGIHYKRNV 1200
            +SLVGSL YLT TRPDI+++V ++SR+M  P  +H++A KR+LRYI GTV+ G+HY    
Sbjct: 1141 KSLVGSLRYLTCTRPDILYAVGVVSRYMEHPTTTHFKAAKRILRYIKGTVNFGLHYSTTS 1200

Query: 1201 DNVLVGYSDSDWGGNIDDFKSTSGYIFNIGFKAVSWASKKQDVVALSTTEAEYISLSVAS 1205
            D  LVGYSDSDWGG++DD KSTSG++F IG  A +W SKKQ +V LST EAEY++ +   
Sbjct: 1201 DYKLVGYSDSDWGGDVDDRKSTSGFVFYIGDTAFTWMSKKQPIVTLSTCEAEYVAATSCV 1260

BLAST of CSPI03G21050 vs. NCBI nr
Match: gi|545693870|gb|AGW47867.1| (polyprotein [Phaseolus vulgaris])

HSP 1 Score: 1040.0 bits (2688), Expect = 3.2e-300
Identity = 574/1350 (42.52%), Postives = 808/1350 (59.85%), Query Frame = 1

Query: 18   YRRWSHQMKVLYGSQDLWDIVDIGYSEPESENGLSAQQLNELRDARKKDKKALFFIYQSV 77
            Y  WS QMK L GSQD W++V+ G+ EP +  G +A Q   L++ R KDK AL+ +Y++V
Sbjct: 19   YDNWSIQMKALLGSQDSWEVVEEGFEEPTNTTGYTAAQTKALKEMRSKDKAALYMLYRAV 78

Query: 78   DENIFERISGVSTAKAAWDALQNLYEGEEKVKLVRLQTLRAEFDTIRMKDSETIEEFFNR 137
            DE IFE+I+G ST+K AWD L+ +++G ++VK VRLQTLR E + ++M +SE++ ++  R
Sbjct: 79   DEAIFEKIAGASTSKEAWDILEKVFKGADRVKQVRLQTLRGELENMKMMESESVSDYITR 138

Query: 138  VLLIVNQLRSNGETIEDQRIVEKILRSMTRRYEHIVVAIEESKDLSTLSINSLMGSLQSH 197
            V  +VNQL  NGET+ D R+VEKILR++T  +E IV AIEESKDL+TL+++ L GSL++H
Sbjct: 139  VQAVVNQLNRNGETLTDARVVEKILRTLTDNFESIVCAIEESKDLATLTVDELAGSLEAH 198

Query: 198  ELRLKMFDSNPSEEA-------------FHMQSSYRGRSNGRRG-GRGGRGNGRSNVVTN 257
            E R K       E+A             +H  S YRGR  G RG GRGG+G+        
Sbjct: 199  EQRKKKKKEETLEQALQTKASIKDEKVLYHQNSQYRGRGRGSRGNGRGGKGSNHEGYYKE 258

Query: 258  TESESRDNQFFSNRGRGRSSNRGRGRSGGRGDFSHIQCFNCRRYGHFQADCWSKK----- 317
             E  S+ N     RGRGR    GRGR GGR ++S+I+C+ C +YGH+  DC S K     
Sbjct: 259  KEQSSQPNW----RGRGR----GRGR-GGRSNYSNIECYKCHKYGHYAKDCNSDKCYNCG 318

Query: 318  ----------TNSNQAETTLMHEQSNNDQGLLFLTLNVQESSTEEIWYLDSGCSNHMTGR 377
                       +    ETT +  +   ++G+L +  +    + + +WYLDSG SNHM G 
Sbjct: 319  KVGHFAKDCRADIKIEETTNLALEVETNEGVLLMAQDEVNINNDTLWYLDSGASNHMCGH 378

Query: 378  KDIFISLDESHQNVVKTGDNKMLEVKGKGDILVKTKMGA-KKITDVYYVSGLKHNLLSVG 437
            + +F  + +     V  GD   +EVKG+G +    K G    + DVYYV  LK N+LS+G
Sbjct: 379  EYLFKDMQKIEDGHVSFGDASKVEVKGRGTVCYLQKDGLIGSLQDVYYVPDLKTNILSMG 438

Query: 438  QLLLRGHDVIFKDNICEIRTKNGDLITKVCMTHNKMFPIKICYEKLVCFETLVNDTSWLW 497
            QL  +G+ +  KD    ++ K G L+ ++ M  N+M+ + +   +  C +  + D + LW
Sbjct: 439  QLTEKGYSIFLKDRFLHLKNKQGCLVARIEMARNRMYKLNLRSIREKCLQVNIEDKASLW 498

Query: 498  HCRFGHLSFDTLSHMCQQHMVRGMSNIKKEDQLCEACVFRKHHRNSFPTGGSWRASKPLE 557
            H RFGHL    L  + +++MV G+ N+  E + CE CV  KH R SFP    + A +PLE
Sbjct: 499  HLRFGHLHHGGLKELAKKNMVHGLPNMDYEGKFCEECVLSKHVRTSFPKKAQYWAKQPLE 558

Query: 558  LVHTDLCGPMRTTTHGGNRYFLTFIDDYSRKTWIYLLKEKSATFECFKTFKAMVENESNL 617
            L+HTD+CGP+   +  G RYF+TFIDD+SRKTW+Y LKEKS  FE FK FK MVE  ++ 
Sbjct: 559  LIHTDICGPITPESFSGKRYFITFIDDFSRKTWVYFLKEKSEAFEVFKKFKVMVERTTDK 618

Query: 618  KLKSLRSDRGGEYI--VFADFLKENGIKHQKTVRRTPQQNGVAERKNRIIMELARSMLKA 677
            ++K++RSDRGGEY    F ++ +E GI+   T   TPQQNGVAERKNR I+++ RSMLK+
Sbjct: 619  QIKAVRSDRGGEYTSTTFMEYCEEQGIRRFLTAPYTPQQNGVAERKNRTILDMVRSMLKS 678

Query: 678  KKLPDQFWGDAVTCAVYLLNRA------------------------------STKSVQDE 737
            KK+P +FW +AV CA+Y+ NR                               +   V D+
Sbjct: 679  KKMPKEFWAEAVQCAIYVQNRCPHVKLDDQTPQEAWSGQKPTVSHLKVFGSVAYAHVPDQ 738

Query: 738  KRGKLDDKSEKCIFVGYSENSKAYRLYNPISKKVVISRDVKFDEAKLWQWNAPNEDQNPL 797
            +R KL+DKS++ +F+GY E +K Y+L +PISKKV +SRDV+ +EA  W WN         
Sbjct: 739  RRTKLEDKSKRYVFIGYDEKTKGYKLLDPISKKVTVSRDVQINEASEWDWN--------- 798

Query: 798  HVDMDGKKDARDLELEVTQPLTSPSSSHS-TSDEETTPR--KTRNIQEIYNTSRRI---- 857
                    ++ ++ +EV +  +SP+S +S T+D+E  PR  K R++ ++Y+++  +    
Sbjct: 799  --------NSSEVMIEVGE--SSPTSINSETTDDEDEPRQPKIRSLHDLYDSTNEVHLVC 858

Query: 858  --LDEEHVDF------------------ALFAN----------VDPALGVKWIYRTKLKQ 917
               D E++ F                  A+  N              +GVKWI++ K+  
Sbjct: 859  LLADAENISFEEAVRDKKWQTAMDEEIKAIDRNNTWELTELPEGSQPIGVKWIFKKKMNA 918

Query: 918  NGEVQKYKARLVVKGYKQKFGVDYEEVFAPPLQQKITGKFI-------------NVKSAF 977
             GE+++YKARLV KGYKQK G+DY+EVFAP ++ +     I             +VKSAF
Sbjct: 919  QGEIERYKARLVAKGYKQKEGIDYDEVFAPVVRMETIRLLISQAAQFKWPIFQMDVKSAF 978

Query: 978  LNGYLEDEIYVEQPPGYAKIGEENK----------------------------DGFRRCP 1037
            LNG LE+E+Y+EQPPGY KIGEE K                            +GF++CP
Sbjct: 979  LNGVLEEEVYIEQPPGYMKIGEEKKVLKLKKALYGLKQAPRAWNTRIDTYFKENGFKQCP 1038

Query: 1038 YEHALHTKEDE-----------------------NEFKESMKKEFEMTDMGLLHYFLGIE 1097
            YEHAL+ K +                         EFK +M++EFEMTD+GL+ +FLG+E
Sbjct: 1039 YEHALYAKNNGGNMIFVALYVDDLIFMGNNNDMIEEFKGTMRREFEMTDLGLMKFFLGLE 1098

Query: 1098 VKQDDNEIAIFQKKYAKDLLKRFKMENAYPTNTPMELGLKLSKHDVSEAFDATIYRSLVG 1157
            V+Q +  I + Q+KYAK++LK++KMEN  P + PME G KLSK D  E  DA+ YRSLVG
Sbjct: 1099 VRQKETGIFVSQEKYAKEILKKYKMENCNPVSIPMEPGAKLSKFDGGERVDASRYRSLVG 1158

Query: 1158 SLMYLTTTRPDIMFSVSLLSRFMTSPKRSHWEAGKRVLRYILGTVDHGIHYKRNVDNVLV 1205
            SL YLT TRPD+  SV ++SRFM  P  SHW+A KRVLRYI GTV  G+ Y +  D  LV
Sbjct: 1159 SLRYLTCTRPDLSLSVGIISRFMEEPVYSHWKALKRVLRYIQGTVSLGLFYSKAEDYKLV 1218

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
POLX_TOBAC9.5e-7935.78Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum... [more]
COPIA_DROME1.6e-5739.00Copia protein OS=Drosophila melanogaster GN=GIP PE=1 SV=3[more]
M810_ARATH3.0e-3236.00Uncharacterized mitochondrial protein AtMg00810 OS=Arabidopsis thaliana GN=AtMg0... [more]
YCH4_YEAST1.2e-2033.84Putative transposon Ty5-1 protein YCL074W OS=Saccharomyces cerevisiae (strain AT... [more]
YB21B_YEAST6.5e-1125.47Transposon Ty2-B Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 20... [more]
Match NameE-valueIdentityDescription
A6YTD9_CUCME0.0e+0056.84Integrase OS=Cucumis melo subsp. melo PE=4 SV=1[more]
Q9C536_ARATH0.0e+0046.18Copia-type polyprotein, putative OS=Arabidopsis thaliana GN=T18I24.5 PE=4 SV=1[more]
Q9SXB2_ARATH1.1e-30744.43T28P6.8 protein OS=Arabidopsis thaliana GN=T28P6.8 PE=4 SV=1[more]
Q9M2D1_ARATH2.4e-30744.35Copia-type polyprotein OS=Arabidopsis thaliana GN=T20K12.230 PE=4 SV=1[more]
A0A059QBK0_PHAVU2.2e-30042.52Polyprotein OS=Phaseolus vulgaris PE=4 SV=1[more]
Match NameE-valueIdentityDescription
AT4G23160.11.5e-5040.63 cysteine-rich RLK (RECEPTOR-like protein kinase) 8[more]
ATMG00810.11.7e-3336.00ATMG00810.1 DNA/RNA polymerases superfamily protein[more]
AT1G48720.11.5e-1846.74 unknown protein[more]
ATMG00240.13.4e-1039.74ATMG00240.1 Gag-Pol-related retrotransposon family protein[more]
AT3G21000.14.9e-0922.62 Gag-Pol-related retrotransposon family protein[more]
Match NameE-valueIdentityDescription
gi|150036244|gb|ABR67407.1|0.0e+0056.84integrase [Cucumis melo subsp. melo][more]
gi|12321254|gb|AAG50698.1|AC079604_50.0e+0046.18copia-type polyprotein, putative [Arabidopsis thaliana][more]
gi|5734736|gb|AAD50001.1|AC007259_141.6e-30744.43Hypothetical protein [Arabidopsis thaliana][more]
gi|6850900|emb|CAB71063.1|3.5e-30744.35copia-type polyprotein [Arabidopsis thaliana][more]
gi|545693870|gb|AGW47867.1|3.2e-30042.52polyprotein [Phaseolus vulgaris][more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR001584Integrase_cat-core
IPR001878Znf_CCHC
IPR012337RNaseH-like_sf
IPR013103RVT_2
IPR025724GAG-pre-integrase_dom
Vocabulary: Biological Process
TermDefinition
GO:0015074DNA integration
Vocabulary: Molecular Function
TermDefinition
GO:0003676nucleic acid binding
GO:0008270zinc ion binding
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0015074 DNA integration
cellular_component GO:0005575 cellular_component
molecular_function GO:0003676 nucleic acid binding
molecular_function GO:0008270 zinc ion binding

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CSPI03G21050.1CSPI03G21050.1mRNA


Analysis Name: InterPro Annotations of cucumber (PI183967)
Date Performed: 2017-01-17
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR001584Integrase, catalytic corePFAMPF00665rvecoord: 523..636
score: 8.0
IPR001584Integrase, catalytic corePROFILEPS50994INTEGRASEcoord: 521..685
score: 22
IPR001878Zinc finger, CCHC-typeGENE3DG3DSA:4.10.60.10coord: 275..298
score: 1.
IPR001878Zinc finger, CCHC-typePROFILEPS50158ZF_CCHCcoord: 281..294
score: 9
IPR001878Zinc finger, CCHC-typeunknownSSF57756Retrovirus zinc finger-like domainscoord: 263..303
score: 8.1
IPR012337Ribonuclease H-like domainGENE3DG3DSA:3.30.420.10coord: 520..671
score: 6.1
IPR012337Ribonuclease H-like domainunknownSSF53098Ribonuclease H-likecoord: 520..671
score: 3.92
IPR013103Reverse transcriptase, RNA-dependent DNA polymerasePFAMPF07727RVT_2coord: 805..890
score: 3.0E-21coord: 906..970
score: 1.4
IPR025724GAG-pre-integrase domainPFAMPF13976gag_pre-integrscoord: 452..509
score: 1.9
NoneNo IPR availablePANTHERPTHR11439GAG-POL-RELATED RETROTRANSPOSONcoord: 332..1129
score: 0.0coord: 13..226
score: 0.0coord: 254..306
score:
NoneNo IPR availablePFAMPF14223Retrotran_gag_2coord: 63..202
score: 2.8
NoneNo IPR availableunknownSSF56672DNA/RNA polymerasescoord: 979..1161
score: 7.11E-9coord: 910..949
score: 7.1

The following gene(s) are orthologous to this gene:

None

The following gene(s) are paralogous to this gene:

None