CSPI04G20550 (gene) Wild cucumber (PI 183967)

NameCSPI04G20550
Typegene
OrganismCucumis sativus (Wild cucumber (PI 183967))
DescriptionRetrovirus-related Pol polyprotein from transposon 297 family
LocationChr4 : 18809754 .. 18815325 (+)
   



The following sequences are available for this feature:

Gene sequence (with intron)

Legend: CDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGGTTGGTTAATTTAGCGGATTAGTGAATTAGAAAAAAAGAAAAAGAGTAATTGGGGTGTTAGATTGTTTCGTTTACTTGTTAATTTTATATTAGATAAATTAGAAAATTAGTACTTTAAAATTAAAATAGACAAAAATGAAAATAAGTTATTCTTTTTTTCTTCTGAATCTGGCGGATAGTTCCTTCGATAGACTATTCAGACTAAATCGTCAGCGGCAAGTGATCAACCATCTCTGGCAACCAACGATCACTTCCGGCGACTTCCGGCGACTTCCGGACTCGGGGCAATAGCTGCTGGATTCAGGTATAACAATTTTGAATGGGAAGGGTTTGAAAATTTAGTACAAATGCTATCTAGGGATTTTTTTTTATTCATTTTATATCTGAGATTTGTTTTTTGTTCGATTACTCCTCCTTGCTTCCCTCCAAATGAACAAATTCATCGTGTCTCTAAAACTAGAACGATTGTTTGATTCTCAAAATTTGCCCAAAATCACGATCAATTTCCTGATACAATTTTAATGAAGAAACCACAACTCGGTAAAATCGAAAATGTTGAAAGGAAACGGAGGTACTTGGTTGGGGAATGAGAAGGTGATTGGAACCATAATGGTCATGAAAAGATTGTCATCCAATATAAGCCGGGAGAGTTCTTGAAGGACAGTGAAGATTTTAAAGATTGGATCAAACGTTTGGAATGCTACAGCAAAGGAATGCAATTCGTCGAAGTACGTGATGGTGAATGAAACCGATACTGGCGAGCATTTGAGATTGGATAAGGGAATCTCTAAGGTTTATTTATTCCGTTCCTCGTTTATTTCTCTCCGTTTCTCTTTTCCTCTCCACATCGTTTTCTATCTTTCTCGGCTTTTTTTTCTCTCTTTCTCTTCTCCGGCGCGTCATCACCATTATCGGCTTTCGTCTTCTCTCTCCCTCTTCGCCCGCTGTCTTCTCTCTTCTCCGGCTAAAGTTACAAAGTATGTGTTTCTCTATTCCTATATTTCTCTCTTCTCCGGCTCTGTCTCTCTATTTATCTCTTCCCCGATCTCCATCTTCTCCGATTCTCTTTCTCTTAACTTTTCTTCTCCGATTCTCTGCCTTCTCTCTTCTTCGGCTAAAGGTATGTGTTTCTGATTTTAACTTTACAAAAAAATCTATTCCCCTCCATCTTTTTATAAACAACTTCATTTTAATTTTCTGTGTTTGTTCCTCTCTCTGACCATAACCCATGCTCTAATTTTTTACATGTTTTTTTTTTTTTGGAGTTTTGGTTGCGTCTGACTTTTAGGTAGAGTACTGTACTTTGGTGACGGAAATTTGATGAAATTTAGTTCTCCTTTTACTTACTTTGAGCCCTTAGTTGTCTGTTATCTGCCATCTCCTTTCTGGGTTTTGATGCTGTAATACATTTTACTCAGTGTTAAAACTTGTGATGGATTCCTATTGGCTGCCCCTTTAAGCTGTGATTTTGGGGGTAATGAAATTTAACTCTTCTTATATGGGGAAAATCGTGAGATGTCCTTGTTATAGTGTGTTTCAAGAGAACTATAGTGCTATTTGCTGAAATTGGCAGATAAATTTTACATAATGTACCATGGAGTAGACATCTCTGCCTTGGCAATTTTTTGATGGCAAGAATATTTCTTCATAGAGAATAGCTGTTACTATTGATTGGTATTTCCGTGTATGAGAGGTGTTTGAGGAGTGCAACCTTGATATATCGTGGTGTACGTTTCAGTACAAGATTGAGTATTGGAGGAGTGTGCATTTAAGAATCTAATATCATTGTGCAATGGAAGATTTATGAAAGGAAACTTTGGACTTCAATTAGGAAAATAAACCTTCAAATTTGTGGCTTCTAACCCATGCTCTAACTATATATCAGTGTTTTAGAGCAATTTTAACCCCAAGAGGAAAATGGATTTGCTTGTGTTCAACAATGGAAATAAGTTTTCAGTTAACCTTAACCCCAAAAAGAAAATATCGAACAAATCCATGGTAGTGAGGTGAGAAATTATCCTACTTATCATTTACAAGATTTTAGAGAAAGGAGACATCAGCCCAGAGGTAGAGAAGAAGTGTTTAACTATTGACATTCTTCACTGTGAAATGTCATTGAACGTTTTTTTTGGTGTGCTGAAGGCTCGATTTCCTATTCTTAAGCAAATGCCTCCATACCCGATCAAAACACAAAAATATATATATACTGATAGCATGTTGCACAATTCACAATTATATTAGATTGAATGATTGTCAAGATGATCTATTCAACGACTTTAGCAATGAATCAATGAACGTTGGAGATGCACAAAATTTGTCGGTCAATCTCCAAAGTGATATGATTAAATTAGATGTGAGTCGACATCATCTAAGAGAAATGATCCGAGCGAGAGATGATATTGCCAATCAAATTTGGGCAACATTTGAACGATAGATACAATTTTATGTTATTGTTTTTCTTTGTGCACTGTGACTTGTTTGCATTAAATGATATAAACTTTTGATCATTTTCTTTTTTATGTTTAATGTAGGACATATTTTTTGTTTATTTTTTAATTAAATTAGATGTGAGTCGACATATTCTTAATGTATGACATTATTAAGAAAACAAGAAATAAGAAACAGGAAACGAGAAACCATTATTAAACACGTTTTCTGTTTTCTGTTTTTTTTTTTTCAAGAAACAAGAAACAAGAAACAGAAACAATTATTGTTAGACCACCACTGTTATTTACTTATAATAGAAGAAAGAAAAATATGTCACGTGAGGAACCAACAGTTAATTAGAAATTAGAAGAGGAAGTGGAGAGAGGAAAGAAGGGGTGACAGTTACTTGGAAGAGCAAGTTAAATCGATGAGAAGGGGAATGAAGAAGAGGGGCAGTTTTTATGAAAGGAGAGGTCTGCAATACTCCTTGAAAGATAGGGGAAGCAGAAGGCTGAAGTTTTGTGTTCTGGATTGCTTTTTAGTATTTTGTTTTGTGTGTTGTTTTGGTATTTTGGATTAGGAGGTTGACTTTGCTTGTTATTTGTGAATTATCTGTTACGTGACTCATTTCCTGAGGTTCACGATTTGCTTATTATTGTAAGTTGTGTTTGAATTATCAATATAATAGAAACACTACATCAATGTTCTATCAATTGTTATCAGAGCCCTAAAACCTGGGCATTCAGAAAAGATGGCGAAAAAAGAAAAAAAAGGTCTGGAACTTCTTGAGCAGAAATTTATGGAAATGCAAACAAAGTTCAAAAAAATACCGGTGATGGAGGAAAATTTAACTTTGATTTCGAAAAACATTGAGAATTTGAATGCGATGATAGCAAAGCAGCGACAACAACAATAGATGGTTGTGAAATATCTTGAAGGTATTATTCGAGAGAAGGCTTTGACATCGAAGGAAGACGAAGGATCTGTGAGTAGAGAGAAGAAAAGCGAAAATATATTAGTTGAACTCAACGATGAATTGAAGTTGCACCTGAAGAGCAATGAAAAATTGACTGAGCAGAGTAAGTTTAAGAAATTTGAAATACCGGTATTCAATGGCACGGATCCGGATTCTTGGTTATTCCGAGCGGATCGATTCTTCAAAATACACAGTTTTACTGATTCAGAGAAGTTGTCTGTGGCAATTATCAGTTTTGATGGACCGGCTATAGATTGGTATCGTTTTCAGGATGAATGTGAGGCTTTTAAGGATTGGGATGGTCTAAAAGAGAAGATGTTAACTTGTTTTCCAACAATAAGAGATGGATCTCTTGTGGGCCAATTTTTGACTATTAAACAGAAATTCATAGTGGAAGAATATCGCAATTTATTTGATTAAGTTTTAGCTCCTGTAGCCTTTCTACAGACAATTGTATTAGAAGAAACTTTTATGAATGGGCTTTACCCATGGCTGAAAACGGAAGTGTCTCTTTTGGAACCTAAGGGCTTGGCTGCAATGATGAAATTGGCATTAAAAATAGAACACAGAGATGGTTCGCAAAGAGGCTGGGTTCGTTAGCGTATATAAGAATAGGATTCAGTATAATTTGCCTAAAATCAAGGAAATTTCCAACAGCAATACTACAAATATACACCAGGGGGAAATACACCTATTAGAACGATAACATTGAAAGGGGTTATGCCGGGGGAGAACTGAAGGGAAGGGCCAACAAAAAGATTGTATGATGCTGAATTTCAATCTTGGAGAGAGAAGGGATTATGCTTTAGATGTGAAGAGAAATATTATGTTGGGCATCGTTGTAAAGTGAAAGAACAAAAGGAATTGAGAATGTTGGTTGTGCACGACAATGGGGAAGAATTAGAAATCATTGAAGATGACCCATATGATGAAGAATTGGAGGTTAAACCAATGGAGGTAGGAGCTGATGAAAATCTCAACATTGAGTTATCTATCAACTCAGTGGTGGGATTAAACAACTCTGGAACCATGAAAGTGAAGGGAAAAGTGAAGAATGAGGATGTTGTAGTGTTAATTGACTGTGGGGCTACCCACAATTTCATATCTGAGAAATTGGTGACTGCATTAAATCTTCCTTTGAAAACAACTACAACTTATGGAGTGATTCTTGGGCCAGGGACGACTATTAAAGGAAAAGGCGTGTGTGGCTAAGTGGAAATTTTGATTGGTGATTGGAAGATTATTGACAGTTTCTTGTCCTTAGAATTGAGAGGGGCGGATGTGATTCTTGGAATGCAGTGGTTGAAATCTTTAGGTGTCACTGAAGTGGACTGGAGGAAGTTAATTATGACTTTTCAGCATTATTGGAAGAGGATTACAATCAAAGGTGATCCTAGCTTAATGAAGATAAGGGTTAGTTTGAAACATATGATGAAGACTTGGGATAATGAAGATCAGAGTTATTTAATAGGGTGTCGAGCATTACACAGTAGTTTAACTATAGAAGAGATGTATGATGAGGAGACAAAACCCACAGCCGAAAATTTACTGCCCTCTCTGTTAACTAAGTTCGATGATGTGTTCAGTTGGCCCGAGACGCTACCACCGCAACGTGGGATTGAACATCATATCCATCTCAAACAAGGGACGAATCCAGTAAACGTGTCGCCATATAGATATGCACACAAACAGAAATAGGAAATGGAACGTTTGGTGGATGAGATGCTGGCTTCGGGAATTATTTGGCCAAGTACCAGTCCATATTCTAGTCCCGTGTTGTTGGTAAGAAGGAAAGATGGGAGTTGGCGTTTTTATGTCGATTATAGAGCGCTGAACAATGTAACAATCCCAGATAAATTTCCTATTCCAGTTATTGAAGAGCTCTTTAATGAACTTAATGGAGCGAATATGTTTTCAAAGATTGATATTAAAGCGAGATATCATCAAATTCGGATGAATCAAGAAGATGTGGAGAAGACAACTTTTAGAACTCATGAAGGGCATTATGAGTTCTTAGTTATGCCATTTGGGCTCACCAATGCCCCTTCAACTTTCCAAGCATTAATGAATTCAATCTTCAGGCCGTATATGAGGAGGTTTGTGTTGGTTTTCTTTGATGACATATTAGTTGCAGCAGAGGATTGGAAGAACACTTCCAGCATTTAG

mRNA sequence

ATGACTATTCAGACTAAATCGTCAGCGGCAAGTGATCAACCATCTCTGGCAACCAACGATCACTTCCGGCGACTTCCGGCGACTTCCGGACTCGGGGCAATAGCTGCTGGATTCAGAAACCACAACTCGGTAAAATCGAAAATGTTGAAAGGAAACGGAGGTACTTGGTTGGGGAATGAGAAGATTGGATCAAACGTTTGGAATGCTACAGCAAAGGAATGCAATTCGTCGAAGTACGTGATGGTGAATGAAACCGATACTGGCGAGCATTTGAGATTGGATAAGGGAATCTCTAAGGTTTATTTATTCCGTTCCTCGTTTATTTCTCTCCGTTTCTCTTTTCCTCTCCACATCGTTTTCTATCTTTCTCGGCTTTTTTTTCTCTCTTTCTCTTCTCCGGCGCGTCATCACCATTATCGGCTTTCGTCTTCTCTCTCCCTCTTCGCCCGCTGTCTTCTCTCTTCTCCGGCTAAAGTTACAAAGTATGTGTTTCTCTATTCCTATATTTCTCTCTTCTCCGGCTCTGTCTCTCTATTTATCTCTTCCCCGATCTCCATCTTCTCCGATTCTCTTTCTCTTAACTTTTCTTCTCCGATTCTCTGCCTTCTCTCTTCTTCGGCTAAAGGTAGAGTACTGTACTTTGGTGACGGAAATTTGATGAAATTTAGTTCTCCTTTTACTTACTTTGAGCCCTTAGTTGTCTGTTATCTGCCATCTCCTTTCTGGGTTTTGATGCTAGCCCTAAAACCTGGGCATTCAGAAAAGATGGCGAAAAAAGAAAAAAAAGGTCTGGAACTTCTTGAGCAGAAATTTATGGAAATGCAAACAAAGTTCAAAAAAATACCGATGGTTGTGAAATATCTTGAAGGTATTATTCGAGAGAAGGCTTTGACATCGAAGGAAGACGAAGGATCTGTGAGTAGAGAGAAGAAAAGCGAAAATATATTAGTTGAACTCAACGATGAATTGAAGTTGCACCTGAAGAGCAATGAAAAATTGACTGAGCAGAGTAAGTTTAAGAAATTTGAAATACCGGTATTCAATGGCACGGATCCGGATTCTTGGTTATTCCGAGCGGATCGATTCTTCAAAATACACAGTTTTACTGATTCAGAGAAGTTGTCTGTGGCAATTATCAGTTTTGATGGACCGGCTATAGATTGGTATCGTTTTCAGGATGAATGTGAGGCTTTTAAGGATTGGGATGGTCTAAAAGAGAAGATGTTAACTTGTTTTCCAACAATAAGAGATGGATCTCTTACAATTGTATTAGAAGAAACTTTTATGAATGGGCTTTACCCATGGCTGAAAACGGAAGTGTCTCTTTTGGAACCTAAGGGCTTGGCTGCAATGATGAAATTGGCATTAAAAATAGAACACAGAGATGGTTCGCAAAGAGGCTGGAACGATAACATTGAAAGGGGTTATGCCGGGGGAGAACTGAAGGGAAGGGCCAACAAAAAGATTGAATTGAGAATGTTGGTTGTGCACGACAATGGGGAAGAATTAGAAATCATTGAAGATGACCCATATGATGAAGAATTGGAGGTTAAACCAATGGAGGTAGGAGCTGATGAAAATCTCAACATTGAGTTATCTATCAACTCAGTGGTGGGATTAAACAACTCTGGAACCATGAAAGTGAAGGGAAAAGTGAAGAATGAGGATGTTGTAGTGTTAATTGACTGTGGGGCTACCCACAATTTCATATCTGAGAAATTGGTGACTGCATTAAATCTTCCTTTGAAAACAACTACAACTTATGGAGTGATTCTTGGGCCAGGGACGACTATTAAAGGAAAAGGCTGGTTGAAATCTTTAGGTGTCACTGAAGTGGACTGGAGGAAGTTAATTATGACTTTTCAGCATTATTGGAAGAGGATTACAATCAAAGGTGATCCTAGCTTAATGAAGATAAGGGTTAGTTTGAAACATATGATGAAGACTTGGGATAATGAAGATCAGAGTTATTTAATAGGGTGTCGAGCATTACACAGTAGTTTAACTATAGAAGAGATGTATGATGAGGAGACAAAACCCACAGCCGAAAATTTACTGCCCTCTCTGTTAACTAAGTTCGATGATGTGTTCAGTTGGCCCGAGACGCTACCACCGCAACGTGGGATTGAACATCATATCCATCTCAAACAAGGGACGAATCCAGAAATGGAACGTTTGGTGGATGAGATGCTGGCTTCGGGAATTATTTGGCCAAGTACCAGTCCATATTCTAGTCCCGTGTTGTTGGTAAGAAGGAAAGATGGGAGTTGGCGTTTTTATGTCGATTATAGAGCGCTGAACAATGTAACAATCCCAGATAAATTTCCTATTCCAGTTATTGAAGAGCTCTTTAATGAACTTAATGGAGCGAATATGTTTTCAAAGATTGATATTAAAGCGAGATATCATCAAATTCGGATGAATCAAGAAGATGTGGAGAAGACAACTTTTAGAACTCATGAAGGGCATTATGAGTTCTTAGTTATGCCATTTGGGCTCACCAATGCCCCTTCAACTTTCCAAGCATTAATGAATTCAATCTTCAGGCCGTATATGAGGAGGTTTGTGTTGGTTTTCTTTGATGACATATTAGTTGCAGCAGAGGATTGGAAGAACACTTCCAGCATTTAG

Coding sequence (CDS)

ATGACTATTCAGACTAAATCGTCAGCGGCAAGTGATCAACCATCTCTGGCAACCAACGATCACTTCCGGCGACTTCCGGCGACTTCCGGACTCGGGGCAATAGCTGCTGGATTCAGAAACCACAACTCGGTAAAATCGAAAATGTTGAAAGGAAACGGAGGTACTTGGTTGGGGAATGAGAAGATTGGATCAAACGTTTGGAATGCTACAGCAAAGGAATGCAATTCGTCGAAGTACGTGATGGTGAATGAAACCGATACTGGCGAGCATTTGAGATTGGATAAGGGAATCTCTAAGGTTTATTTATTCCGTTCCTCGTTTATTTCTCTCCGTTTCTCTTTTCCTCTCCACATCGTTTTCTATCTTTCTCGGCTTTTTTTTCTCTCTTTCTCTTCTCCGGCGCGTCATCACCATTATCGGCTTTCGTCTTCTCTCTCCCTCTTCGCCCGCTGTCTTCTCTCTTCTCCGGCTAAAGTTACAAAGTATGTGTTTCTCTATTCCTATATTTCTCTCTTCTCCGGCTCTGTCTCTCTATTTATCTCTTCCCCGATCTCCATCTTCTCCGATTCTCTTTCTCTTAACTTTTCTTCTCCGATTCTCTGCCTTCTCTCTTCTTCGGCTAAAGGTAGAGTACTGTACTTTGGTGACGGAAATTTGATGAAATTTAGTTCTCCTTTTACTTACTTTGAGCCCTTAGTTGTCTGTTATCTGCCATCTCCTTTCTGGGTTTTGATGCTAGCCCTAAAACCTGGGCATTCAGAAAAGATGGCGAAAAAAGAAAAAAAAGGTCTGGAACTTCTTGAGCAGAAATTTATGGAAATGCAAACAAAGTTCAAAAAAATACCGATGGTTGTGAAATATCTTGAAGGTATTATTCGAGAGAAGGCTTTGACATCGAAGGAAGACGAAGGATCTGTGAGTAGAGAGAAGAAAAGCGAAAATATATTAGTTGAACTCAACGATGAATTGAAGTTGCACCTGAAGAGCAATGAAAAATTGACTGAGCAGAGTAAGTTTAAGAAATTTGAAATACCGGTATTCAATGGCACGGATCCGGATTCTTGGTTATTCCGAGCGGATCGATTCTTCAAAATACACAGTTTTACTGATTCAGAGAAGTTGTCTGTGGCAATTATCAGTTTTGATGGACCGGCTATAGATTGGTATCGTTTTCAGGATGAATGTGAGGCTTTTAAGGATTGGGATGGTCTAAAAGAGAAGATGTTAACTTGTTTTCCAACAATAAGAGATGGATCTCTTACAATTGTATTAGAAGAAACTTTTATGAATGGGCTTTACCCATGGCTGAAAACGGAAGTGTCTCTTTTGGAACCTAAGGGCTTGGCTGCAATGATGAAATTGGCATTAAAAATAGAACACAGAGATGGTTCGCAAAGAGGCTGGAACGATAACATTGAAAGGGGTTATGCCGGGGGAGAACTGAAGGGAAGGGCCAACAAAAAGATTGAATTGAGAATGTTGGTTGTGCACGACAATGGGGAAGAATTAGAAATCATTGAAGATGACCCATATGATGAAGAATTGGAGGTTAAACCAATGGAGGTAGGAGCTGATGAAAATCTCAACATTGAGTTATCTATCAACTCAGTGGTGGGATTAAACAACTCTGGAACCATGAAAGTGAAGGGAAAAGTGAAGAATGAGGATGTTGTAGTGTTAATTGACTGTGGGGCTACCCACAATTTCATATCTGAGAAATTGGTGACTGCATTAAATCTTCCTTTGAAAACAACTACAACTTATGGAGTGATTCTTGGGCCAGGGACGACTATTAAAGGAAAAGGCTGGTTGAAATCTTTAGGTGTCACTGAAGTGGACTGGAGGAAGTTAATTATGACTTTTCAGCATTATTGGAAGAGGATTACAATCAAAGGTGATCCTAGCTTAATGAAGATAAGGGTTAGTTTGAAACATATGATGAAGACTTGGGATAATGAAGATCAGAGTTATTTAATAGGGTGTCGAGCATTACACAGTAGTTTAACTATAGAAGAGATGTATGATGAGGAGACAAAACCCACAGCCGAAAATTTACTGCCCTCTCTGTTAACTAAGTTCGATGATGTGTTCAGTTGGCCCGAGACGCTACCACCGCAACGTGGGATTGAACATCATATCCATCTCAAACAAGGGACGAATCCAGAAATGGAACGTTTGGTGGATGAGATGCTGGCTTCGGGAATTATTTGGCCAAGTACCAGTCCATATTCTAGTCCCGTGTTGTTGGTAAGAAGGAAAGATGGGAGTTGGCGTTTTTATGTCGATTATAGAGCGCTGAACAATGTAACAATCCCAGATAAATTTCCTATTCCAGTTATTGAAGAGCTCTTTAATGAACTTAATGGAGCGAATATGTTTTCAAAGATTGATATTAAAGCGAGATATCATCAAATTCGGATGAATCAAGAAGATGTGGAGAAGACAACTTTTAGAACTCATGAAGGGCATTATGAGTTCTTAGTTATGCCATTTGGGCTCACCAATGCCCCTTCAACTTTCCAAGCATTAATGAATTCAATCTTCAGGCCGTATATGAGGAGGTTTGTGTTGGTTTTCTTTGATGACATATTAGTTGCAGCAGAGGATTGGAAGAACACTTCCAGCATTTAG
BLAST of CSPI04G20550 vs. Swiss-Prot
Match: POL2_DROME (Retrovirus-related Pol polyprotein from transposon 297 OS=Drosophila melanogaster GN=pol PE=3 SV=1)

HSP 1 Score: 150.2 bits (378), Expect = 1.0e-34
Identity = 71/156 (45.51%), Postives = 100/156 (64.10%), Query Frame = 1

Query: 715 LKQGTNPEMERLVDEMLASGIIWPSTSPYSSPVLLVRRKDGS-----WRFYVDYRALNNV 774
           L Q    E+E  V EML  G+I  S SPY+SP  +V +K  +     +R  +DYR LN +
Sbjct: 214 LAQTHEIEVENQVQEMLNQGLIRESNSPYNSPTWVVPKKPDASGANKYRVVIDYRKLNEI 273

Query: 775 TIPDKFPIPVIEELFNELNGANMFSKIDIKARYHQIRMNQEDVEKTTFRTHEGHYEFLVM 834
           TIPD++PIP ++E+  +L     F+ ID+   +HQI M++E + KT F T  GHYE+L M
Sbjct: 274 TIPDRYPIPNMDEILGKLGKCQYFTTIDLAKGFHQIEMDEESISKTAFSTKSGHYEYLRM 333

Query: 835 PFGLTNAPSTFQALMNSIFRPYMRRFVLVFFDDILV 866
           PFGL NAP+TFQ  MN+I RP + +  LV+ DDI++
Sbjct: 334 PFGLRNAPATFQRCMNNILRPLLNKHCLVYLDDIII 369

BLAST of CSPI04G20550 vs. Swiss-Prot
Match: YI31B_YEAST (Transposon Ty3-I Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GN=TY3B-I PE=3 SV=2)

HSP 1 Score: 149.8 bits (377), Expect = 1.4e-34
Identity = 75/167 (44.91%), Postives = 104/167 (62.28%), Query Frame = 1

Query: 714 HLKQGTNPEMERLVDEMLASGIIWPSTSPYSSPVLLVRRKDGSWRFYVDYRALNNVTIPD 773
           H+ +    E+ ++V ++L +  I PS SP SSPV+LV +KDG++R  VDYR LN  TI D
Sbjct: 629 HVTEKNEQEINKIVQKLLDNKFIVPSKSPCSSPVVLVPKKDGTFRLCVDYRTLNKATISD 688

Query: 774 KFPIPVIEELFNELNGANMFSKIDIKARYHQIRMNQEDVEKTTFRTHEGHYEFLVMPFGL 833
            FP+P I+ L + +  A +F+ +D+ + YHQI M  +D  KT F T  G YE+ VMPFGL
Sbjct: 689 PFPLPRIDNLLSRIGNAQIFTTLDLHSGYHQIPMEPKDRYKTAFVTPSGKYEYTVMPFGL 748

Query: 834 TNAPSTFQALMNSIFRPYMRRFVLVFFDDILVAAED----WKNTSSI 877
            NAPSTF   M   FR    RFV V+ DDIL+ +E     WK+  ++
Sbjct: 749 VNAPSTFARYMADTFRDL--RFVNVYLDDILIFSESPEEHWKHLDTV 793

BLAST of CSPI04G20550 vs. Swiss-Prot
Match: YG31B_YEAST (Transposon Ty3-G Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GN=TY3B-G PE=1 SV=3)

HSP 1 Score: 149.8 bits (377), Expect = 1.4e-34
Identity = 75/167 (44.91%), Postives = 104/167 (62.28%), Query Frame = 1

Query: 714 HLKQGTNPEMERLVDEMLASGIIWPSTSPYSSPVLLVRRKDGSWRFYVDYRALNNVTIPD 773
           H+ +    E+ ++V ++L +  I PS SP SSPV+LV +KDG++R  VDYR LN  TI D
Sbjct: 603 HVTEKNEQEINKIVQKLLDNKFIVPSKSPCSSPVVLVPKKDGTFRLCVDYRTLNKATISD 662

Query: 774 KFPIPVIEELFNELNGANMFSKIDIKARYHQIRMNQEDVEKTTFRTHEGHYEFLVMPFGL 833
            FP+P I+ L + +  A +F+ +D+ + YHQI M  +D  KT F T  G YE+ VMPFGL
Sbjct: 663 PFPLPRIDNLLSRIGNAQIFTTLDLHSGYHQIPMEPKDRYKTAFVTPSGKYEYTVMPFGL 722

Query: 834 TNAPSTFQALMNSIFRPYMRRFVLVFFDDILVAAED----WKNTSSI 877
            NAPSTF   M   FR    RFV V+ DDIL+ +E     WK+  ++
Sbjct: 723 VNAPSTFARYMADTFRDL--RFVNVYLDDILIFSESPEEHWKHLDTV 767

BLAST of CSPI04G20550 vs. Swiss-Prot
Match: POL5_DROME (Retrovirus-related Pol polyprotein from transposon opus OS=Drosophila melanogaster GN=pol PE=3 SV=1)

HSP 1 Score: 147.5 bits (371), Expect = 6.8e-34
Identity = 71/160 (44.38%), Postives = 101/160 (63.12%), Query Frame = 1

Query: 722 EMERLVDEMLASGIIWPSTSPYSSPVLLVRRK-----DGSWRFYVDYRALNNVTIPDKFP 781
           E+ER +DE+L  GII PS SPY+SP+ +V +K     +  +R  VD++ LN VTIPD +P
Sbjct: 138 EVERQIDELLQDGIIRPSNSPYNSPIWIVPKKPKPNGEKQYRMVVDFKRLNTVTIPDTYP 197

Query: 782 IPVIEELFNELNGANMFSKIDIKARYHQIRMNQEDVEKTTFRTHEGHYEFLVMPFGLTNA 841
           IP I      L  A  F+ +D+ + +HQI M + D+ KT F T  G YEFL +PFGL NA
Sbjct: 198 IPDINATLASLGNAKYFTTLDLTSGFHQIHMKESDIPKTAFSTLNGKYEFLRLPFGLKNA 257

Query: 842 PSTFQALMNSIFRPYMRRFVLVFFDDILVAAED----WKN 873
           P+ FQ +++ I R ++ +   V+ DDI+V +ED    WKN
Sbjct: 258 PAIFQRMIDDILREHIGKVCYVYIDDIIVFSEDYDTHWKN 297

BLAST of CSPI04G20550 vs. Swiss-Prot
Match: POL3_DROME (Retrovirus-related Pol polyprotein from transposon 17.6 OS=Drosophila melanogaster GN=pol PE=3 SV=1)

HSP 1 Score: 147.1 bits (370), Expect = 8.8e-34
Identity = 70/154 (45.45%), Postives = 97/154 (62.99%), Query Frame = 1

Query: 717 QGTNPEMERLVDEMLASGIIWPSTSPYSSPVLLVRRKDGS-----WRFYVDYRALNNVTI 776
           Q    E+E  + +ML  GII  S SPY+SP+ +V +K  +     +R  +DYR LN +T+
Sbjct: 217 QAYEQEVESQIQDMLNQGIIRTSNSPYNSPIWVVPKKQDASGKQKFRIVIDYRKLNEITV 276

Query: 777 PDKFPIPVIEELFNELNGANMFSKIDIKARYHQIRMNQEDVEKTTFRTHEGHYEFLVMPF 836
            D+ PIP ++E+  +L   N F+ ID+   +HQI M+ E V KT F T  GHYE+L MPF
Sbjct: 277 GDRHPIPNMDEILGKLGRCNYFTTIDLAKGFHQIEMDPESVSKTAFSTKHGHYEYLRMPF 336

Query: 837 GLTNAPSTFQALMNSIFRPYMRRFVLVFFDDILV 866
           GL NAP+TFQ  MN I RP + +  LV+ DDI+V
Sbjct: 337 GLKNAPATFQRCMNDILRPLLNKHCLVYLDDIIV 370

BLAST of CSPI04G20550 vs. TrEMBL
Match: A0A087HSH3_ARAAL (Uncharacterized protein OS=Arabis alpina GN=AALP_AA1G340100 PE=4 SV=1)

HSP 1 Score: 375.9 bits (964), Expect = 1.3e-100
Identity = 235/631 (37.24%), Postives = 340/631 (53.88%), Query Frame = 1

Query: 340 KKFEIPVFNGTDPDSWLFRADRFFKIHSFTDSEKLSVAIISFDGPAIDWYRFQDECEAFK 399
           ++ EIP+F+G + +SW+ R +++F++ +F+D EKL    + F   A  WY+++ +   F+
Sbjct: 120 RRIEIPLFSGENAESWVQRIEQYFELGNFSDLEKLQAVRVCFLEDAWSWYQWERDRNPFR 179

Query: 400 DWDGLKEKMLTCFP------------TIR-DGSL------------------TIVLEETF 459
            W  L+ ++L  F             T+R DGS+                     LE  F
Sbjct: 180 SWAQLRYRLLDEFSASPNSSAGERLLTLRQDGSVKDYCHEFIALATNAPEIQETTLELAF 239

Query: 460 MNGLYPWLKTEVSLLEPKGLAAMMKLALKIE---------------------HRDGSQRG 519
           M GL P + T     EP+ L  MM +  +++                       + +QR 
Sbjct: 240 MMGLTPAICTRTKTFEPQTLKQMMVVEQRVDCWIEVDDSPPPRVTPPYRKLTADEIAQRK 299

Query: 520 WNDNIERGYAGGELKGRANKKIELRMLVVHDNGEELEIIEDDPYDEELEVKPM-EVGADE 579
             +   R    G +     KK E  +LVV  +G  LE+ +DDP   E EV    E+ A  
Sbjct: 300 AANQCYRCDEVGHMHHMCPKK-EFGVLVVQADGSYLELADDDPVGTEEEVPTQTELAA-- 359

Query: 580 NLNIELSINSVVGLNNSGTMKVKGKVKNEDVVVLIDCGATHNFISEKLVTALNLPLKTTT 639
                LS+NS+VG+++  TMK+KG +++  VVV+ID GA+HNF+S KL+++L LP+  + 
Sbjct: 360 -----LSLNSIVGISSPRTMKLKGLIQSTPVVVMIDSGASHNFVSTKLISSLGLPIDRSN 419

Query: 640 TYGVILGPGTTIKGKG---------------------------------WLKSLGVTEVD 699
            YGV+ G    ++G G                                 WL+SLG   V+
Sbjct: 420 RYGVMTGTRMIVEGIGTCIELELQIQDVVVKAGFLPLELGSADVILGMQWLESLGDMLVN 479

Query: 700 WRKLIMTFQHYWKRITIKGDPSLMKIRVSLKHMMKTWDNEDQSYLIGCRALHSSLTIEEM 759
           W+   M F    ++  ++GD  L    +S K + K   ++ Q  L+    L + L I + 
Sbjct: 480 WKLQRMKFILNSEKAGLQGDVGLCCSEISFKALWKAVTDKGQGVLVEYNGLQAELGIHK- 539

Query: 760 YDEETKPTAENLLPSLLTKFDDVFSWPETLPPQRGIEHHIHLKQGTNP------------ 819
            + E  P     L  +L  F  VF+ P+ LPP RG EH I L+    P            
Sbjct: 540 -EREQIP---EQLQDVLAGFAGVFAEPQGLPPSRGKEHGITLEPNARPVSVCPFRYPQAQ 599

Query: 820 --EMERLVDEMLASGIIWPSTSPYSSPVLLVRRKDGSWRFYVDYRALNNVTIPDKFPIPV 871
             E+E+ V  MLA+GII  S SP+SSPVLLV++KDGSWRF VDYRALN VTIPD FPIP+
Sbjct: 600 REEIEKQVASMLAAGIIQASGSPFSSPVLLVKKKDGSWRFCVDYRALNKVTIPDSFPIPM 659

BLAST of CSPI04G20550 vs. TrEMBL
Match: J3SDF5_BETVU (Ty3/gypsy retrotransposon protein OS=Beta vulgaris subsp. vulgaris PE=4 SV=1)

HSP 1 Score: 349.7 bits (896), Expect = 1.0e-92
Identity = 190/431 (44.08%), Postives = 258/431 (59.86%), Query Frame = 1

Query: 483 GRANKKIELRMLVVHDNGEELEIIEDDPYDEELEVKPMEVGADENLNIELSINSVVGLNN 542
           G   ++ EL +L + DN       E+D  +  L          E +  E+S+NSV+GL+N
Sbjct: 427 GHQCRRKELSVLFMEDN-------EEDELEGALSGSEAPPSPTEEIPPEVSLNSVIGLSN 486

Query: 543 SGTMKVKGKVKNEDVVVLIDCGATHNFISEKLVTALNLPLKTTTTYGVILGPGTTIKGKG 602
             TMK+ G + N +VVV+ID GATHNF+S K +  L +P+  +  +GV LG G  ++G G
Sbjct: 487 PKTMKLSGLIDNHEVVVMIDPGATHNFLSLKAIDKLGIPVTESEEFGVSLGDGQAVRGTG 546

Query: 603 ----------------------------------WLKSLGVTEVDWRKLIMTFQHYWKRI 662
                                             WL++LG    +W+   M+FQ      
Sbjct: 547 ICRAVALYLDGGLVVVEDFLPLGLGNSDVILGVQWLETLGTVVSNWKTQKMSFQLGGVPY 606

Query: 663 TIKGDPSLMKIRVSLKHMMKTWDNEDQSYLIGCRALHSSLTIEEMYDEETKPTAENLLPS 722
           T+ GDP+L + +VSLK M++T   E     + C  + +      + D + +      L  
Sbjct: 607 TLTGDPTLARSKVSLKAMLRTLRKEGGGLWLECNQVEAG-GAGSIRDSKVEQEIPPFLQE 666

Query: 723 LLTKFDDVFSWPETLPPQRGIEHHIHLKQGTNPEM--------------ERLVDEMLASG 782
           L+ +F+ VF  P  LPP+RG EH I LK+G+NP                ERL+ EMLA+G
Sbjct: 667 LMRRFEGVFETPVGLPPRRGHEHAIVLKEGSNPVGVRPYRYPQFQKDEIERLIKEMLAAG 726

Query: 783 IIWPSTSPYSSPVLLVRRKDGSWRFYVDYRALNNVTIPDKFPIPVIEELFNELNGANMFS 842
           II PSTSP+SSPV+LV++KDGSWRF VDYRALN  T+PDK+PIPVI+EL +EL+GA +FS
Sbjct: 727 IIQPSTSPFSSPVILVKKKDGSWRFCVDYRALNKETVPDKYPIPVIDELLDELHGATVFS 786

Query: 843 KIDIKARYHQIRMNQEDVEKTTFRTHEGHYEFLVMPFGLTNAPSTFQALMNSIFRPYMRR 866
           K+D++A YHQI +  ED  KT FRTHEGHYEFLVMPFGLTNAP+TFQ+LMN +FRP++RR
Sbjct: 787 KLDLRAGYHQILVRPEDTHKTAFRTHEGHYEFLVMPFGLTNAPATFQSLMNEVFRPFLRR 846

BLAST of CSPI04G20550 vs. TrEMBL
Match: A5B2I6_VITVI (Putative uncharacterized protein OS=Vitis vinifera GN=VITISV_043911 PE=4 SV=1)

HSP 1 Score: 340.5 bits (872), Expect = 6.1e-90
Identity = 189/431 (43.85%), Postives = 261/431 (60.56%), Query Frame = 1

Query: 488  KIELRMLVVHDNGEELEIIEDDPYDEELEVKPMEVGADENLNIELSINSVVGLNNSGTMK 547
            K ELR+L+VH++ EE    +D+ +D+    +P  +   +   +ELS+NSVVGL   GTMK
Sbjct: 1007 KKELRVLLVHEDEEE----DDNQFDDRATEEPALIELKDA--VELSLNSVVGLTTPGTMK 1066

Query: 548  VKGKVKNEDVVVLIDCGATHNFISEKLVTALNLPLKTTTTYGVILGPGTTIKGKG----- 607
            +KG + +++V++L+D GATHNF+S +LV  L LPL TTT+YGV++G G ++KGKG     
Sbjct: 1067 IKGTIGSKEVIILVDSGATHNFLSLELVQQLTLPLTTTTSYGVMMGTGISVKGKGICRGV 1126

Query: 608  ----------------------------WLKSLGVTEVDWRKLIMTFQHYWKRITIKGDP 667
                                        WL +LG  +V+W+ L M  +     + +KGDP
Sbjct: 1127 CISMQGLTVVEDFLPLELGNTDVILGMPWLGTLGDVKVNWKMLTMKIKMGKAVMVLKGDP 1186

Query: 668  SLMKIRVSLKHMMKTWDNEDQSYLIGCRALHSSLTIEEMYDEETKPTAENLLPSLLTKFD 727
            SL +   S                       ++  + E   E  K   E     +L +  
Sbjct: 1187 SLSRTETS-----------------------TTSDLSEGVQEVPKTVKE-----VLAQHQ 1246

Query: 728  DVFSWPETLPPQRGIEHHIHLKQGTNP--------------EMERLVDEMLASGIIWPST 787
             +F     LPP R I+H I L  G +P              E++RLV EML +GI+ PS 
Sbjct: 1247 QIFEPITGLPPSRDIDHAIQLILGASPVNVRPYRYPHILKNEIKRLVQEMLEAGIVRPSL 1306

Query: 788  SPYSSPVLLVRRKDGSWRFYVDYRALNNVTIPDKFPIPVIEELFNELNGANMFSKIDIKA 847
            SP+SSPVLLV++KDG WRF +DYRALN VT+PD+FPIPVI+EL ++L+GA +FSK+D+K+
Sbjct: 1307 SPFSSPVLLVKKKDGGWRFCIDYRALNKVTVPDRFPIPVIDELLDKLHGATIFSKLDLKS 1366

Query: 848  RYHQIRMNQEDVEKTTFRTHEGHYEFLVMPFGLTNAPSTFQALMNSIFRPYMRRFVLVFF 872
             YHQIR+ Q+D+ KT FRTHEGHYEFLVMPFGLTNAP+TFQ+LMN IF P++ +FVLVFF
Sbjct: 1367 GYHQIRVRQQDIPKTAFRTHEGHYEFLVMPFGLTNAPATFQSLMNRIFWPHLWKFVLVFF 1403

BLAST of CSPI04G20550 vs. TrEMBL
Match: A0A087G289_ARAAL (Uncharacterized protein OS=Arabis alpina GN=AALP_AAs48021U000700 PE=4 SV=1)

HSP 1 Score: 336.3 bits (861), Expect = 1.1e-88
Identity = 187/423 (44.21%), Postives = 257/423 (60.76%), Query Frame = 1

Query: 490 ELRMLVVHDNGEELEIIEDDPYDEELEVKPMEVGADENLNIELSINSVVGLNNSGTMKVK 549
           E  +L+V D+G E+E  ED+  +E++E        D     ELS+NS+VG+++  T+K++
Sbjct: 273 EYAVLIVQDDGSEIEW-EDEGGEEKIEAI-----LDTAEVAELSLNSMVGISSPRTVKLR 332

Query: 550 GKVKNEDVVVLIDCGATHNFISEKLVTALNLPLKTTTTYGVILGPGTTIKGKG------- 609
           G +++E V+V+ID GA+HNF+SEK+V  L L    T  YGV+ G G T++G+G       
Sbjct: 333 GSIRDEPVIVMIDSGASHNFVSEKMVVKLGLTATETKGYGVVTGTGLTVQGRGVCKDVEL 392

Query: 610 --------------------------WLKSLGVTEVDWRKLIMTFQHYWKRITIKGDPSL 669
                                     WL SLG    +W+   + F    K + ++GDPS+
Sbjct: 393 HLQGLVVVAPFLPLELGSADVILGIQWLGSLGDMRCNWKLQKIAFMVEGKEVELQGDPSI 452

Query: 670 MKIRVSLKHMMKTWDNEDQSYLIGCRALHSSLTIEEMYDEETKPTAENLLPSLLTKFDDV 729
               V+LK + K  D E Q  ++     +  L  +    E+  P A   L ++L +F  V
Sbjct: 453 CCSPVTLKGLWKALDQEGQGVIVE----YGGLQAQNPRSEKPVPEA---LSTVLAEFTGV 512

Query: 730 FSWPETLPPQRGIEHHIHLKQGTNP--------------EMERLVDEMLASGIIWPSTSP 789
           F  P  LPP RG EH I LKQ  +P              E+ER V  MLA+GI   S SP
Sbjct: 513 FEEPRGLPPSRGKEHEITLKQEASPVCVRPFRYPQAQREELERQVATMLAAGITKESNSP 572

Query: 790 YSSPVLLVRRKDGSWRFYVDYRALNNVTIPDKFPIPVIEELFNELNGANMFSKIDIKARY 849
           +SSPVLLV++KDGSWRF VDYRALN VT+ D +PIP+I++L +EL+G+ +FSK+D++A Y
Sbjct: 573 FSSPVLLVKKKDGSWRFCVDYRALNKVTVGDSYPIPMIDQLLDELHGSVIFSKLDLRAGY 632

Query: 850 HQIRMNQEDVEKTTFRTHEGHYEFLVMPFGLTNAPSTFQALMNSIFRPYMRRFVLVFFDD 866
           HQIR+  EDV KT FRTH+GHYEFLVMPFGLTNAP TFQ+LMN +FR ++RRFVLVFFDD
Sbjct: 633 HQIRVKAEDVPKTAFRTHDGHYEFLVMPFGLTNAPGTFQSLMNEVFRKFLRRFVLVFFDD 682

BLAST of CSPI04G20550 vs. TrEMBL
Match: A0A087G291_ARAAL (Uncharacterized protein OS=Arabis alpina GN=AALP_AAs48021U000700 PE=4 SV=1)

HSP 1 Score: 336.3 bits (861), Expect = 1.1e-88
Identity = 187/423 (44.21%), Postives = 257/423 (60.76%), Query Frame = 1

Query: 490 ELRMLVVHDNGEELEIIEDDPYDEELEVKPMEVGADENLNIELSINSVVGLNNSGTMKVK 549
           E  +L+V D+G E+E  ED+  +E++E        D     ELS+NS+VG+++  T+K++
Sbjct: 355 EYAVLIVQDDGSEIEW-EDEGGEEKIEAI-----LDTAEVAELSLNSMVGISSPRTVKLR 414

Query: 550 GKVKNEDVVVLIDCGATHNFISEKLVTALNLPLKTTTTYGVILGPGTTIKGKG------- 609
           G +++E V+V+ID GA+HNF+SEK+V  L L    T  YGV+ G G T++G+G       
Sbjct: 415 GSIRDEPVIVMIDSGASHNFVSEKMVVKLGLTATETKGYGVVTGTGLTVQGRGVCKDVEL 474

Query: 610 --------------------------WLKSLGVTEVDWRKLIMTFQHYWKRITIKGDPSL 669
                                     WL SLG    +W+   + F    K + ++GDPS+
Sbjct: 475 HLQGLVVVAPFLPLELGSADVILGIQWLGSLGDMRCNWKLQKIAFMVEGKEVELQGDPSI 534

Query: 670 MKIRVSLKHMMKTWDNEDQSYLIGCRALHSSLTIEEMYDEETKPTAENLLPSLLTKFDDV 729
               V+LK + K  D E Q  ++     +  L  +    E+  P A   L ++L +F  V
Sbjct: 535 CCSPVTLKGLWKALDQEGQGVIVE----YGGLQAQNPRSEKPVPEA---LSTVLAEFTGV 594

Query: 730 FSWPETLPPQRGIEHHIHLKQGTNP--------------EMERLVDEMLASGIIWPSTSP 789
           F  P  LPP RG EH I LKQ  +P              E+ER V  MLA+GI   S SP
Sbjct: 595 FEEPRGLPPSRGKEHEITLKQEASPVCVRPFRYPQAQREELERQVATMLAAGITKESNSP 654

Query: 790 YSSPVLLVRRKDGSWRFYVDYRALNNVTIPDKFPIPVIEELFNELNGANMFSKIDIKARY 849
           +SSPVLLV++KDGSWRF VDYRALN VT+ D +PIP+I++L +EL+G+ +FSK+D++A Y
Sbjct: 655 FSSPVLLVKKKDGSWRFCVDYRALNKVTVGDSYPIPMIDQLLDELHGSVIFSKLDLRAGY 714

Query: 850 HQIRMNQEDVEKTTFRTHEGHYEFLVMPFGLTNAPSTFQALMNSIFRPYMRRFVLVFFDD 866
           HQIR+  EDV KT FRTH+GHYEFLVMPFGLTNAP TFQ+LMN +FR ++RRFVLVFFDD
Sbjct: 715 HQIRVKAEDVPKTAFRTHDGHYEFLVMPFGLTNAPGTFQSLMNEVFRKFLRRFVLVFFDD 764

BLAST of CSPI04G20550 vs. TAIR10
Match: ATMG00850.1 (ATMG00850.1 DNA/RNA polymerases superfamily protein)

HSP 1 Score: 50.4 bits (119), Expect = 6.3e-06
Identity = 24/45 (53.33%), Postives = 32/45 (71.11%), Query Frame = 1

Query: 713 IHLKQGTNPEMERLVDEMLASGIIWPSTSPYSSPVLLVRRKDGSW 758
           IH+ + T   ++  + EML + II PS SPYSSPVLLV++KDG W
Sbjct: 37  IHILRRTR--LKNWLGEMLEARIIQPSISPYSSPVLLVQKKDGGW 79

BLAST of CSPI04G20550 vs. NCBI nr
Match: gi|674252310|gb|KFK45075.1| (hypothetical protein AALP_AA1G340100 [Arabis alpina])

HSP 1 Score: 375.9 bits (964), Expect = 1.9e-100
Identity = 235/631 (37.24%), Postives = 340/631 (53.88%), Query Frame = 1

Query: 340 KKFEIPVFNGTDPDSWLFRADRFFKIHSFTDSEKLSVAIISFDGPAIDWYRFQDECEAFK 399
           ++ EIP+F+G + +SW+ R +++F++ +F+D EKL    + F   A  WY+++ +   F+
Sbjct: 120 RRIEIPLFSGENAESWVQRIEQYFELGNFSDLEKLQAVRVCFLEDAWSWYQWERDRNPFR 179

Query: 400 DWDGLKEKMLTCFP------------TIR-DGSL------------------TIVLEETF 459
            W  L+ ++L  F             T+R DGS+                     LE  F
Sbjct: 180 SWAQLRYRLLDEFSASPNSSAGERLLTLRQDGSVKDYCHEFIALATNAPEIQETTLELAF 239

Query: 460 MNGLYPWLKTEVSLLEPKGLAAMMKLALKIE---------------------HRDGSQRG 519
           M GL P + T     EP+ L  MM +  +++                       + +QR 
Sbjct: 240 MMGLTPAICTRTKTFEPQTLKQMMVVEQRVDCWIEVDDSPPPRVTPPYRKLTADEIAQRK 299

Query: 520 WNDNIERGYAGGELKGRANKKIELRMLVVHDNGEELEIIEDDPYDEELEVKPM-EVGADE 579
             +   R    G +     KK E  +LVV  +G  LE+ +DDP   E EV    E+ A  
Sbjct: 300 AANQCYRCDEVGHMHHMCPKK-EFGVLVVQADGSYLELADDDPVGTEEEVPTQTELAA-- 359

Query: 580 NLNIELSINSVVGLNNSGTMKVKGKVKNEDVVVLIDCGATHNFISEKLVTALNLPLKTTT 639
                LS+NS+VG+++  TMK+KG +++  VVV+ID GA+HNF+S KL+++L LP+  + 
Sbjct: 360 -----LSLNSIVGISSPRTMKLKGLIQSTPVVVMIDSGASHNFVSTKLISSLGLPIDRSN 419

Query: 640 TYGVILGPGTTIKGKG---------------------------------WLKSLGVTEVD 699
            YGV+ G    ++G G                                 WL+SLG   V+
Sbjct: 420 RYGVMTGTRMIVEGIGTCIELELQIQDVVVKAGFLPLELGSADVILGMQWLESLGDMLVN 479

Query: 700 WRKLIMTFQHYWKRITIKGDPSLMKIRVSLKHMMKTWDNEDQSYLIGCRALHSSLTIEEM 759
           W+   M F    ++  ++GD  L    +S K + K   ++ Q  L+    L + L I + 
Sbjct: 480 WKLQRMKFILNSEKAGLQGDVGLCCSEISFKALWKAVTDKGQGVLVEYNGLQAELGIHK- 539

Query: 760 YDEETKPTAENLLPSLLTKFDDVFSWPETLPPQRGIEHHIHLKQGTNP------------ 819
            + E  P     L  +L  F  VF+ P+ LPP RG EH I L+    P            
Sbjct: 540 -EREQIP---EQLQDVLAGFAGVFAEPQGLPPSRGKEHGITLEPNARPVSVCPFRYPQAQ 599

Query: 820 --EMERLVDEMLASGIIWPSTSPYSSPVLLVRRKDGSWRFYVDYRALNNVTIPDKFPIPV 871
             E+E+ V  MLA+GII  S SP+SSPVLLV++KDGSWRF VDYRALN VTIPD FPIP+
Sbjct: 600 REEIEKQVASMLAAGIIQASGSPFSSPVLLVKKKDGSWRFCVDYRALNKVTIPDSFPIPM 659

BLAST of CSPI04G20550 vs. NCBI nr
Match: gi|731338584|ref|XP_010680400.1| (PREDICTED: transposon Tf2-1 polyprotein isoform X1 [Beta vulgaris subsp. vulgaris])

HSP 1 Score: 359.8 bits (922), Expect = 1.4e-95
Identity = 202/438 (46.12%), Postives = 270/438 (61.64%), Query Frame = 1

Query: 482 KGRANKKIELRMLVVHDNGEELEIIEDDPYDEELEVKPMEVGAD---ENLNIELSINSVV 541
           +G   +K E+ +LV+          E+DP  EE E +  +  AD   E   +ELS+NSVV
Sbjct: 374 QGHRCQKKEVSVLVMEG--------EEDPPPEEEEEEVNDASADVSAEVTTVELSLNSVV 433

Query: 542 GLNNSGTMKVKGKVKNEDVVVLIDCGATHNFISEKLVTALNLPLKTTTTYGVILGPGTTI 601
           GL +  TMK+ G +  ++VVV++D GATHNFIS + V  L +PL     +GV LG GT +
Sbjct: 434 GLTSPRTMKLTGVINGQEVVVMVDPGATHNFISLRAVEKLAIPLIGEANFGVSLGTGTMV 493

Query: 602 KGKGWLKS-------------------------LGV--------TEVDWRKLIMTFQHYW 661
           KGKG  +                          LGV           +W+  +M F+   
Sbjct: 494 KGKGECQGVMLEIQGLVIRENFLPLDLGNSDIILGVQWLEKLGSVTTNWKSQLMKFKIGR 553

Query: 662 KRITIKGDPSLMKIRVSLKHMMKTWDNEDQSYLIGCRALHSSLTIEEMYDEETKPTAENL 721
           + +T++GDPSL + R+SLK M++    E Q  L+    +         +D E +      
Sbjct: 554 EEVTLQGDPSLDRTRISLKAMLRALRIEGQGVLVEMNHIEREKEPPGKWDIEVE--VPRP 613

Query: 722 LPSLLTKFDDVFSWPETLPPQRGIEHHIHLKQGTNP--------------EMERLVDEML 781
           L  LL ++  VF+ P  LPP RG EH I LK+G+NP              E+ERLV +ML
Sbjct: 614 LQPLLNQYSQVFNMPSGLPPSRGREHSITLKEGSNPVSVRPYRYPHVQKGEIERLVKDML 673

Query: 782 ASGIIWPSTSPYSSPVLLVRRKDGSWRFYVDYRALNNVTIPDKFPIPVIEELFNELNGAN 841
           A+GII PSTSP+SSPVLLV++KDGSWRF VDYRALN  T+PDK+PIPVI+EL +EL G+ 
Sbjct: 674 AAGIIQPSTSPFSSPVLLVKKKDGSWRFCVDYRALNKETVPDKYPIPVIDELLDELYGSV 733

Query: 842 MFSKIDIKARYHQIRMNQEDVEKTTFRTHEGHYEFLVMPFGLTNAPSTFQALMNSIFRPY 870
           +FSK+D+K+ YHQIR+ +ED+ KT FRTHEGHYEFLVMPFGLTNAP+TFQ+LMN +FRP+
Sbjct: 734 VFSKLDLKSGYHQIRVRKEDIHKTAFRTHEGHYEFLVMPFGLTNAPATFQSLMNEVFRPF 793

BLAST of CSPI04G20550 vs. NCBI nr
Match: gi|778697580|ref|XP_011654353.1| (PREDICTED: uncharacterized protein LOC105435354 [Cucumis sativus])

HSP 1 Score: 359.4 bits (921), Expect = 1.8e-95
Identity = 203/470 (43.19%), Postives = 282/470 (60.00%), Query Frame = 1

Query: 452 MKLALKIEHRDGSQRGWNDNIERGYAGGELKGRANKKIELRMLVVHDNGEELEIIEDDPY 511
           +K  L +E +    +G        Y+ G  + +   K EL + ++++     E +ED+  
Sbjct: 296 VKRLLDVEFKARLDKGLCFKCNERYSPGH-RCKMKDKRELMLFIMNEE----ESLEDEDR 355

Query: 512 DEELEVKPMEVGA---DENLNIELSINSVVGLNNSGTMKVKGKVKNEDVVVLIDCGATHN 571
            EE   + +E+     +E   IEL   ++ GL + GTMK+KG++K ++V++LID GATHN
Sbjct: 356 TEETNEEVLELNQLTLEEGTEIELK--AIHGLTSKGTMKIKGEIKGKEVLILIDSGATHN 415

Query: 572 FISEKLVTALNLPLKTTTTYGVILGPGTTIKGKG-------------------------- 631
           FI  K+V  + L L+  T +GV +G GT  +G+G                          
Sbjct: 416 FIHNKIVEEVGLELENHTPFGVTIGDGTRCQGRGVCNRLELKLKEITIVADFLAIELGSV 475

Query: 632 -------WLKSLGVTEVDWRKLIMTFQHYWKRITIKGDPSLMKIRVSLKHMMKTWDNEDQ 691
                  WL + G  ++ W  L MTF+   K+  +KGDPSL++   SLK + KTW+ +DQ
Sbjct: 476 DVILGMQWLNTTGTMKIHWPSLTMTFRMGKKQFILKGDPSLIRAECSLKTIEKTWEEDDQ 535

Query: 692 SYLIGCRALHSSLT--IEEMY----DEETKPTAENLLPSLLTKFDDVFSWPETLPPQRGI 751
            +L+  +   +     ++E+     DEE  P    ++  LL ++ D+F  P+ LPP+R  
Sbjct: 536 GFLLEMQNYEAEEDGELDEVQRVKGDEEESP----MIQVLLQQYTDLFEEPKGLPPKREC 595

Query: 752 EHHIHLKQGTNP--------------EMERLVDEMLASGIIWPSTSPYSSPVLLVRRKDG 811
           +H I L  G  P              E+E+L+ EML  GII PS SPYSSPVLLVR+KDG
Sbjct: 596 DHRILLVTGQKPINVRPYKYGHTQKEEIEKLISEMLQVGIIRPSHSPYSSPVLLVRKKDG 655

Query: 812 SWRFYVDYRALNNVTIPDKFPIPVIEELFNELNGANMFSKIDIKARYHQIRMNQEDVEKT 866
            WRF VDYR LN VTI DKFPIPVIEEL +EL+GA +FSK+D+K+ YHQIRM +EDVEKT
Sbjct: 656 GWRFCVDYRKLNQVTISDKFPIPVIEELLDELHGATVFSKLDLKSGYHQIRMKEEDVEKT 715

BLAST of CSPI04G20550 vs. NCBI nr
Match: gi|729344250|ref|XP_010541181.1| (PREDICTED: uncharacterized protein LOC104814705 [Tarenaya hassleriana])

HSP 1 Score: 358.2 bits (918), Expect = 4.0e-95
Identity = 197/433 (45.50%), Postives = 270/433 (62.36%), Query Frame = 1

Query: 483  GRANKKIELRMLV---VHDNGEELEIIEDDPYDEELEVKPMEVGADENLNIELSINSVVG 542
            G   K+ EL++++   + + GEELE  +D+               DE    ELS+NSVVG
Sbjct: 593  GHRCKQKELQVILAEEITETGEELEEEQDNEAGNR---------EDEGEFAELSLNSVVG 652

Query: 543  LNNSGTMKVKGKVKNEDVVVLIDCGATHNFISEKLVTALNLPLKTTTTYGVILGPGTTIK 602
            L +  T+K++G ++ ++VVVLID GATHNFIS KL+  L L  +  T +GV LG G  +K
Sbjct: 653  LTSPKTLKIRGSIEGQEVVVLIDSGATHNFISLKLMKKLKLRPEGNTQFGVSLGTGMKVK 712

Query: 603  GKG---------------------------------WLKSLGVTEVDWRKLIMTFQHYWK 662
            GKG                                 WL+ LG  ++D++ L + F     
Sbjct: 713  GKGICKAVHLQLQQIEVVEDFLPLELGSADLILGVQWLQKLGKVQMDFQDLELKFNQGTS 772

Query: 663  RITIKGDPSLMKIRVSLKHMMKTWDNEDQSYLIGCRALHSSLTIEEMYDEETKPTAENLL 722
             +T+ GDP+L    V+L+ ++K+  + DQSYL+    L   + ++    E+        L
Sbjct: 773  WVTVTGDPTLHSSLVTLRSLIKSVCDGDQSYLVKLETLEEQVGVDSNLPEK--------L 832

Query: 723  PSLLTKFDDVFSWPETLPPQRGIEHHIHLKQGTNP--------------EMERLVDEMLA 782
             ++L +F  VF  P  LPP+RG EH I+LK+GT P              E+E+LV +ML 
Sbjct: 833  QAVLEEFGPVFEIPTELPPERGREHPINLKEGTGPVSVRPYRYPHAHKEEIEKLVKDMLK 892

Query: 783  SGIIWPSTSPYSSPVLLVRRKDGSWRFYVDYRALNNVTIPDKFPIPVIEELFNELNGANM 842
            +GI+ PS SP+SSPVLLV++KDGSWRF +DYRALN VT+ DKFPIP+I++L +EL+GA +
Sbjct: 893  AGIVRPSQSPFSSPVLLVKKKDGSWRFCIDYRALNKVTVLDKFPIPMIDQLLDELHGARV 952

Query: 843  FSKIDIKARYHQIRMNQEDVEKTTFRTHEGHYEFLVMPFGLTNAPSTFQALMNSIFRPYM 866
            FSK+D+++ YHQIRM  ED+ KT FRTH+GHYEFLVMPFGLTNAP+TFQALMN IFRPY+
Sbjct: 953  FSKLDLRSGYHQIRMKTEDIPKTAFRTHDGHYEFLVMPFGLTNAPATFQALMNEIFRPYL 1008

BLAST of CSPI04G20550 vs. NCBI nr
Match: gi|922423376|ref|XP_013617706.1| (PREDICTED: uncharacterized protein LOC106324252 [Brassica oleracea var. oleracea])

HSP 1 Score: 350.5 bits (898), Expect = 8.4e-93
Identity = 194/427 (45.43%), Postives = 264/427 (61.83%), Query Frame = 1

Query: 488 KIELRMLVVHDNGEELEIIEDDPYDEELEVKPMEVGADENLNIELSINSVVGLNNSGTMK 547
           + E+ +++V ++G E+E+ E+   D E +   M+         ELS+NSVVGL++  TMK
Sbjct: 404 RAEMLVVMVMEDGTEIEMAEEQWGDGEDDPTKMQAEV-----AELSLNSVVGLSSPKTMK 463

Query: 548 VKGKVKNEDVVVLIDCGATHNFISEKLVTALNLPLKTTTTYGVILGPGTTIKGKG----- 607
           V+G +  E VV+LID GA+HNFISEK+VT LNL  +   +YGV++  G T++G+G     
Sbjct: 464 VRGTIHGEAVVILIDNGASHNFISEKIVTKLNLQKRAVASYGVMVAGGATLEGQGVIIGL 523

Query: 608 ----------------------------WLKSLGVTEVDWRKLIMTFQHYWKRITIKGDP 667
                                       WL +LG   V+W+   M +    + I ++GDP
Sbjct: 524 ELRLPGYVVVTDFLPLELGIADVILGVQWLDTLGDVNVNWKLQCMRYHDGEEEIILQGDP 583

Query: 668 SLMKIRVSLKHMMKTWDNEDQSYLIGCRALHSS--LTIEEMYDEETKPTAENLLPSLLTK 727
           SL    VSLK M KT   E +  L+    L +S  + +   + EE K         +L +
Sbjct: 584 SLHSASVSLKSMWKTLQKEGEGVLLEFGGLRASEDVVLPVAWPEELK--------EVLEQ 643

Query: 728 FDDVFSWPETLPPQRGIEHHIHLKQGTNP--------------EMERLVDEMLASGIIWP 787
           +  VFS P  LPP RG EH I L+ G  P              E+E+ +  MLA+GII  
Sbjct: 644 YTQVFSEPRGLPPSRGREHTIILENGAKPVSIRPFRYPHAQKEEIEQQIASMLAAGIIQE 703

Query: 788 STSPYSSPVLLVRRKDGSWRFYVDYRALNNVTIPDKFPIPVIEELFNELNGANMFSKIDI 847
           ++SP+SSPVLLVR+KDGSWRF VDYRALN  T+ DK+PIP+I++L +EL+GA +FSKID+
Sbjct: 704 TSSPFSSPVLLVRKKDGSWRFCVDYRALNKYTVADKYPIPMIDQLLDELHGATIFSKIDL 763

Query: 848 KARYHQIRMNQEDVEKTTFRTHEGHYEFLVMPFGLTNAPSTFQALMNSIFRPYMRRFVLV 866
           ++ YHQIR+  EDV KT FRTH+GHYEFLVMPFGL+NAP+TFQALMN IFRPY+R+FVLV
Sbjct: 764 RSGYHQIRVRAEDVPKTAFRTHDGHYEFLVMPFGLSNAPATFQALMNDIFRPYLRKFVLV 817

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
POL2_DROME1.0e-3445.51Retrovirus-related Pol polyprotein from transposon 297 OS=Drosophila melanogaste... [more]
YI31B_YEAST1.4e-3444.91Transposon Ty3-I Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 20... [more]
YG31B_YEAST1.4e-3444.91Transposon Ty3-G Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 20... [more]
POL5_DROME6.8e-3444.38Retrovirus-related Pol polyprotein from transposon opus OS=Drosophila melanogast... [more]
POL3_DROME8.8e-3445.45Retrovirus-related Pol polyprotein from transposon 17.6 OS=Drosophila melanogast... [more]
Match NameE-valueIdentityDescription
A0A087HSH3_ARAAL1.3e-10037.24Uncharacterized protein OS=Arabis alpina GN=AALP_AA1G340100 PE=4 SV=1[more]
J3SDF5_BETVU1.0e-9244.08Ty3/gypsy retrotransposon protein OS=Beta vulgaris subsp. vulgaris PE=4 SV=1[more]
A5B2I6_VITVI6.1e-9043.85Putative uncharacterized protein OS=Vitis vinifera GN=VITISV_043911 PE=4 SV=1[more]
A0A087G289_ARAAL1.1e-8844.21Uncharacterized protein OS=Arabis alpina GN=AALP_AAs48021U000700 PE=4 SV=1[more]
A0A087G291_ARAAL1.1e-8844.21Uncharacterized protein OS=Arabis alpina GN=AALP_AAs48021U000700 PE=4 SV=1[more]
Match NameE-valueIdentityDescription
ATMG00850.16.3e-0653.33ATMG00850.1 DNA/RNA polymerases superfamily protein[more]
Match NameE-valueIdentityDescription
gi|674252310|gb|KFK45075.1|1.9e-10037.24hypothetical protein AALP_AA1G340100 [Arabis alpina][more]
gi|731338584|ref|XP_010680400.1|1.4e-9546.12PREDICTED: transposon Tf2-1 polyprotein isoform X1 [Beta vulgaris subsp. vulgari... [more]
gi|778697580|ref|XP_011654353.1|1.8e-9543.19PREDICTED: uncharacterized protein LOC105435354 [Cucumis sativus][more]
gi|729344250|ref|XP_010541181.1|4.0e-9545.50PREDICTED: uncharacterized protein LOC104814705 [Tarenaya hassleriana][more]
gi|922423376|ref|XP_013617706.1|8.4e-9345.43PREDICTED: uncharacterized protein LOC106324252 [Brassica oleracea var. oleracea... [more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR000477RT_dom
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0008150 biological_process
cellular_component GO:0005575 cellular_component
molecular_function GO:0003674 molecular_function

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CSPI04G20550.1CSPI04G20550.1mRNA


Analysis Name: InterPro Annotations of cucumber (PI183967)
Date Performed: 2017-01-17
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR000477Reverse transcriptase domainPFAMPF00078RVT_1coord: 752..869
score: 2.3
IPR000477Reverse transcriptase domainPROFILEPS50878RT_POLcoord: 731..876
score: 1
NoneNo IPR availableunknownCoilCoilcoord: 316..336
scor
NoneNo IPR availableGENE3DG3DSA:3.10.10.10coord: 713..844
score: 2.0
NoneNo IPR availablePANTHERPTHR24559FAMILY NOT NAMEDcoord: 743..870
score: 2.0
NoneNo IPR availablePANTHERPTHR24559:SF202SUBFAMILY NOT NAMEDcoord: 743..870
score: 2.0
NoneNo IPR availablePFAMPF13975gag-asp_proteascoord: 548..618
score: 1.
NoneNo IPR availableunknownSSF56672DNA/RNA polymerasescoord: 689..872
score: 3.22

The following gene(s) are orthologous to this gene:
GeneOrthologueOrganismBlock
CSPI04G20550CsaV3_4G030990Cucumber (Chinese Long) v3cpicucB212
CSPI04G20550Cla97C04G069640Watermelon (97103) v2cpiwmbB314
The following gene(s) are paralogous to this gene:

None