Cucsa.308940 (gene) Cucumber (Gy14) v1

NameCucsa.308940
Typegene
OrganismCucumis sativus (Cucumber (Gy14) v1)
DescriptionCysteine proteinases superfamily protein isoform 1
Locationscaffold02978 : 900608 .. 904154 (-)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: five_prime_UTRCDSthree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
ATCGAAGAAAAAGCTGCTCTGATTTCTGTGGGAGAGAGAGATAGAGAGATAGAGAGAGAGAGAGAGAAAAGAGAAAAGAGAAAAGAGAAAGAAGAGAGGCTGAGCGATTAACTGGCCGCCGGAGCTCTAATTATGGCCGCATAGGCACAGCAAGGGGCAGGTCCCTGGCAGCGCAGGATTGCCTCGACTGCTGTAAGTGACTGATGCTCGGAGTACTTTGTGCTCGTCCTAAGCCTTGGATTCTCGTTTCCCTATCCAATTTCATTCACGGCTCAGCCGTTTACCATCACCACCATCATCAAAGCCGGTTACTCGTTCAGAGTCCCATCCAATTCGATCGGCGACAGCGCCACCATTCCAGCGCCTGCAAGCTTGCGGGCGGTGGTGCTGCTTCGATATGGCACGCTATAATGCCTTCTGGTGCGGGAAGCAGTAGCAATCTCTGCCGTCCGGCGATTCACTGCCATGAGCGCAAAGGAGAGGGATCTTGGAACGTCGCCTGGGACGCTCGCCCGGCTCGTTGGCTCCACCGTCCCGATTCGGCTTGGCTGCTGTTTGGTGTCTGTGCCTGTATTGCGCCGCTTGATTGGGTGGATGCAAGTCATGAGGCTGTATCGTTAGATCAGAAGAAGGAAGTGTGTGAATCGAGCGGCCCTGAATTTAATCAAAACGATGAGAGCTCTGCTGATTACAGGGTGACAGGTATTGTGGAAAATTTGCGTGGCAATTTATCTTGATTTGATGAAAATTCCATTAGAGAAGCCATCGAAAATATCAATTCAAGTTCAGTTTAAATAAGGGTTTACCATTTATCTCTGTTCTATGTTATTTTTATTAATTGCTTTGATATGAGATAGATTGAAATGAACTGCTACAGGTGTGCTAGCGGATGGTCGGTGCTTGTTTAGGGCAATCGCTCATGGAGCTTGTTTGAGAAGTGGGGAAGAAGCTCCTGATGATGATCGTCAAAGAGAACTCGCTGATGAATTAAGGGCTAAGGTAAGATTAGAATTCATGCATAATCATCCATTTGTTGGTTACTATGTTGAATAACATGGGTAATATGCAAAACTACAGGTTGTGGATGAGCTCTTAAAGAGGCGGAAGGAAACAGAGTGGTAAGTTCAGTTCATCTGATACACATTACACGGATTTCATATATTTGGTTGAATCAGAAAATGGGTGGTTTGTGATTGTGATGATTGAAAATGCCTTTAGGTATATTGAAGGAGATTTTGATGCGTATGTGAAGAGAATTCAGCAACCTTTCGTGTGGGGTGGAGAACCTGAGTTACTTATGGCATCTCATGTTCTGAAGTAAGTTTTTATTTATATTGCTAACCACAAATTTGTCTCACTGTTTTATATAAAAAATTGATCCAAAACAAAGAAAGAGGAAGTAAGAAACGAATGGTAACTCAAAACAAAAAAGGAGAAGTAAGAAACAAAATTGAAACACTTTGTTCTGACAAACAGGGAACTGGTATTGGTGGAAATTTAAGGGCACGAATGGTAGGGAGCTAAGAGGGGATGATGGGTTGAATGCATGTTGGCCATGTGTTTAAGATTTAATATCGAGTTTTTTTAACATCCAAAATGTAGTGGAAATGTAGTAGGGTCCAGCCGATTGTCTCCTGATATTAGTTAAGGGTACACTTAAGGCATTTGGAGTCCTAAAAGATTAAATTAGTGTTACTTTGATGCATTTTAGTTTGTGTTCTTATTGGTGTAAACTTCATTTTTCTTCAAGCACTTTAAGTGGTGAAAAAGCCAGAGAGCTTTTGTGTTATTAACTTTTGAAGTTTATTTTAAAGAGTTTTTATAAAAGAATAGACTGTATGTATTATTTGGGGTTTAACTTTTTGTATATCACTTGATAACTTAGAATAAGGAAATCACTATTAGAATATATGTTCTTTTTAACCTTAGTAACACCAGCGTAAACTGTGAGTCAAATATTCAAACACATTGGTATAGTTCTTGAGAATAGCAGTTTTAAGTGTTCAATCTAACTATATTTTTTCAAGTGTGCTTGACATTACCAAACTTTTAAAGTGCTTTTAACCATTATAATAAAAAACTATCGTTATACTTGGCTACCTGGCTAAAAGTATTTTCAGTTTAATAATCAGTACTTTTAGTAAAAAAACAATTAAATTGGATTTGACATTTACTTTTAGGGTAGTGGAGAATGTTTTTAAAGCTATTTATCCTTTATGCTTTTAAATTTTTTTTTACAAAAAGAGTAGTTTCAAAATGTGTATTTAGATATTTTCTGAAGTATTTATAAAAAGTTACTTTCACTCAAAAGCAATACTCTCAAAAGTCAAAGTCATATGTTACTTTTAGCCATATAACTAAATTTACTCTAATCTGTTTTATAAATTTCCAAAAATTCTTTGAACTGTTAAGAACAAAACATAATAAAAAGTTTAATAGGGATTTTAATCAGGTAGGTTGTTTCAAGAACCACAACAAACACTCTAACACATGTCTGTTCCATCACAAGGGTTAAACTCTCCAAGTATAGACATTTGTAGTTTTACATTCATAGGGAGCTTTTTGTTGAGACAGCTTTGTATTGCATGACATACTGCATGGTTTTTTTGTCTTGATGTGTGACACTTCTACCGAGTTGAAAACAACTGTAATTGAAAACTGAATTTGGATTTTAAGTCGATCATTTAGGCCTTGATTTTGAAAATTCTGAGTTTCCTACACTGGTTTTTATGCATGTCTTTCTTATCAAAAGCAAGTTTTTGTTTAAGTTTCATCTTTTTGTCTAGTTTTTAAAACCTGTAATTGATTTTGAAAAGATGGTTTGAAAGAAATTAGATAACAGAACAAGAGACCGAAATGGTTGTCTAAGAAACCTTAGTTACTAAACTTTTTGCATTTTCTTGAACACTTTGTTAACTAAACTTCTTTTTTTTTTTCTTGAACACTGCACTTTGTACCCAAACCTCATAGGTGAAAGCTGTATTTACAAGTTTAATATTCAAAAGCAAAATTGGTTGTCAAAGGGGATTTAGTTACTAAACTTTTTGCATTTTCTTTGAACACTTTGCACTTGGCAGGACTCCAATATCAGTATTCATGAGAGAGAGGAGCTCAGATGGTCTGATAAACATAGCCAAGTATGGTCAAGAGTATCAGAAAGGTGAAGAAAGTCCTATCAACGTGCTGTTCCATGGGTATGGTCATTATGATATTCTGGAGACTTCGTCAGACAAAGTTTCACTGAAACTAAGCATGTAGAGAAATTCAAAATGGCTTATAATCTGTGGGGTTAGACAGATTAAAGTCATTCACATTAGGATTAGGCAACTTTCTCCAAAAGTTGAATAAAACTGCATTAATTTTAATAATTGGCACCAACAACCCCATTTCAAGCGATTGGCTTGTAAAAGTTTTGTAGGTAATTGATTTTCTGTTATGCCTTTTACATTTTCAAATTTGTGTGTGAACTTCAACTTGATAATGCAATTCTAGGTGTTTTGGTGCATTGCTTTCCTTAATTCCATTGATGAATTTTCTTTTGATGATA

mRNA sequence

ATCGAAGAAAAAGCTGCTCTGATTTCTGTGGgagagagagatagagagatagagagagagagagagaaaagagaaaagagaaaagagaaagaagagagGCTGAGCGATTAACTGGCCGCCGGAGCTCTAATTATGGCCGCATAGGCACAGCAAGGGGCAGGTCCCTGGCAGCGCAGGATTGCCTCGACTGCTGTAAGTGACTGATGCTCGGAGTACTTTGTGCTCGTCCTAAGCCTTGGATTCTCGTTTCCCTATCCAATTTCATTCACGGCTCAGCCGTTTACCATCACCACCATCATCAAAGCCGGTTACTCGTTCAGAGTCCCATCCAATTCGATCGGCGACAGCGCCACCATTCCAGCGCCTGCAAGCTTGCGGGCGGTGGTGCTGCTTCGATATGGCACGCTATAATGCCTTCTGGTGCGGGAAGCAGTAGCAATCTCTGCCGTCCGGCGATTCACTGCCATGAGCGCAAAGGAGAGGGATCTTGGAACGTCGCCTGGGACGCTCGCCCGGCTCGTTGGCTCCACCGTCCCGATTCGGCTTGGCTGCTGTTTGGTGTCTGTGCCTGTATTGCGCCGCTTGATTGGGTGGATGCAAGTCATGAGGCTGTATCGTTAGATCAGAAGAAGGAAGTGTGTGAATCGAGCGGCCCTGAATTTAATCAAAACGATGAGAGCTCTGCTGATTACAGGGTGACAGGTGTGCTAGCGGATGGTCGGTGCTTGTTTAGGGCAATCGCTCATGGAGCTTGTTTGAGAAGTGGGGAAGAAGCTCCTGATGATGATCGTCAAAGAGAACTCGCTGATGAATTAAGGGCTAAGGTTGTGGATGAGCTCTTAAAGAGGCGGAAGGAAACAGAGTGGTAAGTTCAGTTCATCTGATACACATTACACGGATTTCATATATTTGGTTGAATCAGAAAATGGGTGGTTTGTGATTGTGATGATTGAAAATGCCTTTAGGTATATTGAAGGAGATTTTGATGCGTATGTGAAGAGAATTCAGCAACCTTTCGTGTGGGGTGGAGAACCTGAGTTACTTATGGCATCTCATGTTCTGAAGACTCCAATATCAGTATTCATGAGAGAGAGGAGCTCAGATGGTCTGATAAACATAGCCAAGTATGGTCAAGAGTATCAGAAAGGTGAAGAAAGTCCTATCAACGTGCTGTTCCATGGGTATGGTCATTATGATATTCTGGAGACTTCGTCAGACAAAGTTTCACTGAAACTAAGCATGTAGAGAAATTCAAAATGGCTTATAATCTGTGGGGTTAGACAGATTAAAGTCATTCACATTAGGATTAGGCAACTTTCTCCAAAAGTTGAATAAAACTGCATTAATTTTAATAATTGGCACCAACAACCCCATTTCAAGCGATTGGCTTGTAAAAGTTTTGTAGGTAATTGATTTTCTGTTATGCCTTTTACATTTTCAAATTTGTGTGTGAACTTCAACTTGATAATGCAATTCTAGGTGTTTTGGTGCATTGCTTTCCTTAATTCCATTGATGAATTTTCTTTTGATGATA

Coding sequence (CDS)

ATGCTCGGAGTACTTTGTGCTCGTCCTAAGCCTTGGATTCTCGTTTCCCTATCCAATTTCATTCACGGCTCAGCCGTTTACCATCACCACCATCATCAAAGCCGGTTACTCGTTCAGAGTCCCATCCAATTCGATCGGCGACAGCGCCACCATTCCAGCGCCTGCAAGCTTGCGGGCGGTGGTGCTGCTTCGATATGGCACGCTATAATGCCTTCTGGTGCGGGAAGCAGTAGCAATCTCTGCCGTCCGGCGATTCACTGCCATGAGCGCAAAGGAGAGGGATCTTGGAACGTCGCCTGGGACGCTCGCCCGGCTCGTTGGCTCCACCGTCCCGATTCGGCTTGGCTGCTGTTTGGTGTCTGTGCCTGTATTGCGCCGCTTGATTGGGTGGATGCAAGTCATGAGGCTGTATCGTTAGATCAGAAGAAGGAAGTGTGTGAATCGAGCGGCCCTGAATTTAATCAAAACGATGAGAGCTCTGCTGATTACAGGGTGACAGGTGTGCTAGCGGATGGTCGGTGCTTGTTTAGGGCAATCGCTCATGGAGCTTGTTTGAGAAGTGGGGAAGAAGCTCCTGATGATGATCGTCAAAGAGAACTCGCTGATGAATTAAGGGCTAAGGTTGTGGATGAGCTCTTAAAGAGGCGGAAGGAAACAGAGTGGTAA

Protein sequence

MLGVLCARPKPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQFDRRQRHHSSACKLAGGGAASIWHAIMPSGAGSSSNLCRPAIHCHERKGEGSWNVAWDARPARWLHRPDSAWLLFGVCACIAPLDWVDASHEAVSLDQKKEVCESSGPEFNQNDESSADYRVTGVLADGRCLFRAIAHGACLRSGEEAPDDDRQRELADELRAKVVDELLKRRKETEW*
BLAST of Cucsa.308940 vs. Swiss-Prot
Match: OTU_ARATH (OTU domain-containing protein At3g57810 OS=Arabidopsis thaliana GN=At3g57810 PE=2 SV=1)

HSP 1 Score: 202.6 bits (514), Expect = 6.3e-51
Identity = 93/165 (56.36%), Postives = 123/165 (74.55%), Query Frame = 1

Query: 148 SSGPEFNQNDESSADYRVTGVLADGRCLFRAIAHGACLRSGEEAPDDDRQRELADELRAK 207
           SS  +F+       DY + G+  DGRCLFR++AHG CLRSG+ AP +  QRELADELR +
Sbjct: 153 SSDGKFHNGKRVYTDYSIIGIPGDGRCLFRSVAHGFCLRSGKLAPGEKMQRELADELRTR 212

Query: 208 VVDELLKRRKETEWYIEGDFDAYVKRIQQPFVWGGEPELLMASHVLKTPISVFMRERSSD 267
           V DE ++RR+ETEW++EGDFD YV++I+ P VWGGEPEL MASHVL+ PI+V+M++  + 
Sbjct: 213 VADEFIQRRQETEWFVEGDFDTYVRQIRDPHVWGGEPELFMASHVLQMPITVYMKDDKAG 272

Query: 268 GLINIAKYGQEYQKGEESPINVLFHGYGHYDILETSSDKVSLKLS 313
           GLI+IA+YGQEY  G++ PI VL+HG+GHYD L     K S+  S
Sbjct: 273 GLISIAEYGQEY--GKDDPIRVLYHGFGHYDALLLHESKASIPKS 315

BLAST of Cucsa.308940 vs. TrEMBL
Match: A0A0A0KRE9_CUCSA (Uncharacterized protein OS=Cucumis sativus GN=Csa_5G615810 PE=4 SV=1)

HSP 1 Score: 654.8 bits (1688), Expect = 5.2e-185
Identity = 313/313 (100.00%), Postives = 313/313 (100.00%), Query Frame = 1

Query: 1   MLGVLCARPKPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQFDRRQRHHSSACKLAGG 60
           MLGVLCARPKPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQFDRRQRHHSSACKLAGG
Sbjct: 1   MLGVLCARPKPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQFDRRQRHHSSACKLAGG 60

Query: 61  GAASIWHAIMPSGAGSSSNLCRPAIHCHERKGEGSWNVAWDARPARWLHRPDSAWLLFGV 120
           GAASIWHAIMPSGAGSSSNLCRPAIHCHERKGEGSWNVAWDARPARWLHRPDSAWLLFGV
Sbjct: 61  GAASIWHAIMPSGAGSSSNLCRPAIHCHERKGEGSWNVAWDARPARWLHRPDSAWLLFGV 120

Query: 121 CACIAPLDWVDASHEAVSLDQKKEVCESSGPEFNQNDESSADYRVTGVLADGRCLFRAIA 180
           CACIAPLDWVDASHEAVSLDQKKEVCESSGPEFNQNDESSADYRVTGVLADGRCLFRAIA
Sbjct: 121 CACIAPLDWVDASHEAVSLDQKKEVCESSGPEFNQNDESSADYRVTGVLADGRCLFRAIA 180

Query: 181 HGACLRSGEEAPDDDRQRELADELRAKVVDELLKRRKETEWYIEGDFDAYVKRIQQPFVW 240
           HGACLRSGEEAPDDDRQRELADELRAKVVDELLKRRKETEWYIEGDFDAYVKRIQQPFVW
Sbjct: 181 HGACLRSGEEAPDDDRQRELADELRAKVVDELLKRRKETEWYIEGDFDAYVKRIQQPFVW 240

Query: 241 GGEPELLMASHVLKTPISVFMRERSSDGLINIAKYGQEYQKGEESPINVLFHGYGHYDIL 300
           GGEPELLMASHVLKTPISVFMRERSSDGLINIAKYGQEYQKGEESPINVLFHGYGHYDIL
Sbjct: 241 GGEPELLMASHVLKTPISVFMRERSSDGLINIAKYGQEYQKGEESPINVLFHGYGHYDIL 300

Query: 301 ETSSDKVSLKLSM 314
           ETSSDKVSLKLSM
Sbjct: 301 ETSSDKVSLKLSM 313

BLAST of Cucsa.308940 vs. TrEMBL
Match: D7UBV6_VITVI (Putative uncharacterized protein OS=Vitis vinifera GN=VIT_13s0074g00470 PE=4 SV=1)

HSP 1 Score: 446.4 bits (1147), Expect = 2.8e-122
Identity = 225/318 (70.75%), Postives = 256/318 (80.50%), Query Frame = 1

Query: 1   MLGVLCARPKPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQF-----DRRQRHHSSAC 60
           MLGVLCAR KPWIL +LS F+HGSA +HH H     L+ +PIQF     D R+RHHS AC
Sbjct: 1   MLGVLCARHKPWILATLS-FVHGSATHHHLHLNHHHLLGTPIQFNGGGDDHRRRHHSRAC 60

Query: 61  KL--AGGGAASIWHAIMPSGAGSSSNLCRPAIHCHERKGEGSWNVAWDARPARWLHRPDS 120
           +   +GGGAASIWHAI+PSG    S+L RPA+  H++KGEGSWNVAWDARPARWLHRPDS
Sbjct: 61  RQGSSGGGAASIWHAILPSGGDRRSSL-RPAL-LHDQKGEGSWNVAWDARPARWLHRPDS 120

Query: 121 AWLLFGVCACIAPLDWVDASHEAVSLDQKKEVCESSGPEFNQNDESSADYRVTGVLADGR 180
           AWLLFGVCAC+APLD  D  +E V++D K E C       ++N+ SSADYRVTGV ADGR
Sbjct: 121 AWLLFGVCACLAPLDSFDVDNEVVAVDDKIEGCNQVNEISDENNNSSADYRVTGVPADGR 180

Query: 181 CLFRAIAHGACLRSGEEAPDDDRQRELADELRAKVVDELLKRRKETEWYIEGDFDAYVKR 240
           CLFRAIAH ACLRSGEEAPD++RQ ELAD+LRA+VVDELLKRR+ETEW+IEG+FDAYVKR
Sbjct: 181 CLFRAIAHSACLRSGEEAPDENRQTELADDLRAQVVDELLKRREETEWFIEGNFDAYVKR 240

Query: 241 IQQPFVWGGEPELLMASHVLKTPISVFMRERSSDGLINIAKYGQEYQKGEESPINVLFHG 300
           IQQP+VWGGEPEL+MASHVLK PISVFM  RSS  L NIA YG+EY+   ESPINVLFHG
Sbjct: 241 IQQPYVWGGEPELIMASHVLKMPISVFMIGRSSGDLKNIANYGKEYRIDNESPINVLFHG 300

Query: 301 YGHYDILETSSDKVSLKL 312
           YGHYDILET SD    KL
Sbjct: 301 YGHYDILETFSDHSYQKL 315

BLAST of Cucsa.308940 vs. TrEMBL
Match: M5X6I5_PRUPE (Uncharacterized protein OS=Prunus persica GN=PRUPE_ppa008484mg PE=4 SV=1)

HSP 1 Score: 422.9 bits (1086), Expect = 3.3e-115
Identity = 224/339 (66.08%), Postives = 257/339 (75.81%), Query Frame = 1

Query: 1   MLGVLCARPKPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQ------------FDRRQ 60
           MLG LCAR K WI+ SLS+F HGSA  H    QSRLL    +             F+ R+
Sbjct: 1   MLGFLCARRKTWIVSSLSSFAHGSAAAH----QSRLLQAHTLPLIHQQIASFSCGFETRR 60

Query: 61  RHHSSACKLA---GGGAASIWHAIMPSGAGSSS-NLCRPAIHCHERKGEGSWNVAWDARP 120
            HHSSAC+L    G GAASIWHA++PS     S +L RPAIH +E KGEGSWN AWDARP
Sbjct: 61  HHHSSACQLGSACGTGAASIWHALLPSSCNRRSRDLRRPAIH-YELKGEGSWNAAWDARP 120

Query: 121 ARWLHRPDSAWLLFGVCACIAPLDWVDAS----------HEAVSLDQKKEVCESSGPEFN 180
           ARWLHRPDSAWLLFGVC C+AP+DW D S            A S D K   C S+ P+ N
Sbjct: 121 ARWLHRPDSAWLLFGVCNCLAPIDWADDSTPDGNDGVSNENAESFDSK---C-SAAPDQN 180

Query: 181 QNDESSADYRVTGVLADGRCLFRAIAHGACLRSGEEAPDDDRQRELADELRAKVVDELLK 240
            N +SSADYRVTGV ADGRCLFRAIAH ACLR+GEEAPD++RQR+LADELRA+VVDELLK
Sbjct: 181 -NIDSSADYRVTGVPADGRCLFRAIAHVACLRNGEEAPDENRQRDLADELRAQVVDELLK 240

Query: 241 RRKETEWYIEGDFDAYVKRIQQPFVWGGEPELLMASHVLKTPISVFMRERSSDGLINIAK 300
           RR+ETEW+IEGDFDAYVKR+QQP+VWGGEPELLMASHVLKTPISVFM +RSS GL+NIA 
Sbjct: 241 RREETEWFIEGDFDAYVKRLQQPYVWGGEPELLMASHVLKTPISVFMIDRSSAGLVNIAN 300

Query: 301 YGQEYQKGEESPINVLFHGYGHYDILETSSDKVSLKLSM 314
           YG+EY+K EE PINVLFHGYGHYDIL++ S++   KL+M
Sbjct: 301 YGEEYRKEEEKPINVLFHGYGHYDILDSFSEQSLKKLNM 329

BLAST of Cucsa.308940 vs. TrEMBL
Match: A0A061FW28_THECC (Cysteine proteinases superfamily protein isoform 1 OS=Theobroma cacao GN=TCM_043657 PE=4 SV=1)

HSP 1 Score: 404.4 bits (1038), Expect = 1.2e-109
Identity = 211/319 (66.14%), Postives = 241/319 (75.55%), Query Frame = 1

Query: 1   MLGVLCARP-KPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQF------DRRQRHHSS 60
           MLGVLCARP KPWIL SLS   HG    HHH  +   LV+ P  F      DRR RHHS+
Sbjct: 1   MLGVLCARPPKPWILNSLSLIAHGGLAAHHHDSR---LVEWPTHFADLSADDRRCRHHST 60

Query: 61  ACKLAG--GGAASIWHAIMPSGAGSSSNLCRPAIHCHERKGEGSWNVAWDARPARWLHRP 120
           AC+L G  GGAASIWHAI+P G G             ERKGEGSWNVAWDARPARWLHRP
Sbjct: 61  ACRLGGSDGGAASIWHAILPCGGGGGGRRRGEVWKNVERKGEGSWNVAWDARPARWLHRP 120

Query: 121 DSAWLLFGVCACIAPL-DWVDASHEAVSLDQKKE---VCESSGPEFNQNDESSA----DY 180
           DSAWLLFGVCAC+AP+ ++VD + +A    +  E   V   S  E + +  SS     + 
Sbjct: 121 DSAWLLFGVCACLAPMIEFVDVNPDADDKIEGAELNLVSRLSADEKSSSSSSSVAAADNC 180

Query: 181 RVTGVLADGRCLFRAIAHGACLRSGEEAPDDDRQRELADELRAKVVDELLKRRKETEWYI 240
           +VTGVLADGRCLFRAIAHGACLRSGE+APD++ QRELADELRA+VV+ELLKRR+ETEW+I
Sbjct: 181 KVTGVLADGRCLFRAIAHGACLRSGEDAPDENHQRELADELRAQVVNELLKRREETEWFI 240

Query: 241 EGDFDAYVKRIQQPFVWGGEPELLMASHVLKTPISVFMRERSSDGLINIAKYGQEYQKGE 300
           EGDFDAYVK IQQP+VWGGEPE+LMASHVLKTPISV+M  RSS  L  IAKYG+EYQK +
Sbjct: 241 EGDFDAYVKEIQQPYVWGGEPEILMASHVLKTPISVYMIPRSSSNLTKIAKYGEEYQKDK 300

Query: 301 ESPINVLFHGYGHYDILET 303
           E+PINVLFHGYGHYDILE+
Sbjct: 301 ENPINVLFHGYGHYDILES 316

BLAST of Cucsa.308940 vs. TrEMBL
Match: A0A061FNX4_THECC (Cysteine proteinases superfamily protein isoform 2 OS=Theobroma cacao GN=TCM_043657 PE=4 SV=1)

HSP 1 Score: 399.1 bits (1024), Expect = 5.1e-108
Identity = 211/322 (65.53%), Postives = 241/322 (74.84%), Query Frame = 1

Query: 1   MLGVLCARP-KPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQF------DRRQRHHSS 60
           MLGVLCARP KPWIL SLS   HG    HHH  +   LV+ P  F      DRR RHHS+
Sbjct: 1   MLGVLCARPPKPWILNSLSLIAHGGLAAHHHDSR---LVEWPTHFADLSADDRRCRHHST 60

Query: 61  ACKLAG--GGAASIWHAIMPSGAGSSSNLCRPAIHCHERKGEGSWNVAWDARPARWLHRP 120
           AC+L G  GGAASIWHAI+P G G             ERKGEGSWNVAWDARPARWLHRP
Sbjct: 61  ACRLGGSDGGAASIWHAILPCGGGGGGRRRGEVWKNVERKGEGSWNVAWDARPARWLHRP 120

Query: 121 DSAWLLFGVCACIAPL-DWVDASHEAVSLDQKKE---VCESSGPEFNQNDESSA----DY 180
           DSAWLLFGVCAC+AP+ ++VD + +A    +  E   V   S  E + +  SS     + 
Sbjct: 121 DSAWLLFGVCACLAPMIEFVDVNPDADDKIEGAELNLVSRLSADEKSSSSSSSVAAADNC 180

Query: 181 RVTGVLADGRCLFRAIAHGACLRSGEEAPDDDRQRELADELRAK---VVDELLKRRKETE 240
           +VTGVLADGRCLFRAIAHGACLRSGE+APD++ QRELADELRA+   VV+ELLKRR+ETE
Sbjct: 181 KVTGVLADGRCLFRAIAHGACLRSGEDAPDENHQRELADELRAQVSLVVNELLKRREETE 240

Query: 241 WYIEGDFDAYVKRIQQPFVWGGEPELLMASHVLKTPISVFMRERSSDGLINIAKYGQEYQ 300
           W+IEGDFDAYVK IQQP+VWGGEPE+LMASHVLKTPISV+M  RSS  L  IAKYG+EYQ
Sbjct: 241 WFIEGDFDAYVKEIQQPYVWGGEPEILMASHVLKTPISVYMIPRSSSNLTKIAKYGEEYQ 300

Query: 301 KGEESPINVLFHGYGHYDILET 303
           K +E+PINVLFHGYGHYDILE+
Sbjct: 301 KDKENPINVLFHGYGHYDILES 319

BLAST of Cucsa.308940 vs. TAIR10
Match: AT3G57810.2 (AT3G57810.2 Cysteine proteinases superfamily protein)

HSP 1 Score: 202.6 bits (514), Expect = 3.6e-52
Identity = 93/165 (56.36%), Postives = 123/165 (74.55%), Query Frame = 1

Query: 148 SSGPEFNQNDESSADYRVTGVLADGRCLFRAIAHGACLRSGEEAPDDDRQRELADELRAK 207
           SS  +F+       DY + G+  DGRCLFR++AHG CLRSG+ AP +  QRELADELR +
Sbjct: 153 SSDGKFHNGKRVYTDYSIIGIPGDGRCLFRSVAHGFCLRSGKLAPGEKMQRELADELRTR 212

Query: 208 VVDELLKRRKETEWYIEGDFDAYVKRIQQPFVWGGEPELLMASHVLKTPISVFMRERSSD 267
           V DE ++RR+ETEW++EGDFD YV++I+ P VWGGEPEL MASHVL+ PI+V+M++  + 
Sbjct: 213 VADEFIQRRQETEWFVEGDFDTYVRQIRDPHVWGGEPELFMASHVLQMPITVYMKDDKAG 272

Query: 268 GLINIAKYGQEYQKGEESPINVLFHGYGHYDILETSSDKVSLKLS 313
           GLI+IA+YGQEY  G++ PI VL+HG+GHYD L     K S+  S
Sbjct: 273 GLISIAEYGQEY--GKDDPIRVLYHGFGHYDALLLHESKASIPKS 315

BLAST of Cucsa.308940 vs. TAIR10
Match: AT2G38025.1 (AT2G38025.1 Cysteine proteinases superfamily protein)

HSP 1 Score: 79.7 bits (195), Expect = 3.5e-15
Identity = 53/159 (33.33%), Postives = 77/159 (48.43%), Query Frame = 1

Query: 163 YRVTGVLADGRCLFRAIAHGACLRSGEEAPDDDRQRELADELRAKVVDELLKRRKETEWY 222
           Y V  V  DGRCLFRA+  G     G    +  R+R+ ADELR  V + +    KE E Y
Sbjct: 76  YAVDRVKGDGRCLFRALVKGMAFNKGITL-NPQRERDDADELRMAVKEVICNDPKEREKY 135

Query: 223 --------IEGDFDAYVKRIQQPFVWGGEPELLMASHVLKTPISVFMRERS-------SD 282
                   ++     + +RI +   WGGE ELL+ S + K PI V++ E           
Sbjct: 136 KEALVAITVDESLKRFCQRIGRHDFWGGESELLVLSKLCKQPIIVYIPEHEHGRGGGYGP 195

Query: 283 GLINIAKYGQEYQKG------EESPINVLFHGYGHYDIL 301
           G I I +YG E++ G       ++ + +L+ G  HYD+L
Sbjct: 196 GFIPIQEYGSEFRGGWGKGKTNKNVVRLLYSGRNHYDLL 233

BLAST of Cucsa.308940 vs. NCBI nr
Match: gi|449449405|ref|XP_004142455.1| (PREDICTED: OTU domain-containing protein At3g57810-like [Cucumis sativus])

HSP 1 Score: 654.8 bits (1688), Expect = 7.5e-185
Identity = 313/313 (100.00%), Postives = 313/313 (100.00%), Query Frame = 1

Query: 1   MLGVLCARPKPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQFDRRQRHHSSACKLAGG 60
           MLGVLCARPKPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQFDRRQRHHSSACKLAGG
Sbjct: 1   MLGVLCARPKPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQFDRRQRHHSSACKLAGG 60

Query: 61  GAASIWHAIMPSGAGSSSNLCRPAIHCHERKGEGSWNVAWDARPARWLHRPDSAWLLFGV 120
           GAASIWHAIMPSGAGSSSNLCRPAIHCHERKGEGSWNVAWDARPARWLHRPDSAWLLFGV
Sbjct: 61  GAASIWHAIMPSGAGSSSNLCRPAIHCHERKGEGSWNVAWDARPARWLHRPDSAWLLFGV 120

Query: 121 CACIAPLDWVDASHEAVSLDQKKEVCESSGPEFNQNDESSADYRVTGVLADGRCLFRAIA 180
           CACIAPLDWVDASHEAVSLDQKKEVCESSGPEFNQNDESSADYRVTGVLADGRCLFRAIA
Sbjct: 121 CACIAPLDWVDASHEAVSLDQKKEVCESSGPEFNQNDESSADYRVTGVLADGRCLFRAIA 180

Query: 181 HGACLRSGEEAPDDDRQRELADELRAKVVDELLKRRKETEWYIEGDFDAYVKRIQQPFVW 240
           HGACLRSGEEAPDDDRQRELADELRAKVVDELLKRRKETEWYIEGDFDAYVKRIQQPFVW
Sbjct: 181 HGACLRSGEEAPDDDRQRELADELRAKVVDELLKRRKETEWYIEGDFDAYVKRIQQPFVW 240

Query: 241 GGEPELLMASHVLKTPISVFMRERSSDGLINIAKYGQEYQKGEESPINVLFHGYGHYDIL 300
           GGEPELLMASHVLKTPISVFMRERSSDGLINIAKYGQEYQKGEESPINVLFHGYGHYDIL
Sbjct: 241 GGEPELLMASHVLKTPISVFMRERSSDGLINIAKYGQEYQKGEESPINVLFHGYGHYDIL 300

Query: 301 ETSSDKVSLKLSM 314
           ETSSDKVSLKLSM
Sbjct: 301 ETSSDKVSLKLSM 313

BLAST of Cucsa.308940 vs. NCBI nr
Match: gi|659091884|ref|XP_008446786.1| (PREDICTED: OTU domain-containing protein At3g57810-like [Cucumis melo])

HSP 1 Score: 651.4 bits (1679), Expect = 8.3e-184
Identity = 311/313 (99.36%), Postives = 312/313 (99.68%), Query Frame = 1

Query: 1   MLGVLCARPKPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQFDRRQRHHSSACKLAGG 60
           MLGVLCARPKPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQFDRRQRHHSSACKLAGG
Sbjct: 1   MLGVLCARPKPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQFDRRQRHHSSACKLAGG 60

Query: 61  GAASIWHAIMPSGAGSSSNLCRPAIHCHERKGEGSWNVAWDARPARWLHRPDSAWLLFGV 120
           GAASIWHAI+PSGAGSSSNLCRPAIHCHERKGEGSWNVAWDARPARWLHRPDSAWLLFGV
Sbjct: 61  GAASIWHAILPSGAGSSSNLCRPAIHCHERKGEGSWNVAWDARPARWLHRPDSAWLLFGV 120

Query: 121 CACIAPLDWVDASHEAVSLDQKKEVCESSGPEFNQNDESSADYRVTGVLADGRCLFRAIA 180
           CACIAPLDWVDASHEAVSLDQKKEVCESSGPEFNQNDESSADYRVTGVLADGRCLFRAIA
Sbjct: 121 CACIAPLDWVDASHEAVSLDQKKEVCESSGPEFNQNDESSADYRVTGVLADGRCLFRAIA 180

Query: 181 HGACLRSGEEAPDDDRQRELADELRAKVVDELLKRRKETEWYIEGDFDAYVKRIQQPFVW 240
           HGACLRSGEEAPDDDRQRELADELRAKVVDELLKRRKETEWYIEGDFDAYVKRIQQPFVW
Sbjct: 181 HGACLRSGEEAPDDDRQRELADELRAKVVDELLKRRKETEWYIEGDFDAYVKRIQQPFVW 240

Query: 241 GGEPELLMASHVLKTPISVFMRERSSDGLINIAKYGQEYQKGEESPINVLFHGYGHYDIL 300
           GGEPELLMASHVLKTPISVFMRERSSDGLINIAKYGQEYQ GEESPINVLFHGYGHYDIL
Sbjct: 241 GGEPELLMASHVLKTPISVFMRERSSDGLINIAKYGQEYQMGEESPINVLFHGYGHYDIL 300

Query: 301 ETSSDKVSLKLSM 314
           ETSSDKVSLKLSM
Sbjct: 301 ETSSDKVSLKLSM 313

BLAST of Cucsa.308940 vs. NCBI nr
Match: gi|731413376|ref|XP_010658710.1| (PREDICTED: uncharacterized protein LOC100245448 [Vitis vinifera])

HSP 1 Score: 446.4 bits (1147), Expect = 4.0e-122
Identity = 225/318 (70.75%), Postives = 256/318 (80.50%), Query Frame = 1

Query: 1   MLGVLCARPKPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQF-----DRRQRHHSSAC 60
           MLGVLCAR KPWIL +LS F+HGSA +HH H     L+ +PIQF     D R+RHHS AC
Sbjct: 1   MLGVLCARHKPWILATLS-FVHGSATHHHLHLNHHHLLGTPIQFNGGGDDHRRRHHSRAC 60

Query: 61  KL--AGGGAASIWHAIMPSGAGSSSNLCRPAIHCHERKGEGSWNVAWDARPARWLHRPDS 120
           +   +GGGAASIWHAI+PSG    S+L RPA+  H++KGEGSWNVAWDARPARWLHRPDS
Sbjct: 61  RQGSSGGGAASIWHAILPSGGDRRSSL-RPAL-LHDQKGEGSWNVAWDARPARWLHRPDS 120

Query: 121 AWLLFGVCACIAPLDWVDASHEAVSLDQKKEVCESSGPEFNQNDESSADYRVTGVLADGR 180
           AWLLFGVCAC+APLD  D  +E V++D K E C       ++N+ SSADYRVTGV ADGR
Sbjct: 121 AWLLFGVCACLAPLDSFDVDNEVVAVDDKIEGCNQVNEISDENNNSSADYRVTGVPADGR 180

Query: 181 CLFRAIAHGACLRSGEEAPDDDRQRELADELRAKVVDELLKRRKETEWYIEGDFDAYVKR 240
           CLFRAIAH ACLRSGEEAPD++RQ ELAD+LRA+VVDELLKRR+ETEW+IEG+FDAYVKR
Sbjct: 181 CLFRAIAHSACLRSGEEAPDENRQTELADDLRAQVVDELLKRREETEWFIEGNFDAYVKR 240

Query: 241 IQQPFVWGGEPELLMASHVLKTPISVFMRERSSDGLINIAKYGQEYQKGEESPINVLFHG 300
           IQQP+VWGGEPEL+MASHVLK PISVFM  RSS  L NIA YG+EY+   ESPINVLFHG
Sbjct: 241 IQQPYVWGGEPELIMASHVLKMPISVFMIGRSSGDLKNIANYGKEYRIDNESPINVLFHG 300

Query: 301 YGHYDILETSSDKVSLKL 312
           YGHYDILET SD    KL
Sbjct: 301 YGHYDILETFSDHSYQKL 315

BLAST of Cucsa.308940 vs. NCBI nr
Match: gi|1009119780|ref|XP_015876567.1| (PREDICTED: uncharacterized protein LOC107413195 [Ziziphus jujuba])

HSP 1 Score: 424.9 bits (1091), Expect = 1.3e-115
Identity = 219/332 (65.96%), Postives = 258/332 (77.71%), Query Frame = 1

Query: 1   MLGVLCARPKPWILVSLSNFIHGSAVYHHHHHQ-----SRLLVQSPIQFD---------R 60
           MLGVLCARPKPWIL SLS+F+HGSA +HHHHH      S  L+ +PIQ            
Sbjct: 1   MLGVLCARPKPWILTSLSSFVHGSAGHHHHHHHHHHSSSSRLLHAPIQSADFAGGYARLT 60

Query: 61  RQRHHSSACKLAGGG-----AASIWHAIMPSGAGSSSNLCRPAIHCHERKGEGSWNVAWD 120
           R RHHSSAC+L G G     AASIWHAI+P+ AG   ++ RP +   E KGEGSWN AWD
Sbjct: 61  RPRHHSSACQLGGAGGAAATAASIWHAILPA-AGRRCDVRRPGVQ-FELKGEGSWNAAWD 120

Query: 121 ARPARWLHRPDSAWLLFGVCACIAPL-DWVDASHEAVSLDQKKEVCESSGPEFNQNDESS 180
           ARPARWLHRPDSAWLLFGVCAC+AP+ D +DA+ E V  D K  V        ++  ESS
Sbjct: 121 ARPARWLHRPDSAWLLFGVCACLAPVVDLLDANSEPVVSDDKANVVVP-----DEVVESS 180

Query: 181 ADYRVTGVLADGRCLFRAIAHGACLRSGEEAPDDDRQRELADELRAKVVDELLKRRKETE 240
           + YRVTGV ADGRCLFRAIAH ACLR+G+EAPD++RQR+LADELRA+VVDELLKRR+E+E
Sbjct: 181 SGYRVTGVAADGRCLFRAIAHVACLRNGKEAPDENRQRQLADELRAQVVDELLKRREESE 240

Query: 241 WYIEGDFDAYVKRIQQPFVWGGEPELLMASHVLKTPISVFMRERSSDGLINIAKYGQEYQ 300
           WYIEGDFDAY+KRIQ+P+VWGGEPEL+MASHVLKTPISVFMR+R+  GL+NIAKYG+EY 
Sbjct: 241 WYIEGDFDAYIKRIQEPYVWGGEPELIMASHVLKTPISVFMRDRTG-GLVNIAKYGEEYG 300

Query: 301 KGEESPINVLFHGYGHYDILETSSDKVSLKLS 313
           K EE PIN+LF+GYGHYDILET  ++   KL+
Sbjct: 301 KVEEDPINLLFYGYGHYDILETFPNQSCQKLN 324

BLAST of Cucsa.308940 vs. NCBI nr
Match: gi|596049241|ref|XP_007220473.1| (hypothetical protein PRUPE_ppa008484mg [Prunus persica])

HSP 1 Score: 422.9 bits (1086), Expect = 4.8e-115
Identity = 224/339 (66.08%), Postives = 257/339 (75.81%), Query Frame = 1

Query: 1   MLGVLCARPKPWILVSLSNFIHGSAVYHHHHHQSRLLVQSPIQ------------FDRRQ 60
           MLG LCAR K WI+ SLS+F HGSA  H    QSRLL    +             F+ R+
Sbjct: 1   MLGFLCARRKTWIVSSLSSFAHGSAAAH----QSRLLQAHTLPLIHQQIASFSCGFETRR 60

Query: 61  RHHSSACKLA---GGGAASIWHAIMPSGAGSSS-NLCRPAIHCHERKGEGSWNVAWDARP 120
            HHSSAC+L    G GAASIWHA++PS     S +L RPAIH +E KGEGSWN AWDARP
Sbjct: 61  HHHSSACQLGSACGTGAASIWHALLPSSCNRRSRDLRRPAIH-YELKGEGSWNAAWDARP 120

Query: 121 ARWLHRPDSAWLLFGVCACIAPLDWVDAS----------HEAVSLDQKKEVCESSGPEFN 180
           ARWLHRPDSAWLLFGVC C+AP+DW D S            A S D K   C S+ P+ N
Sbjct: 121 ARWLHRPDSAWLLFGVCNCLAPIDWADDSTPDGNDGVSNENAESFDSK---C-SAAPDQN 180

Query: 181 QNDESSADYRVTGVLADGRCLFRAIAHGACLRSGEEAPDDDRQRELADELRAKVVDELLK 240
            N +SSADYRVTGV ADGRCLFRAIAH ACLR+GEEAPD++RQR+LADELRA+VVDELLK
Sbjct: 181 -NIDSSADYRVTGVPADGRCLFRAIAHVACLRNGEEAPDENRQRDLADELRAQVVDELLK 240

Query: 241 RRKETEWYIEGDFDAYVKRIQQPFVWGGEPELLMASHVLKTPISVFMRERSSDGLINIAK 300
           RR+ETEW+IEGDFDAYVKR+QQP+VWGGEPELLMASHVLKTPISVFM +RSS GL+NIA 
Sbjct: 241 RREETEWFIEGDFDAYVKRLQQPYVWGGEPELLMASHVLKTPISVFMIDRSSAGLVNIAN 300

Query: 301 YGQEYQKGEESPINVLFHGYGHYDILETSSDKVSLKLSM 314
           YG+EY+K EE PINVLFHGYGHYDIL++ S++   KL+M
Sbjct: 301 YGEEYRKEEEKPINVLFHGYGHYDILDSFSEQSLKKLNM 329

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
OTU_ARATH6.3e-5156.36OTU domain-containing protein At3g57810 OS=Arabidopsis thaliana GN=At3g57810 PE=... [more]
Match NameE-valueIdentityDescription
A0A0A0KRE9_CUCSA5.2e-185100.00Uncharacterized protein OS=Cucumis sativus GN=Csa_5G615810 PE=4 SV=1[more]
D7UBV6_VITVI2.8e-12270.75Putative uncharacterized protein OS=Vitis vinifera GN=VIT_13s0074g00470 PE=4 SV=... [more]
M5X6I5_PRUPE3.3e-11566.08Uncharacterized protein OS=Prunus persica GN=PRUPE_ppa008484mg PE=4 SV=1[more]
A0A061FW28_THECC1.2e-10966.14Cysteine proteinases superfamily protein isoform 1 OS=Theobroma cacao GN=TCM_043... [more]
A0A061FNX4_THECC5.1e-10865.53Cysteine proteinases superfamily protein isoform 2 OS=Theobroma cacao GN=TCM_043... [more]
Match NameE-valueIdentityDescription
AT3G57810.23.6e-5256.36 Cysteine proteinases superfamily protein[more]
AT2G38025.13.5e-1533.33 Cysteine proteinases superfamily protein[more]
Match NameE-valueIdentityDescription
gi|449449405|ref|XP_004142455.1|7.5e-185100.00PREDICTED: OTU domain-containing protein At3g57810-like [Cucumis sativus][more]
gi|659091884|ref|XP_008446786.1|8.3e-18499.36PREDICTED: OTU domain-containing protein At3g57810-like [Cucumis melo][more]
gi|731413376|ref|XP_010658710.1|4.0e-12270.75PREDICTED: uncharacterized protein LOC100245448 [Vitis vinifera][more]
gi|1009119780|ref|XP_015876567.1|1.3e-11565.96PREDICTED: uncharacterized protein LOC107413195 [Ziziphus jujuba][more]
gi|596049241|ref|XP_007220473.1|4.8e-11566.08hypothetical protein PRUPE_ppa008484mg [Prunus persica][more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR003323OTU_dom
IPR003323OTU_dom
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0008150 biological_process
cellular_component GO:0005575 cellular_component
molecular_function GO:0003674 molecular_function

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Cucsa.308940.2Cucsa.308940.2mRNA
Cucsa.308940.1Cucsa.308940.1mRNA


Analysis Name: InterPro Annotations of cucumber (Gy14)
Date Performed: 2017-01-17
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR003323OTU domainPROFILEPS50802OTUcoord: 163..221
score:
NoneNo IPR availablePANTHERPTHR12419OTU DOMAIN CONTAINING PROTEINcoord: 152..221
score: 6.2
NoneNo IPR availablePANTHERPTHR12419:SF14SUBFAMILY NOT NAMEDcoord: 152..221
score: 6.2

The following gene(s) are paralogous to this gene:

None