Cp4.1LG10g12010 (gene) Cucurbita pepo (Zucchini)

NameCp4.1LG10g12010
Typegene
OrganismCucurbita pepo (Cucurbita pepo (Zucchini))
DescriptionNepenthesin II
LocationCp4.1LG10 : 8313299 .. 8315865 (+)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: five_prime_UTRCDSthree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
TCCCAACTTAGCTCCCATCTTACTCTAAATTCCTTCTTTTTTCTATTTCATCTCCCTAATCTCTTCCCAATTCTCTTCATTGCTCAAGGATTCAAGAACTCATGGCGGACTCGTTACGATATCTGATCGTCTCGGTTGTTCTATCGATTACCATGTTATTCATTCATACCTCGGCCTCAAGTTCGTCTCACTCAAGGCGAGCTCTACGGCAACCTAAGTTGCCTAGTGATGGTTTCCGAGTGAGTCTTAACCATGTGGATCATGTCAAGAATTTGACGAGATTCGAGCAGTTGCAACGAGGAGTGGCACGTGGGAAGACTAGATTGCATAGACTAAACGCCATGATGTTGGCTGCCAACGTCGGTGTTGGTGGTCGAGTGCAGGCGCCGGTGGTGGCGGGTAATGGTGAGTTTCTTATGAAGTTGGCTATCGGATCTCCGCCGAGAAGCTTCTCGGCGATCATGGATACGGGGAGTGACCTGATTTGGACACAGTGTAAGCCTTGTCAACAATGTTTTGATCAAGCTACGCCTATTTTTGATCCGAAAGAATCTTCTTCTTTCTCTAAGATTTCTTGCTCGAGCGAGCTCTGTGATGCTCTCCCGACATCGACATGTAGTAGCGATGAGTGCGAGTATTTCTACACGTATGGTGATTATTCCTCAACCCATGGCGTTTTGGCTGCTGAGACCTTCACTTTTGGAGATTCAAGCCAAGACCAGGTTATTAACCACCATCAAACCAAAAACATTATAAAGTTTGATAAGTTATGGTTTACTTAACATAGTTGTTTTTAATAAAAATTCCCTAAAATTTGAGGATTTTTCGTTCGAGTATAAACGATAGGGGTTATAAGGATCGAGCTGGGTCAGGTTGATAAATTTTTTTGGATCAACCAAAATTTTGGGTTGGTTAGGTTGGTAACCTATATAACTCGAAATTGTTTTACATCCTAATTGAACCCTATATTTATAGGTAAGGTCAGGTTGGGTTGTCCAGTAGTTGTTGAGGGCGATGGTGAGGCTAAGGAGCTGCATCTGTGAGAGGTGTCTTGTTTACATTTGACAATGATGGTGAGGATAAGGAGTGTGGCCAACGACGAGGAGCTGTGTCTGAGACCTAACTTTTGTCTAATTTTTACGATATATATACTTATTTATTATTATTATTATTTTAATTTGGACTGATTTGGTTTGGATGGATAGAAACTTTAGCCAATAAACCCGAAATTTGATCCAAACCAGCTCGGTTCAAGAAAATTAACCTAACCCAACCCTTATGATTTAGGTTGGGTTGGTTGAGTTTTTTGGGTTTGATTTACACTGCTAATATCTCCCCTTTGTTTCAATAGCATGTTAAATCACCGCTAAATTAAAAAAGTTTAAAATGGTATCTAAACTCTATTCGATCTTATATACATGTTCTTAATTCAGGTATCGATTCCTGGACTTGGATTTGGATGTGGAGACGACAACGAAGGGGACGGGTTCAGCCAAGGCGAGGGGCTAGTGGGTCTCGGCCGAGGACCCTTATCGCTAGTTTCTCAACTAAAAGAACAGAAGTTTTCGTATTGTTTAACCGCCATTGATGACACGAAACCAAGCTCACTTTTGTTGGGATCTCTAGCAAATGTGAAACCTAAAGCATCCGAAGGTGAAATCAAAACCACCCCATTGATAAGAAACCCATCTCAGCCATCTTTTTACTATCTTTCTCTACAAGGAATCTCGGTTGGTGGCACTCAATTACCAATACCAAAGGCCACTTTTGAGCTCCATGATGATGGGAGTGGTGGCGTTATCATAGATTCAGGCACAACAATCACATACATTGAGAAAAATGCTTTCACTTTACTCAAAAAAGAGTTCGTTTCTCAAATGAAACTTCCCGTTGACGACTCGGGTACCAGTGGTCTCGACCTTTGCTTTAACTTGCCTCCCGAGACAACTCAGGTACGTTTACCTAACAGTTTAAGTCGATAAGAAGGAGTCCCACGTTGACTAATTTAGAAATGATCATGAGTTTATAAGTAAGGAATACATCTCCATTGGTACGAGTCGTGATTCCTAACATAGAATAATGTATTGATGTTTTTGTTTATGTTGTTAACAAAGGTGGAGGTTCCGAAGTTGACGTTCCATTTCAAAGGTGCCGATTTGGAGCTTCCGGGGGAGAATTACATGATCGGTGACTCGAGGGCAGAGTTGATATGTTTGGCCATTGGGAGTTCGAGCGGAATGTCCATCTTTGGGAATCTTCAACAACAAAACATCATGGTTGTTCATGATCTTCAAGAAGAAACTGTGTCGTTTTTGCCTACTCAATGTAGTGACATATGAAAAGGTTGAAGGGAATTTGTTCAATCAAAATGGAGTGAAATGATAGATTTAAGATTGTTTATATTATTAATTCAAGCTATTCAATTTGTAAGTTTATTAAGGGATTTTAGAACTTGTATTGAACAAGTTATATGCATGTTAATGCTATTGAANATTTGGATTTATCCATTTATAAAATATCCCTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTATTTAGTTA

mRNA sequence

TCCCAACTTAGCTCCCATCTTACTCTAAATTCCTTCTTTTTTCTATTTCATCTCCCTAATCTCTTCCCAATTCTCTTCATTGCTCAAGGATTCAAGAACTCATGGCGGACTCGTTACGATATCTGATCGTCTCGGTTGTTCTATCGATTACCATGTTATTCATTCATACCTCGGCCTCAAGTTCGTCTCACTCAAGGCGAGCTCTACGGCAACCTAAGTTGCCTAGTGATGGTTTCCGAGTGAGTCTTAACCATGTGGATCATGTCAAGAATTTGACGAGATTCGAGCAGTTGCAACGAGGAGTGGCACGTGGGAAGACTAGATTGCATAGACTAAACGCCATGATGTTGGCTGCCAACGTCGGTGTTGGTGGTCGAGTGCAGGCGCCGGTGGTGGCGGGTAATGGTGAGTTTCTTATGAAGTTGGCTATCGGATCTCCGCCGAGAAGCTTCTCGGCGATCATGGATACGGGGAGTGACCTGATTTGGACACAGTGTAAGCCTTGTCAACAATGTTTTGATCAAGCTACGCCTATTTTTGATCCGAAAGAATCTTCTTCTTTCTCTAAGATTTCTTGCTCGAGCGAGCTCTGTGATGCTCTCCCGACATCGACATGTAGTAGCGATGAGTGCGAGTATTTCTACACGTATGGTGATTATTCCTCAACCCATGGCGTTTTGGCTGCTGAGACCTTCACTTTTGGAGATTCAAGCCAAGACCAGGTATCGATTCCTGGACTTGGATTTGGATGTGGAGACGACAACGAAGGGGACGGGTTCAGCCAAGGCGAGGGGCTAGTGGGTCTCGGCCGAGGACCCTTATCGCTAGTTTCTCAACTAAAAGAACAGAAGTTTTCGTATTGTTTAACCGCCATTGATGACACGAAACCAAGCTCACTTTTGTTGGGATCTCTAGCAAATGTGAAACCTAAAGCATCCGAAGGTGAAATCAAAACCACCCCATTGATAAGAAACCCATCTCAGCCATCTTTTTACTATCTTTCTCTACAAGGAATCTCGGTGGAGGTTCCGAAGTTGACGTTCCATTTCAAAGGTGCCGATTTGGAGCTTCCGGGGGAGAATTACATGATCGGTGACTCGAGGGCAGAGTTGATATGTTTGGCCATTGGGAGTTCGAGCGGAATGTCCATCTTTGGGAATCTTCAACAACAAAACATCATGGTTGTTCATGATCTTCAAGAAGAAACTGTGTCGTTTTTGCCTACTCAATGTAGTGACATATGAAAAGGTTGAAGGGAATTTGTTCAATCAAAATGGAGTGAAATGATAGATTTAAGATTGTTTATATTATTAATTCAAGCTATTCAATTTGTAAGTTTATTAAGGGATTTTAGAACTTGTATTGAACAAGTTATATGCATGTTAATGCTATTGAANATTTGGATTTATCCATTTATAAAATATCCCTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTATTTAGTTA

Coding sequence (CDS)

ATGGCGGACTCGTTACGATATCTGATCGTCTCGGTTGTTCTATCGATTACCATGTTATTCATTCATACCTCGGCCTCAAGTTCGTCTCACTCAAGGCGAGCTCTACGGCAACCTAAGTTGCCTAGTGATGGTTTCCGAGTGAGTCTTAACCATGTGGATCATGTCAAGAATTTGACGAGATTCGAGCAGTTGCAACGAGGAGTGGCACGTGGGAAGACTAGATTGCATAGACTAAACGCCATGATGTTGGCTGCCAACGTCGGTGTTGGTGGTCGAGTGCAGGCGCCGGTGGTGGCGGGTAATGGTGAGTTTCTTATGAAGTTGGCTATCGGATCTCCGCCGAGAAGCTTCTCGGCGATCATGGATACGGGGAGTGACCTGATTTGGACACAGTGTAAGCCTTGTCAACAATGTTTTGATCAAGCTACGCCTATTTTTGATCCGAAAGAATCTTCTTCTTTCTCTAAGATTTCTTGCTCGAGCGAGCTCTGTGATGCTCTCCCGACATCGACATGTAGTAGCGATGAGTGCGAGTATTTCTACACGTATGGTGATTATTCCTCAACCCATGGCGTTTTGGCTGCTGAGACCTTCACTTTTGGAGATTCAAGCCAAGACCAGGTATCGATTCCTGGACTTGGATTTGGATGTGGAGACGACAACGAAGGGGACGGGTTCAGCCAAGGCGAGGGGCTAGTGGGTCTCGGCCGAGGACCCTTATCGCTAGTTTCTCAACTAAAAGAACAGAAGTTTTCGTATTGTTTAACCGCCATTGATGACACGAAACCAAGCTCACTTTTGTTGGGATCTCTAGCAAATGTGAAACCTAAAGCATCCGAAGGTGAAATCAAAACCACCCCATTGATAAGAAACCCATCTCAGCCATCTTTTTACTATCTTTCTCTACAAGGAATCTCGGTGGAGGTTCCGAAGTTGACGTTCCATTTCAAAGGTGCCGATTTGGAGCTTCCGGGGGAGAATTACATGATCGGTGACTCGAGGGCAGAGTTGATATGTTTGGCCATTGGGAGTTCGAGCGGAATGTCCATCTTTGGGAATCTTCAACAACAAAACATCATGGTTGTTCATGATCTTCAAGAAGAAACTGTGTCGTTTTTGCCTACTCAATGTAGTGACATATGA

Protein sequence

MADSLRYLIVSVVLSITMLFIHTSASSSSHSRRALRQPKLPSDGFRVSLNHVDHVKNLTRFEQLQRGVARGKTRLHRLNAMMLAANVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAIMDTGSDLIWTQCKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPTSTCSSDECEYFYTYGDYSSTHGVLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLVSQLKEQKFSYCLTAIDDTKPSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFYYLSLQGISVEVPKLTFHFKGADLELPGENYMIGDSRAELICLAIGSSSGMSIFGNLQQQNIMVVHDLQEETVSFLPTQCSDI
BLAST of Cp4.1LG10g12010 vs. Swiss-Prot
Match: NEP1_NEPGR (Aspartic proteinase nepenthesin-1 OS=Nepenthes gracilis GN=nep1 PE=1 SV=1)

HSP 1 Score: 281.6 bits (719), Expect = 1.3e-74
Identity = 159/312 (50.96%), Postives = 210/312 (67.31%), Query Frame = 1

Query: 1   MADSLRYLIVSVVLSITMLFIHTSASSSSHSRRALRQPKLPSDGFRVSLNHVDHVKNLTR 60
           MA SL   +++  LSI  +F+  + S+S  +     + K+   GF++ L HVD  KNLT+
Sbjct: 1   MASSLYSFLLA--LSIVYIFVAPTHSTSRTALNHRHEAKVT--GFQIMLEHVDSGKNLTK 60

Query: 61  FEQLQRGVARGKTRLHRLNAMMLAANVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAI 120
           F+ L+R + RG  RL RL AM+     G  G V+  V AG+GE+LM L+IG+P + FSAI
Sbjct: 61  FQLLERAIERGSRRLQRLEAMLN----GPSG-VETSVYAGDGEYLMNLSIGTPAQPFSAI 120

Query: 121 MDTGSDLIWTQCKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPTSTCSSDECEYF 180
           MDTGSDLIWTQC+PC QCF+Q+TPIF+P+ SSSFS + CSS+LC AL + TCS++ C+Y 
Sbjct: 121 MDTGSDLIWTQCQPCTQCFNQSTPIFNPQGSSSFSTLPCSSQLCQALSSPTCSNNFCQYT 180

Query: 181 YTYGDYSSTHGVLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPL 240
           Y YGD S T G +  ET TFG      VSIP + FGCG++N+G G   G GLVG+GRGPL
Sbjct: 181 YGYGDGSETQGSMGTETLTFG-----SVSIPNITFGCGENNQGFGQGNGAGLVGMGRGPL 240

Query: 241 SLVSQLKEQKFSYCLTAIDDTKPSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFYYL 300
           SL SQL   KFSYC+T I  + PS+LLLGSLAN     + G   TT LI++   P+FYY+
Sbjct: 241 SLPSQLDVTKFSYCMTPIGSSTPSNLLLGSLAN---SVTAGSPNTT-LIQSSQIPTFYYI 294

Query: 301 SLQGISVEVPKL 313
           +L G+SV   +L
Sbjct: 301 TLNGLSVGSTRL 294

BLAST of Cp4.1LG10g12010 vs. Swiss-Prot
Match: NEP2_NEPGR (Aspartic proteinase nepenthesin-2 OS=Nepenthes gracilis GN=nep2 PE=1 SV=1)

HSP 1 Score: 280.4 bits (716), Expect = 2.9e-74
Identity = 153/333 (45.95%), Postives = 212/333 (63.66%), Query Frame = 1

Query: 9   IVSVVLSITMLFIHTSASSSSHSRRALRQ-PKLPSDGFRVSLNHVDHVKNLTRFEQLQRG 68
           + SVVL + ++    + +SS+     L    K P  G RV L  VD  KNLT++E ++R 
Sbjct: 5   LYSVVLGLAIVSAIVAPTSSTSRGTLLHHGQKRPQPGLRVDLEQVDSGKNLTKYELIKRA 64

Query: 69  VARGKTRLHRLNAMMLAANVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAIMDTGSDL 128
           + RG+ R+  +NAM+ +++      ++ PV AG+GE+LM +AIG+P  SFSAIMDTGSDL
Sbjct: 65  IKRGERRMRSINAMLQSSS-----GIETPVYAGDGEYLMNVAIGTPDSSFSAIMDTGSDL 124

Query: 129 IWTQCKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPTSTCSSDECEYFYTYGDYS 188
           IWTQC+PC QCF Q TPIF+P++SSSFS + C S+ C  LP+ TC+++EC+Y Y YGD S
Sbjct: 125 IWTQCEPCTQCFSQPTPIFNPQDSSSFSTLPCESQYCQDLPSETCNNNECQYTYGYGDGS 184

Query: 189 STHGVLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLVSQLK 248
           +T G +A ETFTF  S     S+P + FGCG+DN+G G   G GL+G+G GPLSL SQL 
Sbjct: 185 TTQGYMATETFTFETS-----SVPNIAFGCGEDNQGFGQGNGAGLIGMGWGPLSLPSQLG 244

Query: 249 EQKFSYCLTAIDDTKPSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFYYLSLQGISV 308
             +FSYC+T+   + PS+L LGS A+  P+ S     +T LI +   P++YY++LQGI+V
Sbjct: 245 VGQFSYCMTSYGSSSPSTLALGSAASGVPEGS----PSTTLIHSSLNPTYYYITLQGITV 304

Query: 309 EVPKLTFHFKGADLELPGENYMIGDSRAELICL 341
               L        L+  G   MI DS   L  L
Sbjct: 305 GGDNLGIPSSTFQLQDDGTGGMIIDSGTTLTYL 323

BLAST of Cp4.1LG10g12010 vs. Swiss-Prot
Match: AP37_ORYSJ (Aspartyl protease 37 OS=Oryza sativa subsp. japonica GN=AP37 PE=3 SV=2)

HSP 1 Score: 188.0 bits (476), Expect = 2.0e-46
Identity = 108/276 (39.13%), Postives = 155/276 (56.16%), Query Frame = 1

Query: 41  PSDGFRVSLNHVD----HVKNLTRFEQLQRGVARGKTRLHRLN-AMMLAANVGVGGRVQA 100
           P   FR+ L  VD       NLT  E L+R + R + RL  +  A   AA+       + 
Sbjct: 21  PPRSFRLELASVDASAADAANLTEHELLRRAIQRSRYRLAGIGMARGEAASARKAVVAET 80

Query: 101 PVVAGNGEFLMKLAIGSPPRSFSAIMDTGSDLIWTQCKPCQQCFDQATPIFDPKESSSFS 160
           P++   GE+L+KL IG+PP  F+A +DT SDLIWTQC+PC  C+ Q  P+F+P+ SS+++
Sbjct: 81  PIMPAGGEYLVKLGIGTPPYKFTAAIDTASDLIWTQCQPCTGCYHQVDPMFNPRVSSTYA 140

Query: 161 KISCSSELCDALPTSTCSSDE---CEYFYTYGDYSSTHGVLAAETFTFGDSSQDQVSIPG 220
            + CSS+ CD L    C  D+   C+Y YTY   ++T G LA +    G+      +  G
Sbjct: 141 ALPCSSDTCDELDVHRCGHDDDESCQYTYTYSGNATTEGTLAVDKLVIGED-----AFRG 200

Query: 221 LGFGCGDDNEGDG-FSQGEGLVGLGRGPLSLVSQLKEQKFSYCLTAIDDTKPSSLLLGSL 280
           + FGC   + G     Q  G+VGLGRGPLSLVSQL  ++F+YCL       P  L+LG+ 
Sbjct: 201 VAFGCSTSSTGGAPPPQASGVVGLGRGPLSLVSQLSVRRFAYCLPPPASRIPGKLVLGAD 260

Query: 281 ANVKPKASEGEIKTTPLIRNPSQPSFYYLSLQGISV 308
           A+    A+       P+ R+P  PS+YYL+L G+ +
Sbjct: 261 ADAARNATNR--IAVPMRRDPRYPSYYYLNLDGLLI 289

BLAST of Cp4.1LG10g12010 vs. Swiss-Prot
Match: CDR1_ARATH (Aspartic proteinase CDR1 OS=Arabidopsis thaliana GN=CDR1 PE=1 SV=1)

HSP 1 Score: 185.7 bits (470), Expect = 9.7e-46
Identity = 128/342 (37.43%), Postives = 184/342 (53.80%), Query Frame = 1

Query: 8   LIVSVVLSITMLFIHTSASSSSHSRRALRQPKLPSDGFRVSLNHVDHVKNLTRFEQLQRG 67
           L  SV+LS+ +L       SS     A  +PKL   GF   L H D  K+   +  ++  
Sbjct: 4   LFSSVLLSLCLL-------SSLFLSNANAKPKL---GFTADLIHRDSPKS-PFYNPMETS 63

Query: 68  VARGKTRLHR-LNAMMLAANVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAIMDTGSD 127
             R +  +HR +N +          + Q  + + +GE+LM ++IG+PP    AI DTGSD
Sbjct: 64  SQRLRNAIHRSVNRVFHFTEKDNTPQPQIDLTSNSGEYLMNVSIGTPPFPIMAIADTGSD 123

Query: 128 LIWTQCKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPT-STCSSDE--CEYFYTY 187
           L+WTQC PC  C+ Q  P+FDPK SS++  +SCSS  C AL   ++CS+++  C Y  +Y
Sbjct: 124 LLWTQCAPCDDCYTQVDPLFDPKTSSTYKDVSCSSSQCTALENQASCSTNDNTCSYSLSY 183

Query: 188 GDYSSTHGVLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLV 247
           GD S T G +A +T T G S    + +  +  GCG +N G    +G G+VGLG GP+SL+
Sbjct: 184 GDNSYTKGNIAVDTLTLGSSDTRPMQLKNIIIGCGHNNAGTFNKKGSGIVGLGGGPVSLI 243

Query: 248 SQLKEQ---KFSYCLTAIDDTK--PSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFY 307
            QL +    KFSYCL  +   K   S +  G+ A V    S   + +TPLI   SQ +FY
Sbjct: 244 KQLGDSIDGKFSYCLVPLTSKKDQTSKINFGTNAIV----SGSGVVSTPLIAKASQETFY 303

Query: 308 YLSLQGISVEVPKLTFHFKGADLELPGENYMIGDSRAELICL 341
           YL+L+ ISV   ++   + G+D E   E  +I DS   L  L
Sbjct: 304 YLTLKSISVGSKQI--QYSGSDSE-SSEGNIIIDSGTTLTLL 327

BLAST of Cp4.1LG10g12010 vs. Swiss-Prot
Match: ASPG1_ARATH (Protein ASPARTIC PROTEASE IN GUARD CELL 1 OS=Arabidopsis thaliana GN=ASPG1 PE=1 SV=1)

HSP 1 Score: 171.4 bits (433), Expect = 1.9e-41
Identity = 96/241 (39.83%), Postives = 137/241 (56.85%), Query Frame = 1

Query: 96  PVVAG----NGEFLMKLAIGSPPRSFSAIMDTGSDLIWTQCKPCQQCFDQATPIFDPKES 155
           PVV+G    +GE+  ++ +G+P +    ++DTGSD+ W QC+PC  C+ Q+ P+F+P  S
Sbjct: 150 PVVSGASQGSGEYFSRIGVGTPAKEMYLVLDTGSDVNWIQCEPCADCYQQSDPVFNPTSS 209

Query: 156 SSFSKISCSSELCDALPTSTCSSDECEYFYTYGDYSSTHGVLAAETFTFGDSSQDQVSIP 215
           S++  ++CS+  C  L TS C S++C Y  +YGD S T G LA +T TFG+S +    I 
Sbjct: 210 STYKSLTCSAPQCSLLETSACRSNKCLYQVSYGDGSFTVGELATDTVTFGNSGK----IN 269

Query: 216 GLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLVSQLKEQKFSYCLTAIDDTKPSSLLLGSL 275
            +  GCG DNEG  F+   GL+GLG G LS+ +Q+K   FSYCL   D  K SSL   S+
Sbjct: 270 NVALGCGHDNEG-LFTGAAGLLGLGGGVLSITNQMKATSFSYCLVDRDSGKSSSLDFNSV 329

Query: 276 ANVKPKASEGEIKTTPLIRNPSQPSFYYLSLQGISVEVPKLTFHFKGADLELPGENYMIG 333
                    G   T PL+RN    +FYY+ L G SV   K+       D++  G   +I 
Sbjct: 330 ------QLGGGDATAPLLRNKKIDTFYYVGLSGFSVGGEKVVLPDAIFDVDASGSGGVIL 379

BLAST of Cp4.1LG10g12010 vs. TrEMBL
Match: A0A0A0KYT9_CUCSA (Uncharacterized protein OS=Cucumis sativus GN=Csa_4G554680 PE=3 SV=1)

HSP 1 Score: 492.7 bits (1267), Expect = 4.1e-136
Identity = 253/310 (81.61%), Postives = 277/310 (89.35%), Query Frame = 1

Query: 12  VVLSITMLFIHTSASSSSHSRRALRQP-KLPSDGFRVSLNHVDHVKNLTRFEQLQRGVAR 71
           +++ IT LFI+T A SSS SRRAL++P KLPS GFRV L HVDHVKNLTRFE+L+RGVAR
Sbjct: 17  LIILITTLFINTLAFSSSLSRRALQKPNKLPSHGFRVRLKHVDHVKNLTRFERLRRGVAR 76

Query: 72  GKTRLHRLNAMMLAA-NVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAIMDTGSDLIW 131
           GK RLHRLNAM+LAA N  VG +V+APVVAGNGEFLMKLAIGSPPRSFSAIMDTGSDLIW
Sbjct: 77  GKNRLHRLNAMVLAAANATVGDQVKAPVVAGNGEFLMKLAIGSPPRSFSAIMDTGSDLIW 136

Query: 132 TQCKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPTSTCSSDECEYFYTYGDYSST 191
           TQCKPCQQCFDQ+TPIFDPK+SSSF KISCSSELC ALPTSTCSSD CEY YTYGD SST
Sbjct: 137 TQCKPCQQCFDQSTPIFDPKQSSSFYKISCSSELCGALPTSTCSSDGCEYLYTYGDSSST 196

Query: 192 HGVLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLVSQLKEQ 251
            GVLA ETFTFGDS++DQ+SIPGLGFGCG+DN GDGFSQG GLVGLGRGPLSLVSQLKEQ
Sbjct: 197 QGVLAFETFTFGDSTEDQISIPGLGFGCGNDNNGDGFSQGAGLVGLGRGPLSLVSQLKEQ 256

Query: 252 KFSYCLTAIDDTKPSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFYYLSLQGISV-- 311
           KF+YCLTAIDD+KPSSLLLGSLAN+ PK S+ E+KTTPLI+NPSQPSFYYLSLQGISV  
Sbjct: 257 KFAYCLTAIDDSKPSSLLLGSLANITPKTSKDEMKTTPLIKNPSQPSFYYLSLQGISVGG 316

Query: 312 ---EVPKLTF 315
               +PK TF
Sbjct: 317 TQLSIPKSTF 326

BLAST of Cp4.1LG10g12010 vs. TrEMBL
Match: A0A0D2T6N5_GOSRA (Uncharacterized protein OS=Gossypium raimondii GN=B456_011G117400 PE=3 SV=1)

HSP 1 Score: 388.3 bits (996), Expect = 1.1e-104
Identity = 205/330 (62.12%), Postives = 247/330 (74.85%), Query Frame = 1

Query: 4   SLRYLIVSVVLSITMLFIHTSASSSSHSRRALRQPKLPSDGFRVSLNHVDHVKNLTRFEQ 63
           +L Y +  V  S  ++     +   S SRR L       +GFRV+L HVD  KNLT++E+
Sbjct: 2   ALLYSLCCVSFSALVIVALYVSPVVSTSRRVLGDYGKLENGFRVTLKHVDSSKNLTKWER 61

Query: 64  LQRGVARGKTRLHRLNAMMLAANVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAIMDT 123
           +QRG+ RG  RL RLNAM+LAA+ G    VQAP+VAGNGEFLM L+IG+PP S+SAI+DT
Sbjct: 62  IQRGIKRGNHRLQRLNAMVLAAS-GDSAEVQAPIVAGNGEFLMDLSIGTPPNSYSAILDT 121

Query: 124 GSDLIWTQCKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPTSTCSSDECEYFYTY 183
           GSDLIWTQCKPC QCFDQ+TPIFDP++SS+F+K+SCSS+LC+ALP STCS   CEY YTY
Sbjct: 122 GSDLIWTQCKPCTQCFDQSTPIFDPQKSSTFTKLSCSSDLCEALPQSTCSDGSCEYLYTY 181

Query: 184 GDYSSTHGVLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLV 243
           GDYSST GV+A E F F     D VS+P +GFGCG+DNEGDGFSQG GLVGLGRGPLSLV
Sbjct: 182 GDYSSTQGVMATEIFKF-----DSVSVPNIGFGCGEDNEGDGFSQGAGLVGLGRGPLSLV 241

Query: 244 SQLKEQKFSYCLTAIDDTKPSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFYYLSLQ 303
           SQLKE KFSYCLTA+D+T+ S LL+GS+A+     S GE++TTPLIRNPSQPSFYYLSLQ
Sbjct: 242 SQLKEPKFSYCLTAMDETQKSLLLMGSIASA--NESLGEMRTTPLIRNPSQPSFYYLSLQ 301

Query: 304 GISVEVPKLTFHFKGADLELPGENYMIGDS 334
           GI+V   +L        LE  G   +I DS
Sbjct: 302 GITVGSTRLPIKESTFALEDNGSGGVIIDS 323

BLAST of Cp4.1LG10g12010 vs. TrEMBL
Match: B9SA95_RICCO (Aspartic proteinase nepenthesin-1, putative OS=Ricinus communis GN=RCOM_1697100 PE=3 SV=1)

HSP 1 Score: 385.2 bits (988), Expect = 9.3e-104
Identity = 202/322 (62.73%), Postives = 246/322 (76.40%), Query Frame = 1

Query: 12  VVLSITMLFIHTSASSSSHSRRALRQPKLPSDGFRVSLNHVDHVKNLTRFEQLQRGVARG 71
           V+LS+ +L +    + S+ SRRAL  P    +GFR++L HVD  KNLT+F+++Q G+ R 
Sbjct: 11  VLLSLLILSLSVYPAFST-SRRALSYPAQLKNGFRITLKHVDSDKNLTKFQRIQHGIKRA 70

Query: 72  KTRLHRLNAMMLAANVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAIMDTGSDLIWTQ 131
             RL RLNAM+LAA+      + +PV++GNGEFLM LAIG+PP ++SAIMDTGSDLIWTQ
Sbjct: 71  NHRLERLNAMVLAASSNA--EINSPVLSGNGEFLMNLAIGTPPETYSAIMDTGSDLIWTQ 130

Query: 132 CKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPTSTCSSDECEYFYTYGDYSSTHG 191
           CKPC QCFDQ +PIFDPK+SSSFSK+SCSS+LC ALP S+C SD CEY YTYGDYSST G
Sbjct: 131 CKPCTQCFDQPSPIFDPKKSSSFSKLSCSSQLCKALPQSSC-SDSCEYLYTYGDYSSTQG 190

Query: 192 VLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLVSQLKEQKF 251
            +A ETFTFG     +VSIP +GFGCG+DNEGDGF+QG GLVGLGRGPLSLVSQLKE KF
Sbjct: 191 TMATETFTFG-----KVSIPNVGFGCGEDNEGDGFTQGSGLVGLGRGPLSLVSQLKEAKF 250

Query: 252 SYCLTAIDDTKPSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFYYLSLQGISVEVPK 311
           SYCLT+IDDTK S+LL+GSLA+V    +   I+TTPLI+NP QPSFYYLSL+GISV   +
Sbjct: 251 SYCLTSIDDTKTSTLLMGSLASV--NGTSAAIRTTPLIQNPLQPSFYYLSLEGISVGGTR 310

Query: 312 LTFHFKGADLELPGENYMIGDS 334
           L        L+  G   +I DS
Sbjct: 311 LPIKESTFQLQDDGTGGLIIDS 321

BLAST of Cp4.1LG10g12010 vs. TrEMBL
Match: B9N2J4_POPTR (Aspartyl protease family protein OS=Populus trichocarpa GN=POPTR_0019s01600g PE=3 SV=1)

HSP 1 Score: 384.8 bits (987), Expect = 1.2e-103
Identity = 205/324 (63.27%), Postives = 246/324 (75.93%), Query Frame = 1

Query: 10  VSVVLSITMLFIHTSASSSSHSRRALRQPKLPSDGFRVSLNHVDHVKNLTRFEQLQRGVA 69
           +S+V+++  +F    + + S SRR L  PK+  +GFR  L HVD  KNLT+FE++Q GV 
Sbjct: 7   LSLVVALA-IFAFVFSHAFSTSRRVLEHPKV-QNGFRAKLKHVDSGKNLTKFERIQHGVK 66

Query: 70  RGKTRLHRLNAMMLAANVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAIMDTGSDLIW 129
           RG+ RL R  AM L A+      + APV+ GNGEFLMKLAIG+PP ++SAIMDTGSDLIW
Sbjct: 67  RGRHRLQRFKAMALVASSN--SEIDAPVLPGNGEFLMKLAIGTPPETYSAIMDTGSDLIW 126

Query: 130 TQCKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPTSTCSSDECEYFYTYGDYSST 189
           TQCKPC QCFDQ TPIFDPK+SSSFSK+SCSS+LC+ALP STC SD CEY Y YGDYSST
Sbjct: 127 TQCKPCTQCFDQPTPIFDPKKSSSFSKLSCSSKLCEALPQSTC-SDGCEYLYGYGDYSST 186

Query: 190 HGVLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLVSQLKEQ 249
            G+LA+ET TFG     +VS+P + FGCG+DNEG GFSQG GLVGLGRGPLSLVSQLKE 
Sbjct: 187 QGMLASETLTFG-----KVSVPEVAFGCGEDNEGSGFSQGSGLVGLGRGPLSLVSQLKEP 246

Query: 250 KFSYCLTAIDDTKPSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFYYLSLQGISVEV 309
           KFSYCLT++DDTK S+LL+GSLA+V  KAS+ EIKTTPLI+N +QPSFYYLSL+GISV  
Sbjct: 247 KFSYCLTSVDDTKASTLLMGSLASV--KASDSEIKTTPLIQNSAQPSFYYLSLEGISVGD 306

Query: 310 PKLTFHFKGADLELPGENYMIGDS 334
             L        L+  G   +I DS
Sbjct: 307 TSLPIKKSTFSLQEDGSGGLIIDS 318

BLAST of Cp4.1LG10g12010 vs. TrEMBL
Match: A0A061DK44_THECC (Eukaryotic aspartyl protease family protein OS=Theobroma cacao GN=TCM_001628 PE=3 SV=1)

HSP 1 Score: 382.5 bits (981), Expect = 6.0e-103
Identity = 207/330 (62.73%), Postives = 250/330 (75.76%), Query Frame = 1

Query: 4   SLRYLIVSVVLSITMLFIHTSASSSSHSRRALRQPKLPSDGFRVSLNHVDHVKNLTRFEQ 63
           SL  L+    L++ ++ ++ S + S+ SR AL   +L  +GFRV+L HVD  KNLT++E+
Sbjct: 3   SLYSLLCVAFLTLEIVALYVSPAVST-SRGALEHRRL-QNGFRVTLRHVDSGKNLTKWER 62

Query: 64  LQRGVARGKTRLHRLNAMMLAANVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAIMDT 123
           +QRGV RG  RL RLNAM+LAA       +QAP+ AGNGEFLM LAIG+PP S+SAI+DT
Sbjct: 63  IQRGVKRGNHRLQRLNAMVLAATDA--SELQAPITAGNGEFLMDLAIGTPPESYSAILDT 122

Query: 124 GSDLIWTQCKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPTSTCSSDECEYFYTY 183
           GSDLIWTQCKPC QCFDQ TPIFDPK+SSSFSK+SCSS LC ALP S C SD CEY YTY
Sbjct: 123 GSDLIWTQCKPCSQCFDQPTPIFDPKKSSSFSKLSCSSHLCSALPQSAC-SDGCEYLYTY 182

Query: 184 GDYSSTHGVLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLV 243
           GDYSST GV+A ETFTFG     +VS+P +GFGCG DN+GDGF+QG GLVGLGRGP+SLV
Sbjct: 183 GDYSSTQGVMAVETFTFG-----KVSVPNIGFGCGGDNQGDGFTQGAGLVGLGRGPVSLV 242

Query: 244 SQLKEQKFSYCLTAIDDTKPSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFYYLSLQ 303
           SQLK+ KFSYCLT+IDDTK S+LL+GS+A+V    + G IKTTPLI NP+QPSFYYLSL+
Sbjct: 243 SQLKQGKFSYCLTSIDDTKKSTLLMGSIASV--NRTLGAIKTTPLIHNPTQPSFYYLSLK 302

Query: 304 GISVEVPKLTFHFKGADLELPGENYMIGDS 334
           GI+V   +L        LE  G   +I DS
Sbjct: 303 GITVGDTRLPIKKSTFALEDDGTGGVIIDS 320

BLAST of Cp4.1LG10g12010 vs. TAIR10
Match: AT2G03200.1 (AT2G03200.1 Eukaryotic aspartyl protease family protein)

HSP 1 Score: 340.1 bits (871), Expect = 1.7e-93
Identity = 188/330 (56.97%), Postives = 236/330 (71.52%), Query Frame = 1

Query: 19  LFIHTSASSSSHSRRALRQ---PK-LPSDGFRVSLNHVDHVKNLTRFEQLQRGVARGKTR 78
           L + +   S S SRR+L     PK LP  GFR+SL HVD  KNLT+ +++QRG+ RG  R
Sbjct: 15  LILFSCLISVSSSRRSLIDRTLPKNLPRSGFRLSLRHVDSGKNLTKIQKIQRGINRGFHR 74

Query: 79  LHRLNA---MMLAANVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAIMDTGSDLIWTQ 138
           L+RL A   + +A+       ++AP   G+GEFLM+L+IG+P   +SAI+DTGSDLIWTQ
Sbjct: 75  LNRLGAVAVLAVASKPDDTNNIKAPTHGGSGEFLMELSIGNPAVKYSAIVDTGSDLIWTQ 134

Query: 139 CKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPTSTCSSDE--CEYFYTYGDYSST 198
           CKPC +CFDQ TPIFDP++SSS+SK+ CSS LC+ALP S C+ D+  CEY YTYGDYSST
Sbjct: 135 CKPCTECFDQPTPIFDPEKSSSYSKVGCSSGLCNALPRSNCNEDKDACEYLYTYGDYSST 194

Query: 199 HGVLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLVSQLKEQ 258
            G+LA ETFTF    +D+ SI G+GFGCG +NEGDGFSQG GLVGLGRGPLSL+SQLKE 
Sbjct: 195 RGLLATETFTF----EDENSISGIGFGCGVENEGDGFSQGSGLVGLGRGPLSLISQLKET 254

Query: 259 KFSYCLTAIDDTK-PSSLLLGSLA----NVKPKASEGEI-KTTPLIRNPSQPSFYYLSLQ 318
           KFSYCLT+I+D++  SSL +GSLA    N    + +GE+ KT  L+RNP QPSFYYL LQ
Sbjct: 255 KFSYCLTSIEDSEASSSLFIGSLASGIVNKTGASLDGEVTKTMSLLRNPDQPSFYYLELQ 314

Query: 319 GISVEVPKLTFHFKGADLELPGENYMIGDS 334
           GI+V   +L+      +L   G   MI DS
Sbjct: 315 GITVGAKRLSVEKSTFELAEDGTGGMIIDS 340

BLAST of Cp4.1LG10g12010 vs. TAIR10
Match: AT1G64830.1 (AT1G64830.1 Eukaryotic aspartyl protease family protein)

HSP 1 Score: 186.8 bits (473), Expect = 2.5e-47
Identity = 109/282 (38.65%), Postives = 160/282 (56.74%), Query Frame = 1

Query: 41  PSDGFRVSLNHVDHVKN-LTRFEQLQRGVARGKTRLHRLNAMMLAANVGVGGRVQAPVVA 100
           P DGF + L H D  K+      +      R   R    + +  + +       Q+ + +
Sbjct: 22  PKDGFTIDLIHRDSPKSPFYNSAETSSQRMRNAIRRSARSTLQFSNDDASPNSPQSFITS 81

Query: 101 GNGEFLMKLAIGSPPRSFSAIMDTGSDLIWTQCKPCQQCFDQATPIFDPKESSSFSKISC 160
             GE+LM ++IG+PP    AI DTGSDLIWTQC PC+ C+ Q +P+FDPKESS++ K+SC
Sbjct: 82  NRGEYLMNISIGTPPVPILAIADTGSDLIWTQCNPCEDCYQQTSPLFDPKESSTYRKVSC 141

Query: 161 SSELCDALPTSTCSSDE--CEYFYTYGDYSSTHGVLAAETFTFGDSSQDQVSIPGLGFGC 220
           SS  C AL  ++CS+DE  C Y  TYGD S T G +A +T T G S +  VS+  +  GC
Sbjct: 142 SSSQCRALEDASCSTDENTCSYTITYGDNSYTKGDVAVDTVTMGSSGRRPVSLRNMIIGC 201

Query: 221 GDDNEGDGFSQGEGLVGLGRGPLSLVSQLKEQ---KFSYCLTAI--DDTKPSSLLLGSLA 280
           G +N G     G G++GLG G  SLVSQL++    KFSYCL     +    S +  G+  
Sbjct: 202 GHENTGTFDPAGSGIIGLGGGSTSLVSQLRKSINGKFSYCLVPFTSETGLTSKINFGTNG 261

Query: 281 NVKPKASEGEIKTTPLIRNPSQPSFYYLSLQGISVEVPKLTF 315
            V   + +G + T+ + ++P+  ++Y+L+L+ ISV   K+ F
Sbjct: 262 IV---SGDGVVSTSMVKKDPA--TYYFLNLEAISVGSKKIQF 298

BLAST of Cp4.1LG10g12010 vs. TAIR10
Match: AT5G33340.1 (AT5G33340.1 Eukaryotic aspartyl protease family protein)

HSP 1 Score: 185.7 bits (470), Expect = 5.5e-47
Identity = 128/342 (37.43%), Postives = 184/342 (53.80%), Query Frame = 1

Query: 8   LIVSVVLSITMLFIHTSASSSSHSRRALRQPKLPSDGFRVSLNHVDHVKNLTRFEQLQRG 67
           L  SV+LS+ +L       SS     A  +PKL   GF   L H D  K+   +  ++  
Sbjct: 4   LFSSVLLSLCLL-------SSLFLSNANAKPKL---GFTADLIHRDSPKS-PFYNPMETS 63

Query: 68  VARGKTRLHR-LNAMMLAANVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAIMDTGSD 127
             R +  +HR +N +          + Q  + + +GE+LM ++IG+PP    AI DTGSD
Sbjct: 64  SQRLRNAIHRSVNRVFHFTEKDNTPQPQIDLTSNSGEYLMNVSIGTPPFPIMAIADTGSD 123

Query: 128 LIWTQCKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPT-STCSSDE--CEYFYTY 187
           L+WTQC PC  C+ Q  P+FDPK SS++  +SCSS  C AL   ++CS+++  C Y  +Y
Sbjct: 124 LLWTQCAPCDDCYTQVDPLFDPKTSSTYKDVSCSSSQCTALENQASCSTNDNTCSYSLSY 183

Query: 188 GDYSSTHGVLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLV 247
           GD S T G +A +T T G S    + +  +  GCG +N G    +G G+VGLG GP+SL+
Sbjct: 184 GDNSYTKGNIAVDTLTLGSSDTRPMQLKNIIIGCGHNNAGTFNKKGSGIVGLGGGPVSLI 243

Query: 248 SQLKEQ---KFSYCLTAIDDTK--PSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFY 307
            QL +    KFSYCL  +   K   S +  G+ A V    S   + +TPLI   SQ +FY
Sbjct: 244 KQLGDSIDGKFSYCLVPLTSKKDQTSKINFGTNAIV----SGSGVVSTPLIAKASQETFY 303

Query: 308 YLSLQGISVEVPKLTFHFKGADLELPGENYMIGDSRAELICL 341
           YL+L+ ISV   ++   + G+D E   E  +I DS   L  L
Sbjct: 304 YLTLKSISVGSKQI--QYSGSDSE-SSEGNIIIDSGTTLTLL 327

BLAST of Cp4.1LG10g12010 vs. TAIR10
Match: AT1G31450.1 (AT1G31450.1 Eukaryotic aspartyl protease family protein)

HSP 1 Score: 177.2 bits (448), Expect = 1.9e-44
Identity = 117/311 (37.62%), Postives = 166/311 (53.38%), Query Frame = 1

Query: 13  VLSITMLFIHTSASSSSHSRRALRQPKLPSDGFRVSLNHVDHVKNLTRFEQLQRGVARGK 72
           +L+I+  F    AS+SS +R  L    +  D     L +  H    T  ++L     R  
Sbjct: 11  LLAISFFF----ASNSSANRENLTVELIHRDSPHSPLYNPHH----TVSDRLNAAFLRSI 70

Query: 73  TRLHRLNAMMLAANVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAIMDTGSDLIWTQC 132
           +R  R               +Q+ +++  GE+ M ++IG+PP    AI DTGSDL W QC
Sbjct: 71  SRSRRFTTKT---------DLQSGLISNGGEYFMSISIGTPPSKVFAIADTGSDLTWVQC 130

Query: 133 KPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPTSTCSSDE----CEYFYTYGDYSS 192
           KPCQQC+ Q +P+FD K+SS++   SC S+ C AL       DE    C+Y Y+YGD S 
Sbjct: 131 KPCQQCYKQNSPLFDKKKSSTYKTESCDSKTCQALSEHEEGCDESKDICKYRYSYGDNSF 190

Query: 193 THGVLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLVSQLKE 252
           T G +A ET +   SS   VS PG  FGCG +N G     G G++GLG GPLSLVSQL  
Sbjct: 191 TKGDVATETISIDSSSGSSVSFPGTVFGCGYNNGGTFEETGSGIIGLGGGPLSLVSQLGS 250

Query: 253 ---QKFSYCL--TAIDDTKPSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFYYLSLQ 312
              +KFSYCL  TA      S + LG+ +     + +    TTPLI+   + ++Y+L+L+
Sbjct: 251 SIGKKFSYCLSHTAATTNGTSVINLGTNSIPSNPSKDSATLTTPLIQKDPE-TYYFLTLE 303

Query: 313 GISVEVPKLTF 315
            ++V   KL +
Sbjct: 311 AVTVGKTKLPY 303

BLAST of Cp4.1LG10g12010 vs. TAIR10
Match: AT3G18490.1 (AT3G18490.1 Eukaryotic aspartyl protease family protein)

HSP 1 Score: 171.4 bits (433), Expect = 1.1e-42
Identity = 96/241 (39.83%), Postives = 137/241 (56.85%), Query Frame = 1

Query: 96  PVVAG----NGEFLMKLAIGSPPRSFSAIMDTGSDLIWTQCKPCQQCFDQATPIFDPKES 155
           PVV+G    +GE+  ++ +G+P +    ++DTGSD+ W QC+PC  C+ Q+ P+F+P  S
Sbjct: 150 PVVSGASQGSGEYFSRIGVGTPAKEMYLVLDTGSDVNWIQCEPCADCYQQSDPVFNPTSS 209

Query: 156 SSFSKISCSSELCDALPTSTCSSDECEYFYTYGDYSSTHGVLAAETFTFGDSSQDQVSIP 215
           S++  ++CS+  C  L TS C S++C Y  +YGD S T G LA +T TFG+S +    I 
Sbjct: 210 STYKSLTCSAPQCSLLETSACRSNKCLYQVSYGDGSFTVGELATDTVTFGNSGK----IN 269

Query: 216 GLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLVSQLKEQKFSYCLTAIDDTKPSSLLLGSL 275
            +  GCG DNEG  F+   GL+GLG G LS+ +Q+K   FSYCL   D  K SSL   S+
Sbjct: 270 NVALGCGHDNEG-LFTGAAGLLGLGGGVLSITNQMKATSFSYCLVDRDSGKSSSLDFNSV 329

Query: 276 ANVKPKASEGEIKTTPLIRNPSQPSFYYLSLQGISVEVPKLTFHFKGADLELPGENYMIG 333
                    G   T PL+RN    +FYY+ L G SV   K+       D++  G   +I 
Sbjct: 330 ------QLGGGDATAPLLRNKKIDTFYYVGLSGFSVGGEKVVLPDAIFDVDASGSGGVIL 379

BLAST of Cp4.1LG10g12010 vs. NCBI nr
Match: gi|778695110|ref|XP_011653928.1| (PREDICTED: aspartic proteinase nepenthesin-1 [Cucumis sativus])

HSP 1 Score: 492.7 bits (1267), Expect = 5.9e-136
Identity = 253/310 (81.61%), Postives = 277/310 (89.35%), Query Frame = 1

Query: 12  VVLSITMLFIHTSASSSSHSRRALRQP-KLPSDGFRVSLNHVDHVKNLTRFEQLQRGVAR 71
           +++ IT LFI+T A SSS SRRAL++P KLPS GFRV L HVDHVKNLTRFE+L+RGVAR
Sbjct: 17  LIILITTLFINTLAFSSSLSRRALQKPNKLPSHGFRVRLKHVDHVKNLTRFERLRRGVAR 76

Query: 72  GKTRLHRLNAMMLAA-NVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAIMDTGSDLIW 131
           GK RLHRLNAM+LAA N  VG +V+APVVAGNGEFLMKLAIGSPPRSFSAIMDTGSDLIW
Sbjct: 77  GKNRLHRLNAMVLAAANATVGDQVKAPVVAGNGEFLMKLAIGSPPRSFSAIMDTGSDLIW 136

Query: 132 TQCKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPTSTCSSDECEYFYTYGDYSST 191
           TQCKPCQQCFDQ+TPIFDPK+SSSF KISCSSELC ALPTSTCSSD CEY YTYGD SST
Sbjct: 137 TQCKPCQQCFDQSTPIFDPKQSSSFYKISCSSELCGALPTSTCSSDGCEYLYTYGDSSST 196

Query: 192 HGVLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLVSQLKEQ 251
            GVLA ETFTFGDS++DQ+SIPGLGFGCG+DN GDGFSQG GLVGLGRGPLSLVSQLKEQ
Sbjct: 197 QGVLAFETFTFGDSTEDQISIPGLGFGCGNDNNGDGFSQGAGLVGLGRGPLSLVSQLKEQ 256

Query: 252 KFSYCLTAIDDTKPSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFYYLSLQGISV-- 311
           KF+YCLTAIDD+KPSSLLLGSLAN+ PK S+ E+KTTPLI+NPSQPSFYYLSLQGISV  
Sbjct: 257 KFAYCLTAIDDSKPSSLLLGSLANITPKTSKDEMKTTPLIKNPSQPSFYYLSLQGISVGG 316

Query: 312 ---EVPKLTF 315
               +PK TF
Sbjct: 317 TQLSIPKSTF 326

BLAST of Cp4.1LG10g12010 vs. NCBI nr
Match: gi|659083174|ref|XP_008442220.1| (PREDICTED: aspartic proteinase nepenthesin-1 [Cucumis melo])

HSP 1 Score: 489.6 bits (1259), Expect = 5.0e-135
Identity = 255/319 (79.94%), Postives = 280/319 (87.77%), Query Frame = 1

Query: 4   SLRYL-IVSVVLSITMLFIHTSASSSSHSRRALRQP-KLPSDGFRVSLNHVDHVKNLTRF 63
           S  YL ++ +++ IT LFI+T A SSS S RAL++P KLPS GFRV L HVDHVKNLTRF
Sbjct: 8   SFGYLQLLLLIVFITTLFINTLAFSSSLSTRALQKPNKLPSHGFRVRLKHVDHVKNLTRF 67

Query: 64  EQLQRGVARGKTRLHRLNAMMLAA-NVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAI 123
           E+L+RGVARGK RLHRLNAM+LAA N  VG +V+APVVAGNGEFLMKLAIGSPPRSFSAI
Sbjct: 68  ERLRRGVARGKNRLHRLNAMVLAAANASVGDQVKAPVVAGNGEFLMKLAIGSPPRSFSAI 127

Query: 124 MDTGSDLIWTQCKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPTSTCSSDECEYF 183
           MDTGSDLIWTQCKPCQQCFDQATPIFDPK+SSSFSKISC SELC ALPTSTCSSD CEY 
Sbjct: 128 MDTGSDLIWTQCKPCQQCFDQATPIFDPKQSSSFSKISCRSELCGALPTSTCSSDGCEYL 187

Query: 184 YTYGDYSSTHGVLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPL 243
           YTYGD SST GVLA ETFTFGDS++DQ+SIPGLGFGCG+DN GDGFSQG GLVGLGRGPL
Sbjct: 188 YTYGDSSSTQGVLAFETFTFGDSTEDQISIPGLGFGCGNDNNGDGFSQGAGLVGLGRGPL 247

Query: 244 SLVSQLKEQKFSYCLTAIDDTKPSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFYYL 303
           SLVSQLKEQKF+YCLTAIDD+KPSSLLLGSLAN+ PK S+ E+K TPLI+NPSQPSFYYL
Sbjct: 248 SLVSQLKEQKFAYCLTAIDDSKPSSLLLGSLANITPKTSKDEMKATPLIKNPSQPSFYYL 307

Query: 304 SLQGISV-----EVPKLTF 315
           SLQGISV      +PK TF
Sbjct: 308 SLQGISVGGTQLSIPKSTF 326

BLAST of Cp4.1LG10g12010 vs. NCBI nr
Match: gi|823248901|ref|XP_012457105.1| (PREDICTED: aspartic proteinase nepenthesin-1-like [Gossypium raimondii])

HSP 1 Score: 388.3 bits (996), Expect = 1.6e-104
Identity = 205/330 (62.12%), Postives = 247/330 (74.85%), Query Frame = 1

Query: 4   SLRYLIVSVVLSITMLFIHTSASSSSHSRRALRQPKLPSDGFRVSLNHVDHVKNLTRFEQ 63
           +L Y +  V  S  ++     +   S SRR L       +GFRV+L HVD  KNLT++E+
Sbjct: 2   ALLYSLCCVSFSALVIVALYVSPVVSTSRRVLGDYGKLENGFRVTLKHVDSSKNLTKWER 61

Query: 64  LQRGVARGKTRLHRLNAMMLAANVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAIMDT 123
           +QRG+ RG  RL RLNAM+LAA+ G    VQAP+VAGNGEFLM L+IG+PP S+SAI+DT
Sbjct: 62  IQRGIKRGNHRLQRLNAMVLAAS-GDSAEVQAPIVAGNGEFLMDLSIGTPPNSYSAILDT 121

Query: 124 GSDLIWTQCKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPTSTCSSDECEYFYTY 183
           GSDLIWTQCKPC QCFDQ+TPIFDP++SS+F+K+SCSS+LC+ALP STCS   CEY YTY
Sbjct: 122 GSDLIWTQCKPCTQCFDQSTPIFDPQKSSTFTKLSCSSDLCEALPQSTCSDGSCEYLYTY 181

Query: 184 GDYSSTHGVLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLV 243
           GDYSST GV+A E F F     D VS+P +GFGCG+DNEGDGFSQG GLVGLGRGPLSLV
Sbjct: 182 GDYSSTQGVMATEIFKF-----DSVSVPNIGFGCGEDNEGDGFSQGAGLVGLGRGPLSLV 241

Query: 244 SQLKEQKFSYCLTAIDDTKPSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFYYLSLQ 303
           SQLKE KFSYCLTA+D+T+ S LL+GS+A+     S GE++TTPLIRNPSQPSFYYLSLQ
Sbjct: 242 SQLKEPKFSYCLTAMDETQKSLLLMGSIASA--NESLGEMRTTPLIRNPSQPSFYYLSLQ 301

Query: 304 GISVEVPKLTFHFKGADLELPGENYMIGDS 334
           GI+V   +L        LE  G   +I DS
Sbjct: 302 GITVGSTRLPIKESTFALEDNGSGGVIIDS 323

BLAST of Cp4.1LG10g12010 vs. NCBI nr
Match: gi|255563827|ref|XP_002522914.1| (PREDICTED: aspartic proteinase nepenthesin-1 [Ricinus communis])

HSP 1 Score: 385.2 bits (988), Expect = 1.3e-103
Identity = 202/322 (62.73%), Postives = 246/322 (76.40%), Query Frame = 1

Query: 12  VVLSITMLFIHTSASSSSHSRRALRQPKLPSDGFRVSLNHVDHVKNLTRFEQLQRGVARG 71
           V+LS+ +L +    + S+ SRRAL  P    +GFR++L HVD  KNLT+F+++Q G+ R 
Sbjct: 11  VLLSLLILSLSVYPAFST-SRRALSYPAQLKNGFRITLKHVDSDKNLTKFQRIQHGIKRA 70

Query: 72  KTRLHRLNAMMLAANVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAIMDTGSDLIWTQ 131
             RL RLNAM+LAA+      + +PV++GNGEFLM LAIG+PP ++SAIMDTGSDLIWTQ
Sbjct: 71  NHRLERLNAMVLAASSNA--EINSPVLSGNGEFLMNLAIGTPPETYSAIMDTGSDLIWTQ 130

Query: 132 CKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPTSTCSSDECEYFYTYGDYSSTHG 191
           CKPC QCFDQ +PIFDPK+SSSFSK+SCSS+LC ALP S+C SD CEY YTYGDYSST G
Sbjct: 131 CKPCTQCFDQPSPIFDPKKSSSFSKLSCSSQLCKALPQSSC-SDSCEYLYTYGDYSSTQG 190

Query: 192 VLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLVSQLKEQKF 251
            +A ETFTFG     +VSIP +GFGCG+DNEGDGF+QG GLVGLGRGPLSLVSQLKE KF
Sbjct: 191 TMATETFTFG-----KVSIPNVGFGCGEDNEGDGFTQGSGLVGLGRGPLSLVSQLKEAKF 250

Query: 252 SYCLTAIDDTKPSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFYYLSLQGISVEVPK 311
           SYCLT+IDDTK S+LL+GSLA+V    +   I+TTPLI+NP QPSFYYLSL+GISV   +
Sbjct: 251 SYCLTSIDDTKTSTLLMGSLASV--NGTSAAIRTTPLIQNPLQPSFYYLSLEGISVGGTR 310

Query: 312 LTFHFKGADLELPGENYMIGDS 334
           L        L+  G   +I DS
Sbjct: 311 LPIKESTFQLQDDGTGGLIIDS 321

BLAST of Cp4.1LG10g12010 vs. NCBI nr
Match: gi|566222317|ref|XP_006370905.1| (aspartyl protease family protein [Populus trichocarpa])

HSP 1 Score: 384.8 bits (987), Expect = 1.7e-103
Identity = 205/324 (63.27%), Postives = 246/324 (75.93%), Query Frame = 1

Query: 10  VSVVLSITMLFIHTSASSSSHSRRALRQPKLPSDGFRVSLNHVDHVKNLTRFEQLQRGVA 69
           +S+V+++  +F    + + S SRR L  PK+  +GFR  L HVD  KNLT+FE++Q GV 
Sbjct: 7   LSLVVALA-IFAFVFSHAFSTSRRVLEHPKV-QNGFRAKLKHVDSGKNLTKFERIQHGVK 66

Query: 70  RGKTRLHRLNAMMLAANVGVGGRVQAPVVAGNGEFLMKLAIGSPPRSFSAIMDTGSDLIW 129
           RG+ RL R  AM L A+      + APV+ GNGEFLMKLAIG+PP ++SAIMDTGSDLIW
Sbjct: 67  RGRHRLQRFKAMALVASSN--SEIDAPVLPGNGEFLMKLAIGTPPETYSAIMDTGSDLIW 126

Query: 130 TQCKPCQQCFDQATPIFDPKESSSFSKISCSSELCDALPTSTCSSDECEYFYTYGDYSST 189
           TQCKPC QCFDQ TPIFDPK+SSSFSK+SCSS+LC+ALP STC SD CEY Y YGDYSST
Sbjct: 127 TQCKPCTQCFDQPTPIFDPKKSSSFSKLSCSSKLCEALPQSTC-SDGCEYLYGYGDYSST 186

Query: 190 HGVLAAETFTFGDSSQDQVSIPGLGFGCGDDNEGDGFSQGEGLVGLGRGPLSLVSQLKEQ 249
            G+LA+ET TFG     +VS+P + FGCG+DNEG GFSQG GLVGLGRGPLSLVSQLKE 
Sbjct: 187 QGMLASETLTFG-----KVSVPEVAFGCGEDNEGSGFSQGSGLVGLGRGPLSLVSQLKEP 246

Query: 250 KFSYCLTAIDDTKPSSLLLGSLANVKPKASEGEIKTTPLIRNPSQPSFYYLSLQGISVEV 309
           KFSYCLT++DDTK S+LL+GSLA+V  KAS+ EIKTTPLI+N +QPSFYYLSL+GISV  
Sbjct: 247 KFSYCLTSVDDTKASTLLMGSLASV--KASDSEIKTTPLIQNSAQPSFYYLSLEGISVGD 306

Query: 310 PKLTFHFKGADLELPGENYMIGDS 334
             L        L+  G   +I DS
Sbjct: 307 TSLPIKKSTFSLQEDGSGGLIIDS 318

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
NEP1_NEPGR1.3e-7450.96Aspartic proteinase nepenthesin-1 OS=Nepenthes gracilis GN=nep1 PE=1 SV=1[more]
NEP2_NEPGR2.9e-7445.95Aspartic proteinase nepenthesin-2 OS=Nepenthes gracilis GN=nep2 PE=1 SV=1[more]
AP37_ORYSJ2.0e-4639.13Aspartyl protease 37 OS=Oryza sativa subsp. japonica GN=AP37 PE=3 SV=2[more]
CDR1_ARATH9.7e-4637.43Aspartic proteinase CDR1 OS=Arabidopsis thaliana GN=CDR1 PE=1 SV=1[more]
ASPG1_ARATH1.9e-4139.83Protein ASPARTIC PROTEASE IN GUARD CELL 1 OS=Arabidopsis thaliana GN=ASPG1 PE=1 ... [more]
Match NameE-valueIdentityDescription
A0A0A0KYT9_CUCSA4.1e-13681.61Uncharacterized protein OS=Cucumis sativus GN=Csa_4G554680 PE=3 SV=1[more]
A0A0D2T6N5_GOSRA1.1e-10462.12Uncharacterized protein OS=Gossypium raimondii GN=B456_011G117400 PE=3 SV=1[more]
B9SA95_RICCO9.3e-10462.73Aspartic proteinase nepenthesin-1, putative OS=Ricinus communis GN=RCOM_1697100 ... [more]
B9N2J4_POPTR1.2e-10363.27Aspartyl protease family protein OS=Populus trichocarpa GN=POPTR_0019s01600g PE=... [more]
A0A061DK44_THECC6.0e-10362.73Eukaryotic aspartyl protease family protein OS=Theobroma cacao GN=TCM_001628 PE=... [more]
Match NameE-valueIdentityDescription
AT2G03200.11.7e-9356.97 Eukaryotic aspartyl protease family protein[more]
AT1G64830.12.5e-4738.65 Eukaryotic aspartyl protease family protein[more]
AT5G33340.15.5e-4737.43 Eukaryotic aspartyl protease family protein[more]
AT1G31450.11.9e-4437.62 Eukaryotic aspartyl protease family protein[more]
AT3G18490.11.1e-4239.83 Eukaryotic aspartyl protease family protein[more]
Match NameE-valueIdentityDescription
gi|778695110|ref|XP_011653928.1|5.9e-13681.61PREDICTED: aspartic proteinase nepenthesin-1 [Cucumis sativus][more]
gi|659083174|ref|XP_008442220.1|5.0e-13579.94PREDICTED: aspartic proteinase nepenthesin-1 [Cucumis melo][more]
gi|823248901|ref|XP_012457105.1|1.6e-10462.12PREDICTED: aspartic proteinase nepenthesin-1-like [Gossypium raimondii][more]
gi|255563827|ref|XP_002522914.1|1.3e-10362.73PREDICTED: aspartic proteinase nepenthesin-1 [Ricinus communis][more]
gi|566222317|ref|XP_006370905.1|1.7e-10363.27aspartyl protease family protein [Populus trichocarpa][more]
The following terms have been associated with this gene:
Vocabulary: Biological Process
TermDefinition
GO:0006508proteolysis
Vocabulary: Molecular Function
TermDefinition
GO:0004190aspartic-type endopeptidase activity
Vocabulary: INTERPRO
TermDefinition
IPR021109Peptidase_aspartic_dom_sf
IPR001461Aspartic_peptidase_A1
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0044260 cellular macromolecule metabolic process
biological_process GO:0006508 proteolysis
biological_process GO:0044699 single-organism process
biological_process GO:0010413 glucuronoxylan metabolic process
biological_process GO:0050665 hydrogen peroxide biosynthetic process
biological_process GO:0016926 protein desumoylation
biological_process GO:0000041 transition metal ion transport
biological_process GO:0010228 vegetative to reproductive phase transition of meristem
biological_process GO:0045492 xylan biosynthetic process
cellular_component GO:0005575 cellular_component
molecular_function GO:0004190 aspartic-type endopeptidase activity
molecular_function GO:0016740 transferase activity

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Cp4.1LG10g12010.1Cp4.1LG10g12010.1mRNA


Analysis Name: InterPro Annotations of Cucurbita pepo
Date Performed: 2017-12-02
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR001461Aspartic peptidase A1 familyPANTHERPTHR13683ASPARTYL PROTEASEScoord: 1..380
score: 4.4E
IPR021109Aspartic peptidase domainGENE3DG3DSA:2.40.70.10coord: 308..379
score: 3.6E-17coord: 256..307
score: 2.1E-5coord: 101..255
score: 1.7
IPR021109Aspartic peptidase domainunknownSSF50630Acid proteasescoord: 97..378
score: 2.04
NoneNo IPR availablePANTHERPTHR13683:SF324SUBFAMILY NOT NAMEDcoord: 1..380
score: 4.4E