Cp4.1LG15g04750 (gene) Cucurbita pepo (Zucchini)

NameCp4.1LG15g04750
Typegene
OrganismCucurbita pepo (Cucurbita pepo (Zucchini))
DescriptionCysteine protease
LocationCp4.1LG15 : 5754029 .. 5763131 (-)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: CDS
Hold the cursor over a type above to highlight its positions in the sequence below.
AACGCAAAGCACCACATGGGTAACCCTTACCGGACCCACGACGAGGTCGCGGCTCTGTACGAGTCGTGGTTGGTCCATCATGGAAAAGCGTACAACGCTATCGGCGAGAAAGAGCGGCGGTTTGAGATTTTCAAGGATAATCTCAGGTTCATCGATGAACATAACCGGGAATCGCGGAGTTATAAGGTTGGATTGACTCGTTTCGCTGATCTCACCAATGATGAGTATCGGGCCATGTTTTTGGGCGGCCGATTCTCCCCAATGCCTCGCCTATCCGCCGCTAAGAGCAGTAGGTACGCGACGGCTGTCGGCGATGATCTTCCGGATGATGTCGATTGGAGAAAGAAGGGCGCTGTTACGGCTGTTAAAGATCAAGGACAGTGTGGTGAGTTCTGTTTTCCTCTTATTTTGCAGCATTTTTGTTTAACTGATCCGATGGATTCTGTATTCTCATTTAGTCTTTGATCTTTTGACTCTTGCAAACCACCGGTGATTATCGATTTGATTTGTGTTTGTTTGTGTGTTTGTTTGAGTTTATAAGTAGGGAATACATCTCCGTTGTAATGAGGTCTTTTGGTAAAAACAAAAAACAAAGACACGAGAGCTTATGATCAAAGTAAAGAATATCGTACCATGGTGGAGAGTACTAATTTCTAACATGATATTAGAGTCACGCTCTTAATTTAGTCATGCCAATAGAATCCTCAAATATCGAACAAAGAAGTTGTGAACTTCAAAGGTGTAGTCAAAAGTGATTCAATTGTCGAACGAAGGGTGTATTTTATTCGAGGCCTCTAAGAAGGAGTGGAGCCTCGATTAAGGGGAGGCTGTTTGAGGGCTCCAAAGAAGTTGTGAACCTTGAAAGTGTAGTCAAAAGTGGTGACTCAAGTGTCGAACGAAGGGTGTATTTTGTTCGAGGGCTCCATAGGTCTCAGAAGAGGCTCTATGATGTAATTTGTTCGAGTGGAGGCTTGTTAGGAGAGGTCTTATGGACTATGATAGAACATATAACATGGGTTGCAATGGTGGTCTTATGGACTATGCTTTCCACTTCATCATTGACAATGGTGGAATCGACACCGAGAAAGATTACCCTTACAAGGCCCGTGATGACACTTGTGATCCCGACCGCGTTCGTGTCGATCGACCTACTAAATTCGATTCATTTGTTGTTTCTGACATCTTCATTTGACTTGGTTGCTAATGTTTTGTCAACAGAAACATATTAAGGTTGTTGAAATAGATGGGCATGAAGATGTTCCTGAGATTAATGAGAGGTTGTTGAAGACGGCTATGGCAGGTCAACCAGTTAGTGTTGCCATTGAAGCTGGTGGCAGAGCCTTCCAACTTTACCAATCGGTAATCTTTTCAACTTTTGAACATTTCCTGGATGTTGTTTCACAAAGCTAAGGTTTTGAATGCATGTTCGAGAATATCCTGTTGTGTGTGTGTGTTTGGAAGACCGCCCATGTCTATGGATATGTGCATTTAAAGATCCTACATTGGTGGGAGAGGGGAATGAAGCATTTTTGGTAAAGGTGTGGAAACATCTCTCTTGCATACGCGTTTTAAAACCGTGAGGCTGACAGTGGTATGTAATGGGTCAAAGCGGACAATATCGGCTAGCAGTATGTTTGGGCTTTTACAAATGGTACCAGAGTTAGAAGAGGCTAAGCTCCGAAGGGGGTGGACATGAGGGGGTGTGATAGAAAGAAAAGTGAGCCACAAAGAGGGTGGGGGTCCCACATTGATTGGAGAGAGGACAAGTGTCAGCGAAGACGTTGGGTCCTTAGGAGGTTGATTGTGAGATCCCACGTCGGTTGTAGGGGGGAATGATCATTCTCTATAAGGGTATGAAAACCTCTCCCTATCTGACGCTTTTTAAAACAGTGAGGCTGATGACGATACGTAACGGGTCAAAGTGGATAATATTTGCTAGCGGTGAATTTGAGCTGTTACAAATGATATCAGAGCCAGAAGAGGCTAAACCTTGAAAAGGGTGGACACGAGGCGGTGCACCAGAAAGGTGACTGGCCTCAAAGGGGGTGGATTGGGGGTTACTTCGATTGGAGAGAGAACTTGTGCCAGCAAGGACGCTGAGCACCAAAGGGGGTGGATTGTGAGATCCCACATCTTTATAAGGGTGTGGGAACATCTCCCTAGTAGACGCTTTTTAAAACTATGAGATTGACGGTGATACATAATGGGCCAAAACGGACAATATCTGTGTACTTACGCTGTTACAATGCATCATCATTCGTGGTGTTTTTTTTTTCTCCTCTCGTTGTGTGTTGAAGACATTCTATGTTTATGCATCATTCCGGGTGTCTTCACTGGGCGTTGTGGAACCGATCTGGATCACGGTGTTCTTGCTGTTGGGTACGGTAGAGAAGACGATTTAGACTACTGGATTGTAAAGAATTCGTGGGGAGAAAAATGGGAAGAGGACGGTTATAGCAAGATGGAGAGAAATGTGGCAAACACTACCTCAGGTAAGTGTGGTATAGCCTTACAGGCATCATACCCCATCAAGACCGGAAAGAACCCTCCAAAAGATATGAGAAGTGTATGCTCGATGAGAATCCGCTCGATCATTAAACTGTGATTGTAGATTTAAACACGAAACTTAAAATATAATCTCGCGAGATGAAATTCACTTGGAGCAAGTAGATTCACTCACTTCCGTATAGGATGAGTTAATGCATAGTTTTACTTGAACATCATCCATATATTATAGATCCTTTAAATTATTACTTGAACATGATTCTCTTCTGTGTCATTCACATATTGTTTAAGATTCCGCTATTTTTTTATACCATGATTAATATTAAATTATAAATAAAATAATTAAAATTATTATTAAAAACAATTAATTATTTACTAGATTTTTTTATCATAAATCTCAAAAAAATATTTTTAAATTTTTTTTGAAGGAGAGAACACGAGGAGGGCAAGATGCCAGCGCATGACATCAGCAATGATGCCAACGCATGATGTCATGTTGGCATCATTCCTTATGCCAATAGTACTAACCGTCTTTCGAGTCCATTTTTCAAGCGATATTCTCTCGACTTACTTTAGGATCTCTTTAAGGAATTTTGAGAGATTTAGAATTTTTTAAGGACCCAAACTGATTTCGAGATGTTATCGAACAAGTAACTAACCATTTGGATCTTTTTAGCAATAAATACTTGTAGGAATTGACTATGACTCGAGCATATGGACTGTTAGCATGATTTATGAATGAATTGATCACGAAAATGGATAGCTAATCCTAGAATTATAGATTATTGAATGACTTGGCTCGTTTCATGCAATAGGATTTCTGAAAAATACCAAATTTTTCTACAATGGATTTGAATGTTCTGTTGGGACTTTGACATGATATGATTTACCGTACTCTTCTTGATATGCATATACTTTTGTCAAATATGTATGATTTTATGATTTAATGCTATTTATTTCTAGAAGCGTGTTATGAGGTTGAAATACCTATTTATGTCCTAATTTATGTATAGACTAGATGTTACGTGACCTAGAGGATATGATGCAACACCTATTAGAGAGTCTTACTCTACAATTTGTCTTTTCTTGGTCCTCGAGCCACGTTTAGAGCTCATTGTGAGAATATGGGTCGACTAGGGCTATGTAGGAATTGTTAGGGCGCAACCCCTGATCGTGAACTCACCTCAGCTGGTAAGGGTCTATCTTGCATACAATAATGCGATGTCACCTACTTAGGGCCCGTGACACATCAGAAGGTACTAGTGAGGGAGTTTCAATTGTGTTTAGTGGGGAGCCTGTCTTACGCTATTTAAAAGCTTGGAATGATGAGCGAATAGGGAGCTTCCAGCAAGCACTCAATAGTAGATACCTATGGAGATGCAGGGAGACTCGAAGGTTACTAACAGCAAGTGGCGACTCCTCCATACCTATTAAACTTCCCTAGATCCCAAGACAGATGAGAGCATCCCGACTTATACGCAGGAAGTATGATAACTAGATGGACCATGGAGAGTAAAGTTACTAATAACGACAAAAAAATGGGCCCTTGTATTCTGTCGTTTCTATACTGTTACTAATAACTACAAGCGTATGGAAAAATTTAAAAGGCGAAAAATAAGAATCGCATAATTAATTTAATAAGGAATTATTTAAAAAAAAAATATAAGATAATGTAGGAAGCAAAGAAATAAAATTATTAATGAGCTGGAATCTCTATTTTTATTTCATATATTTTTTATTTATTATATTTTTGAAATATTTATATTAATCTCTAAATTTTTAAAAAATCTATCATTTTAGTTATTTATTTATTCTATTTTAATTTTTCAAACAAATTAATGACAATTTTAAACACGTTTAGAAATCCTAAAGGAAGAGAGAAAGGAAAGGAGAGGAAACGACACTCTTATAAGGAATAAAATGATAAATAATTATTACTCTTTTTTTCTTAACAAATTATTTATCGAAACGTGTACCGTGCATTATTATAATATAGATTCGACTTATTATTTTATTTTTTTTAATTTAATGGTGGAAAATTGTCGATAATTTTAAATTTAATGTTGTTCATCTCATTGGACGACTCGAGGAACTAACCGACTTCGATAATTTTGCACACATACTCATCATTAGTTAACGTGGTCCAAAATGAAAATGTGATCAGCTTGCATAAATATAGTCTGCTTGTAGACTTAATCGATTTTATGTTATAGAGTGCAAAATATTTACGTTGAACCATGATGTCATACCAAATTTAGGAGCAATTTTGACGGTGATAATATGTTTGGACCAATATGTTTTGAACCATATGCTGAAACAATTTCAGACACGATCACTATCGAGAAGGACATGCTGAATTTTTAAAATAAACCATAATAATAGGAATAGGGTGTTACCTAGGATAACACTAATTGATTTTTAACGAGAAACACGACTAATAAAAAATATTAAAATTTGTTGGTAAGATTTTGGACGCGGCCAAACCTTAAAAAGCATATAAGGTAATTTACTCTAACATAATAAAGCAAAATAATAGTGATACAAATTTATAAAGTCTATTAGGATAATAATTTGTGATGCTCAATTATGGGTAACGTTCATCGTTGGTCTAAAATCTTTTATGAAAAAGTTCTATTTATATATAATTTATAATTTGACTTATTTTGTGATGTCCCACGTCAGTTGGAGAGGAGAACTAATCACCCTTTATAAGAATGTGAAAACCTTCTCATAGCAGACGCGTTTTAAAGCCTTGAGGGAAGCCCGAGAGGGAAGCGGACAATATCTGATAGCGGTGGATCTAAACCATTATATATATATAAATTTAAATTTAAATTTTAATGGTACGAAAATTTAAATTTTTAATTTTATGATTAATTAAAAAAGTATATATATATATATAGTCAGTAATCTATTAGATATAAAATAAAATAAAAAAATTAGACATTTATTTTAAAAAAATAACCAGAGCTAGATGTAAATAGAGACGTGTTACTTATGGCCGCATGATAGATGTCCTAGTCTTGCTTATTTGAGTCGGTTGACAAAGACGTCAACCAATTGAGTGGGGGAAGATGTCACGGTCATGCATATCCAAGGCATGTTGTCAATAACCACATGACAAATATCCTGATCTTGCTTGTCCAGAACGTGTTGTCCATGGTTGCATGTCTGTTCCAAACCTCTTTTACATGCTTTCATCCTAGATAGTTCTTGCCATTTCATATTAGGTCAGTATGAGAATGTATGACCGTTCACCCAAATCGTGTCTACACCTGCCTGCAAGGGCTAAAATGCCCCCAAAATGTACTATAGGGGATATGATTAGTTGTCAATTTCTCCCTTACCTATATGCAGTAAGCTGTTGTTGCTGCTTTTTGCTTACATTCTCGTATAACACCAATTTCAATCTCTCAAATCTTGCTTTCAAAACTCTCTTTCGAAATCTCTACCCGTTTTCTAAAATTTTGTTAGTAGAAAATATATGACGAAGCATGGAGAGAGAATGGAAAACATTTTTTCATTAGTAAATGGGGTCGGAGACGAGGTTATCTCCGCCTCATTTCATTTCAATAACAGTGGAGCTTCACAGGTTCAAAGGTGCAGATATTTTTCTGAAATTCGCCTATATAATGCACCAATTCCTTCACTTCCTCTTCACACCCAACTCCACCAAACATGGCGGTTTCCACCACTTTCTCCTTCCCCGCCGCCGCTCTTCTAGCCATTTTCCTCTGTTTTCCCTCATTTTCTTCCGCTTCCGATTCCGATTCACCCACCTTCTCCATCATCGACGAGAATGCAAAACACCACATGGGTATCCCTCATACTCTCGATTCTGATTCCGGCGGCTTCCCTCACCGGACCCACGACGAAATTGCGGCTCTGTTCGAGTCGTGGTTGGTCCATCATGGTAAAGCTTACAATTCTCTCGGCGAGAAGGAGCGGCGGTTGGAGATTTTCAAGGATAATCTCAATTTCATCGATGAACATAACCGGGTGCCAAGGAGTTATAAGGTTGGATTGACCTGTTTCGCTGATCTCACCAATGATGAGTACAGGGCTCTGTTTTTGGGTGCCCGATTCTCCCCATTGCCTCACCTATCCGCCGCGGATAGTGGTAGGTACGCGACGGCTCGCGGCGATGATCTGCCGGATCATGTCGATTGGAGGAAGAAGGGCGCCGTTACTGCTGTTAGAGATCAAGGAGATTGCCGTGAGAATAAATTAATAAAAATATTCTTGAACTTAATAATTTTATCTATAAAAAAAATAACCTTAAATTTTTTAAAATTTCAATAATCACTTATTATTATTTATTATTAATTTTAGATGGAAACTTGACTACTTGGTTCTAAAAATAAATTTTAATATAGTTTCAAAAATATTCTTAAATTTTAAAAAGTTTTTTTTAATACATTTTAAAAGTAATATTAATACTATTATCCTTTACAAATAAAAATTCAAAAAAGGTTTTTTTACCGTCAATATATAGATAGAAATAGTTTATCAACACCTCATAGATTTTTTTTTTCTTTTTTTTTCTTTTTAATTTAAGGGTATTAATGAAATTCTGAATAACTCGATACTATTTTTTGAAATAAAGTTGGAGGGTATTTTTAAACTTTTTCTATTAATTTGGGGACAAAATGAAATTAAACATGAATTTCTCCGTTTGAATCAAAATCGAGAGACCGGAGTGGTTAATTAGTCCAAAACACATGGTTTTAGTGGTGGGTTTTGATGGATGATTAACGTTAACTTAAAGTGGATTATGTTTTGAATGAAGGGAGCTGCTGGGCTTTCTCAGCGGTGGCTGCAGTGGAAGGAATAAACCAAATCGTCACCGGTGAATTAATCTCTCTGTCGGAGCAGGAACTTGTGGACTGCGGTCGATCTAAGTTTCTCAGCGGTTGCAACGGTGGTTATATGGACAAAGCTTTCCGATTCATCATTGACAATGGTGGCATTGACACCGAGGAAGATTACCCTTACAAGGCCCTTGATAATTACACTTGCGATCCCGACCGGGTTCGTGTCGGTCGACCTACTTCATTCAATTCATTCAATCGTATCTGATGTCTTAATCTCACTCGGTTTCTAATGTGTTATCAACAGATAAATATAAATGTTGTCTCCATAAATGGGTATGTAAATGTTCCTCCGGGGAATGAGAGGTTGTTGAAAATGGCTGTGGCTGGTCAACCAGTTAGTGTTGCCATTGACGCTCATTCCGAAGCCTTACAACTTTACAAATCGGTAATCTCTTCAACTTCCAACATTTCCTTAATGTTGTTTCACAAAGCTAAGGTTTTGACATGTTTAAATGATTTGTGTTTACGAGAGAGAGGTTTCTACCTCCTTATAAAAAATGCTTCGTTCTCTTCCCTACCCTCTTTCGGGTCGAGGGTCTTTGCTGTTACACCGCTTCATGTCCACCTCCCTCGGAGCTCAACCTCCTCGCTGGTAGGGAGAGATTTCCACACTCTTATAAAGAATGTTTCGTTCTCCTCCCAACCTATGTGGGATCTCATAATCCACTCCCCTTCCGTGTCAAGTGTCTCGTTCGCACTTGTTCTCTTCTCCAATCGATGTGGGACCCCTCAATGCGTCTCTAAGGCCCAGTGTCCTTGTTGGCACACCAACCGCGAGATCTCACAATCTATCCCTTCAAGGCCCAGTGTCTCGCTGGCACTTGTTCCCTCCTCCAATCGATGTGGGACCCCCCAATCTACTCTTTTGAGGCTCAGTGTTCTTGTTGGCACACCGCCTTGTGTCCACCCCCTTCAGAACTCTGCCTCTTTGCTGATACATCGTCCTGTGTCTGGCTCTAATATCATTTGTAATAGCCTAAACTCACCCAACACAGATATTGTCCTCTCTGGGATTTCCCTCGAGATTTTTAAAACGCGTCTTCTAGGGATATGTTTCCACACTGTGAATGTTTGATTAATCCACCACACCTAAATAGTGTGAAAGTTATCTTATTTATAATCAAACAAATTACAAGGATATGGAAATAGTAATAATGGAAATAACAATAAATGAAAGATACCTATGGTAATAAGGCAAATATTAGATATTTGCCTTATAATAAAATATCAAATATAATATATACGGGAAAGTATAATTTATTACTAAATATCCTTAATGCACACCCTTATAAAGAATGTTTCGTTCTCCTCCCCAACCGATATGAGATCTCACAATCGGTTGGAGAGGAGAACAAAACATTCTTTATAAGAGGGTGGGAATCTCCTCTAGCCCACAACTTTTTAAAAATTTTGAGAGGAAGTCCAACAAGAAAACCCTAAAAAGACAATATCAGCTTAGGAAATTGTGTGTTTTATAGACCGCCGTTATTTATGCATCATTCATTCAGGGTGTCTTCACCGGAGACTGCGGAACCGAGCTGGACCACGGTGTTGTTGCTGTTGGGTATGGCACAGAAGATGGTTTAGACTACTGGATCATTAAGAATTCGTGGAGTGAAAATTGGGGAGAGAAGGGTTATATGAAGTTGGAGAGAAATGTGGCAAGCACTACCTTAGGCAAGTGTGGTATAACCCTAATGGCTTCATACCCCATCAAGCTTTGA

mRNA sequence

AACGCAAAGCACCACATGGGTAACCCTTACCGGACCCACGACGAGGTCGCGGCTCTGTACGAGTCGTGGTTGGTCCATCATGGAAAAGCGTACAACGCTATCGGCGAGAAAGAGCGGCGGTTTGAGATTTTCAAGGATAATCTCAGGTTCATCGATGAACATAACCGGGAATCGCGGAGTTATAAGGTTGGATTGACTCGTTTCGCTGATCTCACCAATGATGAGTATCGGGCCATGTTTTTGGGCGGCCGATTCTCCCCAATGCCTCGCCTATCCGCCGCTAAGAGCAGTAGGTACGCGACGGCTGTCGGCGATGATCTTCCGGATGATGTCGATTGGAGAAAGAAGGGCGCTGTTACGGCTGTTAAAGATCAAGGACAGTGTGATTACCCTTACAAGGCCCGTGATGACACTTGTGATCCCGACCGCAAACATATTAAGGTTGTTGAAATAGATGGGCATGAAGATGTTCCTGAGATTAATGAGAGGTTGTTGAAGACGGCTATGGCAGGTCAACCAGTTAGTGTTGCCATTGAAGCTGCCATTTTCCTCTGTTTTCCCTCATTTTCTTCCGCTTCCGATTCCGATTCACCCACCTTCTCCATCATCGACGAGAATGCAAAACACCACATGGGTATCCCTCATACTCTCGATTCTGATTCCGGCGGCTTCCCTCACCGGACCCACGACGAAATTGCGGCTCTGTTCGAGTCGTGGTTGGTCCATCATGGTAAAGCTTACAATTCTCTCGGCGAGAAGGAGCGGCGGTTGGAGATTTTCAAGGATAATCTCAATTTCATCGATGAACATAACCGGGTGCCAAGGAGTTATAAGGTTGGATTGACCTGTTTCGCTGATCTCACCAATGATGAGTACAGGGCTCTGTTTTTGGGTGCCCGATTCTCCCCATTGCCTCACCTATCCGCCGCGGATAGTGGGAGCTGCTGGGCTTTCTCAGCGGTGGCTGCAGTGGAAGGAATAAACCAAATCGTCACCGGTGAATTAATCTCTCTGTCGGAGCAGGAACTTGTGGACTGCGGTCGATCTAAGTTTCTCAGCGGTTGCAACGGTGGTTATATGGACAAAGCTTTCCGATTCATCATTGACAATGGTGGCATTGACACCGAGGAAGATTACCCTTACAAGGCCCTTGATAATTACACTTGCGATCCCGACCGGATAAATATAAATGTTGTCTCCATAAATGGGTATGTAAATGTTCCTCCGGGGAATGAGAGGTTGTTGAAAATGGCTGTGGCTGGTCAACCAGTTAGTGTTGCCATTGACGCTCATTCCGAAGCCTTACAACTTTACAAATCGGGTGTCTTCACCGGAGACTGCGGAACCGAGCTGGACCACGGTGTTGTTGCTGTTGGGTATGGCACAGAAGATGGTTTAGACTACTGGATCATTAAGAATTCGTGGAGTGAAAATTGGGGAGAGAAGGGTTATATGAAGTTGGAGAGAAATGTGGCAAGCACTACCTTAGGCAAGTGTGGTATAACCCTAATGGCTTCATACCCCATCAAGCTTTGA

Coding sequence (CDS)

AACGCAAAGCACCACATGGGTAACCCTTACCGGACCCACGACGAGGTCGCGGCTCTGTACGAGTCGTGGTTGGTCCATCATGGAAAAGCGTACAACGCTATCGGCGAGAAAGAGCGGCGGTTTGAGATTTTCAAGGATAATCTCAGGTTCATCGATGAACATAACCGGGAATCGCGGAGTTATAAGGTTGGATTGACTCGTTTCGCTGATCTCACCAATGATGAGTATCGGGCCATGTTTTTGGGCGGCCGATTCTCCCCAATGCCTCGCCTATCCGCCGCTAAGAGCAGTAGGTACGCGACGGCTGTCGGCGATGATCTTCCGGATGATGTCGATTGGAGAAAGAAGGGCGCTGTTACGGCTGTTAAAGATCAAGGACAGTGTGATTACCCTTACAAGGCCCGTGATGACACTTGTGATCCCGACCGCAAACATATTAAGGTTGTTGAAATAGATGGGCATGAAGATGTTCCTGAGATTAATGAGAGGTTGTTGAAGACGGCTATGGCAGGTCAACCAGTTAGTGTTGCCATTGAAGCTGCCATTTTCCTCTGTTTTCCCTCATTTTCTTCCGCTTCCGATTCCGATTCACCCACCTTCTCCATCATCGACGAGAATGCAAAACACCACATGGGTATCCCTCATACTCTCGATTCTGATTCCGGCGGCTTCCCTCACCGGACCCACGACGAAATTGCGGCTCTGTTCGAGTCGTGGTTGGTCCATCATGGTAAAGCTTACAATTCTCTCGGCGAGAAGGAGCGGCGGTTGGAGATTTTCAAGGATAATCTCAATTTCATCGATGAACATAACCGGGTGCCAAGGAGTTATAAGGTTGGATTGACCTGTTTCGCTGATCTCACCAATGATGAGTACAGGGCTCTGTTTTTGGGTGCCCGATTCTCCCCATTGCCTCACCTATCCGCCGCGGATAGTGGGAGCTGCTGGGCTTTCTCAGCGGTGGCTGCAGTGGAAGGAATAAACCAAATCGTCACCGGTGAATTAATCTCTCTGTCGGAGCAGGAACTTGTGGACTGCGGTCGATCTAAGTTTCTCAGCGGTTGCAACGGTGGTTATATGGACAAAGCTTTCCGATTCATCATTGACAATGGTGGCATTGACACCGAGGAAGATTACCCTTACAAGGCCCTTGATAATTACACTTGCGATCCCGACCGGATAAATATAAATGTTGTCTCCATAAATGGGTATGTAAATGTTCCTCCGGGGAATGAGAGGTTGTTGAAAATGGCTGTGGCTGGTCAACCAGTTAGTGTTGCCATTGACGCTCATTCCGAAGCCTTACAACTTTACAAATCGGGTGTCTTCACCGGAGACTGCGGAACCGAGCTGGACCACGGTGTTGTTGCTGTTGGGTATGGCACAGAAGATGGTTTAGACTACTGGATCATTAAGAATTCGTGGAGTGAAAATTGGGGAGAGAAGGGTTATATGAAGTTGGAGAGAAATGTGGCAAGCACTACCTTAGGCAAGTGTGGTATAACCCTAATGGCTTCATACCCCATCAAGCTTTGA

Protein sequence

NAKHHMGNPYRTHDEVAALYESWLVHHGKAYNAIGEKERRFEIFKDNLRFIDEHNRESRSYKVGLTRFADLTNDEYRAMFLGGRFSPMPRLSAAKSSRYATAVGDDLPDDVDWRKKGAVTAVKDQGQCDYPYKARDDTCDPDRKHIKVVEIDGHEDVPEINERLLKTAMAGQPVSVAIEAAIFLCFPSFSSASDSDSPTFSIIDENAKHHMGIPHTLDSDSGGFPHRTHDEIAALFESWLVHHGKAYNSLGEKERRLEIFKDNLNFIDEHNRVPRSYKVGLTCFADLTNDEYRALFLGARFSPLPHLSAADSGSCWAFSAVAAVEGINQIVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIIDNGGIDTEEDYPYKALDNYTCDPDRININVVSINGYVNVPPGNERLLKMAVAGQPVSVAIDAHSEALQLYKSGVFTGDCGTELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLERNVASTTLGKCGITLMASYPIKL
BLAST of Cp4.1LG15g04750 vs. Swiss-Prot
Match: RD21A_ARATH (Cysteine proteinase RD21A OS=Arabidopsis thaliana GN=RD21A PE=1 SV=1)

HSP 1 Score: 352.4 bits (903), Expect = 8.1e-96
Identity = 192/363 (52.89%), Postives = 238/363 (65.56%), Query Frame = 1

Query: 180 AAIFLCFPSFSSASDSDSPTFSIIDENAKHHMGIPHTLDSDSGGFPHRTHDEIAALFESW 239
           A +FL   + SSA D      SII  + KH +       S +GG   R+  E+ +++E+W
Sbjct: 9   AILFLAMVAVSSAVD-----MSIISYDEKHGV-------STTGG---RSEAEVMSIYEAW 68

Query: 240 LVHHGKAY--NSLGEKERRLEIFKDNLNFIDEHNRVPRSYKVGLTCFADLTNDEYRALFL 299
           LV HGKA   NSL EK+RR EIFKDNL F+DEHN    SY++GLT FADLTNDEYR+ +L
Sbjct: 69  LVKHGKAQSQNSLVEKDRRFEIFKDNLRFVDEHNEKNLSYRLGLTRFADLTNDEYRSKYL 128

Query: 300 GARFSP-----------------LPHL-------------SAADSGSCWAFSAVAAVEGI 359
           GA+                    LP                    GSCWAFS + AVEGI
Sbjct: 129 GAKMEKKGERRTSLRYEARVGDELPESIDWRKKGAVAEVKDQGGCGSCWAFSTIGAVEGI 188

Query: 360 NQIVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIIDNGGIDTEEDYPYKALDNY 419
           NQIVTG+LI+LSEQELVDC  S +  GCNGG MD AF FII NGGIDT++DYPYK +D  
Sbjct: 189 NQIVTGDLITLSEQELVDCDTS-YNEGCNGGLMDYAFEFIIKNGGIDTDKDYPYKGVDG- 248

Query: 420 TCDPDRININVVSINGYVNVPPGNERLLKMAVAGQPVSVAIDAHSEALQLYKSGVFTGDC 479
           TCD  R N  VV+I+ Y +VP  +E  LK AVA QP+S+AI+A   A QLY SG+F G C
Sbjct: 249 TCDQIRKNAKVVTIDSYEDVPTYSEESLKKAVAHQPISIAIEAGGRAFQLYDSGIFDGSC 308

Query: 480 GTELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLERNVASTTLGKCGITLMASY 511
           GT+LDHGVVAVGYGTE+G DYWI++NSW ++WGE GY+++ RN+AS++ GKCGI +  SY
Sbjct: 309 GTQLDHGVVAVGYGTENGKDYWIVRNSWGKSWGESGYLRMARNIASSS-GKCGIAIEPSY 353

BLAST of Cp4.1LG15g04750 vs. Swiss-Prot
Match: RD21B_ARATH (Probable cysteine protease RD21B OS=Arabidopsis thaliana GN=RD21B PE=1 SV=1)

HSP 1 Score: 342.0 bits (876), Expect = 1.1e-92
Identity = 188/343 (54.81%), Postives = 224/343 (65.31%), Query Frame = 1

Query: 201 SIIDENAKHHMGIPHTLDSDSGGFPHRTHDEIAALFESWLVHHGKA---YNSLG-EKERR 260
           SII  +  HH+    T  SDS         E+  ++E+W+V HGK     N LG EK++R
Sbjct: 25  SIISYDENHHI-TTETSRSDS---------EVERIYEAWMVEHGKKKMNQNGLGAEKDQR 84

Query: 261 LEIFKDNLNFIDEHNRVPRSYKVGLTCFADLTNDEYRALFLGA-----------RFSPLP 320
            EIFKDNL FIDEHN    SYK+GLT FADLTN+EYR+++LGA           R+    
Sbjct: 85  FEIFKDNLRFIDEHNTKNLSYKLGLTRFADLTNEEYRSMYLGAKPTKRVLKTSDRYQARV 144

Query: 321 HLSAADS------------------GSCWAFSAVAAVEGINQIVTGELISLSEQELVDCG 380
             +  DS                  GSCWAFS + AVEGIN+IVTG+LISLSEQELVDC 
Sbjct: 145 GDALPDSVDWRKEGAVADVKDQGSCGSCWAFSTIGAVEGINKIVTGDLISLSEQELVDCD 204

Query: 381 RSKFLSGCNGGYMDKAFRFIIDNGGIDTEEDYPYKALDNYTCDPDRININVVSINGYVNV 440
            S +  GCNGG MD AF FII NGGIDTE DYPYKA D   CD +R N  VV+I+ Y +V
Sbjct: 205 TS-YNQGCNGGLMDYAFEFIIKNGGIDTEADYPYKAADG-RCDQNRKNAKVVTIDSYEDV 264

Query: 441 PPGNERLLKMAVAGQPVSVAIDAHSEALQLYKSGVFTGDCGTELDHGVVAVGYGTEDGLD 500
           P  +E  LK A+A QP+SVAI+A   A QLY SGVF G CGTELDHGVVAVGYGTE+G D
Sbjct: 265 PENSEASLKKALAHQPISVAIEAGGRAFQLYSSGVFDGLCGTELDHGVVAVGYGTENGKD 324

Query: 501 YWIIKNSWSENWGEKGYMKLERNVASTTLGKCGITLMASYPIK 511
           YWI++NSW   WGE GY+K+ RN+ + T GKCGI + ASYPIK
Sbjct: 325 YWIVRNSWGNRWGESGYIKMARNIEAPT-GKCGIAMEASYPIK 354

BLAST of Cp4.1LG15g04750 vs. Swiss-Prot
Match: RD21C_ARATH (Probable cysteine protease RD21C OS=Arabidopsis thaliana GN=RD21C PE=1 SV=1)

HSP 1 Score: 338.6 bits (867), Expect = 1.2e-91
Identity = 176/315 (55.87%), Postives = 213/315 (67.62%), Query Frame = 1

Query: 227 RTHDEIAALFESWLVHHGKAYNSLGEKERRLEIFKDNLNFIDEHNRVP-RSYKVGLTCFA 286
           R   E   ++E WLV + K YN LGEKERR EIFKDNL F++EH+ +P R+Y+VGLT FA
Sbjct: 34  RNEAEARRMYERWLVENRKNYNGLGEKERRFEIFKDNLKFVEEHSSIPNRTYEVGLTRFA 93

Query: 287 DLTNDEYRALFLGARF-----------------SPLPHL----------SAADSGSCWAF 346
           DLTNDE+RA++L ++                    LP               D GSC + 
Sbjct: 94  DLTNDEFRAIYLRSKMERTRVPVKGEKYLYKVGDSLPDAIDWRAKGAVNPVKDQGSCGSC 153

Query: 347 SAVA---AVEGINQIVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIIDNGGIDT 406
            A +   AVEGINQI TGELISLSEQELVDC  S +  GC GG MD AF+FII+NGGIDT
Sbjct: 154 WAFSAIGAVEGINQIKTGELISLSEQELVDCDTS-YNDGCGGGLMDYAFKFIIENGGIDT 213

Query: 407 EEDYPYKALDNYTCDPDRININVVSINGYVNVPPGNERLLKMAVAGQPVSVAIDAHSEAL 466
           EEDYPY A D   C+ D+ N  VV+I+GY +VP  +E+ LK A+A QP+SVAI+A   A 
Sbjct: 214 EEDYPYIATDVNVCNSDKKNTRVVTIDGYEDVPQNDEKSLKKALANQPISVAIEAGGRAF 273

Query: 467 QLYKSGVFTGDCGTELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLERNVASTT 511
           QLY SGVFTG CGT LDHGVVAVGYG+E G DYWI++NSW  NWGE GY KLERN+  ++
Sbjct: 274 QLYTSGVFTGTCGTSLDHGVVAVGYGSEGGQDYWIVRNSWGSNWGESGYFKLERNIKESS 333

BLAST of Cp4.1LG15g04750 vs. Swiss-Prot
Match: ORYA_ORYSJ (Oryzain alpha chain OS=Oryza sativa subsp. japonica GN=Os04g0650000 PE=1 SV=2)

HSP 1 Score: 337.0 bits (863), Expect = 3.5e-91
Identity = 170/321 (52.96%), Postives = 213/321 (66.36%), Query Frame = 1

Query: 224 FPHRTHDEIAALFESWLVHHGKAYNSLGEKERRLEIFKDNLNFIDEHNRVP----RSYKV 283
           +  R+ +E   L+  W   HGK+YN++GE+ERR   F+DNL +IDEHN        S+++
Sbjct: 28  YGERSEEEARRLYAEWKAEHGKSYNAVGEEERRYAAFRDNLRYIDEHNAAADAGVHSFRL 87

Query: 284 GLTCFADLTNDEYRALFLGARFSPLPHLSAADS--------------------------- 343
           GL  FADLTN+EYR  +LG R  P      +D                            
Sbjct: 88  GLNRFADLTNEEYRDTYLGLRNKPRRERKVSDRYLAADNEALPESVDWRTKGAVAEIKDQ 147

Query: 344 ---GSCWAFSAVAAVEGINQIVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIID 403
              GSCWAFSA+AAVEGINQIVTG+LISLSEQELVDC  S +  GCNGG MD AF FII+
Sbjct: 148 GGCGSCWAFSAIAAVEGINQIVTGDLISLSEQELVDCDTS-YNEGCNGGLMDYAFDFIIN 207

Query: 404 NGGIDTEEDYPYKALDNYTCDPDRININVVSINGYVNVPPGNERLLKMAVAGQPVSVAID 463
           NGGIDTE+DYPYK  D   CD +R N  VV+I+ Y +V P +E  L+ AVA QPVSVAI+
Sbjct: 208 NGGIDTEDDYPYKGKDE-RCDVNRKNAKVVTIDSYEDVTPNSETSLQKAVANQPVSVAIE 267

Query: 464 AHSEALQLYKSGVFTGDCGTELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLER 511
           A   A QLY SG+FTG CGT LDHGV AVGYGTE+G DYWI++NSW ++WGE GY+++ER
Sbjct: 268 AGGRAFQLYSSGIFTGKCGTALDHGVAAVGYGTENGKDYWIVRNSWGKSWGESGYVRMER 327

BLAST of Cp4.1LG15g04750 vs. Swiss-Prot
Match: RDL2_ARATH (Probable cysteine protease RDL2 OS=Arabidopsis thaliana GN=RDL2 PE=2 SV=1)

HSP 1 Score: 322.0 bits (824), Expect = 1.2e-86
Identity = 170/316 (53.80%), Postives = 211/316 (66.77%), Query Frame = 1

Query: 227 RTHDEIAALFESWLVHHGKAYNSLGEKERRLEIFKDNLNFIDEHNRVP-RSYKVGLTCFA 286
           R   E+  ++E WLV + K YN LGEKERR +IFKDNL F+DEHN VP R+++VGLT FA
Sbjct: 35  RNETEVRLMYEQWLVENRKNYNGLGEKERRFKIFKDNLKFVDEHNSVPDRTFEVGLTRFA 94

Query: 287 DLTNDEYRALFLGARF-----------------SPLPH----------LSAADSGSCWAF 346
           DLTN+E+RA++L  +                    LP           +S  D G+C + 
Sbjct: 95  DLTNEEFRAIYLRKKMERTKDSVKTERYLYKEGDVLPDEVDWRANGAVVSVKDQGNCGSC 154

Query: 347 SAVA---AVEGINQIVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIIDNGGIDT 406
            A +   AVEGINQI TGELISLSEQELVDC R    +GC+GG M+ AF FI+ NGGI+T
Sbjct: 155 WAFSAVGAVEGINQITTGELISLSEQELVDCDRGFVNAGCDGGIMNYAFEFIMKNGGIET 214

Query: 407 EEDYPYKALDNYTCDPDR-ININVVSINGYVNVPPGNERLLKMAVAGQPVSVAIDAHSEA 466
           ++DYPY A D   C+ D+  N  VV+I+GY +VP  +E+ LK AVA QPVSVAI+A S+A
Sbjct: 215 DQDYPYNANDLGLCNADKNNNTRVVTIDGYEDVPRDDEKSLKKAVAHQPVSVAIEASSQA 274

Query: 467 LQLYKSGVFTGDCGTELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLERNVAST 511
            QLYKSGV TG CG  LDHGVV VGYG+  G DYWII+NSW  NWG+ GY+KL+RN+   
Sbjct: 275 FQLYKSGVMTGTCGISLDHGVVVVGYGSTSGEDYWIIRNSWGLNWGDSGYVKLQRNI-DD 334

BLAST of Cp4.1LG15g04750 vs. TrEMBL
Match: A0A0A0K4N3_CUCSA (Cysteine protease OS=Cucumis sativus GN=Csa_7G073400 PE=3 SV=1)

HSP 1 Score: 430.3 bits (1105), Expect = 3.4e-117
Identity = 230/374 (61.50%), Postives = 262/374 (70.05%), Query Frame = 1

Query: 169 MAGQPVSVAIEAAIFLCFPSFSSASDSDSPTFSIIDENAKHHMGIPHTLDSDSGGFPHRT 228
           MA  P+  A   A+F C   FSSAS S   TFSIIDENAKHH+GIP    SD+     R 
Sbjct: 1   MAISPIFFAF-LALFFCLSPFSSASHSS--TFSIIDENAKHHLGIPEIPHSDAH---QRP 60

Query: 229 HDEIAALFESWLVHHGKAYNSLGEKERRLEIFKDNLNFIDEHNRVPRSYKVGLTCFADLT 288
            +E+AAL+ESWLVHHGKAYN++GEKERR EIFKDNL FIDEHNR  R+YKVGLT FADLT
Sbjct: 61  DEEVAALYESWLVHHGKAYNAIGEKERRFEIFKDNLRFIDEHNRESRTYKVGLTRFADLT 120

Query: 289 NDEYRALFLG-------------------ARFSPLPH-------------LSAADSGSCW 348
           N+EYRA FLG                   A    LP                    GSCW
Sbjct: 121 NEEYRARFLGGRFSRKPRLSAAKSGRYAAALGDDLPDDVDWRKKGAVATVKDQGQCGSCW 180

Query: 349 AFSAVAAVEGINQIVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIIDNGGIDTE 408
           AFS+VAAVEGINQIVTGELI LSEQELVDC +S F  GCNGG MD AF+FII NGGIDTE
Sbjct: 181 AFSSVAAVEGINQIVTGELIPLSEQELVDCDKS-FNMGCNGGLMDYAFQFIIGNGGIDTE 240

Query: 409 EDYPYKALDNYTCDPDRININVVSINGYVNVPPGNERLLKMAVAGQPVSVAIDAHSEALQ 468
           EDYPYK  D   CDP+R N  VV+I+GY +VP  +E  LK AVA QPVSVAI+A   A Q
Sbjct: 241 EDYPYKGRD-AACDPNRKNAKVVTIDGYEDVPENDESSLKKAVANQPVSVAIEAGGRAFQ 300

Query: 469 LYKSGVFTGDCGTELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLERNVASTTL 511
           LY+SGVFTG CGT+LDHGVVAVGYGT++G DYWI++NSW ++WGE GY++LERNVA+ T 
Sbjct: 301 LYQSGVFTGRCGTDLDHGVVAVGYGTDNGTDYWIVRNSWGKDWGESGYIRLERNVANITT 360

BLAST of Cp4.1LG15g04750 vs. TrEMBL
Match: B9RYC1_RICCO (Cysteine protease, putative OS=Ricinus communis GN=RCOM_0812150 PE=3 SV=1)

HSP 1 Score: 396.4 bits (1017), Expect = 5.4e-107
Identity = 210/366 (57.38%), Postives = 244/366 (66.67%), Query Frame = 1

Query: 181 AIFLCFP----SFSSASDSDSPTFSIIDENAKHHMGIPHTLDSDSGGFPHRTHDEIAALF 240
           A F+ F     SFSSA D      SI+D N KH              +P RT  ++  ++
Sbjct: 8   AFFILFSGLLSSFSSALD-----MSIVDYNIKH-----------GTKYPLRTDSQVRRMY 67

Query: 241 ESWLVHHGKAYNSLGEKERRLEIFKDNLNFIDEHNRVPRSYKVGLTCFADLTNDEYRALF 300
           E WLV HGKAYN+LGEKE+R EIFKDNL FIDEHN V RSYKVGL  FADLTN+EY+A+F
Sbjct: 68  EMWLVEHGKAYNALGEKEKRFEIFKDNLRFIDEHNSVDRSYKVGLNRFADLTNEEYKAMF 127

Query: 301 LGARF--------------------------------SPLPHLSAADSGSCWAFSAVAAV 360
           LG +                                 + +P       GSCWAFS V AV
Sbjct: 128 LGTKMERKNRFLGTRSQRYLFKDGDDLPENVDWREKGAVVPVKDQGQCGSCWAFSTVGAV 187

Query: 361 EGINQIVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIIDNGGIDTEEDYPYKAL 420
           EGINQIVTGELISLSEQELVDC +S +  GCNGG MD AF FII+NGGIDTEEDYPYKA 
Sbjct: 188 EGINQIVTGELISLSEQELVDCDKS-YNQGCNGGLMDYAFEFIINNGGIDTEEDYPYKAS 247

Query: 421 DNYTCDPDRININVVSINGYVNVPPGNERLLKMAVAGQPVSVAIDAHSEALQLYKSGVFT 480
           DN  CDP+R N  VV+I+GY +VP  +E  LK AVA QPVSVAI+A   A QLYKSGVFT
Sbjct: 248 DNI-CDPNRKNAKVVTIDGYEDVPENDENSLKKAVAHQPVSVAIEAGGRAFQLYKSGVFT 307

Query: 481 GDCGTELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLERNVASTTLGKCGITLM 511
           G CGTELDHGVVAVGYGTE+G++YWI++NSW   WGE GY+++ERNVA+T  GKCGI + 
Sbjct: 308 GRCGTELDHGVVAVGYGTENGVNYWIVRNSWGSAWGESGYIRMERNVANTKTGKCGIAIQ 355

BLAST of Cp4.1LG15g04750 vs. TrEMBL
Match: A0A061E033_THECC (Granulin repeat cysteine protease family protein OS=Theobroma cacao GN=TCM_007113 PE=3 SV=1)

HSP 1 Score: 386.0 bits (990), Expect = 7.3e-104
Identity = 206/358 (57.54%), Postives = 235/358 (65.64%), Query Frame = 1

Query: 189 FSSASDSDSPTFSIIDENAKHHMGIPHTLDSDSGGFPHRTHDEIAALFESWLVHHGKAYN 248
           F S + S +   SIID + KH            G    RT  +I  ++E+WLV HGKAYN
Sbjct: 124 FFSLTLSSALDMSIIDYDLKH-----------GGQQQKRTETQIRRMYETWLVKHGKAYN 183

Query: 249 SLGEKERRLEIFKDNLNFIDEHNRVPRSYKVGLTCFADLTNDEYRALFLGARFS------ 308
            LGEKE+R EIFKDNL FI+EHN V  +YKVGL  FADLTN+EY+A++LGAR        
Sbjct: 184 GLGEKEKRFEIFKDNLKFIEEHNSVNGTYKVGLNRFADLTNEEYKAMYLGARLDGKTVSH 243

Query: 309 -----------------PLPHL-------------SAADSGSCWAFSAVAAVEGINQIVT 368
                             LP                    GSCWAFS VAAVEGINQIVT
Sbjct: 244 RLAGKEKSQRYVFRVGDKLPESVDWREKGAVVAVKDQGQCGSCWAFSTVAAVEGINQIVT 303

Query: 369 GELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIIDNGGIDTEEDYPYKALDNYTCDPD 428
           G+LISLSEQELVDC R  +  GCNGG MD AF FI  NGGIDTEEDYPY+A DN TCDP+
Sbjct: 304 GDLISLSEQELVDCDRL-YNQGCNGGLMDNAFDFITKNGGIDTEEDYPYRASDN-TCDPN 363

Query: 429 RININVVSINGYVNVPPGNERLLKMAVAGQPVSVAIDAHSEALQLYKSGVFTGDCGTELD 488
           R N  VVSI+GY +VP  +E  LK AVA QPVSVAI+A     QLY SGVFTG CGT LD
Sbjct: 364 RKNARVVSIDGYEDVPENDENSLKKAVAHQPVSVAIEAGGRPFQLYHSGVFTGHCGTNLD 423

Query: 489 HGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLERNVASTTLGKCGITLMASYPIK 511
           HGVVAVGYGTEDG+DYW +KNSW  +WGE GY+++ERNVA T+ GKCGI  MASYPIK
Sbjct: 424 HGVVAVGYGTEDGVDYWTVKNSWGPDWGENGYIRMERNVAGTSTGKCGIATMASYPIK 468

BLAST of Cp4.1LG15g04750 vs. TrEMBL
Match: A0A0B0MZ41_GOSAR (Cysteinease RD21a-like protein OS=Gossypium arboreum GN=F383_30955 PE=3 SV=1)

HSP 1 Score: 379.4 bits (973), Expect = 6.9e-102
Identity = 206/365 (56.44%), Postives = 239/365 (65.48%), Query Frame = 1

Query: 177 AIEAAIFLCFPSFSSASDSDSPTFSIIDENAKHHMGIPHTLDSDSGGFPHRTHDEIAALF 236
           ++ A + L   + SSASD      SII  +  H          D      RT DE+ A++
Sbjct: 6   SVMAMLLLMMFTLSSASD-----MSIISYDEAH---------PDKSTSSWRTDDEVMAMY 65

Query: 237 ESWLVHHGKAYNSLGEKERRLEIFKDNLNFIDEHNR-VPRSYKVGLTCFADLTNDEYRAL 296
           E WLV HGK YN LGEKERRL+IFKDNL FIDEHN     S+KVGL  FADLTN+EYRA+
Sbjct: 66  EEWLVKHGKTYNGLGEKERRLQIFKDNLRFIDEHNADESHSFKVGLNRFADLTNEEYRAI 125

Query: 297 FLGA------------RFSPLPHLSAADS------------------GSCWAFSAVAAVE 356
           +LG             R++PL      DS                  GSCWAFS +AAVE
Sbjct: 126 YLGIKKPNRKVSKASDRYAPLLGQKLPDSVDWREKGAVAEVKDQGSCGSCWAFSTIAAVE 185

Query: 357 GINQIVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIIDNGGIDTEEDYPYKALD 416
           GINQIVTGEL+SLSEQELVDC  S +  GCNGG MD AF FII NGGIDTEEDYPY A D
Sbjct: 186 GINQIVTGELLSLSEQELVDCDTS-YNEGCNGGLMDYAFEFIIKNGGIDTEEDYPYTARD 245

Query: 417 NYTCDPDRININVVSINGYVNVPPGNERLLKMAVAGQPVSVAIDAHSEALQLYKSGVFTG 476
             TCDP R N  VVSIN Y +VP  +E+ LK AVA QPVSVAI+A   + QLY+SG+F G
Sbjct: 246 G-TCDPYRKNAKVVSINDYEDVPVNDEKALKKAVANQPVSVAIEAGGRSFQLYQSGIFDG 305

Query: 477 DCGTELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLERNVASTTLGKCGITLMA 511
            CGT+LDHGV AVGYGTE G DYWI+KNSW  +WGE GY+++ RNVA+T  GKCGI + A
Sbjct: 306 KCGTQLDHGVTAVGYGTEKGKDYWIVKNSWGSSWGEAGYIRMARNVANTVTGKCGIAMEA 354

BLAST of Cp4.1LG15g04750 vs. TrEMBL
Match: W9RY43_9ROSA (Cysteine proteinase RD21a OS=Morus notabilis GN=L484_027445 PE=3 SV=1)

HSP 1 Score: 375.6 bits (963), Expect = 9.9e-101
Identity = 194/361 (53.74%), Postives = 236/361 (65.37%), Query Frame = 1

Query: 183 FLCFPSFSSASDSDSPTFSIIDENAKHHMGIPHTLDSDSGGFPHRTHDEIAALFESWLVH 242
           FL   SF       SP  SIID +AKH + +P            R+  E+ A++ESWLV 
Sbjct: 14  FLVLVSFLVGFSWASPDMSIIDYDAKHGIVVP----------TERSETEMRAMYESWLVI 73

Query: 243 HGKAYNSLGEKERRLEIFKDNLNFIDEHNRVPRSYKVGLTCFADLTNDEYRALFLGARFS 302
           H KAYN+L EKERR EIFKDNL F+D HN   R++K+GL  FADLTNDEYRA FLG +  
Sbjct: 74  HAKAYNALREKERRFEIFKDNLRFVDAHNAANRTFKLGLNRFADLTNDEYRAGFLGVKID 133

Query: 303 PL---------------------------------PHLSAADSGSCWAFSAVAAVEGINQ 362
            +                                 P       GSCWAFS + AVEGIN+
Sbjct: 134 RIGTLFGSRKSDRYAFRAGDELPESIDWRDKGAVAPVKDQGQCGSCWAFSTIGAVEGINK 193

Query: 363 IVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIIDNGGIDTEEDYPYKALDNYTC 422
           IVTGELISLSEQELVDC  S +  GCNGG MD  F FII+NGG+DTEEDYPYKA D   C
Sbjct: 194 IVTGELISLSEQELVDCDTS-YNQGCNGGLMDYGFEFIINNGGVDTEEDYPYKARDGQ-C 253

Query: 423 DPDRININVVSINGYVNVPPGNERLLKMAVAGQPVSVAIDAHSEALQLYKSGVFTGDCGT 482
           DP+R N +VV+I+GY +VP  +E+ LK AVA QPVSVAI+A     QLY+SGVFTG CGT
Sbjct: 254 DPNRKNAHVVTIDGYEDVPENDEKALKKAVANQPVSVAIEAGGREFQLYQSGVFTGRCGT 313

Query: 483 ELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLERNVASTTLGKCGITLMASYPI 511
           +LDHGVV +GYGTE+G+DYW ++NSW  +WGE GY+++ERN+A  + GKCGI + +SYPI
Sbjct: 314 QLDHGVVVIGYGTENGVDYWKVRNSWGPSWGENGYIRMERNLAGASHGKCGIAMESSYPI 362

BLAST of Cp4.1LG15g04750 vs. TAIR10
Match: AT1G47128.1 (AT1G47128.1 Granulin repeat cysteine protease family protein)

HSP 1 Score: 352.4 bits (903), Expect = 4.5e-97
Identity = 192/363 (52.89%), Postives = 238/363 (65.56%), Query Frame = 1

Query: 180 AAIFLCFPSFSSASDSDSPTFSIIDENAKHHMGIPHTLDSDSGGFPHRTHDEIAALFESW 239
           A +FL   + SSA D      SII  + KH +       S +GG   R+  E+ +++E+W
Sbjct: 9   AILFLAMVAVSSAVD-----MSIISYDEKHGV-------STTGG---RSEAEVMSIYEAW 68

Query: 240 LVHHGKAY--NSLGEKERRLEIFKDNLNFIDEHNRVPRSYKVGLTCFADLTNDEYRALFL 299
           LV HGKA   NSL EK+RR EIFKDNL F+DEHN    SY++GLT FADLTNDEYR+ +L
Sbjct: 69  LVKHGKAQSQNSLVEKDRRFEIFKDNLRFVDEHNEKNLSYRLGLTRFADLTNDEYRSKYL 128

Query: 300 GARFSP-----------------LPHL-------------SAADSGSCWAFSAVAAVEGI 359
           GA+                    LP                    GSCWAFS + AVEGI
Sbjct: 129 GAKMEKKGERRTSLRYEARVGDELPESIDWRKKGAVAEVKDQGGCGSCWAFSTIGAVEGI 188

Query: 360 NQIVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIIDNGGIDTEEDYPYKALDNY 419
           NQIVTG+LI+LSEQELVDC  S +  GCNGG MD AF FII NGGIDT++DYPYK +D  
Sbjct: 189 NQIVTGDLITLSEQELVDCDTS-YNEGCNGGLMDYAFEFIIKNGGIDTDKDYPYKGVDG- 248

Query: 420 TCDPDRININVVSINGYVNVPPGNERLLKMAVAGQPVSVAIDAHSEALQLYKSGVFTGDC 479
           TCD  R N  VV+I+ Y +VP  +E  LK AVA QP+S+AI+A   A QLY SG+F G C
Sbjct: 249 TCDQIRKNAKVVTIDSYEDVPTYSEESLKKAVAHQPISIAIEAGGRAFQLYDSGIFDGSC 308

Query: 480 GTELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLERNVASTTLGKCGITLMASY 511
           GT+LDHGVVAVGYGTE+G DYWI++NSW ++WGE GY+++ RN+AS++ GKCGI +  SY
Sbjct: 309 GTQLDHGVVAVGYGTENGKDYWIVRNSWGKSWGESGYLRMARNIASSS-GKCGIAIEPSY 353

BLAST of Cp4.1LG15g04750 vs. TAIR10
Match: AT5G43060.1 (AT5G43060.1 Granulin repeat cysteine protease family protein)

HSP 1 Score: 342.0 bits (876), Expect = 6.1e-94
Identity = 188/343 (54.81%), Postives = 224/343 (65.31%), Query Frame = 1

Query: 201 SIIDENAKHHMGIPHTLDSDSGGFPHRTHDEIAALFESWLVHHGKA---YNSLG-EKERR 260
           SII  +  HH+    T  SDS         E+  ++E+W+V HGK     N LG EK++R
Sbjct: 25  SIISYDENHHI-TTETSRSDS---------EVERIYEAWMVEHGKKKMNQNGLGAEKDQR 84

Query: 261 LEIFKDNLNFIDEHNRVPRSYKVGLTCFADLTNDEYRALFLGA-----------RFSPLP 320
            EIFKDNL FIDEHN    SYK+GLT FADLTN+EYR+++LGA           R+    
Sbjct: 85  FEIFKDNLRFIDEHNTKNLSYKLGLTRFADLTNEEYRSMYLGAKPTKRVLKTSDRYQARV 144

Query: 321 HLSAADS------------------GSCWAFSAVAAVEGINQIVTGELISLSEQELVDCG 380
             +  DS                  GSCWAFS + AVEGIN+IVTG+LISLSEQELVDC 
Sbjct: 145 GDALPDSVDWRKEGAVADVKDQGSCGSCWAFSTIGAVEGINKIVTGDLISLSEQELVDCD 204

Query: 381 RSKFLSGCNGGYMDKAFRFIIDNGGIDTEEDYPYKALDNYTCDPDRININVVSINGYVNV 440
            S +  GCNGG MD AF FII NGGIDTE DYPYKA D   CD +R N  VV+I+ Y +V
Sbjct: 205 TS-YNQGCNGGLMDYAFEFIIKNGGIDTEADYPYKAADG-RCDQNRKNAKVVTIDSYEDV 264

Query: 441 PPGNERLLKMAVAGQPVSVAIDAHSEALQLYKSGVFTGDCGTELDHGVVAVGYGTEDGLD 500
           P  +E  LK A+A QP+SVAI+A   A QLY SGVF G CGTELDHGVVAVGYGTE+G D
Sbjct: 265 PENSEASLKKALAHQPISVAIEAGGRAFQLYSSGVFDGLCGTELDHGVVAVGYGTENGKD 324

Query: 501 YWIIKNSWSENWGEKGYMKLERNVASTTLGKCGITLMASYPIK 511
           YWI++NSW   WGE GY+K+ RN+ + T GKCGI + ASYPIK
Sbjct: 325 YWIVRNSWGNRWGESGYIKMARNIEAPT-GKCGIAMEASYPIK 354

BLAST of Cp4.1LG15g04750 vs. TAIR10
Match: AT3G19390.1 (AT3G19390.1 Granulin repeat cysteine protease family protein)

HSP 1 Score: 338.6 bits (867), Expect = 6.8e-93
Identity = 176/315 (55.87%), Postives = 213/315 (67.62%), Query Frame = 1

Query: 227 RTHDEIAALFESWLVHHGKAYNSLGEKERRLEIFKDNLNFIDEHNRVP-RSYKVGLTCFA 286
           R   E   ++E WLV + K YN LGEKERR EIFKDNL F++EH+ +P R+Y+VGLT FA
Sbjct: 34  RNEAEARRMYERWLVENRKNYNGLGEKERRFEIFKDNLKFVEEHSSIPNRTYEVGLTRFA 93

Query: 287 DLTNDEYRALFLGARF-----------------SPLPHL----------SAADSGSCWAF 346
           DLTNDE+RA++L ++                    LP               D GSC + 
Sbjct: 94  DLTNDEFRAIYLRSKMERTRVPVKGEKYLYKVGDSLPDAIDWRAKGAVNPVKDQGSCGSC 153

Query: 347 SAVA---AVEGINQIVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIIDNGGIDT 406
            A +   AVEGINQI TGELISLSEQELVDC  S +  GC GG MD AF+FII+NGGIDT
Sbjct: 154 WAFSAIGAVEGINQIKTGELISLSEQELVDCDTS-YNDGCGGGLMDYAFKFIIENGGIDT 213

Query: 407 EEDYPYKALDNYTCDPDRININVVSINGYVNVPPGNERLLKMAVAGQPVSVAIDAHSEAL 466
           EEDYPY A D   C+ D+ N  VV+I+GY +VP  +E+ LK A+A QP+SVAI+A   A 
Sbjct: 214 EEDYPYIATDVNVCNSDKKNTRVVTIDGYEDVPQNDEKSLKKALANQPISVAIEAGGRAF 273

Query: 467 QLYKSGVFTGDCGTELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLERNVASTT 511
           QLY SGVFTG CGT LDHGVVAVGYG+E G DYWI++NSW  NWGE GY KLERN+  ++
Sbjct: 274 QLYTSGVFTGTCGTSLDHGVVAVGYGSEGGQDYWIVRNSWGSNWGESGYFKLERNIKESS 333

BLAST of Cp4.1LG15g04750 vs. TAIR10
Match: AT3G19400.1 (AT3G19400.1 Cysteine proteinases superfamily protein)

HSP 1 Score: 322.0 bits (824), Expect = 6.6e-88
Identity = 170/316 (53.80%), Postives = 211/316 (66.77%), Query Frame = 1

Query: 227 RTHDEIAALFESWLVHHGKAYNSLGEKERRLEIFKDNLNFIDEHNRVP-RSYKVGLTCFA 286
           R   E+  ++E WLV + K YN LGEKERR +IFKDNL F+DEHN VP R+++VGLT FA
Sbjct: 35  RNETEVRLMYEQWLVENRKNYNGLGEKERRFKIFKDNLKFVDEHNSVPDRTFEVGLTRFA 94

Query: 287 DLTNDEYRALFLGARF-----------------SPLPH----------LSAADSGSCWAF 346
           DLTN+E+RA++L  +                    LP           +S  D G+C + 
Sbjct: 95  DLTNEEFRAIYLRKKMERTKDSVKTERYLYKEGDVLPDEVDWRANGAVVSVKDQGNCGSC 154

Query: 347 SAVA---AVEGINQIVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIIDNGGIDT 406
            A +   AVEGINQI TGELISLSEQELVDC R    +GC+GG M+ AF FI+ NGGI+T
Sbjct: 155 WAFSAVGAVEGINQITTGELISLSEQELVDCDRGFVNAGCDGGIMNYAFEFIMKNGGIET 214

Query: 407 EEDYPYKALDNYTCDPDR-ININVVSINGYVNVPPGNERLLKMAVAGQPVSVAIDAHSEA 466
           ++DYPY A D   C+ D+  N  VV+I+GY +VP  +E+ LK AVA QPVSVAI+A S+A
Sbjct: 215 DQDYPYNANDLGLCNADKNNNTRVVTIDGYEDVPRDDEKSLKKAVAHQPVSVAIEASSQA 274

Query: 467 LQLYKSGVFTGDCGTELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLERNVAST 511
            QLYKSGV TG CG  LDHGVV VGYG+  G DYWII+NSW  NWG+ GY+KL+RN+   
Sbjct: 275 FQLYKSGVMTGTCGISLDHGVVVVGYGSTSGEDYWIIRNSWGLNWGDSGYVKLQRNI-DD 334

BLAST of Cp4.1LG15g04750 vs. TAIR10
Match: AT4G36880.1 (AT4G36880.1 cysteine proteinase1)

HSP 1 Score: 310.1 bits (793), Expect = 2.6e-84
Identity = 160/325 (49.23%), Postives = 209/325 (64.31%), Query Frame = 1

Query: 227 RTHDEIAALFESWLVHHGKAYNS----LGEKERRLEIFKDNLNFIDEHNRVPRS--YKVG 286
           RT +E+ +++  W   HGK  N+    + ++++R  IFKDNL FID HN   ++  YK+G
Sbjct: 40  RTDEEVRSIYLQWSAEHGKTNNNNNGIINDQDKRFNIFKDNLRFIDLHNENNKNATYKLG 99

Query: 287 LTCFADLTNDEYRALFL----------------GARFSPL-------------------P 346
           LT F DLTNDEYR L+L                  ++S                     P
Sbjct: 100 LTKFTDLTNDEYRKLYLGARTEPARRIAKAKNVNQKYSAAVNGKEVPETVDWRQKGAVNP 159

Query: 347 HLSAADSGSCWAFSAVAAVEGINQIVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFR 406
                  GSCWAFS  AAVEGIN+IVTGELISLSEQELVDC +S +  GCNGG MD AF+
Sbjct: 160 IKDQGTCGSCWAFSTTAAVEGINKIVTGELISLSEQELVDCDKS-YNQGCNGGLMDYAFQ 219

Query: 407 FIIDNGGIDTEEDYPYKALDNYTCDPDRININVVSINGYVNVPPGNERLLKMAVAGQPVS 466
           FI+ NGG++TE+DYPY+      C+    N  VVSI+GY +VP  +E  LK A++ QPVS
Sbjct: 220 FIMKNGGLNTEKDYPYRGFGG-KCNSFLKNSRVVSIDGYEDVPTKDETALKKAISYQPVS 279

Query: 467 VAIDAHSEALQLYKSGVFTGDCGTELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYM 511
           VAI+A     Q Y+SG+FTG CGT LDH VVAVGYG+E+G+DYWI++NSW   WGE+GY+
Sbjct: 280 VAIEAGGRIFQHYQSGIFTGSCGTNLDHAVVAVGYGSENGVDYWIVRNSWGPRWGEEGYI 339

BLAST of Cp4.1LG15g04750 vs. NCBI nr
Match: gi|659109966|ref|XP_008454976.1| (PREDICTED: low-temperature-induced cysteine proteinase-like [Cucumis melo])

HSP 1 Score: 438.0 bits (1125), Expect = 2.3e-119
Identity = 231/362 (63.81%), Postives = 261/362 (72.10%), Query Frame = 1

Query: 181 AIFLCFPSFSSASDSDSPTFSIIDENAKHHMGIPHTLDSDSGGFPHRTHDEIAALFESWL 240
           A+FLC   FSSAS S   TFSIIDENAKHH+GIP    SD+     RT +E+AAL+ESWL
Sbjct: 13  ALFLCLSPFSSASHSS--TFSIIDENAKHHLGIPEIPHSDAH---QRTDEEVAALYESWL 72

Query: 241 VHHGKAYNSLGEKERRLEIFKDNLNFIDEHNRVPRSYKVGLTCFADLTNDEYRALFLG-- 300
           VHHGKAYN+LGEKERR EIFKDNL FIDEHNR  R+YKVGLT FADLTN+EYRA FLG  
Sbjct: 73  VHHGKAYNALGEKERRFEIFKDNLMFIDEHNRESRTYKVGLTRFADLTNEEYRARFLGGR 132

Query: 301 -----------------ARFSPLPH-------------LSAADSGSCWAFSAVAAVEGIN 360
                            A    LP                    GSCWAFS VAAVEGIN
Sbjct: 133 FSRKPSLSAAKSGRYAAALGDDLPDDVDWRKKGAVANVKDQGQCGSCWAFSTVAAVEGIN 192

Query: 361 QIVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIIDNGGIDTEEDYPYKALDNYT 420
           QIVTGELISLSEQELVDC +S F  GCNGG MD AF+FIIDNGGIDT+EDYPYK  D   
Sbjct: 193 QIVTGELISLSEQELVDCDKS-FNMGCNGGLMDYAFQFIIDNGGIDTDEDYPYKGRDG-A 252

Query: 421 CDPDRININVVSINGYVNVPPGNERLLKMAVAGQPVSVAIDAHSEALQLYKSGVFTGDCG 480
           CDP+R N  VV+I+GY +VP  +E  LK AVA QPVSVAI+A   A QLY+SGVFTG CG
Sbjct: 253 CDPNRKNAKVVTIDGYEDVPENDESSLKKAVANQPVSVAIEAGGRAFQLYQSGVFTGRCG 312

Query: 481 TELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLERNVASTTLGKCGITLMASYP 511
           T LDHGVVAVGYGT++G DYWI++NSW ++WGE GY++LERNVA++T GKCGI +  SYP
Sbjct: 313 TNLDHGVVAVGYGTDNGTDYWIVRNSWGKDWGENGYIRLERNVANSTTGKCGIAVEPSYP 367

BLAST of Cp4.1LG15g04750 vs. NCBI nr
Match: gi|449438381|ref|XP_004136967.1| (PREDICTED: low-temperature-induced cysteine proteinase [Cucumis sativus])

HSP 1 Score: 430.3 bits (1105), Expect = 4.9e-117
Identity = 230/374 (61.50%), Postives = 262/374 (70.05%), Query Frame = 1

Query: 169 MAGQPVSVAIEAAIFLCFPSFSSASDSDSPTFSIIDENAKHHMGIPHTLDSDSGGFPHRT 228
           MA  P+  A   A+F C   FSSAS S   TFSIIDENAKHH+GIP    SD+     R 
Sbjct: 1   MAISPIFFAF-LALFFCLSPFSSASHSS--TFSIIDENAKHHLGIPEIPHSDAH---QRP 60

Query: 229 HDEIAALFESWLVHHGKAYNSLGEKERRLEIFKDNLNFIDEHNRVPRSYKVGLTCFADLT 288
            +E+AAL+ESWLVHHGKAYN++GEKERR EIFKDNL FIDEHNR  R+YKVGLT FADLT
Sbjct: 61  DEEVAALYESWLVHHGKAYNAIGEKERRFEIFKDNLRFIDEHNRESRTYKVGLTRFADLT 120

Query: 289 NDEYRALFLG-------------------ARFSPLPH-------------LSAADSGSCW 348
           N+EYRA FLG                   A    LP                    GSCW
Sbjct: 121 NEEYRARFLGGRFSRKPRLSAAKSGRYAAALGDDLPDDVDWRKKGAVATVKDQGQCGSCW 180

Query: 349 AFSAVAAVEGINQIVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIIDNGGIDTE 408
           AFS+VAAVEGINQIVTGELI LSEQELVDC +S F  GCNGG MD AF+FII NGGIDTE
Sbjct: 181 AFSSVAAVEGINQIVTGELIPLSEQELVDCDKS-FNMGCNGGLMDYAFQFIIGNGGIDTE 240

Query: 409 EDYPYKALDNYTCDPDRININVVSINGYVNVPPGNERLLKMAVAGQPVSVAIDAHSEALQ 468
           EDYPYK  D   CDP+R N  VV+I+GY +VP  +E  LK AVA QPVSVAI+A   A Q
Sbjct: 241 EDYPYKGRD-AACDPNRKNAKVVTIDGYEDVPENDESSLKKAVANQPVSVAIEAGGRAFQ 300

Query: 469 LYKSGVFTGDCGTELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLERNVASTTL 511
           LY+SGVFTG CGT+LDHGVVAVGYGT++G DYWI++NSW ++WGE GY++LERNVA+ T 
Sbjct: 301 LYQSGVFTGRCGTDLDHGVVAVGYGTDNGTDYWIVRNSWGKDWGESGYIRLERNVANITT 360

BLAST of Cp4.1LG15g04750 vs. NCBI nr
Match: gi|255555337|ref|XP_002518705.1| (PREDICTED: cysteine proteinase RD21a [Ricinus communis])

HSP 1 Score: 396.4 bits (1017), Expect = 7.8e-107
Identity = 210/366 (57.38%), Postives = 244/366 (66.67%), Query Frame = 1

Query: 181 AIFLCFP----SFSSASDSDSPTFSIIDENAKHHMGIPHTLDSDSGGFPHRTHDEIAALF 240
           A F+ F     SFSSA D      SI+D N KH              +P RT  ++  ++
Sbjct: 8   AFFILFSGLLSSFSSALD-----MSIVDYNIKH-----------GTKYPLRTDSQVRRMY 67

Query: 241 ESWLVHHGKAYNSLGEKERRLEIFKDNLNFIDEHNRVPRSYKVGLTCFADLTNDEYRALF 300
           E WLV HGKAYN+LGEKE+R EIFKDNL FIDEHN V RSYKVGL  FADLTN+EY+A+F
Sbjct: 68  EMWLVEHGKAYNALGEKEKRFEIFKDNLRFIDEHNSVDRSYKVGLNRFADLTNEEYKAMF 127

Query: 301 LGARF--------------------------------SPLPHLSAADSGSCWAFSAVAAV 360
           LG +                                 + +P       GSCWAFS V AV
Sbjct: 128 LGTKMERKNRFLGTRSQRYLFKDGDDLPENVDWREKGAVVPVKDQGQCGSCWAFSTVGAV 187

Query: 361 EGINQIVTGELISLSEQELVDCGRSKFLSGCNGGYMDKAFRFIIDNGGIDTEEDYPYKAL 420
           EGINQIVTGELISLSEQELVDC +S +  GCNGG MD AF FII+NGGIDTEEDYPYKA 
Sbjct: 188 EGINQIVTGELISLSEQELVDCDKS-YNQGCNGGLMDYAFEFIINNGGIDTEEDYPYKAS 247

Query: 421 DNYTCDPDRININVVSINGYVNVPPGNERLLKMAVAGQPVSVAIDAHSEALQLYKSGVFT 480
           DN  CDP+R N  VV+I+GY +VP  +E  LK AVA QPVSVAI+A   A QLYKSGVFT
Sbjct: 248 DNI-CDPNRKNAKVVTIDGYEDVPENDENSLKKAVAHQPVSVAIEAGGRAFQLYKSGVFT 307

Query: 481 GDCGTELDHGVVAVGYGTEDGLDYWIIKNSWSENWGEKGYMKLERNVASTTLGKCGITLM 511
           G CGTELDHGVVAVGYGTE+G++YWI++NSW   WGE GY+++ERNVA+T  GKCGI + 
Sbjct: 308 GRCGTELDHGVVAVGYGTENGVNYWIVRNSWGSAWGESGYIRMERNVANTKTGKCGIAIQ 355

BLAST of Cp4.1LG15g04750 vs. NCBI nr
Match: gi|694413514|ref|XP_009335016.1| (PREDICTED: cysteine proteinase RD21a-like isoform X1 [Pyrus x bretschneideri])

HSP 1 Score: 387.9 bits (995), Expect = 2.8e-104
Identity = 205/345 (59.42%), Postives = 235/345 (68.12%), Query Frame = 1

Query: 201 SIIDENAKHHMGIPHTLDSDSGGFPHRTHDEIAALFESWLVHHGKAYNSLGEKERRLEIF 260
           SI+D N KH  G+P + D         T  E+ AL+ESWLV H K YN+LGEKERR EIF
Sbjct: 33  SIVDYNEKH--GMPASSDP--------TETEVRALYESWLVKHAKNYNALGEKERRFEIF 92

Query: 261 KDNLNFIDEHNRVPRSYKVGLTCFADLTNDEYRALFLGARFS------------------ 320
           KDNL FIDEHN+  R+YKVGL  FADLTN+EYR+++LGA+                    
Sbjct: 93  KDNLRFIDEHNKQSRTYKVGLNRFADLTNEEYRSVYLGAKVDRRRRSGLSGSRKSDRYAF 152

Query: 321 ----PLPHL-------------SAADSGSCWAFSAVAAVEGINQIVTGELISLSEQELVD 380
                LP L                  GSCWAFS V AVEGIN+IVTGELISLSEQELVD
Sbjct: 153 RVGDKLPELVDWRAEGAVPAVKDQGQCGSCWAFSTVGAVEGINKIVTGELISLSEQELVD 212

Query: 381 CGRSKFLSGCNGGYMDKAFRFIIDNGGIDTEEDYPYKALDNYTCDPDRININVVSINGYV 440
           C RS +  GCNGG MD AF+FII NGGIDTE DYPY A D  +CDP R N  VVSI+GY 
Sbjct: 213 CDRS-YNQGCNGGLMDYAFQFIITNGGIDTEADYPYHARDG-SCDPSRKNARVVSIDGYE 272

Query: 441 NVPPGNERLLKMAVAGQPVSVAIDAHSEALQLYKSGVFTGDCGTELDHGVVAVGYGTEDG 500
           +VP  +E+ LK AVA QPVSVAI+A     QLY+SGVFTG CGT+LDHGVVAVGYGTE+G
Sbjct: 273 DVPENDEKSLKKAVAHQPVSVAIEAGGREFQLYQSGVFTGRCGTDLDHGVVAVGYGTENG 332

Query: 501 LDYWIIKNSWSENWGEKGYMKLERNVASTTLGKCGITLMASYPIK 511
           +DYWI++NSW  NWGE GY+KLERNVAS   GKCGI + ASYP K
Sbjct: 333 VDYWIVRNSWGPNWGEAGYIKLERNVASINTGKCGIAIEASYPTK 365

BLAST of Cp4.1LG15g04750 vs. NCBI nr
Match: gi|694413516|ref|XP_009335017.1| (PREDICTED: cysteine proteinase RD21a-like isoform X2 [Pyrus x bretschneideri])

HSP 1 Score: 387.9 bits (995), Expect = 2.8e-104
Identity = 205/345 (59.42%), Postives = 235/345 (68.12%), Query Frame = 1

Query: 201 SIIDENAKHHMGIPHTLDSDSGGFPHRTHDEIAALFESWLVHHGKAYNSLGEKERRLEIF 260
           SI+D N KH  G+P + D         T  E+ AL+ESWLV H K YN+LGEKERR EIF
Sbjct: 33  SIVDYNEKH--GMPASSDP--------TEPEVRALYESWLVKHAKNYNALGEKERRFEIF 92

Query: 261 KDNLNFIDEHNRVPRSYKVGLTCFADLTNDEYRALFLGARFS------------------ 320
           KDNL FIDEHN+  R+YKVGL  FADLTN+EYR+++LGA+                    
Sbjct: 93  KDNLRFIDEHNKQSRTYKVGLNRFADLTNEEYRSVYLGAKVDRRRRSGLSGSRKSDRYAF 152

Query: 321 ----PLPHL-------------SAADSGSCWAFSAVAAVEGINQIVTGELISLSEQELVD 380
                LP L                  GSCWAFS V AVEGIN+IVTGELISLSEQELVD
Sbjct: 153 RVGDKLPELVDWRAEGAVPAVKDQGQCGSCWAFSTVGAVEGINKIVTGELISLSEQELVD 212

Query: 381 CGRSKFLSGCNGGYMDKAFRFIIDNGGIDTEEDYPYKALDNYTCDPDRININVVSINGYV 440
           C RS +  GCNGG MD AF+FII NGGIDTE DYPY A D  +CDP R N  VVSI+GY 
Sbjct: 213 CDRS-YNQGCNGGLMDYAFQFIITNGGIDTEADYPYHARDG-SCDPSRKNARVVSIDGYE 272

Query: 441 NVPPGNERLLKMAVAGQPVSVAIDAHSEALQLYKSGVFTGDCGTELDHGVVAVGYGTEDG 500
           +VP  +E+ LK AVA QPVSVAI+A     QLY+SGVFTG CGT+LDHGVVAVGYGTE+G
Sbjct: 273 DVPENDEKSLKKAVAHQPVSVAIEAGGREFQLYQSGVFTGRCGTDLDHGVVAVGYGTENG 332

Query: 501 LDYWIIKNSWSENWGEKGYMKLERNVASTTLGKCGITLMASYPIK 511
           +DYWI++NSW  NWGE GY+KLERNVAS   GKCGI + ASYP K
Sbjct: 333 VDYWIVRNSWGPNWGEAGYIKLERNVASINTGKCGIAIEASYPTK 365

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
RD21A_ARATH8.1e-9652.89Cysteine proteinase RD21A OS=Arabidopsis thaliana GN=RD21A PE=1 SV=1[more]
RD21B_ARATH1.1e-9254.81Probable cysteine protease RD21B OS=Arabidopsis thaliana GN=RD21B PE=1 SV=1[more]
RD21C_ARATH1.2e-9155.87Probable cysteine protease RD21C OS=Arabidopsis thaliana GN=RD21C PE=1 SV=1[more]
ORYA_ORYSJ3.5e-9152.96Oryzain alpha chain OS=Oryza sativa subsp. japonica GN=Os04g0650000 PE=1 SV=2[more]
RDL2_ARATH1.2e-8653.80Probable cysteine protease RDL2 OS=Arabidopsis thaliana GN=RDL2 PE=2 SV=1[more]
Match NameE-valueIdentityDescription
A0A0A0K4N3_CUCSA3.4e-11761.50Cysteine protease OS=Cucumis sativus GN=Csa_7G073400 PE=3 SV=1[more]
B9RYC1_RICCO5.4e-10757.38Cysteine protease, putative OS=Ricinus communis GN=RCOM_0812150 PE=3 SV=1[more]
A0A061E033_THECC7.3e-10457.54Granulin repeat cysteine protease family protein OS=Theobroma cacao GN=TCM_00711... [more]
A0A0B0MZ41_GOSAR6.9e-10256.44Cysteinease RD21a-like protein OS=Gossypium arboreum GN=F383_30955 PE=3 SV=1[more]
W9RY43_9ROSA9.9e-10153.74Cysteine proteinase RD21a OS=Morus notabilis GN=L484_027445 PE=3 SV=1[more]
Match NameE-valueIdentityDescription
AT1G47128.14.5e-9752.89 Granulin repeat cysteine protease family protein[more]
AT5G43060.16.1e-9454.81 Granulin repeat cysteine protease family protein[more]
AT3G19390.16.8e-9355.87 Granulin repeat cysteine protease family protein[more]
AT3G19400.16.6e-8853.80 Cysteine proteinases superfamily protein[more]
AT4G36880.12.6e-8449.23 cysteine proteinase1[more]
Match NameE-valueIdentityDescription
gi|659109966|ref|XP_008454976.1|2.3e-11963.81PREDICTED: low-temperature-induced cysteine proteinase-like [Cucumis melo][more]
gi|449438381|ref|XP_004136967.1|4.9e-11761.50PREDICTED: low-temperature-induced cysteine proteinase [Cucumis sativus][more]
gi|255555337|ref|XP_002518705.1|7.8e-10757.38PREDICTED: cysteine proteinase RD21a [Ricinus communis][more]
gi|694413514|ref|XP_009335016.1|2.8e-10459.42PREDICTED: cysteine proteinase RD21a-like isoform X1 [Pyrus x bretschneideri][more]
gi|694413516|ref|XP_009335017.1|2.8e-10459.42PREDICTED: cysteine proteinase RD21a-like isoform X2 [Pyrus x bretschneideri][more]
The following terms have been associated with this gene:
Vocabulary: Molecular Function
TermDefinition
GO:0008234cysteine-type peptidase activity
Vocabulary: Biological Process
TermDefinition
GO:0006508proteolysis
Vocabulary: INTERPRO
TermDefinition
IPR025661Pept_asp_AS
IPR025660Pept_his_AS
IPR013201Prot_inhib_I29
IPR013128Peptidase_C1A
IPR000668Peptidase_C1A_C
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0006508 proteolysis
biological_process GO:0055114 oxidation-reduction process
biological_process GO:0051603 proteolysis involved in cellular protein catabolic process
cellular_component GO:0005575 cellular_component
cellular_component GO:0005615 extracellular space
cellular_component GO:0005764 lysosome
molecular_function GO:0008234 cysteine-type peptidase activity
molecular_function GO:0032440 2-alkenal reductase [NAD(P)] activity
molecular_function GO:0004197 cysteine-type endopeptidase activity

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Cp4.1LG15g04750.1Cp4.1LG15g04750.1mRNA


Analysis Name: InterPro Annotations of Cucurbita pepo
Date Performed: 2017-12-02
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR000668Peptidase C1A, papain C-terminalPRINTSPR00705PAPAINcoord: 309..324
score: 1.7E-7coord: 468..474
score: 1.7E-7coord: 453..463
score: 1.
IPR000668Peptidase C1A, papain C-terminalPFAMPF00112Peptidase_C1coord: 312..509
score: 4.5E-68coord: 129..181
score: 1.
IPR000668Peptidase C1A, papain C-terminalSMARTSM00645pept_c1coord: 107..509
score: 1.3
IPR013128Peptidase C1APANTHERPTHR12411CYSTEINE PROTEASE FAMILY C1-RELATEDcoord: 11..128
score: 2.7E-184coord: 313..510
score: 2.7E
IPR013201Cathepsin propeptide inhibitor domain (I29)PFAMPF08246Inhibitor_I29coord: 20..76
score: 3.5E-17coord: 236..292
score: 1.2
IPR013201Cathepsin propeptide inhibitor domain (I29)SMARTSM00848Inhibitor_I29_2coord: 20..76
score: 1.8E-25coord: 236..292
score: 9.9
IPR025660Cysteine peptidase, histidine active sitePROSITEPS00639THIOL_PROTEASE_HIScoord: 451..461
scor
IPR025661Cysteine peptidase, asparagine active sitePROSITEPS00640THIOL_PROTEASE_ASNcoord: 468..487
scor
NoneNo IPR availableGENE3DG3DSA:3.90.70.10coord: 14..128
score: 2.0E-30coord: 217..509
score: 1.3E-92coord: 129..180
score: 6.
NoneNo IPR availablePANTHERPTHR12411:SF344CYSTEINE PROTEASE COMPONENT OF PROTEASE-INHIBITOR COMPLEX-RELATEDcoord: 313..510
score: 2.7E-184coord: 11..128
score: 2.7E
NoneNo IPR availableunknownSSF54001Cysteine proteinasescoord: 14..181
score: 2.06E-38coord: 228..509
score: 8.81

The following gene(s) are paralogous to this gene:
GeneParalogueOrganismBlock
Cp4.1LG15g04750Cp4.1LG19g11240Cucurbita pepo (Zucchini)cpecpeB262
Cp4.1LG15g04750Cp4.1LG04g07890Cucurbita pepo (Zucchini)cpecpeB269
Cp4.1LG15g04750Cp4.1LG06g01630Cucurbita pepo (Zucchini)cpecpeB282