Cla97C01G004720 (gene) Watermelon (97103) v2

NameCla97C01G004720
Typegene
OrganismCitrullus lanatus (Watermelon (97103) v2)
DescriptionProlyl 4-hydroxylase subunit alpha-1
LocationCla97Chr01 : 4524020 .. 4526580 (+)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: exonpolypeptideCDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGGATTTCTACAATAGCACCTTTCTGAAATCTATATCCACATTCTCTAATTTTTACTTCAACTTCATCTTGTTTCGATCTTCGTACCAGCCATCAATGGATTCTCGTCTTCATTTTTTGCTTCTTTTATCGATTGCATTTTCATTCTCAACCTGCCTTGCACAAAGGTGATGACCGAACCCCATTTTACATCGTATTTCGAAATCTTAGGTTTTACGTACTTATCTGTGGTTTACTGCCTCGATTGGATTCTCATCAATTTCGTCTACGGATATTTATTTGGAATTGTGTTCTTAAAACTGATTTTCGTTTCACTTTGGAATGATTTGTTGATGCATGATTGAGTTTCATAACATGGGTTTTTATTCTTTCCCTTGAATCTGTTATGTTCATTCTGCCAGCAATTTGATTAGTGGCCGGAAGGGTTTAAGGGACCAATTGGTTGAAAGACCTTTGAGCTACTCAAATCATTCAGGAAGAATCGACCCATCAAGAGTTGTCCAAGTCTCTTGGCGACCAAGGTCAGGTACTTGCAGAATCTCTCATATGATTTGGAATTATCACATCAACTTCAAAGCATATTGATTAACTTAGTAGGAAGTCTGATTTAACTTCTCTTGCTGTAGGGTTTTCTTGTATAAAGGTTTTCTCTCAGATGAGGAGTGTGATCATCTGATTTCTTTGGTATTTCTCTGCCCTGTTTTTGTTAGTATCATAAACATGCTTCCGCTTTTTCTGACGTGCATATACCTTTTGGGCTTAGGCTTCAAATTCAGAAGACAATCCTTCCGGGAACACTGTCTCAACCAAAGTGCTAAAGAGTTCAGGAGTCATTTTAAACACAACAGTATGTTCCTGTGCATTTTGAAGTTTATGCAATGGTTTGAATAGCACACTTAATGGTCAAATTTGTATTTCATAGCACGAGGTTTGTTTGTCTTGTTTTCTTGAAGCTTATGTTAAAGTAGTATATTTTTCAATCATATATTGTTTTCTGTTAGTTTCTACACGTGAGCAATTATGCAGTCCATCAGTTATAGAACTTTTTCTGTCATCATTGTTTTCTTTATTTCCAGTATTGTTATTCAAAATATAATTATAAATGTACATGGACGAACAGCTTCATCTATGCTTGGGAATTGCTTGAGTTAAATTAGTCGTTCAATGTATTTTCCATTGAAATATGTTCAGTCTAAAGTTCTCTGTTGATAGTTTTCTCAGTGGTACTTCCAAATACCGTGTTGCATATTAGGATTTGGTACACATACTGTTTTTGCTTTTAATTTCATATGCTGTCAATCTGTCTCAATACCCATAGAAGTATGCTTTTTTGCTGTATTTTCTTGGGGCTTGGTAGTCGTTTTAAAGGAATCTTAAATAACTCTGAACTCTAAATGTCCTGTGAAGTCAAACGTTCATTTCTGCATTTGTAGATAAGAACACTAAAAATGTGGAGGATGCTAGCATAGAAAAACCTATGATAACTTATGCAAAATTCCATAATTATTATTAAAATAATTATTTTTGATGATTCAATTTTGAGTTGTCAGCATCGTCTTTCAAAATTTATAGCAACCCATGGCTCAAGGGAGTAATTGCAGACCAAAGAATTTAGTATTGTGCTTCCTTTTTGTAAGACCATCTATTCCTGTTCATAGTATTTATGTTTTGTGGAGCTTGTCAATTACTTCTTAAGTAACTAGGTCTTGTGAGATAAAGCTCAAAGTCACTGTAGTTGGTCTATTTCTAGTTTCTATGAGATTTATTCTTTCTTGGTTGGGCTTGTTTCTTAGGATGATATTATTGCAAGAATTGAAAATCGAATTGCACTGTGGACTTTTCTCCCAAAAGGTATTTCTCATCAATGCTGTATTGCAGCTTTGCCTTTTTCATTTTCCTGATAATGCTCCTTTTATTTGGTTTTCTTTTCTTCTAAAAGTATTAGTTTTATTGTCTTCAGATCATAGCATGCCTTTCCAGATCATGCAATACAGGGGTGAAGAAGCAGAGCACAAGTACTTTTATGGCAACAGATCTGCAATGTCGTCCAGTGAGCCTTTGATGGCCACAGTAGTTTTGTATCTCTCAGATTCTGCTCGCGGTGGCGTGATGCTCTTTCCAGAGTCAAAGGTGAGGGGAAGTACTCAAAGATCTGTGGCCATAACAATGTACTCATGACTGCCTTTTTCTCAATGCCTCAGGTAAAGAGCAAATTTTGGTCAAACCGGAGAAAGAAAAACAACTTTCTGAGACCAGTGAAAGGCAATGCAGTACTTTTTTTCTCTGTGCATCTTAATGCTTCTCCAGACAAGAGTAACTACCACACTCGATCCCCAATACTCAATGGGGAATTGTGGGTTGCTACAAAATTCTTCTACTTAAGACCAACCACTGGGAATAAAGATAAACACACAATTGAATCTGATGTAGACGGTTGCATTGATGAAGATAAAAGCTGCCCCCAATGGGCTGCCATTGGCGAATGTGAACGAAACGCTGTATTCATGATCGGTTCTCCAGATTACTATGGTACATGTAGAAAAAGCTGCAATGCATGTTGA

mRNA sequence

ATGGATTTCTACAATAGCACCTTTCTGAAATCTATATCCACATTCTCTAATTTTTACTTCAACTTCATCTTGTTTCGATCTTCGTACCAGCCATCAATGGATTCTCGTCTTCATTTTTTGCTTCTTTTATCGATTGCATTTTCATTCTCAACCTGCCTTGCACAAAGCAATTTGATTAGTGGCCGGAAGGGTTTAAGGGACCAATTGGTTGAAAGACCTTTGAGCTACTCAAATCATTCAGGAAGAATCGACCCATCAAGAGTTGTCCAAGTCTCTTGGCGACCAAGGGTTTTCTTGTATAAAGGTTTTCTCTCAGATGAGGAGTGTGATCATCTGATTTCTTTGGCTTCAAATTCAGAAGACAATCCTTCCGGGAACACTGTCTCAACCAAAGTGCTAAAGAGTTCAGGAGTCATTTTAAACACAACAGATGATATTATTGCAAGAATTGAAAATCGAATTGCACTGTGGACTTTTCTCCCAAAAGATCATAGCATGCCTTTCCAGATCATGCAATACAGGGGTGAAGAAGCAGAGCACAAGTACTTTTATGGCAACAGATCTGCAATGTCGTCCAGTGAGCCTTTGATGGCCACAGTAGTTTTGTATCTCTCAGATTCTGCTCGCGGTGGCGTGATGCTCTTTCCAGAGTCAAAGGTAAAGAGCAAATTTTGGTCAAACCGGAGAAAGAAAAACAACTTTCTGAGACCAGTGAAAGGCAATGCAGTACTTTTTTTCTCTGTGCATCTTAATGCTTCTCCAGACAAGAGTAACTACCACACTCGATCCCCAATACTCAATGGGGAATTGTGGGTTGCTACAAAATTCTTCTACTTAAGACCAACCACTGGGAATAAAGATAAACACACAATTGAATCTGATGTAGACGGTTGCATTGATGAAGATAAAAGCTGCCCCCAATGGGCTGCCATTGGCGAATGTGAACGAAACGCTGTATTCATGATCGGTTCTCCAGATTACTATGGTACATGTAGAAAAAGCTGCAATGCATGTTGA

Coding sequence (CDS)

ATGGATTTCTACAATAGCACCTTTCTGAAATCTATATCCACATTCTCTAATTTTTACTTCAACTTCATCTTGTTTCGATCTTCGTACCAGCCATCAATGGATTCTCGTCTTCATTTTTTGCTTCTTTTATCGATTGCATTTTCATTCTCAACCTGCCTTGCACAAAGCAATTTGATTAGTGGCCGGAAGGGTTTAAGGGACCAATTGGTTGAAAGACCTTTGAGCTACTCAAATCATTCAGGAAGAATCGACCCATCAAGAGTTGTCCAAGTCTCTTGGCGACCAAGGGTTTTCTTGTATAAAGGTTTTCTCTCAGATGAGGAGTGTGATCATCTGATTTCTTTGGCTTCAAATTCAGAAGACAATCCTTCCGGGAACACTGTCTCAACCAAAGTGCTAAAGAGTTCAGGAGTCATTTTAAACACAACAGATGATATTATTGCAAGAATTGAAAATCGAATTGCACTGTGGACTTTTCTCCCAAAAGATCATAGCATGCCTTTCCAGATCATGCAATACAGGGGTGAAGAAGCAGAGCACAAGTACTTTTATGGCAACAGATCTGCAATGTCGTCCAGTGAGCCTTTGATGGCCACAGTAGTTTTGTATCTCTCAGATTCTGCTCGCGGTGGCGTGATGCTCTTTCCAGAGTCAAAGGTAAAGAGCAAATTTTGGTCAAACCGGAGAAAGAAAAACAACTTTCTGAGACCAGTGAAAGGCAATGCAGTACTTTTTTTCTCTGTGCATCTTAATGCTTCTCCAGACAAGAGTAACTACCACACTCGATCCCCAATACTCAATGGGGAATTGTGGGTTGCTACAAAATTCTTCTACTTAAGACCAACCACTGGGAATAAAGATAAACACACAATTGAATCTGATGTAGACGGTTGCATTGATGAAGATAAAAGCTGCCCCCAATGGGCTGCCATTGGCGAATGTGAACGAAACGCTGTATTCATGATCGGTTCTCCAGATTACTATGGTACATGTAGAAAAAGCTGCAATGCATGTTGA

Protein sequence

MDFYNSTFLKSISTFSNFYFNFILFRSSYQPSMDSRLHFLLLLSIAFSFSTCLAQSNLISGRKGLRDQLVERPLSYSNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLASNSEDNPSGNTVSTKVLKSSGVILNTTDDIIARIENRIALWTFLPKDHSMPFQIMQYRGEEAEHKYFYGNRSAMSSSEPLMATVVLYLSDSARGGVMLFPESKVKSKFWSNRRKKNNFLRPVKGNAVLFFSVHLNASPDKSNYHTRSPILNGELWVATKFFYLRPTTGNKDKHTIESDVDGCIDEDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC
BLAST of Cla97C01G004720 vs. NCBI nr
Match: XP_008436994.1 (PREDICTED: probable prolyl 4-hydroxylase 12 [Cucumis melo])

HSP 1 Score: 554.3 bits (1427), Expect = 2.9e-154
Identity = 274/313 (87.54%), Postives = 287/313 (91.69%), Query Frame = 0

Query: 33  MDSRLHFLLLLSIAFSFSTCLAQSNLISGRKGLRDQLVERPLSYSNHSGRIDPSRVVQVS 92
           MDSRL+FLLL + AFSFSTCLAQSNLISGRKGLRDQLV+RPLSYSN S RIDPSRVVQVS
Sbjct: 1   MDSRLNFLLLFATAFSFSTCLAQSNLISGRKGLRDQLVDRPLSYSNQSVRIDPSRVVQVS 60

Query: 93  WRPRVFLYKGFLSDEECDHLISLASNSEDNP------SGNTVSTKVLKSSGVILNTTDDI 152
           WRPRVFLYKGFLSDEECDHLISLASNSEDNP      SGNTVST++L  SGVILNTTDDI
Sbjct: 61  WRPRVFLYKGFLSDEECDHLISLASNSEDNPSRNSAGSGNTVSTELLNGSGVILNTTDDI 120

Query: 153 IARIENRIALWTFLPKDHSMPFQIMQYRGEEAEHKYFYGNRSAM-SSSEPLMATVVLYLS 212
           IARIENRIA+WT LPKDH MPFQIMQYRGEEA+HKYFYGNRSAM SSSEPLMATVVLYLS
Sbjct: 121 IARIENRIAVWTLLPKDHGMPFQIMQYRGEEAKHKYFYGNRSAMSSSSEPLMATVVLYLS 180

Query: 213 DSARGGVMLFPESKVKSKFWSNRRKKNNFLRPVKGNAVLFFSVHLNASPDKSNYHTRSPI 272
           DSA GG MLFPESKVKSKFWS RRKK NFLRPVKGNA+LFFSVHLNASPDKS+YH R PI
Sbjct: 181 DSASGGEMLFPESKVKSKFWSGRRKKKNFLRPVKGNAILFFSVHLNASPDKSSYHIRYPI 240

Query: 273 LNGELWVATKFFYLRPTTGNKDKHTIESDVDGCIDEDKSCPQWAAIGECERNAVFMIGSP 332
            NGELWVATKF YLRP TGN  KHTI+S++DGCIDEDKSCPQWAAIGECERNAVFM+GSP
Sbjct: 241 RNGELWVATKFLYLRPPTGN--KHTIDSNIDGCIDEDKSCPQWAAIGECERNAVFMVGSP 300

Query: 333 DYYGTCRKSCNAC 339
           DYYGTCRKSCNAC
Sbjct: 301 DYYGTCRKSCNAC 311

BLAST of Cla97C01G004720 vs. NCBI nr
Match: XP_004152378.1 (PREDICTED: probable prolyl 4-hydroxylase 12 [Cucumis sativus])

HSP 1 Score: 553.1 bits (1424), Expect = 6.4e-154
Identity = 272/313 (86.90%), Postives = 289/313 (92.33%), Query Frame = 0

Query: 33  MDSRLHFLLLLSIAFSFSTCLAQSNLISGRKGLRDQLVERPLSYSNHSGRIDPSRVVQVS 92
           MDSRL+FLLLL+ AFSFSTCLAQSNLISGRKGLRD+LV+RPLSYSN+SGRIDPSRVVQVS
Sbjct: 1   MDSRLNFLLLLATAFSFSTCLAQSNLISGRKGLRDRLVDRPLSYSNYSGRIDPSRVVQVS 60

Query: 93  WRPRVFLYKGFLSDEECDHLISLASNSEDNPSGN------TVSTKVLKSSGVILNTTDDI 152
           WRPRVFLYKGFLSDEECDHLISLASNSEDNPS N      TVST++L SSGVILNTTDDI
Sbjct: 61  WRPRVFLYKGFLSDEECDHLISLASNSEDNPSRNSAGSGITVSTELLNSSGVILNTTDDI 120

Query: 153 IARIENRIALWTFLPKDHSMPFQIMQYRGEEAEHKYFYGNRSAM-SSSEPLMATVVLYLS 212
           +ARIENR+A+WT LPKDHSMPFQIMQYRGEEA+HKYFYGNRSAM  SSEPLMATVVLYLS
Sbjct: 121 VARIENRLAIWTLLPKDHSMPFQIMQYRGEEAKHKYFYGNRSAMLPSSEPLMATVVLYLS 180

Query: 213 DSARGGVMLFPESKVKSKFWSNRRKKNNFLRPVKGNAVLFFSVHLNASPDKSNYHTRSPI 272
           DSA GG +LFPESKVKSKFWS RRKKNNFLRPVKGNA+LFFSVHLNASPDKS+YH RSPI
Sbjct: 181 DSASGGEILFPESKVKSKFWSGRRKKNNFLRPVKGNAILFFSVHLNASPDKSSYHIRSPI 240

Query: 273 LNGELWVATKFFYLRPTTGNKDKHTIESDVDGCIDEDKSCPQWAAIGECERNAVFMIGSP 332
            +GELWVATKF YL P  GN  KHTI+SDVDGC DEDKSCPQWAAIGECERNAVFM+GSP
Sbjct: 241 RDGELWVATKFLYLGPPAGN--KHTIQSDVDGCFDEDKSCPQWAAIGECERNAVFMVGSP 300

Query: 333 DYYGTCRKSCNAC 339
           DYYGTCRKSCNAC
Sbjct: 301 DYYGTCRKSCNAC 311

BLAST of Cla97C01G004720 vs. NCBI nr
Match: XP_022159842.1 (probable prolyl 4-hydroxylase 12 [Momordica charantia])

HSP 1 Score: 523.5 bits (1347), Expect = 5.4e-145
Identity = 264/313 (84.35%), Postives = 279/313 (89.14%), Query Frame = 0

Query: 33  MDSRLHFLLLLSIAFSFSTCLAQSNLISGRKGLRDQLVER-PLSYSNHSGRIDPSRVVQV 92
           MDSRL  LLLL+ A SF +CLAQSNLISGRKGLRDQL+E  PLSYSNHSGRIDPSRVVQV
Sbjct: 1   MDSRLPVLLLLATAISFLSCLAQSNLISGRKGLRDQLIESVPLSYSNHSGRIDPSRVVQV 60

Query: 93  SWRPRVFLYKGFLSDEECDHLISLASNSEDNP------SGNTVSTKVLKSSGVILNTTDD 152
           SWRPRVFLYKGFLSDEECDHLISLA++SED P      SGNTV TK+LKSSG ILNTTDD
Sbjct: 61  SWRPRVFLYKGFLSDEECDHLISLATSSEDKPSGNSTDSGNTVPTKILKSSGAILNTTDD 120

Query: 153 IIARIENRIALWTFLPKDHSMPFQIMQYRGEEAEHKYFYGNRSAMSSSEPLMATVVLYLS 212
           IIARIENRIA+WTFLPKD+SMP QI+QY GEEAEHKY +GNRSAM SSEPLMATVVLYLS
Sbjct: 121 IIARIENRIAVWTFLPKDYSMPLQILQYGGEEAEHKYVFGNRSAMLSSEPLMATVVLYLS 180

Query: 213 DSARGGVMLFPESKVKSKFWSNRRKKNNFLRPVKGNAVLFFSVHLNASPDKSNYHTRSPI 272
           DSA GG M FPESKVKS+FWS+RRKKNN LRPVKGNAVL FSVHLNASPDKS+ HTRSPI
Sbjct: 181 DSASGGEMRFPESKVKSRFWSDRRKKNNILRPVKGNAVLIFSVHLNASPDKSSSHTRSPI 240

Query: 273 LNGELWVATKFFYLRPTTGNKDKHTIESDVDGCIDEDKSCPQWAAIGECERNAVFMIGSP 332
           L+GELW+ATKFFYLRP TGN  KHT E D D C DEDKSCPQWAAIGECERNAVFMIGSP
Sbjct: 241 LDGELWIATKFFYLRPITGN--KHTDEPDGD-CNDEDKSCPQWAAIGECERNAVFMIGSP 300

Query: 333 DYYGTCRKSCNAC 339
           DYYGTCRKSCNAC
Sbjct: 301 DYYGTCRKSCNAC 310

BLAST of Cla97C01G004720 vs. NCBI nr
Match: KGN50302.1 (hypothetical protein Csa_5G166460 [Cucumis sativus])

HSP 1 Score: 518.1 bits (1333), Expect = 2.3e-143
Identity = 252/291 (86.60%), Postives = 268/291 (92.10%), Query Frame = 0

Query: 55  QSNLISGRKGLRDQLVERPLSYSNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLIS 114
           +SNLISGRKGLRD+LV+RPLSYSN+SGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLIS
Sbjct: 6   KSNLISGRKGLRDRLVDRPLSYSNYSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLIS 65

Query: 115 LASNSEDNPSGN------TVSTKVLKSSGVILNTTDDIIARIENRIALWTFLPKDHSMPF 174
           LASNSEDNPS N      TVST++L SSGVILNTTDDI+ARIENR+A+WT LPKDHSMPF
Sbjct: 66  LASNSEDNPSRNSAGSGITVSTELLNSSGVILNTTDDIVARIENRLAIWTLLPKDHSMPF 125

Query: 175 QIMQYRGEEAEHKYFYGNRSAM-SSSEPLMATVVLYLSDSARGGVMLFPESKVKSKFWSN 234
           QIMQYRGEEA+HKYFYGNRSAM  SSEPLMATVVLYLSDSA GG +LFPESKVKSKFWS 
Sbjct: 126 QIMQYRGEEAKHKYFYGNRSAMLPSSEPLMATVVLYLSDSASGGEILFPESKVKSKFWSG 185

Query: 235 RRKKNNFLRPVKGNAVLFFSVHLNASPDKSNYHTRSPILNGELWVATKFFYLRPTTGNKD 294
           RRKKNNFLRPVKGNA+LFFSVHLNASPDKS+YH RSPI +GELWVATKF YL P  GN  
Sbjct: 186 RRKKNNFLRPVKGNAILFFSVHLNASPDKSSYHIRSPIRDGELWVATKFLYLGPPAGN-- 245

Query: 295 KHTIESDVDGCIDEDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC 339
           KHTI+SDVDGC DEDKSCPQWAAIGECERNAVFM+GSPDYYGTCRKSCNAC
Sbjct: 246 KHTIQSDVDGCFDEDKSCPQWAAIGECERNAVFMVGSPDYYGTCRKSCNAC 294

BLAST of Cla97C01G004720 vs. NCBI nr
Match: XP_023549812.1 (probable prolyl 4-hydroxylase 12 [Cucurbita pepo subsp. pepo])

HSP 1 Score: 503.8 bits (1296), Expect = 4.5e-139
Identity = 254/316 (80.38%), Postives = 279/316 (88.29%), Query Frame = 0

Query: 33  MDSRLHFLLLLSIAFSFSTCLAQSNLISGRKGLRDQLVER-PLSYSNHSGRIDPSRVVQV 92
           MDSRL+FLLL + AFSFS+CLAQSN +SGRKGLRDQ+V    LSYSNH  RIDPSRVVQ+
Sbjct: 1   MDSRLNFLLLFAAAFSFSSCLAQSNSVSGRKGLRDQMVNSGHLSYSNHFERIDPSRVVQI 60

Query: 93  SWRPRVFLYKGFLSDEECDHLISLASNSEDNP------SGNTVSTKVLKSSGVILNTTDD 152
           SW+PRVFLYKGFLSDEECDHLI+LASNSED P      S NTVSTK L +SG +LNTTDD
Sbjct: 61  SWQPRVFLYKGFLSDEECDHLIALASNSEDKPSRSNAGSRNTVSTKFLGNSGAVLNTTDD 120

Query: 153 IIARIENRIALWTFLPKDHSMPFQIMQYRGEEAE-HKYFYGNRSAMSSSEPLMATVVLYL 212
           IIARIENRIA+WTFLPKDHSMPFQIMQY GEEA  HKYF+GNRSAM SSEPLMATVVLYL
Sbjct: 121 IIARIENRIAVWTFLPKDHSMPFQIMQYGGEEAAGHKYFFGNRSAMPSSEPLMATVVLYL 180

Query: 213 SDSARGGVMLFPESKVKSKFWSNRRKKNNFLRPVKGNAVLFFSVHLNASPDKSNYHTRSP 272
           SDSA GG +LFP SKVK +FWS+RRKKNNFLRPVKGNAVLFFSVHLNASPDKS YH+R+P
Sbjct: 181 SDSASGGEILFPVSKVKRRFWSDRRKKNNFLRPVKGNAVLFFSVHLNASPDKSCYHSRTP 240

Query: 273 ILNGELWVATKFFYLRP-TTGNKDKHTIESDV-DGCIDEDKSCPQWAAIGECERNAVFMI 332
           IL+G+LWVATKFFY+RP  TGN  +H +ES V D CIDED+SCP+WAAIGEC+RNAVFMI
Sbjct: 241 ILDGKLWVATKFFYIRPAATGN--EHAVESGVDDDCIDEDESCPKWAAIGECKRNAVFMI 300

Query: 333 GSPDYYGTCRKSCNAC 339
           GSPDYYGTCRKSCNAC
Sbjct: 301 GSPDYYGTCRKSCNAC 314

BLAST of Cla97C01G004720 vs. TrEMBL
Match: tr|A0A1S3AT39|A0A1S3AT39_CUCME (probable prolyl 4-hydroxylase 12 OS=Cucumis melo OX=3656 GN=LOC103482556 PE=4 SV=1)

HSP 1 Score: 554.3 bits (1427), Expect = 1.9e-154
Identity = 274/313 (87.54%), Postives = 287/313 (91.69%), Query Frame = 0

Query: 33  MDSRLHFLLLLSIAFSFSTCLAQSNLISGRKGLRDQLVERPLSYSNHSGRIDPSRVVQVS 92
           MDSRL+FLLL + AFSFSTCLAQSNLISGRKGLRDQLV+RPLSYSN S RIDPSRVVQVS
Sbjct: 1   MDSRLNFLLLFATAFSFSTCLAQSNLISGRKGLRDQLVDRPLSYSNQSVRIDPSRVVQVS 60

Query: 93  WRPRVFLYKGFLSDEECDHLISLASNSEDNP------SGNTVSTKVLKSSGVILNTTDDI 152
           WRPRVFLYKGFLSDEECDHLISLASNSEDNP      SGNTVST++L  SGVILNTTDDI
Sbjct: 61  WRPRVFLYKGFLSDEECDHLISLASNSEDNPSRNSAGSGNTVSTELLNGSGVILNTTDDI 120

Query: 153 IARIENRIALWTFLPKDHSMPFQIMQYRGEEAEHKYFYGNRSAM-SSSEPLMATVVLYLS 212
           IARIENRIA+WT LPKDH MPFQIMQYRGEEA+HKYFYGNRSAM SSSEPLMATVVLYLS
Sbjct: 121 IARIENRIAVWTLLPKDHGMPFQIMQYRGEEAKHKYFYGNRSAMSSSSEPLMATVVLYLS 180

Query: 213 DSARGGVMLFPESKVKSKFWSNRRKKNNFLRPVKGNAVLFFSVHLNASPDKSNYHTRSPI 272
           DSA GG MLFPESKVKSKFWS RRKK NFLRPVKGNA+LFFSVHLNASPDKS+YH R PI
Sbjct: 181 DSASGGEMLFPESKVKSKFWSGRRKKKNFLRPVKGNAILFFSVHLNASPDKSSYHIRYPI 240

Query: 273 LNGELWVATKFFYLRPTTGNKDKHTIESDVDGCIDEDKSCPQWAAIGECERNAVFMIGSP 332
            NGELWVATKF YLRP TGN  KHTI+S++DGCIDEDKSCPQWAAIGECERNAVFM+GSP
Sbjct: 241 RNGELWVATKFLYLRPPTGN--KHTIDSNIDGCIDEDKSCPQWAAIGECERNAVFMVGSP 300

Query: 333 DYYGTCRKSCNAC 339
           DYYGTCRKSCNAC
Sbjct: 301 DYYGTCRKSCNAC 311

BLAST of Cla97C01G004720 vs. TrEMBL
Match: tr|A0A0A0KPE4|A0A0A0KPE4_CUCSA (Uncharacterized protein OS=Cucumis sativus OX=3659 GN=Csa_5G166460 PE=4 SV=1)

HSP 1 Score: 518.1 bits (1333), Expect = 1.5e-143
Identity = 252/291 (86.60%), Postives = 268/291 (92.10%), Query Frame = 0

Query: 55  QSNLISGRKGLRDQLVERPLSYSNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLIS 114
           +SNLISGRKGLRD+LV+RPLSYSN+SGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLIS
Sbjct: 6   KSNLISGRKGLRDRLVDRPLSYSNYSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLIS 65

Query: 115 LASNSEDNPSGN------TVSTKVLKSSGVILNTTDDIIARIENRIALWTFLPKDHSMPF 174
           LASNSEDNPS N      TVST++L SSGVILNTTDDI+ARIENR+A+WT LPKDHSMPF
Sbjct: 66  LASNSEDNPSRNSAGSGITVSTELLNSSGVILNTTDDIVARIENRLAIWTLLPKDHSMPF 125

Query: 175 QIMQYRGEEAEHKYFYGNRSAM-SSSEPLMATVVLYLSDSARGGVMLFPESKVKSKFWSN 234
           QIMQYRGEEA+HKYFYGNRSAM  SSEPLMATVVLYLSDSA GG +LFPESKVKSKFWS 
Sbjct: 126 QIMQYRGEEAKHKYFYGNRSAMLPSSEPLMATVVLYLSDSASGGEILFPESKVKSKFWSG 185

Query: 235 RRKKNNFLRPVKGNAVLFFSVHLNASPDKSNYHTRSPILNGELWVATKFFYLRPTTGNKD 294
           RRKKNNFLRPVKGNA+LFFSVHLNASPDKS+YH RSPI +GELWVATKF YL P  GN  
Sbjct: 186 RRKKNNFLRPVKGNAILFFSVHLNASPDKSSYHIRSPIRDGELWVATKFLYLGPPAGN-- 245

Query: 295 KHTIESDVDGCIDEDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC 339
           KHTI+SDVDGC DEDKSCPQWAAIGECERNAVFM+GSPDYYGTCRKSCNAC
Sbjct: 246 KHTIQSDVDGCFDEDKSCPQWAAIGECERNAVFMVGSPDYYGTCRKSCNAC 294

BLAST of Cla97C01G004720 vs. TrEMBL
Match: tr|A0A2P4NAB4|A0A2P4NAB4_QUESU (Putative prolyl 4-hydroxylase 12 OS=Quercus suber OX=58331 GN=CFP56_60861 PE=4 SV=1)

HSP 1 Score: 346.7 bits (888), Expect = 6.0e-92
Identity = 176/317 (55.52%), Postives = 230/317 (72.56%), Query Frame = 0

Query: 33  MDSRLHFLLLLSIAFSFSTCLAQSNLISGRKGLRDQLVER----PLSYSNHSGRIDPSRV 92
           M S L  LLLL+   SF + LA+S     RK LRD+   +     L +S HS +IDPSRV
Sbjct: 1   MASLLSILLLLAFTSSFQSLLAES-----RKELRDKEASQETFIQLGHSVHSNKIDPSRV 60

Query: 93  VQVSWRPRVFLYKGFLSDEECDHLISLASNSEDN------PSGNTVSTKVLKSSGVILNT 152
           VQ+SW+PRVFLYKGFLSDEECDHLI+LA   ++N       SG+  + ++L+SS + L+ 
Sbjct: 61  VQLSWQPRVFLYKGFLSDEECDHLITLAHGMKENGLGNDDNSGHVGTDRLLRSSEIPLDI 120

Query: 153 TDDIIARIENRIALWTFLPKDHSMPFQIMQYRGEEAEHKYFY-GNRSAMSSSEPLMATVV 212
            DD+++RIE RI+ WTFLPK++S P QIM Y  EE + KY Y GN+S +  ++PLMA VV
Sbjct: 121 EDDVVSRIEERISAWTFLPKENSRPLQIMHYGLEEVDKKYNYLGNKSTLELTKPLMAIVV 180

Query: 213 LYLSDSARGGVMLFPESKVKSKFWSNRRKKNNFLRPVKGNAVLFFSVHLNASPDKSNYHT 272
           LYLS+  +GG + FP+S+VKSK WS   K +N LRP+KGNA+LFF+VH NASPD S+ H 
Sbjct: 181 LYLSNITQGGEIHFPDSEVKSKIWSGCTKSSNILRPIKGNAILFFTVHPNASPDNSSSHA 240

Query: 273 RSPILNGELWVATKFFYLRPTTGNKDKHTIESDVDGCIDEDKSCPQWAAIGECERNAVFM 332
           R PI+ GE+W ATKFF++   +G  +K  +++D   C DE++SCP+WAAIGEC+RN V+M
Sbjct: 241 RCPIVEGEMWHATKFFHVGSISG--EKLPLKTDGTDCTDEEESCPKWAAIGECQRNPVYM 300

Query: 333 IGSPDYYGTCRKSCNAC 339
           IGSPDYYGTCRKSCNAC
Sbjct: 301 IGSPDYYGTCRKSCNAC 310

BLAST of Cla97C01G004720 vs. TrEMBL
Match: tr|A0A1R3HDG6|A0A1R3HDG6_COCAP (Metridin-like ShK toxin OS=Corchorus capsularis OX=210143 GN=CCACVL1_19977 PE=4 SV=1)

HSP 1 Score: 334.7 bits (857), Expect = 2.4e-88
Identity = 163/288 (56.60%), Postives = 205/288 (71.18%), Query Frame = 0

Query: 62  RKGLRDQLVER----PLSYSNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLAS 121
           RK LRD++V+      L  S  S  IDPSRV Q+ WRPRVFLY GFLSDEECDHLISL  
Sbjct: 10  RKELRDKVVQEERVIQLRPSAESNTIDPSRVTQLLWRPRVFLYSGFLSDEECDHLISLGH 69

Query: 122 NSEDNPSG------NTVSTKVLKSSGVILNTTDDIIARIENRIALWTFLPKDHSMPFQIM 181
            +++   G      N  + + LKSS   LNT D ++A IE RI+ WTFLPKD+  P ++ 
Sbjct: 70  GAKEGMLGMNDVQANVGTNRQLKSSATSLNTEDKVLAMIEERISAWTFLPKDNGKPLEVR 129

Query: 182 QYRGEEAEHKY-FYGNRSAMSSSEPLMATVVLYLSDSARGGVMLFPESKVKSKFWSNRRK 241
            Y  EE +    ++GN+SA++ SEPLMATVVLYLS+  RGG +LFP S+ KSK WS   K
Sbjct: 130 HYGHEETKQNLDYFGNKSALALSEPLMATVVLYLSNVTRGGEILFPHSEPKSKMWSECTK 189

Query: 242 KNNFLRPVKGNAVLFFSVHLNASPDKSNYHTRSPILNGELWVATKFFYLRPTTGNKDKHT 301
            ++ L+P KGNA+LFF+ +LNASPD S+ H R P+L GE+W ATKFFY  P   N++K  
Sbjct: 190 TSSILKPAKGNAILFFTTNLNASPDGSSSHARCPVLEGEMWCATKFFY--PRAVNREKVP 249

Query: 302 IESDVDGCIDEDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC 339
            +SD +GC+DED +CPQWAA+GEC+RN VFM+GSPDYYGTCRKSCNAC
Sbjct: 250 FDSDGNGCVDEDTNCPQWAAVGECQRNPVFMVGSPDYYGTCRKSCNAC 295

BLAST of Cla97C01G004720 vs. TrEMBL
Match: tr|A0A2I4HT83|A0A2I4HT83_9ROSI (probable prolyl 4-hydroxylase 12 isoform X2 OS=Juglans regia OX=51240 GN=LOC109021218 PE=4 SV=1)

HSP 1 Score: 332.4 bits (851), Expect = 1.2e-87
Identity = 168/309 (54.37%), Postives = 222/309 (71.84%), Query Frame = 0

Query: 40  LLLLSIAFSFSTCLAQSN--LISGRKGLRDQLVERPLSYSNHSGRIDPSRVVQVSWRPRV 99
           LLLL  A SF  C  +S+    SG++  ++ +++  L +S  S RIDPSRVVQ+SW+PRV
Sbjct: 8   LLLLVFASSFLICFTESSRKKFSGKQSNQETVIK--LGHSVDSNRIDPSRVVQLSWQPRV 67

Query: 100 FLYKGFLSDEECDHLISLASNSEDNPSGN------TVSTKVLKSSGVILNTTDDIIARIE 159
           FLYKGFLS EECDHLISL    +    GN       V+ ++L SS + LN  DD+++RIE
Sbjct: 68  FLYKGFLSVEECDHLISLVHGRKKEDLGNNGNSEHVVTNRLLMSSKMHLNIEDDVVSRIE 127

Query: 160 NRIALWTFLPKDHSMPFQIMQYRGEEAEHKY-FYGNRSAMSSSEPLMATVVLYLSDSARG 219
           +RI+ WTFLPK++S P Q+M Y  E+ +  Y F+GNR  +  +EPLMA +VLYLS+  +G
Sbjct: 128 DRISAWTFLPKENSRPLQVMHYGLEKVDRNYNFFGNRDLLGLTEPLMAIIVLYLSNVTQG 187

Query: 220 GVMLFPESKVKSKFWSNRRKKNNFLRPVKGNAVLFFSVHLNASPDKSNYHTRSPILNGEL 279
           G +LFPESK+K+  WS+    ++  RP+KGNA+LFF++H NASPDKS+ H R P+L GE+
Sbjct: 188 GEILFPESKLKNTIWSD-CTGSSIPRPIKGNAILFFTLHPNASPDKSSSHARCPVLEGEM 247

Query: 280 WVATKFFYLRPTTGNK-DKHTIESDVDGCIDEDKSCPQWAAIGECERNAVFMIGSPDYYG 339
           W ATKFF++R  +G K    +  SD  GCIDE ++CP+WAAIGEC+RN VFMIGSPDYYG
Sbjct: 248 WHATKFFHIRSISGEKVSPESDGSDDTGCIDEAENCPRWAAIGECQRNPVFMIGSPDYYG 307

BLAST of Cla97C01G004720 vs. Swiss-Prot
Match: sp|Q8GXT7|P4H12_ARATH (Probable prolyl 4-hydroxylase 12 OS=Arabidopsis thaliana OX=3702 GN=P4H12 PE=2 SV=1)

HSP 1 Score: 236.5 bits (602), Expect = 4.3e-61
Identity = 132/310 (42.58%), Postives = 190/310 (61.29%), Query Frame = 0

Query: 35  SRLHFLLLLSIAFSFSTCLAQSNLISGRKGLRDQLV-----ERPLSYSNHSGRIDPSRVV 94
           SR+  +L+++++ S     +  +    RK LRD+ +     +   SY   S  +DP+RV+
Sbjct: 5   SRIFLILMITMSSSSPPFCSGGS----RKELRDKEITSKSDDTQASYVLGSKFVDPTRVL 64

Query: 95  QVSWRPRVFLYKGFLSDEECDHLISLASNSEDNPSGNTVSTKVLKSSGVILNTTDDIIAR 154
           Q+SW PRVFLY+GFLS+EECDHLISL             +T+V           D ++A 
Sbjct: 65  QLSWLPRVFLYRGFLSEEECDHLISLRKE----------TTEVYSVDADGKTQLDPVVAG 124

Query: 155 IENRIALWTFLPKDHSMPFQIMQYRGEEAEHKY-FYGNRSAMSSSEPLMATVVLYLSDSA 214
           IE +++ WTFLP ++    ++  Y  E++  K  ++G   +    E L+ATVVLYLS++ 
Sbjct: 125 IEEKVSAWTFLPGENGGSIKVRSYTSEKSGKKLDYFGEEPSSVLHESLLATVVLYLSNTT 184

Query: 215 RGGVMLFPESKVKSKFWSNRRKKNNFLRPVKGNAVLFFSVHLNASPDKSNYHTRSPILNG 274
           +GG +LFP S++K K  ++  +  N LRPVKGNA+LFF+  LNAS D  + H R P++ G
Sbjct: 185 QGGELLFPNSEMKPK--NSCLEGGNILRPVKGNAILFFTRLLNASLDGKSTHLRCPVVKG 244

Query: 275 ELWVATKFFYLRPTTGNKDKHTIESDVDGCIDEDKSCPQWAAIGECERNAVFMIGSPDYY 334
           EL VATK  Y       K +  IE   + C DED++C +WA +GEC++N V+MIGSPDYY
Sbjct: 245 ELLVATKLIYA------KKQARIEESGE-CSDEDENCGRWAKLGECKKNPVYMIGSPDYY 291

Query: 335 GTCRKSCNAC 339
           GTCRKSCNAC
Sbjct: 305 GTCRKSCNAC 291

BLAST of Cla97C01G004720 vs. Swiss-Prot
Match: sp|Q8L970|P4H7_ARATH (Probable prolyl 4-hydroxylase 7 OS=Arabidopsis thaliana OX=3702 GN=P4H7 PE=2 SV=1)

HSP 1 Score: 218.0 bits (554), Expect = 1.6e-55
Identity = 123/323 (38.08%), Postives = 193/323 (59.75%), Query Frame = 0

Query: 33  MDSRLHFLLLLSIAFSFSTCL---AQSNLISGRKGLRDQLVERPLSYSNHSGRIDPSRVV 92
           MDSR+   L  S+ F F+  L   A +  ++     RD  V + +  S  S   DP+RV 
Sbjct: 1   MDSRI--FLAFSLCFLFTLPLISSAPNRFLTRSSNTRDGSVIK-MKTSASSFGFDPTRVT 60

Query: 93  QVSWRPRVFLYKGFLSDEECDHLISLA------SNSEDNPSGNTVSTKVLKSSGVILN-T 152
           Q+SW PRVFLY+GFLSDEECDH I LA      S   DN SG +V ++V  SSG+ L+  
Sbjct: 61  QLSWTPRVFLYEGFLSDEECDHFIKLAKGKLEKSMVADNDSGESVESEVRTSSGMFLSKR 120

Query: 153 TDDIIARIENRIALWTFLPKDHSMPFQIMQY-RGEEAE-HKYFYGNRSAMSSSEPLMATV 212
            DDI++ +E ++A WTFLP+++    QI+ Y  G++ E H  ++ +++ +      +ATV
Sbjct: 121 QDDIVSNVEAKLAAWTFLPEENGESMQILHYENGQKYEPHFDYFHDQANLELGGHRIATV 180

Query: 213 VLYLSDSARGGVMLFP-----ESKVKSKFWSNRRKKNNFLRPVKGNAVLFFSVHLNASPD 272
           ++YLS+  +GG  +FP      +++K   W+   K+   ++P KG+A+LFF++H NA+ D
Sbjct: 181 LMYLSNVEKGGETVFPMWKGKATQLKDDSWTECAKQGYAVKPRKGDALLFFNLHPNATTD 240

Query: 273 KSNYHTRSPILNGELWVATKFFYLRPTTGNKDKHTIESDVDGCIDEDKSCPQWAAIGECE 332
            ++ H   P++ GE W AT++ +++      +K +      GC+DE+ SC +WA  GEC+
Sbjct: 241 SNSLHGSCPVVEGEKWSATRWIHVKSFERAFNKQS------GCMDENVSCEKWAKAGECQ 300

Query: 333 RNAVFMIGSPDYYGTCRKSCNAC 339
           +N  +M+GS   +G CRKSC AC
Sbjct: 301 KNPTYMVGSDKDHGYCRKSCKAC 314

BLAST of Cla97C01G004720 vs. Swiss-Prot
Match: sp|F4J0A8|P4H6_ARATH (Probable prolyl 4-hydroxylase 6 OS=Arabidopsis thaliana OX=3702 GN=P4H6 PE=2 SV=1)

HSP 1 Score: 212.2 bits (539), Expect = 8.8e-54
Identity = 112/278 (40.29%), Postives = 165/278 (59.35%), Query Frame = 0

Query: 77  SNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLASNS-------EDNPSGNTVS 136
           S+ S  +DP+R+ Q+SW PR FLYKGFLSDEECDHLI LA           D  SG +  
Sbjct: 21  SSFSFSVDPTRITQLSWTPRAFLYKGFLSDEECDHLIKLAKGKLEKSMVVADVDSGESED 80

Query: 137 TKVLKSSGVIL-NTTDDIIARIENRIALWTFLPKDHSMPFQIMQYRG---EEAEHKYFYG 196
           ++V  SSG+ L    DDI+A +E ++A WTFLP+++    QI+ Y      +    YFY 
Sbjct: 81  SEVRTSSGMFLTKRQDDIVANVEAKLAAWTFLPEENGEALQILHYENGQKYDPHFDYFY- 140

Query: 197 NRSAMSSSEPLMATVVLYLSDSARGGVMLFPESK-----VKSKFWSNRRKKNNFLRPVKG 256
           ++ A+      +ATV++YLS+  +GG  +FP  K     +K   WS   K+   ++P KG
Sbjct: 141 DKKALELGGHRIATVLMYLSNVTKGGETVFPNWKGKTPQLKDDSWSKCAKQGYAVKPRKG 200

Query: 257 NAVLFFSVHLNASPDKSNYHTRSPILNGELWVATKFFYLRPTTGNKDKHTIESDVDGCID 316
           +A+LFF++HLN + D ++ H   P++ GE W AT++ ++R + G K           C+D
Sbjct: 201 DALLFFNLHLNGTTDPNSLHGSCPVIEGEKWSATRWIHVR-SFGKKKL--------VCVD 260

Query: 317 EDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC 339
           + +SC +WA  GECE+N ++M+GS    G CRKSC AC
Sbjct: 261 DHESCQEWADAGECEKNPMYMVGSETSLGFCRKSCKAC 288

BLAST of Cla97C01G004720 vs. Swiss-Prot
Match: sp|F4JAU3|P4H2_ARATH (Prolyl 4-hydroxylase 2 OS=Arabidopsis thaliana OX=3702 GN=P4H2 PE=1 SV=1)

HSP 1 Score: 187.6 bits (475), Expect = 2.3e-46
Identity = 104/279 (37.28%), Postives = 171/279 (61.29%), Query Frame = 0

Query: 77  SNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLASNS------EDNPSGNTVST 136
           S+ S  I+PS+V QVS +PR F+Y+GFL+D ECDHLISLA  +       DN +G +  +
Sbjct: 27  SSPSSIINPSKVKQVSSKPRAFVYEGFLTDLECDHLISLAKENLQRSAVADNDNGESQVS 86

Query: 137 KVLKSSGVILNT-TDDIIARIENRIALWTFLPKDHSMPFQIMQY-RGEEAE-HKYFYGNR 196
            V  SSG  ++   D I++ IE++++ WTFLPK++    Q+++Y  G++ + H  ++ ++
Sbjct: 87  DVRTSSGTFISKGKDPIVSGIEDKLSTWTFLPKENGEDLQVLRYEHGQKYDAHFDYFHDK 146

Query: 197 SAMSSSEPLMATVVLYLSDSARGGVMLFPESKVKSK--------FWSNRRKKNNFLRPVK 256
             ++     +ATV+LYLS+  +GG  +FP+++  S+          S+  KK   ++P K
Sbjct: 147 VNIARGGHRIATVLLYLSNVTKGGETVFPDAQEFSRRSLSENKDDLSDCAKKGIAVKPKK 206

Query: 257 GNAVLFFSVHLNASPDKSNYHTRSPILNGELWVATKFFYLRPTTGNKDKHTIESDVDGCI 316
           GNA+LFF++  +A PD  + H   P++ GE W ATK+ ++     + DK  I +    C 
Sbjct: 207 GNALLFFNLQQDAIPDPFSLHGGCPVIEGEKWSATKWIHV----DSFDK--ILTHDGNCT 266

Query: 317 DEDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC 339
           D ++SC +WA +GEC +N  +M+G+P+  G CR+SC AC
Sbjct: 267 DVNESCERWAVLGECGKNPEYMVGTPEIPGNCRRSCKAC 299

BLAST of Cla97C01G004720 vs. Swiss-Prot
Match: sp|Q8LAN3|P4H4_ARATH (Probable prolyl 4-hydroxylase 4 OS=Arabidopsis thaliana OX=3702 GN=P4H4 PE=2 SV=1)

HSP 1 Score: 184.9 bits (468), Expect = 1.5e-45
Identity = 99/280 (35.36%), Postives = 167/280 (59.64%), Query Frame = 0

Query: 77  SNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLASNS------EDNPSGNTVST 136
           S+ S  ++PS+V QVS +PR F+Y+GFL++ ECDH++SLA  S       DN SG +  +
Sbjct: 26  SSSSVFVNPSKVKQVSSKPRAFVYEGFLTELECDHMVSLAKASLKRSAVADNDSGESKFS 85

Query: 137 KVLKSSGVILNT-TDDIIARIENRIALWTFLPKDHSMPFQIMQY---RGEEAEHKYFYGN 196
           +V  SSG  ++   D I++ IE++I+ WTFLPK++    Q+++Y   +  +A   YF+  
Sbjct: 86  EVRTSSGTFISKGKDPIVSGIEDKISTWTFLPKENGEDIQVLRYEHGQKYDAHFDYFHDK 145

Query: 197 RSAMSSSEPLMATVVLYLSDSARGGVMLFPESKVKSK--------FWSNRRKKNNFLRPV 256
            + +      MAT+++YLS+  +GG  +FP++++ S+          S+  K+   ++P 
Sbjct: 146 VNIVRGGH-RMATILMYLSNVTKGGETVFPDAEIPSRRVLSENKEDLSDCAKRGIAVKPR 205

Query: 257 KGNAVLFFSVHLNASPDKSNYHTRSPILNGELWVATKFFYLRPTTGNKDKHTIESDVDGC 316
           KG+A+LFF++H +A PD  + H   P++ GE W ATK+ ++           I +    C
Sbjct: 206 KGDALLFFNLHPDAIPDPLSLHGGCPVIEGEKWSATKWIHV------DSFDRIVTPSGNC 265

Query: 317 IDEDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC 339
            D ++SC +WA +GEC +N  +M+G+ +  G CR+SC AC
Sbjct: 266 TDMNESCERWAVLGECTKNPEYMVGTTELPGYCRRSCKAC 298

BLAST of Cla97C01G004720 vs. TAIR10
Match: AT4G25600.1 (Oxoglutarate/iron-dependent oxygenase)

HSP 1 Score: 236.5 bits (602), Expect = 2.4e-62
Identity = 132/310 (42.58%), Postives = 190/310 (61.29%), Query Frame = 0

Query: 35  SRLHFLLLLSIAFSFSTCLAQSNLISGRKGLRDQLV-----ERPLSYSNHSGRIDPSRVV 94
           SR+  +L+++++ S     +  +    RK LRD+ +     +   SY   S  +DP+RV+
Sbjct: 5   SRIFLILMITMSSSSPPFCSGGS----RKELRDKEITSKSDDTQASYVLGSKFVDPTRVL 64

Query: 95  QVSWRPRVFLYKGFLSDEECDHLISLASNSEDNPSGNTVSTKVLKSSGVILNTTDDIIAR 154
           Q+SW PRVFLY+GFLS+EECDHLISL             +T+V           D ++A 
Sbjct: 65  QLSWLPRVFLYRGFLSEEECDHLISLRKE----------TTEVYSVDADGKTQLDPVVAG 124

Query: 155 IENRIALWTFLPKDHSMPFQIMQYRGEEAEHKY-FYGNRSAMSSSEPLMATVVLYLSDSA 214
           IE +++ WTFLP ++    ++  Y  E++  K  ++G   +    E L+ATVVLYLS++ 
Sbjct: 125 IEEKVSAWTFLPGENGGSIKVRSYTSEKSGKKLDYFGEEPSSVLHESLLATVVLYLSNTT 184

Query: 215 RGGVMLFPESKVKSKFWSNRRKKNNFLRPVKGNAVLFFSVHLNASPDKSNYHTRSPILNG 274
           +GG +LFP S++K K  ++  +  N LRPVKGNA+LFF+  LNAS D  + H R P++ G
Sbjct: 185 QGGELLFPNSEMKPK--NSCLEGGNILRPVKGNAILFFTRLLNASLDGKSTHLRCPVVKG 244

Query: 275 ELWVATKFFYLRPTTGNKDKHTIESDVDGCIDEDKSCPQWAAIGECERNAVFMIGSPDYY 334
           EL VATK  Y       K +  IE   + C DED++C +WA +GEC++N V+MIGSPDYY
Sbjct: 245 ELLVATKLIYA------KKQARIEESGE-CSDEDENCGRWAKLGECKKNPVYMIGSPDYY 291

Query: 335 GTCRKSCNAC 339
           GTCRKSCNAC
Sbjct: 305 GTCRKSCNAC 291

BLAST of Cla97C01G004720 vs. TAIR10
Match: AT3G28490.1 (Oxoglutarate/iron-dependent oxygenase)

HSP 1 Score: 212.2 bits (539), Expect = 4.9e-55
Identity = 112/278 (40.29%), Postives = 165/278 (59.35%), Query Frame = 0

Query: 77  SNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLASNS-------EDNPSGNTVS 136
           S+ S  +DP+R+ Q+SW PR FLYKGFLSDEECDHLI LA           D  SG +  
Sbjct: 21  SSFSFSVDPTRITQLSWTPRAFLYKGFLSDEECDHLIKLAKGKLEKSMVVADVDSGESED 80

Query: 137 TKVLKSSGVIL-NTTDDIIARIENRIALWTFLPKDHSMPFQIMQYRG---EEAEHKYFYG 196
           ++V  SSG+ L    DDI+A +E ++A WTFLP+++    QI+ Y      +    YFY 
Sbjct: 81  SEVRTSSGMFLTKRQDDIVANVEAKLAAWTFLPEENGEALQILHYENGQKYDPHFDYFY- 140

Query: 197 NRSAMSSSEPLMATVVLYLSDSARGGVMLFPESK-----VKSKFWSNRRKKNNFLRPVKG 256
           ++ A+      +ATV++YLS+  +GG  +FP  K     +K   WS   K+   ++P KG
Sbjct: 141 DKKALELGGHRIATVLMYLSNVTKGGETVFPNWKGKTPQLKDDSWSKCAKQGYAVKPRKG 200

Query: 257 NAVLFFSVHLNASPDKSNYHTRSPILNGELWVATKFFYLRPTTGNKDKHTIESDVDGCID 316
           +A+LFF++HLN + D ++ H   P++ GE W AT++ ++R + G K           C+D
Sbjct: 201 DALLFFNLHLNGTTDPNSLHGSCPVIEGEKWSATRWIHVR-SFGKKKL--------VCVD 260

Query: 317 EDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC 339
           + +SC +WA  GECE+N ++M+GS    G CRKSC AC
Sbjct: 261 DHESCQEWADAGECEKNPMYMVGSETSLGFCRKSCKAC 288

BLAST of Cla97C01G004720 vs. TAIR10
Match: AT3G28480.2 (Oxoglutarate/iron-dependent oxygenase)

HSP 1 Score: 209.9 bits (533), Expect = 2.4e-54
Identity = 123/331 (37.16%), Postives = 192/331 (58.01%), Query Frame = 0

Query: 33  MDSRLHFLLLLSIAFSFSTCL---AQSNLISGRKGLRDQLVERPLSYSNHSGRIDPSRVV 92
           MDSR+   L  S+ F F+  L   A +  ++     RD  V + +  S  S   DP+RV 
Sbjct: 1   MDSRI--FLAFSLCFLFTLPLISSAPNRFLTRSSNTRDGSVIK-MKTSASSFGFDPTRVT 60

Query: 93  QVSWRPRVFLYKGFLSDEECDHLISLA------SNSEDNPSGNTVSTK-----VLKSSGV 152
           Q+SW PRVFLY+GFLSDEECDH I LA      S   DN SG +V ++     V +SS  
Sbjct: 61  QLSWTPRVFLYEGFLSDEECDHFIKLAKGKLEKSMVADNDSGESVESEDSVSVVRQSSSF 120

Query: 153 ILN----TTDDIIARIENRIALWTFLPKDHSMPFQIMQY-RGEEAE-HKYFYGNRSAMSS 212
           I N      DDI++ +E ++A WTFLP+++    QI+ Y  G++ E H  ++ +++ +  
Sbjct: 121 IANMDSLEIDDIVSNVEAKLAAWTFLPEENGESMQILHYENGQKYEPHFDYFHDQANLEL 180

Query: 213 SEPLMATVVLYLSDSARGGVMLFP-----ESKVKSKFWSNRRKKNNFLRPVKGNAVLFFS 272
               +ATV++YLS+  +GG  +FP      +++K   W+   K+   ++P KG+A+LFF+
Sbjct: 181 GGHRIATVLMYLSNVEKGGETVFPMWKGKATQLKDDSWTECAKQGYAVKPRKGDALLFFN 240

Query: 273 VHLNASPDKSNYHTRSPILNGELWVATKFFYLRPTTGNKDKHTIESDVDGCIDEDKSCPQ 332
           +H NA+ D ++ H   P++ GE W AT++ +++      +K +      GC+DE+ SC +
Sbjct: 241 LHPNATTDSNSLHGSCPVVEGEKWSATRWIHVKSFERAFNKQS------GCMDENVSCEK 300

Query: 333 WAAIGECERNAVFMIGSPDYYGTCRKSCNAC 339
           WA  GEC++N  +M+GS   +G CRKSC AC
Sbjct: 301 WAKAGECQKNPTYMVGSDKDHGYCRKSCKAC 322

BLAST of Cla97C01G004720 vs. TAIR10
Match: AT3G06300.1 (P4H isoform 2)

HSP 1 Score: 187.6 bits (475), Expect = 1.3e-47
Identity = 104/279 (37.28%), Postives = 171/279 (61.29%), Query Frame = 0

Query: 77  SNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLASNS------EDNPSGNTVST 136
           S+ S  I+PS+V QVS +PR F+Y+GFL+D ECDHLISLA  +       DN +G +  +
Sbjct: 27  SSPSSIINPSKVKQVSSKPRAFVYEGFLTDLECDHLISLAKENLQRSAVADNDNGESQVS 86

Query: 137 KVLKSSGVILNT-TDDIIARIENRIALWTFLPKDHSMPFQIMQY-RGEEAE-HKYFYGNR 196
            V  SSG  ++   D I++ IE++++ WTFLPK++    Q+++Y  G++ + H  ++ ++
Sbjct: 87  DVRTSSGTFISKGKDPIVSGIEDKLSTWTFLPKENGEDLQVLRYEHGQKYDAHFDYFHDK 146

Query: 197 SAMSSSEPLMATVVLYLSDSARGGVMLFPESKVKSK--------FWSNRRKKNNFLRPVK 256
             ++     +ATV+LYLS+  +GG  +FP+++  S+          S+  KK   ++P K
Sbjct: 147 VNIARGGHRIATVLLYLSNVTKGGETVFPDAQEFSRRSLSENKDDLSDCAKKGIAVKPKK 206

Query: 257 GNAVLFFSVHLNASPDKSNYHTRSPILNGELWVATKFFYLRPTTGNKDKHTIESDVDGCI 316
           GNA+LFF++  +A PD  + H   P++ GE W ATK+ ++     + DK  I +    C 
Sbjct: 207 GNALLFFNLQQDAIPDPFSLHGGCPVIEGEKWSATKWIHV----DSFDK--ILTHDGNCT 266

Query: 317 DEDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC 339
           D ++SC +WA +GEC +N  +M+G+P+  G CR+SC AC
Sbjct: 267 DVNESCERWAVLGECGKNPEYMVGTPEIPGNCRRSCKAC 299

BLAST of Cla97C01G004720 vs. TAIR10
Match: AT5G18900.1 (2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily protein)

HSP 1 Score: 184.9 bits (468), Expect = 8.3e-47
Identity = 99/280 (35.36%), Postives = 167/280 (59.64%), Query Frame = 0

Query: 77  SNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLASNS------EDNPSGNTVST 136
           S+ S  ++PS+V QVS +PR F+Y+GFL++ ECDH++SLA  S       DN SG +  +
Sbjct: 26  SSSSVFVNPSKVKQVSSKPRAFVYEGFLTELECDHMVSLAKASLKRSAVADNDSGESKFS 85

Query: 137 KVLKSSGVILNT-TDDIIARIENRIALWTFLPKDHSMPFQIMQY---RGEEAEHKYFYGN 196
           +V  SSG  ++   D I++ IE++I+ WTFLPK++    Q+++Y   +  +A   YF+  
Sbjct: 86  EVRTSSGTFISKGKDPIVSGIEDKISTWTFLPKENGEDIQVLRYEHGQKYDAHFDYFHDK 145

Query: 197 RSAMSSSEPLMATVVLYLSDSARGGVMLFPESKVKSK--------FWSNRRKKNNFLRPV 256
            + +      MAT+++YLS+  +GG  +FP++++ S+          S+  K+   ++P 
Sbjct: 146 VNIVRGGH-RMATILMYLSNVTKGGETVFPDAEIPSRRVLSENKEDLSDCAKRGIAVKPR 205

Query: 257 KGNAVLFFSVHLNASPDKSNYHTRSPILNGELWVATKFFYLRPTTGNKDKHTIESDVDGC 316
           KG+A+LFF++H +A PD  + H   P++ GE W ATK+ ++           I +    C
Sbjct: 206 KGDALLFFNLHPDAIPDPLSLHGGCPVIEGEKWSATKWIHV------DSFDRIVTPSGNC 265

Query: 317 IDEDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC 339
            D ++SC +WA +GEC +N  +M+G+ +  G CR+SC AC
Sbjct: 266 TDMNESCERWAVLGECTKNPEYMVGTTELPGYCRRSCKAC 298

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
XP_008436994.12.9e-15487.54PREDICTED: probable prolyl 4-hydroxylase 12 [Cucumis melo][more]
XP_004152378.16.4e-15486.90PREDICTED: probable prolyl 4-hydroxylase 12 [Cucumis sativus][more]
XP_022159842.15.4e-14584.35probable prolyl 4-hydroxylase 12 [Momordica charantia][more]
KGN50302.12.3e-14386.60hypothetical protein Csa_5G166460 [Cucumis sativus][more]
XP_023549812.14.5e-13980.38probable prolyl 4-hydroxylase 12 [Cucurbita pepo subsp. pepo][more]
Match NameE-valueIdentityDescription
tr|A0A1S3AT39|A0A1S3AT39_CUCME1.9e-15487.54probable prolyl 4-hydroxylase 12 OS=Cucumis melo OX=3656 GN=LOC103482556 PE=4 SV... [more]
tr|A0A0A0KPE4|A0A0A0KPE4_CUCSA1.5e-14386.60Uncharacterized protein OS=Cucumis sativus OX=3659 GN=Csa_5G166460 PE=4 SV=1[more]
tr|A0A2P4NAB4|A0A2P4NAB4_QUESU6.0e-9255.52Putative prolyl 4-hydroxylase 12 OS=Quercus suber OX=58331 GN=CFP56_60861 PE=4 S... [more]
tr|A0A1R3HDG6|A0A1R3HDG6_COCAP2.4e-8856.60Metridin-like ShK toxin OS=Corchorus capsularis OX=210143 GN=CCACVL1_19977 PE=4 ... [more]
tr|A0A2I4HT83|A0A2I4HT83_9ROSI1.2e-8754.37probable prolyl 4-hydroxylase 12 isoform X2 OS=Juglans regia OX=51240 GN=LOC1090... [more]
Match NameE-valueIdentityDescription
sp|Q8GXT7|P4H12_ARATH4.3e-6142.58Probable prolyl 4-hydroxylase 12 OS=Arabidopsis thaliana OX=3702 GN=P4H12 PE=2 S... [more]
sp|Q8L970|P4H7_ARATH1.6e-5538.08Probable prolyl 4-hydroxylase 7 OS=Arabidopsis thaliana OX=3702 GN=P4H7 PE=2 SV=... [more]
sp|F4J0A8|P4H6_ARATH8.8e-5440.29Probable prolyl 4-hydroxylase 6 OS=Arabidopsis thaliana OX=3702 GN=P4H6 PE=2 SV=... [more]
sp|F4JAU3|P4H2_ARATH2.3e-4637.28Prolyl 4-hydroxylase 2 OS=Arabidopsis thaliana OX=3702 GN=P4H2 PE=1 SV=1[more]
sp|Q8LAN3|P4H4_ARATH1.5e-4535.36Probable prolyl 4-hydroxylase 4 OS=Arabidopsis thaliana OX=3702 GN=P4H4 PE=2 SV=... [more]
Match NameE-valueIdentityDescription
AT4G25600.12.4e-6242.58Oxoglutarate/iron-dependent oxygenase[more]
AT3G28490.14.9e-5540.29Oxoglutarate/iron-dependent oxygenase[more]
AT3G28480.22.4e-5437.16Oxoglutarate/iron-dependent oxygenase[more]
AT3G06300.11.3e-4737.28P4H isoform 2[more]
AT5G18900.18.3e-4735.362-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily protein[more]
The following terms have been associated with this gene:
Vocabulary: Molecular Function
TermDefinition
GO:0016705oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen
GO:0005506iron ion binding
GO:0031418L-ascorbic acid binding
Vocabulary: Biological Process
TermDefinition
GO:0055114oxidation-reduction process
Vocabulary: INTERPRO
TermDefinition
IPR003582ShKT_dom
IPR006620Pro_4_hyd_alph
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0019511 peptidyl-proline hydroxylation
biological_process GO:0055114 oxidation-reduction process
biological_process GO:0018401 peptidyl-proline hydroxylation to 4-hydroxy-L-proline
cellular_component GO:0005575 cellular_component
cellular_component GO:0005783 endoplasmic reticulum
molecular_function GO:0005506 iron ion binding
molecular_function GO:0031418 L-ascorbic acid binding
molecular_function GO:0016491 oxidoreductase activity
molecular_function GO:0016705 oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen
molecular_function GO:0004656 procollagen-proline 4-dioxygenase activity

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Cla97C01G004720.1Cla97C01G004720.1mRNA


Analysis Name: InterPro Annotations of watermelon 97103 v2
Date Performed: 2019-05-12
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR006620Prolyl 4-hydroxylase, alpha subunitSMARTSM00702p4hccoord: 95..278
e-value: 6.0E-13
score: 59.0
IPR003582ShKT domainSMARTSM00254ShkT_1coord: 297..338
e-value: 1.7E-4
score: 30.9
IPR003582ShKT domainPROSITEPS51670SHKTcoord: 298..338
score: 8.53
NoneNo IPR availableGENE3DG3DSA:2.60.120.620coord: 87..278
e-value: 1.3E-38
score: 134.9
NoneNo IPR availablePANTHERPTHR10869:SF102PROLYL 4-HYDROXYLASE 12-RELATEDcoord: 40..338
NoneNo IPR availablePANTHERPTHR10869PROLYL 4-HYDROXYLASE ALPHA SUBUNITcoord: 40..338

The following gene(s) are paralogous to this gene:

None

The following block(s) are covering this gene:
GeneOrganismBlock
Cla97C01G004720Cucurbita maxima (Rimu)cmawmbB288
Cla97C01G004720Cucurbita moschata (Rifu)cmowmbB270
Cla97C01G004720Watermelon (Charleston Gray)wcgwmbB089
Cla97C01G004720Watermelon (97103) v1wmwmbB245