Cp4.1LG01g23040 (gene) Cucurbita pepo (MU‐CU‐16) v4.1

Overview
NameCp4.1LG01g23040
Typegene
OrganismCucurbita pepo (Cucurbita pepo (MU‐CU‐16) v4.1)
DescriptionProcollagen-proline 4-dioxygenase
LocationCp4.1LG01: 19850463 .. 19853618 (-)
RNA-Seq ExpressionCp4.1LG01g23040
SyntenyCp4.1LG01g23040
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: polypeptideexonCDSthree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
GGCGGATCTGCACTTCGGTTGAAGGGGAGTTCATCTCCGCTGATTTTCGATCCAACTCGAGTCACTCAGCTCTCCTGGCAACCTAGGTCATCTCTCTTTCTCACAATTCGCTTCTCATATACTCTCAAACCTCTCTCTAATCAAATAGTGAGGCTCACGCCAATATGTAACATTGGTTAACTATGTGTTTGACTGTTACAAATGATATCGGAGCCAAACACTAAACGGTGGCTCTCAAAGGGAGTGTATTATGAGATACCACATCGGTTAGAGAGCGGAATGAAGCATTCCTTAAAAGGTGTGGAAACCTCTCACTAGTTGACGCGTTTTAAAACCGTGAGACTAACGATAATAGATAACTAGCCAAAGCGAACAACATCTGTTAGGTGTGAGTTTGAGCGGTTAGAAATGGTATAAAAGCTAGATACCGAGTCAGTGTGTCAGCTAGACGTTGGCCCCTAAGGAATCGGATTGTGAGATCCCATGTCGGTTAGCTAGATACTGAGTCAGTGTGTCAGCTAGACGTTGGCCCCCAAGGAATCGGATTGTGAGATCCCACGTCGGTTGTGTTGGTGACAACAGATAAATTTGGTAGAATTCTTTGAGAGAAAAGAACTTTGAACAACTACTAAGAAATGCTTGGTTTGATTTTCTATGCCTTCCACCAATGCCAACAAACATATATATATATATATATATATATATATATATATATATATATATATTAACACATGTATCTATTTATACATGTATCTACTAGCGACCCTTCAACACATCATATATCTACTAGTGACCCTTCAACCGGTTGGAGAGGGGAACAAAACATTTCTTAAAGAGAGTGTGAAAACCTCTCCCTAGCAAACGCGTTTCAAAACGGTAAAGCTGACGGCAATACGTAACGAGCAAAAGCTAACAATATCTATTAGCTATGAGTTTGGACGGTTGCAAATAGTATTAGAGCCAAACACCGGTGTGTTGGAGAGGGGAACGAAACATTTCTTAAATCCGTGGAAACCTTTTCCTAGCAAACACATATTTTAAAACGGTGAAGTTGACAACTATACATAACCAGCCAAAGTTAACAATAACTATACATAACCAGCCAAAGTTAACGATATCTATTATAGCTGTAGGCTTATACAGTTACATACCCAAACCACATTTTAGTGGCTTTAGTTTTTTTTACTGGGGATGTGTTTGTGCAAATTACAGGGCATTATTATATAAGGGATTTCTATCTGATAAGGAATGCGATCACCTAATCAATCTGGTGAGTGAATAAGGGAATAATTTAGATTTGAATTCAACCGCGTGGTAACTGTAAAATGTTTGTTTGACTGTATCATTTTTTGGGAATAGTCAAGGGGAAGGTTAGAGAGGTCGATGGTAGCAGATAATAAGTCCGGTAAGAGAGTAAGTAGTAAAGTCCGGACGAGCTCCGGCACGTTCGTGCTGAAGGGGCAGGATGAAATAATTGCTGCCATTGAAGCCAGAATTGCGGCATGGACATTCCTTCCACTAGGTATATATGTATATGTTATATGTTATATGTATATTCTGGTTATGAAAGTTAGCTTTGTTTGTTGAAACTGTTGGTAAAATTCAGAAAATGGAGAGCCAATTCAAATTCTGCACTATGAGAATGGTGAGAAGTATGAACCGCATTTTGATTTTTTTGTGGACGAGGTGAATAAGGAGTTGGGTGGCCACCGAGTAGCCACAGTTTTGATGTATTTATCCAATGTTGAGAAGGGTGGAGAGACCGTGTTTCCACATTCAGAGGTATATATATATATATATATATATGCCACTGCCTCTGCAGCGTCTCTCTTCTTTATCTCTCTGAAAATTAATAGGCATGCGTTGTTCCTCTGTGTTTATGCAGTTTAAGGAGTCTCAAGAAAAGGATGATAGCTGGTCTGATTGTGCTCGAATGGGTTATGCAGGTACTATTTGTAATTTTCTTTTTTAATCCCATCAACTCTTAGTTAAACTTCCTAGATGTGTTTTAAAAATTTGAGAGAAACTCGAAAGTGATAGTTCAAAAAGGACAATATCGATTAGGGGTATATTTGGGCTGTTTTAACAACTTGAACGAAAATTGGGAAGAGATAATTCAAAGAGAACAAGATCTACTAGCTGTATACAATTAAAATATTTGAATTTGATTAGTTTGAGTTAGTTCCTGTCGTTTATTCAATTTCTTTATTAAAAAAATTAGTTTATATTGAATTAAAATTTATAATTGGTTGACATTTTGCATTAAAAATACTAAAAAAACAGTTTTTTTTTTTGTTAGAATTTTCTTTTAAAAAAATTATAGAATTAAATAAAATTTAAATAAATAGAAATATCTAATTCAACCTAATCCATATAATTTTATTTATTTAAATTCAATCCAACTCAAATCAAACCATATCAATTTTTTTTAACATCATTTTTTACCAATGTTTTCTAAATTGAAGCTGAATTTATTTTTTTTTAATTTCAACTCATTTTTGTTTATTGGATTTAGCCAAGAATTCAATGTGTTCTAAAGAAAAGACACAATTTTTCATTATGAAGTTGTATTCTATTTCTGGTGTCCATGTGTTCTAAAGCTTGGGGGTGATGGTTGTTGCAGTTAAACCGCAGAAGGGTGATGCATTGCTGTTCTTCAGCCTCAATGTGGATGGAACCACAAATCCGAGAAGCATGCACGGTAGCTGCCCTGTAATTGAGGGTGAGAAATGGAGTGCAACCAAATGGATTCACGTCAGATCCTTTGATAACCCAATTCTCCCAAACAAGGGCTGCATGGACTTCAACGAAAATTGCCCTTTGTGGGCCGAAAACGGTGAGTGCAAAAACAACCCCAGGTACATGCTGGGCTCTGAAACTGCTTCAGGATACTGTAGGAAGAGTTGCCAAGCCTGCTAAACTACAAACTACAACAACCAAGTCTCCATTCTGTCATGGTTATCATGTATATAACATTCGACACTAACTAGATATAGCTTTGCTTCTGTTCCGTTATACAAATAAGGTGGGTGGTCAAATGGATATTATATCTGTTTCCATAAATTGTATCTTTTGCTTCACTCCAACGATATAAATTTCAATTCATCTGAGATGCTTATTTAACCATTGAAGTTGGTTTTTATGAGACCCGCATAC

mRNA sequence

GGCGGATCTGCACTTCGGTTGAAGGGGAGTTCATCTCCGCTGATTTTCGATCCAACTCGAGGATTTCTATCTGATAAGGAATGCGATCACCTAATCAATCTGTCAAGGGGAAGGTTAGAGAGGTCGATGGTAGCAGATAATAAGTCCGGTAAGAGAGTAAGTAGTAAAGTCCGGACGAGCTCCGGCACGTTCGTGCTGAAGGGGCAGGATGAAATAATTGCTGCCATTGAAGCCAGAATTGCGGCATGGACATTCCTTCCACTAGAAAATGGAGAGCCAATTCAAATTCTGCACTATGAGAATGGTGAGAAGTATGAACCGCATTTTGATTTTTTTGTGGACGAGGTGAATAAGGAGTTGGGTGGCCACCGAGTAGCCACAGTTTTGATGTATTTATCCAATGTTGAGAAGGGTGGAGAGACCGTGTTTCCACATTCAGAGTTTAAGGAGTCTCAAGAAAAGGATGATAGCTGGTCTGATTGTGCTCGAATGGTTAAACCGCAGAAGGGTGATGCATTGCTGTTCTTCAGCCTCAATGTGGATGGAACCACAAATCCGAGAAGCATGCACGGTAGCTGCCCTGTAATTGAGGGTGAGAAATGGAGTGCAACCAAATGGATTCACGTCAGATCCTTTGATAACCCAATTCTCCCAAACAAGGGCTGCATGGACTTCAACGAAAATTGCCCTTTGTGGGCCGAAAACGGTGAGTGCAAAAACAACCCCAGGTACATGCTGGGCTCTGAAACTGCTTCAGGATACTGTAGGAAGAGTTGCCAAGCCTGCTAAACTACAAACTACAACAACCAAGTCTCCATTCTGTCATGGTTATCATGTATATAACATTCGACACTAACTAGATATAGCTTTGCTTCTGTTCCGTTATACAAATAAGGTGGGTGGTCAAATGGATATTATATCTGTTTCCATAAATTGTATCTTTTGCTTCACTCCAACGATATAAATTTCAATTCATCTGAGATGCTTATTTAACCATTGAAGTTGGTTTTTATGAGACCCGCATAC

Coding sequence (CDS)

GGCGGATCTGCACTTCGGTTGAAGGGGAGTTCATCTCCGCTGATTTTCGATCCAACTCGAGGATTTCTATCTGATAAGGAATGCGATCACCTAATCAATCTGTCAAGGGGAAGGTTAGAGAGGTCGATGGTAGCAGATAATAAGTCCGGTAAGAGAGTAAGTAGTAAAGTCCGGACGAGCTCCGGCACGTTCGTGCTGAAGGGGCAGGATGAAATAATTGCTGCCATTGAAGCCAGAATTGCGGCATGGACATTCCTTCCACTAGAAAATGGAGAGCCAATTCAAATTCTGCACTATGAGAATGGTGAGAAGTATGAACCGCATTTTGATTTTTTTGTGGACGAGGTGAATAAGGAGTTGGGTGGCCACCGAGTAGCCACAGTTTTGATGTATTTATCCAATGTTGAGAAGGGTGGAGAGACCGTGTTTCCACATTCAGAGTTTAAGGAGTCTCAAGAAAAGGATGATAGCTGGTCTGATTGTGCTCGAATGGTTAAACCGCAGAAGGGTGATGCATTGCTGTTCTTCAGCCTCAATGTGGATGGAACCACAAATCCGAGAAGCATGCACGGTAGCTGCCCTGTAATTGAGGGTGAGAAATGGAGTGCAACCAAATGGATTCACGTCAGATCCTTTGATAACCCAATTCTCCCAAACAAGGGCTGCATGGACTTCAACGAAAATTGCCCTTTGTGGGCCGAAAACGGTGAGTGCAAAAACAACCCCAGGTACATGCTGGGCTCTGAAACTGCTTCAGGATACTGTAGGAAGAGTTGCCAAGCCTGCTAA

Protein sequence

GGSALRLKGSSSPLIFDPTRGFLSDKECDHLINLSRGRLERSMVADNKSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYEPHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARMVKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCMDFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC
Homology
BLAST of Cp4.1LG01g23040 vs. ExPASy Swiss-Prot
Match: Q8L970 (Probable prolyl 4-hydroxylase 7 OS=Arabidopsis thaliana OX=3702 GN=P4H7 PE=2 SV=1)

HSP 1 Score: 366.7 bits (940), Expect = 2.2e-100
Identity = 171/278 (61.51%), Postives = 215/278 (77.34%), Query Frame = 0

Query: 2   GSALRLKGSSSPLIFDPTR--------------GFLSDKECDHLINLSRGRLERSMVADN 61
           GS +++K S+S   FDPTR              GFLSD+ECDH I L++G+LE+SMVADN
Sbjct: 37  GSVIKMKTSASSFGFDPTRVTQLSWTPRVFLYEGFLSDEECDHFIKLAKGKLEKSMVADN 96

Query: 62  KSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYEP 121
            SG+ V S+VRTSSG F+ K QD+I++ +EA++AAWTFLP ENGE +QILHYENG+KYEP
Sbjct: 97  DSGESVESEVRTSSGMFLSKRQDDIVSNVEAKLAAWTFLPEENGESMQILHYENGQKYEP 156

Query: 122 HFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM--- 181
           HFD+F D+ N ELGGHR+ATVLMYLSNVEKGGETVFP  + K +Q KDDSW++CA+    
Sbjct: 157 HFDYFHDQANLELGGHRIATVLMYLSNVEKGGETVFPMWKGKATQLKDDSWTECAKQGYA 216

Query: 182 VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCMD 241
           VKP+KGDALLFF+L+ + TT+  S+HGSCPV+EGEKWSAT+WIHV+SF+       GCMD
Sbjct: 217 VKPRKGDALLFFNLHPNATTDSNSLHGSCPVVEGEKWSATRWIHVKSFERAFNKQSGCMD 276

Query: 242 FNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 263
            N +C  WA+ GEC+ NP YM+GS+   GYCRKSC+AC
Sbjct: 277 ENVSCEKWAKAGECQKNPTYMVGSDKDHGYCRKSCKAC 314

BLAST of Cp4.1LG01g23040 vs. ExPASy Swiss-Prot
Match: F4J0A8 (Probable prolyl 4-hydroxylase 6 OS=Arabidopsis thaliana OX=3702 GN=P4H6 PE=2 SV=1)

HSP 1 Score: 345.1 bits (884), Expect = 6.8e-94
Identity = 163/247 (65.99%), Postives = 203/247 (82.19%), Query Frame = 0

Query: 20  RGFLSDKECDHLINLSRGRLERSM-VADNKSGKRVSSKVRTSSGTFVLKGQDEIIAAIEA 79
           +GFLSD+ECDHLI L++G+LE+SM VAD  SG+   S+VRTSSG F+ K QD+I+A +EA
Sbjct: 45  KGFLSDEECDHLIKLAKGKLEKSMVVADVDSGESEDSEVRTSSGMFLTKRQDDIVANVEA 104

Query: 80  RIAAWTFLPLENGEPIQILHYENGEKYEPHFDFFVDEVNKELGGHRVATVLMYLSNVEKG 139
           ++AAWTFLP ENGE +QILHYENG+KY+PHFD+F D+   ELGGHR+ATVLMYLSNV KG
Sbjct: 105 KLAAWTFLPEENGEALQILHYENGQKYDPHFDYFYDKKALELGGHRIATVLMYLSNVTKG 164

Query: 140 GETVFPHSEFKESQEKDDSWSDCARM---VKPQKGDALLFFSLNVDGTTNPRSMHGSCPV 199
           GETVFP+ + K  Q KDDSWS CA+    VKP+KGDALLFF+L+++GTT+P S+HGSCPV
Sbjct: 165 GETVFPNWKGKTPQLKDDSWSKCAKQGYAVKPRKGDALLFFNLHLNGTTDPNSLHGSCPV 224

Query: 200 IEGEKWSATKWIHVRSFDNPILPNKGCMDFNENCPLWAENGECKNNPRYMLGSETASGYC 259
           IEGEKWSAT+WIHVRSF    L    C+D +E+C  WA+ GEC+ NP YM+GSET+ G+C
Sbjct: 225 IEGEKWSATRWIHVRSFGKKKLV---CVDDHESCQEWADAGECEKNPMYMVGSETSLGFC 284

Query: 260 RKSCQAC 263
           RKSC+AC
Sbjct: 285 RKSCKAC 288

BLAST of Cp4.1LG01g23040 vs. ExPASy Swiss-Prot
Match: Q8LAN3 (Probable prolyl 4-hydroxylase 4 OS=Arabidopsis thaliana OX=3702 GN=P4H4 PE=2 SV=1)

HSP 1 Score: 314.7 bits (805), Expect = 9.9e-85
Identity = 149/259 (57.53%), Postives = 189/259 (72.97%), Query Frame = 0

Query: 10  SSSPLIFDPTRGFLSDKECDHLINLSRGRLERSMVADNKSGKRVSSKVRTSSGTFVLKGQ 69
           SS P  F    GFL++ ECDH+++L++  L+RS VADN SG+   S+VRTSSGTF+ KG+
Sbjct: 41  SSKPRAF-VYEGFLTELECDHMVSLAKASLKRSAVADNDSGESKFSEVRTSSGTFISKGK 100

Query: 70  DEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYEPHFDFFVDEVNKELGGHRVATVL 129
           D I++ IE +I+ WTFLP ENGE IQ+L YE+G+KY+ HFD+F D+VN   GGHR+AT+L
Sbjct: 101 DPIVSGIEDKISTWTFLPKENGEDIQVLRYEHGQKYDAHFDYFHDKVNIVRGGHRMATIL 160

Query: 130 MYLSNVEKGGETVFPHSEFKESQ---EKDDSWSDCAR---MVKPQKGDALLFFSLNVDGT 189
           MYLSNV KGGETVFP +E    +   E  +  SDCA+    VKP+KGDALLFF+L+ D  
Sbjct: 161 MYLSNVTKGGETVFPDAEIPSRRVLSENKEDLSDCAKRGIAVKPRKGDALLFFNLHPDAI 220

Query: 190 TNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCMDFNENCPLWAENGECKNNPR 249
            +P S+HG CPVIEGEKWSATKWIHV SFD  + P+  C D NE+C  WA  GEC  NP 
Sbjct: 221 PDPLSLHGGCPVIEGEKWSATKWIHVDSFDRIVTPSGNCTDMNESCERWAVLGECTKNPE 280

Query: 250 YMLGSETASGYCRKSCQAC 263
           YM+G+    GYCR+SC+AC
Sbjct: 281 YMVGTTELPGYCRRSCKAC 298

BLAST of Cp4.1LG01g23040 vs. ExPASy Swiss-Prot
Match: F4JAU3 (Prolyl 4-hydroxylase 2 OS=Arabidopsis thaliana OX=3702 GN=P4H2 PE=1 SV=1)

HSP 1 Score: 303.5 bits (776), Expect = 2.3e-81
Identity = 148/259 (57.14%), Postives = 186/259 (71.81%), Query Frame = 0

Query: 10  SSSPLIFDPTRGFLSDKECDHLINLSRGRLERSMVADNKSGKRVSSKVRTSSGTFVLKGQ 69
           SS P  F    GFL+D ECDHLI+L++  L+RS VADN +G+   S VRTSSGTF+ KG+
Sbjct: 42  SSKPRAF-VYEGFLTDLECDHLISLAKENLQRSAVADNDNGESQVSDVRTSSGTFISKGK 101

Query: 70  DEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYEPHFDFFVDEVNKELGGHRVATVL 129
           D I++ IE +++ WTFLP ENGE +Q+L YE+G+KY+ HFD+F D+VN   GGHR+ATVL
Sbjct: 102 DPIVSGIEDKLSTWTFLPKENGEDLQVLRYEHGQKYDAHFDYFHDKVNIARGGHRIATVL 161

Query: 130 MYLSNVEKGGETVFPHS-EF--KESQEKDDSWSDCAR---MVKPQKGDALLFFSLNVDGT 189
           +YLSNV KGGETVFP + EF  +   E  D  SDCA+    VKP+KG+ALLFF+L  D  
Sbjct: 162 LYLSNVTKGGETVFPDAQEFSRRSLSENKDDLSDCAKKGIAVKPKKGNALLFFNLQQDAI 221

Query: 190 TNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCMDFNENCPLWAENGECKNNPR 249
            +P S+HG CPVIEGEKWSATKWIHV SFD  +  +  C D NE+C  WA  GEC  NP 
Sbjct: 222 PDPFSLHGGCPVIEGEKWSATKWIHVDSFDKILTHDGNCTDVNESCERWAVLGECGKNPE 281

Query: 250 YMLGSETASGYCRKSCQAC 263
           YM+G+    G CR+SC+AC
Sbjct: 282 YMVGTPEIPGNCRRSCKAC 299

BLAST of Cp4.1LG01g23040 vs. ExPASy Swiss-Prot
Match: Q9LN20 (Probable prolyl 4-hydroxylase 3 OS=Arabidopsis thaliana OX=3702 GN=P4H3 PE=2 SV=1)

HSP 1 Score: 221.5 bits (563), Expect = 1.1e-56
Identity = 104/195 (53.33%), Postives = 146/195 (74.87%), Query Frame = 0

Query: 22  FLSDKECDHLINLSRGRLERSMVADNKSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIA 81
           FLS +EC++LI+L++  + +S V D+++GK   S+VRTSSGTF+ +G+D+II  IE RIA
Sbjct: 91  FLSKEECEYLISLAKPHMVKSTVVDSETGKSKDSRVRTSSGTFLRRGRDKIIKTIEKRIA 150

Query: 82  AWTFLPLENGEPIQILHYENGEKYEPHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGET 141
            +TF+P ++GE +Q+LHYE G+KYEPH+D+FVDE N + GG R+AT+LMYLS+VE+GGET
Sbjct: 151 DYTFIPADHGEGLQVLHYEAGQKYEPHYDYFVDEFNTKNGGQRMATMLMYLSDVEEGGET 210

Query: 142 VFPHSEFK-ESQEKDDSWSDCAR---MVKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIE 201
           VFP +     S    +  S+C +    VKP+ GDALLF+S+  D T +P S+HG CPVI 
Sbjct: 211 VFPAANMNFSSVPWYNELSECGKKGLSVKPRMGDALLFWSMRPDATLDPTSLHGGCPVIR 270

Query: 202 GEKWSATKWIHVRSF 213
           G KWS+TKW+HV  +
Sbjct: 271 GNKWSSTKWMHVGEY 285

BLAST of Cp4.1LG01g23040 vs. NCBI nr
Match: XP_023523636.1 (probable prolyl 4-hydroxylase 7 [Cucurbita pepo subsp. pepo])

HSP 1 Score: 534 bits (1376), Expect = 8.09e-191
Identity = 262/279 (93.91%), Postives = 262/279 (93.91%), Query Frame = 0

Query: 1   GGSALRLKGSSSPLIFDPTR--------------GFLSDKECDHLINLSRGRLERSMVAD 60
           GGSALRLKGSSSPLIFDPTR              GFLSDKECDHLINLSRGRLERSMVAD
Sbjct: 33  GGSALRLKGSSSPLIFDPTRVTQLSWQPRALLYKGFLSDKECDHLINLSRGRLERSMVAD 92

Query: 61  NKSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYE 120
           NKSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYE
Sbjct: 93  NKSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYE 152

Query: 121 PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM-- 180
           PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM  
Sbjct: 153 PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARMGY 212

Query: 181 -VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM 240
            VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM
Sbjct: 213 AVKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM 272

Query: 241 DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 262
           DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC
Sbjct: 273 DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 311

BLAST of Cp4.1LG01g23040 vs. NCBI nr
Match: XP_022961014.1 (probable prolyl 4-hydroxylase 6 [Cucurbita moschata])

HSP 1 Score: 529 bits (1362), Expect = 1.10e-188
Identity = 257/279 (92.11%), Postives = 262/279 (93.91%), Query Frame = 0

Query: 1   GGSALRLKGSSSPLIFDPTR--------------GFLSDKECDHLINLSRGRLERSMVAD 60
           GGSALRL+GSSSPLIFDPTR              GFLSDKECDHLINLS+GRLERSMVAD
Sbjct: 33  GGSALRLEGSSSPLIFDPTRVTQLSWQPRALLYKGFLSDKECDHLINLSKGRLERSMVAD 92

Query: 61  NKSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYE 120
           NKSGKR+SS+VRTSSGTFVLKGQDEIIAAIEARIAAWTFLP+ENGEPIQILHYENGEKYE
Sbjct: 93  NKSGKRISSEVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPIENGEPIQILHYENGEKYE 152

Query: 121 PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM-- 180
           PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM  
Sbjct: 153 PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARMGY 212

Query: 181 -VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM 240
            VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM
Sbjct: 213 AVKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM 272

Query: 241 DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 262
           DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC
Sbjct: 273 DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 311

BLAST of Cp4.1LG01g23040 vs. NCBI nr
Match: XP_022990688.1 (probable prolyl 4-hydroxylase 7 [Cucurbita maxima])

HSP 1 Score: 527 bits (1358), Expect = 4.48e-188
Identity = 255/279 (91.40%), Postives = 262/279 (93.91%), Query Frame = 0

Query: 1   GGSALRLKGSSSPLIFDPTR--------------GFLSDKECDHLINLSRGRLERSMVAD 60
           GGSALRLKG+SSPLIFDPTR              GFLSDKECDHLINLS+GRLERSMVAD
Sbjct: 33  GGSALRLKGNSSPLIFDPTRVTQLSWQPRALLYKGFLSDKECDHLINLSKGRLERSMVAD 92

Query: 61  NKSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYE 120
           NKSGKR+SSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLP+ENGEPIQILHYENGEKYE
Sbjct: 93  NKSGKRISSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPIENGEPIQILHYENGEKYE 152

Query: 121 PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM-- 180
           PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM  
Sbjct: 153 PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARMGY 212

Query: 181 -VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM 240
            VKP+KGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM
Sbjct: 213 AVKPEKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM 272

Query: 241 DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 262
           DFN+NCPLWAENGECKNNPRYMLGSET+SGYCRKSCQAC
Sbjct: 273 DFNQNCPLWAENGECKNNPRYMLGSETSSGYCRKSCQAC 311

BLAST of Cp4.1LG01g23040 vs. NCBI nr
Match: KAG7033150.1 (putative prolyl 4-hydroxylase 7, partial [Cucurbita argyrosperma subsp. argyrosperma])

HSP 1 Score: 526 bits (1356), Expect = 1.40e-187
Identity = 256/279 (91.76%), Postives = 261/279 (93.55%), Query Frame = 0

Query: 1   GGSALRLKGSSSPLIFDPTR--------------GFLSDKECDHLINLSRGRLERSMVAD 60
           GGSALRL+GSSSPLIFDPTR              GFLSDKECDHLINLS+GRLERSMVAD
Sbjct: 45  GGSALRLEGSSSPLIFDPTRVTQLSWQPRALLYKGFLSDKECDHLINLSKGRLERSMVAD 104

Query: 61  NKSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYE 120
           NKSGKR+SS+VRTSSGTFVLKGQDEIIAAIEARIAAWTFLP+ENGEPIQILHYENGEKYE
Sbjct: 105 NKSGKRISSEVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPIENGEPIQILHYENGEKYE 164

Query: 121 PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM-- 180
           PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSE KESQEKDDSWSDCARM  
Sbjct: 165 PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSELKESQEKDDSWSDCARMGY 224

Query: 181 -VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM 240
            VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM
Sbjct: 225 AVKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM 284

Query: 241 DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 262
           DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC
Sbjct: 285 DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 323

BLAST of Cp4.1LG01g23040 vs. NCBI nr
Match: KAG6602475.1 (putative prolyl 4-hydroxylase 7, partial [Cucurbita argyrosperma subsp. sororia])

HSP 1 Score: 525 bits (1353), Expect = 2.59e-187
Identity = 255/279 (91.40%), Postives = 261/279 (93.55%), Query Frame = 0

Query: 1   GGSALRLKGSSSPLIFDPTR--------------GFLSDKECDHLINLSRGRLERSMVAD 60
           GGSALRLKGSSSPLIFDPTR              GFLSDKECDHL+NLS+GRLERSMVAD
Sbjct: 33  GGSALRLKGSSSPLIFDPTRVTQLSWQPRALLYKGFLSDKECDHLMNLSKGRLERSMVAD 92

Query: 61  NKSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYE 120
           N+SGKR+SS+VRTSSGTFVLKGQDEIIAAIEARIAAWTFLP+ENGEPIQILHYENGEKYE
Sbjct: 93  NESGKRISSEVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPIENGEPIQILHYENGEKYE 152

Query: 121 PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM-- 180
           PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSE KESQEKDDSWSDCARM  
Sbjct: 153 PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSELKESQEKDDSWSDCARMGY 212

Query: 181 -VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM 240
            VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM
Sbjct: 213 AVKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM 272

Query: 241 DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 262
           DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC
Sbjct: 273 DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 311

BLAST of Cp4.1LG01g23040 vs. ExPASy TrEMBL
Match: A0A6J1HCS1 (Procollagen-proline 4-dioxygenase OS=Cucurbita moschata OX=3662 GN=LOC111461644 PE=3 SV=1)

HSP 1 Score: 529 bits (1362), Expect = 5.33e-189
Identity = 257/279 (92.11%), Postives = 262/279 (93.91%), Query Frame = 0

Query: 1   GGSALRLKGSSSPLIFDPTR--------------GFLSDKECDHLINLSRGRLERSMVAD 60
           GGSALRL+GSSSPLIFDPTR              GFLSDKECDHLINLS+GRLERSMVAD
Sbjct: 33  GGSALRLEGSSSPLIFDPTRVTQLSWQPRALLYKGFLSDKECDHLINLSKGRLERSMVAD 92

Query: 61  NKSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYE 120
           NKSGKR+SS+VRTSSGTFVLKGQDEIIAAIEARIAAWTFLP+ENGEPIQILHYENGEKYE
Sbjct: 93  NKSGKRISSEVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPIENGEPIQILHYENGEKYE 152

Query: 121 PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM-- 180
           PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM  
Sbjct: 153 PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARMGY 212

Query: 181 -VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM 240
            VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM
Sbjct: 213 AVKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM 272

Query: 241 DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 262
           DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC
Sbjct: 273 DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 311

BLAST of Cp4.1LG01g23040 vs. ExPASy TrEMBL
Match: A0A6J1JU08 (Procollagen-proline 4-dioxygenase OS=Cucurbita maxima OX=3661 GN=LOC111487502 PE=3 SV=1)

HSP 1 Score: 527 bits (1358), Expect = 2.17e-188
Identity = 255/279 (91.40%), Postives = 262/279 (93.91%), Query Frame = 0

Query: 1   GGSALRLKGSSSPLIFDPTR--------------GFLSDKECDHLINLSRGRLERSMVAD 60
           GGSALRLKG+SSPLIFDPTR              GFLSDKECDHLINLS+GRLERSMVAD
Sbjct: 33  GGSALRLKGNSSPLIFDPTRVTQLSWQPRALLYKGFLSDKECDHLINLSKGRLERSMVAD 92

Query: 61  NKSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYE 120
           NKSGKR+SSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLP+ENGEPIQILHYENGEKYE
Sbjct: 93  NKSGKRISSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPIENGEPIQILHYENGEKYE 152

Query: 121 PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM-- 180
           PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM  
Sbjct: 153 PHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARMGY 212

Query: 181 -VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM 240
            VKP+KGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM
Sbjct: 213 AVKPEKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCM 272

Query: 241 DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 262
           DFN+NCPLWAENGECKNNPRYMLGSET+SGYCRKSCQAC
Sbjct: 273 DFNQNCPLWAENGECKNNPRYMLGSETSSGYCRKSCQAC 311

BLAST of Cp4.1LG01g23040 vs. ExPASy TrEMBL
Match: A0A6J1BXN9 (Procollagen-proline 4-dioxygenase OS=Momordica charantia OX=3673 GN=LOC111006412 PE=3 SV=1)

HSP 1 Score: 424 bits (1090), Expect = 1.54e-147
Identity = 206/280 (73.57%), Postives = 234/280 (83.57%), Query Frame = 0

Query: 2   GSALRLKGSSSPLIFDPTR--------------GFLSDKECDHLINLSRGRLERSMVADN 61
           GS LRLKG  SPLIFDPTR              GFLSDKECDHLI+L++ +LE+SMVADN
Sbjct: 34  GSVLRLKGEPSPLIFDPTRVTQLSWQPRAFLYKGFLSDKECDHLIDLAKDKLEKSMVADN 93

Query: 62  KSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYEP 121
            SGK VSS+VRTSSG F+ K QDEI+AA+EARIAAWTFLP ENGE IQILHYENG+KYEP
Sbjct: 94  NSGKSVSSEVRTSSGMFLHKAQDEIVAAVEARIAAWTFLPAENGESIQILHYENGQKYEP 153

Query: 122 HFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM--- 181
           HFD+F D+VN+ELGGHRVATVLMYLSNVEKGGET+FP+SEFKESQEKDDSWSDCAR    
Sbjct: 154 HFDYFHDKVNQELGGHRVATVLMYLSNVEKGGETIFPNSEFKESQEKDDSWSDCARKGYA 213

Query: 182 VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNK--GC 241
           VK +KGDALLFFSL++D TT+ +S+HGSCPVIEGEKWSATKWIHVRSF+ P  P++   C
Sbjct: 214 VKAKKGDALLFFSLHLDATTDVKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRPSRRLDC 273

Query: 242 MDFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 262
           +D NENC  WA+ GECK NP YM+GSE+A GYCRKSCQAC
Sbjct: 274 VDENENCASWAKRGECKKNPTYMVGSESALGYCRKSCQAC 313

BLAST of Cp4.1LG01g23040 vs. ExPASy TrEMBL
Match: A0A1S3C8G4 (Procollagen-proline 4-dioxygenase OS=Cucumis melo OX=3656 GN=LOC103498028 PE=3 SV=1)

HSP 1 Score: 417 bits (1072), Expect = 9.39e-145
Identity = 202/279 (72.40%), Postives = 232/279 (83.15%), Query Frame = 0

Query: 2   GSALRLKGSSSPLIFDPTR--------------GFLSDKECDHLINLSRGRLERSMVADN 61
           GS LRLK  SSPLIFDPTR              GFLSD+ECDHLI+L++ +LE+SMVADN
Sbjct: 38  GSVLRLKTDSSPLIFDPTRVTQLSWQPRAFLYKGFLSDEECDHLIDLAKDKLEKSMVADN 97

Query: 62  KSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYEP 121
           +SGK VSS+VRTSSG F+ K QD+I+A +EARIAAWT LP ENGE IQILHYENG+KYEP
Sbjct: 98  ESGKSVSSEVRTSSGMFLRKAQDKIVAGVEARIAAWTLLPAENGESIQILHYENGQKYEP 157

Query: 122 HFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM--- 181
           HFDFF D+VN+ELGGHR+ATVLMYLSNVEKGGET+FP+SEFKESQEKDDSWSDC+R    
Sbjct: 158 HFDFFHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSEFKESQEKDDSWSDCSRKGYA 217

Query: 182 VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDN-PILPNKGCM 241
           VK QKGDALLFFSL++D TT+ RS+HGSCPVIEGEKWSATKWIHVRSF+  P +  + C+
Sbjct: 218 VKAQKGDALLFFSLHLDATTDERSLHGSCPVIEGEKWSATKWIHVRSFEKLPRVSRQDCV 277

Query: 242 DFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 262
           D NENCP WA+ GECK NP YM+GSE A GYCRKSC+AC
Sbjct: 278 DENENCPAWAKRGECKKNPTYMVGSEGALGYCRKSCKAC 316

BLAST of Cp4.1LG01g23040 vs. ExPASy TrEMBL
Match: A0A0A0KS38 (Procollagen-proline 4-dioxygenase OS=Cucumis sativus OX=3659 GN=Csa_5G633280 PE=3 SV=1)

HSP 1 Score: 405 bits (1041), Expect = 4.41e-140
Identity = 198/280 (70.71%), Postives = 226/280 (80.71%), Query Frame = 0

Query: 2   GSALRLKGSSSPLIFDPTR--------------GFLSDKECDHLINLSRGRLERSMVADN 61
           GS LRLK  SSPLIFDPTR              GFLSD ECDHLI+L++ +LE+SMVADN
Sbjct: 34  GSVLRLKTDSSPLIFDPTRVTQLSWQPRAFLYKGFLSDAECDHLIDLAKDKLEKSMVADN 93

Query: 62  KSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYEP 121
            SGK VSS+VRTSSG F+ K QDE++A +EARIAAWT LP ENGE IQILHYENG+KYEP
Sbjct: 94  DSGKSVSSEVRTSSGMFLRKAQDEVVAGVEARIAAWTLLPAENGESIQILHYENGQKYEP 153

Query: 122 HFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM--- 181
           HFDFF D+VN+ELGGHR+ATVLMYLSNVEKGGET+FP+SEFKESQ KD+SWSDC+R    
Sbjct: 154 HFDFFHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSEFKESQAKDESWSDCSRKGYA 213

Query: 182 VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPI--LPNKGC 241
           VK QKGDALLFFSLN+D TT+ RS+HGSCPVI GEKWSATKWIHVRSF+     +  +GC
Sbjct: 214 VKAQKGDALLFFSLNLDATTDERSLHGSCPVIAGEKWSATKWIHVRSFEKITSRVSRQGC 273

Query: 242 MDFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 262
           +D NENC  WA+ GECK NP YM+GS  A GYCRKSC+AC
Sbjct: 274 VDENENCLAWAKKGECKKNPTYMVGSGGALGYCRKSCKAC 313

BLAST of Cp4.1LG01g23040 vs. TAIR 10
Match: AT3G28480.1 (Oxoglutarate/iron-dependent oxygenase )

HSP 1 Score: 366.7 bits (940), Expect = 1.6e-101
Identity = 171/278 (61.51%), Postives = 215/278 (77.34%), Query Frame = 0

Query: 2   GSALRLKGSSSPLIFDPTR--------------GFLSDKECDHLINLSRGRLERSMVADN 61
           GS +++K S+S   FDPTR              GFLSD+ECDH I L++G+LE+SMVADN
Sbjct: 37  GSVIKMKTSASSFGFDPTRVTQLSWTPRVFLYEGFLSDEECDHFIKLAKGKLEKSMVADN 96

Query: 62  KSGKRVSSKVRTSSGTFVLKGQDEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYEP 121
            SG+ V S+VRTSSG F+ K QD+I++ +EA++AAWTFLP ENGE +QILHYENG+KYEP
Sbjct: 97  DSGESVESEVRTSSGMFLSKRQDDIVSNVEAKLAAWTFLPEENGESMQILHYENGQKYEP 156

Query: 122 HFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWSDCARM--- 181
           HFD+F D+ N ELGGHR+ATVLMYLSNVEKGGETVFP  + K +Q KDDSW++CA+    
Sbjct: 157 HFDYFHDQANLELGGHRIATVLMYLSNVEKGGETVFPMWKGKATQLKDDSWTECAKQGYA 216

Query: 182 VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCMD 241
           VKP+KGDALLFF+L+ + TT+  S+HGSCPV+EGEKWSAT+WIHV+SF+       GCMD
Sbjct: 217 VKPRKGDALLFFNLHPNATTDSNSLHGSCPVVEGEKWSATRWIHVKSFERAFNKQSGCMD 276

Query: 242 FNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 263
            N +C  WA+ GEC+ NP YM+GS+   GYCRKSC+AC
Sbjct: 277 ENVSCEKWAKAGECQKNPTYMVGSDKDHGYCRKSCKAC 314

BLAST of Cp4.1LG01g23040 vs. TAIR 10
Match: AT3G28490.1 (Oxoglutarate/iron-dependent oxygenase )

HSP 1 Score: 345.1 bits (884), Expect = 4.9e-95
Identity = 163/247 (65.99%), Postives = 203/247 (82.19%), Query Frame = 0

Query: 20  RGFLSDKECDHLINLSRGRLERSM-VADNKSGKRVSSKVRTSSGTFVLKGQDEIIAAIEA 79
           +GFLSD+ECDHLI L++G+LE+SM VAD  SG+   S+VRTSSG F+ K QD+I+A +EA
Sbjct: 45  KGFLSDEECDHLIKLAKGKLEKSMVVADVDSGESEDSEVRTSSGMFLTKRQDDIVANVEA 104

Query: 80  RIAAWTFLPLENGEPIQILHYENGEKYEPHFDFFVDEVNKELGGHRVATVLMYLSNVEKG 139
           ++AAWTFLP ENGE +QILHYENG+KY+PHFD+F D+   ELGGHR+ATVLMYLSNV KG
Sbjct: 105 KLAAWTFLPEENGEALQILHYENGQKYDPHFDYFYDKKALELGGHRIATVLMYLSNVTKG 164

Query: 140 GETVFPHSEFKESQEKDDSWSDCARM---VKPQKGDALLFFSLNVDGTTNPRSMHGSCPV 199
           GETVFP+ + K  Q KDDSWS CA+    VKP+KGDALLFF+L+++GTT+P S+HGSCPV
Sbjct: 165 GETVFPNWKGKTPQLKDDSWSKCAKQGYAVKPRKGDALLFFNLHLNGTTDPNSLHGSCPV 224

Query: 200 IEGEKWSATKWIHVRSFDNPILPNKGCMDFNENCPLWAENGECKNNPRYMLGSETASGYC 259
           IEGEKWSAT+WIHVRSF    L    C+D +E+C  WA+ GEC+ NP YM+GSET+ G+C
Sbjct: 225 IEGEKWSATRWIHVRSFGKKKLV---CVDDHESCQEWADAGECEKNPMYMVGSETSLGFC 284

Query: 260 RKSCQAC 263
           RKSC+AC
Sbjct: 285 RKSCKAC 288

BLAST of Cp4.1LG01g23040 vs. TAIR 10
Match: AT3G28480.2 (Oxoglutarate/iron-dependent oxygenase )

HSP 1 Score: 343.6 bits (880), Expect = 1.4e-94
Identity = 165/286 (57.69%), Postives = 210/286 (73.43%), Query Frame = 0

Query: 2   GSALRLKGSSSPLIFDPTR--------------GFLSDKECDHLINLSRGRLERSMVADN 61
           GS +++K S+S   FDPTR              GFLSD+ECDH I L++G+LE+SMVADN
Sbjct: 37  GSVIKMKTSASSFGFDPTRVTQLSWTPRVFLYEGFLSDEECDHFIKLAKGKLEKSMVADN 96

Query: 62  KSGKRVSSK----VRTSSGTFVLKGQ----DEIIAAIEARIAAWTFLPLENGEPIQILHY 121
            SG+ V S+    V   S +F+        D+I++ +EA++AAWTFLP ENGE +QILHY
Sbjct: 97  DSGESVESEDSVSVVRQSSSFIANMDSLEIDDIVSNVEAKLAAWTFLPEENGESMQILHY 156

Query: 122 ENGEKYEPHFDFFVDEVNKELGGHRVATVLMYLSNVEKGGETVFPHSEFKESQEKDDSWS 181
           ENG+KYEPHFD+F D+ N ELGGHR+ATVLMYLSNVEKGGETVFP  + K +Q KDDSW+
Sbjct: 157 ENGQKYEPHFDYFHDQANLELGGHRIATVLMYLSNVEKGGETVFPMWKGKATQLKDDSWT 216

Query: 182 DCARM---VKPQKGDALLFFSLNVDGTTNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPI 241
           +CA+    VKP+KGDALLFF+L+ + TT+  S+HGSCPV+EGEKWSAT+WIHV+SF+   
Sbjct: 217 ECAKQGYAVKPRKGDALLFFNLHPNATTDSNSLHGSCPVVEGEKWSATRWIHVKSFERAF 276

Query: 242 LPNKGCMDFNENCPLWAENGECKNNPRYMLGSETASGYCRKSCQAC 263
               GCMD N +C  WA+ GEC+ NP YM+GS+   GYCRKSC+AC
Sbjct: 277 NKQSGCMDENVSCEKWAKAGECQKNPTYMVGSDKDHGYCRKSCKAC 322

BLAST of Cp4.1LG01g23040 vs. TAIR 10
Match: AT5G18900.1 (2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily protein )

HSP 1 Score: 314.7 bits (805), Expect = 7.0e-86
Identity = 149/259 (57.53%), Postives = 189/259 (72.97%), Query Frame = 0

Query: 10  SSSPLIFDPTRGFLSDKECDHLINLSRGRLERSMVADNKSGKRVSSKVRTSSGTFVLKGQ 69
           SS P  F    GFL++ ECDH+++L++  L+RS VADN SG+   S+VRTSSGTF+ KG+
Sbjct: 41  SSKPRAF-VYEGFLTELECDHMVSLAKASLKRSAVADNDSGESKFSEVRTSSGTFISKGK 100

Query: 70  DEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYEPHFDFFVDEVNKELGGHRVATVL 129
           D I++ IE +I+ WTFLP ENGE IQ+L YE+G+KY+ HFD+F D+VN   GGHR+AT+L
Sbjct: 101 DPIVSGIEDKISTWTFLPKENGEDIQVLRYEHGQKYDAHFDYFHDKVNIVRGGHRMATIL 160

Query: 130 MYLSNVEKGGETVFPHSEFKESQ---EKDDSWSDCAR---MVKPQKGDALLFFSLNVDGT 189
           MYLSNV KGGETVFP +E    +   E  +  SDCA+    VKP+KGDALLFF+L+ D  
Sbjct: 161 MYLSNVTKGGETVFPDAEIPSRRVLSENKEDLSDCAKRGIAVKPRKGDALLFFNLHPDAI 220

Query: 190 TNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCMDFNENCPLWAENGECKNNPR 249
            +P S+HG CPVIEGEKWSATKWIHV SFD  + P+  C D NE+C  WA  GEC  NP 
Sbjct: 221 PDPLSLHGGCPVIEGEKWSATKWIHVDSFDRIVTPSGNCTDMNESCERWAVLGECTKNPE 280

Query: 250 YMLGSETASGYCRKSCQAC 263
           YM+G+    GYCR+SC+AC
Sbjct: 281 YMVGTTELPGYCRRSCKAC 298

BLAST of Cp4.1LG01g23040 vs. TAIR 10
Match: AT3G06300.1 (P4H isoform 2 )

HSP 1 Score: 303.5 bits (776), Expect = 1.6e-82
Identity = 148/259 (57.14%), Postives = 186/259 (71.81%), Query Frame = 0

Query: 10  SSSPLIFDPTRGFLSDKECDHLINLSRGRLERSMVADNKSGKRVSSKVRTSSGTFVLKGQ 69
           SS P  F    GFL+D ECDHLI+L++  L+RS VADN +G+   S VRTSSGTF+ KG+
Sbjct: 42  SSKPRAF-VYEGFLTDLECDHLISLAKENLQRSAVADNDNGESQVSDVRTSSGTFISKGK 101

Query: 70  DEIIAAIEARIAAWTFLPLENGEPIQILHYENGEKYEPHFDFFVDEVNKELGGHRVATVL 129
           D I++ IE +++ WTFLP ENGE +Q+L YE+G+KY+ HFD+F D+VN   GGHR+ATVL
Sbjct: 102 DPIVSGIEDKLSTWTFLPKENGEDLQVLRYEHGQKYDAHFDYFHDKVNIARGGHRIATVL 161

Query: 130 MYLSNVEKGGETVFPHS-EF--KESQEKDDSWSDCAR---MVKPQKGDALLFFSLNVDGT 189
           +YLSNV KGGETVFP + EF  +   E  D  SDCA+    VKP+KG+ALLFF+L  D  
Sbjct: 162 LYLSNVTKGGETVFPDAQEFSRRSLSENKDDLSDCAKKGIAVKPKKGNALLFFNLQQDAI 221

Query: 190 TNPRSMHGSCPVIEGEKWSATKWIHVRSFDNPILPNKGCMDFNENCPLWAENGECKNNPR 249
            +P S+HG CPVIEGEKWSATKWIHV SFD  +  +  C D NE+C  WA  GEC  NP 
Sbjct: 222 PDPFSLHGGCPVIEGEKWSATKWIHVDSFDKILTHDGNCTDVNESCERWAVLGECGKNPE 281

Query: 250 YMLGSETASGYCRKSCQAC 263
           YM+G+    G CR+SC+AC
Sbjct: 282 YMVGTPEIPGNCRRSCKAC 299

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
Q8L9702.2e-10061.51Probable prolyl 4-hydroxylase 7 OS=Arabidopsis thaliana OX=3702 GN=P4H7 PE=2 SV=... [more]
F4J0A86.8e-9465.99Probable prolyl 4-hydroxylase 6 OS=Arabidopsis thaliana OX=3702 GN=P4H6 PE=2 SV=... [more]
Q8LAN39.9e-8557.53Probable prolyl 4-hydroxylase 4 OS=Arabidopsis thaliana OX=3702 GN=P4H4 PE=2 SV=... [more]
F4JAU32.3e-8157.14Prolyl 4-hydroxylase 2 OS=Arabidopsis thaliana OX=3702 GN=P4H2 PE=1 SV=1[more]
Q9LN201.1e-5653.33Probable prolyl 4-hydroxylase 3 OS=Arabidopsis thaliana OX=3702 GN=P4H3 PE=2 SV=... [more]
Match NameE-valueIdentityDescription
XP_023523636.18.09e-19193.91probable prolyl 4-hydroxylase 7 [Cucurbita pepo subsp. pepo][more]
XP_022961014.11.10e-18892.11probable prolyl 4-hydroxylase 6 [Cucurbita moschata][more]
XP_022990688.14.48e-18891.40probable prolyl 4-hydroxylase 7 [Cucurbita maxima][more]
KAG7033150.11.40e-18791.76putative prolyl 4-hydroxylase 7, partial [Cucurbita argyrosperma subsp. argyrosp... [more]
KAG6602475.12.59e-18791.40putative prolyl 4-hydroxylase 7, partial [Cucurbita argyrosperma subsp. sororia][more]
Match NameE-valueIdentityDescription
A0A6J1HCS15.33e-18992.11Procollagen-proline 4-dioxygenase OS=Cucurbita moschata OX=3662 GN=LOC111461644 ... [more]
A0A6J1JU082.17e-18891.40Procollagen-proline 4-dioxygenase OS=Cucurbita maxima OX=3661 GN=LOC111487502 PE... [more]
A0A6J1BXN91.54e-14773.57Procollagen-proline 4-dioxygenase OS=Momordica charantia OX=3673 GN=LOC111006412... [more]
A0A1S3C8G49.39e-14572.40Procollagen-proline 4-dioxygenase OS=Cucumis melo OX=3656 GN=LOC103498028 PE=3 S... [more]
A0A0A0KS384.41e-14070.71Procollagen-proline 4-dioxygenase OS=Cucumis sativus OX=3659 GN=Csa_5G633280 PE=... [more]
Match NameE-valueIdentityDescription
AT3G28480.11.6e-10161.51Oxoglutarate/iron-dependent oxygenase [more]
AT3G28490.14.9e-9565.99Oxoglutarate/iron-dependent oxygenase [more]
AT3G28480.21.4e-9457.69Oxoglutarate/iron-dependent oxygenase [more]
AT5G18900.17.0e-8657.532-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily protein [more]
AT3G06300.11.6e-8257.14P4H isoform 2 [more]
InterPro
Analysis Name: InterPro Annotations of Cucurbita pepo (Zucchini) v1
Date Performed: 2021-10-25
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR006620Prolyl 4-hydroxylase, alpha subunitSMARTSM00702p4hccoord: 13..208
e-value: 6.2E-53
score: 191.8
IPR003582ShKT domainSMARTSM00254ShkT_1coord: 221..262
e-value: 0.0014
score: 27.9
IPR003582ShKT domainPFAMPF01549ShKcoord: 221..262
e-value: 2.8E-4
score: 21.3
IPR003582ShKT domainPROSITEPS51670SHKTcoord: 222..262
score: 8.161264
NoneNo IPR availableGENE3D2.60.120.620q2cbj1_9rhob like domaincoord: 9..209
e-value: 2.2E-68
score: 232.1
NoneNo IPR availablePANTHERPTHR10869:SF140OS03G0803500 PROTEINcoord: 20..262
IPR044862Prolyl 4-hydroxylase alpha subunit, Fe(2+) 2OG dioxygenase domainPFAMPF136402OG-FeII_Oxy_3coord: 95..208
e-value: 5.0E-20
score: 72.2
IPR045054Prolyl 4-hydroxylasePANTHERPTHR10869PROLYL 4-HYDROXYLASE ALPHA SUBUNITcoord: 20..262
IPR005123Oxoglutarate/iron-dependent dioxygenasePROSITEPS51471FE2OG_OXYcoord: 90..209
score: 12.418397

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Cp4.1LG01g23040.1Cp4.1LG01g23040.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0018401 peptidyl-proline hydroxylation to 4-hydroxy-L-proline
cellular_component GO:0005789 endoplasmic reticulum membrane
molecular_function GO:0005506 iron ion binding
molecular_function GO:0031418 L-ascorbic acid binding
molecular_function GO:0004656 procollagen-proline 4-dioxygenase activity
molecular_function GO:0016705 oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen