CmaCh14G016540 (gene) Cucurbita maxima (Rimu) v1.1

Overview
NameCmaCh14G016540
Typegene
OrganismCucurbita maxima (Cucurbita maxima (Rimu) v1.1)
DescriptionProcollagen-proline 4-dioxygenase
LocationCma_Chr14: 12395828 .. 12403613 (-)
RNA-Seq ExpressionCmaCh14G016540
SyntenyCmaCh14G016540
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: exonfive_prime_UTRpolypeptideCDSthree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
TGCTTCGTTCTCTTTCTCTACTGATCCGATTCAGTTCATGGAGAAATTTTGCAGTTTCAATCCGCTGTTTTTCTTCTCATTATCGATTTCGTTGCTTCTCGGGCGAGCTTCAAGCTCCTATGCAGGTTCCGCTAGCTCAATTGTCAATCCTGCTAAAGTAAAGCAGATTTCATGGAATCCTCGGTACTTCAATTTCTTGCCGTTCTGCTTCTTCTTTTCCTTGTTGTAGTTTCTAAGTTTTGATAATCTATTTCTTCGTTTGATGTTCGCGGTTCGATGTTTCCGTTTCAATTTGAATGATTATATAGGGCGTTTGTGTATGAAGGTTTTCTCACGGACTTAGAATGCGATCATCTCATCTCGATTGTGAGTAGAATTAGAATTTATTCGTTTTTCATTTGTGATACCGCAGTAGTACGCGATTTTCGTTTTGGTTAATTTTTGTTCTGTTTAAATTAGGCTAAAGCTGAATTGAAGAGATCTGCTGTTGCCGATAATTTGTCAGGAGAGAGCAAGGTCAGCGAGATCCGAACTAGTTCTGGGGCGTTTATTCAAAAATCCAAGGTTTGTGCATAGTTTGCTTCTTATAGTTCTTAAGCTTCCAATAGCGAAGCAAGTGATTTGACAAGTTTTTGCTTCAATTTTTAGCTTGAATCACCATCAAGCCTAAATCTTAAACTCTTAGCTTGAAATTTATTGGGTTGATAGACTTTAGATTCCCTCATGGGCTCCAAGATTAGTTTGGTTCGTTCGACCGTCTATTGAATATTGGTTTGATTTATGAAGTCTAGCTACCTTAATATGAAATGAGCGCGTACATGGCCTTGAAGTAATTTAAACCTGAGATCATCTTTTCTGAGATATTGTTATAATCACCATTTGAGGAAAGAAATAAAAAGTATAAGTGTTACTCAGTCGGTCTATTCATTGATCGATATTGATTTGGAAGGGTCAGCTGTTACTACTTCATGCCAGTAGTTTGGAAGATTAGTATTTTGGATGTCAAATTTTACGGAATGTGTTTAGCAGACTTAGCTGATTGAAATTTCAAATTTGTCTGCTTAGTTGACTATTGTAGCTGGAATACATGAAATGCCATCCTTGATCCTTCTGAGTATACAGATAGGGAGCCATTTAGTATTTTGAACGTCAAATTTTAAAGAATGGGTTCAGCAGCCTTAGCTGCTTGAAATTTCAAAATTTTCTGCTTAGTTGACTATTGTTGTTGGAAAACATTAAATGCCATCCTTGATCCTTCTGAGTGTACAGATAGGGAGCCAATTTTACTTATAAACCTCTAACCTGATGAGGCACTCTCTATATCCCTCTAAGTAAACTCATTGTTTCTCATGTAGGATCCTATTGTTTCTGGTATAGAAGACAAAATTTCAGCATGGACATTTCTGCCAAAAGGTATTATAACACTAGCAAAAGGTGCAACTTTCTTGCTAATCCCATAGTCTTTAGCAACTTATCCTCCATTACAAACTGTTTTCCGTCATCAAAAATTGAAGATTGAATAGATTGACAATTGAGTGTGAAATTATGATCTTTCTTAATGCAACTTTGGAGTTGAATAGATTGACATGATATAGGTTTACAGTTGAATTCTAAATTCTGGGTTCTCTAGTTTTATCCGTTCTTATGACTGAAATCATAAAGAGAAGTCTTGGGCTATTTATATTCCAATTTCCATATAAAATAGGGAACTTAGGAAGACATAATCCCCTCGTGAAATACATACATAATCAAGCGATAGACTGAGGAAGAATTGTGCAGCTGGGAAGAAACGGAGTTTCAATTTAGCCCTTCTCATTTATTTATTTTTTATTGAGAAAAGAAAAACTATGAATGGAAACTCTATTCATATAAACACTTGACTCATGCATTGATAGCATGTCTGTGTATTGGTATAGCTGGTAAACCGAAACTTGAATATGCTTGATTTCAGAAAATGGAGAAGACATTCAAGTGTTGAGATATGAATATGGGCAGAAGTATGATGCACACTTCGATTACTTTGCCGACGAGGTTAATATTGCCCGTGGTGGACATCGAATGGCAACCGTTCTCATGTATCTTTCCGACGTAAAAAGAGGCGGTGAAACTGTGTTTCCTTCTGCAGAGGTCTGCCCTTGTATTTCCCTTCAATCCAATTAATTGCAACTTCCTTTTTTGTGGCTGTACCTCGACCATAATCAAGTGTCTTGGTTCTTGTTTGTACCATATTCTTTGAAAACTAATTTTCTCTCGAGTTACCTTACTGCGGCTCTTAGAATTGGAAAAGTTTCCTGTTTTGGTATTTACGTGTGCCATTATCAATTGATGGGTTTTTTTTATTCATTAAAATCTAAATGTTGAATATCTTTGTTCATCGAGTTCAGGAATCTCAAAGACGTCAGGCTTCTGAAACAAACGAAGATCTCTCAGACTGTGCAAAGAAAGGGATAGCAGGTGACTGTCTTCAGTTCTCTTACTCCTGTTCTGTGATTTGTGACGTTGGTTCAAACTCATCGAGATTTCTTATCGTTAAAAGATTTAATTTGTGATCATATTGTCTAAGAAGTTGTGATTTGATTTGATTAGTTAGCTGTTACTCTTTATTAGCTAATTTTTATCTTTTTGATTGCTTTGTAAAGCTATTTATATAGCTGTTTTTGCTCATCAAACAACTTTTTTTTTTTTTGAACACTTTTTAATCTACGTGCTTGAGATATTCAGTGAAACCGCGGAAAGGCGACGCTCTTCTGTTCTTCAGTCTTCATCCAAATGCTGTTCCAGACACAAGTAGTCTGCATGGAGGGTGCCCTGTGATTGAAGGCGAGAAATGGTCAGCAACTAAGTGGATTCGTGTCAATCCTTTCGACCAGATTGTGGGAGACTACATGAATTGCAGTGATGAGAATGCAAGTTGTGAGAGATGGGCTGAGCTCGGCGAGTGCAATGATAACCCAGAGTATATGGTGGGATCTCCTGAGTTTCCTGGCTACTGCAGAAAAAGTTGCAAGGTGTGTTCATAAACTTCCTCCATTCTCTATGCACATTCCTTATCTTTGTGTGCCAGAAACTGAGAAGGTAGTTTTTTTAGAGCATTTTCATTGCAATGTTTTGTGACACTTAGCTGCGCGTATTGATTACTCACTTATAATAGTCATCTCAGTAGGGAGATTTTGATAATTATTGATATAAATTTGAAATAAATTTTATAGTTTTTCTCTTGAACATGAGGTGTATAGACCTGTCAACCACAACATTGTAATAACTTTCTTCATTCATTAACTACTCGACGTATCATTCATAACTTGAACTATTCATGAAACGAAAAATATGTCTCGCTTTAAGCTTATAATTTTTATAGTTTTTCTCTTGAACATGAGGTGTATAGACCTGTCAGCCACATTGTAATAACTTCCTTCATTAACTCCTCGACGTATCATTCATAACTTGAACAACTATTCATGAAACGAAAAATATGTCTCGCTTTAAGCTTTGGTATTGGTTAGGTTGTTATTATTTTTTTAGTTTTGATTTGGTTAAATTTTTGGAAAATAGAATATTTAAATCAATTTACAAATTATCATTTTATTTTTTTTTTTTGTCGTGTGACCAACCTGTAAATAGAGTTACAACTCAATCCTATCGTCTTGCTTATATGACAAAAAATATTTGATGTTTATAATCAATGATAGCGAAATATAAAAACCAACTAAGTCCATTAATTAAAAATTTCAGATTTATCAAAAAAAAAAAAAAAATTGACAAAGAAACTTGAATTAAAAAATAAAAAACATTTTTCAAACTCAAAATTCAAAAAAAGTATTCCACAGATCAGACGGTTAGAAATGATTGCTGTCTTGACCCAAATTGAGAATTTGTTGAAGAAAAAGATAACAAAAGAGAGAGACGGTGATGATGGTCTTTTGGATTGGTATTTAAGGATATTTAGGTTTTGTCGTTCCGTTTTGGTGATTTAACAACAAAACCTTTCAAAATATCTATAACATTGATTATATGTAAAACAGTTTTCTATTAAAATACTATTTTACATTCTTAAAAGCAAAGTAGTATATATATATATATTAAAATTTAAAATCACAGTCTGAGGACTTCTTATGGCCGAACAACGAAGGCTCAAACCTCTGACTCTGTTTTAAGAAATTTTAGAAAATGGGGTAACTTAGATAGCACATAATTTCGAGTAGGGGGATGCAACCTATATATATAACGGGATTCTTTCATTCGAGATTCCACGCCCCATTTAGATATATCAAAGAAAATAAATTAGTGCTATATAAAGTCAAGAATTTTGTTATTTTTTATAAAATTGAAAAGTGAATACTAATATATATATATATATATATATATATATAGTTGTTACTTTTGAAAGAGGCCCTATATGTTTGTAGCTAGTACGGATAAGGTTTTTTGGAAATTCAATATGAAAAATACATTGGTAACATAAATAAAAAGCAACCAATGAAAAATTTCAAAAGAAAAAAAAGAAGACCAATGAAATTTGGAGGACCATATAAAACAAAGTATCACTGCAAAATGCAAATCAACGAAAAGAAAACGACAAGCAAAGGTGGAGGCTTTGATCCTGTAAGAACGACCCTCAAATCCAGCCTGATAAAAAGCTGCTTTGGACTCTGCCTCCACTTCGCCTCCTCCTTTGCAGGCGCCCGGCCATCGCCGAACCCCACTTCAAGTTTATCTATCTATTCTTCACCAATCCCTTCAAACCCCTCATCATCCTTAAACTTCATTCCTAAGATCTGTAATCGAACACCAACTTTTGTGGATCGCAGCCAAGTTCTAAGAATCTTCTTCATATTTTCGTAGTCAACAGAAGCTTCTGTGAGTAAAAATGGAGGTGGGATTGATGCAGAGACAGCGGGTTCAGTATGTAAAAGGCCTACTCGGTGAGGTAAAACTTGAAATCTGAAGTGGGTTTTGGTTTGTTTTGTGATGATTGTGAATAATCTTATGTTTGTTCTGTGTGATTTTGTGGGTTTTGGTTTGTGTGGATCTATCTATGTGGTTTTGGTGGGAGAAGGGCACTCTGGATAGTCAGTATTTGCAGCTTTTGCAACTGCAAGATGAGAGTAATCCAACTTTCGTTTCTGAAGTGGCGACTCTTTTCTTTGAAGATACCGAGGAGCTTCTCAATAAACTGAGAGTCGCTCTGTAAGGACTCAGCCTCTTTGGCTGCTTTCTTGGAGTTTTATTCTATCTCTCTCTCTCTCTCTCTCTCTCTCTCTGTCTCATCAATACCAAGTTAAGCAATCATCCAGTCGTGCAGATCCTTTTTTATGACTTGTTTTTCTTGGTTGAATCGTTTTGTGTTTCTCTGTGTTTCTGTCAGATTACAGCCATCTGTGGACTTCAAAAAGATTGATGATCATGTACACCAGCTGAAGGGCAGCAGCTCCAGGTACAAAATTTCCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTAAAAATGGATAGACGATGATAGTCAGAACTTCGGGAGCTTTGGGTTCCAAGCCTTTAAAGATGTGATTAAAACTTGTTTGTGTGAGTTTGCACGTCCCTGTATTACAGAGTTCAAACTTACCACACTGTGTTTGGAAAAACTTTTCGAACTGCCTAAACTTAGCCTGGGTTAAAAGATATACTATCGACCAAAAGGTTAGATATTCAAATCTCTCCACTCCTAATGTTTTTCCTTTCGTATTTCCCCTCAAGGTTGTCCTCTTTGAGCTTTCCCTTTTGTACTTCTCCTCAAGGTTGTCCTCTTTGGGCTTTCCTTTTCGGACTTCCCCTCAAGGTTTTTAAAACGCGTCTGTTAGAGAAAGGTTTCCACACCCTTATAAATAAGGGTGTTTCGTTATCCTCCCAATCGATATGGGATCTCACAATCCACCCTCCTTCAGGGCACAACGTCCTCACTGGCACTCGTTCCTTTCTCCAATCGATGTGGGACCCCTACCAAATCCATCCTCTTCGGGGCTCAGCATACTTATTGGCACACCACCTCATGTCTACCCCCCTTCAGGAAATAGCCTTCTCGCTGGTACATCGCCCAGTGTCTGGCTCTGATACCATTTGTAACGGCTCAAGCCCACTGCTAGCAGATATTGTTCTCTTTGGGCTTTCCCTTTCGGCCTTCTCCTCAAGGTTTATAAAACGCGTTTTTTAGGAAAAGGTTTTCACACCCTTATAAAGGGTGTTTCGTTCTCCTCCTCAACCGATATGAGATCTCACAATATTGAATTGAAATAATTTTTTCGGGTTGAGATGCAAGACTGATGATGGGAACTGATCATTGAATTTTGAGCATTCAAGGGAGGACGAGCGAGGTTTATAACGATGTATCAATGGTCAGGATATTCATGATGCTTCTCAAATCTAAAAGTCATGAAGAAAATGCTGACCTTGTTTTGTTTGTTGTTTTGACTCCTCTGTGTCCAATGCAGCATAGGTGCACTTAGAGTGAAGAATGCCTGCATTGACTTCCGGAGCGCCTGCGAGCAACAAAGTCCAGAATGGTGAGCTATTTTCACTATTTGATTCATCTTTCAAAGTTCTCTTCCCAGTTAGCAGTCAGGTCTGAGGAATAACTAAGTGACAAGAATAATCAAGTGATTAGTAAGTAAACTGTTGTTTCCAGGTGTTCAAGATGTCTCCAACAAGTAGAGCAAGCATTCTATGGTGTAAAAGATAAGCTCAGTTATCTATATGCTGTAAGTAGAGAAATCTGCTACTTAAGGGCTTTTCTTCTTATCATCTCACACTTACCTTGGTCGGTGTCTGTGACAAAAATGGATAAATTGACTACAAACCTTGTTTACTGCAGCTGGAGCAACGGATTTTGAATGCTGGTGGATCCATCCCAGTGGACTTGGGTTCCTAAACTGACAAAATCCATGAAAACCAAGAGTTTTCCATTCGTTCCTGTCGAGTCTCCTTTCTCTGTTCTCTAGGCTCTTAAAATGGCTGTTACTTGTGCTTGTACAAATTCAGAACCTTTGCCCTTTTCACTCAGAACATCAATGTTTTCTACAGCTGATTTCGACTTTTGATTTTTCAAGTCTGTTTGAGCTGAACTGTTTGTATTGCATGGAGTCCTGAAAGATTTTCTGGGTGTTTTCATAGAGGCAAATACTCTCTTTCATCTTTGTTCCCCTAGTATTCTGCTAATCAAAACATACCTGGTCATTTGTGGGTTTCATTTGAATTTTCTTGGATTTCCACTCTTATTTGAAACAAACAAAATCAACAAATGGACAAAGTAGGCCGCCAATGGACTAAGAACAGCATCAAGCTTAGATGATGACATAAATTGATAGGGGGATAATCTAGATCTTATTGTTTGTTGCTTTCTATGAACCACGAAATGGAAACCTTTTTTTATGTTTATACAAATGCCCCAAAAGTTTTCATCTCGGTGAATATCAGCAAACACATGATCCATGGCAATAATTACAGGTGTGGTTTTGCAGTTTATTCCACCTTCTTTTCTGGTCATTAAAGCTTTTGACCACTGCAGATTGTATCTTATCACTTAGCCGGGTTGCAAATAGGTGAAAGTAGGCTGTTTGGTTCAAATTCTGCAGTACTCGGATTAATTTCCCGCCCTTCTCATAGATGACAGTTTGGCCATCTTGAGATTCAACACCAGCTGCGAAGATTTTTACAGGCCTGAATTGAAAAACTGGCTCCACACGGCCTTCTTCAGCCAGGACTACTGCACCAAGAAGTTCCCCCAAGAAAGTATCCTACAGCAGAACAAGCTTTTATTAC

mRNA sequence

TGCTTCGTTCTCTTTCTCTACTGATCCGATTCAGTTCATGGAGAAATTTTGCAGTTTCAATCCGCTGTTTTTCTTCTCATTATCGATTTCGTTGCTTCTCGGGCGAGCTTCAAGCTCCTATGCAGGTTCCGCTAGCTCAATTGTCAATCCTGCTAAAGTAAAGCAGATTTCATGGAATCCTCGGGCGTTTGTGTATGAAGGTTTTCTCACGGACTTAGAATGCGATCATCTCATCTCGATTGCTAAAGCTGAATTGAAGAGATCTGCTGTTGCCGATAATTTGTCAGGAGAGAGCAAGGTCAGCGAGATCCGAACTAGTTCTGGGGCGTTTATTCAAAAATCCAAGGATCCTATTGTTTCTGGTATAGAAGACAAAATTTCAGCATGGACATTTCTGCCAAAAGAAAATGGAGAAGACATTCAAGTGTTGAGATATGAATATGGGCAGAAGTATGATGCACACTTCGATTACTTTGCCGACGAGGTTAATATTGCCCGTGGTGGACATCGAATGGCAACCGTTCTCATGTATCTTTCCGACGTAAAAAGAGGCGGTGAAACTGTGTTTCCTTCTGCAGAGGAATCTCAAAGACGTCAGGCTTCTGAAACAAACGAAGATCTCTCAGACTGTGCAAAGAAAGGGATAGCAGTGAAACCGCGGAAAGGCGACGCTCTTCTGTTCTTCAGTCTTCATCCAAATGCTGTTCCAGACACAAGTAGTCTGCATGGAGGGTGCCCTGTGATTGAAGGCGAGAAATGGTCAGCAACTAAGTGGATTCGTGTCAATCCTTTCGACCAGATTGTGGGAGACTACATGAATTGCAGTGATGAGAATGCAAGTTGTGAGAGATGGGCTGAGCTCGGCGAGTGCAATGATAACCCAGAGTATATGGTGGGATCTCCTGAGTTTCCTGGCTACTGCAGAAAAAGTTGCAAGTCAACAGAAGCTTCTGTGAGTAAAAATGGAGGTGGGATTGATGCAGAGACAGCGGGTTCAGGCACTCTGGATAGTCAGTATTTGCAGCTTTTGCAACTGCAAGATGAGAGTAATCCAACTTTCGTTTCTGAAGTGGCGACTCTTTTCTTTGAAGATACCGAGGAGCTTCTCAATAAACTGAGAGTCGCTCTATTACAGCCATCTGTGGACTTCAAAAAGATTGATGATCATGTACACCAGCTGAAGGGCAGCAGCTCCAGCATAGGTGCACTTAGAGTGAAGAATGCCTGCATTGACTTCCGGAGCGCCTGCGAGCAACAAAGTCCAGAATGGTGTTCAAGATGTCTCCAACAAGTAGAGCAAGCATTCTATGGTGTAAAAGATAAGCTCAGTTATCTATATGCTCTGGAGCAACGGATTTTGAATGCTGGTGGATCCATCCCAGTGGACTTGGGTTCCTAAACTGACAAAATCCATGAAAACCAAGAGTTTTCCATTCGTTCCTGTCGAGTCTCCTTTCTCTGTTCTCTAGGCTCTTAAAATGGCTGTTACTTGTGCTTGTACAAATTCAGAACCTTTGCCCTTTTCACTCAGAACATCAATGTTTTCTACAGCTGATTTCGACTTTTGATTTTTCAAGTCTGTTTGAGCTGAACTGTTTGTATTGCATGGAGTCCTGAAAGATTTTCTGGGTGTTTTCATAGAGGCAAATACTCTCTTTCATCTTTGTTCCCCTAGTATTCTGCTAATCAAAACATACCTGGTCATTTGTGGGTTTCATTTGAATTTTCTTGGATTTCCACTCTTATTTGAAACAAACAAAATCAACAAATGGACAAAGTAGGCCGCCAATGGACTAAGAACAGCATCAAGCTTAGATGATGACATAAATTGATAGGGGGATAATCTAGATCTTATTGTTTGTTGCTTTCTATGAACCACGAAATGGAAACCTTTTTTTATGTTTATACAAATGCCCCAAAAGTTTTCATCTCGGTGAATATCAGCAAACACATGATCCATGGCAATAATTACAGGTGTGGTTTTGCAGTTTATTCCACCTTCTTTTCTGGTCATTAAAGCTTTTGACCACTGCAGATTGTATCTTATCACTTAGCCGGGTTGCAAATAGGTGAAAGTAGGCTGTTTGGTTCAAATTCTGCAGTACTCGGATTAATTTCCCGCCCTTCTCATAGATGACAGTTTGGCCATCTTGAGATTCAACACCAGCTGCGAAGATTTTTACAGGCCTGAATTGAAAAACTGGCTCCACACGGCCTTCTTCAGCCAGGACTACTGCACCAAGAAGTTCCCCCAAGAAAGTATCCTACAGCAGAACAAGCTTTTATTAC

Coding sequence (CDS)

ATGGAGAAATTTTGCAGTTTCAATCCGCTGTTTTTCTTCTCATTATCGATTTCGTTGCTTCTCGGGCGAGCTTCAAGCTCCTATGCAGGTTCCGCTAGCTCAATTGTCAATCCTGCTAAAGTAAAGCAGATTTCATGGAATCCTCGGGCGTTTGTGTATGAAGGTTTTCTCACGGACTTAGAATGCGATCATCTCATCTCGATTGCTAAAGCTGAATTGAAGAGATCTGCTGTTGCCGATAATTTGTCAGGAGAGAGCAAGGTCAGCGAGATCCGAACTAGTTCTGGGGCGTTTATTCAAAAATCCAAGGATCCTATTGTTTCTGGTATAGAAGACAAAATTTCAGCATGGACATTTCTGCCAAAAGAAAATGGAGAAGACATTCAAGTGTTGAGATATGAATATGGGCAGAAGTATGATGCACACTTCGATTACTTTGCCGACGAGGTTAATATTGCCCGTGGTGGACATCGAATGGCAACCGTTCTCATGTATCTTTCCGACGTAAAAAGAGGCGGTGAAACTGTGTTTCCTTCTGCAGAGGAATCTCAAAGACGTCAGGCTTCTGAAACAAACGAAGATCTCTCAGACTGTGCAAAGAAAGGGATAGCAGTGAAACCGCGGAAAGGCGACGCTCTTCTGTTCTTCAGTCTTCATCCAAATGCTGTTCCAGACACAAGTAGTCTGCATGGAGGGTGCCCTGTGATTGAAGGCGAGAAATGGTCAGCAACTAAGTGGATTCGTGTCAATCCTTTCGACCAGATTGTGGGAGACTACATGAATTGCAGTGATGAGAATGCAAGTTGTGAGAGATGGGCTGAGCTCGGCGAGTGCAATGATAACCCAGAGTATATGGTGGGATCTCCTGAGTTTCCTGGCTACTGCAGAAAAAGTTGCAAGTCAACAGAAGCTTCTGTGAGTAAAAATGGAGGTGGGATTGATGCAGAGACAGCGGGTTCAGGCACTCTGGATAGTCAGTATTTGCAGCTTTTGCAACTGCAAGATGAGAGTAATCCAACTTTCGTTTCTGAAGTGGCGACTCTTTTCTTTGAAGATACCGAGGAGCTTCTCAATAAACTGAGAGTCGCTCTATTACAGCCATCTGTGGACTTCAAAAAGATTGATGATCATGTACACCAGCTGAAGGGCAGCAGCTCCAGCATAGGTGCACTTAGAGTGAAGAATGCCTGCATTGACTTCCGGAGCGCCTGCGAGCAACAAAGTCCAGAATGGTGTTCAAGATGTCTCCAACAAGTAGAGCAAGCATTCTATGGTGTAAAAGATAAGCTCAGTTATCTATATGCTCTGGAGCAACGGATTTTGAATGCTGGTGGATCCATCCCAGTGGACTTGGGTTCCTAA

Protein sequence

MEKFCSFNPLFFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDLECDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFLPKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSAEESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEKWSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCKSTEASVSKNGGGIDAETAGSGTLDSQYLQLLQLQDESNPTFVSEVATLFFEDTEELLNKLRVALLQPSVDFKKIDDHVHQLKGSSSSIGALRVKNACIDFRSACEQQSPEWCSRCLQQVEQAFYGVKDKLSYLYALEQRILNAGGSIPVDLGS
Homology
BLAST of CmaCh14G016540 vs. ExPASy Swiss-Prot
Match: Q8LAN3 (Probable prolyl 4-hydroxylase 4 OS=Arabidopsis thaliana OX=3702 GN=P4H4 PE=2 SV=1)

HSP 1 Score: 448.7 bits (1153), Expect = 7.6e-125
Identity = 218/292 (74.66%), Postives = 245/292 (83.90%), Query Frame = 0

Query: 10  LFFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDLECDHLISIA 69
           L  F    S+LL ++S+S   S+S  VNP+KVKQ+S  PRAFVYEGFLT+LECDH++S+A
Sbjct: 7   LISFFAIFSVLL-QSSTSLISSSSVFVNPSKVKQVSSKPRAFVYEGFLTELECDHMVSLA 66

Query: 70  KAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFLPKENGEDIQ 129
           KA LKRSAVADN SGESK SE+RTSSG FI K KDPIVSGIEDKIS WTFLPKENGEDIQ
Sbjct: 67  KASLKRSAVADNDSGESKFSEVRTSSGTFISKGKDPIVSGIEDKISTWTFLPKENGEDIQ 126

Query: 130 VLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSAEESQRRQAS 189
           VLRYE+GQKYDAHFDYF D+VNI RGGHRMAT+LMYLS+V +GGETVFP AE   RR  S
Sbjct: 127 VLRYEHGQKYDAHFDYFHDKVNIVRGGHRMATILMYLSNVTKGGETVFPDAEIPSRRVLS 186

Query: 190 ETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEKWSATKWIRV 249
           E  EDLSDCAK+GIAVKPRKGDALLFF+LHP+A+PD  SLHGGCPVIEGEKWSATKWI V
Sbjct: 187 ENKEDLSDCAKRGIAVKPRKGDALLFFNLHPDAIPDPLSLHGGCPVIEGEKWSATKWIHV 246

Query: 250 NPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCKS 302
           + FD+IV    NC+D N SCERWA LGEC  NPEYMVG+ E PGYCR+SCK+
Sbjct: 247 DSFDRIVTPSGNCTDMNESCERWAVLGECTKNPEYMVGTTELPGYCRRSCKA 297

BLAST of CmaCh14G016540 vs. ExPASy Swiss-Prot
Match: F4JAU3 (Prolyl 4-hydroxylase 2 OS=Arabidopsis thaliana OX=3702 GN=P4H2 PE=1 SV=1)

HSP 1 Score: 438.7 bits (1127), Expect = 7.9e-122
Identity = 208/287 (72.47%), Postives = 245/287 (85.37%), Query Frame = 0

Query: 15  LSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDLECDHLISIAKAELK 74
           ++I L+L ++S+    S SSI+NP+KVKQ+S  PRAFVYEGFLTDLECDHLIS+AK  L+
Sbjct: 12  VAILLVLLQSSTCLISSPSSIINPSKVKQVSSKPRAFVYEGFLTDLECDHLISLAKENLQ 71

Query: 75  RSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFLPKENGEDIQVLRYE 134
           RSAVADN +GES+VS++RTSSG FI K KDPIVSGIEDK+S WTFLPKENGED+QVLRYE
Sbjct: 72  RSAVADNDNGESQVSDVRTSSGTFISKGKDPIVSGIEDKLSTWTFLPKENGEDLQVLRYE 131

Query: 135 YGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSAEESQRRQASETNED 194
           +GQKYDAHFDYF D+VNIARGGHR+ATVL+YLS+V +GGETVFP A+E  RR  SE  +D
Sbjct: 132 HGQKYDAHFDYFHDKVNIARGGHRIATVLLYLSNVTKGGETVFPDAQEFSRRSLSENKDD 191

Query: 195 LSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEKWSATKWIRVNPFDQ 254
           LSDCAKKGIAVKP+KG+ALLFF+L  +A+PD  SLHGGCPVIEGEKWSATKWI V+ FD+
Sbjct: 192 LSDCAKKGIAVKPKKGNALLFFNLQQDAIPDPFSLHGGCPVIEGEKWSATKWIHVDSFDK 251

Query: 255 IVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCKS 302
           I+    NC+D N SCERWA LGEC  NPEYMVG+PE PG CR+SCK+
Sbjct: 252 ILTHDGNCTDVNESCERWAVLGECGKNPEYMVGTPEIPGNCRRSCKA 298

BLAST of CmaCh14G016540 vs. ExPASy Swiss-Prot
Match: Q8L970 (Probable prolyl 4-hydroxylase 7 OS=Arabidopsis thaliana OX=3702 GN=P4H7 PE=2 SV=1)

HSP 1 Score: 355.1 bits (910), Expect = 1.1e-96
Identity = 175/312 (56.09%), Postives = 227/312 (72.76%), Query Frame = 0

Query: 4   FCSFNPLFFFSLSI-----SLLLGRASSSYAG-------SASSI-VNPAKVKQISWNPRA 63
           F +F+  F F+L +     +  L R+S++  G       SASS   +P +V Q+SW PR 
Sbjct: 6   FLAFSLCFLFTLPLISSAPNRFLTRSSNTRDGSVIKMKTSASSFGFDPTRVTQLSWTPRV 65

Query: 64  FVYEGFLTDLECDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGI 123
           F+YEGFL+D ECDH I +AK +L++S VADN SGES  SE+RTSSG F+ K +D IVS +
Sbjct: 66  FLYEGFLSDEECDHFIKLAKGKLEKSMVADNDSGESVESEVRTSSGMFLSKRQDDIVSNV 125

Query: 124 EDKISAWTFLPKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVK 183
           E K++AWTFLP+ENGE +Q+L YE GQKY+ HFDYF D+ N+  GGHR+ATVLMYLS+V+
Sbjct: 126 EAKLAAWTFLPEENGESMQILHYENGQKYEPHFDYFHDQANLELGGHRIATVLMYLSNVE 185

Query: 184 RGGETVFPSAEESQRRQASETNED-LSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSL 243
           +GGETVFP      + +A++  +D  ++CAK+G AVKPRKGDALLFF+LHPNA  D++SL
Sbjct: 186 KGGETVFP----MWKGKATQLKDDSWTECAKQGYAVKPRKGDALLFFNLHPNATTDSNSL 245

Query: 244 HGGCPVIEGEKWSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSP 302
           HG CPV+EGEKWSAT+WI V  F++       C DEN SCE+WA+ GEC  NP YMVGS 
Sbjct: 246 HGSCPVVEGEKWSATRWIHVKSFERAFNKQSGCMDENVSCEKWAKAGECQKNPTYMVGSD 305

BLAST of CmaCh14G016540 vs. ExPASy Swiss-Prot
Match: F4J0A8 (Probable prolyl 4-hydroxylase 6 OS=Arabidopsis thaliana OX=3702 GN=P4H6 PE=2 SV=1)

HSP 1 Score: 327.0 bits (837), Expect = 3.3e-88
Identity = 164/292 (56.16%), Postives = 211/292 (72.26%), Query Frame = 0

Query: 11  FFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDLECDHLISIAK 70
           +F + S+SLLL     S   S S  V+P ++ Q+SW PRAF+Y+GFL+D ECDHLI +AK
Sbjct: 5   YFLAFSLSLLL---IFSQISSFSFSVDPTRITQLSWTPRAFLYKGFLSDEECDHLIKLAK 64

Query: 71  AELKRS-AVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFLPKENGEDIQ 130
            +L++S  VAD  SGES+ SE+RTSSG F+ K +D IV+ +E K++AWTFLP+ENGE +Q
Sbjct: 65  GKLEKSMVVADVDSGESEDSEVRTSSGMFLTKRQDDIVANVEAKLAAWTFLPEENGEALQ 124

Query: 131 VLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSAEESQRRQAS 190
           +L YE GQKYD HFDYF D+  +  GGHR+ATVLMYLS+V +GGETVFP+    + +   
Sbjct: 125 ILHYENGQKYDPHFDYFYDKKALELGGHRIATVLMYLSNVTKGGETVFPN---WKGKTPQ 184

Query: 191 ETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEKWSATKWIRV 250
             ++  S CAK+G AVKPRKGDALLFF+LH N   D +SLHG CPVIEGEKWSAT+WI V
Sbjct: 185 LKDDSWSKCAKQGYAVKPRKGDALLFFNLHLNGTTDPNSLHGSCPVIEGEKWSATRWIHV 244

Query: 251 NPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCKS 302
             F +     + C D++ SC+ WA+ GEC  NP YMVGS    G+CRKSCK+
Sbjct: 245 RSFGK---KKLVCVDDHESCQEWADAGECEKNPMYMVGSETSLGFCRKSCKA 287

BLAST of CmaCh14G016540 vs. ExPASy Swiss-Prot
Match: Q9LN20 (Probable prolyl 4-hydroxylase 3 OS=Arabidopsis thaliana OX=3702 GN=P4H3 PE=2 SV=1)

HSP 1 Score: 248.8 bits (634), Expect = 1.2e-64
Identity = 114/209 (54.55%), Postives = 155/209 (74.16%), Query Frame = 0

Query: 44  ISWNPRAFVYEGFLTDLECDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIQKSK 103
           +SW PRAFVY  FL+  EC++LIS+AK  + +S V D+ +G+SK S +RTSSG F+++ +
Sbjct: 79  LSWEPRAFVYHNFLSKEECEYLISLAKPHMVKSTVVDSETGKSKDSRVRTSSGTFLRRGR 138

Query: 104 DPIVSGIEDKISAWTFLPKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVL 163
           D I+  IE +I+ +TF+P ++GE +QVL YE GQKY+ H+DYF DE N   GG RMAT+L
Sbjct: 139 DKIIKTIEKRIADYTFIPADHGEGLQVLHYEAGQKYEPHYDYFVDEFNTKNGGQRMATML 198

Query: 164 MYLSDVKRGGETVFPSAEESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAV 223
           MYLSDV+ GGETVFP+A  +    +     +LS+C KKG++VKPR GDALLF+S+ P+A 
Sbjct: 199 MYLSDVEEGGETVFPAA--NMNFSSVPWYNELSECGKKGLSVKPRMGDALLFWSMRPDAT 258

Query: 224 PDTSSLHGGCPVIEGEKWSATKWIRVNPF 253
            D +SLHGGCPVI G KWS+TKW+ V  +
Sbjct: 259 LDPTSLHGGCPVIRGNKWSSTKWMHVGEY 285

BLAST of CmaCh14G016540 vs. ExPASy TrEMBL
Match: A0A6J1J084 (Procollagen-proline 4-dioxygenase OS=Cucurbita maxima OX=3661 GN=LOC111480082 PE=3 SV=1)

HSP 1 Score: 615.1 bits (1585), Expect = 2.3e-172
Identity = 300/300 (100.00%), Postives = 300/300 (100.00%), Query Frame = 0

Query: 1   MEKFCSFNPLFFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDL 60
           MEKFCSFNPLFFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDL
Sbjct: 1   MEKFCSFNPLFFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDL 60

Query: 61  ECDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFL 120
           ECDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFL
Sbjct: 61  ECDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFL 120

Query: 121 PKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSA 180
           PKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSA
Sbjct: 121 PKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSA 180

Query: 181 EESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEK 240
           EESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEK
Sbjct: 181 EESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEK 240

Query: 241 WSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCK 300
           WSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCK
Sbjct: 241 WSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCK 300

BLAST of CmaCh14G016540 vs. ExPASy TrEMBL
Match: A0A7J6HHC7 (Uncharacterized protein OS=Cannabis sativa OX=3483 GN=G4B88_007776 PE=4 SV=1)

HSP 1 Score: 599.0 bits (1543), Expect = 1.7e-167
Identity = 304/462 (65.80%), Postives = 353/462 (76.41%), Query Frame = 0

Query: 9   PLFFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDLECDHLISI 68
           PLFF           + SSYAGSASSI+NPAKVKQISW PRAF+YEGFLTDLECDHLIS+
Sbjct: 321 PLFFLFTIFFFFFHESFSSYAGSASSIINPAKVKQISWKPRAFIYEGFLTDLECDHLISL 380

Query: 69  AKAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFLPKENGEDI 128
           AK+ELKRSAVAD+ SGES++SE+RTSSG FI K+KDPIV+GIEDKIS WTFLPKENGEDI
Sbjct: 381 AKSELKRSAVADSESGESQLSEVRTSSGMFISKAKDPIVAGIEDKISTWTFLPKENGEDI 440

Query: 129 QVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSAEESQRRQA 188
           QVLRYE GQKY+ H+DYFAD+VNI RGGHR+ATVLMYL+DV +GGETVFP A E+ R + 
Sbjct: 441 QVLRYEEGQKYEPHYDYFADKVNIIRGGHRIATVLMYLTDVVKGGETVFPHAVENPRHKP 500

Query: 189 SETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEKWSATKWIR 248
           S T ED S+CAKKG+AVK R+GDALLFFSL P A+PDT SLH GCPVIEGEKWSATKWI 
Sbjct: 501 STTLEDFSECAKKGVAVKARRGDALLFFSLLPTAIPDTLSLHAGCPVIEGEKWSATKWIH 560

Query: 249 VNPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSC--------- 308
           V+ FD+ V     C+D N SCERWA LGEC  N EYMVGSPE PGYCR+SC         
Sbjct: 561 VDSFDKDVSAGGTCTDMNESCERWAALGECTKNKEYMVGSPELPGYCRRSCKIRTSHSIH 620

Query: 309 ----KSTEASVSKNGGGIDAETAG--------SGTLDSQYLQLLQLQDESNPTFVSEVAT 368
               K  E + +   G +  +              LDSQ+LQLLQLQDESNP FV EV +
Sbjct: 621 PILFKRREEAKNMEVGQMQRQWVDYTKSLFMEKRYLDSQFLQLLQLQDESNPDFVVEVVS 680

Query: 369 LFFEDTEELLNKLRVALLQPSVDFKKIDDHVHQLKGSSSSIGALRVKNACIDFRSACEQQ 428
           LFF+DTE+LLN L  AL Q  VDFK++D HVHQLKGSSSSIGA RVKN C+ FR+ CE+Q
Sbjct: 681 LFFDDTEKLLNDLTAALEQQCVDFKRVDAHVHQLKGSSSSIGAERVKNGCVAFRNLCEEQ 740

Query: 429 SPEWCSRCLQQVEQAFYGVKDKLSYLYALEQRILNAGGSIPV 450
           + + C RCLQQV+Q +Y VK+KL  L+ LEQ+I+ AGGSIP+
Sbjct: 741 NTDACLRCLQQVKQEYYLVKNKLENLFRLEQQIVAAGGSIPM 782

BLAST of CmaCh14G016540 vs. ExPASy TrEMBL
Match: A0A6J1GXF3 (Procollagen-proline 4-dioxygenase OS=Cucurbita moschata OX=3662 GN=LOC111457709 PE=3 SV=1)

HSP 1 Score: 595.9 bits (1535), Expect = 1.4e-166
Identity = 293/300 (97.67%), Postives = 295/300 (98.33%), Query Frame = 0

Query: 1   MEKFCSFNPLFFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDL 60
           MEKFCSFN LFFFSLSISLLL RASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDL
Sbjct: 1   MEKFCSFNLLFFFSLSISLLLRRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDL 60

Query: 61  ECDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFL 120
           E DHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFI+KSKDPIVSGIEDKI+AWTFL
Sbjct: 61  ESDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIKKSKDPIVSGIEDKIAAWTFL 120

Query: 121 PKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSA 180
           PKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSA
Sbjct: 121 PKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSA 180

Query: 181 EESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEK 240
           EESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEK
Sbjct: 181 EESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEK 240

Query: 241 WSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCK 300
           WSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGEC  NPEYMVGSPEFPGYCRKSCK
Sbjct: 241 WSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGECMGNPEYMVGSPEFPGYCRKSCK 300

BLAST of CmaCh14G016540 vs. ExPASy TrEMBL
Match: A0A7J7GMG3 (Uncharacterized protein OS=Camellia sinensis OX=4442 GN=HYC85_019329 PE=4 SV=1)

HSP 1 Score: 585.5 bits (1508), Expect = 1.9e-163
Identity = 298/480 (62.08%), Postives = 355/480 (73.96%), Query Frame = 0

Query: 10  LFFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDLECDHLISIA 69
           L  F  S+ LL+  +S  Y  S+SSI+NP+K KQ+SW PRAFVYEGFLTD EC+HLISIA
Sbjct: 12  LKMFRFSLILLISISSIIYESSSSSIINPSKAKQVSWKPRAFVYEGFLTDEECNHLISIA 71

Query: 70  KAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFLPKENGEDIQ 129
           K ELKRS+VADN+SG+SK+S++RTSSG FI K+KDPIVSGIE+KI+ WTFLPKENGE IQ
Sbjct: 72  KTELKRSSVADNVSGKSKLSQVRTSSGMFISKAKDPIVSGIEEKIAMWTFLPKENGEAIQ 131

Query: 130 VLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSAEESQRRQAS 189
           VLRYE+GQKYD H+DYF D+VN+ARGGHR+ATVLMYLSDV +GGETVFPSAEE+    +S
Sbjct: 132 VLRYEHGQKYDPHYDYFLDKVNVARGGHRIATVLMYLSDVAKGGETVFPSAEEAP-HHSS 191

Query: 190 ETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEKWSATKWIRV 249
            +++DLS+CAKKGIAVKPRKGDALLFFSLHP A+PD  SLHGGCPVIEGEKWSATKWI V
Sbjct: 192 TSDDDLSECAKKGIAVKPRKGDALLFFSLHPTAIPDPLSLHGGCPVIEGEKWSATKWIHV 251

Query: 250 NPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCKSTEASVSKN 309
           + FD++V    NC+D N +CERWA LGEC  NPEYMVG+PE PGYCR+SC+ +  S +  
Sbjct: 252 DSFDKVVRHGGNCTDGNENCERWAALGECTKNPEYMVGTPELPGYCRRSCRISYNSATDP 311

Query: 310 GGGIDAETAGS----------------------------------------GTLDSQYLQ 369
             G D     +                                        G LDSQ+ Q
Sbjct: 312 QWGSDPHQREACSFVSVKREEEEENQESKQSMEVGQLQRQYVEYTTSLFREGFLDSQFTQ 371

Query: 370 LLQLQDESNPTFVSEVATLFFEDTEELLNKLRVALLQPSVDFKKIDDHVHQLKGSSSSIG 429
           L QLQDESNP FV EV +LFFED+E LLN L  AL Q  VDFKK+D +VHQLKGSSSSIG
Sbjct: 372 LQQLQDESNPDFVVEVVSLFFEDSERLLNDLNKALDQKGVDFKKVDGNVHQLKGSSSSIG 431

Query: 430 ALRVKNACIDFRSACEQQSPEWCSRCLQQVEQAFYGVKDKLSYLYALEQRILNAGGSIPV 450
           A RVKNAC+ FR+ CE+ + E C  CLQQV+Q +  VK+KL  L+ LEQ+IL AGGS+P+
Sbjct: 432 ANRVKNACVAFRNYCEEHNTEACFGCLQQVKQEYVLVKNKLETLFGLEQQILAAGGSVPM 490

BLAST of CmaCh14G016540 vs. ExPASy TrEMBL
Match: A0A1Q3B5T1 (Procollagen-proline 4-dioxygenase OS=Cephalotus follicularis OX=3775 GN=CFOL_v3_06885 PE=3 SV=1)

HSP 1 Score: 583.6 bits (1503), Expect = 7.3e-163
Identity = 299/474 (63.08%), Postives = 353/474 (74.47%), Query Frame = 0

Query: 11  FFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDLECDHLISIAK 70
           F + L IS+++ +  SS+  S SS+++P+KVKQIS  PRA+VYEGFLT LECDHLIS+AK
Sbjct: 33  FLYFLWISIIIQQCWSSFVSSPSSVIDPSKVKQISTKPRAYVYEGFLTGLECDHLISLAK 92

Query: 71  AELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFLPKENGEDIQV 130
           +ELKRSAVADNLSG+SK+SE+RTSSG FI K KDPIV GIEDKIS WTFLPKENGEDIQV
Sbjct: 93  SELKRSAVADNLSGKSKLSEVRTSSGMFIPKGKDPIVVGIEDKISTWTFLPKENGEDIQV 152

Query: 131 LRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSAEESQRRQASE 190
           LRYE GQKY+ HFDYF D+VNIARGGHR+ATVL+YL+DV +GGETVFPSAE S RR+ S 
Sbjct: 153 LRYEPGQKYEPHFDYFVDKVNIARGGHRVATVLLYLTDVAKGGETVFPSAEVSTRRKVSA 212

Query: 191 TNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEKWSATKWIRVN 250
           TN DLS+C +KG+AVKPR+GDALLFFSLHPNA+PD SSLH GCPVIEGEKWSATKWI V+
Sbjct: 213 TNSDLSECGRKGVAVKPRRGDALLFFSLHPNALPDQSSLHAGCPVIEGEKWSATKWIHVD 272

Query: 251 PFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCKSTE------- 310
            FD+ +    NC+D N SCE+WA LGEC  N EYMVGSPE PGYCR+SCK  E       
Sbjct: 273 SFDKNLTAGGNCTDSNESCEKWAALGECTKNREYMVGSPELPGYCRRSCKMREFKPQQHL 332

Query: 311 ------ASVSK------NGGGIDAETAG-------------------SGTLDSQYLQLLQ 370
                  SVSK        G I  +  G                    G LD Q+LQL Q
Sbjct: 333 LKWTLCVSVSKRIEKERERGRIQRQHKGMEVGQMQRRLLDYTKTLFMEGFLDGQFLQLQQ 392

Query: 371 LQDESNPTFVSEVATLFFEDTEELLNKLRVALLQPSVDFKKIDDHVHQLKGSSSSIGALR 430
           LQDESNP FV EV +LFF+D+E LLN L  AL QPSVDF ++D HVHQLKGSSSSI A R
Sbjct: 393 LQDESNPGFVVEVVSLFFDDSERLLNDLTWALDQPSVDFGRVDSHVHQLKGSSSSIAAQR 452

Query: 431 VKNACIDFRSACEQQSPEWCSRCLQQVEQAFYGVKDKLSYLYALEQRILNAGGS 447
           +KNA + FR+ CE+Q+ E C RCLQQ++Q +Y  ++ L  L+ LEQ+I+ AGGS
Sbjct: 453 IKNASLAFRNFCEEQNVEACHRCLQQIKQEYYLARNNLETLFRLEQQIVAAGGS 506

BLAST of CmaCh14G016540 vs. NCBI nr
Match: XP_022980799.1 (probable prolyl 4-hydroxylase 4 [Cucurbita maxima])

HSP 1 Score: 615.1 bits (1585), Expect = 4.7e-172
Identity = 300/300 (100.00%), Postives = 300/300 (100.00%), Query Frame = 0

Query: 1   MEKFCSFNPLFFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDL 60
           MEKFCSFNPLFFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDL
Sbjct: 1   MEKFCSFNPLFFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDL 60

Query: 61  ECDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFL 120
           ECDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFL
Sbjct: 61  ECDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFL 120

Query: 121 PKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSA 180
           PKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSA
Sbjct: 121 PKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSA 180

Query: 181 EESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEK 240
           EESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEK
Sbjct: 181 EESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEK 240

Query: 241 WSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCK 300
           WSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCK
Sbjct: 241 WSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCK 300

BLAST of CmaCh14G016540 vs. NCBI nr
Match: KAF9678089.1 (hypothetical protein SADUNF_Sadunf08G0175600 [Salix dunnii])

HSP 1 Score: 601.3 bits (1549), Expect = 7.0e-168
Identity = 301/440 (68.41%), Postives = 355/440 (80.68%), Query Frame = 0

Query: 11  FFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDLECDHLISIAK 70
           F F LSISL+L + S SY  ++SSI+NPAKVKQ+S  PRAFVY+GFLTDLECDHLIS+AK
Sbjct: 30  FLFLLSISLILHK-SISYPATSSSIINPAKVKQVSSKPRAFVYKGFLTDLECDHLISLAK 89

Query: 71  AELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFLPKENGEDIQV 130
           +ELKRSAVADN SG+SK+SE+RTSSG FI K+KDPIVSGIEDKI+ WTFLPKENGEDIQV
Sbjct: 90  SELKRSAVADNESGKSKLSEVRTSSGMFIAKAKDPIVSGIEDKIATWTFLPKENGEDIQV 149

Query: 131 LRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSAEESQRRQASE 190
           LRYE+GQKYD H+DYF+D+VNIARGGHR+ATVLMYL+DV++GGETVFPSAEE+ RR+AS 
Sbjct: 150 LRYEHGQKYDPHYDYFSDKVNIARGGHRLATVLMYLTDVEKGGETVFPSAEEAPRRKASV 209

Query: 191 TNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEKWSATKWIRVN 250
           ++EDLS+CA+KGIAVKP +GDALLFFSL+P AVPDTSSLH GCPVIEGEKWSATKWI V+
Sbjct: 210 SHEDLSECARKGIAVKPHRGDALLFFSLYPTAVPDTSSLHAGCPVIEGEKWSATKWIHVD 269

Query: 251 PFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCKSTEASVSKNG 310
            FD+ +    NC+D+N  C RWA LGEC  NPEYMVGSP  PGYCR+SCK          
Sbjct: 270 SFDKNLEAGGNCTDQNDGCGRWAALGECTKNPEYMVGSPALPGYCRRSCK---------- 329

Query: 311 GGIDAETAGSGTLDSQYLQLLQLQDESNPTFVSEVATLFFEDTEELLNKLRVALLQPSVD 370
                     G LD+Q+ QL  LQD+SNP FV+EV +LFFED+E LL  L   L Q ++D
Sbjct: 330 ----------GFLDAQFHQLQLLQDDSNPDFVAEVVSLFFEDSERLLADLTFVLEQQNID 389

Query: 371 FKKIDDHVHQLKGSSSSIGALRVKNACIDFRSACEQQSPEWCSRCLQQVEQAFYGVKDKL 430
           FKK+D HVHQ KGSSSSIGA RVKN CI FR+ CE+Q+ E C RCLQQV+Q +Y VK KL
Sbjct: 390 FKKVDAHVHQFKGSSSSIGAERVKNDCIAFRNFCEEQNIEGCLRCLQQVKQDYYLVKSKL 448

Query: 431 SYLYALEQRILNAGGSIPVD 451
             L  LEQ+I+ AGGSIP++
Sbjct: 450 EALIRLEQQIVAAGGSIPME 448

BLAST of CmaCh14G016540 vs. NCBI nr
Match: KAF4393790.1 (hypothetical protein G4B88_007776 [Cannabis sativa])

HSP 1 Score: 599.0 bits (1543), Expect = 3.5e-167
Identity = 304/462 (65.80%), Postives = 353/462 (76.41%), Query Frame = 0

Query: 9   PLFFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDLECDHLISI 68
           PLFF           + SSYAGSASSI+NPAKVKQISW PRAF+YEGFLTDLECDHLIS+
Sbjct: 321 PLFFLFTIFFFFFHESFSSYAGSASSIINPAKVKQISWKPRAFIYEGFLTDLECDHLISL 380

Query: 69  AKAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFLPKENGEDI 128
           AK+ELKRSAVAD+ SGES++SE+RTSSG FI K+KDPIV+GIEDKIS WTFLPKENGEDI
Sbjct: 381 AKSELKRSAVADSESGESQLSEVRTSSGMFISKAKDPIVAGIEDKISTWTFLPKENGEDI 440

Query: 129 QVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSAEESQRRQA 188
           QVLRYE GQKY+ H+DYFAD+VNI RGGHR+ATVLMYL+DV +GGETVFP A E+ R + 
Sbjct: 441 QVLRYEEGQKYEPHYDYFADKVNIIRGGHRIATVLMYLTDVVKGGETVFPHAVENPRHKP 500

Query: 189 SETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEKWSATKWIR 248
           S T ED S+CAKKG+AVK R+GDALLFFSL P A+PDT SLH GCPVIEGEKWSATKWI 
Sbjct: 501 STTLEDFSECAKKGVAVKARRGDALLFFSLLPTAIPDTLSLHAGCPVIEGEKWSATKWIH 560

Query: 249 VNPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSC--------- 308
           V+ FD+ V     C+D N SCERWA LGEC  N EYMVGSPE PGYCR+SC         
Sbjct: 561 VDSFDKDVSAGGTCTDMNESCERWAALGECTKNKEYMVGSPELPGYCRRSCKIRTSHSIH 620

Query: 309 ----KSTEASVSKNGGGIDAETAG--------SGTLDSQYLQLLQLQDESNPTFVSEVAT 368
               K  E + +   G +  +              LDSQ+LQLLQLQDESNP FV EV +
Sbjct: 621 PILFKRREEAKNMEVGQMQRQWVDYTKSLFMEKRYLDSQFLQLLQLQDESNPDFVVEVVS 680

Query: 369 LFFEDTEELLNKLRVALLQPSVDFKKIDDHVHQLKGSSSSIGALRVKNACIDFRSACEQQ 428
           LFF+DTE+LLN L  AL Q  VDFK++D HVHQLKGSSSSIGA RVKN C+ FR+ CE+Q
Sbjct: 681 LFFDDTEKLLNDLTAALEQQCVDFKRVDAHVHQLKGSSSSIGAERVKNGCVAFRNLCEEQ 740

Query: 429 SPEWCSRCLQQVEQAFYGVKDKLSYLYALEQRILNAGGSIPV 450
           + + C RCLQQV+Q +Y VK+KL  L+ LEQ+I+ AGGSIP+
Sbjct: 741 NTDACLRCLQQVKQEYYLVKNKLENLFRLEQQIVAAGGSIPM 782

BLAST of CmaCh14G016540 vs. NCBI nr
Match: XP_023526540.1 (probable prolyl 4-hydroxylase 4 [Cucurbita pepo subsp. pepo])

HSP 1 Score: 597.0 bits (1538), Expect = 1.3e-166
Identity = 292/300 (97.33%), Postives = 296/300 (98.67%), Query Frame = 0

Query: 1   MEKFCSFNPLFFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDL 60
           MEKFCSFN LFFFSLSISLLL  ASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDL
Sbjct: 1   MEKFCSFNLLFFFSLSISLLLRLASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDL 60

Query: 61  ECDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFL 120
           ECDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFI+KSKDPIVSGIEDKI+AWTFL
Sbjct: 61  ECDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIKKSKDPIVSGIEDKIAAWTFL 120

Query: 121 PKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSA 180
           PKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVK+GGETVFPSA
Sbjct: 121 PKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKKGGETVFPSA 180

Query: 181 EESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEK 240
           EESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEK
Sbjct: 181 EESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEK 240

Query: 241 WSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCK 300
           WSATKWIRVNPFDQIVGDY+NCSDENASCERWAELGEC DNPEYMVGSPEFPGYCRKSCK
Sbjct: 241 WSATKWIRVNPFDQIVGDYLNCSDENASCERWAELGECIDNPEYMVGSPEFPGYCRKSCK 300

BLAST of CmaCh14G016540 vs. NCBI nr
Match: XP_022955844.1 (probable prolyl 4-hydroxylase 4 [Cucurbita moschata])

HSP 1 Score: 595.9 bits (1535), Expect = 2.9e-166
Identity = 293/300 (97.67%), Postives = 295/300 (98.33%), Query Frame = 0

Query: 1   MEKFCSFNPLFFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDL 60
           MEKFCSFN LFFFSLSISLLL RASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDL
Sbjct: 1   MEKFCSFNLLFFFSLSISLLLRRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDL 60

Query: 61  ECDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFL 120
           E DHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFI+KSKDPIVSGIEDKI+AWTFL
Sbjct: 61  ESDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIKKSKDPIVSGIEDKIAAWTFL 120

Query: 121 PKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSA 180
           PKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSA
Sbjct: 121 PKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSA 180

Query: 181 EESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEK 240
           EESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEK
Sbjct: 181 EESQRRQASETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEK 240

Query: 241 WSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCK 300
           WSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGEC  NPEYMVGSPEFPGYCRKSCK
Sbjct: 241 WSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGECMGNPEYMVGSPEFPGYCRKSCK 300

BLAST of CmaCh14G016540 vs. TAIR 10
Match: AT5G18900.1 (2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily protein )

HSP 1 Score: 448.7 bits (1153), Expect = 5.4e-126
Identity = 218/292 (74.66%), Postives = 245/292 (83.90%), Query Frame = 0

Query: 10  LFFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDLECDHLISIA 69
           L  F    S+LL ++S+S   S+S  VNP+KVKQ+S  PRAFVYEGFLT+LECDH++S+A
Sbjct: 7   LISFFAIFSVLL-QSSTSLISSSSVFVNPSKVKQVSSKPRAFVYEGFLTELECDHMVSLA 66

Query: 70  KAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFLPKENGEDIQ 129
           KA LKRSAVADN SGESK SE+RTSSG FI K KDPIVSGIEDKIS WTFLPKENGEDIQ
Sbjct: 67  KASLKRSAVADNDSGESKFSEVRTSSGTFISKGKDPIVSGIEDKISTWTFLPKENGEDIQ 126

Query: 130 VLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSAEESQRRQAS 189
           VLRYE+GQKYDAHFDYF D+VNI RGGHRMAT+LMYLS+V +GGETVFP AE   RR  S
Sbjct: 127 VLRYEHGQKYDAHFDYFHDKVNIVRGGHRMATILMYLSNVTKGGETVFPDAEIPSRRVLS 186

Query: 190 ETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEKWSATKWIRV 249
           E  EDLSDCAK+GIAVKPRKGDALLFF+LHP+A+PD  SLHGGCPVIEGEKWSATKWI V
Sbjct: 187 ENKEDLSDCAKRGIAVKPRKGDALLFFNLHPDAIPDPLSLHGGCPVIEGEKWSATKWIHV 246

Query: 250 NPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCKS 302
           + FD+IV    NC+D N SCERWA LGEC  NPEYMVG+ E PGYCR+SCK+
Sbjct: 247 DSFDRIVTPSGNCTDMNESCERWAVLGECTKNPEYMVGTTELPGYCRRSCKA 297

BLAST of CmaCh14G016540 vs. TAIR 10
Match: AT3G06300.1 (P4H isoform 2 )

HSP 1 Score: 438.7 bits (1127), Expect = 5.6e-123
Identity = 208/287 (72.47%), Postives = 245/287 (85.37%), Query Frame = 0

Query: 15  LSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDLECDHLISIAKAELK 74
           ++I L+L ++S+    S SSI+NP+KVKQ+S  PRAFVYEGFLTDLECDHLIS+AK  L+
Sbjct: 12  VAILLVLLQSSTCLISSPSSIINPSKVKQVSSKPRAFVYEGFLTDLECDHLISLAKENLQ 71

Query: 75  RSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFLPKENGEDIQVLRYE 134
           RSAVADN +GES+VS++RTSSG FI K KDPIVSGIEDK+S WTFLPKENGED+QVLRYE
Sbjct: 72  RSAVADNDNGESQVSDVRTSSGTFISKGKDPIVSGIEDKLSTWTFLPKENGEDLQVLRYE 131

Query: 135 YGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSAEESQRRQASETNED 194
           +GQKYDAHFDYF D+VNIARGGHR+ATVL+YLS+V +GGETVFP A+E  RR  SE  +D
Sbjct: 132 HGQKYDAHFDYFHDKVNIARGGHRIATVLLYLSNVTKGGETVFPDAQEFSRRSLSENKDD 191

Query: 195 LSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEKWSATKWIRVNPFDQ 254
           LSDCAKKGIAVKP+KG+ALLFF+L  +A+PD  SLHGGCPVIEGEKWSATKWI V+ FD+
Sbjct: 192 LSDCAKKGIAVKPKKGNALLFFNLQQDAIPDPFSLHGGCPVIEGEKWSATKWIHVDSFDK 251

Query: 255 IVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCKS 302
           I+    NC+D N SCERWA LGEC  NPEYMVG+PE PG CR+SCK+
Sbjct: 252 ILTHDGNCTDVNESCERWAVLGECGKNPEYMVGTPEIPGNCRRSCKA 298

BLAST of CmaCh14G016540 vs. TAIR 10
Match: AT3G28480.1 (Oxoglutarate/iron-dependent oxygenase )

HSP 1 Score: 355.1 bits (910), Expect = 8.1e-98
Identity = 175/312 (56.09%), Postives = 227/312 (72.76%), Query Frame = 0

Query: 4   FCSFNPLFFFSLSI-----SLLLGRASSSYAG-------SASSI-VNPAKVKQISWNPRA 63
           F +F+  F F+L +     +  L R+S++  G       SASS   +P +V Q+SW PR 
Sbjct: 6   FLAFSLCFLFTLPLISSAPNRFLTRSSNTRDGSVIKMKTSASSFGFDPTRVTQLSWTPRV 65

Query: 64  FVYEGFLTDLECDHLISIAKAELKRSAVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGI 123
           F+YEGFL+D ECDH I +AK +L++S VADN SGES  SE+RTSSG F+ K +D IVS +
Sbjct: 66  FLYEGFLSDEECDHFIKLAKGKLEKSMVADNDSGESVESEVRTSSGMFLSKRQDDIVSNV 125

Query: 124 EDKISAWTFLPKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVK 183
           E K++AWTFLP+ENGE +Q+L YE GQKY+ HFDYF D+ N+  GGHR+ATVLMYLS+V+
Sbjct: 126 EAKLAAWTFLPEENGESMQILHYENGQKYEPHFDYFHDQANLELGGHRIATVLMYLSNVE 185

Query: 184 RGGETVFPSAEESQRRQASETNED-LSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSL 243
           +GGETVFP      + +A++  +D  ++CAK+G AVKPRKGDALLFF+LHPNA  D++SL
Sbjct: 186 KGGETVFP----MWKGKATQLKDDSWTECAKQGYAVKPRKGDALLFFNLHPNATTDSNSL 245

Query: 244 HGGCPVIEGEKWSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSP 302
           HG CPV+EGEKWSAT+WI V  F++       C DEN SCE+WA+ GEC  NP YMVGS 
Sbjct: 246 HGSCPVVEGEKWSATRWIHVKSFERAFNKQSGCMDENVSCEKWAKAGECQKNPTYMVGSD 305

BLAST of CmaCh14G016540 vs. TAIR 10
Match: AT3G28480.2 (Oxoglutarate/iron-dependent oxygenase )

HSP 1 Score: 333.2 bits (853), Expect = 3.3e-91
Identity = 171/320 (53.44%), Postives = 221/320 (69.06%), Query Frame = 0

Query: 4   FCSFNPLFFFSLSI-----SLLLGRASSSYAG-------SASSI-VNPAKVKQISWNPRA 63
           F +F+  F F+L +     +  L R+S++  G       SASS   +P +V Q+SW PR 
Sbjct: 6   FLAFSLCFLFTLPLISSAPNRFLTRSSNTRDGSVIKMKTSASSFGFDPTRVTQLSWTPRV 65

Query: 64  FVYEGFLTDLECDHLISIAKAELKRSAVADNLSGES-----KVSEIRTSSGAFIQKSK-- 123
           F+YEGFL+D ECDH I +AK +L++S VADN SGES      VS +R SS          
Sbjct: 66  FLYEGFLSDEECDHFIKLAKGKLEKSMVADNDSGESVESEDSVSVVRQSSSFIANMDSLE 125

Query: 124 -DPIVSGIEDKISAWTFLPKENGEDIQVLRYEYGQKYDAHFDYFADEVNIARGGHRMATV 183
            D IVS +E K++AWTFLP+ENGE +Q+L YE GQKY+ HFDYF D+ N+  GGHR+ATV
Sbjct: 126 IDDIVSNVEAKLAAWTFLPEENGESMQILHYENGQKYEPHFDYFHDQANLELGGHRIATV 185

Query: 184 LMYLSDVKRGGETVFPSAEESQRRQASETNED-LSDCAKKGIAVKPRKGDALLFFSLHPN 243
           LMYLS+V++GGETVFP      + +A++  +D  ++CAK+G AVKPRKGDALLFF+LHPN
Sbjct: 186 LMYLSNVEKGGETVFP----MWKGKATQLKDDSWTECAKQGYAVKPRKGDALLFFNLHPN 245

Query: 244 AVPDTSSLHGGCPVIEGEKWSATKWIRVNPFDQIVGDYMNCSDENASCERWAELGECNDN 302
           A  D++SLHG CPV+EGEKWSAT+WI V  F++       C DEN SCE+WA+ GEC  N
Sbjct: 246 ATTDSNSLHGSCPVVEGEKWSATRWIHVKSFERAFNKQSGCMDENVSCEKWAKAGECQKN 305

BLAST of CmaCh14G016540 vs. TAIR 10
Match: AT3G28490.1 (Oxoglutarate/iron-dependent oxygenase )

HSP 1 Score: 327.0 bits (837), Expect = 2.4e-89
Identity = 164/292 (56.16%), Postives = 211/292 (72.26%), Query Frame = 0

Query: 11  FFFSLSISLLLGRASSSYAGSASSIVNPAKVKQISWNPRAFVYEGFLTDLECDHLISIAK 70
           +F + S+SLLL     S   S S  V+P ++ Q+SW PRAF+Y+GFL+D ECDHLI +AK
Sbjct: 5   YFLAFSLSLLL---IFSQISSFSFSVDPTRITQLSWTPRAFLYKGFLSDEECDHLIKLAK 64

Query: 71  AELKRS-AVADNLSGESKVSEIRTSSGAFIQKSKDPIVSGIEDKISAWTFLPKENGEDIQ 130
            +L++S  VAD  SGES+ SE+RTSSG F+ K +D IV+ +E K++AWTFLP+ENGE +Q
Sbjct: 65  GKLEKSMVVADVDSGESEDSEVRTSSGMFLTKRQDDIVANVEAKLAAWTFLPEENGEALQ 124

Query: 131 VLRYEYGQKYDAHFDYFADEVNIARGGHRMATVLMYLSDVKRGGETVFPSAEESQRRQAS 190
           +L YE GQKYD HFDYF D+  +  GGHR+ATVLMYLS+V +GGETVFP+    + +   
Sbjct: 125 ILHYENGQKYDPHFDYFYDKKALELGGHRIATVLMYLSNVTKGGETVFPN---WKGKTPQ 184

Query: 191 ETNEDLSDCAKKGIAVKPRKGDALLFFSLHPNAVPDTSSLHGGCPVIEGEKWSATKWIRV 250
             ++  S CAK+G AVKPRKGDALLFF+LH N   D +SLHG CPVIEGEKWSAT+WI V
Sbjct: 185 LKDDSWSKCAKQGYAVKPRKGDALLFFNLHLNGTTDPNSLHGSCPVIEGEKWSATRWIHV 244

Query: 251 NPFDQIVGDYMNCSDENASCERWAELGECNDNPEYMVGSPEFPGYCRKSCKS 302
             F +     + C D++ SC+ WA+ GEC  NP YMVGS    G+CRKSCK+
Sbjct: 245 RSFGK---KKLVCVDDHESCQEWADAGECEKNPMYMVGSETSLGFCRKSCKA 287

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
Q8LAN37.6e-12574.66Probable prolyl 4-hydroxylase 4 OS=Arabidopsis thaliana OX=3702 GN=P4H4 PE=2 SV=... [more]
F4JAU37.9e-12272.47Prolyl 4-hydroxylase 2 OS=Arabidopsis thaliana OX=3702 GN=P4H2 PE=1 SV=1[more]
Q8L9701.1e-9656.09Probable prolyl 4-hydroxylase 7 OS=Arabidopsis thaliana OX=3702 GN=P4H7 PE=2 SV=... [more]
F4J0A83.3e-8856.16Probable prolyl 4-hydroxylase 6 OS=Arabidopsis thaliana OX=3702 GN=P4H6 PE=2 SV=... [more]
Q9LN201.2e-6454.55Probable prolyl 4-hydroxylase 3 OS=Arabidopsis thaliana OX=3702 GN=P4H3 PE=2 SV=... [more]
Match NameE-valueIdentityDescription
A0A6J1J0842.3e-172100.00Procollagen-proline 4-dioxygenase OS=Cucurbita maxima OX=3661 GN=LOC111480082 PE... [more]
A0A7J6HHC71.7e-16765.80Uncharacterized protein OS=Cannabis sativa OX=3483 GN=G4B88_007776 PE=4 SV=1[more]
A0A6J1GXF31.4e-16697.67Procollagen-proline 4-dioxygenase OS=Cucurbita moschata OX=3662 GN=LOC111457709 ... [more]
A0A7J7GMG31.9e-16362.08Uncharacterized protein OS=Camellia sinensis OX=4442 GN=HYC85_019329 PE=4 SV=1[more]
A0A1Q3B5T17.3e-16363.08Procollagen-proline 4-dioxygenase OS=Cephalotus follicularis OX=3775 GN=CFOL_v3_... [more]
Match NameE-valueIdentityDescription
XP_022980799.14.7e-172100.00probable prolyl 4-hydroxylase 4 [Cucurbita maxima][more]
KAF9678089.17.0e-16868.41hypothetical protein SADUNF_Sadunf08G0175600 [Salix dunnii][more]
KAF4393790.13.5e-16765.80hypothetical protein G4B88_007776 [Cannabis sativa][more]
XP_023526540.11.3e-16697.33probable prolyl 4-hydroxylase 4 [Cucurbita pepo subsp. pepo][more]
XP_022955844.12.9e-16697.67probable prolyl 4-hydroxylase 4 [Cucurbita moschata][more]
Match NameE-valueIdentityDescription
AT5G18900.15.4e-12674.662-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily protein [more]
AT3G06300.15.6e-12372.47P4H isoform 2 [more]
AT3G28480.18.1e-9856.09Oxoglutarate/iron-dependent oxygenase [more]
AT3G28480.23.3e-9153.44Oxoglutarate/iron-dependent oxygenase [more]
AT3G28490.12.4e-8956.16Oxoglutarate/iron-dependent oxygenase [more]
InterPro
Analysis Name: InterPro Annotations of Cucurbita maxima (Rimu) v1.1
Date Performed: 2021-10-25
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR006620Prolyl 4-hydroxylase, alpha subunitSMARTSM00702p4hccoord: 48..248
e-value: 2.4E-60
score: 216.5
IPR044862Prolyl 4-hydroxylase alpha subunit, Fe(2+) 2OG dioxygenase domainPFAMPF136402OG-FeII_Oxy_3coord: 128..248
e-value: 3.0E-20
score: 72.9
IPR036641HPT domain superfamilyGENE3D1.20.120.160HPT domaincoord: 304..452
e-value: 5.4E-48
score: 164.8
IPR036641HPT domain superfamilySUPERFAMILY47226Histidine-containing phosphotransfer domain, HPT domaincoord: 322..437
NoneNo IPR availableGENE3D2.60.120.620q2cbj1_9rhob like domaincoord: 40..249
e-value: 7.6E-76
score: 256.4
NoneNo IPR availablePANTHERPTHR10869:SF175PROLYL 4-HYDROXYLASE SUBUNIT ALPHA-LIKE PROTEINcoord: 21..301
IPR008207Signal transduction histidine kinase, phosphotransfer (Hpt) domainPFAMPF01627Hptcoord: 345..420
e-value: 2.5E-10
score: 40.5
IPR008207Signal transduction histidine kinase, phosphotransfer (Hpt) domainPROSITEPS50894HPTcoord: 338..447
score: 14.14445
IPR008207Signal transduction histidine kinase, phosphotransfer (Hpt) domainCDDcd00088HPTcoord: 342..421
e-value: 9.67347E-15
score: 67.7918
IPR045054Prolyl 4-hydroxylasePANTHERPTHR10869PROLYL 4-HYDROXYLASE ALPHA SUBUNITcoord: 21..301
IPR003582ShKT domainPROSITEPS51670SHKTcoord: 262..302
score: 7.705144
IPR005123Oxoglutarate/iron-dependent dioxygenasePROSITEPS51471FE2OG_OXYcoord: 124..249
score: 12.2256

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CmaCh14G016540.1CmaCh14G016540.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0018401 peptidyl-proline hydroxylation to 4-hydroxy-L-proline
biological_process GO:0000160 phosphorelay signal transduction system
biological_process GO:0016310 phosphorylation
cellular_component GO:0005789 endoplasmic reticulum membrane
cellular_component GO:0016021 integral component of membrane
molecular_function GO:0005506 iron ion binding
molecular_function GO:0016301 kinase activity
molecular_function GO:0031418 L-ascorbic acid binding
molecular_function GO:0000166 nucleotide binding
molecular_function GO:0004656 procollagen-proline 4-dioxygenase activity
molecular_function GO:0016705 oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen