CSPI05G28680 (gene) Cucumber (PI 183967) v1

Overview
NameCSPI05G28680
Typegene
OrganismCucumis sativus var. hardwickii cv. PI 183967 (Cucumber (PI 183967) v1)
DescriptionProcollagen-proline 4-dioxygenase
LocationChr5: 27018887 .. 27022765 (-)
RNA-Seq ExpressionCSPI05G28680
SyntenyCSPI05G28680
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: five_prime_UTRpolypeptideCDSthree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
CAAATTTATTATAACCGCCCACTGATTCTCATTTTTCATTAATTGTCTTCCTAGTTGAATTTTCATAATTAATTCAATTCAAATCCTTTTATTTTTCCTTTTTCCTTCCTCGTAAACGAAGAACCCTTCAATCGTTGAATTCTTTTTCTTGTTCATTTCTCCGATTTGACATCGGAGAAACAATCATGGATTCTCGACCATTCCTCGCATTTTCTCTCTGCTTTCTCTCCGTCTTCACCGCCTTCGCTCGCTTGCCGGAAACGCGTACCCACAAGCAATCGTACGATCATTTTCTTCTTTCTCTTCTTCTTCTTTTTTTTTTTTTTTTGTAATTTCACGAATTCACTGATGCACGGTTTTGGATTCGACTGTTTTTGGAATTTTAGAAGTGGATCTGTGCTTCGATTGAAGACGGATTCATCTCCGCTCATTTTCGATCCAACACGAGTCACTCAGCTCTCCTGGCAACCCAGGTAATCTGTCTTTTTCACTTAGCGATCGAGTAAGATGTTCCTTTGCTCTCTCTGGCTTCATTGTGTGGAAAGATTGTCCTGCTTGAGATAGTCAAACTACATTTACATTAATTCTGTTTCTAATGTGTGCTTATATTCTTCTAAAAACTGCAAATCTACAGGGCATTTTTGTATAAGGGATTTTTATCTGATGCGGAATGTGATCACCTAATTGATCTGGTAATTGATTATGGAACGGTCTGTTTGTGTTGATTTAGATTTCAATGTTGTGGTATTTGTATATGTTTGTTTAATTCTGTTATTTTCTTGAATAGGCTAAGGATAAATTAGAGAAGTCAATGGTAGCAGATAATGATTCTGGTAAAAGTGTAAGTAGTGAAGTCCGAACGAGTTCTGGCATGTTCCTTCGGAAGGCCCAGGTGCGTCAGTTTATTTGAATCATCCCATCCATGATTTTAATGTTTGGCTTTTTGAACTTATTTTTATGTTGTGTCATGGTATGTTCTTGTAGAATTAGGTGAGAGGGATAATGTAATTGTAGATGTCATTGTGTATGAGGACACGGACATGCTATCCATTATCGTAGCGTTCGTAAACATTCTGTGGTAAAGAATTGAAAGGAAAACTGTGTGAACTTATTTAAATTATTTGACTACTTGAATTTTCTAAAATGTGTTTGTGCAGTAACGTATGTAAGATTACCTGTTTTTGATAGAAAGCGATGTCTGAAAATATATGAACAAACGTGTTTGCCTGATGGAGTCTAGTCAATGCCTCTATGTAGTTTGACCTTATGCATGTGTTTGAGGGGCTGTTTTGGAGGAGAATGTTAGGAACCAGAGGGTCTTTGTTTCACATCTGTTTGAATGGGATGACCAATGTGGTACACAAGTGTGTTGGTTCTCTAGTCTCACTTCAATGGCTGGTTTTTGGGGTGTGGTTATTCAAGGTGCGGAGTTATCAACATCCCTGTTGTCAACAATATCCCTTATCCCTGCCCTCTCTCCTCTTATCATCATTCATACTGTTGCCATTGGATGTTCAATTAAATGTCATCGTTGGGAACATCCTTGACTTCCTCTTCCTTTGAAATGAGTGTTTAATCTTGATGTTTAGATCCTCTCCCCCTCATTTTTCTTCTTTAAATATGATCCTGTCGGGGGATAGATCATTGACAGAACCTACAGATCTCCATTTAAAGTGTTGAAGGTTGAATATCCATTGTTATTCTTCTGTGTTTGAGGACTTGGTTTTTTGCATTTTGCTTGTCTTTAGAGAACACGATTTTCTTTGTGAATAAAACATAAGTAGAAGATATATTGAAGTTTCAAGTCTAAACAAATGGTGAAATGACCAGAAGCAATAGTAATTGTTATTTAAATATACAAGTTTTGCTATGGGTTTAATGCCTATAAACTGCAAGTTAGCTAATTACAGTCCATTACTAAATTACTGATTTTACCACAAGTCTAAACTAAAGAAACATCAATGACAACAATTAAAGATTAAAATGGTTCTGTTAGTGATAATATTGACGATGGTATTGAGAATGGTATCTAGATTATTTATTTAAGAGAGTGGCTATTGATAGACTTGCTTTGCATATCAATTTTTTAATATTTCAATCATGGCTTATGCCTCAAAGCACCTTGAAGCTTAACTATAATGTATATGATTTATTTTGTGCTTTGAAGGATGAAGTTGTTGCTGGCGTTGAAGCCAGGATAGCTGCATGGACACTCCTTCCAGCAGGTAGATTTGTTGTATGCTCATTCTCATGGATGCACCTTTCTTTTCTTTTTTAAATGTAGGATTATGAAGCTATTCCAATCACTTACGTTTCTTTCTTGGAACTATTGTTAAAATCAGAAAATGGCGAATCCATTCAAATTCTTCACTATGAGAATGGTCAAAAGTATGAACCACATTTTGATTTTTTTCACGACAAGGTGAATCAGGAGTTAGGTGGCCACCGCATAGCCACAGTCTTGATGTATTTATCCAATGTTGAAAAGGGTGGAGAAACCATCTTTCCTAATTCAGAGGTATGGTATGGCAGTGGTTCTGCTACTTCAGTACCTTTGTTTTTTTAAAAAAAACAGACGTATATTTTAATTTTGTTGCCAACTGTGATGACTTTGTCTCTGCAGTTTAAAGAATCTCAAGCAAAAGATGAGAGCTGGTCTGATTGTTCTCGAAAGGGTTATGCAGGTAGGTTTGTTATTGACTCGTACAGACCATAAGTCATCAATGTATTAAGTTACGGTTTTCAAATTAGAACAGGCCACAGATGGCCTATCTATCATTGATAGATCTTCCCTTGAAACTTATCTTTCTTGAACTTGACCTTCCTTTCTGGTCTTCTAGAAGAGTACTCATCTTTTGGCTAATATCCTTTATGGAGAGCCAACTTGGTTTGAACCTTTATTGTCTGATTTTCCCCTTTCCAAGAGTTGCTCACCTGTATCATCGATTCGCCATTGTAGTCTGGCTTAGAAAAAGTTTGGTTGTCCATTTAGTCTGAAATGTTAGCTGAGACTATTGCTTGTGATTTTACATTTTCCAACTTACATTGTTTAAAAAGGTACACATCTAGACGCCATTACAAAGGCACTGCATACTCAGTAGAACTGAATTATATTGCTTGGATGTTTACGCCTGTCTCTAAACATATTAAATTGAAATACATTTCTTGACTCTTTCATTATTAGGCTCTCTTCTATTTTTGTTGTCCTGATCGTTTCATGTAGCTTCTAGGATTTGATATTGTACCGTCATACACCCTTATGGTTCTACAACTTGGGTGTTCATTGCAGTTAAAGCGCAGAAGGGCGATGCATTGTTGTTCTTCAGCCTAAATCTCGACGCAACAACAGATGAAAGAAGTTTGCACGGTAGTTGCCCTGTAATTGCAGGCGAGAAATGGTCTGCAACCAAATGGATTCATGTGAGATCCTTTGAGAAGATAACTTCTCGTGTTAGTAGACAGGGTTGCGTGGACGAGAACGAAAATTGCCTGGCATGGGCAAAAAAGGGAGAGTGCAAAAAGAACCCTACTTACATGGTGGGTTCTGGAGGTGCTTTAGGATACTGTAGGAAGAGCTGCAAAGCATGCTAAAACCCTAGGAGGAGGAAGAAGAAGAAGTAATCCCCACATCTCTCTTTCTTTTTTTCTGTTTTGCTGAGCTTGTGTGTCGATTTTGTAATGGCTATGTATATAACATTGGGCAGCAACTTGGTATACTATATAATATTACAAGTGGATATTAATTACAGCTTTCATTAAACCTTGTTTTAGCAATTAACCACAAAAGAGTTATCATTTGATAATTGAATATGCAATGAGAAGTTTTCTCATGTATGATCCTTATTGGCTGCTTGACTTTTATATTCAACTTTACAAACC

mRNA sequence

CAAATTTATTATAACCGCCCACTGATTCTCATTTTTCATTAATTGTCTTCCTAGTTGAATTTTCATAATTAATTCAATTCAAATCCTTTTATTTTTCCTTTTTCCTTCCTCGTAAACGAAGAACCCTTCAATCGTTGAATTCTTTTTCTTGTTCATTTCTCCGATTTGACATCGGAGAAACAATCATGGATTCTCGACCATTCCTCGCATTTTCTCTCTGCTTTCTCTCCGTCTTCACCGCCTTCGCTCGCTTGCCGGAAACGCGTACCCACAAGCAATCAAGTGGATCTGTGCTTCGATTGAAGACGGATTCATCTCCGCTCATTTTCGATCCAACACGAGTCACTCAGCTCTCCTGGCAACCCAGGGCATTTTTGTATAAGGGATTTTTATCTGATGCGGAATGTGATCACCTAATTGATCTGGCTAAGGATAAATTAGAGAAGTCAATGGTAGCAGATAATGATTCTGGTAAAAGTGTAAGTAGTGAAGTCCGAACGAGTTCTGGCATGTTCCTTCGGAAGGCCCAGGATGAAGTTGTTGCTGGCGTTGAAGCCAGGATAGCTGCATGGACACTCCTTCCAGCAGAAAATGGCGAATCCATTCAAATTCTTCACTATGAGAATGGTCAAAAGTATGAACCACATTTTGATTTTTTTCACGACAAGGTGAATCAGGAGTTAGGTGGCCACCGCATAGCCACAGTCTTGATGTATTTATCCAATGTTGAAAAGGGTGGAGAAACCATCTTTCCTAATTCAGAGTTTAAAGAATCTCAAGCAAAAGATGAGAGCTGGTCTGATTGTTCTCGAAAGGGTTATGCAGTTAAAGCGCAGAAGGGCGATGCATTGTTGTTCTTCAGCCTAAATCTCGACGCAACAACAGATGAAAGAAGTTTGCACGGTAGTTGCCCTGTAATTGCAGGCGAGAAATGGTCTGCAACCAAATGGATTCATGTGAGATCCTTTGAGAAGATAACTTCTCGTGTTAGTAGACAGGGTTGCGTGGACGAGAACGAAAATTGCCTGGCATGGGCAAAAAAGGGAGAGTGCAAAAAGAACCCTACTTACATGGTGGGTTCTGGAGGTGCTTTAGGATACTGTAGGAAGAGCTGCAAAGCATGCTAAAACCCTAGGAGGAGGAAGAAGAAGAAGTAATCCCCACATCTCTCTTTCTTTTTTTCTGTTTTGCTGAGCTTGTGTGTCGATTTTGTAATGGCTATGTATATAACATTGGGCAGCAACTTGGTATACTATATAATATTACAAGTGGATATTAATTACAGCTTTCATTAAACCTTGTTTTAGCAATTAACCACAAAAGAGTTATCATTTGATAATTGAATATGCAATGAGAAGTTTTCTCATGTATGATCCTTATTGGCTGCTTGACTTTTATATTCAACTTTACAAACC

Coding sequence (CDS)

ATGGATTCTCGACCATTCCTCGCATTTTCTCTCTGCTTTCTCTCCGTCTTCACCGCCTTCGCTCGCTTGCCGGAAACGCGTACCCACAAGCAATCAAGTGGATCTGTGCTTCGATTGAAGACGGATTCATCTCCGCTCATTTTCGATCCAACACGAGTCACTCAGCTCTCCTGGCAACCCAGGGCATTTTTGTATAAGGGATTTTTATCTGATGCGGAATGTGATCACCTAATTGATCTGGCTAAGGATAAATTAGAGAAGTCAATGGTAGCAGATAATGATTCTGGTAAAAGTGTAAGTAGTGAAGTCCGAACGAGTTCTGGCATGTTCCTTCGGAAGGCCCAGGATGAAGTTGTTGCTGGCGTTGAAGCCAGGATAGCTGCATGGACACTCCTTCCAGCAGAAAATGGCGAATCCATTCAAATTCTTCACTATGAGAATGGTCAAAAGTATGAACCACATTTTGATTTTTTTCACGACAAGGTGAATCAGGAGTTAGGTGGCCACCGCATAGCCACAGTCTTGATGTATTTATCCAATGTTGAAAAGGGTGGAGAAACCATCTTTCCTAATTCAGAGTTTAAAGAATCTCAAGCAAAAGATGAGAGCTGGTCTGATTGTTCTCGAAAGGGTTATGCAGTTAAAGCGCAGAAGGGCGATGCATTGTTGTTCTTCAGCCTAAATCTCGACGCAACAACAGATGAAAGAAGTTTGCACGGTAGTTGCCCTGTAATTGCAGGCGAGAAATGGTCTGCAACCAAATGGATTCATGTGAGATCCTTTGAGAAGATAACTTCTCGTGTTAGTAGACAGGGTTGCGTGGACGAGAACGAAAATTGCCTGGCATGGGCAAAAAAGGGAGAGTGCAAAAAGAACCCTACTTACATGGTGGGTTCTGGAGGTGCTTTAGGATACTGTAGGAAGAGCTGCAAAGCATGCTAA

Protein sequence

MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQPRAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHGSCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSGGALGYCRKSCKAC*
Homology
BLAST of CSPI05G28680 vs. ExPASy Swiss-Prot
Match: Q8L970 (Probable prolyl 4-hydroxylase 7 OS=Arabidopsis thaliana OX=3702 GN=P4H7 PE=2 SV=1)

HSP 1 Score: 456.8 bits (1174), Expect = 1.9e-127
Identity = 219/316 (69.30%), Postives = 257/316 (81.33%), Query Frame = 0

Query: 1   MDSRPFLAFSLCFLSVFTAFARLPE---TRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLS 60
           MDSR FLAFSLCFL      +  P    TR+     GSV+++KT +S   FDPTRVTQLS
Sbjct: 1   MDSRIFLAFSLCFLFTLPLISSAPNRFLTRSSNTRDGSVIKMKTSASSFGFDPTRVTQLS 60

Query: 61  WQPRAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDE 120
           W PR FLY+GFLSD ECDH I LAK KLEKSMVADNDSG+SV SEVRTSSGMFL K QD+
Sbjct: 61  WTPRVFLYEGFLSDEECDHFIKLAKGKLEKSMVADNDSGESVESEVRTSSGMFLSKRQDD 120

Query: 121 VVAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMY 180
           +V+ VEA++AAWT LP ENGES+QILHYENGQKYEPHFD+FHD+ N ELGGHRIATVLMY
Sbjct: 121 IVSNVEAKLAAWTFLPEENGESMQILHYENGQKYEPHFDYFHDQANLELGGHRIATVLMY 180

Query: 181 LSNVEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERS 240
           LSNVEKGGET+FP  + K +Q KD+SW++C+++GYAVK +KGDALLFF+L+ +ATTD  S
Sbjct: 181 LSNVEKGGETVFPMWKGKATQLKDDSWTECAKQGYAVKPRKGDALLFFNLHPNATTDSNS 240

Query: 241 LHGSCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMV 300
           LHGSCPV+ GEKWSAT+WIHV+SFE+  ++ S  GC+DEN +C  WAK GEC+KNPTYMV
Sbjct: 241 LHGSCPVVEGEKWSATRWIHVKSFERAFNKQS--GCMDENVSCEKWAKAGECQKNPTYMV 300

Query: 301 GSGGALGYCRKSCKAC 314
           GS    GYCRKSCKAC
Sbjct: 301 GSDKDHGYCRKSCKAC 314

BLAST of CSPI05G28680 vs. ExPASy Swiss-Prot
Match: F4J0A8 (Probable prolyl 4-hydroxylase 6 OS=Arabidopsis thaliana OX=3702 GN=P4H6 PE=2 SV=1)

HSP 1 Score: 408.7 bits (1049), Expect = 6.0e-113
Identity = 205/314 (65.29%), Postives = 238/314 (75.80%), Query Frame = 0

Query: 1   MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP 60
           MDS+ FLAFSL  L +F+                     +  S     DPTR+TQLSW P
Sbjct: 1   MDSQYFLAFSLSLLLIFS---------------------QISSFSFSVDPTRITQLSWTP 60

Query: 61  RAFLYKGFLSDAECDHLIDLAKDKLEKSM-VADNDSGKSVSSEVRTSSGMFLRKAQDEVV 120
           RAFLYKGFLSD ECDHLI LAK KLEKSM VAD DSG+S  SEVRTSSGMFL K QD++V
Sbjct: 61  RAFLYKGFLSDEECDHLIKLAKGKLEKSMVVADVDSGESEDSEVRTSSGMFLTKRQDDIV 120

Query: 121 AGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLS 180
           A VEA++AAWT LP ENGE++QILHYENGQKY+PHFD+F+DK   ELGGHRIATVLMYLS
Sbjct: 121 ANVEAKLAAWTFLPEENGEALQILHYENGQKYDPHFDYFYDKKALELGGHRIATVLMYLS 180

Query: 181 NVEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLH 240
           NV KGGET+FPN + K  Q KD+SWS C+++GYAVK +KGDALLFF+L+L+ TTD  SLH
Sbjct: 181 NVTKGGETVFPNWKGKTPQLKDDSWSKCAKQGYAVKPRKGDALLFFNLHLNGTTDPNSLH 240

Query: 241 GSCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGS 300
           GSCPVI GEKWSAT+WIHVRSF K      +  CVD++E+C  WA  GEC+KNP YMVGS
Sbjct: 241 GSCPVIEGEKWSATRWIHVRSFGK-----KKLVCVDDHESCQEWADAGECEKNPMYMVGS 288

Query: 301 GGALGYCRKSCKAC 314
             +LG+CRKSCKAC
Sbjct: 301 ETSLGFCRKSCKAC 288

BLAST of CSPI05G28680 vs. ExPASy Swiss-Prot
Match: Q8LAN3 (Probable prolyl 4-hydroxylase 4 OS=Arabidopsis thaliana OX=3702 GN=P4H4 PE=2 SV=1)

HSP 1 Score: 343.2 bits (879), Expect = 3.1e-93
Identity = 166/286 (58.04%), Postives = 212/286 (74.13%), Query Frame = 0

Query: 31  QSSGSVLRLKTDSSPLIFDPTRVTQLSWQPRAFLYKGFLSDAECDHLIDLAKDKLEKSMV 90
           QSS S++     SS +  +P++V Q+S +PRAF+Y+GFL++ ECDH++ LAK  L++S V
Sbjct: 19  QSSTSLI----SSSSVFVNPSKVKQVSSKPRAFVYEGFLTELECDHMVSLAKASLKRSAV 78

Query: 91  ADNDSGKSVSSEVRTSSGMFLRKAQDEVVAGVEARIAAWTLLPAENGESIQILHYENGQK 150
           ADNDSG+S  SEVRTSSG F+ K +D +V+G+E +I+ WT LP ENGE IQ+L YE+GQK
Sbjct: 79  ADNDSGESKFSEVRTSSGTFISKGKDPIVSGIEDKISTWTFLPKENGEDIQVLRYEHGQK 138

Query: 151 YEPHFDFFHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSEFKESQA---KDESWSDC 210
           Y+ HFD+FHDKVN   GGHR+AT+LMYLSNV KGGET+FP++E    +      E  SDC
Sbjct: 139 YDAHFDYFHDKVNIVRGGHRMATILMYLSNVTKGGETVFPDAEIPSRRVLSENKEDLSDC 198

Query: 211 SRKGYAVKAQKGDALLFFSLNLDATTDERSLHGSCPVIAGEKWSATKWIHVRSFEKITSR 270
           +++G AVK +KGDALLFF+L+ DA  D  SLHG CPVI GEKWSATKWIHV SF++I + 
Sbjct: 199 AKRGIAVKPRKGDALLFFNLHPDAIPDPLSLHGGCPVIEGEKWSATKWIHVDSFDRIVT- 258

Query: 271 VSRQGCVDENENCLAWAKKGECKKNPTYMVGSGGALGYCRKSCKAC 314
                C D NE+C  WA  GEC KNP YMVG+    GYCR+SCKAC
Sbjct: 259 -PSGNCTDMNESCERWAVLGECTKNPEYMVGTTELPGYCRRSCKAC 298

BLAST of CSPI05G28680 vs. ExPASy Swiss-Prot
Match: F4JAU3 (Prolyl 4-hydroxylase 2 OS=Arabidopsis thaliana OX=3702 GN=P4H2 PE=1 SV=1)

HSP 1 Score: 337.8 bits (865), Expect = 1.3e-91
Identity = 167/276 (60.51%), Postives = 210/276 (76.09%), Query Frame = 0

Query: 43  SSP-LIFDPTRVTQLSWQPRAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSS 102
           SSP  I +P++V Q+S +PRAF+Y+GFL+D ECDHLI LAK+ L++S VADND+G+S  S
Sbjct: 27  SSPSSIINPSKVKQVSSKPRAFVYEGFLTDLECDHLISLAKENLQRSAVADNDNGESQVS 86

Query: 103 EVRTSSGMFLRKAQDEVVAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDK 162
           +VRTSSG F+ K +D +V+G+E +++ WT LP ENGE +Q+L YE+GQKY+ HFD+FHDK
Sbjct: 87  DVRTSSGTFISKGKDPIVSGIEDKLSTWTFLPKENGEDLQVLRYEHGQKYDAHFDYFHDK 146

Query: 163 VNQELGGHRIATVLMYLSNVEKGGETIFPNS-EFKE---SQAKDESWSDCSRKGYAVKAQ 222
           VN   GGHRIATVL+YLSNV KGGET+FP++ EF     S+ KD+  SDC++KG AVK +
Sbjct: 147 VNIARGGHRIATVLLYLSNVTKGGETVFPDAQEFSRRSLSENKDD-LSDCAKKGIAVKPK 206

Query: 223 KGDALLFFSLNLDATTDERSLHGSCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDEN 282
           KG+ALLFF+L  DA  D  SLHG CPVI GEKWSATKWIHV SF+KI +      C D N
Sbjct: 207 KGNALLFFNLQQDAIPDPFSLHGGCPVIEGEKWSATKWIHVDSFDKILTHDG--NCTDVN 266

Query: 283 ENCLAWAKKGECKKNPTYMVGSGGALGYCRKSCKAC 314
           E+C  WA  GEC KNP YMVG+    G CR+SCKAC
Sbjct: 267 ESCERWAVLGECGKNPEYMVGTPEIPGNCRRSCKAC 299

BLAST of CSPI05G28680 vs. ExPASy Swiss-Prot
Match: Q9LN20 (Probable prolyl 4-hydroxylase 3 OS=Arabidopsis thaliana OX=3702 GN=P4H3 PE=2 SV=1)

HSP 1 Score: 244.2 bits (622), Expect = 2.0e-63
Identity = 114/208 (54.81%), Postives = 154/208 (74.04%), Query Frame = 0

Query: 56  LSWQPRAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQ 115
           LSW+PRAF+Y  FLS  EC++LI LAK  + KS V D+++GKS  S VRTSSG FLR+ +
Sbjct: 79  LSWEPRAFVYHNFLSKEECEYLISLAKPHMVKSTVVDSETGKSKDSRVRTSSGTFLRRGR 138

Query: 116 DEVVAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVL 175
           D+++  +E RIA +T +PA++GE +Q+LHYE GQKYEPH+D+F D+ N + GG R+AT+L
Sbjct: 139 DKIIKTIEKRIADYTFIPADHGEGLQVLHYEAGQKYEPHYDYFVDEFNTKNGGQRMATML 198

Query: 176 MYLSNVEKGGETIFPNSEFKESQAK-DESWSDCSRKGYAVKAQKGDALLFFSLNLDATTD 235
           MYLS+VE+GGET+FP +    S        S+C +KG +VK + GDALLF+S+  DAT D
Sbjct: 199 MYLSDVEEGGETVFPAANMNFSSVPWYNELSECGKKGLSVKPRMGDALLFWSMRPDATLD 258

Query: 236 ERSLHGSCPVIAGEKWSATKWIHVRSFE 263
             SLHG CPVI G KWS+TKW+HV  ++
Sbjct: 259 PTSLHGGCPVIRGNKWSSTKWMHVGEYK 286

BLAST of CSPI05G28680 vs. ExPASy TrEMBL
Match: A0A0A0KS38 (Procollagen-proline 4-dioxygenase OS=Cucumis sativus OX=3659 GN=Csa_5G633280 PE=3 SV=1)

HSP 1 Score: 641.0 bits (1652), Expect = 2.7e-180
Identity = 313/313 (100.00%), Postives = 313/313 (100.00%), Query Frame = 0

Query: 1   MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP 60
           MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP
Sbjct: 1   MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP 60

Query: 61  RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA 120
           RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA
Sbjct: 61  RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA 120

Query: 121 GVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN 180
           GVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN
Sbjct: 121 GVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN 180

Query: 181 VEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHG 240
           VEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHG
Sbjct: 181 VEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHG 240

Query: 241 SCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSG 300
           SCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSG
Sbjct: 241 SCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSG 300

Query: 301 GALGYCRKSCKAC 314
           GALGYCRKSCKAC
Sbjct: 301 GALGYCRKSCKAC 313

BLAST of CSPI05G28680 vs. ExPASy TrEMBL
Match: A0A1S3C8G4 (Procollagen-proline 4-dioxygenase OS=Cucumis melo OX=3656 GN=LOC103498028 PE=3 SV=1)

HSP 1 Score: 599.0 bits (1543), Expect = 1.2e-167
Identity = 295/317 (93.06%), Postives = 305/317 (96.21%), Query Frame = 0

Query: 1   MDSRPFLAFSLCFLSVFTAFARLPETR----THKQSSGSVLRLKTDSSPLIFDPTRVTQL 60
           MDSRPFLAFSLCFLSVFTAFARLPETR    ++KQS+GSVLRLKTDSSPLIFDPTRVTQL
Sbjct: 1   MDSRPFLAFSLCFLSVFTAFARLPETRMLKHSYKQSTGSVLRLKTDSSPLIFDPTRVTQL 60

Query: 61  SWQPRAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQD 120
           SWQPRAFLYKGFLSD ECDHLIDLAKDKLEKSMVADN+SGKSVSSEVRTSSGMFLRKAQD
Sbjct: 61  SWQPRAFLYKGFLSDEECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQD 120

Query: 121 EVVAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLM 180
           ++VAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLM
Sbjct: 121 KIVAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLM 180

Query: 181 YLSNVEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDER 240
           YLSNVEKGGETIFPNSEFKESQ KD+SWSDCSRKGYAVKAQKGDALLFFSL+LDATTDER
Sbjct: 181 YLSNVEKGGETIFPNSEFKESQEKDDSWSDCSRKGYAVKAQKGDALLFFSLHLDATTDER 240

Query: 241 SLHGSCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYM 300
           SLHGSCPVI GEKWSATKWIHVRSFEK+  RVSRQ CVDENENC AWAK+GECKKNPTYM
Sbjct: 241 SLHGSCPVIEGEKWSATKWIHVRSFEKL-PRVSRQDCVDENENCPAWAKRGECKKNPTYM 300

Query: 301 VGSGGALGYCRKSCKAC 314
           VGS GALGYCRKSCKAC
Sbjct: 301 VGSEGALGYCRKSCKAC 316

BLAST of CSPI05G28680 vs. ExPASy TrEMBL
Match: A0A5D3CTS4 (Procollagen-proline 4-dioxygenase OS=Cucumis melo var. makuwa OX=1194695 GN=E5676_scaffold892G00740 PE=3 SV=1)

HSP 1 Score: 572.0 bits (1473), Expect = 1.5e-159
Identity = 295/376 (78.46%), Postives = 305/376 (81.12%), Query Frame = 0

Query: 1   MDSRPFLAFSLCFLSVFTAFARLPETR----THKQSSGSVLRLKTDSSPLIFDPTRVTQL 60
           MDSRPFLAFSLCFLSVFTAFARLPETR    ++KQS+GSVLRLKTDSSPLIFDPTRVTQL
Sbjct: 1   MDSRPFLAFSLCFLSVFTAFARLPETRMLKHSYKQSTGSVLRLKTDSSPLIFDPTRVTQL 60

Query: 61  SWQPRAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQD 120
           SWQPRAFLYKGFLSD ECDHLIDLAKDKLEKSMVADN+SGKSVSSEVRTSSGMFLRKAQD
Sbjct: 61  SWQPRAFLYKGFLSDEECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQD 120

Query: 121 EVVAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLM 180
           ++VAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLM
Sbjct: 121 KIVAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLM 180

Query: 181 YLSNVEKGGETIFPNSEFKESQAKDESWSDCSRKGY------------------------ 240
           YLSNVEKGGETIFPNSEFKESQ KD+SWSDCSRKGY                        
Sbjct: 181 YLSNVEKGGETIFPNSEFKESQEKDDSWSDCSRKGYAGSTHRLANILYGEPTWFEPLLSD 240

Query: 241 -----------------------------------AVKAQKGDALLFFSLNLDATTDERS 300
                                              AVKAQKGDALLFFSL+LDATTDERS
Sbjct: 241 FPRVARLYYRFAIMLLGFGIVPSYTPYGSTTWLFIAVKAQKGDALLFFSLHLDATTDERS 300

Query: 301 LHGSCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMV 314
           LHGSCPVI GEKWSATKWIHVRSFEK+  RVSRQ CVDENENC AWAK+GECKKNPTYMV
Sbjct: 301 LHGSCPVIEGEKWSATKWIHVRSFEKL-PRVSRQDCVDENENCPAWAKRGECKKNPTYMV 360

BLAST of CSPI05G28680 vs. ExPASy TrEMBL
Match: A0A6J1BXN9 (Procollagen-proline 4-dioxygenase OS=Momordica charantia OX=3673 GN=LOC111006412 PE=3 SV=1)

HSP 1 Score: 560.5 bits (1443), Expect = 4.6e-156
Identity = 271/313 (86.58%), Postives = 287/313 (91.69%), Query Frame = 0

Query: 1   MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP 60
           MDS  FL+FSLCFL VFTA ARLP+ R HK+ SGSVLRLK + SPLIFDPTRVTQLSWQP
Sbjct: 1   MDSPRFLSFSLCFLFVFTALARLPDMRAHKKISGSVLRLKGEPSPLIFDPTRVTQLSWQP 60

Query: 61  RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA 120
           RAFLYKGFLSD ECDHLIDLAKDKLEKSMVADN+SGKSVSSEVRTSSGMFL KAQDE+VA
Sbjct: 61  RAFLYKGFLSDKECDHLIDLAKDKLEKSMVADNNSGKSVSSEVRTSSGMFLHKAQDEIVA 120

Query: 121 GVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN 180
            VEARIAAWT LPAENGESIQILHYENGQKYEPHFD+FHDKVNQELGGHR+ATVLMYLSN
Sbjct: 121 AVEARIAAWTFLPAENGESIQILHYENGQKYEPHFDYFHDKVNQELGGHRVATVLMYLSN 180

Query: 181 VEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHG 240
           VEKGGETIFPNSEFKESQ KD+SWSDC+RKGYAVKA+KGDALLFFSL+LDATTD +SLHG
Sbjct: 181 VEKGGETIFPNSEFKESQEKDDSWSDCARKGYAVKAKKGDALLFFSLHLDATTDVKSLHG 240

Query: 241 SCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSG 300
           SCPVI GEKWSATKWIHVRSFEK T    R  CVDENENC +WAK+GECKKNPTYMVGS 
Sbjct: 241 SCPVIEGEKWSATKWIHVRSFEKPTRPSRRLDCVDENENCASWAKRGECKKNPTYMVGSE 300

Query: 301 GALGYCRKSCKAC 314
            ALGYCRKSC+AC
Sbjct: 301 SALGYCRKSCQAC 313

BLAST of CSPI05G28680 vs. ExPASy TrEMBL
Match: A0A6J1FJ93 (Procollagen-proline 4-dioxygenase OS=Cucurbita moschata OX=3662 GN=LOC111444767 PE=3 SV=1)

HSP 1 Score: 551.2 bits (1419), Expect = 2.8e-153
Identity = 272/313 (86.90%), Postives = 289/313 (92.33%), Query Frame = 0

Query: 1   MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP 60
           MDSR FLAFSL FLSV T FARLPE  THK+ SGSVL LK DS  LIFDPTRVTQLSWQP
Sbjct: 1   MDSRRFLAFSLFFLSVSTGFARLPE--THKKLSGSVLELKRDSPRLIFDPTRVTQLSWQP 60

Query: 61  RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA 120
           RAFLYKGFL+D ECDHLIDLAKDKLEKSMVADN+SGKSVSSEVRTSSGMFLRKAQDE+VA
Sbjct: 61  RAFLYKGFLTDQECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA 120

Query: 121 GVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN 180
           G+EARI+AWT LP ENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN
Sbjct: 121 GIEARISAWTFLPVENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN 180

Query: 181 VEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHG 240
           VEKGGETIFPNS F ESQ KD+SWSDC+RKGYAVKAQKGDALLFFSL+LDATTD+RSLHG
Sbjct: 181 VEKGGETIFPNSAF-ESQEKDDSWSDCARKGYAVKAQKGDALLFFSLHLDATTDKRSLHG 240

Query: 241 SCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSG 300
           SCPVI GEKWSATKWIHVRSF+K T R+S Q CVDEN+NC +WAK+GEC+KNPTYMVGS 
Sbjct: 241 SCPVIEGEKWSATKWIHVRSFDKAT-RISSQDCVDENKNCPSWAKRGECQKNPTYMVGSE 300

Query: 301 GALGYCRKSCKAC 314
           GA+GYCRKSCKAC
Sbjct: 301 GAVGYCRKSCKAC 309

BLAST of CSPI05G28680 vs. NCBI nr
Match: XP_011655982.1 (probable prolyl 4-hydroxylase 7 isoform X1 [Cucumis sativus])

HSP 1 Score: 641.0 bits (1652), Expect = 5.5e-180
Identity = 313/313 (100.00%), Postives = 313/313 (100.00%), Query Frame = 0

Query: 1   MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP 60
           MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP
Sbjct: 1   MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP 60

Query: 61  RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA 120
           RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA
Sbjct: 61  RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA 120

Query: 121 GVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN 180
           GVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN
Sbjct: 121 GVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN 180

Query: 181 VEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHG 240
           VEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHG
Sbjct: 181 VEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHG 240

Query: 241 SCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSG 300
           SCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSG
Sbjct: 241 SCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSG 300

Query: 301 GALGYCRKSCKAC 314
           GALGYCRKSCKAC
Sbjct: 301 GALGYCRKSCKAC 313

BLAST of CSPI05G28680 vs. NCBI nr
Match: XP_031742194.1 (probable prolyl 4-hydroxylase 7 isoform X2 [Cucumis sativus])

HSP 1 Score: 634.8 bits (1636), Expect = 3.9e-178
Identity = 312/313 (99.68%), Postives = 312/313 (99.68%), Query Frame = 0

Query: 1   MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP 60
           MDSRPFLAFSLCFLSVFTAFARLPETRTHKQ SGSVLRLKTDSSPLIFDPTRVTQLSWQP
Sbjct: 1   MDSRPFLAFSLCFLSVFTAFARLPETRTHKQ-SGSVLRLKTDSSPLIFDPTRVTQLSWQP 60

Query: 61  RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA 120
           RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA
Sbjct: 61  RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA 120

Query: 121 GVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN 180
           GVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN
Sbjct: 121 GVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN 180

Query: 181 VEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHG 240
           VEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHG
Sbjct: 181 VEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHG 240

Query: 241 SCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSG 300
           SCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSG
Sbjct: 241 SCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSG 300

Query: 301 GALGYCRKSCKAC 314
           GALGYCRKSCKAC
Sbjct: 301 GALGYCRKSCKAC 312

BLAST of CSPI05G28680 vs. NCBI nr
Match: KAE8648909.1 (hypothetical protein Csa_008411 [Cucumis sativus])

HSP 1 Score: 617.5 bits (1591), Expect = 6.5e-173
Identity = 313/352 (88.92%), Postives = 313/352 (88.92%), Query Frame = 0

Query: 1   MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP 60
           MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP
Sbjct: 1   MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP 60

Query: 61  RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA 120
           RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA
Sbjct: 61  RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA 120

Query: 121 GVEARIAAWTLLPA-----------------------ENGESIQILHYENGQKYEPHFDF 180
           GVEARIAAWTLLPA                       ENGESIQILHYENGQKYEPHFDF
Sbjct: 121 GVEARIAAWTLLPAGRFVDYEAIPITYVSFLELLLKSENGESIQILHYENGQKYEPHFDF 180

Query: 181 FHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSE----------------FKESQAKD 240
           FHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSE                FKESQAKD
Sbjct: 181 FHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSETYILILLPTVMTLSLQFKESQAKD 240

Query: 241 ESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHGSCPVIAGEKWSATKWIHVRSF 300
           ESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHGSCPVIAGEKWSATKWIHVRSF
Sbjct: 241 ESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHGSCPVIAGEKWSATKWIHVRSF 300

Query: 301 EKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSGGALGYCRKSCKAC 314
           EKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSGGALGYCRKSCKAC
Sbjct: 301 EKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSGGALGYCRKSCKAC 352

BLAST of CSPI05G28680 vs. NCBI nr
Match: XP_008458700.1 (PREDICTED: probable prolyl 4-hydroxylase 7 [Cucumis melo])

HSP 1 Score: 599.0 bits (1543), Expect = 2.4e-167
Identity = 295/317 (93.06%), Postives = 305/317 (96.21%), Query Frame = 0

Query: 1   MDSRPFLAFSLCFLSVFTAFARLPETR----THKQSSGSVLRLKTDSSPLIFDPTRVTQL 60
           MDSRPFLAFSLCFLSVFTAFARLPETR    ++KQS+GSVLRLKTDSSPLIFDPTRVTQL
Sbjct: 1   MDSRPFLAFSLCFLSVFTAFARLPETRMLKHSYKQSTGSVLRLKTDSSPLIFDPTRVTQL 60

Query: 61  SWQPRAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQD 120
           SWQPRAFLYKGFLSD ECDHLIDLAKDKLEKSMVADN+SGKSVSSEVRTSSGMFLRKAQD
Sbjct: 61  SWQPRAFLYKGFLSDEECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQD 120

Query: 121 EVVAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLM 180
           ++VAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLM
Sbjct: 121 KIVAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLM 180

Query: 181 YLSNVEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDER 240
           YLSNVEKGGETIFPNSEFKESQ KD+SWSDCSRKGYAVKAQKGDALLFFSL+LDATTDER
Sbjct: 181 YLSNVEKGGETIFPNSEFKESQEKDDSWSDCSRKGYAVKAQKGDALLFFSLHLDATTDER 240

Query: 241 SLHGSCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYM 300
           SLHGSCPVI GEKWSATKWIHVRSFEK+  RVSRQ CVDENENC AWAK+GECKKNPTYM
Sbjct: 241 SLHGSCPVIEGEKWSATKWIHVRSFEKL-PRVSRQDCVDENENCPAWAKRGECKKNPTYM 300

Query: 301 VGSGGALGYCRKSCKAC 314
           VGS GALGYCRKSCKAC
Sbjct: 301 VGSEGALGYCRKSCKAC 316

BLAST of CSPI05G28680 vs. NCBI nr
Match: XP_038889686.1 (probable prolyl 4-hydroxylase 7 isoform X1 [Benincasa hispida])

HSP 1 Score: 577.0 bits (1486), Expect = 9.8e-161
Identity = 280/313 (89.46%), Postives = 294/313 (93.93%), Query Frame = 0

Query: 1   MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP 60
           MDSR FLAF LCFLSVFT FARLPE R+ K+SSGSV+RLKTDSSPL+FDPTRVTQLSW+P
Sbjct: 1   MDSRRFLAFCLCFLSVFTGFARLPELRSQKKSSGSVIRLKTDSSPLVFDPTRVTQLSWEP 60

Query: 61  RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA 120
           RAFLYKGFLSD ECDHLIDLAKDKLEKSMVADN+SGKSVSSEVRTSSGMFLRKAQDE+VA
Sbjct: 61  RAFLYKGFLSDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA 120

Query: 121 GVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN 180
            +EARI+AWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN
Sbjct: 121 AIEARISAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSN 180

Query: 181 VEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHG 240
           VEKGGETIFPNSEFKESQ KDESWSDC+RKGYAVKA+KGDALLFFSL  DATTD +SLHG
Sbjct: 181 VEKGGETIFPNSEFKESQEKDESWSDCARKGYAVKARKGDALLFFSLRPDATTDVKSLHG 240

Query: 241 SCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSG 300
           SCPVI GEKWSATKWIHVRSFEK T RVSRQ CVDENENC  WAK+GECKKNPTYMVGS 
Sbjct: 241 SCPVIEGEKWSATKWIHVRSFEKAT-RVSRQDCVDENENCQIWAKRGECKKNPTYMVGSE 300

Query: 301 GALGYCRKSCKAC 314
            ALGYCRKSC+AC
Sbjct: 301 DALGYCRKSCRAC 312

BLAST of CSPI05G28680 vs. TAIR 10
Match: AT3G28480.1 (Oxoglutarate/iron-dependent oxygenase )

HSP 1 Score: 456.8 bits (1174), Expect = 1.4e-128
Identity = 219/316 (69.30%), Postives = 257/316 (81.33%), Query Frame = 0

Query: 1   MDSRPFLAFSLCFLSVFTAFARLPE---TRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLS 60
           MDSR FLAFSLCFL      +  P    TR+     GSV+++KT +S   FDPTRVTQLS
Sbjct: 1   MDSRIFLAFSLCFLFTLPLISSAPNRFLTRSSNTRDGSVIKMKTSASSFGFDPTRVTQLS 60

Query: 61  WQPRAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDE 120
           W PR FLY+GFLSD ECDH I LAK KLEKSMVADNDSG+SV SEVRTSSGMFL K QD+
Sbjct: 61  WTPRVFLYEGFLSDEECDHFIKLAKGKLEKSMVADNDSGESVESEVRTSSGMFLSKRQDD 120

Query: 121 VVAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMY 180
           +V+ VEA++AAWT LP ENGES+QILHYENGQKYEPHFD+FHD+ N ELGGHRIATVLMY
Sbjct: 121 IVSNVEAKLAAWTFLPEENGESMQILHYENGQKYEPHFDYFHDQANLELGGHRIATVLMY 180

Query: 181 LSNVEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERS 240
           LSNVEKGGET+FP  + K +Q KD+SW++C+++GYAVK +KGDALLFF+L+ +ATTD  S
Sbjct: 181 LSNVEKGGETVFPMWKGKATQLKDDSWTECAKQGYAVKPRKGDALLFFNLHPNATTDSNS 240

Query: 241 LHGSCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMV 300
           LHGSCPV+ GEKWSAT+WIHV+SFE+  ++ S  GC+DEN +C  WAK GEC+KNPTYMV
Sbjct: 241 LHGSCPVVEGEKWSATRWIHVKSFERAFNKQS--GCMDENVSCEKWAKAGECQKNPTYMV 300

Query: 301 GSGGALGYCRKSCKAC 314
           GS    GYCRKSCKAC
Sbjct: 301 GSDKDHGYCRKSCKAC 314

BLAST of CSPI05G28680 vs. TAIR 10
Match: AT3G28480.2 (Oxoglutarate/iron-dependent oxygenase )

HSP 1 Score: 430.3 bits (1105), Expect = 1.4e-120
Identity = 212/324 (65.43%), Postives = 250/324 (77.16%), Query Frame = 0

Query: 1   MDSRPFLAFSLCFLSVFTAFARLPE---TRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLS 60
           MDSR FLAFSLCFL      +  P    TR+     GSV+++KT +S   FDPTRVTQLS
Sbjct: 1   MDSRIFLAFSLCFLFTLPLISSAPNRFLTRSSNTRDGSVIKMKTSASSFGFDPTRVTQLS 60

Query: 61  WQPRAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSE-----VRTSSGMFLR 120
           W PR FLY+GFLSD ECDH I LAK KLEKSMVADNDSG+SV SE     VR SS     
Sbjct: 61  WTPRVFLYEGFLSDEECDHFIKLAKGKLEKSMVADNDSGESVESEDSVSVVRQSSSFIAN 120

Query: 121 KAQ---DEVVAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGH 180
                 D++V+ VEA++AAWT LP ENGES+QILHYENGQKYEPHFD+FHD+ N ELGGH
Sbjct: 121 MDSLEIDDIVSNVEAKLAAWTFLPEENGESMQILHYENGQKYEPHFDYFHDQANLELGGH 180

Query: 181 RIATVLMYLSNVEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNL 240
           RIATVLMYLSNVEKGGET+FP  + K +Q KD+SW++C+++GYAVK +KGDALLFF+L+ 
Sbjct: 181 RIATVLMYLSNVEKGGETVFPMWKGKATQLKDDSWTECAKQGYAVKPRKGDALLFFNLHP 240

Query: 241 DATTDERSLHGSCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGEC 300
           +ATTD  SLHGSCPV+ GEKWSAT+WIHV+SFE+  ++ S  GC+DEN +C  WAK GEC
Sbjct: 241 NATTDSNSLHGSCPVVEGEKWSATRWIHVKSFERAFNKQS--GCMDENVSCEKWAKAGEC 300

Query: 301 KKNPTYMVGSGGALGYCRKSCKAC 314
           +KNPTYMVGS    GYCRKSCKAC
Sbjct: 301 QKNPTYMVGSDKDHGYCRKSCKAC 322

BLAST of CSPI05G28680 vs. TAIR 10
Match: AT3G28490.1 (Oxoglutarate/iron-dependent oxygenase )

HSP 1 Score: 408.7 bits (1049), Expect = 4.3e-114
Identity = 205/314 (65.29%), Postives = 238/314 (75.80%), Query Frame = 0

Query: 1   MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP 60
           MDS+ FLAFSL  L +F+                     +  S     DPTR+TQLSW P
Sbjct: 1   MDSQYFLAFSLSLLLIFS---------------------QISSFSFSVDPTRITQLSWTP 60

Query: 61  RAFLYKGFLSDAECDHLIDLAKDKLEKSM-VADNDSGKSVSSEVRTSSGMFLRKAQDEVV 120
           RAFLYKGFLSD ECDHLI LAK KLEKSM VAD DSG+S  SEVRTSSGMFL K QD++V
Sbjct: 61  RAFLYKGFLSDEECDHLIKLAKGKLEKSMVVADVDSGESEDSEVRTSSGMFLTKRQDDIV 120

Query: 121 AGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLS 180
           A VEA++AAWT LP ENGE++QILHYENGQKY+PHFD+F+DK   ELGGHRIATVLMYLS
Sbjct: 121 ANVEAKLAAWTFLPEENGEALQILHYENGQKYDPHFDYFYDKKALELGGHRIATVLMYLS 180

Query: 181 NVEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLH 240
           NV KGGET+FPN + K  Q KD+SWS C+++GYAVK +KGDALLFF+L+L+ TTD  SLH
Sbjct: 181 NVTKGGETVFPNWKGKTPQLKDDSWSKCAKQGYAVKPRKGDALLFFNLHLNGTTDPNSLH 240

Query: 241 GSCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGS 300
           GSCPVI GEKWSAT+WIHVRSF K      +  CVD++E+C  WA  GEC+KNP YMVGS
Sbjct: 241 GSCPVIEGEKWSATRWIHVRSFGK-----KKLVCVDDHESCQEWADAGECEKNPMYMVGS 288

Query: 301 GGALGYCRKSCKAC 314
             +LG+CRKSCKAC
Sbjct: 301 ETSLGFCRKSCKAC 288

BLAST of CSPI05G28680 vs. TAIR 10
Match: AT5G18900.1 (2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily protein )

HSP 1 Score: 343.2 bits (879), Expect = 2.2e-94
Identity = 166/286 (58.04%), Postives = 212/286 (74.13%), Query Frame = 0

Query: 31  QSSGSVLRLKTDSSPLIFDPTRVTQLSWQPRAFLYKGFLSDAECDHLIDLAKDKLEKSMV 90
           QSS S++     SS +  +P++V Q+S +PRAF+Y+GFL++ ECDH++ LAK  L++S V
Sbjct: 19  QSSTSLI----SSSSVFVNPSKVKQVSSKPRAFVYEGFLTELECDHMVSLAKASLKRSAV 78

Query: 91  ADNDSGKSVSSEVRTSSGMFLRKAQDEVVAGVEARIAAWTLLPAENGESIQILHYENGQK 150
           ADNDSG+S  SEVRTSSG F+ K +D +V+G+E +I+ WT LP ENGE IQ+L YE+GQK
Sbjct: 79  ADNDSGESKFSEVRTSSGTFISKGKDPIVSGIEDKISTWTFLPKENGEDIQVLRYEHGQK 138

Query: 151 YEPHFDFFHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSEFKESQA---KDESWSDC 210
           Y+ HFD+FHDKVN   GGHR+AT+LMYLSNV KGGET+FP++E    +      E  SDC
Sbjct: 139 YDAHFDYFHDKVNIVRGGHRMATILMYLSNVTKGGETVFPDAEIPSRRVLSENKEDLSDC 198

Query: 211 SRKGYAVKAQKGDALLFFSLNLDATTDERSLHGSCPVIAGEKWSATKWIHVRSFEKITSR 270
           +++G AVK +KGDALLFF+L+ DA  D  SLHG CPVI GEKWSATKWIHV SF++I + 
Sbjct: 199 AKRGIAVKPRKGDALLFFNLHPDAIPDPLSLHGGCPVIEGEKWSATKWIHVDSFDRIVT- 258

Query: 271 VSRQGCVDENENCLAWAKKGECKKNPTYMVGSGGALGYCRKSCKAC 314
                C D NE+C  WA  GEC KNP YMVG+    GYCR+SCKAC
Sbjct: 259 -PSGNCTDMNESCERWAVLGECTKNPEYMVGTTELPGYCRRSCKAC 298

BLAST of CSPI05G28680 vs. TAIR 10
Match: AT3G06300.1 (P4H isoform 2 )

HSP 1 Score: 337.8 bits (865), Expect = 9.3e-93
Identity = 167/276 (60.51%), Postives = 210/276 (76.09%), Query Frame = 0

Query: 43  SSP-LIFDPTRVTQLSWQPRAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSS 102
           SSP  I +P++V Q+S +PRAF+Y+GFL+D ECDHLI LAK+ L++S VADND+G+S  S
Sbjct: 27  SSPSSIINPSKVKQVSSKPRAFVYEGFLTDLECDHLISLAKENLQRSAVADNDNGESQVS 86

Query: 103 EVRTSSGMFLRKAQDEVVAGVEARIAAWTLLPAENGESIQILHYENGQKYEPHFDFFHDK 162
           +VRTSSG F+ K +D +V+G+E +++ WT LP ENGE +Q+L YE+GQKY+ HFD+FHDK
Sbjct: 87  DVRTSSGTFISKGKDPIVSGIEDKLSTWTFLPKENGEDLQVLRYEHGQKYDAHFDYFHDK 146

Query: 163 VNQELGGHRIATVLMYLSNVEKGGETIFPNS-EFKE---SQAKDESWSDCSRKGYAVKAQ 222
           VN   GGHRIATVL+YLSNV KGGET+FP++ EF     S+ KD+  SDC++KG AVK +
Sbjct: 147 VNIARGGHRIATVLLYLSNVTKGGETVFPDAQEFSRRSLSENKDD-LSDCAKKGIAVKPK 206

Query: 223 KGDALLFFSLNLDATTDERSLHGSCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDEN 282
           KG+ALLFF+L  DA  D  SLHG CPVI GEKWSATKWIHV SF+KI +      C D N
Sbjct: 207 KGNALLFFNLQQDAIPDPFSLHGGCPVIEGEKWSATKWIHVDSFDKILTHDG--NCTDVN 266

Query: 283 ENCLAWAKKGECKKNPTYMVGSGGALGYCRKSCKAC 314
           E+C  WA  GEC KNP YMVG+    G CR+SCKAC
Sbjct: 267 ESCERWAVLGECGKNPEYMVGTPEIPGNCRRSCKAC 299

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
Q8L9701.9e-12769.30Probable prolyl 4-hydroxylase 7 OS=Arabidopsis thaliana OX=3702 GN=P4H7 PE=2 SV=... [more]
F4J0A86.0e-11365.29Probable prolyl 4-hydroxylase 6 OS=Arabidopsis thaliana OX=3702 GN=P4H6 PE=2 SV=... [more]
Q8LAN33.1e-9358.04Probable prolyl 4-hydroxylase 4 OS=Arabidopsis thaliana OX=3702 GN=P4H4 PE=2 SV=... [more]
F4JAU31.3e-9160.51Prolyl 4-hydroxylase 2 OS=Arabidopsis thaliana OX=3702 GN=P4H2 PE=1 SV=1[more]
Q9LN202.0e-6354.81Probable prolyl 4-hydroxylase 3 OS=Arabidopsis thaliana OX=3702 GN=P4H3 PE=2 SV=... [more]
Match NameE-valueIdentityDescription
A0A0A0KS382.7e-180100.00Procollagen-proline 4-dioxygenase OS=Cucumis sativus OX=3659 GN=Csa_5G633280 PE=... [more]
A0A1S3C8G41.2e-16793.06Procollagen-proline 4-dioxygenase OS=Cucumis melo OX=3656 GN=LOC103498028 PE=3 S... [more]
A0A5D3CTS41.5e-15978.46Procollagen-proline 4-dioxygenase OS=Cucumis melo var. makuwa OX=1194695 GN=E567... [more]
A0A6J1BXN94.6e-15686.58Procollagen-proline 4-dioxygenase OS=Momordica charantia OX=3673 GN=LOC111006412... [more]
A0A6J1FJ932.8e-15386.90Procollagen-proline 4-dioxygenase OS=Cucurbita moschata OX=3662 GN=LOC111444767 ... [more]
Match NameE-valueIdentityDescription
XP_011655982.15.5e-180100.00probable prolyl 4-hydroxylase 7 isoform X1 [Cucumis sativus][more]
XP_031742194.13.9e-17899.68probable prolyl 4-hydroxylase 7 isoform X2 [Cucumis sativus][more]
KAE8648909.16.5e-17388.92hypothetical protein Csa_008411 [Cucumis sativus][more]
XP_008458700.12.4e-16793.06PREDICTED: probable prolyl 4-hydroxylase 7 [Cucumis melo][more]
XP_038889686.19.8e-16189.46probable prolyl 4-hydroxylase 7 isoform X1 [Benincasa hispida][more]
Match NameE-valueIdentityDescription
AT3G28480.11.4e-12869.30Oxoglutarate/iron-dependent oxygenase [more]
AT3G28480.21.4e-12065.43Oxoglutarate/iron-dependent oxygenase [more]
AT3G28490.14.3e-11465.29Oxoglutarate/iron-dependent oxygenase [more]
AT5G18900.12.2e-9458.042-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily protein [more]
AT3G06300.19.3e-9360.51P4H isoform 2 [more]
InterPro
Analysis Name: InterPro Annotations of Cucumber (PI 183967) v1
Date Performed: 2021-10-25
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR003582ShKT domainSMARTSM00254ShkT_1coord: 272..313
e-value: 4.4E-4
score: 29.6
IPR003582ShKT domainPROSITEPS51670SHKTcoord: 273..313
score: 8.75773
IPR006620Prolyl 4-hydroxylase, alpha subunitSMARTSM00702p4hccoord: 60..257
e-value: 2.0E-54
score: 196.8
IPR044862Prolyl 4-hydroxylase alpha subunit, Fe(2+) 2OG dioxygenase domainPFAMPF136402OG-FeII_Oxy_3coord: 141..257
e-value: 8.7E-20
score: 71.4
NoneNo IPR availableGENE3D2.60.120.620q2cbj1_9rhob like domaincoord: 52..258
e-value: 3.6E-76
score: 257.5
NoneNo IPR availablePANTHERPTHR10869:SF140OS03G0803500 PROTEINcoord: 41..313
IPR045054Prolyl 4-hydroxylasePANTHERPTHR10869PROLYL 4-HYDROXYLASE ALPHA SUBUNITcoord: 41..313
IPR005123Oxoglutarate/iron-dependent dioxygenasePROSITEPS51471FE2OG_OXYcoord: 136..258
score: 12.217567

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CSPI05G28680.1CSPI05G28680.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0018401 peptidyl-proline hydroxylation to 4-hydroxy-L-proline
cellular_component GO:0005789 endoplasmic reticulum membrane
molecular_function GO:0005506 iron ion binding
molecular_function GO:0031418 L-ascorbic acid binding
molecular_function GO:0004656 procollagen-proline 4-dioxygenase activity
molecular_function GO:0016705 oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen