HG10022103 (gene) Bottle gourd (Hangzhou Gourd) v1

Overview
NameHG10022103
Typegene
OrganismLagenaria siceraria cv. Hangzhou Gourd (Bottle gourd (Hangzhou Gourd) v1)
DescriptionProcollagen-proline 4-dioxygenase
LocationChr05: 20874457 .. 20878029 (+)
RNA-Seq ExpressionHG10022103
SyntenyHG10022103
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: CDSpolypeptide
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGGATTCCCGACGATTCCTCGCATTTTCTCTCTGCTTTCTGTCCGTCTTTACTGGCTTCGCTCGCTTGCCGGAAACGCGTACGCACAAAAAATTGTACGATTTCTTTCGTCTTCTTCTTGCCTTTCATTTTTTGTAATTTCACGAATCCCACTGACGCATGGTTTTGGATTCAATTGTTTTTGGAATTTCAGAAGTGGATCTGTGCTTCAATTGAAGACGGATTCATCTCCGCTCATTTTCGATCCAACACGAGTCACTCAGCTCTCCTGGCAACCCAGGTAATCTGTCTTTCTCTCTTAGGCGATCGAATAAGATGTTCCTTTACTTTCTGGCTTGATTGTGTGGAAAGATAATCCTGCTTGAGATAGTCAAACCACATTTTCCATCAATTCAGTTTTCAATGTATCCTCAATTGGCGATGTGTTTGTTTATATTCTTCTGAAAACTGCAAATCTACAGGGCATTTTTGTATAAGGGATTTTTAACTGATAAGGAATGTGATCATCTAATCGATCTGGTAAGTGATTATGGAACGGTTTGCTTGTTTTAATTTAAATTTCAATATTGTGGTAATGTTATATGTTTGTTTGATTCTATTATTTTCGGGAATAGGCCAAGGATAAATTAGAGAAGTCAATGGTAGCAGATAATGAGTCTGGTAAAAGTGTAAGTAGTGAAGTCCGAACGAGTTCTGGCATGTTCCTTCGGAAGGCCCAGGTGCGTTAGTTTATTTGAATCATCCCATCCATCCTTATAATGTTTTACTTTTTGAACTTAATTTTATGTTACTGTATGTTAAAGATCGTGGTGTCATAGTATGTTCTTGTAGAATAAGGTAAGAACTAAGAAGGATAATGTAATTGTAGATGTCAATGTATGGAGTCCTGAACGTGTTGGTAACATTTTAAGGTAAAAACTTCTAACGAAAATTGTGTGAACTTATATAAATTATTTGACTACATGAATTTTCCAAATGTGCAGTAATGTATGTAAAAGTACTTGTTTTAGACAGAAAGAGATGGCTGAATATAAATGAAAAAAACTTGTTTGCCTTATGGAGTCTAGTCATTGTCCCAATGTAGTTTGACCTTATGCATGTGTTTGGGGCATTGTTTTGGAGGAGAATTATCAACATCCCTGTTCATTAACCTGTTGTCAACAATATCCCTTGATAATATTTCATGAGGAAGGATCCATCTGCCTTCTCTCCTCTTATCATCATCCATACTGTTGGCATTGGATGTTCAACTAAATAACATCTCTCTAGCGAGCTAACTCAAGACTAAACTCTCAAAAGAAACAGTTCTACACGGATCTCACAAACTAACAACTCCAGAGAAGGTGATCCAAGTCTTCCTCCACCTTTCAAAAAGGAATACAACAAAAAAGTTCCGTAATTGTGGACTTCCTTCTCAGAAGCCTCCATAGTGTTCACACGGCCGAGTAAAACTTGTCATATTCAAGTTTCAAGTCTAAACACTAATTATTATTTTCTTTAAACATAATATAATATGGAAGTTTTGTTATGTGTTTAATGCCTATAAACTGCAAGTTGGCTAATTACAATCCATGACTAATTTTACCACCAAGTCTAAACTAAATAAACATGGATGACAAGTTAAAGATTAAATTGATTTTATTAGACGACCTTACTTGAACAGGACCTATTCTATAGGTTGTACTACTGTGTCCTTAAGTTCTATAGATTAAATTGATTCTATTAGTGACAATATTGATCACGGTATTGAGAATGATCTTTAAAATATTTATTTAAGAAAGCGACTGTTGATAGATTTGTTTGACATATCAATTTTTTTAATATTTCAATCATGGCGTATGCCTCATATTGCTTTGAAGCTTAATTACAATGTATATGATTTATTTTGTGCTTGAAGGATGAAATTGTTGCTGGCATTGAGGCCAGGATATCTGCGTGGACATTCCTTCCAGCAGGTACATTTGTTGTATGCTCATTGTTGTTATGGATACACTATTTTTGTTTTTTTTGTTTTTTTGTTTTTTTTTTAATTTAAATCAGATTATAAAGCTTTAACGATCACTTACTTTTCTTTCTTGGAACTGTTGTTAAAATCAGAAAACGGAGAATCTATTCAAATTCTTCACTATGAGAATGGTCAAAAGTATGAACCACATTTTGATTTTTTTCACGACAAGGTGAATCAGGAGTTGGGTGGCCACCGAATAGCCACAGTCTTGATGTATTTATCCAATGTTGAAAAGGGTGGAGAAACCATCTTTCCAAATTCAGAGGTATGGCAGTGGTTCTGCCACTTCAGTGCCTTTGGTTTTCTGAAAAACGGACGTATATTTTATTTTTGTTGCCAACTGTGATGATTTTGTCTTCTGCAGTTGAAAGAATCTCAAGAAAAGGATGACAGCTGGTCTGATTGTGCTCATAAGGGTTATGCAGGTAGGTATTTGTTATTGACTCATTTTTTTTAGTATAGATTCTGAGTCATCAATGTATTATGTTACATTTTCAATTTAGAAAAGGCCTCGGATGGCCTATCTAGCATTGCTAGACCTTCCGTGAAAGTATCTTTCTTGAACTTGATCTTCCTTTCAGGTCTTTGAGAAAGGGTACTCATCTCTGGTTAATATACTTTATGGAGAGCCAACTTGATTTGACCCTTTGTTATGAGATTTTCGCCTGCATCATTGATTTGTCGTCATTATAGTCTATTCAATATATAGCCTGGTTTAGAAAAAGTTTGGTTGTCCAATTAGTTTGAAATGTTAGCTGAGACTGTTGCTTGTGATTTTGCATGTTCTAACTTGTTTTAAAAGGCACATCTAGACGCCATTACAAAGGCAATGCACACTCAGTAAAACCTATTGTAGTTCCTTGTTCTTTAAAAGTGAAATATATTGCTTGCCTGCGTGGATGTTTTTGGGCCTATTTGAATTGACTAGAAAAAAAAATGTTTTCGTCATTTTTATTTAAATTCTTTTGATGAAAACTATTTGAAATACACTTTGAAAGTGTTTCAAAAGCTATTTTGAGTAGTTTCAAACACTTCAATTTTTTCAAAATGACTTATTTTCAAAATTAAACACTTGAAAAGTTAAAACTTAAACCAAACACACCTTATAACTCTGAACATATTAACTTTGTAGGCTCTACTGTAAAGTGAAATATATTTCTTGACTTTTGAGGCTCTATTCTATTTTTGTTGTCATTGTTTCTTATAGCTTCTAGGATCGATATCATACGTTTCCTTATGGTTCTACAACTTGGCTGTTCATTGCAGTTAAGGCACAGAAGGGTGATGCATTGCTGTTCTTCAGCCTCCATCTTGATGCAACGACAGATGACAAAAGCTTGCACGGTAGTTGCCCTGTGATTGAGGGCGAGAAATGGTCTGCAACCAAATGGATTCATGTGAGATCCTTCGAGAAGCCAACTCGTGTAAGTAGTCAGGATTGCATGGACGAGAACGAAAATTGCCCGTTATGGGCAAAAAGAGGAGAGTGCAAAAAGAACCCTACTTACATGGTGGGTTCTGAAGGTGCTTTAGGATACTGTAGGAAGAGTTGCAGAGCATGTTGA

mRNA sequence

ATGGATTCCCGACGATTCCTCGCATTTTCTCTCTGCTTTCTGTCCGTCTTTACTGGCTTCGCTCGCTTGCCGGAAACGCGTACGCACAAAAAATTAAGTGGATCTGTGCTTCAATTGAAGACGGATTCATCTCCGCTCATTTTCGATCCAACACGAGTCACTCAGCTCTCCTGGCAACCCAGGGCATTTTTGTATAAGGGATTTTTAACTGATAAGGAATGTGATCATCTAATCGATCTGGCCAAGGATAAATTAGAGAAGTCAATGGTAGCAGATAATGAGTCTGGTAAAAGTGTAAGTAGTGAAGTCCGAACGAGTTCTGGCATGTTCCTTCGGAAGGCCCAGGATGAAATTGTTGCTGGCATTGAGGCCAGGATATCTGCGTGGACATTCCTTCCAGCAGATTATAAAGCTTTAACGATCACTTACTTTTCTTTCTTGGAACTGTTGTTAAAATCAGAAAACGGAGAATCTATTCAAATTCTTCACTATGAGAATGGTCAAAAGTATGAACCACATTTTGATTTTTTTCACGACAAGGTGAATCAGGAGTTGGGTGGCCACCGAATAGCCACAGTCTTGATGTATTTATCCAATGTTGAAAAGGGTGGAGAAACCATCTTTCCAAATTCAGAGTTGAAAGAATCTCAAGAAAAGGATGACAGCTGGTCTGATTGTGCTCATAAGGGTTATGCAGTTAAGGCACAGAAGGGTGATGCATTGCTGTTCTTCAGCCTCCATCTTGATGCAACGACAGATGACAAAAGCTTGCACGGTAGTTGCCCTGTGATTGAGGGCGAGAAATGGTCTGCAACCAAATGGATTCATGTGAGATCCTTCGAGAAGCCAACTCGTGTAAGTAGTCAGGATTGCATGGACGAGAACGAAAATTGCCCGTTATGGGCAAAAAGAGGAGAGTGCAAAAAGAACCCTACTTACATGGTGGGTTCTGAAGGTGCTTTAGGATACTGTAGGAAGAGTTGCAGAGCATGTTGA

Coding sequence (CDS)

ATGGATTCCCGACGATTCCTCGCATTTTCTCTCTGCTTTCTGTCCGTCTTTACTGGCTTCGCTCGCTTGCCGGAAACGCGTACGCACAAAAAATTAAGTGGATCTGTGCTTCAATTGAAGACGGATTCATCTCCGCTCATTTTCGATCCAACACGAGTCACTCAGCTCTCCTGGCAACCCAGGGCATTTTTGTATAAGGGATTTTTAACTGATAAGGAATGTGATCATCTAATCGATCTGGCCAAGGATAAATTAGAGAAGTCAATGGTAGCAGATAATGAGTCTGGTAAAAGTGTAAGTAGTGAAGTCCGAACGAGTTCTGGCATGTTCCTTCGGAAGGCCCAGGATGAAATTGTTGCTGGCATTGAGGCCAGGATATCTGCGTGGACATTCCTTCCAGCAGATTATAAAGCTTTAACGATCACTTACTTTTCTTTCTTGGAACTGTTGTTAAAATCAGAAAACGGAGAATCTATTCAAATTCTTCACTATGAGAATGGTCAAAAGTATGAACCACATTTTGATTTTTTTCACGACAAGGTGAATCAGGAGTTGGGTGGCCACCGAATAGCCACAGTCTTGATGTATTTATCCAATGTTGAAAAGGGTGGAGAAACCATCTTTCCAAATTCAGAGTTGAAAGAATCTCAAGAAAAGGATGACAGCTGGTCTGATTGTGCTCATAAGGGTTATGCAGTTAAGGCACAGAAGGGTGATGCATTGCTGTTCTTCAGCCTCCATCTTGATGCAACGACAGATGACAAAAGCTTGCACGGTAGTTGCCCTGTGATTGAGGGCGAGAAATGGTCTGCAACCAAATGGATTCATGTGAGATCCTTCGAGAAGCCAACTCGTGTAAGTAGTCAGGATTGCATGGACGAGAACGAAAATTGCCCGTTATGGGCAAAAAGAGGAGAGTGCAAAAAGAACCCTACTTACATGGTGGGTTCTGAAGGTGCTTTAGGATACTGTAGGAAGAGTTGCAGAGCATGTTGA

Protein sequence

MDSRRFLAFSLCFLSVFTGFARLPETRTHKKLSGSVLQLKTDSSPLIFDPTRVTQLSWQPRAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVAGIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQEKDDSWSDCAHKGYAVKAQKGDALLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRVSSQDCMDENENCPLWAKRGECKKNPTYMVGSEGALGYCRKSCRAC
Homology
BLAST of HG10022103 vs. NCBI nr
Match: KAE8648909.1 (hypothetical protein Csa_008411 [Cucumis sativus])

HSP 1 Score: 594.7 bits (1532), Expect = 4.8e-166
Identity = 297/352 (84.38%), Postives = 314/352 (89.20%), Query Frame = 0

Query: 1   MDSRRFLAFSLCFLSVFTGFARLPETRTHKKLSGSVLQLKTDSSPLIFDPTRVTQLSWQP 60
           MDSR FLAFSLCFLSVFT FARLPETRTHK+ SGSVL+LKTDSSPLIFDPTRVTQLSWQP
Sbjct: 1   MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP 60

Query: 61  RAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA 120
           RAFLYKGFL+D ECDHLIDLAKDKLEKSMVADN+SGKSVSSEVRTSSGMFLRKAQDE+VA
Sbjct: 61  RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA 120

Query: 121 GIEARISAWTFLPA----DYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFDF 180
           G+EARI+AWT LPA    DY+A+ ITY SFLELLLKSENGESIQILHYENGQKYEPHFDF
Sbjct: 121 GVEARIAAWTLLPAGRFVDYEAIPITYVSFLELLLKSENGESIQILHYENGQKYEPHFDF 180

Query: 181 FHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSE----------------LKESQEKD 240
           FHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSE                 KESQ KD
Sbjct: 181 FHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSETYILILLPTVMTLSLQFKESQAKD 240

Query: 241 DSWSDCAHKGYAVKAQKGDALLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSF 300
           +SWSDC+ KGYAVKAQKGDALLFFSL+LDATTD++SLHGSCPVI GEKWSATKWIHVRSF
Sbjct: 241 ESWSDCSRKGYAVKAQKGDALLFFSLNLDATTDERSLHGSCPVIAGEKWSATKWIHVRSF 300

Query: 301 EKPT-RVSSQDCMDENENCPLWAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
           EK T RVS Q C+DENENC  WAK+GECKKNPTYMVGS GALGYCRKSC+AC
Sbjct: 301 EKITSRVSRQGCVDENENCLAWAKKGECKKNPTYMVGSGGALGYCRKSCKAC 352

BLAST of HG10022103 vs. NCBI nr
Match: XP_008458700.1 (PREDICTED: probable prolyl 4-hydroxylase 7 [Cucumis melo])

HSP 1 Score: 582.8 bits (1501), Expect = 1.9e-162
Identity = 287/335 (85.67%), Postives = 302/335 (90.15%), Query Frame = 0

Query: 1   MDSRRFLAFSLCFLSVFTGFARLPETR----THKKLSGSVLQLKTDSSPLIFDPTRVTQL 60
           MDSR FLAFSLCFLSVFT FARLPETR    ++K+ +GSVL+LKTDSSPLIFDPTRVTQL
Sbjct: 1   MDSRPFLAFSLCFLSVFTAFARLPETRMLKHSYKQSTGSVLRLKTDSSPLIFDPTRVTQL 60

Query: 61  SWQPRAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQD 120
           SWQPRAFLYKGFL+D+ECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQD
Sbjct: 61  SWQPRAFLYKGFLSDEECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQD 120

Query: 121 EIVAGIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFDF 180
           +IVAG+EARI+AWT LPA                   ENGESIQILHYENGQKYEPHFDF
Sbjct: 121 KIVAGVEARIAAWTLLPA-------------------ENGESIQILHYENGQKYEPHFDF 180

Query: 181 FHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQEKDDSWSDCAHKGYAVKAQ 240
           FHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSE KESQEKDDSWSDC+ KGYAVKAQ
Sbjct: 181 FHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSEFKESQEKDDSWSDCSRKGYAVKAQ 240

Query: 241 KGDALLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRVSSQDCMDENE 300
           KGDALLFFSLHLDATTD++SLHGSCPVIEGEKWSATKWIHVRSFEK  RVS QDC+DENE
Sbjct: 241 KGDALLFFSLHLDATTDERSLHGSCPVIEGEKWSATKWIHVRSFEKLPRVSRQDCVDENE 300

Query: 301 NCPLWAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
           NCP WAKRGECKKNPTYMVGSEGALGYCRKSC+AC
Sbjct: 301 NCPAWAKRGECKKNPTYMVGSEGALGYCRKSCKAC 316

BLAST of HG10022103 vs. NCBI nr
Match: XP_038889686.1 (probable prolyl 4-hydroxylase 7 isoform X1 [Benincasa hispida])

HSP 1 Score: 582.4 bits (1500), Expect = 2.5e-162
Identity = 287/331 (86.71%), Postives = 297/331 (89.73%), Query Frame = 0

Query: 1   MDSRRFLAFSLCFLSVFTGFARLPETRTHKKLSGSVLQLKTDSSPLIFDPTRVTQLSWQP 60
           MDSRRFLAF LCFLSVFTGFARLPE R+ KK SGSV++LKTDSSPL+FDPTRVTQLSW+P
Sbjct: 1   MDSRRFLAFCLCFLSVFTGFARLPELRSQKKSSGSVIRLKTDSSPLVFDPTRVTQLSWEP 60

Query: 61  RAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA 120
           RAFLYKGFL+DKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA
Sbjct: 61  RAFLYKGFLSDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA 120

Query: 121 GIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFDFFHDK 180
            IEARISAWT LPA                   ENGESIQILHYENGQKYEPHFDFFHDK
Sbjct: 121 AIEARISAWTLLPA-------------------ENGESIQILHYENGQKYEPHFDFFHDK 180

Query: 181 VNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQEKDDSWSDCAHKGYAVKAQKGDA 240
           VNQELGGHRIATVLMYLSNVEKGGETIFPNSE KESQEKD+SWSDCA KGYAVKA+KGDA
Sbjct: 181 VNQELGGHRIATVLMYLSNVEKGGETIFPNSEFKESQEKDESWSDCARKGYAVKARKGDA 240

Query: 241 LLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRVSSQDCMDENENCPL 300
           LLFFSL  DATTD KSLHGSCPVIEGEKWSATKWIHVRSFEK TRVS QDC+DENENC +
Sbjct: 241 LLFFSLRPDATTDVKSLHGSCPVIEGEKWSATKWIHVRSFEKATRVSRQDCVDENENCQI 300

Query: 301 WAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
           WAKRGECKKNPTYMVGSE ALGYCRKSCRAC
Sbjct: 301 WAKRGECKKNPTYMVGSEDALGYCRKSCRAC 312

BLAST of HG10022103 vs. NCBI nr
Match: XP_038889687.1 (probable prolyl 4-hydroxylase 7 isoform X2 [Benincasa hispida])

HSP 1 Score: 578.6 bits (1490), Expect = 3.5e-161
Identity = 287/331 (86.71%), Postives = 297/331 (89.73%), Query Frame = 0

Query: 1   MDSRRFLAFSLCFLSVFTGFARLPETRTHKKLSGSVLQLKTDSSPLIFDPTRVTQLSWQP 60
           MDSRRFLAF LCFLSVFTGFARLPE R+ KK SGSV++LKTDSSPL+FDPTRVTQLSW+P
Sbjct: 1   MDSRRFLAFCLCFLSVFTGFARLPELRSQKK-SGSVIRLKTDSSPLVFDPTRVTQLSWEP 60

Query: 61  RAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA 120
           RAFLYKGFL+DKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA
Sbjct: 61  RAFLYKGFLSDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA 120

Query: 121 GIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFDFFHDK 180
            IEARISAWT LPA                   ENGESIQILHYENGQKYEPHFDFFHDK
Sbjct: 121 AIEARISAWTLLPA-------------------ENGESIQILHYENGQKYEPHFDFFHDK 180

Query: 181 VNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQEKDDSWSDCAHKGYAVKAQKGDA 240
           VNQELGGHRIATVLMYLSNVEKGGETIFPNSE KESQEKD+SWSDCA KGYAVKA+KGDA
Sbjct: 181 VNQELGGHRIATVLMYLSNVEKGGETIFPNSEFKESQEKDESWSDCARKGYAVKARKGDA 240

Query: 241 LLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRVSSQDCMDENENCPL 300
           LLFFSL  DATTD KSLHGSCPVIEGEKWSATKWIHVRSFEK TRVS QDC+DENENC +
Sbjct: 241 LLFFSLRPDATTDVKSLHGSCPVIEGEKWSATKWIHVRSFEKATRVSRQDCVDENENCQI 300

Query: 301 WAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
           WAKRGECKKNPTYMVGSE ALGYCRKSCRAC
Sbjct: 301 WAKRGECKKNPTYMVGSEDALGYCRKSCRAC 311

BLAST of HG10022103 vs. NCBI nr
Match: XP_022938573.1 (probable prolyl 4-hydroxylase 7 [Cucurbita moschata])

HSP 1 Score: 574.3 bits (1479), Expect = 6.7e-160
Identity = 287/331 (86.71%), Postives = 297/331 (89.73%), Query Frame = 0

Query: 1   MDSRRFLAFSLCFLSVFTGFARLPETRTHKKLSGSVLQLKTDSSPLIFDPTRVTQLSWQP 60
           MDSRRFLAFSL FLSV TGFARLPE  THKKLSGSVL+LK DS  LIFDPTRVTQLSWQP
Sbjct: 1   MDSRRFLAFSLFFLSVSTGFARLPE--THKKLSGSVLELKRDSPRLIFDPTRVTQLSWQP 60

Query: 61  RAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA 120
           RAFLYKGFLTD+ECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA
Sbjct: 61  RAFLYKGFLTDQECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA 120

Query: 121 GIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFDFFHDK 180
           GIEARISAWTFLP                    ENGESIQILHYENGQKYEPHFDFFHDK
Sbjct: 121 GIEARISAWTFLPV-------------------ENGESIQILHYENGQKYEPHFDFFHDK 180

Query: 181 VNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQEKDDSWSDCAHKGYAVKAQKGDA 240
           VNQELGGHRIATVLMYLSNVEKGGETIFPNS   ESQEKDDSWSDCA KGYAVKAQKGDA
Sbjct: 181 VNQELGGHRIATVLMYLSNVEKGGETIFPNSAF-ESQEKDDSWSDCARKGYAVKAQKGDA 240

Query: 241 LLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRVSSQDCMDENENCPL 300
           LLFFSLHLDATTD +SLHGSCPVIEGEKWSATKWIHVRSF+K TR+SSQDC+DEN+NCP 
Sbjct: 241 LLFFSLHLDATTDKRSLHGSCPVIEGEKWSATKWIHVRSFDKATRISSQDCVDENKNCPS 300

Query: 301 WAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
           WAKRGEC+KNPTYMVGSEGA+GYCRKSC+AC
Sbjct: 301 WAKRGECQKNPTYMVGSEGAVGYCRKSCKAC 309

BLAST of HG10022103 vs. ExPASy Swiss-Prot
Match: Q8L970 (Probable prolyl 4-hydroxylase 7 OS=Arabidopsis thaliana OX=3702 GN=P4H7 PE=2 SV=1)

HSP 1 Score: 447.2 bits (1149), Expect = 1.6e-124
Identity = 220/334 (65.87%), Postives = 257/334 (76.95%), Query Frame = 0

Query: 1   MDSRRFLAFSLCFLSVFTGFARLPE---TRTHKKLSGSVLQLKTDSSPLIFDPTRVTQLS 60
           MDSR FLAFSLCFL      +  P    TR+     GSV+++KT +S   FDPTRVTQLS
Sbjct: 1   MDSRIFLAFSLCFLFTLPLISSAPNRFLTRSSNTRDGSVIKMKTSASSFGFDPTRVTQLS 60

Query: 61  WQPRAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDE 120
           W PR FLY+GFL+D+ECDH I LAK KLEKSMVADN+SG+SV SEVRTSSGMFL K QD+
Sbjct: 61  WTPRVFLYEGFLSDEECDHFIKLAKGKLEKSMVADNDSGESVESEVRTSSGMFLSKRQDD 120

Query: 121 IVAGIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFDFF 180
           IV+ +EA+++AWTFLP                    ENGES+QILHYENGQKYEPHFD+F
Sbjct: 121 IVSNVEAKLAAWTFLP-------------------EENGESMQILHYENGQKYEPHFDYF 180

Query: 181 HDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQEKDDSWSDCAHKGYAVKAQK 240
           HD+ N ELGGHRIATVLMYLSNVEKGGET+FP  + K +Q KDDSW++CA +GYAVK +K
Sbjct: 181 HDQANLELGGHRIATVLMYLSNVEKGGETVFPMWKGKATQLKDDSWTECAKQGYAVKPRK 240

Query: 241 GDALLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRVSSQDCMDENEN 300
           GDALLFF+LH +ATTD  SLHGSCPV+EGEKWSAT+WIHV+SFE+     S  CMDEN +
Sbjct: 241 GDALLFFNLHPNATTDSNSLHGSCPVVEGEKWSATRWIHVKSFERAFNKQS-GCMDENVS 300

Query: 301 CPLWAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
           C  WAK GEC+KNPTYMVGS+   GYCRKSC+AC
Sbjct: 301 CEKWAKAGECQKNPTYMVGSDKDHGYCRKSCKAC 314

BLAST of HG10022103 vs. ExPASy Swiss-Prot
Match: F4J0A8 (Probable prolyl 4-hydroxylase 6 OS=Arabidopsis thaliana OX=3702 GN=P4H6 PE=2 SV=1)

HSP 1 Score: 404.4 bits (1038), Expect = 1.2e-111
Identity = 207/332 (62.35%), Postives = 241/332 (72.59%), Query Frame = 0

Query: 1   MDSRRFLAFSLCFLSVFTGFARLPETRTHKKLSGSVLQLKTDSSPLIFDPTRVTQLSWQP 60
           MDS+ FLAFSL  L +F+                     +  S     DPTR+TQLSW P
Sbjct: 1   MDSQYFLAFSLSLLLIFS---------------------QISSFSFSVDPTRITQLSWTP 60

Query: 61  RAFLYKGFLTDKECDHLIDLAKDKLEKSM-VADNESGKSVSSEVRTSSGMFLRKAQDEIV 120
           RAFLYKGFL+D+ECDHLI LAK KLEKSM VAD +SG+S  SEVRTSSGMFL K QD+IV
Sbjct: 61  RAFLYKGFLSDEECDHLIKLAKGKLEKSMVVADVDSGESEDSEVRTSSGMFLTKRQDDIV 120

Query: 121 AGIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFDFFHD 180
           A +EA+++AWTFLP                    ENGE++QILHYENGQKY+PHFD+F+D
Sbjct: 121 ANVEAKLAAWTFLP-------------------EENGEALQILHYENGQKYDPHFDYFYD 180

Query: 181 KVNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQEKDDSWSDCAHKGYAVKAQKGD 240
           K   ELGGHRIATVLMYLSNV KGGET+FPN + K  Q KDDSWS CA +GYAVK +KGD
Sbjct: 181 KKALELGGHRIATVLMYLSNVTKGGETVFPNWKGKTPQLKDDSWSKCAKQGYAVKPRKGD 240

Query: 241 ALLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRVSSQDCMDENENCP 300
           ALLFF+LHL+ TTD  SLHGSCPVIEGEKWSAT+WIHVRSF K   V    C+D++E+C 
Sbjct: 241 ALLFFNLHLNGTTDPNSLHGSCPVIEGEKWSATRWIHVRSFGKKKLV----CVDDHESCQ 288

Query: 301 LWAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
            WA  GEC+KNP YMVGSE +LG+CRKSC+AC
Sbjct: 301 EWADAGECEKNPMYMVGSETSLGFCRKSCKAC 288

BLAST of HG10022103 vs. ExPASy Swiss-Prot
Match: Q8LAN3 (Probable prolyl 4-hydroxylase 4 OS=Arabidopsis thaliana OX=3702 GN=P4H4 PE=2 SV=1)

HSP 1 Score: 340.1 bits (871), Expect = 2.8e-92
Identity = 168/292 (57.53%), Postives = 209/292 (71.58%), Query Frame = 0

Query: 43  SSPLIFDPTRVTQLSWQPRAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSE 102
           SS +  +P++V Q+S +PRAF+Y+GFLT+ ECDH++ LAK  L++S VADN+SG+S  SE
Sbjct: 27  SSSVFVNPSKVKQVSSKPRAFVYEGFLTELECDHMVSLAKASLKRSAVADNDSGESKFSE 86

Query: 103 VRTSSGMFLRKAQDEIVAGIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQIL 162
           VRTSSG F+ K +D IV+GIE +IS WTFLP                    ENGE IQ+L
Sbjct: 87  VRTSSGTFISKGKDPIVSGIEDKISTWTFLP-------------------KENGEDIQVL 146

Query: 163 HYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQ---EK 222
            YE+GQKY+ HFD+FHDKVN   GGHR+AT+LMYLSNV KGGET+FP++E+   +   E 
Sbjct: 147 RYEHGQKYDAHFDYFHDKVNIVRGGHRMATILMYLSNVTKGGETVFPDAEIPSRRVLSEN 206

Query: 223 DDSWSDCAHKGYAVKAQKGDALLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRS 282
            +  SDCA +G AVK +KGDALLFF+LH DA  D  SLHG CPVIEGEKWSATKWIHV S
Sbjct: 207 KEDLSDCAKRGIAVKPRKGDALLFFNLHPDAIPDPLSLHGGCPVIEGEKWSATKWIHVDS 266

Query: 283 FEKPTRVSSQDCMDENENCPLWAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
           F++     S +C D NE+C  WA  GEC KNP YMVG+    GYCR+SC+AC
Sbjct: 267 FDRIV-TPSGNCTDMNESCERWAVLGECTKNPEYMVGTTELPGYCRRSCKAC 298

BLAST of HG10022103 vs. ExPASy Swiss-Prot
Match: F4JAU3 (Prolyl 4-hydroxylase 2 OS=Arabidopsis thaliana OX=3702 GN=P4H2 PE=1 SV=1)

HSP 1 Score: 332.4 bits (851), Expect = 5.8e-90
Identity = 169/303 (55.78%), Postives = 210/303 (69.31%), Query Frame = 0

Query: 32  LSGSVLQLKTDSSPLIFDPTRVTQLSWQPRAFLYKGFLTDKECDHLIDLAKDKLEKSMVA 91
           L  S   + + SS  I +P++V Q+S +PRAF+Y+GFLTD ECDHLI LAK+ L++S VA
Sbjct: 19  LQSSTCLISSPSS--IINPSKVKQVSSKPRAFVYEGFLTDLECDHLISLAKENLQRSAVA 78

Query: 92  DNESGKSVSSEVRTSSGMFLRKAQDEIVAGIEARISAWTFLPADYKALTITYFSFLELLL 151
           DN++G+S  S+VRTSSG F+ K +D IV+GIE ++S WTFLP                  
Sbjct: 79  DNDNGESQVSDVRTSSGTFISKGKDPIVSGIEDKLSTWTFLP------------------ 138

Query: 152 KSENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNS 211
             ENGE +Q+L YE+GQKY+ HFD+FHDKVN   GGHRIATVL+YLSNV KGGET+FP++
Sbjct: 139 -KENGEDLQVLRYEHGQKYDAHFDYFHDKVNIARGGHRIATVLLYLSNVTKGGETVFPDA 198

Query: 212 E---LKESQEKDDSWSDCAHKGYAVKAQKGDALLFFSLHLDATTDDKSLHGSCPVIEGEK 271
           +    +   E  D  SDCA KG AVK +KG+ALLFF+L  DA  D  SLHG CPVIEGEK
Sbjct: 199 QEFSRRSLSENKDDLSDCAKKGIAVKPKKGNALLFFNLQQDAIPDPFSLHGGCPVIEGEK 258

Query: 272 WSATKWIHVRSFEKPTRVSSQDCMDENENCPLWAKRGECKKNPTYMVGSEGALGYCRKSC 331
           WSATKWIHV SF+K       +C D NE+C  WA  GEC KNP YMVG+    G CR+SC
Sbjct: 259 WSATKWIHVDSFDK-ILTHDGNCTDVNESCERWAVLGECGKNPEYMVGTPEIPGNCRRSC 299

BLAST of HG10022103 vs. ExPASy Swiss-Prot
Match: Q9LN20 (Probable prolyl 4-hydroxylase 3 OS=Arabidopsis thaliana OX=3702 GN=P4H3 PE=2 SV=1)

HSP 1 Score: 236.5 bits (602), Expect = 4.3e-61
Identity = 117/227 (51.54%), Postives = 157/227 (69.16%), Query Frame = 0

Query: 56  LSWQPRAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQ 115
           LSW+PRAF+Y  FL+ +EC++LI LAK  + KS V D+E+GKS  S VRTSSG FLR+ +
Sbjct: 79  LSWEPRAFVYHNFLSKEECEYLISLAKPHMVKSTVVDSETGKSKDSRVRTSSGTFLRRGR 138

Query: 116 DEIVAGIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFD 175
           D+I+  IE RI+ +TF+PAD+                   GE +Q+LHYE GQKYEPH+D
Sbjct: 139 DKIIKTIEKRIADYTFIPADH-------------------GEGLQVLHYEAGQKYEPHYD 198

Query: 176 FFHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSELK-ESQEKDDSWSDCAHKGYAVK 235
           +F D+ N + GG R+AT+LMYLS+VE+GGET+FP + +   S    +  S+C  KG +VK
Sbjct: 199 YFVDEFNTKNGGQRMATMLMYLSDVEEGGETVFPAANMNFSSVPWYNELSECGKKGLSVK 258

Query: 236 AQKGDALLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFE 282
            + GDALLF+S+  DAT D  SLHG CPVI G KWS+TKW+HV  ++
Sbjct: 259 PRMGDALLFWSMRPDATLDPTSLHGGCPVIRGNKWSSTKWMHVGEYK 286

BLAST of HG10022103 vs. ExPASy TrEMBL
Match: A0A1S3C8G4 (Procollagen-proline 4-dioxygenase OS=Cucumis melo OX=3656 GN=LOC103498028 PE=3 SV=1)

HSP 1 Score: 582.8 bits (1501), Expect = 9.1e-163
Identity = 287/335 (85.67%), Postives = 302/335 (90.15%), Query Frame = 0

Query: 1   MDSRRFLAFSLCFLSVFTGFARLPETR----THKKLSGSVLQLKTDSSPLIFDPTRVTQL 60
           MDSR FLAFSLCFLSVFT FARLPETR    ++K+ +GSVL+LKTDSSPLIFDPTRVTQL
Sbjct: 1   MDSRPFLAFSLCFLSVFTAFARLPETRMLKHSYKQSTGSVLRLKTDSSPLIFDPTRVTQL 60

Query: 61  SWQPRAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQD 120
           SWQPRAFLYKGFL+D+ECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQD
Sbjct: 61  SWQPRAFLYKGFLSDEECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQD 120

Query: 121 EIVAGIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFDF 180
           +IVAG+EARI+AWT LPA                   ENGESIQILHYENGQKYEPHFDF
Sbjct: 121 KIVAGVEARIAAWTLLPA-------------------ENGESIQILHYENGQKYEPHFDF 180

Query: 181 FHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQEKDDSWSDCAHKGYAVKAQ 240
           FHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSE KESQEKDDSWSDC+ KGYAVKAQ
Sbjct: 181 FHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSEFKESQEKDDSWSDCSRKGYAVKAQ 240

Query: 241 KGDALLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRVSSQDCMDENE 300
           KGDALLFFSLHLDATTD++SLHGSCPVIEGEKWSATKWIHVRSFEK  RVS QDC+DENE
Sbjct: 241 KGDALLFFSLHLDATTDERSLHGSCPVIEGEKWSATKWIHVRSFEKLPRVSRQDCVDENE 300

Query: 301 NCPLWAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
           NCP WAKRGECKKNPTYMVGSEGALGYCRKSC+AC
Sbjct: 301 NCPAWAKRGECKKNPTYMVGSEGALGYCRKSCKAC 316

BLAST of HG10022103 vs. ExPASy TrEMBL
Match: A0A6J1FJ93 (Procollagen-proline 4-dioxygenase OS=Cucurbita moschata OX=3662 GN=LOC111444767 PE=3 SV=1)

HSP 1 Score: 574.3 bits (1479), Expect = 3.2e-160
Identity = 287/331 (86.71%), Postives = 297/331 (89.73%), Query Frame = 0

Query: 1   MDSRRFLAFSLCFLSVFTGFARLPETRTHKKLSGSVLQLKTDSSPLIFDPTRVTQLSWQP 60
           MDSRRFLAFSL FLSV TGFARLPE  THKKLSGSVL+LK DS  LIFDPTRVTQLSWQP
Sbjct: 1   MDSRRFLAFSLFFLSVSTGFARLPE--THKKLSGSVLELKRDSPRLIFDPTRVTQLSWQP 60

Query: 61  RAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA 120
           RAFLYKGFLTD+ECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA
Sbjct: 61  RAFLYKGFLTDQECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA 120

Query: 121 GIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFDFFHDK 180
           GIEARISAWTFLP                    ENGESIQILHYENGQKYEPHFDFFHDK
Sbjct: 121 GIEARISAWTFLPV-------------------ENGESIQILHYENGQKYEPHFDFFHDK 180

Query: 181 VNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQEKDDSWSDCAHKGYAVKAQKGDA 240
           VNQELGGHRIATVLMYLSNVEKGGETIFPNS   ESQEKDDSWSDCA KGYAVKAQKGDA
Sbjct: 181 VNQELGGHRIATVLMYLSNVEKGGETIFPNSAF-ESQEKDDSWSDCARKGYAVKAQKGDA 240

Query: 241 LLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRVSSQDCMDENENCPL 300
           LLFFSLHLDATTD +SLHGSCPVIEGEKWSATKWIHVRSF+K TR+SSQDC+DEN+NCP 
Sbjct: 241 LLFFSLHLDATTDKRSLHGSCPVIEGEKWSATKWIHVRSFDKATRISSQDCVDENKNCPS 300

Query: 301 WAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
           WAKRGEC+KNPTYMVGSEGA+GYCRKSC+AC
Sbjct: 301 WAKRGECQKNPTYMVGSEGAVGYCRKSCKAC 309

BLAST of HG10022103 vs. ExPASy TrEMBL
Match: A0A6J1JWX0 (Procollagen-proline 4-dioxygenase OS=Cucurbita maxima OX=3661 GN=LOC111489579 PE=3 SV=1)

HSP 1 Score: 571.6 bits (1472), Expect = 2.1e-159
Identity = 286/331 (86.40%), Postives = 295/331 (89.12%), Query Frame = 0

Query: 1   MDSRRFLAFSLCFLSVFTGFARLPETRTHKKLSGSVLQLKTDSSPLIFDPTRVTQLSWQP 60
           MDSRRFL FSL FLSV TGFARLPE  THKKLSGSVL+LK DS  LIFDPTRVTQLSWQP
Sbjct: 1   MDSRRFLGFSLFFLSVSTGFARLPE--THKKLSGSVLELKRDSPRLIFDPTRVTQLSWQP 60

Query: 61  RAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA 120
           RAFLYKGFLTD+ECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA
Sbjct: 61  RAFLYKGFLTDQECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA 120

Query: 121 GIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFDFFHDK 180
           GIEARISAWTFLP                    ENGESIQILHYENGQKYEPHFDFFHDK
Sbjct: 121 GIEARISAWTFLPV-------------------ENGESIQILHYENGQKYEPHFDFFHDK 180

Query: 181 VNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQEKDDSWSDCAHKGYAVKAQKGDA 240
           VNQELGGHRIATVLMYLSNVEKGGETIFPNS   ESQEKDDSWSDCA KGYAVKAQKGDA
Sbjct: 181 VNQELGGHRIATVLMYLSNVEKGGETIFPNSAF-ESQEKDDSWSDCARKGYAVKAQKGDA 240

Query: 241 LLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRVSSQDCMDENENCPL 300
           LLFFSLHLDATTD +SLHGSCPVIEGEKWSATKWIHVRSF+K TR SSQDC+DEN+NCP 
Sbjct: 241 LLFFSLHLDATTDKRSLHGSCPVIEGEKWSATKWIHVRSFDKATRTSSQDCVDENKNCPS 300

Query: 301 WAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
           WAKRGEC+KNPTYMVGSEGA+GYCRKSC+AC
Sbjct: 301 WAKRGECQKNPTYMVGSEGAVGYCRKSCKAC 309

BLAST of HG10022103 vs. ExPASy TrEMBL
Match: A0A0A0KS38 (Procollagen-proline 4-dioxygenase OS=Cucumis sativus OX=3659 GN=Csa_5G633280 PE=3 SV=1)

HSP 1 Score: 571.2 bits (1471), Expect = 2.7e-159
Identity = 282/332 (84.94%), Postives = 297/332 (89.46%), Query Frame = 0

Query: 1   MDSRRFLAFSLCFLSVFTGFARLPETRTHKKLSGSVLQLKTDSSPLIFDPTRVTQLSWQP 60
           MDSR FLAFSLCFLSVFT FARLPETRTHK+ SGSVL+LKTDSSPLIFDPTRVTQLSWQP
Sbjct: 1   MDSRPFLAFSLCFLSVFTAFARLPETRTHKQSSGSVLRLKTDSSPLIFDPTRVTQLSWQP 60

Query: 61  RAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA 120
           RAFLYKGFL+D ECDHLIDLAKDKLEKSMVADN+SGKSVSSEVRTSSGMFLRKAQDE+VA
Sbjct: 61  RAFLYKGFLSDAECDHLIDLAKDKLEKSMVADNDSGKSVSSEVRTSSGMFLRKAQDEVVA 120

Query: 121 GIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFDFFHDK 180
           G+EARI+AWT LPA                   ENGESIQILHYENGQKYEPHFDFFHDK
Sbjct: 121 GVEARIAAWTLLPA-------------------ENGESIQILHYENGQKYEPHFDFFHDK 180

Query: 181 VNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQEKDDSWSDCAHKGYAVKAQKGDA 240
           VNQELGGHRIATVLMYLSNVEKGGETIFPNSE KESQ KD+SWSDC+ KGYAVKAQKGDA
Sbjct: 181 VNQELGGHRIATVLMYLSNVEKGGETIFPNSEFKESQAKDESWSDCSRKGYAVKAQKGDA 240

Query: 241 LLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFEKPT-RVSSQDCMDENENCP 300
           LLFFSL+LDATTD++SLHGSCPVI GEKWSATKWIHVRSFEK T RVS Q C+DENENC 
Sbjct: 241 LLFFSLNLDATTDERSLHGSCPVIAGEKWSATKWIHVRSFEKITSRVSRQGCVDENENCL 300

Query: 301 LWAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
            WAK+GECKKNPTYMVGS GALGYCRKSC+AC
Sbjct: 301 AWAKKGECKKNPTYMVGSGGALGYCRKSCKAC 313

BLAST of HG10022103 vs. ExPASy TrEMBL
Match: A0A6J1BXN9 (Procollagen-proline 4-dioxygenase OS=Momordica charantia OX=3673 GN=LOC111006412 PE=3 SV=1)

HSP 1 Score: 567.4 bits (1461), Expect = 4.0e-158
Identity = 279/332 (84.04%), Postives = 293/332 (88.25%), Query Frame = 0

Query: 1   MDSRRFLAFSLCFLSVFTGFARLPETRTHKKLSGSVLQLKTDSSPLIFDPTRVTQLSWQP 60
           MDS RFL+FSLCFL VFT  ARLP+ R HKK+SGSVL+LK + SPLIFDPTRVTQLSWQP
Sbjct: 1   MDSPRFLSFSLCFLFVFTALARLPDMRAHKKISGSVLRLKGEPSPLIFDPTRVTQLSWQP 60

Query: 61  RAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDEIVA 120
           RAFLYKGFL+DKECDHLIDLAKDKLEKSMVADN SGKSVSSEVRTSSGMFL KAQDEIVA
Sbjct: 61  RAFLYKGFLSDKECDHLIDLAKDKLEKSMVADNNSGKSVSSEVRTSSGMFLHKAQDEIVA 120

Query: 121 GIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFDFFHDK 180
            +EARI+AWTFLPA                   ENGESIQILHYENGQKYEPHFD+FHDK
Sbjct: 121 AVEARIAAWTFLPA-------------------ENGESIQILHYENGQKYEPHFDYFHDK 180

Query: 181 VNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQEKDDSWSDCAHKGYAVKAQKGDA 240
           VNQELGGHR+ATVLMYLSNVEKGGETIFPNSE KESQEKDDSWSDCA KGYAVKA+KGDA
Sbjct: 181 VNQELGGHRVATVLMYLSNVEKGGETIFPNSEFKESQEKDDSWSDCARKGYAVKAKKGDA 240

Query: 241 LLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRVSSQ-DCMDENENCP 300
           LLFFSLHLDATTD KSLHGSCPVIEGEKWSATKWIHVRSFEKPTR S + DC+DENENC 
Sbjct: 241 LLFFSLHLDATTDVKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRPSRRLDCVDENENCA 300

Query: 301 LWAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
            WAKRGECKKNPTYMVGSE ALGYCRKSC+AC
Sbjct: 301 SWAKRGECKKNPTYMVGSESALGYCRKSCQAC 313

BLAST of HG10022103 vs. TAIR 10
Match: AT3G28480.1 (Oxoglutarate/iron-dependent oxygenase )

HSP 1 Score: 447.2 bits (1149), Expect = 1.1e-125
Identity = 220/334 (65.87%), Postives = 257/334 (76.95%), Query Frame = 0

Query: 1   MDSRRFLAFSLCFLSVFTGFARLPE---TRTHKKLSGSVLQLKTDSSPLIFDPTRVTQLS 60
           MDSR FLAFSLCFL      +  P    TR+     GSV+++KT +S   FDPTRVTQLS
Sbjct: 1   MDSRIFLAFSLCFLFTLPLISSAPNRFLTRSSNTRDGSVIKMKTSASSFGFDPTRVTQLS 60

Query: 61  WQPRAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSEVRTSSGMFLRKAQDE 120
           W PR FLY+GFL+D+ECDH I LAK KLEKSMVADN+SG+SV SEVRTSSGMFL K QD+
Sbjct: 61  WTPRVFLYEGFLSDEECDHFIKLAKGKLEKSMVADNDSGESVESEVRTSSGMFLSKRQDD 120

Query: 121 IVAGIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFDFF 180
           IV+ +EA+++AWTFLP                    ENGES+QILHYENGQKYEPHFD+F
Sbjct: 121 IVSNVEAKLAAWTFLP-------------------EENGESMQILHYENGQKYEPHFDYF 180

Query: 181 HDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQEKDDSWSDCAHKGYAVKAQK 240
           HD+ N ELGGHRIATVLMYLSNVEKGGET+FP  + K +Q KDDSW++CA +GYAVK +K
Sbjct: 181 HDQANLELGGHRIATVLMYLSNVEKGGETVFPMWKGKATQLKDDSWTECAKQGYAVKPRK 240

Query: 241 GDALLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRVSSQDCMDENEN 300
           GDALLFF+LH +ATTD  SLHGSCPV+EGEKWSAT+WIHV+SFE+     S  CMDEN +
Sbjct: 241 GDALLFFNLHPNATTDSNSLHGSCPVVEGEKWSATRWIHVKSFERAFNKQS-GCMDENVS 300

Query: 301 CPLWAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
           C  WAK GEC+KNPTYMVGS+   GYCRKSC+AC
Sbjct: 301 CEKWAKAGECQKNPTYMVGSDKDHGYCRKSCKAC 314

BLAST of HG10022103 vs. TAIR 10
Match: AT3G28480.2 (Oxoglutarate/iron-dependent oxygenase )

HSP 1 Score: 420.6 bits (1080), Expect = 1.1e-117
Identity = 213/342 (62.28%), Postives = 250/342 (73.10%), Query Frame = 0

Query: 1   MDSRRFLAFSLCFLSVFTGFARLPE---TRTHKKLSGSVLQLKTDSSPLIFDPTRVTQLS 60
           MDSR FLAFSLCFL      +  P    TR+     GSV+++KT +S   FDPTRVTQLS
Sbjct: 1   MDSRIFLAFSLCFLFTLPLISSAPNRFLTRSSNTRDGSVIKMKTSASSFGFDPTRVTQLS 60

Query: 61  WQPRAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSE-----VRTSSGMFLR 120
           W PR FLY+GFL+D+ECDH I LAK KLEKSMVADN+SG+SV SE     VR SS     
Sbjct: 61  WTPRVFLYEGFLSDEECDHFIKLAKGKLEKSMVADNDSGESVESEDSVSVVRQSSSFIAN 120

Query: 121 KAQ---DEIVAGIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQK 180
                 D+IV+ +EA+++AWTFLP                    ENGES+QILHYENGQK
Sbjct: 121 MDSLEIDDIVSNVEAKLAAWTFLP-------------------EENGESMQILHYENGQK 180

Query: 181 YEPHFDFFHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQEKDDSWSDCAHK 240
           YEPHFD+FHD+ N ELGGHRIATVLMYLSNVEKGGET+FP  + K +Q KDDSW++CA +
Sbjct: 181 YEPHFDYFHDQANLELGGHRIATVLMYLSNVEKGGETVFPMWKGKATQLKDDSWTECAKQ 240

Query: 241 GYAVKAQKGDALLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRVSSQ 300
           GYAVK +KGDALLFF+LH +ATTD  SLHGSCPV+EGEKWSAT+WIHV+SFE+     S 
Sbjct: 241 GYAVKPRKGDALLFFNLHPNATTDSNSLHGSCPVVEGEKWSATRWIHVKSFERAFNKQS- 300

Query: 301 DCMDENENCPLWAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
            CMDEN +C  WAK GEC+KNPTYMVGS+   GYCRKSC+AC
Sbjct: 301 GCMDENVSCEKWAKAGECQKNPTYMVGSDKDHGYCRKSCKAC 322

BLAST of HG10022103 vs. TAIR 10
Match: AT3G28490.1 (Oxoglutarate/iron-dependent oxygenase )

HSP 1 Score: 404.4 bits (1038), Expect = 8.5e-113
Identity = 207/332 (62.35%), Postives = 241/332 (72.59%), Query Frame = 0

Query: 1   MDSRRFLAFSLCFLSVFTGFARLPETRTHKKLSGSVLQLKTDSSPLIFDPTRVTQLSWQP 60
           MDS+ FLAFSL  L +F+                     +  S     DPTR+TQLSW P
Sbjct: 1   MDSQYFLAFSLSLLLIFS---------------------QISSFSFSVDPTRITQLSWTP 60

Query: 61  RAFLYKGFLTDKECDHLIDLAKDKLEKSM-VADNESGKSVSSEVRTSSGMFLRKAQDEIV 120
           RAFLYKGFL+D+ECDHLI LAK KLEKSM VAD +SG+S  SEVRTSSGMFL K QD+IV
Sbjct: 61  RAFLYKGFLSDEECDHLIKLAKGKLEKSMVVADVDSGESEDSEVRTSSGMFLTKRQDDIV 120

Query: 121 AGIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQILHYENGQKYEPHFDFFHD 180
           A +EA+++AWTFLP                    ENGE++QILHYENGQKY+PHFD+F+D
Sbjct: 121 ANVEAKLAAWTFLP-------------------EENGEALQILHYENGQKYDPHFDYFYD 180

Query: 181 KVNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQEKDDSWSDCAHKGYAVKAQKGD 240
           K   ELGGHRIATVLMYLSNV KGGET+FPN + K  Q KDDSWS CA +GYAVK +KGD
Sbjct: 181 KKALELGGHRIATVLMYLSNVTKGGETVFPNWKGKTPQLKDDSWSKCAKQGYAVKPRKGD 240

Query: 241 ALLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRSFEKPTRVSSQDCMDENENCP 300
           ALLFF+LHL+ TTD  SLHGSCPVIEGEKWSAT+WIHVRSF K   V    C+D++E+C 
Sbjct: 241 ALLFFNLHLNGTTDPNSLHGSCPVIEGEKWSATRWIHVRSFGKKKLV----CVDDHESCQ 288

Query: 301 LWAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
            WA  GEC+KNP YMVGSE +LG+CRKSC+AC
Sbjct: 301 EWADAGECEKNPMYMVGSETSLGFCRKSCKAC 288

BLAST of HG10022103 vs. TAIR 10
Match: AT5G18900.1 (2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily protein )

HSP 1 Score: 340.1 bits (871), Expect = 2.0e-93
Identity = 168/292 (57.53%), Postives = 209/292 (71.58%), Query Frame = 0

Query: 43  SSPLIFDPTRVTQLSWQPRAFLYKGFLTDKECDHLIDLAKDKLEKSMVADNESGKSVSSE 102
           SS +  +P++V Q+S +PRAF+Y+GFLT+ ECDH++ LAK  L++S VADN+SG+S  SE
Sbjct: 27  SSSVFVNPSKVKQVSSKPRAFVYEGFLTELECDHMVSLAKASLKRSAVADNDSGESKFSE 86

Query: 103 VRTSSGMFLRKAQDEIVAGIEARISAWTFLPADYKALTITYFSFLELLLKSENGESIQIL 162
           VRTSSG F+ K +D IV+GIE +IS WTFLP                    ENGE IQ+L
Sbjct: 87  VRTSSGTFISKGKDPIVSGIEDKISTWTFLP-------------------KENGEDIQVL 146

Query: 163 HYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNSELKESQ---EK 222
            YE+GQKY+ HFD+FHDKVN   GGHR+AT+LMYLSNV KGGET+FP++E+   +   E 
Sbjct: 147 RYEHGQKYDAHFDYFHDKVNIVRGGHRMATILMYLSNVTKGGETVFPDAEIPSRRVLSEN 206

Query: 223 DDSWSDCAHKGYAVKAQKGDALLFFSLHLDATTDDKSLHGSCPVIEGEKWSATKWIHVRS 282
            +  SDCA +G AVK +KGDALLFF+LH DA  D  SLHG CPVIEGEKWSATKWIHV S
Sbjct: 207 KEDLSDCAKRGIAVKPRKGDALLFFNLHPDAIPDPLSLHGGCPVIEGEKWSATKWIHVDS 266

Query: 283 FEKPTRVSSQDCMDENENCPLWAKRGECKKNPTYMVGSEGALGYCRKSCRAC 332
           F++     S +C D NE+C  WA  GEC KNP YMVG+    GYCR+SC+AC
Sbjct: 267 FDRIV-TPSGNCTDMNESCERWAVLGECTKNPEYMVGTTELPGYCRRSCKAC 298

BLAST of HG10022103 vs. TAIR 10
Match: AT3G06300.1 (P4H isoform 2 )

HSP 1 Score: 332.4 bits (851), Expect = 4.1e-91
Identity = 169/303 (55.78%), Postives = 210/303 (69.31%), Query Frame = 0

Query: 32  LSGSVLQLKTDSSPLIFDPTRVTQLSWQPRAFLYKGFLTDKECDHLIDLAKDKLEKSMVA 91
           L  S   + + SS  I +P++V Q+S +PRAF+Y+GFLTD ECDHLI LAK+ L++S VA
Sbjct: 19  LQSSTCLISSPSS--IINPSKVKQVSSKPRAFVYEGFLTDLECDHLISLAKENLQRSAVA 78

Query: 92  DNESGKSVSSEVRTSSGMFLRKAQDEIVAGIEARISAWTFLPADYKALTITYFSFLELLL 151
           DN++G+S  S+VRTSSG F+ K +D IV+GIE ++S WTFLP                  
Sbjct: 79  DNDNGESQVSDVRTSSGTFISKGKDPIVSGIEDKLSTWTFLP------------------ 138

Query: 152 KSENGESIQILHYENGQKYEPHFDFFHDKVNQELGGHRIATVLMYLSNVEKGGETIFPNS 211
             ENGE +Q+L YE+GQKY+ HFD+FHDKVN   GGHRIATVL+YLSNV KGGET+FP++
Sbjct: 139 -KENGEDLQVLRYEHGQKYDAHFDYFHDKVNIARGGHRIATVLLYLSNVTKGGETVFPDA 198

Query: 212 E---LKESQEKDDSWSDCAHKGYAVKAQKGDALLFFSLHLDATTDDKSLHGSCPVIEGEK 271
           +    +   E  D  SDCA KG AVK +KG+ALLFF+L  DA  D  SLHG CPVIEGEK
Sbjct: 199 QEFSRRSLSENKDDLSDCAKKGIAVKPKKGNALLFFNLQQDAIPDPFSLHGGCPVIEGEK 258

Query: 272 WSATKWIHVRSFEKPTRVSSQDCMDENENCPLWAKRGECKKNPTYMVGSEGALGYCRKSC 331
           WSATKWIHV SF+K       +C D NE+C  WA  GEC KNP YMVG+    G CR+SC
Sbjct: 259 WSATKWIHVDSFDK-ILTHDGNCTDVNESCERWAVLGECGKNPEYMVGTPEIPGNCRRSC 299

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
KAE8648909.14.8e-16684.38hypothetical protein Csa_008411 [Cucumis sativus][more]
XP_008458700.11.9e-16285.67PREDICTED: probable prolyl 4-hydroxylase 7 [Cucumis melo][more]
XP_038889686.12.5e-16286.71probable prolyl 4-hydroxylase 7 isoform X1 [Benincasa hispida][more]
XP_038889687.13.5e-16186.71probable prolyl 4-hydroxylase 7 isoform X2 [Benincasa hispida][more]
XP_022938573.16.7e-16086.71probable prolyl 4-hydroxylase 7 [Cucurbita moschata][more]
Match NameE-valueIdentityDescription
Q8L9701.6e-12465.87Probable prolyl 4-hydroxylase 7 OS=Arabidopsis thaliana OX=3702 GN=P4H7 PE=2 SV=... [more]
F4J0A81.2e-11162.35Probable prolyl 4-hydroxylase 6 OS=Arabidopsis thaliana OX=3702 GN=P4H6 PE=2 SV=... [more]
Q8LAN32.8e-9257.53Probable prolyl 4-hydroxylase 4 OS=Arabidopsis thaliana OX=3702 GN=P4H4 PE=2 SV=... [more]
F4JAU35.8e-9055.78Prolyl 4-hydroxylase 2 OS=Arabidopsis thaliana OX=3702 GN=P4H2 PE=1 SV=1[more]
Q9LN204.3e-6151.54Probable prolyl 4-hydroxylase 3 OS=Arabidopsis thaliana OX=3702 GN=P4H3 PE=2 SV=... [more]
Match NameE-valueIdentityDescription
A0A1S3C8G49.1e-16385.67Procollagen-proline 4-dioxygenase OS=Cucumis melo OX=3656 GN=LOC103498028 PE=3 S... [more]
A0A6J1FJ933.2e-16086.71Procollagen-proline 4-dioxygenase OS=Cucurbita moschata OX=3662 GN=LOC111444767 ... [more]
A0A6J1JWX02.1e-15986.40Procollagen-proline 4-dioxygenase OS=Cucurbita maxima OX=3661 GN=LOC111489579 PE... [more]
A0A0A0KS382.7e-15984.94Procollagen-proline 4-dioxygenase OS=Cucumis sativus OX=3659 GN=Csa_5G633280 PE=... [more]
A0A6J1BXN94.0e-15884.04Procollagen-proline 4-dioxygenase OS=Momordica charantia OX=3673 GN=LOC111006412... [more]
Match NameE-valueIdentityDescription
AT3G28480.11.1e-12565.87Oxoglutarate/iron-dependent oxygenase [more]
AT3G28480.21.1e-11762.28Oxoglutarate/iron-dependent oxygenase [more]
AT3G28490.18.5e-11362.35Oxoglutarate/iron-dependent oxygenase [more]
AT5G18900.12.0e-9357.532-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily protein [more]
AT3G06300.14.1e-9155.78P4H isoform 2 [more]
InterPro
Analysis Name: InterPro Annotations of Bottle gourd (Hangzhou Gourd) v1
Date Performed: 2022-08-01
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR003582ShKT domainSMARTSM00254ShkT_1coord: 290..331
e-value: 1.4E-4
score: 31.3
IPR003582ShKT domainPROSITEPS51670SHKTcoord: 291..331
score: 9.377586
IPR006620Prolyl 4-hydroxylase, alpha subunitSMARTSM00702p4hccoord: 60..276
e-value: 2.7E-54
score: 196.4
IPR044862Prolyl 4-hydroxylase alpha subunit, Fe(2+) 2OG dioxygenase domainPFAMPF136402OG-FeII_Oxy_3coord: 160..276
e-value: 5.6E-20
score: 72.0
NoneNo IPR availableGENE3D2.60.120.620q2cbj1_9rhob like domaincoord: 52..277
e-value: 1.4E-74
score: 252.3
NoneNo IPR availablePANTHERPTHR10869:SF140OS03G0803500 PROTEINcoord: 41..135
NoneNo IPR availablePANTHERPTHR10869:SF140OS03G0803500 PROTEINcoord: 150..331
IPR045054Prolyl 4-hydroxylasePANTHERPTHR10869PROLYL 4-HYDROXYLASE ALPHA SUBUNITcoord: 41..135
IPR045054Prolyl 4-hydroxylasePANTHERPTHR10869PROLYL 4-HYDROXYLASE ALPHA SUBUNITcoord: 150..331
IPR005123Oxoglutarate/iron-dependent dioxygenasePROSITEPS51471FE2OG_OXYcoord: 155..277
score: 12.478646

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
HG10022103.1HG10022103.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0018401 peptidyl-proline hydroxylation to 4-hydroxy-L-proline
cellular_component GO:0005789 endoplasmic reticulum membrane
molecular_function GO:0005506 iron ion binding
molecular_function GO:0031418 L-ascorbic acid binding
molecular_function GO:0004656 procollagen-proline 4-dioxygenase activity
molecular_function GO:0016705 oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen