MS017764 (gene) Bitter gourd (TR) v1

Overview
NameMS017764
Typegene
OrganismMomordica charantia cv. TR (Bitter gourd (TR) v1)
DescriptionProcollagen-proline 4-dioxygenase
Locationscaffold373: 2326000 .. 2334399 (-)
RNA-Seq ExpressionMS017764
SyntenyMS017764
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: polypeptideCDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGTCTATCATATGTGGCATTCCTATTCTTGAGTGTGTATGCTGTCTGGGATGTGCTCGTTGGGCCTGTAAACGCTGTTTTCACTCAGCCGTTCATGACAGTGAAACTTGGGGCTTTGCCACCGCTGATGAGTTCGGGCCTATTCCCCGAATTTGTCGGTATATCCTAGCTGTGTATGAAGATGATATTCAACAACCCCTTTGGGAACCGGCTGGTGGTTATGGAATCAATCCAGATTGGTTGCTCATTAAAAAAACATATAAGGATACTCGAGGACGTGCGCCTCCGTATATTTTATACCTTGATCACAATCATGGGGACATTGTTCTTGCCATCAGGGGACTTAATATGGCAAAGGAGAGTGATTATGCAGTTTTATTGGACAACAAGCTGGGGAAGAAGAAATTTGACGGTGGATATGTTCACAATGGGCTTCTGAAGGCAGCTGGGTGGGTTTTGGACACTGAGAACGAAATTTTAAAGGATTTGGTGAGCAAATATCCGGATTATACATTGACGTTTGCAGGGCATTCCCTTGGCTCCGGAGTAGCAGCCATGTTAACTCTGGTAGTAGTTCAGAATCGCGATAAATTGGAAAATATTGATCGGAAGAGGATAAGGTGCTATGCGATTGCTCCTGCCAGGTGCATGTCCCTAAATTTGGCTGTTAGATATGCAGATGTGATCAACTCTGTTGTTCTTCAGGTAAATTGTTTATTTCTTTCATAGTTTTTTCATAAACTAATCAGTAGCCATTGTGAAAGTTATTGAAATAATGTGGATCTATGCCTTCTGAGGCCTTGATAACTCCCTAGGATTTGTTTTTGCATATCTTATATAACAATTCCTCATTCTTTACACCAACATTCTAGTGATCCTAAAGTTTTTTTTAGCAATGTTATTCATGATTAATTTATTTTATGTACTCAGTAAAAGAAAAAGAACTTTCTTTTTTTGTTGTATTTGAACACAGCCTTGATTGTTTAAAGTGTGCTAGGAGAATTTATATTTGCACTTCAAAATGTAACGGTGATTATTATACTGAACCAGGATGACTTTTTACCCAGGACAGCCACACCCTTGGAAGACATTTTTAAATCACTTTTCTGGTATGGTTCAGGACAAATTACTCTTTATGGCGTTGAATCTTTTTACGTTGTTTGTTTTCCGTTTCTCTTGCTTAGTCCTGACACCATTTGCAAAGTAGATATTGCTGCAAATATGGAAACTAGAGCTCTATTCTTGCATAAAAGTAATAACAGAAAATTTGCGTCTGTTAGAAAAGAATGAAATGGTAGCTTGGATTTATTTAGTGTTAATTTAAGTTTCTCAATAATCATATGAAATTCTTCTTTCCAAACCATGAATATTAAACTGGTTTTTCTTTCCTTGTAGTTTGCCATGCCTTCTATGCATGAGGTGCCTGCGGGATACATGCGTATCAGAGGAGAAGATGATTAAAGATCCGAGGAGACTTTATGCACCGGGACGACTTTATCACATCGTCGAGCGAAAGCCCTGCAGGTAAAAGACCTGCATTTTCCATATTAAGTGAGTTAAATGTGGTGAATTATCGAGCATCGGTTTTAAAAGTTGAGAGAATTCATGAATTTATGGCTTTCTTGTACTTGTTTTGTGGGAAATATCTCTCAATCTTATCAGTTATTGAAATGTTCTGGCTTTCTAGACCGTTGGAACGTGCCCTTCTCTAACAATTGAATGGATTAACCTGTTCTTGTTCAAAGGTGCGGAAGGCTTCCGCCAGTTGTGAAGACGGCTGTTCCGGTGGATGGGCGGTTCGAACATATTGTTCTTTCTTGTAATGCAACTTCTGACCATGCAATCATTTGGATAGAGAAAGAAGCCAAAATGGCCCTGGAAGTAAGTTTCATTCTACCTGGCCTTTTATGCTGACTGTTTGAATTTTAGTGAAATTTTGGAACACAAGCATTCTTTTGCATTTCTTCACCTTACATTTATCATTTTGGTGCATTTTTTCCTCTCTATAATTGCCCGTAAATTTGGAGTTGCCATTGGTGGATCAATTTCTTAGGGGTGTTCACGGGGCCAGGCAGGGATGGGGTCTGTCCCCATCCCCGCTCCCCATTTGATTCCCTATTCTCGTGAAATTTTTCACGAAATTTTGCCGGAAATTTTTCTCTGTTAATTTTTTTAATTGTTTAAAATTTTTTCTCAATTAAATAAATTTTTAAAGAAGTTGTAATTAATTACTTAATTTTAAGGATAAACCCCATCTGCATGTTTAATCCTCCTCTCTCTATAATCTATATATATTGGGTGGATAAAGAGGAATCAAAGTTTTTCAAGTATATGTATGTATGTTGAATATATATGGGGGGGGGGGGGGACGTCGAAAAAACTTCTCCCATCACCGCCCCGCTTAGGGCAGGCAGGGCCTTGCAGGGAAAACGGACACACACACACGAGAACTACTAAAAGCATTTGAAATTGTGTTGCTAATCGGGACATTTCTTTCGAAAATTTCAATGGACTCTATTTATCCTGATCTCGGACTACATTTGAACGTTGATATAGAGCCAATTCTATATCAGGTTCATTGATCGGGGCTTTACCTGAATTCTTCTAGAAACATCAAGCCAATTATTAAAATCATGTTCTTCCTTCAGTTCGGAAAAATTGATGACAAACTATCTGCTATCTGATCCATAATATTTGTTTTGGATGTAGAGTACAAATATCAATGGTTTCTTTCAGCACTTGGCAAGATGCATTACCTATTCAAAACAAGGACTTAGCAGCATCTTATGATTACATGAACAATGTCATCATTCTTCCAATATAAAATGTCAAATTTTAGTTTCTTTGTCCTGTTCGTTGCATGTATGTGAATACGAGGCATTCGTGTTGTTGTCGCTATCTGCATATGAGTACGATGAAGTCATTGCTGATAAATGAAAATGGCATCGATCTGTTTCTGTTAACTTCAGTTAATGCACGAGGATGATAAGGTCATGGAGATACCACCCCAACAAAAAATGGAGAGGCAAAACACTATAGCCAGGGAGCACAGTGAAGAGTACAAGGCTGCATTGCAGCGGGCTGTGACGTTAGCTGTGCCACACGCATACGTACTTTCCCCGTACGGGACCTTCAGCCAGACAGATGAAGGGGAAGAAGAGTCACCAGGCTCAAGTGGAAGGTCGTCTTCAGGTTCGTCGAGGAGGAAGAAAGAAACTTGGGATGAACTGATCGAGCGCCTCTACGACAAGGATGATTCGAGACACGCAGCATTGAAGAAATCACACAGTAGTATTTGACATTGCACTAGATTCATTCATGTTGACATTCATTACTTAATTGAGTTTGGTCGGATGAATCTACTGTATGATGTCGGATTATCCTTGCCCGGGTGTCGATTTTCTTGGATTCACCTGTTCAGAGATGATTGAAAATATGCAGTATTCATCCAAATTTTTAGGAGTCTGATCAGGAAGTGGATATTGGTAGTATGGAGAGGTAGATTTTTTTTTTTTGAATTAATTTTTTTTAGGGATATCACATTCTGTACAATTGTTTAAAAGTCGACTATATATTTATATTATATGTCAATACTTCTATATAATAGTGTGTGATATAGCTTGATATTCTTATCTGGTAATTAAGGACGAATGCTTCCCATTTGTTTTGTTTTGTTCTCATTTTGAGAAAATAAGTGTATTTATTAACTTTCTCTATTTCCTTTTTCTTATATTTTGCTAACATTGTTTAAGATCTGTATGCAGTAAACAAAAATGAGTTATTTAGGTCCGTGGATAACTATTTTATTTTTACAAATTAAGTTTAGAACCACTGCTTCCACCACGTGTTTCTTCTTAAAAACAGTTTTAAAATCCAAGTCAAATATTGAAAACTAAAAAAAAAAAGTTTTCAAAAAGTAGTTTTAGTTTTTTAAAATTTGGAATTAAAATGTTGGTAAGAAATATGAAAAACATATAAAAAAATCATGAGAAATCAAGCATACTCATTAATTCATATAAAAGTATGGTTAAATTTGTTTTAAAATGAATTGATGCATTTGGTTGTATTATCTCAATTTTGTTTTAAATTTCATGGATAAGGTTGGAAATTAACCATCGAGTATAATTTCTCCTTCCACATAAAATATCTACGTCCAGTCTTCCAAGATGTGGGTATTTCTCAAATTTCGGCTTAGAAAAACTCTAAACCCGAAAACACGAATATATAGGCATCGACACAGCGCCGTGGCACTACGGACGAAAAATCGTCCGCGCGCACACAACAGTGCTGCGGTGCTGCCCCTTAGCGCCATGGCGCTGCACTTTCAACCCCCTTGGTTTGCAGCGTCCGCTTCCCCTGTTTTCCATCCCCGCTTCGATCCCGACTTGATCCAGATTCATCCCTAGCTCGTTCCGACGTCGTTTTGACTCCACCAATCCCAACAAACGTATTAGAATACATATTTGTTATAAATGTAATATTGAACTTATAGTATGTATGTATCCATATATATAATAGGATAGTTTTAAGAGGGTGGATCTCACCTTATTAAAGTTTATAATTCTTGGATATCACTAAATTATTTATCATAGATTATTATTGAGGATGAAATTTATTTGAATGTCATAAATCTTAAGAATGGAATATAATGTCTAATCAAGAAAGTTAGGAATGATTTTTCTTTGTCATCCATAAGTATGGTAAAAGAATATTGTATGTCCTAGAAAATATTTGTATAACGTAGGGATGTTTTGAGAGTATTTGCGAAAATATTTTAGTTGTTTTTTTTATTACTTTTTCTACAATTTTCATTATATTCTCACGTCTCATTTCTACCACGTTTAACACAAAGTCTGGTTTAATTTAGGTATATAATATATTTGACTAGTATCGTGTGGGTACACATTATTTTTTGTTTAGCAGGGTTCGTTGAAGATATTCCATTGAAAATGGTGTGAATTCAATCATTCTATCAGGTCAATTCATGCATCTACAAATATCAATATATAACAAAGAAAAAACACTGAGATCTTAATTATTGGTAGAAATCAAAGTTGTTATTGATTAGCAGATAAATAAAAAACTCTAGTGACCATTATCAACCATGGGGTGGTGGTGCAGTTGGCTAGCGTGTAGGTCTCATAGCTATCTGAGTTATCCTTAGGTCGAGAGTTCGAACCTCTCTCACCCCATGATTTTTTATCACGAACACATCGAATATTGATCAATCTATAGATCTTCACATATAAGAAATATTATTAAATATTGATAAAAGACTTTATAATAAATACTAAATCGTCGCAATCTTTAAACTAGAAGAACTAAATTGTTTAGTTATATACTAAAGTTTAGGTATGCGCCCTGGATAAGTTGCTTAGGTTGGCCAAGTGTATTTCTCATAGTATAAGAAACAAACAGTAGGGTAATATGTTCAATGGTGTAATTTCAGTCCTTGTTAGTTCGAATATCTAATCGAGAGAGTGCAAAATGCAAATGGGATCCAGATTCATGGAACTACCGAAGAAATCATGAATTTCTACAGCAGCACTCATGTTTCAAAGACAATTTCTAATACATTTTTCTCAACTTCCCATTTCTCTAATTCGGATCTTCGTCTTCGTCGCCGTCTCACCCATCCATGGATTCTCGTCTTCCCGTCTTACTTCTTTTAGCGACTGCAATTTCGTTCTTAAGCTGCCTTGCACAAAGGTGATAACCGAACGCCATTTTACATCTCAGCTGTGTCTTTTTCATTTCGAAGTCGTGGGTTCTGTAGATCCGCAATTGGAATTGTATTCGTAAAACTGATTTTCGTTTCACTTCCTAATACTGAAATTGTTGAATGATTTGTCGATGCATGATTGAGTTTGATATCGTGGTTTTTTTTCCCTCCTTTCTGCCAGCAATTTGATTAGTGGGCGCAAGGGTTTAAGGGACCAATTGATCGAAAGTGTACCTTTGAGCTACTCTAATCATTCTGGAAGAATCGACCCATCAAGAGTTGTCCAAGTCTCTTGGCGACCAAGGTCAGGTACTTGCAGAATCTCTCATACGATACTGTTGACGTGAAACTGTGAAATCATTTAGAATTACTCCTAGGTTGTGGCTTATGAATTAGTTCTCTTTTCTCCTCTTTTTGTGACAGAATCAACTTCAAAGCATTTTGAAGTTCTTTTATACTGTACATATTTGTTGATTAATTTAATAGGAAGTCTGATTTAACTTTTTTTGCTATAGGGTTTTCTTGTATAAAGGATTTCTCTCAGATGAGGAGTGTGATCACCTTATTTCTTTGGTATGTTTCTGCCATGTTCTTCATAGTATCTTGAACATGGCTTCAGCTGTTTCTGACGTGTATATAACTTTTGGGCGTAGGCTACAAGTTCAGAAGATAAACCTTCTGGGAACAGTACTGACTCTGGGAACACTGTCCCAACCAAAATTCTAAAGAGTTCAGGAGCCATTTTAAACACAACAGTACGTTCCTGTGCTTTTCGAAGTTTATGGGTTGGTTTGATACCACACCTAGTGGTTCATGGTGGAGTTCTAACCATAGAAAATTCTTTTGTAATACCTTTTGATGCTATTCTTTTAATGGTTATTGACAATTGAAAGGTATTTTTTTTGTAGTTCCCCCTAGGCTGTTCTTTTCTCCCCTTTTAAATATAGTTCTCTCTTCTCTGTTTCTTAAAAAAAAAAAAAGTGATGCTGCAGAATCTGTCACATCGAGATTTTCTTATCATGTTTTCTTAAAATTGACAGAAGCAGCATACTATTAATCTTATATCGCTTCGCATTAGTTTCTACATGCTAGTAATTATGCAATCCATTAGTTTAGACCTTTTTCTGACATTGTTGTTTTATTTAGTTCTAATGGTTTTGGTCAGTATCCAACTAATACATGGATGAACAGTTTGAGTTTATCTCATTGTTCAATGTATTTTCTATTGGACTATGTTTGTTTAAAATCATTCTGTTATTAGTTTTTATGGTGGTAGTGGTACTACCAAAGTATCAAATATCTGTGTTGTGTATATTAGGATTTGGACTTGGTTACACCTACTATTGTTGATCTTAATCTATTACCATTGATGTGGGGTTTTGTTTTCAAAGATTCCATCTCTCATTCCCTAAATCCTTGATGGGTCACTTTTAGCATTGATGATAGACGATATGCTCTCAAATTGCTCTCATTTTTTTTATTTTCAAACATATAAAACATTCTTTGTTGCTGAGTTTTCTGGGGAACCTGCAGTCGCTTGAAGGGAATCTCAAAAAATTCGAGCTCTAAATGTCTTGTGAAGAGTGAAGTCAGAAATTTCTCTTTCTGGTTTTATAGATATAAAGCACTAAAAAGTGTGGGGGATGGTGGTAAAACCTAAGATAATATATACTAGGATCTCATAATTGTTACTGAAATTCTTTGTTCTTGGTGACTCCACTATGACTTGTCAGTCAGCATCAATTGTCTTCCAAAATTTGTAACAAGCCTGACTTAAGGGAGTATTTGTTGGGATCATAGAACTTATAATTGTGCTTTTTTTGTGGTACCAACTATTCCTGTTGGTGAAGCTGCTACTGTTAGTCTATTTCTAGGTTCTATGAGATTTACGTTTCGTTTATTTTGTTGGGGCTTGTGTCTTAGGATGATATCATTGCAAGGATCGAGAATCGAATTGCTGTGTGGACTTTTCTTCCAAAAGGTATTTTCCATCAGTACTGTGTACTGCTTTTCCTTCCACATTTTCTGGATAATGCCCCTTGTATTTAATCTTCTTCTAAACTTCTCAGTTCTATTGTCTTCAGATTATAGCATGCCTTTGCAGATTTTGCAATATGGGGGTGAAGAAGCAGAGCATAAGTACGTTTTTGGTAACAGATCTGCAATGTTGTCCAGTGAGCCTTTGATGGCCACAGTAGTTCTGTATCTCTCAGATTCTGCTAGCGGTGGCGAGATGCGCTTTCCTGAATCAAAGGTGAGAGAAAATACTCAAACATCCGTGGCCAACTGGCCATAACACCGTACTAATGACTGCCATTTCAATGCCTCAGGTAAAGAGCAGATTTTGGTCAGACCGGAGAAAGAAAAACAACATTCTGAGACCAGTGAAAGGCAATGCAGTTCTTATTTTCTCTGTGCATCTTAATGCTTCTCCAGACAAGAGTAGCTCCCATACCCGATCTCCGATACTCGATGGGGAATTGTGGATTGCAACAAAATTCTTCTACTTAAGACCAATCACTGGGAATAAACACACAGACGAACCTGATGGAGACTGTAATGATGAAGATAAAAGCTGCCCCCAATGGGCTGCCATTGGCGAATGCGAACGAAACGCTGTTTTCATGATTGGTTCTCCAGATTACTATGGAACATGTAGAAAAAGCTGCAACGCATGT

mRNA sequence

ATGTCTATCATATGTGGCATTCCTATTCTTGAGTGTGTATGCTGTCTGGGATGTGCTCGTTGGGCCTGTAAACGCTGTTTTCACTCAGCCGTTCATGACAGTGAAACTTGGGGCTTTGCCACCGCTGATGAGTTCGGGCCTATTCCCCGAATTTGTCGGTATATCCTAGCTGTGTATGAAGATGATATTCAACAACCCCTTTGGGAACCGGCTGGTGGTTATGGAATCAATCCAGATTGGTTGCTCATTAAAAAAACATATAAGGATACTCGAGGACGTGCGCCTCCGTATATTTTATACCTTGATCACAATCATGGGGACATTGTTCTTGCCATCAGGGGACTTAATATGGCAAAGGAGAGTGATTATGCAGTTTTATTGGACAACAAGCTGGGGAAGAAGAAATTTGACGGTGGATATGTTCACAATGGGCTTCTGAAGGCAGCTGGGTGGGTTTTGGACACTGAGAACGAAATTTTAAAGGATTTGGTGAGCAAATATCCGGATTATACATTGACGTTTGCAGGGCATTCCCTTGGCTCCGGAGTAGCAGCCATGTTAACTCTGGTAGTAGTTCAGAATCGCGATAAATTGGAAAATATTGATCGGAAGAGGATAAGGTGCTATGCGATTGCTCCTGCCAGGTGCATGTCCCTAAATTTGGCTGTTAGATATGCAGATGTGATCAACTCTGTTGTTCTTCAGGTAAATTGTTTATTTCTTTCATACAATTTGATTAGTGGGCGCAAGGGTTTAAGGGACCAATTGATCGAAAGTGTACCTTTGAGCTACTCTAATCATTCTGGAAGAATCGACCCATCAAGAGTTGTCCAAGTCTCTTGGCGACCAAGGGTTTTCTTGTATAAAGGATTTCTCTCAGATGAGGAGTGTGATCACCTTATTTCTTTGGCTACAAGTTCAGAAGATAAACCTTCTGGGAACAGTACTGACTCTGGGAACACTGTCCCAACCAAAATTCTAAAGAGTTCAGGAGCCATTTTAAACACAACAGATGATATCATTGCAAGGATCGAGAATCGAATTGCTGTGTGGACTTTTCTTCCAAAAGATTATAGCATGCCTTTGCAGATTTTGCAATATGGGGGTGAAGAAGCAGAGCATAAGTACGTTTTTGGTAACAGATCTGCAATGTTGTCCAGTGAGCCTTTGATGGCCACAGTAGTTCTGTATCTCTCAGATTCTGCTAGCGGTGGCGAGATGCGCTTTCCTGAATCAAAGGTAAAGAGCAGATTTTGGTCAGACCGGAGAAAGAAAAACAACATTCTGAGACCAGTGAAAGGCAATGCAGTTCTTATTTTCTCTGTGCATCTTAATGCTTCTCCAGACAAGAGTAGCTCCCATACCCGATCTCCGATACTCGATGGGGAATTGTGGATTGCAACAAAATTCTTCTACTTAAGACCAATCACTGGGAATAAACACACAGACGAACCTGATGGAGACTGTAATGATGAAGATAAAAGCTGCCCCCAATGGGCTGCCATTGGCGAATGCGAACGAAACGCTGTTTTCATGATTGGTTCTCCAGATTACTATGGAACATGTAGAAAAAGCTGCAACGCATGT

Coding sequence (CDS)

ATGTCTATCATATGTGGCATTCCTATTCTTGAGTGTGTATGCTGTCTGGGATGTGCTCGTTGGGCCTGTAAACGCTGTTTTCACTCAGCCGTTCATGACAGTGAAACTTGGGGCTTTGCCACCGCTGATGAGTTCGGGCCTATTCCCCGAATTTGTCGGTATATCCTAGCTGTGTATGAAGATGATATTCAACAACCCCTTTGGGAACCGGCTGGTGGTTATGGAATCAATCCAGATTGGTTGCTCATTAAAAAAACATATAAGGATACTCGAGGACGTGCGCCTCCGTATATTTTATACCTTGATCACAATCATGGGGACATTGTTCTTGCCATCAGGGGACTTAATATGGCAAAGGAGAGTGATTATGCAGTTTTATTGGACAACAAGCTGGGGAAGAAGAAATTTGACGGTGGATATGTTCACAATGGGCTTCTGAAGGCAGCTGGGTGGGTTTTGGACACTGAGAACGAAATTTTAAAGGATTTGGTGAGCAAATATCCGGATTATACATTGACGTTTGCAGGGCATTCCCTTGGCTCCGGAGTAGCAGCCATGTTAACTCTGGTAGTAGTTCAGAATCGCGATAAATTGGAAAATATTGATCGGAAGAGGATAAGGTGCTATGCGATTGCTCCTGCCAGGTGCATGTCCCTAAATTTGGCTGTTAGATATGCAGATGTGATCAACTCTGTTGTTCTTCAGGTAAATTGTTTATTTCTTTCATACAATTTGATTAGTGGGCGCAAGGGTTTAAGGGACCAATTGATCGAAAGTGTACCTTTGAGCTACTCTAATCATTCTGGAAGAATCGACCCATCAAGAGTTGTCCAAGTCTCTTGGCGACCAAGGGTTTTCTTGTATAAAGGATTTCTCTCAGATGAGGAGTGTGATCACCTTATTTCTTTGGCTACAAGTTCAGAAGATAAACCTTCTGGGAACAGTACTGACTCTGGGAACACTGTCCCAACCAAAATTCTAAAGAGTTCAGGAGCCATTTTAAACACAACAGATGATATCATTGCAAGGATCGAGAATCGAATTGCTGTGTGGACTTTTCTTCCAAAAGATTATAGCATGCCTTTGCAGATTTTGCAATATGGGGGTGAAGAAGCAGAGCATAAGTACGTTTTTGGTAACAGATCTGCAATGTTGTCCAGTGAGCCTTTGATGGCCACAGTAGTTCTGTATCTCTCAGATTCTGCTAGCGGTGGCGAGATGCGCTTTCCTGAATCAAAGGTAAAGAGCAGATTTTGGTCAGACCGGAGAAAGAAAAACAACATTCTGAGACCAGTGAAAGGCAATGCAGTTCTTATTTTCTCTGTGCATCTTAATGCTTCTCCAGACAAGAGTAGCTCCCATACCCGATCTCCGATACTCGATGGGGAATTGTGGATTGCAACAAAATTCTTCTACTTAAGACCAATCACTGGGAATAAACACACAGACGAACCTGATGGAGACTGTAATGATGAAGATAAAAGCTGCCCCCAATGGGCTGCCATTGGCGAATGCGAACGAAACGCTGTTTTCATGATTGGTTCTCCAGATTACTATGGAACATGTAGAAAAAGCTGCAACGCATGT

Protein sequence

MSIICGIPILECVCCLGCARWACKRCFHSAVHDSETWGFATADEFGPIPRICRYILAVYEDDIQQPLWEPAGGYGINPDWLLIKKTYKDTRGRAPPYILYLDHNHGDIVLAIRGLNMAKESDYAVLLDNKLGKKKFDGGYVHNGLLKAAGWVLDTENEILKDLVSKYPDYTLTFAGHSLGSGVAAMLTLVVVQNRDKLENIDRKRIRCYAIAPARCMSLNLAVRYADVINSVVLQVNCLFLSYNLISGRKGLRDQLIESVPLSYSNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLATSSEDKPSGNSTDSGNTVPTKILKSSGAILNTTDDIIARIENRIAVWTFLPKDYSMPLQILQYGGEEAEHKYVFGNRSAMLSSEPLMATVVLYLSDSASGGEMRFPESKVKSRFWSDRRKKNNILRPVKGNAVLIFSVHLNASPDKSSSHTRSPILDGELWIATKFFYLRPITGNKHTDEPDGDCNDEDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC
Homology
BLAST of MS017764 vs. NCBI nr
Match: CAB4301873.1 (unnamed protein product [Prunus armeniaca])

HSP 1 Score: 671.4 bits (1731), Expect = 6.4e-189
Identity = 356/709 (50.21%), Postives = 429/709 (60.51%), Query Frame = 0

Query: 1   MSIICGIPILECVCCLGCARWACKRCFHSAVHDSETWGFATADEFGPIPRICRYILAVYE 60
           MSI+CG P++ECV CL C RWA KRC H+A HDSETWG ATA+EF P+PR+CRYILAVYE
Sbjct: 14  MSILCGCPLIECVYCLACTRWAWKRCLHTAGHDSETWGIATAEEFEPVPRLCRYILAVYE 73

Query: 61  DDIQQPLWEPAGGYGINPDWLLIKKTYKDTRGRAPPYILYLDHNHGDIVLAIRGLNMAKE 120
           DD++QPLWEP GGYGI PDWL++KKTY+DT+G+APPYILYLDH+H DIVLA RGLN+A+E
Sbjct: 74  DDLRQPLWEPPGGYGIKPDWLILKKTYEDTQGQAPPYILYLDHDHADIVLAFRGLNLARE 133

Query: 121 SDYAVLLDNKLGKKKFDGGYVHNGLLKAAGWVLDTENEILKDLVSKYPDYTLTFAGHSLG 180
           SDYAVL+DNKLGKKKFDGGYVHNGLLKAA WVLD E E LKDLV KYP+YTLTF GHSLG
Sbjct: 134 SDYAVLMDNKLGKKKFDGGYVHNGLLKAAEWVLDAECENLKDLVEKYPNYTLTFTGHSLG 193

Query: 181 SGVAAMLTLVVVQNRDKLENIDRKRIRCYAIAPARCMSLNLAVRYADVINSVVLQ----- 240
           SGVAA+LT+VVVQ+RD+L NIDRKR+R YAIAPARC+SLNLAVRYADVINSVVLQ     
Sbjct: 194 SGVAALLTMVVVQSRDRLGNIDRKRVRGYAIAPARCVSLNLAVRYADVINSVVLQATTPL 253

Query: 241 ---------------VNC------------------------------------------ 300
                          + C                                          
Sbjct: 254 EDIFKSLFCLPCLLCIRCMRDTCIPEEKMLKDPRRLYAPGRLYHIVERKPFRLGRFPPVV 313

Query: 301 ------------LFLSYNLIS----------GRKGL-----RDQLIE------------- 360
                       + LS N  S           ++ L     +DQ++E             
Sbjct: 314 KTAVPVDGRFEHIVLSCNATSDHAIIWIEREAQRALKLMLEKDQIMEIPPKQKMERQETL 373

Query: 361 ------------------SVPLSYSN---------------------------------- 420
                             +VP +YS                                   
Sbjct: 374 AKEHTEEYRAALQRAVTLAVPHAYSPSMYGTFDEKDEEEHSYGSSGESSFSSAKKSKTFV 433

Query: 421 ------------------------HSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLIS 480
                                   HS RIDPSR VQ+SWRPRVFLY+GFLSDEECDHL+S
Sbjct: 434 ARSRKELRSEEANKETFIHFGHSVHSNRIDPSRAVQLSWRPRVFLYQGFLSDEECDHLVS 493

Query: 481 LATSSEDKPSGNSTDSGNTVPTKILKSSGAILNTTDDIIARIENRIAVWTFLPKDYSMPL 530
           LA   E+       D GNT   ++ KS    LN  D+I++RIE RI+ WTFLPK+ S  L
Sbjct: 494 LAHGGEENSLTEYDDLGNTNTIRLRKSLQIPLNMEDEIVSRIEERISAWTFLPKENSRAL 553

BLAST of MS017764 vs. NCBI nr
Match: CAB4271435.1 (unnamed protein product [Prunus armeniaca])

HSP 1 Score: 669.1 bits (1725), Expect = 3.2e-188
Identity = 355/709 (50.07%), Postives = 429/709 (60.51%), Query Frame = 0

Query: 1   MSIICGIPILECVCCLGCARWACKRCFHSAVHDSETWGFATADEFGPIPRICRYILAVYE 60
           MSI+CG P++ECV CL C RWA KRC H+A HDSETWG ATA+EF P+PR+CRYILAVYE
Sbjct: 1   MSILCGCPLIECVYCLACTRWAWKRCLHTAGHDSETWGIATAEEFEPVPRLCRYILAVYE 60

Query: 61  DDIQQPLWEPAGGYGINPDWLLIKKTYKDTRGRAPPYILYLDHNHGDIVLAIRGLNMAKE 120
           DD++QPLWEP GGYGI PDWL++KKTY+DT+G+APPYILYLDH+H DIVLA RGLN+A+E
Sbjct: 61  DDLRQPLWEPPGGYGIKPDWLILKKTYEDTQGQAPPYILYLDHDHADIVLAFRGLNLARE 120

Query: 121 SDYAVLLDNKLGKKKFDGGYVHNGLLKAAGWVLDTENEILKDLVSKYPDYTLTFAGHSLG 180
           SDYAVL+DNKLGKKKFDGGYVHNGLLKAA WVLD E E LKDLV KYP+YTLTF GHSLG
Sbjct: 121 SDYAVLMDNKLGKKKFDGGYVHNGLLKAAEWVLDAECENLKDLVEKYPNYTLTFTGHSLG 180

Query: 181 SGVAAMLTLVVVQNRDKLENIDRKRIRCYAIAPARCMSLNLAVRYADVINSVVLQ----- 240
           SGVAA+LT+VVVQ+RD+L NIDRKR+R YAIAPARC+SLNLAVRYADVINSVVLQ     
Sbjct: 181 SGVAALLTMVVVQSRDRLGNIDRKRVRGYAIAPARCVSLNLAVRYADVINSVVLQATTPL 240

Query: 241 ---------------VNC------------------------------------------ 300
                          + C                                          
Sbjct: 241 EDIFKSLFCLPCLLCIRCMRDTCIPEEKMLKDPRRLYAPGRLYHIVERKPFRLGRFPPVV 300

Query: 301 ------------LFLSYNLIS----------GRKGL-----RDQLIE------------- 360
                       + LS N  S           ++ L     +DQ++E             
Sbjct: 301 KTAVPVDGRFEHIVLSCNATSDHAIIWIEREAQRALKLMLEKDQIMEIPPKQKMERQETL 360

Query: 361 ------------------SVPLSYSN---------------------------------- 420
                             +VP +YS                                   
Sbjct: 361 AKEHTEEYRAALQRAVTLAVPHAYSPSMYGTFDEKDEEEHSYGSSGESSFSSAKKSKTFV 420

Query: 421 ------------------------HSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLIS 480
                                   HS RIDPSR VQ+SWRPRVFLY+GFLSDEECDHL+S
Sbjct: 421 ARSRKELRSEEANKETFIHFGHSVHSNRIDPSRAVQLSWRPRVFLYQGFLSDEECDHLVS 480

Query: 481 LATSSEDKPSGNSTDSGNTVPTKILKSSGAILNTTDDIIARIENRIAVWTFLPKDYSMPL 530
           LA   E+       D GNT   ++  S    LN  D+I++RIE RI+ WTFLPK+ S  L
Sbjct: 481 LAHGGEENSLTEYDDLGNTNTIRLRISLQIPLNMEDEIVSRIEERISAWTFLPKENSRAL 540

BLAST of MS017764 vs. NCBI nr
Match: RXH95088.1 (hypothetical protein DVH24_024772 [Malus domestica])

HSP 1 Score: 630.6 bits (1625), Expect = 1.3e-176
Identity = 340/739 (46.01%), Postives = 420/739 (56.83%), Query Frame = 0

Query: 1    MSIICGIPILECVCCLGCARWACKRCFHSAVHDSETWGFATADEFGPIPRICRYILAVYE 60
            MSI+C  P+LECV CL C RWA KRC H+A HDSETWG +TA+EF P+PR+CRYILAVYE
Sbjct: 994  MSILCACPVLECVYCLACTRWAWKRCLHTAGHDSETWGLSTAEEFEPVPRLCRYILAVYE 1053

Query: 61   DDIQQPLWEPAGGYGINPDWLLIKKTYKDTRGRAPPYILYLDHNHGDIVLAIRGLNMAKE 120
            DD++ PLWEP GGYGINPDWL++KKTY+DT G APPYILYLDHNH DIVLA RGLN+A+E
Sbjct: 1054 DDLRCPLWEPPGGYGINPDWLILKKTYEDTGGLAPPYILYLDHNHADIVLAFRGLNLARE 1113

Query: 121  SDYAVLLDNKLGKKKFDGGYVHNGLLKAAGWVLDTENEILKDLVSKYPDYTLTFAGHSLG 180
            SDYAVL+DNKLG++KFDGGYVHNGLLK+A WV+D E EILKDLV  YP+YTLTFAGHSLG
Sbjct: 1114 SDYAVLMDNKLGQRKFDGGYVHNGLLKSAQWVMDAECEILKDLVQNYPNYTLTFAGHSLG 1173

Query: 181  SGVAAMLTLVVVQNRDKLENIDRKRIRCYAIAPARCMSLNLAVRYADVINSVVLQ----- 240
            SGVAA+LT+VVV+NRD+L +IDRKR+R YAIAPARCMSLNLAVRYADVINSVVLQ     
Sbjct: 1174 SGVAALLTMVVVKNRDRLGDIDRKRVRGYAIAPARCMSLNLAVRYADVINSVVLQDDFLP 1233

Query: 241  --------------VNCL------------------------------------------ 300
                          + CL                                          
Sbjct: 1234 RTATPLEDIFNLPCILCLRCMRDTCIPEEKMLKDPRRLYAPGRLYHIVERKPFRCGRFPP 1293

Query: 301  ------------------------------------------------------------ 360
                                                                        
Sbjct: 1294 VVKTAVPVDGRFEHIVLSCNATSDHAIIWIEREAQRALDLMLQKDHIMEIPSKQRMERQE 1353

Query: 361  ------------------------------------------------------------ 420
                                                                        
Sbjct: 1354 TLAKEHTEEYKAALQRAVTLAVPHAYSPSPYGTFDEKDEEDHSYGSSGESSFGSTKKSKS 1413

Query: 421  --------------------------FLSYNLISGRKGLR-DQLIES--VPLSYSNHSGR 480
                                      F S +    RK LR +Q I+   +   +S HS R
Sbjct: 1414 FTARRGGSASMASLASIFLLLSVTSSFFSSSAEISRKELRTNQTIQETVIHFGHSVHSNR 1473

Query: 481  IDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLATSSEDKPSGNSTDSGNTVPTKILKSS 528
            IDPSRVVQ+SW+PR        SDEECDHL+SLA   EDK      + GNT   +++KS 
Sbjct: 1474 IDPSRVVQLSWQPR--------SDEECDHLVSLALGGEDKSVTEYDELGNTNTMRLIKSL 1533

BLAST of MS017764 vs. NCBI nr
Match: KAF4353598.1 (hypothetical protein F8388_017773 [Cannabis sativa])

HSP 1 Score: 614.4 bits (1583), Expect = 9.3e-172
Identity = 346/748 (46.26%), Postives = 417/748 (55.75%), Query Frame = 0

Query: 1   MSIICGIPILECVCCLGCARWACKRCFHSAVHDSETWGFATADEFGPIPRICRYILAVYE 60
           MSIICGIP+LECV CL CARWA KRC H+A HDSE WG ATA+EF P+PR+C YILAVYE
Sbjct: 1   MSIICGIPLLECVYCLACARWAWKRCLHTAGHDSENWGIATAEEFEPVPRMCCYILAVYE 60

Query: 61  DDIQQPLWEPAGGYGINPDWLLIKKTYKDTRGRAPPYILYLDHNHGDIVLAIRGLNMAKE 120
           DD++ PLWEP  GYGINPDWL  KK+Y+DT G+APPYILYLDH+H DIVLA RGLN+AKE
Sbjct: 61  DDLRHPLWEPPEGYGINPDWLEHKKSYEDTDGQAPPYILYLDHDHEDIVLAFRGLNLAKE 120

Query: 121 SDYAVLLDNKLGKKKFDGGYVHNGLLKAAGWVLDTENEILKDLVSKYPDYTLTFAGHSLG 180
           SDYAVLLDNKLG++KFDGGYVHNGLLKAA  VL  E++ LK LV KYP+YTLTFAGHSLG
Sbjct: 121 SDYAVLLDNKLGQRKFDGGYVHNGLLKAAEHVLLMESDTLKKLVMKYPNYTLTFAGHSLG 180

Query: 181 SGVAAMLTLVVVQNRDKLENIDRKRIRCYAIAPARCMSLNLAVRYADVINSVVLQ----- 240
           SGVA +L ++ VQNR +L NIDR+RIRCYAIAPARCMSLNLAVRYADVINSVVLQ     
Sbjct: 181 SGVATLLAVLAVQNRAELGNIDRRRIRCYAIAPARCMSLNLAVRYADVINSVVLQATTPL 240

Query: 241 ------------VNCL-------------------------------------------- 300
                       + CL                                            
Sbjct: 241 EDIFKSLFCLPCILCLRCMRDTCIPEEKMIKDPRRLYAPGRLYHIVERKPFRMGRFPPEV 300

Query: 301 ------------------------------------------------------------ 360
                                                                       
Sbjct: 301 RTAVPVDGRFEHIVLSCNATSDHAIVWIEREARRALELMLKKDPIMEIPTKQRMERQETL 360

Query: 361 ------------------------------------------------------------ 420
                                                                       
Sbjct: 361 AKEKKEEYKAALQRAVTLSVPHAFTPSQYGTFDEEAESSPGSAGESSFGSPRFLHFSSHS 420

Query: 421 ---------FLSYNL------------------ISGRKGLRDQLIES---VPLSYSNHSG 480
                    F S NL                  +S RK LRD+  +    +    S HS 
Sbjct: 421 SVINNHNNGFSSLNLASSNFSPLFFFPNCRFYWLSSRKELRDEEFKQEMVIQFPSSVHSN 480

Query: 481 RIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLATSSEDKPSGNSTDSGNTVPTKILKS 530
           RIDPSRVVQ+SWRPRVFLY+GFLSDEECDHLIS  +  ED        SGNT+  K++KS
Sbjct: 481 RIDPSRVVQLSWRPRVFLYEGFLSDEECDHLISSTSREEDA-------SGNTIKKKLMKS 540

BLAST of MS017764 vs. NCBI nr
Match: KAF4351179.1 (hypothetical protein F8388_024210 [Cannabis sativa])

HSP 1 Score: 610.5 bits (1573), Expect = 1.3e-170
Identity = 344/764 (45.03%), Postives = 417/764 (54.58%), Query Frame = 0

Query: 1   MSIICGIPILECVCCLGCARWACKRCFHSAVHDSETWGFATADEFGPIPRICRYILAVYE 60
           MSIICGIP+LECV CL CARWA KRC H+A HDSE WG ATA+EF P+PR+C YILAVYE
Sbjct: 1   MSIICGIPLLECVYCLACARWAWKRCLHTAGHDSENWGIATAEEFEPVPRMCCYILAVYE 60

Query: 61  DDIQQPLWEPAGGYGINPDWLLIKKTYKDTRGRAPPYILYLDHNHGDIVLAIRGLNMAKE 120
           DD++ PLWEP  GYGINPDWL  KK+Y+DT G+APPYILYLDH+H DIVLA RGLN+AKE
Sbjct: 61  DDLRHPLWEPPEGYGINPDWLEHKKSYEDTDGQAPPYILYLDHDHEDIVLAFRGLNLAKE 120

Query: 121 SDYAVLLDNKLGKKKFDGGYVHNGLLKAAGWVLDTENEILKDLVSKYPDYTLTFAGHSLG 180
           SDYAVLLDNKLG++KFDGGYVHNGLLKAA  VL  E++ LK LV KYP+YTLTFAGHSLG
Sbjct: 121 SDYAVLLDNKLGQRKFDGGYVHNGLLKAAEHVLLMESDTLKKLVMKYPNYTLTFAGHSLG 180

Query: 181 SGVAAMLTLVVVQNRDKLENIDRKRIRCYAIAPARCMSLNLAVRYADVINSVVLQ----- 240
           SGVA +L ++ VQNR +L NIDR+RIRCYAIAPARCMSLNLAVRYADVINSVVLQ     
Sbjct: 181 SGVATLLAVLAVQNRAELGNIDRRRIRCYAIAPARCMSLNLAVRYADVINSVVLQAINFK 240

Query: 241 ------------------------------------------------------------ 300
                                                                       
Sbjct: 241 CDDFLPRTATPLEDIFKSLFCLPCILCLRCMRDTCIPEEKMIKDPRRLYAPGRLYHIVER 300

Query: 301 --------------------------VNC------------------------------- 360
                                     ++C                               
Sbjct: 301 KPFRMGRFPPEVRTAVPVDGRFEHIVLSCNATSDHAIVWIEREARRALELMLKKDPIMEI 360

Query: 361 ------------------------------------------------------------ 420
                                                                       
Sbjct: 361 PTKQRMERQETLAKEKKEEYKAALQRAVTLSVPHAFTPSQYGTFDEEAESSPGSAGESSF 420

Query: 421 ------------------------------------------LFLSYNLISGRKGLRDQL 480
                                                     L+LS    S RK LRD+ 
Sbjct: 421 GSPRFLHFSSHSSVINNHNNGFSSLNLASSNFSPLFFFPNLNLYLSLYFFSSRKELRDEE 480

Query: 481 IES---VPLSYSNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLATSSEDKPSG 530
            +    +    S HS RIDPSRVVQ+SWRPRVFLY+GFLSDEECDHLISL +  +D    
Sbjct: 481 FKQEMVIQFPSSVHSNRIDPSRVVQLSWRPRVFLYEGFLSDEECDHLISLTSREDDA--- 540

BLAST of MS017764 vs. ExPASy Swiss-Prot
Match: Q8GXT7 (Probable prolyl 4-hydroxylase 12 OS=Arabidopsis thaliana OX=3702 GN=P4H12 PE=2 SV=1)

HSP 1 Score: 242.3 bits (617), Expect = 1.3e-62
Identity = 130/286 (45.45%), Postives = 180/286 (62.94%), Query Frame = 0

Query: 249 RKGLRDQLIES----VPLSYSNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLA 308
           RK LRD+ I S       SY   S  +DP+RV+Q+SW PRVFLY+GFLS+EECDHLISL 
Sbjct: 28  RKELRDKEITSKSDDTQASYVLGSKFVDPTRVLQLSWLPRVFLYRGFLSEEECDHLISLR 87

Query: 309 TSSEDKPSGNSTDSGNTVPTKILKSSGAILNTTDDIIARIENRIAVWTFLPKDYSMPLQI 368
             + +  S ++   G T                D ++A IE +++ WTFLP +    +++
Sbjct: 88  KETTEVYSVDA--DGKT--------------QLDPVVAGIEEKVSAWTFLPGENGGSIKV 147

Query: 369 LQYGGEEAEHKY-VFGNRSAMLSSEPLMATVVLYLSDSASGGEMRFPESKVKSRFWSDRR 428
             Y  E++  K   FG   + +  E L+ATVVLYLS++  GGE+ FP S++K +  +   
Sbjct: 148 RSYTSEKSGKKLDYFGEEPSSVLHESLLATVVLYLSNTTQGGELLFPNSEMKPK--NSCL 207

Query: 429 KKNNILRPVKGNAVLIFSVHLNASPDKSSSHTRSPILDGELWIATKFFYLRPITGNKHTD 488
           +  NILRPVKGNA+L F+  LNAS D  S+H R P++ GEL +ATK  Y +     +   
Sbjct: 208 EGGNILRPVKGNAILFFTRLLNASLDGKSTHLRCPVVKGELLVATKLIYAK----KQARI 267

Query: 489 EPDGDCNDEDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC 530
           E  G+C+DED++C +WA +GEC++N V+MIGSPDYYGTCRKSCNAC
Sbjct: 268 EESGECSDEDENCGRWAKLGECKKNPVYMIGSPDYYGTCRKSCNAC 291

BLAST of MS017764 vs. ExPASy Swiss-Prot
Match: F4J0A8 (Probable prolyl 4-hydroxylase 6 OS=Arabidopsis thaliana OX=3702 GN=P4H6 PE=2 SV=1)

HSP 1 Score: 215.3 bits (547), Expect = 1.6e-54
Identity = 114/274 (41.61%), Postives = 162/274 (59.12%), Query Frame = 0

Query: 265 SNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLATSSEDKPS-GNSTDSGNTVP 324
           S+ S  +DP+R+ Q+SW PR FLYKGFLSDEECDHLI LA    +K       DSG +  
Sbjct: 21  SSFSFSVDPTRITQLSWTPRAFLYKGFLSDEECDHLIKLAKGKLEKSMVVADVDSGESED 80

Query: 325 TKILKSSGAIL-NTTDDIIARIENRIAVWTFLPKDYSMPLQILQY--GGEEAEHKYVFGN 384
           +++  SSG  L    DDI+A +E ++A WTFLP++    LQIL Y  G +   H   F +
Sbjct: 81  SEVRTSSGMFLTKRQDDIVANVEAKLAAWTFLPEENGEALQILHYENGQKYDPHFDYFYD 140

Query: 385 RSAMLSSEPLMATVVLYLSDSASGGEMRFPESK-----VKSRFWSDRRKKNNILRPVKGN 444
           + A+      +ATV++YLS+   GGE  FP  K     +K   WS   K+   ++P KG+
Sbjct: 141 KKALELGGHRIATVLMYLSNVTKGGETVFPNWKGKTPQLKDDSWSKCAKQGYAVKPRKGD 200

Query: 445 AVLIFSVHLNASPDKSSSHTRSPILDGELWIATKFFYLRPITGNKHTDEPDGDCNDEDKS 504
           A+L F++HLN + D +S H   P+++GE W AT++ ++R     K        C D+ +S
Sbjct: 201 ALLFFNLHLNGTTDPNSLHGSCPVIEGEKWSATRWIHVRSFGKKKLV------CVDDHES 260

Query: 505 CPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC 530
           C +WA  GECE+N ++M+GS    G CRKSC AC
Sbjct: 261 CQEWADAGECEKNPMYMVGSETSLGFCRKSCKAC 288

BLAST of MS017764 vs. ExPASy Swiss-Prot
Match: Q8L970 (Probable prolyl 4-hydroxylase 7 OS=Arabidopsis thaliana OX=3702 GN=P4H7 PE=2 SV=1)

HSP 1 Score: 213.4 bits (542), Expect = 6.3e-54
Identity = 108/278 (38.85%), Postives = 166/278 (59.71%), Query Frame = 0

Query: 260 VPLSYSNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLATSSEDKPSGNSTDSG 319
           + +  S  S   DP+RV Q+SW PRVFLY+GFLSDEECDH I LA    +K      DSG
Sbjct: 40  IKMKTSASSFGFDPTRVTQLSWTPRVFLYEGFLSDEECDHFIKLAKGKLEKSMVADNDSG 99

Query: 320 NTVPTKILKSSGAILN-TTDDIIARIENRIAVWTFLPKDYSMPLQILQY--GGEEAEHKY 379
            +V +++  SSG  L+   DDI++ +E ++A WTFLP++    +QIL Y  G +   H  
Sbjct: 100 ESVESEVRTSSGMFLSKRQDDIVSNVEAKLAAWTFLPEENGESMQILHYENGQKYEPHFD 159

Query: 380 VFGNRSAMLSSEPLMATVVLYLSDSASGGEMRFP-----ESKVKSRFWSDRRKKNNILRP 439
            F +++ +      +ATV++YLS+   GGE  FP      +++K   W++  K+   ++P
Sbjct: 160 YFHDQANLELGGHRIATVLMYLSNVEKGGETVFPMWKGKATQLKDDSWTECAKQGYAVKP 219

Query: 440 VKGNAVLIFSVHLNASPDKSSSHTRSPILDGELWIATKFFYLRPITGNKHTDEPDGDCND 499
            KG+A+L F++H NA+ D +S H   P+++GE W AT++ +++     +        C D
Sbjct: 220 RKGDALLFFNLHPNATTDSNSLHGSCPVVEGEKWSATRWIHVKSF---ERAFNKQSGCMD 279

Query: 500 EDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC 530
           E+ SC +WA  GEC++N  +M+GS   +G CRKSC AC
Sbjct: 280 ENVSCEKWAKAGECQKNPTYMVGSDKDHGYCRKSCKAC 314

BLAST of MS017764 vs. ExPASy Swiss-Prot
Match: F4JAU3 (Prolyl 4-hydroxylase 2 OS=Arabidopsis thaliana OX=3702 GN=P4H2 PE=1 SV=1)

HSP 1 Score: 201.4 bits (511), Expect = 2.5e-50
Identity = 109/285 (38.25%), Postives = 169/285 (59.30%), Query Frame = 0

Query: 256 LIESVPLSYSNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLATSSEDKPSGNS 315
           L++S     S+ S  I+PS+V QVS +PR F+Y+GFL+D ECDHLISLA  +  + +   
Sbjct: 18  LLQSSTCLISSPSSIINPSKVKQVSSKPRAFVYEGFLTDLECDHLISLAKENLQRSAVAD 77

Query: 316 TDSGNTVPTKILKSSGAILNT-TDDIIARIENRIAVWTFLPKDYSMPLQILQY--GGEEA 375
            D+G +  + +  SSG  ++   D I++ IE++++ WTFLPK+    LQ+L+Y  G +  
Sbjct: 78  NDNGESQVSDVRTSSGTFISKGKDPIVSGIEDKLSTWTFLPKENGEDLQVLRYEHGQKYD 137

Query: 376 EHKYVFGNRSAMLSSEPLMATVVLYLSDSASGGEMRFPESKVKSR--------FWSDRRK 435
            H   F ++  +      +ATV+LYLS+   GGE  FP+++  SR          SD  K
Sbjct: 138 AHFDYFHDKVNIARGGHRIATVLLYLSNVTKGGETVFPDAQEFSRRSLSENKDDLSDCAK 197

Query: 436 KNNILRPVKGNAVLIFSVHLNASPDKSSSHTRSPILDGELWIATKFFYLRPITGNKHTDE 495
           K   ++P KGNA+L F++  +A PD  S H   P+++GE W ATK+ +   +        
Sbjct: 198 KGIAVKPKKGNALLFFNLQQDAIPDPFSLHGGCPVIEGEKWSATKWIH---VDSFDKILT 257

Query: 496 PDGDCNDEDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC 530
            DG+C D ++SC +WA +GEC +N  +M+G+P+  G CR+SC AC
Sbjct: 258 HDGNCTDVNESCERWAVLGECGKNPEYMVGTPEIPGNCRRSCKAC 299

BLAST of MS017764 vs. ExPASy Swiss-Prot
Match: Q8LAN3 (Probable prolyl 4-hydroxylase 4 OS=Arabidopsis thaliana OX=3702 GN=P4H4 PE=2 SV=1)

HSP 1 Score: 201.1 bits (510), Expect = 3.2e-50
Identity = 104/285 (36.49%), Postives = 173/285 (60.70%), Query Frame = 0

Query: 256 LIESVPLSYSNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLATSSEDKPSGNS 315
           L++S     S+ S  ++PS+V QVS +PR F+Y+GFL++ ECDH++SLA +S  + +   
Sbjct: 17  LLQSSTSLISSSSVFVNPSKVKQVSSKPRAFVYEGFLTELECDHMVSLAKASLKRSAVAD 76

Query: 316 TDSGNTVPTKILKSSGAILNT-TDDIIARIENRIAVWTFLPKDYSMPLQILQY--GGEEA 375
            DSG +  +++  SSG  ++   D I++ IE++I+ WTFLPK+    +Q+L+Y  G +  
Sbjct: 77  NDSGESKFSEVRTSSGTFISKGKDPIVSGIEDKISTWTFLPKENGEDIQVLRYEHGQKYD 136

Query: 376 EHKYVFGNRSAMLSSEPLMATVVLYLSDSASGGEMRFPESKVKSR--------FWSDRRK 435
            H   F ++  ++     MAT+++YLS+   GGE  FP++++ SR          SD  K
Sbjct: 137 AHFDYFHDKVNIVRGGHRMATILMYLSNVTKGGETVFPDAEIPSRRVLSENKEDLSDCAK 196

Query: 436 KNNILRPVKGNAVLIFSVHLNASPDKSSSHTRSPILDGELWIATKFFYLRPITGNKHTDE 495
           +   ++P KG+A+L F++H +A PD  S H   P+++GE W ATK+ +   +        
Sbjct: 197 RGIAVKPRKGDALLFFNLHPDAIPDPLSLHGGCPVIEGEKWSATKWIH---VDSFDRIVT 256

Query: 496 PDGDCNDEDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC 530
           P G+C D ++SC +WA +GEC +N  +M+G+ +  G CR+SC AC
Sbjct: 257 PSGNCTDMNESCERWAVLGECTKNPEYMVGTTELPGYCRRSCKAC 298

BLAST of MS017764 vs. ExPASy TrEMBL
Match: A0A6J5WND9 (Procollagen-proline 4-dioxygenase OS=Prunus armeniaca OX=36596 GN=ORAREDHAP_LOCUS17440 PE=3 SV=1)

HSP 1 Score: 671.4 bits (1731), Expect = 3.1e-189
Identity = 356/709 (50.21%), Postives = 429/709 (60.51%), Query Frame = 0

Query: 1   MSIICGIPILECVCCLGCARWACKRCFHSAVHDSETWGFATADEFGPIPRICRYILAVYE 60
           MSI+CG P++ECV CL C RWA KRC H+A HDSETWG ATA+EF P+PR+CRYILAVYE
Sbjct: 14  MSILCGCPLIECVYCLACTRWAWKRCLHTAGHDSETWGIATAEEFEPVPRLCRYILAVYE 73

Query: 61  DDIQQPLWEPAGGYGINPDWLLIKKTYKDTRGRAPPYILYLDHNHGDIVLAIRGLNMAKE 120
           DD++QPLWEP GGYGI PDWL++KKTY+DT+G+APPYILYLDH+H DIVLA RGLN+A+E
Sbjct: 74  DDLRQPLWEPPGGYGIKPDWLILKKTYEDTQGQAPPYILYLDHDHADIVLAFRGLNLARE 133

Query: 121 SDYAVLLDNKLGKKKFDGGYVHNGLLKAAGWVLDTENEILKDLVSKYPDYTLTFAGHSLG 180
           SDYAVL+DNKLGKKKFDGGYVHNGLLKAA WVLD E E LKDLV KYP+YTLTF GHSLG
Sbjct: 134 SDYAVLMDNKLGKKKFDGGYVHNGLLKAAEWVLDAECENLKDLVEKYPNYTLTFTGHSLG 193

Query: 181 SGVAAMLTLVVVQNRDKLENIDRKRIRCYAIAPARCMSLNLAVRYADVINSVVLQ----- 240
           SGVAA+LT+VVVQ+RD+L NIDRKR+R YAIAPARC+SLNLAVRYADVINSVVLQ     
Sbjct: 194 SGVAALLTMVVVQSRDRLGNIDRKRVRGYAIAPARCVSLNLAVRYADVINSVVLQATTPL 253

Query: 241 ---------------VNC------------------------------------------ 300
                          + C                                          
Sbjct: 254 EDIFKSLFCLPCLLCIRCMRDTCIPEEKMLKDPRRLYAPGRLYHIVERKPFRLGRFPPVV 313

Query: 301 ------------LFLSYNLIS----------GRKGL-----RDQLIE------------- 360
                       + LS N  S           ++ L     +DQ++E             
Sbjct: 314 KTAVPVDGRFEHIVLSCNATSDHAIIWIEREAQRALKLMLEKDQIMEIPPKQKMERQETL 373

Query: 361 ------------------SVPLSYSN---------------------------------- 420
                             +VP +YS                                   
Sbjct: 374 AKEHTEEYRAALQRAVTLAVPHAYSPSMYGTFDEKDEEEHSYGSSGESSFSSAKKSKTFV 433

Query: 421 ------------------------HSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLIS 480
                                   HS RIDPSR VQ+SWRPRVFLY+GFLSDEECDHL+S
Sbjct: 434 ARSRKELRSEEANKETFIHFGHSVHSNRIDPSRAVQLSWRPRVFLYQGFLSDEECDHLVS 493

Query: 481 LATSSEDKPSGNSTDSGNTVPTKILKSSGAILNTTDDIIARIENRIAVWTFLPKDYSMPL 530
           LA   E+       D GNT   ++ KS    LN  D+I++RIE RI+ WTFLPK+ S  L
Sbjct: 494 LAHGGEENSLTEYDDLGNTNTIRLRKSLQIPLNMEDEIVSRIEERISAWTFLPKENSRAL 553

BLAST of MS017764 vs. ExPASy TrEMBL
Match: A0A6J5U8N9 (Procollagen-proline 4-dioxygenase OS=Prunus armeniaca OX=36596 GN=CURHAP_LOCUS17819 PE=3 SV=1)

HSP 1 Score: 669.1 bits (1725), Expect = 1.5e-188
Identity = 355/709 (50.07%), Postives = 429/709 (60.51%), Query Frame = 0

Query: 1   MSIICGIPILECVCCLGCARWACKRCFHSAVHDSETWGFATADEFGPIPRICRYILAVYE 60
           MSI+CG P++ECV CL C RWA KRC H+A HDSETWG ATA+EF P+PR+CRYILAVYE
Sbjct: 1   MSILCGCPLIECVYCLACTRWAWKRCLHTAGHDSETWGIATAEEFEPVPRLCRYILAVYE 60

Query: 61  DDIQQPLWEPAGGYGINPDWLLIKKTYKDTRGRAPPYILYLDHNHGDIVLAIRGLNMAKE 120
           DD++QPLWEP GGYGI PDWL++KKTY+DT+G+APPYILYLDH+H DIVLA RGLN+A+E
Sbjct: 61  DDLRQPLWEPPGGYGIKPDWLILKKTYEDTQGQAPPYILYLDHDHADIVLAFRGLNLARE 120

Query: 121 SDYAVLLDNKLGKKKFDGGYVHNGLLKAAGWVLDTENEILKDLVSKYPDYTLTFAGHSLG 180
           SDYAVL+DNKLGKKKFDGGYVHNGLLKAA WVLD E E LKDLV KYP+YTLTF GHSLG
Sbjct: 121 SDYAVLMDNKLGKKKFDGGYVHNGLLKAAEWVLDAECENLKDLVEKYPNYTLTFTGHSLG 180

Query: 181 SGVAAMLTLVVVQNRDKLENIDRKRIRCYAIAPARCMSLNLAVRYADVINSVVLQ----- 240
           SGVAA+LT+VVVQ+RD+L NIDRKR+R YAIAPARC+SLNLAVRYADVINSVVLQ     
Sbjct: 181 SGVAALLTMVVVQSRDRLGNIDRKRVRGYAIAPARCVSLNLAVRYADVINSVVLQATTPL 240

Query: 241 ---------------VNC------------------------------------------ 300
                          + C                                          
Sbjct: 241 EDIFKSLFCLPCLLCIRCMRDTCIPEEKMLKDPRRLYAPGRLYHIVERKPFRLGRFPPVV 300

Query: 301 ------------LFLSYNLIS----------GRKGL-----RDQLIE------------- 360
                       + LS N  S           ++ L     +DQ++E             
Sbjct: 301 KTAVPVDGRFEHIVLSCNATSDHAIIWIEREAQRALKLMLEKDQIMEIPPKQKMERQETL 360

Query: 361 ------------------SVPLSYSN---------------------------------- 420
                             +VP +YS                                   
Sbjct: 361 AKEHTEEYRAALQRAVTLAVPHAYSPSMYGTFDEKDEEEHSYGSSGESSFSSAKKSKTFV 420

Query: 421 ------------------------HSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLIS 480
                                   HS RIDPSR VQ+SWRPRVFLY+GFLSDEECDHL+S
Sbjct: 421 ARSRKELRSEEANKETFIHFGHSVHSNRIDPSRAVQLSWRPRVFLYQGFLSDEECDHLVS 480

Query: 481 LATSSEDKPSGNSTDSGNTVPTKILKSSGAILNTTDDIIARIENRIAVWTFLPKDYSMPL 530
           LA   E+       D GNT   ++  S    LN  D+I++RIE RI+ WTFLPK+ S  L
Sbjct: 481 LAHGGEENSLTEYDDLGNTNTIRLRISLQIPLNMEDEIVSRIEERISAWTFLPKENSRAL 540

BLAST of MS017764 vs. ExPASy TrEMBL
Match: A0A498JHB5 (Procollagen-proline 4-dioxygenase OS=Malus domestica OX=3750 GN=DVH24_024772 PE=3 SV=1)

HSP 1 Score: 630.6 bits (1625), Expect = 6.1e-177
Identity = 340/739 (46.01%), Postives = 420/739 (56.83%), Query Frame = 0

Query: 1    MSIICGIPILECVCCLGCARWACKRCFHSAVHDSETWGFATADEFGPIPRICRYILAVYE 60
            MSI+C  P+LECV CL C RWA KRC H+A HDSETWG +TA+EF P+PR+CRYILAVYE
Sbjct: 994  MSILCACPVLECVYCLACTRWAWKRCLHTAGHDSETWGLSTAEEFEPVPRLCRYILAVYE 1053

Query: 61   DDIQQPLWEPAGGYGINPDWLLIKKTYKDTRGRAPPYILYLDHNHGDIVLAIRGLNMAKE 120
            DD++ PLWEP GGYGINPDWL++KKTY+DT G APPYILYLDHNH DIVLA RGLN+A+E
Sbjct: 1054 DDLRCPLWEPPGGYGINPDWLILKKTYEDTGGLAPPYILYLDHNHADIVLAFRGLNLARE 1113

Query: 121  SDYAVLLDNKLGKKKFDGGYVHNGLLKAAGWVLDTENEILKDLVSKYPDYTLTFAGHSLG 180
            SDYAVL+DNKLG++KFDGGYVHNGLLK+A WV+D E EILKDLV  YP+YTLTFAGHSLG
Sbjct: 1114 SDYAVLMDNKLGQRKFDGGYVHNGLLKSAQWVMDAECEILKDLVQNYPNYTLTFAGHSLG 1173

Query: 181  SGVAAMLTLVVVQNRDKLENIDRKRIRCYAIAPARCMSLNLAVRYADVINSVVLQ----- 240
            SGVAA+LT+VVV+NRD+L +IDRKR+R YAIAPARCMSLNLAVRYADVINSVVLQ     
Sbjct: 1174 SGVAALLTMVVVKNRDRLGDIDRKRVRGYAIAPARCMSLNLAVRYADVINSVVLQDDFLP 1233

Query: 241  --------------VNCL------------------------------------------ 300
                          + CL                                          
Sbjct: 1234 RTATPLEDIFNLPCILCLRCMRDTCIPEEKMLKDPRRLYAPGRLYHIVERKPFRCGRFPP 1293

Query: 301  ------------------------------------------------------------ 360
                                                                        
Sbjct: 1294 VVKTAVPVDGRFEHIVLSCNATSDHAIIWIEREAQRALDLMLQKDHIMEIPSKQRMERQE 1353

Query: 361  ------------------------------------------------------------ 420
                                                                        
Sbjct: 1354 TLAKEHTEEYKAALQRAVTLAVPHAYSPSPYGTFDEKDEEDHSYGSSGESSFGSTKKSKS 1413

Query: 421  --------------------------FLSYNLISGRKGLR-DQLIES--VPLSYSNHSGR 480
                                      F S +    RK LR +Q I+   +   +S HS R
Sbjct: 1414 FTARRGGSASMASLASIFLLLSVTSSFFSSSAEISRKELRTNQTIQETVIHFGHSVHSNR 1473

Query: 481  IDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLATSSEDKPSGNSTDSGNTVPTKILKSS 528
            IDPSRVVQ+SW+PR        SDEECDHL+SLA   EDK      + GNT   +++KS 
Sbjct: 1474 IDPSRVVQLSWQPR--------SDEECDHLVSLALGGEDKSVTEYDELGNTNTMRLIKSL 1533

BLAST of MS017764 vs. ExPASy TrEMBL
Match: A0A7J6E5B7 (Procollagen-proline 4-dioxygenase OS=Cannabis sativa OX=3483 GN=F8388_017773 PE=3 SV=1)

HSP 1 Score: 614.4 bits (1583), Expect = 4.5e-172
Identity = 346/748 (46.26%), Postives = 417/748 (55.75%), Query Frame = 0

Query: 1   MSIICGIPILECVCCLGCARWACKRCFHSAVHDSETWGFATADEFGPIPRICRYILAVYE 60
           MSIICGIP+LECV CL CARWA KRC H+A HDSE WG ATA+EF P+PR+C YILAVYE
Sbjct: 1   MSIICGIPLLECVYCLACARWAWKRCLHTAGHDSENWGIATAEEFEPVPRMCCYILAVYE 60

Query: 61  DDIQQPLWEPAGGYGINPDWLLIKKTYKDTRGRAPPYILYLDHNHGDIVLAIRGLNMAKE 120
           DD++ PLWEP  GYGINPDWL  KK+Y+DT G+APPYILYLDH+H DIVLA RGLN+AKE
Sbjct: 61  DDLRHPLWEPPEGYGINPDWLEHKKSYEDTDGQAPPYILYLDHDHEDIVLAFRGLNLAKE 120

Query: 121 SDYAVLLDNKLGKKKFDGGYVHNGLLKAAGWVLDTENEILKDLVSKYPDYTLTFAGHSLG 180
           SDYAVLLDNKLG++KFDGGYVHNGLLKAA  VL  E++ LK LV KYP+YTLTFAGHSLG
Sbjct: 121 SDYAVLLDNKLGQRKFDGGYVHNGLLKAAEHVLLMESDTLKKLVMKYPNYTLTFAGHSLG 180

Query: 181 SGVAAMLTLVVVQNRDKLENIDRKRIRCYAIAPARCMSLNLAVRYADVINSVVLQ----- 240
           SGVA +L ++ VQNR +L NIDR+RIRCYAIAPARCMSLNLAVRYADVINSVVLQ     
Sbjct: 181 SGVATLLAVLAVQNRAELGNIDRRRIRCYAIAPARCMSLNLAVRYADVINSVVLQATTPL 240

Query: 241 ------------VNCL-------------------------------------------- 300
                       + CL                                            
Sbjct: 241 EDIFKSLFCLPCILCLRCMRDTCIPEEKMIKDPRRLYAPGRLYHIVERKPFRMGRFPPEV 300

Query: 301 ------------------------------------------------------------ 360
                                                                       
Sbjct: 301 RTAVPVDGRFEHIVLSCNATSDHAIVWIEREARRALELMLKKDPIMEIPTKQRMERQETL 360

Query: 361 ------------------------------------------------------------ 420
                                                                       
Sbjct: 361 AKEKKEEYKAALQRAVTLSVPHAFTPSQYGTFDEEAESSPGSAGESSFGSPRFLHFSSHS 420

Query: 421 ---------FLSYNL------------------ISGRKGLRDQLIES---VPLSYSNHSG 480
                    F S NL                  +S RK LRD+  +    +    S HS 
Sbjct: 421 SVINNHNNGFSSLNLASSNFSPLFFFPNCRFYWLSSRKELRDEEFKQEMVIQFPSSVHSN 480

Query: 481 RIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLATSSEDKPSGNSTDSGNTVPTKILKS 530
           RIDPSRVVQ+SWRPRVFLY+GFLSDEECDHLIS  +  ED        SGNT+  K++KS
Sbjct: 481 RIDPSRVVQLSWRPRVFLYEGFLSDEECDHLISSTSREEDA-------SGNTIKKKLMKS 540

BLAST of MS017764 vs. ExPASy TrEMBL
Match: A0A7J6E0F0 (Procollagen-proline 4-dioxygenase OS=Cannabis sativa OX=3483 GN=F8388_024210 PE=3 SV=1)

HSP 1 Score: 610.5 bits (1573), Expect = 6.5e-171
Identity = 344/764 (45.03%), Postives = 417/764 (54.58%), Query Frame = 0

Query: 1   MSIICGIPILECVCCLGCARWACKRCFHSAVHDSETWGFATADEFGPIPRICRYILAVYE 60
           MSIICGIP+LECV CL CARWA KRC H+A HDSE WG ATA+EF P+PR+C YILAVYE
Sbjct: 1   MSIICGIPLLECVYCLACARWAWKRCLHTAGHDSENWGIATAEEFEPVPRMCCYILAVYE 60

Query: 61  DDIQQPLWEPAGGYGINPDWLLIKKTYKDTRGRAPPYILYLDHNHGDIVLAIRGLNMAKE 120
           DD++ PLWEP  GYGINPDWL  KK+Y+DT G+APPYILYLDH+H DIVLA RGLN+AKE
Sbjct: 61  DDLRHPLWEPPEGYGINPDWLEHKKSYEDTDGQAPPYILYLDHDHEDIVLAFRGLNLAKE 120

Query: 121 SDYAVLLDNKLGKKKFDGGYVHNGLLKAAGWVLDTENEILKDLVSKYPDYTLTFAGHSLG 180
           SDYAVLLDNKLG++KFDGGYVHNGLLKAA  VL  E++ LK LV KYP+YTLTFAGHSLG
Sbjct: 121 SDYAVLLDNKLGQRKFDGGYVHNGLLKAAEHVLLMESDTLKKLVMKYPNYTLTFAGHSLG 180

Query: 181 SGVAAMLTLVVVQNRDKLENIDRKRIRCYAIAPARCMSLNLAVRYADVINSVVLQ----- 240
           SGVA +L ++ VQNR +L NIDR+RIRCYAIAPARCMSLNLAVRYADVINSVVLQ     
Sbjct: 181 SGVATLLAVLAVQNRAELGNIDRRRIRCYAIAPARCMSLNLAVRYADVINSVVLQAINFK 240

Query: 241 ------------------------------------------------------------ 300
                                                                       
Sbjct: 241 CDDFLPRTATPLEDIFKSLFCLPCILCLRCMRDTCIPEEKMIKDPRRLYAPGRLYHIVER 300

Query: 301 --------------------------VNC------------------------------- 360
                                     ++C                               
Sbjct: 301 KPFRMGRFPPEVRTAVPVDGRFEHIVLSCNATSDHAIVWIEREARRALELMLKKDPIMEI 360

Query: 361 ------------------------------------------------------------ 420
                                                                       
Sbjct: 361 PTKQRMERQETLAKEKKEEYKAALQRAVTLSVPHAFTPSQYGTFDEEAESSPGSAGESSF 420

Query: 421 ------------------------------------------LFLSYNLISGRKGLRDQL 480
                                                     L+LS    S RK LRD+ 
Sbjct: 421 GSPRFLHFSSHSSVINNHNNGFSSLNLASSNFSPLFFFPNLNLYLSLYFFSSRKELRDEE 480

Query: 481 IES---VPLSYSNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLATSSEDKPSG 530
            +    +    S HS RIDPSRVVQ+SWRPRVFLY+GFLSDEECDHLISL +  +D    
Sbjct: 481 FKQEMVIQFPSSVHSNRIDPSRVVQLSWRPRVFLYEGFLSDEECDHLISLTSREDDA--- 540

BLAST of MS017764 vs. TAIR 10
Match: AT3G49050.1 (alpha/beta-Hydrolases superfamily protein )

HSP 1 Score: 392.9 bits (1008), Expect = 4.1e-109
Identity = 182/236 (77.12%), Postives = 210/236 (88.98%), Query Frame = 0

Query: 1   MSIICG-IPILECVCCLGCARWACKRCFHSAVHDSETWGFATADEFGPIPRICRYILAVY 60
           MSI+CG  P+LECV CLGCARW  KRC ++A HDSE WG AT DEF P+PR CRYILAVY
Sbjct: 1   MSILCGCCPLLECVYCLGCARWGYKRCLYTAGHDSEDWGLATTDEFEPVPRFCRYILAVY 60

Query: 61  EDDIQQPLWEPAGGYGINPDWLLIKKTYKDTRGRAPPYILYLDHNHGDIVLAIRGLNMAK 120
           EDDI+ PLWEP  GYGINPDWLL+KKTY+DT+GRAP YILYLDH H DIV+AIRGLN+AK
Sbjct: 61  EDDIRNPLWEPPEGYGINPDWLLLKKTYEDTQGRAPAYILYLDHVHQDIVVAIRGLNLAK 120

Query: 121 ESDYAVLLDNKLGKKKFDGGYVHNGLLKAAGWVLDTENEILKDLVSKYPDYTLTFAGHSL 180
           ESDYA+LLDNKLG++KFDGGYVHNGL+K+AG+VLD E ++LK+LV KYP YTLTFAGHSL
Sbjct: 121 ESDYAMLLDNKLGERKFDGGYVHNGLVKSAGYVLDEECKVLKELVKKYPSYTLTFAGHSL 180

Query: 181 GSGVAAMLTLVVVQNRDKLENIDRKRIRCYAIAPARCMSLNLAVRYADVINSVVLQ 236
           GSGVA ML L+VV++ ++L NIDRKR+RC+AIAPARCMSLNLAVRYADVINSV+LQ
Sbjct: 181 GSGVATMLALLVVRHPERLGNIDRKRVRCFAIAPARCMSLNLAVRYADVINSVILQ 236

BLAST of MS017764 vs. TAIR 10
Match: AT4G00500.1 (alpha/beta-Hydrolases superfamily protein )

HSP 1 Score: 344.4 bits (882), Expect = 1.7e-94
Identity = 155/235 (65.96%), Postives = 192/235 (81.70%), Query Frame = 0

Query: 1   MSIICGIPILECVCCLGCARWACKRCFHSAVHDSETWGFATADEFGPIPRICRYILAVYE 60
           MSI+C +P+LECV CLGC  W  K+C +SA H+SE WG AT+DEF PIPRICR ILAVYE
Sbjct: 1   MSILCCVPVLECVYCLGCTHWLWKKCLYSAGHESENWGLATSDEFEPIPRICRLILAVYE 60

Query: 61  DDIQQPLWEPAGGYGINPDWLLIKKTYKDTRGRAPPYILYLDHNHGDIVLAIRGLNMAKE 120
           +++  P+W P  GYGI+P+ +++KK Y  T GR  PY++YLDH +GD+VLAIRGLN+AKE
Sbjct: 61  ENLHDPMWAPPDGYGIDPNHVILKKDYDQTEGRVTPYMIYLDHENGDVVLAIRGLNLAKE 120

Query: 121 SDYAVLLDNKLGKKKFDGGYVHNGLLKAAGWVLDTENEILKDLVSKYPDYTLTFAGHSLG 180
            DYAVLLDNKLG+ KFDGGYVHNGLLKAA WV + E+ +L++L+   P Y+LTF GHSLG
Sbjct: 121 CDYAVLLDNKLGQTKFDGGYVHNGLLKAAMWVFEEEHVVLRELLEANPSYSLTFVGHSLG 180

Query: 181 SGVAAMLTLVVVQNRDKLENIDRKRIRCYAIAPARCMSLNLAVRYADVINSVVLQ 236
           +GV ++L L V+QNR +L NI+RKRIRC+AIAP RCMSL+LAV YADVINSVVLQ
Sbjct: 181 AGVVSLLVLFVIQNRVRLGNIERKRIRCFAIAPPRCMSLHLAVTYADVINSVVLQ 235

BLAST of MS017764 vs. TAIR 10
Match: AT4G00500.2 (alpha/beta-Hydrolases superfamily protein )

HSP 1 Score: 344.4 bits (882), Expect = 1.7e-94
Identity = 155/235 (65.96%), Postives = 192/235 (81.70%), Query Frame = 0

Query: 1   MSIICGIPILECVCCLGCARWACKRCFHSAVHDSETWGFATADEFGPIPRICRYILAVYE 60
           MSI+C +P+LECV CLGC  W  K+C +SA H+SE WG AT+DEF PIPRICR ILAVYE
Sbjct: 1   MSILCCVPVLECVYCLGCTHWLWKKCLYSAGHESENWGLATSDEFEPIPRICRLILAVYE 60

Query: 61  DDIQQPLWEPAGGYGINPDWLLIKKTYKDTRGRAPPYILYLDHNHGDIVLAIRGLNMAKE 120
           +++  P+W P  GYGI+P+ +++KK Y  T GR  PY++YLDH +GD+VLAIRGLN+AKE
Sbjct: 61  ENLHDPMWAPPDGYGIDPNHVILKKDYDQTEGRVTPYMIYLDHENGDVVLAIRGLNLAKE 120

Query: 121 SDYAVLLDNKLGKKKFDGGYVHNGLLKAAGWVLDTENEILKDLVSKYPDYTLTFAGHSLG 180
            DYAVLLDNKLG+ KFDGGYVHNGLLKAA WV + E+ +L++L+   P Y+LTF GHSLG
Sbjct: 121 CDYAVLLDNKLGQTKFDGGYVHNGLLKAAMWVFEEEHVVLRELLEANPSYSLTFVGHSLG 180

Query: 181 SGVAAMLTLVVVQNRDKLENIDRKRIRCYAIAPARCMSLNLAVRYADVINSVVLQ 236
           +GV ++L L V+QNR +L NI+RKRIRC+AIAP RCMSL+LAV YADVINSVVLQ
Sbjct: 181 AGVVSLLVLFVIQNRVRLGNIERKRIRCFAIAPPRCMSLHLAVTYADVINSVVLQ 235

BLAST of MS017764 vs. TAIR 10
Match: AT5G37710.1 (alpha/beta-Hydrolases superfamily protein )

HSP 1 Score: 283.1 bits (723), Expect = 4.6e-76
Identity = 137/237 (57.81%), Postives = 179/237 (75.53%), Query Frame = 0

Query: 1   MSIICGIPILECVCCLGCARWACKRCFHSAVHDSETWGFATADEFGPIPRICRYILAVYE 60
           MS+ CG   LECV C+G +RWA KRC H    DS TW  AT +EF PIPRI R ILAVYE
Sbjct: 1   MSVACG---LECVFCVGFSRWAWKRCTHVGSDDSATWTSATPEEFEPIPRISRVILAVYE 60

Query: 61  DDIQQPLWEPA-GGYGINPDWLLIKKTYKDTRGRAPPYILYLDHNHGDIVLAIRGLNMAK 120
            D++ P   P+ G + +NP+W++ + T++ T+GR+PPYI+Y+DH+H +IVLAIRGLN+AK
Sbjct: 61  PDLRNPKISPSLGTFDLNPEWVIKRVTHEKTQGRSPPYIIYIDHDHREIVLAIRGLNLAK 120

Query: 121 ESDYAVLLDNKLGKKKFDGGYVHNGLLKAAGWVLDTENEIL-KDLVSKYPDYTLTFAGHS 180
           ESDY +LLDNKLG+K   GGYVH GLLK+A WVL+ E+E L +       +Y L FAGHS
Sbjct: 121 ESDYKILLDNKLGQKMLGGGYVHRGLLKSAAWVLNQESETLWRVWEENGREYDLVFAGHS 180

Query: 181 LGSGVAAMLTLVVVQNRDKLENIDRKRIRCYAIAPARCMSLNLAVRYADVINSVVLQ 236
           LGSGVAA++ ++VV     + +I R ++RC+A+APARCMSLNLAV+YADVI+SV+LQ
Sbjct: 181 LGSGVAALMAVLVVNTPAMIGDIPRNKVRCFALAPARCMSLNLAVKYADVISSVILQ 234

BLAST of MS017764 vs. TAIR 10
Match: AT4G25600.1 (Oxoglutarate/iron-dependent oxygenase )

HSP 1 Score: 242.3 bits (617), Expect = 9.0e-64
Identity = 130/286 (45.45%), Postives = 180/286 (62.94%), Query Frame = 0

Query: 249 RKGLRDQLIES----VPLSYSNHSGRIDPSRVVQVSWRPRVFLYKGFLSDEECDHLISLA 308
           RK LRD+ I S       SY   S  +DP+RV+Q+SW PRVFLY+GFLS+EECDHLISL 
Sbjct: 28  RKELRDKEITSKSDDTQASYVLGSKFVDPTRVLQLSWLPRVFLYRGFLSEEECDHLISLR 87

Query: 309 TSSEDKPSGNSTDSGNTVPTKILKSSGAILNTTDDIIARIENRIAVWTFLPKDYSMPLQI 368
             + +  S ++   G T                D ++A IE +++ WTFLP +    +++
Sbjct: 88  KETTEVYSVDA--DGKT--------------QLDPVVAGIEEKVSAWTFLPGENGGSIKV 147

Query: 369 LQYGGEEAEHKY-VFGNRSAMLSSEPLMATVVLYLSDSASGGEMRFPESKVKSRFWSDRR 428
             Y  E++  K   FG   + +  E L+ATVVLYLS++  GGE+ FP S++K +  +   
Sbjct: 148 RSYTSEKSGKKLDYFGEEPSSVLHESLLATVVLYLSNTTQGGELLFPNSEMKPK--NSCL 207

Query: 429 KKNNILRPVKGNAVLIFSVHLNASPDKSSSHTRSPILDGELWIATKFFYLRPITGNKHTD 488
           +  NILRPVKGNA+L F+  LNAS D  S+H R P++ GEL +ATK  Y +     +   
Sbjct: 208 EGGNILRPVKGNAILFFTRLLNASLDGKSTHLRCPVVKGELLVATKLIYAK----KQARI 267

Query: 489 EPDGDCNDEDKSCPQWAAIGECERNAVFMIGSPDYYGTCRKSCNAC 530
           E  G+C+DED++C +WA +GEC++N V+MIGSPDYYGTCRKSCNAC
Sbjct: 268 EESGECSDEDENCGRWAKLGECKKNPVYMIGSPDYYGTCRKSCNAC 291

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
CAB4301873.16.4e-18950.21unnamed protein product [Prunus armeniaca][more]
CAB4271435.13.2e-18850.07unnamed protein product [Prunus armeniaca][more]
RXH95088.11.3e-17646.01hypothetical protein DVH24_024772 [Malus domestica][more]
KAF4353598.19.3e-17246.26hypothetical protein F8388_017773 [Cannabis sativa][more]
KAF4351179.11.3e-17045.03hypothetical protein F8388_024210 [Cannabis sativa][more]
Match NameE-valueIdentityDescription
Q8GXT71.3e-6245.45Probable prolyl 4-hydroxylase 12 OS=Arabidopsis thaliana OX=3702 GN=P4H12 PE=2 S... [more]
F4J0A81.6e-5441.61Probable prolyl 4-hydroxylase 6 OS=Arabidopsis thaliana OX=3702 GN=P4H6 PE=2 SV=... [more]
Q8L9706.3e-5438.85Probable prolyl 4-hydroxylase 7 OS=Arabidopsis thaliana OX=3702 GN=P4H7 PE=2 SV=... [more]
F4JAU32.5e-5038.25Prolyl 4-hydroxylase 2 OS=Arabidopsis thaliana OX=3702 GN=P4H2 PE=1 SV=1[more]
Q8LAN33.2e-5036.49Probable prolyl 4-hydroxylase 4 OS=Arabidopsis thaliana OX=3702 GN=P4H4 PE=2 SV=... [more]
Match NameE-valueIdentityDescription
A0A6J5WND93.1e-18950.21Procollagen-proline 4-dioxygenase OS=Prunus armeniaca OX=36596 GN=ORAREDHAP_LOCU... [more]
A0A6J5U8N91.5e-18850.07Procollagen-proline 4-dioxygenase OS=Prunus armeniaca OX=36596 GN=CURHAP_LOCUS17... [more]
A0A498JHB56.1e-17746.01Procollagen-proline 4-dioxygenase OS=Malus domestica OX=3750 GN=DVH24_024772 PE=... [more]
A0A7J6E5B74.5e-17246.26Procollagen-proline 4-dioxygenase OS=Cannabis sativa OX=3483 GN=F8388_017773 PE=... [more]
A0A7J6E0F06.5e-17145.03Procollagen-proline 4-dioxygenase OS=Cannabis sativa OX=3483 GN=F8388_024210 PE=... [more]
Match NameE-valueIdentityDescription
AT3G49050.14.1e-10977.12alpha/beta-Hydrolases superfamily protein [more]
AT4G00500.11.7e-9465.96alpha/beta-Hydrolases superfamily protein [more]
AT4G00500.21.7e-9465.96alpha/beta-Hydrolases superfamily protein [more]
AT5G37710.14.6e-7657.81alpha/beta-Hydrolases superfamily protein [more]
AT4G25600.19.0e-6445.45Oxoglutarate/iron-dependent oxygenase [more]
InterPro
Analysis Name: InterPro Annotations of Bitter gourd (TR) v1
Date Performed: 2021-10-25
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR003582ShKT domainSMARTSM00254ShkT_1coord: 488..529
e-value: 0.0018
score: 27.6
IPR003582ShKT domainPROSITEPS51670SHKTcoord: 489..529
score: 8.436107
IPR006620Prolyl 4-hydroxylase, alpha subunitSMARTSM00702p4hccoord: 283..472
e-value: 2.3E-17
score: 73.7
IPR002921Fungal lipase-like domainPFAMPF01764Lipase_3coord: 109..236
e-value: 9.7E-21
score: 74.1
IPR029058Alpha/Beta hydrolase foldGENE3D3.40.50.1820alpha/beta hydrolasecoord: 40..252
e-value: 6.3E-25
score: 89.8
IPR029058Alpha/Beta hydrolase foldSUPERFAMILY53474alpha/beta-Hydrolasescoord: 43..231
NoneNo IPR availableGENE3D2.60.120.620q2cbj1_9rhob like domaincoord: 275..472
e-value: 3.0E-38
score: 133.6
NoneNo IPR availablePANTHERPTHR46398ALPHA/BETA-HYDROLASES SUPERFAMILY PROTEINcoord: 1..238
NoneNo IPR availablePANTHERPTHR46398:SF4ALPHA/BETA-HYDROLASES SUPERFAMILY PROTEINcoord: 1..238
NoneNo IPR availableCDDcd00519Lipase_3coord: 44..235
e-value: 8.78165E-24
score: 97.9322
IPR005592Mono-/di-acylglycerol lipase, N-terminalPFAMPF03893Lipase3_Ncoord: 10..73
e-value: 3.4E-17
score: 62.2

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
MS017764.1MS017764.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0016042 lipid catabolic process
biological_process GO:0006629 lipid metabolic process
molecular_function GO:0005506 iron ion binding
molecular_function GO:0031418 L-ascorbic acid binding
molecular_function GO:0016705 oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen