HG10003551 (gene) Bottle gourd (Hangzhou Gourd) v1

Overview
NameHG10003551
Typegene
OrganismLagenaria siceraria cv. Hangzhou Gourd (Bottle gourd (Hangzhou Gourd) v1)
DescriptionSUN domain-containing protein
LocationChr08: 3121285 .. 3128537 (+)
RNA-Seq ExpressionHG10003551
SyntenyHG10003551
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: CDSpolypeptide
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGCGGAGGCCTGTTGGAGCTCTTCTGCGTGATAGAAGAGCTGTTCAAGTGCCTACTAGTGGAAGAAATCATTTGTATAAAGTTTCTCTTTCTTTGGTTTTTATTCTGTGGGGACTTATCTTCCTCTTTAGCTTATGGTTCAGCCGTGGGGATGGCTGCCAAGGTATGCTTGCTCTGCACAAATTATTTTGATGTCCTTAGTGAAGATTTGATGAATTTCGAAGTGTATTGAAACTGATTACTCCATTCACTTGTGATTTCTGATTCTGACTGAGCTTTTTCAATTGGTTTTTATCAGTTGGCTACTTCCGTGTTTCCTTCATATAATGGGATTTTGAGTAATTGTGCCGTTGTATTATCCCTTCTTTTATTTTCCAATCCGTTGGTTTTATACGATTCAAAATGGTTTTGTTCCTCTTGTATTGCAAGCTAAATTTCTGGTGTCGGTTTTAGTCTCGCAATATTGGGTTGTGAATTTTCTTAATGCTAACAAATTGAAGATTTTTATTTATTTAGTATTCCAATTTTGAATACGATGCTCTATATTCTTACATTCAAATAAAGAAACTATACAGATTGTGTTTTCATGCTTGAATGTTTCAATATATTGCCACAAGGGTCAAATTAGTGGGAAGTATAATCATATAATTGATTGTTTTTGTAGTTATTTATTAAAGAACTAATAAAAGCATAATTGGCAATCCTTCTCTAATGAAGTTCGAGCATAATTTGATACACAGTATGTTTGTGTGGATATAATAGAGTAAGCCCAGTATAATGACCTTACAGTTCAAATGGATATAATAGAGTAACGCCCTGCAGAGTACTTTGCTTACTGAAATTATATTTATGTAGACGCTTCTTAGTGGAAACTTTAATGTCCCTGAAGTTGGCTTCCTTTCCTAATTTGTTTTATTTTGTTACCACTCGTGTAAGTTAGTTTTGAACTGTTGAACTGGTTTGGCTTTGTTATTAATCACCAAACTGGCTAAATTAAATATTAAAAAAAATTCTACATGAATAGAATAACTTATTTGTCTTACACCATTTTTTTTCTTCTACACCAAGCAGCAATCCATTATTTATTTACCAGACTTGATATTGATAGTAGTACCGATTGATTACCTATGCCCTTCCTTTCTTTGGGTTTCAGAAGGATCAGTTTTACATCCTGCTGATGTATCTACTTCAAATGAATCTAAATTGGAAAATAACAAGGACTCTGACATTTTATATGAACCTCCAAAGGGAGAAACTGATAGTACCATTCAATTAAACGATTCATGTTCAATTGATGCTACAAGCCCTGGTTCTGACAATGATATACTTTCAAGTGAAGAAAGTAGCAGTCATATACGACCTGCTACGAGGTTGCCTGAGGCTGAGAGCTCTAGCACTGGAGTAAAATCTGAAAACAAACCTGTCAAGGGAGATACCTCGTCAGAGACTGTTCTACTCGGCCTTGAAGAATTCAAAAGCAGAGCCTTTATATCCCGGAGTAAGTCTGAAACTGGCCAGGCTGGGAATACCATCCATAGAGTAGAACCTGGTGGTGCAGAGTACAATTACGCTTCAGCTTCAAAGGGAGCAAAGGTTTTGGCTTTCAACAAGGAAGCAAAGGGAGCTTCTAACATTTTAGGCAGGGACAAAGATAAGTACCTCAGAAATCCATGTTCTGCTGAAGAGAAATTTGTTGTCATAGAACTTTCAGAAGAAACCTTAGTAGTAACAATTGAAATTGCTAATTTTGAACACCATTCTTCCAACTTAAAAGAATTTGAGGTACATGGGAGTTTGGTCTATCCAACAGATGTTTGGTTCAAGCTCGGTAATTTTACTGCTCCAAATGCAAAGCATGCACATAGATTTGTTCTCAAGGACCCAAAATGGGTGAGATATTTAAAGTTGAATTTTCTTACCCATTATGGTTCAGAATTCTATTGCACACTCAGCACTGTGGAAGTTTACGGAATGGATGCTGTTGAGATGATGCTAGAGGATTTAATATCTGCTCAACATAAATCTTCTATATCAGATGAAGCTACTACTGAAAAGAGAGTAATTCCCTCCCAGTCTGGACCCAATGATGAAGGACAACATGGTAGAGAGTTGCAATCTCTTACTACTGAGGAAAGTGATGATGATGTTGATTTAGAACTTACAAAGAGTAACATACCTGATCCGGTTGAAGAATCGCACCATCAACAACCTGGCAGAATGCCTGGTGACACTGTTCTCAAAATTTTGACACAGAAAGTTCGTTCACTAGACCTAAGTTTATCTGTTTTGGAGCAGTATTTGGAGGACTTAACTTCCAAATATGGAAATATATTCAAAGAATTTGACAAAGATATAGAAAATAGTGATCTACTCATTGAGAAGACCCGAGAGGATATAAGAAATATTCTTAAAATCCAGGACAGCACAGTATGTCATATTCTTGTCCTTTCAATTTATCTTCTGTTAATAACAATGCCTATTAACATTGTTTTGTTGTCAATGTAGGATAAAGATCTTCGTGATCTCATTTCTTGGAAGTCCATTGTTTCCTTGCAGTTGGATAATCTGCAAAGGCATAATTCTATTCTCAGGTTTTGACTTTTCTTCCCCTCTCCACCCCCAACCACCGACACACACACACACACAAGGAAAAGGAGAAGGGAGAGAAATGGAGCGAGAATTTTTTTTTTTTTGGGTTTAAATTCTATTTTTGGTTCTCTATTCTGTTTATAACTCTTAAATATTCTAGTTTGTTCACTTGAGTTTGCACCTTGACATTAATATTTTAGTAAATTATTTAATAGAAACCTGTCGTAGATATATTCTGTTTATAACTCTTAAATATTCTATTGATCACTCCTTGCAAACAGTTCTGAAAAATACTATTTAGCACTGTGAACAAGTATTGTCTTTCAAATACAACAAAGATATTTCCAAATAATGACAATTTTTACTTTTTCCTTCTGTATTTTCTTTGTTTATTTGTTTCCTTCCTACTTTCTTCAAACTGTTTTTACATGTGTCGTATCCAATAAAATTCCTCCATAAAATTATTCTTGTTTTTTAATACTAGAAAGTATTCTCATACAGCCTTCTGTCCTTTTGATTCCAACGTTCTCATTTCTTTATCATATCTAAGATTTCCTTTGGAGAAGAAAAGAATAAAAGATTTAGTCATTGAAAGCTAATCATAACTGTACTGCCCAAAGTCACCTTAGCAATCTTTGTTACTTCAAAAACTGTGCAAAGTTCTCTTTGTAAAATTCTTCTAGTTGTGAAAGAGGCTGAAAGCTTCTGTTTTGTTTTGCAGATCTGAGATTGAAAGGGTCCAGAAGAATCAGACTTCTCTGGAAAACAAAGGAATAGTTGTTTTTCTTGTGTGTCTCATTTTTTCGTCATTTGCTATTTTTAGATTATTTTTGCACATTGTTCTTAGAGTATATGAGAGAACAAATAATTCCAGGAAATTTTGTTGTATAAGCCCTTCCTGGTATCTATTGCTTTTGAGCTGTTGTATTATTCTTTTCATACAGTCACTATAATCAAGGATTGCCCCTTTATTTCATCCATTTTCCTTCTCCAATGTAAACGCTCCATCTGTCCCCCCCCCACTCCAAAGTGGAGAATTAAAATATGAATATGGAAGATGAATGAGAAAGTAAGATTAAAAATAGATGTTGTTTTGTTTCAATAAGAAATTCCAAATGACCAAATTGATTAAGTACTTATTATGAATCAATTTTTTTAGTTTTTATCTATGGCCCCTCTTATTCAGATAAAGTAATGGCTTAACTAATTTCATAAAATGCATTATATAATATTTATATGAAATGACACATTGTCTTATTTAATATTTGATCTGCTTACTGAACTTGAGATTACTTTTAAATGCAATAGGAATAGAAAGAAAGTCACAGGATAATAAGGTTGGTATCAATCTTTAGAAGGAAAATGCAATGATTGTAATATGAATTGTAATAGAAAGAATAAGAAGCATGCCAACAAGATCTCAAAACAGCATTCATATGTATGAAATTGAGAATCTCCCTCATCTTCAGCAAGGAATGACTCACAAAGAAGCAATGAATCTTTGGTATTATGAGTTGTTATAACATCATACTTATTATTAGTACACATTCACAAGCATTGTATTGATGATGATGATGGAAGATCAAAGGCATGGAGAAGGAGAAATGGAGCATTATTCTAAGAAGCCATTTTTCTCATCTTTTTCTCCATTTTTGTTTGTGTTGATTTTACTTCCTTCACTTGTGCTAGTTTTTCTAGTTTGTAAGATTGATTTGGAGATTCCTTGGAGAAATGGCTTGGATAAAGACTTCTCAAGTTTGCAAAATTCACAACTTAATTCTTTTTCAAATAACATCTCTTCTCCTAAGTTGCTTGAACCAGCTGCTTTGGACCTCAAGGGACAGTATTTTCCCCCTTCCATTGTAAGTATAATAATTACTTTGATTCACTTTTTTGTTTGCCTTCTCATATTATTTTTTTTTTAAACAGTATGTGAGATGGAAGAATTGAACCTCTAATCTCGAGATTAATAATACGACTTTTTTGTTTTATAATACAACCTTATACTTTTTTAAATACAATAGGAGTAGGGAGACTCGAATCACAGACCTCTTAATTGCTAACACATCTGTATGCTTATTGAGTTATACTCACTTTGGTAGAACCTTCTTATAGTTTAATGTTGCAGTATAAGACTATTTAATCACTCTTCCAGTTATAGGCCAGATCTCCCTTTTTTTTCCCTTTAGCCTTTAAACCTCATTTGGATAGTATGCATTCGTTATAATAATGATTTAGCCCTTTTGTTTACAACCATCTAGGACATGTTTAATATTTAAATTTTAAGATTGTGATGTTTGACATTGGTGTTTAAGTGATTTTGGATACCTTTTTGGTCATTCACTGGAAATGTGGTTCCTATGTGGAGTTTCAACTATGTTCGGGAGTAATTTTGAAAAGATCAATATCACTTTTGTCATTCTCAAAATCACTCTAAAATATGCTTTTAATCATTTAAAATCAATTTTGATTATATGAAATTACATTTAAAAGTGTATATGATTAAGTCTATTTTTTAGTGATTTTCAATGTGATAAAAGTGATTATAACAATTTCAAAATCACTCTCAAACATGAACATATTCGTTTATTAAGTGATTATTGGATGAGGATGATGATGATCAGGAAGGATCAGTAAAAGGAATAGTTGAAAACAAAGAAGGCAATGGAAAAGATGCAAGTTCAGGGGTTAGCAGAATTAAGAGATACAGTAAGTTGAAGAAAATAGAGGAGAAATTGGGAAGAGCAAGAGCAGCCATAAGAGAAGCTGCTCAACTTCATAATCTTACATCTATACATCATGATCCTGACTATGTTCCTACAGGCCCAATATACAGGAACCCAAATGCTTTCCACAGGTATATATATATACAAGGTTAAATTACAATTGTTCCAACCTAGTTTCAATTTCCTTCCTATATTTTTAAAAGTTTTGTTACTGTGTTTTTGTTTTTGTTACGCTTTGATCTTGCTCGCACTATTGAAAAGTGAGAGAAATATAAGGTGGTATATATTTTCTGTGGATGTGACGAGTGAACTCCATGAATGTGACCGTTTTAGTCTCTATGTTTTAATCAATATTTCAATTTGGTCCTTATACTTTTATATTTAATAATTATTCTCTTTTAATTAAAGTTTTTAAAATATCCCTATCATTGTTAACTTTAAGTTATTTGTAGATGTGACAGATGAGCTTCATGATGTGACATTTTATTTTAATATCGTAATGACAACTCTTTTTTTTTCTTTTTTTTTTCTTTTATTGAATTCAACAACATATGAGATTACAAGATTCGAGCCGTTGGCCTCTTGGTCATTGGCATATGTTTTAACCTCAGTTAATTTGATAAATGTCTGACAAGTATTGTAAGTCAGAAAAAATGATTTTGAATTAAATTTTGGATTATTGATGTATATGCAGTGCAACCATGACCAACTAATGTTAAATTCCATTAGTGTCTGACAAATCTAGTAAATTTGTCAAAAAAAAAAAAATACAGAATATTTGAAAGCAAAACCACAACTTTTTAGTTCAAATCATTAAAATGAAGAGATTTTATGTGCAGGAGCTATCTAGAAATGGAAAGGCTTTTGAAGATATATGTATACAAAGAAGGAGAACCTCCAATGTTTCATCAAGGTCCATGTAAGAGCATATATTCCACAGAAGGAAGGTTCATTCATGAAATGGAAAAGGGAAATTTGTATACAACCAATGATCCACATCAGGCCCTTCTCTATTTCCTCCCATTCAGTGTTGTCAATTTGGTTCAGTATCTTTATGTACCAAACTCTCATGAAGTTAATGCCATTGGAGTTGCAGTCTCAGATTACATCAATGTCATCTCTAATAAGCATTCTTTCTGGAATCGCAGTCTTGGTGCTGATCATTTTATGCTTTCCTGCCATGATTGGGTAAGCTTTCGAAATCGAAGGAGCAATGGTTAAAACCCCACCGGGTTGGATCTCAGCTTTCTTTCTTTCTTTCTTTCTTTGTTGCAGGGGCCACGTACCACTTCGTACGTTCCACTTTTATTCAATAACTCCATCAGGGTATTGTGTAACGCAAATGTTTCAGAAGGTTTCCGTCCCTCTAAAGATGCGTCGTTTCCTGAAATCCATCTTAGAACGGGAGAAATCGATGGGCTTCTTGGAGGTCTCTCGCCTTCTCGTCGAACTGTTCTTGCATTCTTTGCAGGTCGTCTACATGGCCATATAAGGTACCTACTCCTGCAGAACTGGAAGGAAAAAGATGAGGATGTGCTTGTTTACGACGAACTTCCAAGCGGAATATCGTACAATTCGATGTTGAAGAAGAGTAGGTTTTGTTTATGCCCTAGTGGGTATGAGGTAGCTAGTCCAAGGGTTGTGGAAGCCATTTATGCTGAATGTGTTCCTGTGTTGATATCTGAAAGCTATGTTCCTCCTTTCAGTGATGTTTTGAATTGGAAGTCATTTGCTGTGCAAATACAAGTAAAGGATATACCAAACATAAAAGAGATACTGAGAGGGATATCTAAAACTCAGTACTTGAGAATGCAGAGGAGAGTGAAGCAAGTACAGAAACATTTTGTGCTCAATGGAACTCCCAAGAGATTTGATGCTTTCCATATGATACTTCATTCTATCTGGCTCAGAAGGTTGAATATACACATTCAGGATTAA

mRNA sequence

ATGCGGAGGCCTGTTGGAGCTCTTCTGCGTGATAGAAGAGCTGTTCAAGTGCCTACTAGTGGAAGAAATCATTTGTATAAAGTTTCTCTTTCTTTGGTTTTTATTCTGTGGGGACTTATCTTCCTCTTTAGCTTATGGTTCAGCCGTGGGGATGGCTGCCAAGAAGGATCAGTTTTACATCCTGCTGATGTATCTACTTCAAATGAATCTAAATTGGAAAATAACAAGGACTCTGACATTTTATATGAACCTCCAAAGGGAGAAACTGATAGTACCATTCAATTAAACGATTCATGTTCAATTGATGCTACAAGCCCTGGTTCTGACAATGATATACTTTCAAGTGAAGAAAGTAGCAGTCATATACGACCTGCTACGAGGTTGCCTGAGGCTGAGAGCTCTAGCACTGGAGTAAAATCTGAAAACAAACCTGTCAAGGGAGATACCTCGTCAGAGACTGTTCTACTCGGCCTTGAAGAATTCAAAAGCAGAGCCTTTATATCCCGGAGTAAGTCTGAAACTGGCCAGGCTGGGAATACCATCCATAGAGTAGAACCTGGTGGTGCAGAGTACAATTACGCTTCAGCTTCAAAGGGAGCAAAGGTTTTGGCTTTCAACAAGGAAGCAAAGGGAGCTTCTAACATTTTAGGCAGGGACAAAGATAAGTACCTCAGAAATCCATGTTCTGCTGAAGAGAAATTTGTTGTCATAGAACTTTCAGAAGAAACCTTAGTAGTAACAATTGAAATTGCTAATTTTGAACACCATTCTTCCAACTTAAAAGAATTTGAGGTACATGGGAGTTTGGTCTATCCAACAGATGTTTGGTTCAAGCTCGGTAATTTTACTGCTCCAAATGCAAAGCATGCACATAGATTTGTTCTCAAGGACCCAAAATGGGTGAGATATTTAAAGTTGAATTTTCTTACCCATTATGGTTCAGAATTCTATTGCACACTCAGCACTGTGGAAGTTTACGGAATGGATGCTGTTGAGATGATGCTAGAGGATTTAATATCTGCTCAACATAAATCTTCTATATCAGATGAAGCTACTACTGAAAAGAGAGTAATTCCCTCCCAGTCTGGACCCAATGATGAAGGACAACATGGTAGAGAGTTGCAATCTCTTACTACTGAGGAAAGTGATGATGATGTTGATTTAGAACTTACAAAGAGTAACATACCTGATCCGGTTGAAGAATCGCACCATCAACAACCTGGCAGAATGCCTGGTGACACTGTTCTCAAAATTTTGACACAGAAAGTTCGTTCACTAGACCTAAGTTTATCTGTTTTGGAGCAGTATTTGGAGGACTTAACTTCCAAATATGGAAATATATTCAAAGAATTTGACAAAGATATAGAAAATAGTGATCTACTCATTGAGAAGACCCGAGAGGATATAAGAAATATTCTTAAAATCCAGGACAGCACAGATAAAGATCTTCGTGATCTCATTTCTTGGAAGTCCATTGTTTCCTTGCAGTTGGATAATCTGCAAAGGCATAATTCTATTCTCAGATCTGAGATTGAAAGGGTCCAGAAGAATCAGACTTCTCTGGAAAACAAAGGAATAGTTGTTTTTCTTTTGCTTGAACCAGCTGCTTTGGACCTCAAGGGACAGTATTTTCCCCCTTCCATTGAAGGATCAGTAAAAGGAATAGTTGAAAACAAAGAAGGCAATGGAAAAGATGCAAGTTCAGGGGTTAGCAGAATTAAGAGATACAGTAAGTTGAAGAAAATAGAGGAGAAATTGGGAAGAGCAAGAGCAGCCATAAGAGAAGCTGCTCAACTTCATAATCTTACATCTATACATCATGATCCTGACTATGTTCCTACAGGCCCAATATACAGGAACCCAAATGCTTTCCACAGGAGCTATCTAGAAATGGAAAGGCTTTTGAAGATATATGTATACAAAGAAGGAGAACCTCCAATGTTTCATCAAGGTCCATGTAAGAGCATATATTCCACAGAAGGAAGGTTCATTCATGAAATGGAAAAGGGAAATTTGTATACAACCAATGATCCACATCAGGCCCTTCTCTATTTCCTCCCATTCAGTGTTGTCAATTTGGTTCAGTATCTTTATGTACCAAACTCTCATGAAGTTAATGCCATTGGAGTTGCAGTCTCAGATTACATCAATGTCATCTCTAATAAGCATTCTTTCTGGAATCGCAGTCTTGGTGCTGATCATTTTATGCTTTCCTGCCATGATTGGGGGCCACGTACCACTTCGTACGTTCCACTTTTATTCAATAACTCCATCAGGGTATTGTGTAACGCAAATGTTTCAGAAGGTTTCCGTCCCTCTAAAGATGCGTCGTTTCCTGAAATCCATCTTAGAACGGGAGAAATCGATGGGCTTCTTGGAGGTCTCTCGCCTTCTCGTCGAACTGTTCTTGCATTCTTTGCAGGTCGTCTACATGGCCATATAAGGTACCTACTCCTGCAGAACTGGAAGGAAAAAGATGAGGATGTGCTTGTTTACGACGAACTTCCAAGCGGAATATCGTACAATTCGATGTTGAAGAAGAGTAGGTTTTGTTTATGCCCTAGTGGGTATGAGGTAGCTAGTCCAAGGGTTGTGGAAGCCATTTATGCTGAATGTGTTCCTGTGTTGATATCTGAAAGCTATGTTCCTCCTTTCAGTGATGTTTTGAATTGGAAGTCATTTGCTGTGCAAATACAAGTAAAGGATATACCAAACATAAAAGAGATACTGAGAGGGATATCTAAAACTCAGTACTTGAGAATGCAGAGGAGAGTGAAGCAAGTACAGAAACATTTTGTGCTCAATGGAACTCCCAAGAGATTTGATGCTTTCCATATGATACTTCATTCTATCTGGCTCAGAAGGTTGAATATACACATTCAGGATTAA

Coding sequence (CDS)

ATGCGGAGGCCTGTTGGAGCTCTTCTGCGTGATAGAAGAGCTGTTCAAGTGCCTACTAGTGGAAGAAATCATTTGTATAAAGTTTCTCTTTCTTTGGTTTTTATTCTGTGGGGACTTATCTTCCTCTTTAGCTTATGGTTCAGCCGTGGGGATGGCTGCCAAGAAGGATCAGTTTTACATCCTGCTGATGTATCTACTTCAAATGAATCTAAATTGGAAAATAACAAGGACTCTGACATTTTATATGAACCTCCAAAGGGAGAAACTGATAGTACCATTCAATTAAACGATTCATGTTCAATTGATGCTACAAGCCCTGGTTCTGACAATGATATACTTTCAAGTGAAGAAAGTAGCAGTCATATACGACCTGCTACGAGGTTGCCTGAGGCTGAGAGCTCTAGCACTGGAGTAAAATCTGAAAACAAACCTGTCAAGGGAGATACCTCGTCAGAGACTGTTCTACTCGGCCTTGAAGAATTCAAAAGCAGAGCCTTTATATCCCGGAGTAAGTCTGAAACTGGCCAGGCTGGGAATACCATCCATAGAGTAGAACCTGGTGGTGCAGAGTACAATTACGCTTCAGCTTCAAAGGGAGCAAAGGTTTTGGCTTTCAACAAGGAAGCAAAGGGAGCTTCTAACATTTTAGGCAGGGACAAAGATAAGTACCTCAGAAATCCATGTTCTGCTGAAGAGAAATTTGTTGTCATAGAACTTTCAGAAGAAACCTTAGTAGTAACAATTGAAATTGCTAATTTTGAACACCATTCTTCCAACTTAAAAGAATTTGAGGTACATGGGAGTTTGGTCTATCCAACAGATGTTTGGTTCAAGCTCGGTAATTTTACTGCTCCAAATGCAAAGCATGCACATAGATTTGTTCTCAAGGACCCAAAATGGGTGAGATATTTAAAGTTGAATTTTCTTACCCATTATGGTTCAGAATTCTATTGCACACTCAGCACTGTGGAAGTTTACGGAATGGATGCTGTTGAGATGATGCTAGAGGATTTAATATCTGCTCAACATAAATCTTCTATATCAGATGAAGCTACTACTGAAAAGAGAGTAATTCCCTCCCAGTCTGGACCCAATGATGAAGGACAACATGGTAGAGAGTTGCAATCTCTTACTACTGAGGAAAGTGATGATGATGTTGATTTAGAACTTACAAAGAGTAACATACCTGATCCGGTTGAAGAATCGCACCATCAACAACCTGGCAGAATGCCTGGTGACACTGTTCTCAAAATTTTGACACAGAAAGTTCGTTCACTAGACCTAAGTTTATCTGTTTTGGAGCAGTATTTGGAGGACTTAACTTCCAAATATGGAAATATATTCAAAGAATTTGACAAAGATATAGAAAATAGTGATCTACTCATTGAGAAGACCCGAGAGGATATAAGAAATATTCTTAAAATCCAGGACAGCACAGATAAAGATCTTCGTGATCTCATTTCTTGGAAGTCCATTGTTTCCTTGCAGTTGGATAATCTGCAAAGGCATAATTCTATTCTCAGATCTGAGATTGAAAGGGTCCAGAAGAATCAGACTTCTCTGGAAAACAAAGGAATAGTTGTTTTTCTTTTGCTTGAACCAGCTGCTTTGGACCTCAAGGGACAGTATTTTCCCCCTTCCATTGAAGGATCAGTAAAAGGAATAGTTGAAAACAAAGAAGGCAATGGAAAAGATGCAAGTTCAGGGGTTAGCAGAATTAAGAGATACAGTAAGTTGAAGAAAATAGAGGAGAAATTGGGAAGAGCAAGAGCAGCCATAAGAGAAGCTGCTCAACTTCATAATCTTACATCTATACATCATGATCCTGACTATGTTCCTACAGGCCCAATATACAGGAACCCAAATGCTTTCCACAGGAGCTATCTAGAAATGGAAAGGCTTTTGAAGATATATGTATACAAAGAAGGAGAACCTCCAATGTTTCATCAAGGTCCATGTAAGAGCATATATTCCACAGAAGGAAGGTTCATTCATGAAATGGAAAAGGGAAATTTGTATACAACCAATGATCCACATCAGGCCCTTCTCTATTTCCTCCCATTCAGTGTTGTCAATTTGGTTCAGTATCTTTATGTACCAAACTCTCATGAAGTTAATGCCATTGGAGTTGCAGTCTCAGATTACATCAATGTCATCTCTAATAAGCATTCTTTCTGGAATCGCAGTCTTGGTGCTGATCATTTTATGCTTTCCTGCCATGATTGGGGGCCACGTACCACTTCGTACGTTCCACTTTTATTCAATAACTCCATCAGGGTATTGTGTAACGCAAATGTTTCAGAAGGTTTCCGTCCCTCTAAAGATGCGTCGTTTCCTGAAATCCATCTTAGAACGGGAGAAATCGATGGGCTTCTTGGAGGTCTCTCGCCTTCTCGTCGAACTGTTCTTGCATTCTTTGCAGGTCGTCTACATGGCCATATAAGGTACCTACTCCTGCAGAACTGGAAGGAAAAAGATGAGGATGTGCTTGTTTACGACGAACTTCCAAGCGGAATATCGTACAATTCGATGTTGAAGAAGAGTAGGTTTTGTTTATGCCCTAGTGGGTATGAGGTAGCTAGTCCAAGGGTTGTGGAAGCCATTTATGCTGAATGTGTTCCTGTGTTGATATCTGAAAGCTATGTTCCTCCTTTCAGTGATGTTTTGAATTGGAAGTCATTTGCTGTGCAAATACAAGTAAAGGATATACCAAACATAAAAGAGATACTGAGAGGGATATCTAAAACTCAGTACTTGAGAATGCAGAGGAGAGTGAAGCAAGTACAGAAACATTTTGTGCTCAATGGAACTCCCAAGAGATTTGATGCTTTCCATATGATACTTCATTCTATCTGGCTCAGAAGGTTGAATATACACATTCAGGATTAA

Protein sequence

MRRPVGALLRDRRAVQVPTSGRNHLYKVSLSLVFILWGLIFLFSLWFSRGDGCQEGSVLHPADVSTSNESKLENNKDSDILYEPPKGETDSTIQLNDSCSIDATSPGSDNDILSSEESSSHIRPATRLPEAESSSTGVKSENKPVKGDTSSETVLLGLEEFKSRAFISRSKSETGQAGNTIHRVEPGGAEYNYASASKGAKVLAFNKEAKGASNILGRDKDKYLRNPCSAEEKFVVIELSEETLVVTIEIANFEHHSSNLKEFEVHGSLVYPTDVWFKLGNFTAPNAKHAHRFVLKDPKWVRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVEMMLEDLISAQHKSSISDEATTEKRVIPSQSGPNDEGQHGRELQSLTTEESDDDVDLELTKSNIPDPVEESHHQQPGRMPGDTVLKILTQKVRSLDLSLSVLEQYLEDLTSKYGNIFKEFDKDIENSDLLIEKTREDIRNILKIQDSTDKDLRDLISWKSIVSLQLDNLQRHNSILRSEIERVQKNQTSLENKGIVVFLLLEPAALDLKGQYFPPSIEGSVKGIVENKEGNGKDASSGVSRIKRYSKLKKIEEKLGRARAAIREAAQLHNLTSIHHDPDYVPTGPIYRNPNAFHRSYLEMERLLKIYVYKEGEPPMFHQGPCKSIYSTEGRFIHEMEKGNLYTTNDPHQALLYFLPFSVVNLVQYLYVPNSHEVNAIGVAVSDYINVISNKHSFWNRSLGADHFMLSCHDWGPRTTSYVPLLFNNSIRVLCNANVSEGFRPSKDASFPEIHLRTGEIDGLLGGLSPSRRTVLAFFAGRLHGHIRYLLLQNWKEKDEDVLVYDELPSGISYNSMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLISESYVPPFSDVLNWKSFAVQIQVKDIPNIKEILRGISKTQYLRMQRRVKQVQKHFVLNGTPKRFDAFHMILHSIWLRRLNIHIQD
Homology
BLAST of HG10003551 vs. NCBI nr
Match: KAA0039335.1 (putative glycosyltransferase [Cucumis melo var. makuwa] >TYK00518.1 putative glycosyltransferase [Cucumis melo var. makuwa])

HSP 1 Score: 1677.5 bits (4343), Expect = 0.0e+00
Identity = 846/962 (87.94%), Postives = 890/962 (92.52%), Query Frame = 0

Query: 1   MRRPVGALLRDRRAVQVPTSGRNHLYKVSLSLVFILWGLIFLFSLWFSRGDGCQEGSVLH 60
           MR+PVGALL DRRAV+VP SGRNHLYKVS+SLVFILWGLIFLFSLW SRGDGCQEGS+L 
Sbjct: 1   MRKPVGALLHDRRAVRVPISGRNHLYKVSISLVFILWGLIFLFSLWISRGDGCQEGSILL 60

Query: 61  PADVSTSNESKLENNKDSDILYEPPKGETDSTIQLNDSCSIDATSPGSDNDILSSEESSS 120
           P  VST+NESKLENNKDSD+L EPP GE+  TI LN+SCSI+A+SPGSDN+ILSSEESSS
Sbjct: 61  PDGVSTTNESKLENNKDSDVLCEPPNGESHCTIHLNNSCSINASSPGSDNEILSSEESSS 120

Query: 121 HIRPATRLPEAESSSTGVKSENKPVKGDTSSETVLLGLEEFKSRAFISRSKSETGQAGNT 180
           HI+  TRLPE ESSST VK E+KP KGD SS+TVLLGLEEFKSRAF+SR KSETGQAGNT
Sbjct: 121 HIQATTRLPEDESSSTRVKPESKPPKGDISSDTVLLGLEEFKSRAFVSRGKSETGQAGNT 180

Query: 181 IHRVEPGGAEYNYASASKGAKVLAFNKEAKGASNILGRDKDKYLRNPCSAEEKFVVIELS 240
           IHR+EPGGAEYNYASASKGAKVLAFNKEAKGASNILG+DKDKYLRNPCSAEEKFVVIELS
Sbjct: 181 IHRLEPGGAEYNYASASKGAKVLAFNKEAKGASNILGKDKDKYLRNPCSAEEKFVVIELS 240

Query: 241 EETLVVTIEIANFEHHSSNLKEFEVHGSLVYPTDVWFKLGNFTAPNAKHAHRFVLKDPKW 300
           EETLVVTIEIANFEHHSSNLKEFEVHGSLVYPTDVWFKLGNFTAPNAKHAHRFVLKDPKW
Sbjct: 241 EETLVVTIEIANFEHHSSNLKEFEVHGSLVYPTDVWFKLGNFTAPNAKHAHRFVLKDPKW 300

Query: 301 VRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVEMMLEDLISAQHKSSISDEATTEKRVIPS 360
           VRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVEMMLEDLISAQHK SISDEAT +KRVIPS
Sbjct: 301 VRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVEMMLEDLISAQHKPSISDEATPDKRVIPS 360

Query: 361 QSGPNDEGQHGRELQSLTTEESDDDVDLELTKSNIPDPVEESHHQQPGRMPGDTVLKILT 420
           Q GP DE  HGRELQSL  EE  D VDLEL+KSN PDPVEESHHQQPGRMPGDTVLKILT
Sbjct: 361 QPGPIDEVSHGRELQSLANEEGGDGVDLELSKSNTPDPVEESHHQQPGRMPGDTVLKILT 420

Query: 421 QKVRSLDLSLSVLEQYLEDLTSKYGNIFKEFDKDIENSDLLIEKTREDIRNILKIQDSTD 480
           QKVRSLDLSLSVLE+YLEDLTSKYGNIFKEFDKDI N++LLIEKT+EDIRNILKIQD+TD
Sbjct: 421 QKVRSLDLSLSVLERYLEDLTSKYGNIFKEFDKDIGNNNLLIEKTQEDIRNILKIQDNTD 480

Query: 481 KDLRDLISWKSIVSLQLDNLQRHNSILRSEIERVQKNQTSLENKGIVVFLLLEPAALDLK 540
           KDLRDLISWKS+VSLQLD LQRHNSILRSEIERVQKNQTSLENKGIV             
Sbjct: 481 KDLRDLISWKSMVSLQLDGLQRHNSILRSEIERVQKNQTSLENKGIV------------- 540

Query: 541 GQYFPPSIEGSVKGIVENKEGNGKDASSGVSRIKRYSKLKKIEEKLGRARAAIREAAQLH 600
                   E   K + ++KE NGK A  G+S+ K YSKLKK+EEKLGRARAAIR+A+QLH
Sbjct: 541 --------EEPQKTVAKDKEANGKSAIPGISKTKGYSKLKKLEEKLGRARAAIRKASQLH 600

Query: 601 NLTSIHHDPDYVPTGPIYRNPNAFHRSYLEMERLLKIYVYKEGEPPMFHQGPCKSIYSTE 660
           NLTSIHHDPDYVPTGPIYRNPNAFHRSYLEMERLLKIYVYKEGEPPMFH GPCKSIYSTE
Sbjct: 601 NLTSIHHDPDYVPTGPIYRNPNAFHRSYLEMERLLKIYVYKEGEPPMFHGGPCKSIYSTE 660

Query: 661 GRFIHEMEKGNLYTTNDPHQALLYFLPFSVVNLVQYLYVPNSHEVNAIGVAVSDYINVIS 720
           GRFIHEMEKGNLYTTNDP QALLYFLPFSVVNLVQYLYVPNSHEVNAIG A++DYINVIS
Sbjct: 661 GRFIHEMEKGNLYTTNDPDQALLYFLPFSVVNLVQYLYVPNSHEVNAIGRAITDYINVIS 720

Query: 721 NKHSFWNRSLGADHFMLSCHDWGPRTTSYVPLLFNNSIRVLCNANVSEGFRPSKDASFPE 780
            KH FW+RSLGADHFMLSCHDWGPRTTSYVPLLFNNSIRVLCNANVSEGF PSKDASFPE
Sbjct: 721 KKHPFWDRSLGADHFMLSCHDWGPRTTSYVPLLFNNSIRVLCNANVSEGFLPSKDASFPE 780

Query: 781 IHLRTGEIDGLLGGLSPSRRTVLAFFAGRLHGHIRYLLLQNWKEKDEDVLVYDELPSGIS 840
           IHLRTGEIDGL+GGLSPSRR+VLAFFAGRLHGHIRYLLLQ WKEKDEDVLVY+ELPSGIS
Sbjct: 781 IHLRTGEIDGLIGGLSPSRRSVLAFFAGRLHGHIRYLLLQEWKEKDEDVLVYEELPSGIS 840

Query: 841 YNSMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLISESYVPPFSDVLNWKSFAVQIQV 900
           YNSMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLISESYVPPFSDVLNWKSF+VQIQV
Sbjct: 841 YNSMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLISESYVPPFSDVLNWKSFSVQIQV 900

Query: 901 KDIPNIKEILRGISKTQYLRMQRRVKQVQKHFVLNGTPKRFDAFHMILHSIWLRRLNIHI 960
           KDIPNIK+IL+GIS+TQYLRMQRRVKQVQ+HFVLNGTPKRFDAFHMILHSIWLRRLNIHI
Sbjct: 901 KDIPNIKKILKGISQTQYLRMQRRVKQVQRHFVLNGTPKRFDAFHMILHSIWLRRLNIHI 941

Query: 961 QD 963
           QD
Sbjct: 961 QD 941

BLAST of HG10003551 vs. NCBI nr
Match: KAE8648979.1 (hypothetical protein Csa_009042 [Cucumis sativus])

HSP 1 Score: 1662.5 bits (4304), Expect = 0.0e+00
Identity = 835/962 (86.80%), Postives = 887/962 (92.20%), Query Frame = 0

Query: 1   MRRPVGALLRDRRAVQVPTSGRNHLYKVSLSLVFILWGLIFLFSLWFSRGDGCQEGSVLH 60
           MR+PVGALL DRRAVQVP SGRNHLYKVS+SLVFILWGL+FLFSLWFS G GCQE S+L 
Sbjct: 1   MRKPVGALLHDRRAVQVPISGRNHLYKVSISLVFILWGLVFLFSLWFSHGVGCQEESILL 60

Query: 61  PADVSTSNESKLENNKDSDILYEPPKGETDSTIQLNDSCSIDATSPGSDNDILSSEESSS 120
           P  VST+NESKLENNKDSD+L EPP GE+  TI LN+SCSI+A++PGSDN++LSSEESSS
Sbjct: 61  PDGVSTTNESKLENNKDSDVLREPPNGESHCTIHLNNSCSINASTPGSDNEVLSSEESSS 120

Query: 121 HIRPATRLPEAESSSTGVKSENKPVKGDTSSETVLLGLEEFKSRAFISRSKSETGQAGNT 180
           HI+  TRLPE  SSST VK E+KP KGD SS+TVLLGLEEFKSRAF+S+ KSETGQAGNT
Sbjct: 121 HIQATTRLPEDGSSSTRVKPESKPPKGDISSDTVLLGLEEFKSRAFVSQGKSETGQAGNT 180

Query: 181 IHRVEPGGAEYNYASASKGAKVLAFNKEAKGASNILGRDKDKYLRNPCSAEEKFVVIELS 240
           IHR+EPGGAEYNYASASKGAKVLAFNKEAKGASNILG+DKDKYLRNPCSAEEKFVVIELS
Sbjct: 181 IHRLEPGGAEYNYASASKGAKVLAFNKEAKGASNILGKDKDKYLRNPCSAEEKFVVIELS 240

Query: 241 EETLVVTIEIANFEHHSSNLKEFEVHGSLVYPTDVWFKLGNFTAPNAKHAHRFVLKDPKW 300
           EETLVVTIEIANFEHHSSNLKEFEVHGSLVYPTDVWFKLGNFTAPNAKHAHRFVLKDPKW
Sbjct: 241 EETLVVTIEIANFEHHSSNLKEFEVHGSLVYPTDVWFKLGNFTAPNAKHAHRFVLKDPKW 300

Query: 301 VRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVEMMLEDLISAQHKSSISDEATTEKRVIPS 360
           VRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVEMMLEDLISAQHK SISDEAT +KRVIPS
Sbjct: 301 VRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVEMMLEDLISAQHKPSISDEATHDKRVIPS 360

Query: 361 QSGPNDEGQHGRELQSLTTEESDDDVDLELTKSNIPDPVEESHHQQPGRMPGDTVLKILT 420
           Q GP DE  H RELQS+  EE DD VD+EL+KSN P+PVEESHHQQPGRMPGDTVLKILT
Sbjct: 361 QPGPIDEVSHRRELQSVANEEGDDGVDIELSKSNTPEPVEESHHQQPGRMPGDTVLKILT 420

Query: 421 QKVRSLDLSLSVLEQYLEDLTSKYGNIFKEFDKDIENSDLLIEKTREDIRNILKIQDSTD 480
           QKVRSLDLSLSVLE+YLEDLTSKYGNIFKEFDKDI N++LLIEKT+ DIRNILKIQD+TD
Sbjct: 421 QKVRSLDLSLSVLERYLEDLTSKYGNIFKEFDKDIGNNNLLIEKTQADIRNILKIQDTTD 480

Query: 481 KDLRDLISWKSIVSLQLDNLQRHNSILRSEIERVQKNQTSLENKGIVVFLLLEPAALDLK 540
           KDLRDLISWKS+VSLQLD LQRHNSILRSEIERVQKNQ SLENKG               
Sbjct: 481 KDLRDLISWKSMVSLQLDGLQRHNSILRSEIERVQKNQISLENKG--------------- 540

Query: 541 GQYFPPSIEGSVKGIVENKEGNGKDASSGVSRIKRYSKLKKIEEKLGRARAAIREAAQLH 600
                  IE S K + ++KE NGK A+ G+S+ +RYSKLKK+EEKLGRARAAIREA+Q+H
Sbjct: 541 -------IEESQKTVAKDKEANGKSATPGISKTERYSKLKKLEEKLGRARAAIREASQIH 600

Query: 601 NLTSIHHDPDYVPTGPIYRNPNAFHRSYLEMERLLKIYVYKEGEPPMFHQGPCKSIYSTE 660
           NLTSIHHDPDYVPTGPIYRNPNAFHRSY+EME+LLKIYVYKEGEPPMFH GPCKSIYSTE
Sbjct: 601 NLTSIHHDPDYVPTGPIYRNPNAFHRSYIEMEKLLKIYVYKEGEPPMFHGGPCKSIYSTE 660

Query: 661 GRFIHEMEKGNLYTTNDPHQALLYFLPFSVVNLVQYLYVPNSHEVNAIGVAVSDYINVIS 720
           GRFIHEMEKGNLYTTNDP QALLYFLPFSVVNLVQYLYVPNSHEVNAIG A++DYINVIS
Sbjct: 661 GRFIHEMEKGNLYTTNDPDQALLYFLPFSVVNLVQYLYVPNSHEVNAIGTAITDYINVIS 720

Query: 721 NKHSFWNRSLGADHFMLSCHDWGPRTTSYVPLLFNNSIRVLCNANVSEGFRPSKDASFPE 780
           NKH FW+RSLGADHFMLSCHDWGPRTTS+VPLLFNNSIRVLCNANVSEGFRPSKDASFPE
Sbjct: 721 NKHPFWDRSLGADHFMLSCHDWGPRTTSFVPLLFNNSIRVLCNANVSEGFRPSKDASFPE 780

Query: 781 IHLRTGEIDGLLGGLSPSRRTVLAFFAGRLHGHIRYLLLQNWKEKDEDVLVYDELPSGIS 840
           IHLRTGEIDGLLGGLSPSRR+VLAFFAGRLHGHIRYLLLQ WKEKDEDVLVYDELPSGIS
Sbjct: 781 IHLRTGEIDGLLGGLSPSRRSVLAFFAGRLHGHIRYLLLQEWKEKDEDVLVYDELPSGIS 840

Query: 841 YNSMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLISESYVPPFSDVLNWKSFAVQIQV 900
           Y+SMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLISESYVPPFSDVLNW SFAVQIQV
Sbjct: 841 YDSMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLISESYVPPFSDVLNWNSFAVQIQV 900

Query: 901 KDIPNIKEILRGISKTQYLRMQRRVKQVQKHFVLNGTPKRFDAFHMILHSIWLRRLNIHI 960
           KDIPNIK+IL GIS+TQYLRMQRRVKQVQ+HFVLNGTPKRFDAFHMILHSIWLRRLNIHI
Sbjct: 901 KDIPNIKKILNGISQTQYLRMQRRVKQVQRHFVLNGTPKRFDAFHMILHSIWLRRLNIHI 940

Query: 961 QD 963
           QD
Sbjct: 961 QD 940

BLAST of HG10003551 vs. NCBI nr
Match: KAG6592335.1 (putative glycosyltransferase, partial [Cucurbita argyrosperma subsp. sororia])

HSP 1 Score: 1639.4 bits (4244), Expect = 0.0e+00
Identity = 845/1037 (81.49%), Postives = 891/1037 (85.92%), Query Frame = 0

Query: 1    MRRPVGALLRDRRAVQVPTSGRNHLYKVSLSLVFILWGLIFLFSLWFSRGDGCQEGSVLH 60
            MRR VGALLRDRRAV+V  SGRNHL KVSLSLVF+LWGLIFLFSLWF RGDGCQEGSVL 
Sbjct: 1    MRRRVGALLRDRRAVEVSISGRNHLNKVSLSLVFVLWGLIFLFSLWFIRGDGCQEGSVLL 60

Query: 61   PADVSTSNESKLENNKDS----------------------------DILYEPPKGETDST 120
            P   S SNES LE+NKDS                            D+LYEP KGETD T
Sbjct: 61   PDGASNSNESTLESNKDSDVLYEPSKGETDCTSHLNDSCSIDATSHDVLYEPSKGETDCT 120

Query: 121  IQLNDSCSIDATSPGSDNDILSSEESSSHIRPATRLPEAESSSTGVKSENKPVKGDTSSE 180
             +LNDSCSIDATS  SDN++LSSEESSSH+  AT LPEAESSSTGVKSE+KP+K D SS+
Sbjct: 121  SRLNDSCSIDATSQASDNEMLSSEESSSHVLAATGLPEAESSSTGVKSESKPLKVDISSD 180

Query: 181  TVLLGLEEFKSRAFISRSKSETGQAGNTIHRVEPGGAEYNYASASKGAKVLAFNKEAKGA 240
            TVLLGLEEFKSR F SR+K ETGQAGNTIHRVEPGGAEYNYASASKGAKVLAFNKEAKGA
Sbjct: 181  TVLLGLEEFKSRVFTSRTKDETGQAGNTIHRVEPGGAEYNYASASKGAKVLAFNKEAKGA 240

Query: 241  SNILGRDKDKYLRNPCSAEEKFVVIELSEETLVVTIEIANFEHHSSNLKEFEVHGSLVYP 300
            SNILG+DKDKYLRNPCSAEEKFVVIELSEETLVVTIEIANFEHHSSNLKEFE+HGSLVYP
Sbjct: 241  SNILGKDKDKYLRNPCSAEEKFVVIELSEETLVVTIEIANFEHHSSNLKEFELHGSLVYP 300

Query: 301  TDVWFKLGNFTAPNAKHAHRFVLKDPKWVRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVE 360
            TDVWFKLGNFTAPNAKHAHRFVLKDPKWVRYLKLN LTHYGSEFYCTLSTVEVYGMDAVE
Sbjct: 301  TDVWFKLGNFTAPNAKHAHRFVLKDPKWVRYLKLNLLTHYGSEFYCTLSTVEVYGMDAVE 360

Query: 361  MMLEDLISAQHKSSISDEATTEKRVIPSQSGPNDEG-QHGRELQSLTTEES-DDDVDLEL 420
            MMLEDLISAQHK SISDEAT +KRV PSQ GPND G QH RE QSL  EES DDDV LEL
Sbjct: 361  MMLEDLISAQHKPSISDEATIDKRVTPSQPGPNDVGQQHRRESQSLANEESDDDDVVLEL 420

Query: 421  TKSNIPDPVEESHHQQPGRMPGDTVLKILTQKVRSLDLSLSVLEQYLEDLTSKYGNIFKE 480
            +KSNIPDPVEESHHQQPGRMPGDTVLKILTQKVRSLD SLSVLE+YLED TSKYGNIFKE
Sbjct: 421  SKSNIPDPVEESHHQQPGRMPGDTVLKILTQKVRSLDRSLSVLERYLEDSTSKYGNIFKE 480

Query: 481  FDKDIENSDLLIEKTREDIRNILKIQDSTDKDLRDLISWKSIVSLQLDNLQRHNSILRSE 540
            FDKDI N+ LLIEKTREDIRNILK+QDSTDKDL DLISWKS VSLQLD LQRHN+ILRSE
Sbjct: 481  FDKDIGNNGLLIEKTREDIRNILKVQDSTDKDLHDLISWKSTVSLQLDGLQRHNAILRSE 540

Query: 541  IERVQKNQTSLENKGIVVFL---------------------------------------- 600
            IERVQKNQT LENKGIVVF+                                        
Sbjct: 541  IERVQKNQTFLENKGIVVFVVCIIFSWFAILRLFLHIVVRVLCKLDLEIPWTTGLDKVFS 600

Query: 601  -----LLEPAALDLKGQYFPPSIEGSVKGIVENKEGNGKDASSGVSRIKRYSKLKKIEEK 660
                 LL+PAALDLKG  F   IEGS   + ENKE  GKDA+ G+SR++RYSKL+KIEEK
Sbjct: 601  SFAPHLLDPAALDLKGHSFSSPIEGSQTTVPENKEHKGKDATPGISRVERYSKLEKIEEK 660

Query: 661  LGRARAAIREAAQLHNLTSIHHDPDYVPTGPIYRNPNAFHRSYLEMERLLKIYVYKEGEP 720
            LGRARAAIREA ++ NLTS+H DPDYVP GPIYRNPNAFHRSYLEMERLLKIY+YKEGEP
Sbjct: 661  LGRARAAIREAGRVRNLTSVHDDPDYVPRGPIYRNPNAFHRSYLEMERLLKIYIYKEGEP 720

Query: 721  PMFHQGPCKSIYSTEGRFIHEMEKGNLYTTNDPHQALLYFLPFSVVNLVQYLYVPNSHEV 780
            PMFH+GPCKSIYSTEGRFIHEMEKGN YTTNDP QALLYFLPFSVVNLVQYLY PNSH+V
Sbjct: 721  PMFHEGPCKSIYSTEGRFIHEMEKGNSYTTNDPDQALLYFLPFSVVNLVQYLYEPNSHDV 780

Query: 781  NAIGVAVSDYINVISNKHSFWNRSLGADHFMLSCHDWGPRTTSYVPLLFNNSIRVLCNAN 840
            NAIGVAV DYI+VISNKHSFWNRSLGADHFMLSCHDWGPRTTSYVP LFNNSIRVLCNAN
Sbjct: 781  NAIGVAVQDYIDVISNKHSFWNRSLGADHFMLSCHDWGPRTTSYVPYLFNNSIRVLCNAN 840

Query: 841  VSEGFRPSKDASFPEIHLRTGEIDGLLGGLSPSRRTVLAFFAGRLHGHIRYLLLQNWKEK 900
            VSEGF PSKDASFPEIHLRTGEIDGLLGGLSPSRR +LAFFAGRLHGHIRYLLLQ WKEK
Sbjct: 841  VSEGFHPSKDASFPEIHLRTGEIDGLLGGLSPSRRPILAFFAGRLHGHIRYLLLQKWKEK 900

Query: 901  DEDVLVYDELPSGISYNSMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLISESYVPPF 960
            D+DV+VYDELPSG+SY SMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLISESYVPPF
Sbjct: 901  DDDVVVYDELPSGVSYESMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLISESYVPPF 960

Query: 961  SDVLNWKSFAVQIQVKDIPNIKEILRGISKTQYLRMQRRVKQVQKHFVLNGTPKRFDAFH 963
            SDVLNW SF VQI+VKDI NIKEILRGIS++QYLRMQRRVKQVQ+HFV+NGTPKR+DAFH
Sbjct: 961  SDVLNWNSFGVQIEVKDIGNIKEILRGISQSQYLRMQRRVKQVQRHFVINGTPKRYDAFH 1020

BLAST of HG10003551 vs. NCBI nr
Match: KAF3440963.1 (hypothetical protein FNV43_RR19249 [Rhamnella rubrinervis])

HSP 1 Score: 1210.3 bits (3130), Expect = 0.0e+00
Identity = 634/1009 (62.83%), Postives = 766/1009 (75.92%), Query Frame = 0

Query: 1    MRRPVGALLRDRRAVQVPTSGRNHLYKVSLSLVFILWGLIFLFSLWFSRGDGCQEGSVLH 60
            M+R   ALL+ RRA++   +GRN    VSLSL F+LWGL+FLFSLW S GDG  +G V  
Sbjct: 1    MQRSRRALLQ-RRALEKVITGRNSKCMVSLSLFFVLWGLVFLFSLWISLGDGFTDGDVGL 60

Query: 61   PADVSTSNESKLENNKDSDILYEPPKGETDSTIQLNDSCSIDATSPGS----------DN 120
               +ST NE+KL++ K SD     P  ETD+ +  +D  S +  +P S          +N
Sbjct: 61   AVGISTWNETKLDHGKHSDSGDVHPLKETDA-VHSSDRLSTNGVTPSSISSELLDVEGEN 120

Query: 121  DILSSEESSSHIRPATRLPEAESSSTGVKSENKPVKGDTSSETVLLGLEEFKSRAFISRS 180
            D  S+E S +++    + PE ESSS+  K EN   K D  S  V +GL+EFKSR + ++S
Sbjct: 121  DYASAEGSKNYVSDVVKQPEVESSSSFTKLENDSPKNDRLSHAVPVGLDEFKSRTYSTKS 180

Query: 181  KSETGQAGNTIHRVEPGGAEYNYASASKGAKVLAFNKEAKGASNILGRDKDKYLRNPCSA 240
            KS  G AG   HRVEPGGAEYNYAS SKGAKVLAFNKE+KGASNILGRD+DKYLRNPCS 
Sbjct: 181  KSGIGPAGVIKHRVEPGGAEYNYASVSKGAKVLAFNKESKGASNILGRDEDKYLRNPCSV 240

Query: 241  EEKFVVIELSEETLVVTIEIANFEHHSSNLKEFEVHGSLVYPTDVWFKLGNFTAPNAKHA 300
            E KFV+IELSEETLV TIEIANFEH+SSNLK+FE+ GSLVYPTD W KLGNFTAPN K A
Sbjct: 241  EGKFVIIELSEETLVDTIEIANFEHYSSNLKDFELLGSLVYPTDQWVKLGNFTAPNVKLA 300

Query: 301  HRFVLKDPKWVRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVEMMLEDLISAQHKSSISDE 360
             RFVL++PKWVRYLKLN L+HYGSEFYCTLS VEV+G+DAVE MLEDLIS Q    +S  
Sbjct: 301  QRFVLQEPKWVRYLKLNLLSHYGSEFYCTLSVVEVFGVDAVERMLEDLISVQDNVFVSAG 360

Query: 361  ATTEKRVIPSQ----SGPNDEGQHGRELQSLTTEESDDDVDLELTKSNIPDPVEESHHQQ 420
             T +++ + SQ     G +      +E+ S  T   + +V+ E+ KS++PDPVEE+ HQQ
Sbjct: 361  PTGDQKPMSSQPVSPEGDDSSQNMNKEMDSHAT-TGNSNVNHEILKSDVPDPVEEARHQQ 420

Query: 421  PGRMPGDTVLKILTQKVRSLDLSLSVLEQYLEDLTSKYGNIFKEFDKDIENSDLLIEKTR 480
             GRMPGDTV+KIL QKVR+LD++LSVLE+YLE+LTS+YGNIFKE DKDI + D+L+EK R
Sbjct: 421  AGRMPGDTVIKILMQKVRALDINLSVLERYLEELTSRYGNIFKEIDKDIGDKDILLEKIR 480

Query: 481  EDIRNILKIQDSTDKDLRDLISWKSIVSLQLDNLQRHNSILRSEIERVQKNQTSLENKGI 540
             D+RN+L  Q S  K++ DL+SWKS+VS QLD+L R N+ILR E+E+V++ Q S+E K +
Sbjct: 481  ADVRNLLDSQGSIAKEVDDLVSWKSLVSFQLDSLVRDNAILRLEVEKVREKQNSIEKKNV 540

Query: 541  VVFLLLE---PA--ALDLKGQYF--------------------PPSIEGS-------VKG 600
            V+FL      P+  A +  G Y+                     P I  S       +  
Sbjct: 541  VIFLAKSFSWPSWRADNFLGTYYRSPVVFSSERRSDQLLVASEAPLISSSKPNETVLLPQ 600

Query: 601  IVENKEGNGKDAS-SGVSRIKRYSKLKKIEEKLGRARAAIREAAQLHNLTSIHHDPDYVP 660
            ++E+K+   ++   S    IKRYSKL+K+E  L RAR +I+EAAQ+ NLTSIH D DYVP
Sbjct: 601  VLEDKQEQIRNIEISETKVIKRYSKLEKLEASLARARFSIKEAAQVRNLTSIHEDSDYVP 660

Query: 661  TGPIYRNPNAFHRSYLEMERLLKIYVYKEGEPPMFHQGPCKSIYSTEGRFIHEMEKGNLY 720
             GPIYRN NAFH SYLEME+L KIYVY+EG+PP+FH GPCKSIYSTEGRFIHEMEKGN +
Sbjct: 661  QGPIYRNANAFHWSYLEMEKLFKIYVYREGDPPIFHNGPCKSIYSTEGRFIHEMEKGNKF 720

Query: 721  TTNDPHQALLYFLPFSVVNLVQYLYVPNSHEVNAIGVAVSDYINVISNKHSFWNRSLGAD 780
             T DP +AL+YFLPFSVV +V+YLY P+SH+  AI +A++DYINVIS+KH FWNRSLGAD
Sbjct: 721  RTLDPDEALVYFLPFSVVMMVRYLYAPDSHDTKAIKLAITDYINVISDKHPFWNRSLGAD 780

Query: 781  HFMLSCHDWGPRTTSYVPLLFNNSIRVLCNANVSEGFRPSKDASFPEIHLRTGEIDGLLG 840
            HFMLSCHDWGP T+SYVP LF+ SIRVLCNAN SEGF PSKD SFPEIHLRTGEI GL+G
Sbjct: 781  HFMLSCHDWGPVTSSYVPRLFSKSIRVLCNANTSEGFNPSKDVSFPEIHLRTGEIKGLVG 840

Query: 841  GLSPSRRTVLAFFAGRLHGHIRYLLLQNWKEKDEDVLVYDELPSGISYNSMLKKSRFCLC 900
            G SPSRR++LAFFAGRLHGHIRYLLL+ WKEKD+DV VYD+LPSG+SY SMLKKS+FCLC
Sbjct: 841  GFSPSRRSILAFFAGRLHGHIRYLLLEQWKEKDQDVQVYDQLPSGVSYESMLKKSKFCLC 900

Query: 901  PSGYEVASPRVVEAIYAECVPVLISESYVPPFSDVLNWKSFAVQIQVKDIPNIKEILRGI 960
            PSGYEVASPRVVEAIYAECVPVLIS+ YVPPFSDVLNW+SF+VQ+QVKDIPNIK+IL GI
Sbjct: 901  PSGYEVASPRVVEAIYAECVPVLISDGYVPPFSDVLNWRSFSVQVQVKDIPNIKKILMGI 960

Query: 961  SKTQYLRMQRRVKQVQKHFVLNGTPKRFDAFHMILHSIWLRRLNIHIQD 963
            S++QYLRM RRVKQVQ+HFV NG PKRFD FHMI+HSIWLRRLN+ I++
Sbjct: 961  SQSQYLRMHRRVKQVQRHFVANGPPKRFDVFHMIVHSIWLRRLNVRIEN 1006

BLAST of HG10003551 vs. NCBI nr
Match: TQD89737.1 (hypothetical protein C1H46_024731 [Malus baccata])

HSP 1 Score: 1160.2 bits (3000), Expect = 0.0e+00
Identity = 630/1039 (60.64%), Postives = 749/1039 (72.09%), Query Frame = 0

Query: 1    MRRPVGALLRDRRAVQVPTSGRNHLYKVSLSLVFILWGLIFLFSLWFSRGDGCQEGSVLH 60
            M+R   ALL +RRA+ +  SGR+ LYKVSLSLVF+LWGL+FLFSLWFSRG G ++GS + 
Sbjct: 1    MQRSRKALL-NRRALGI--SGRSRLYKVSLSLVFVLWGLVFLFSLWFSRGHGYKDGSTVS 60

Query: 61   PADVSTSNESKLENNKDSDILYEPPKGETD--------STIQLN------DSCSIDATSP 120
            P  +ST +E+KL+ ++  DI  E   G +          T  LN      +     A++ 
Sbjct: 61   PVGISTWDEAKLDRDEQYDIQKETDLGYSSGGECTNGVETGGLNGEFFAMEGSKQHASTE 120

Query: 121  GSDNDIL--------SSEESSSHIRPATRLPEAESSSTGVKSENKPVKGDTSSETVLLGL 180
            GS    L        S+E S  H       PE  ++ +GVK EN   K       V LGL
Sbjct: 121  GSRQQDLAEGSLHHASTEGSIFHDSAVDEQPEVVTAGSGVKLENDAPKNGRLPRAVPLGL 180

Query: 181  EEFKSRAFISRSKSETGQAGNTIHRVEPGGAEYNYASASKGAKVLAFNKEAKGASNILGR 240
            +EFKS+   S+SKS  GQAG   HRVEPGGAEYNYASA+KGAKVLAFNKEAKGASNILG+
Sbjct: 181  DEFKSKTCSSKSKSGNGQAGGIKHRVEPGGAEYNYASAAKGAKVLAFNKEAKGASNILGK 240

Query: 241  DKDKYLRNPCSAEEKFVVIELSEETLVVTIEIANFEHHSSNLKEFEVHGSLVYPTDVWFK 300
            DKDKYLRNPCSAE KFV IELSEETLV TIEIAN EH+SSNLK+FEV GSL YPT+ W  
Sbjct: 241  DKDKYLRNPCSAEGKFVDIELSEETLVDTIEIANLEHYSSNLKDFEVLGSLTYPTNEWVF 300

Query: 301  LGNFTAPNAKHAHRFVLKDPKWVRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVEMMLEDL 360
            LGN TA N K   RFVL+ PKWVRY+KL  L+HYGSEFYCTLS +E+YG+DAVE MLEDL
Sbjct: 301  LGNVTAANNKLVQRFVLQQPKWVRYIKLKLLSHYGSEFYCTLSIIELYGVDAVERMLEDL 360

Query: 361  ISAQHKSSISDEATTEKRVIPS--QSGPNDEGQHGRELQSLTTEESD-DDVDLELTKSNI 420
            IS +  S +S+ AT +++ +PS   S   DE  H    +S     +   +V+ ++  S +
Sbjct: 361  ISVEGSSFVSEGATVDQKPVPSHPDSPEVDEFFHDIVKESEPQYAAGVSNVNNDMLNSEV 420

Query: 421  PDPVEESHHQQPGRMPGDTVLKILTQKVRSLDLSLSVLEQYLEDLTSKYGNIFKEFDKDI 480
            PD V+E HHQQ  RMPGDTVLKIL QKVRSLD SLSVLE+YLE+ TSKYG+IF EFDKD+
Sbjct: 421  PDAVKEVHHQQVNRMPGDTVLKILMQKVRSLDFSLSVLERYLEESTSKYGSIFGEFDKDL 480

Query: 481  ENSDLLIEKTREDIRNILKIQDSTDKDLRDLISWKSIVSLQLDNLQRHNSILRSEIERVQ 540
               D  ++K REDIRN+++ Q+   KD+ +LISW+S+V++QL+NL R N+ILRSE+E+V+
Sbjct: 481  GEKDTDLQKIREDIRNLVQSQEVIAKDVHNLISWQSLVTMQLNNLVRDNAILRSEVEKVR 540

Query: 541  KNQTSLENKGIVVFL--------------LLEPAALDLKGQYF--------PPSI----- 600
            + Q S++NK +V  L              +LE          F        PP +     
Sbjct: 541  EKQISVDNKVLVCSLGVTSSPSWSWKFGNVLETEDYSSSSSAFSATATPPRPPQVLEAAK 600

Query: 601  --------------------EGSVKGIV-----ENKEGNGKDASSGVSRIKRYSKLKKIE 660
                                 G  +G+V     E+ E NG  A    + IKRYS+L+K+E
Sbjct: 601  QGHNITSSSKANETVVPRQNIGEKQGMVWINGTESDEINGTSAIITSTPIKRYSRLEKLE 660

Query: 661  EKLGRARAAIREAAQLHNLTSIHHDPDYVPTGPIYRNPNAFHRSYLEMERLLKIYVYKEG 720
              L   RA+IREAA++ NLTS H DPDYVP GPIYRN NAFHRSYL+ME+  KIYVY+EG
Sbjct: 661  ANLAGVRASIREAARVRNLTSTHEDPDYVPRGPIYRNANAFHRSYLKMEKHFKIYVYEEG 720

Query: 721  EPPMFHQGPCKSIYSTEGRFIHEMEKGNLYTTNDPHQALLYFLPFSVVNLVQYLYVPNSH 780
            EPP+FH GPCKSIYSTEGRFIHEME  N+Y T DP QAL+YFLPFSVV LVQYLYV +SH
Sbjct: 721  EPPIFHNGPCKSIYSTEGRFIHEMEMENIYRTRDPDQALVYFLPFSVVMLVQYLYVADSH 780

Query: 781  EVNAIGVAVSDYINVISNKHSFWNRSLGADHFMLSCHDWGPRTTSYVPLLFNNSIRVLCN 840
            +   IG AV DY+NVIS+KH FWNRSLGADHFMLSCHDWGP T++YVP L+ NSIRVLCN
Sbjct: 781  DTQPIGRAVVDYVNVISDKHPFWNRSLGADHFMLSCHDWGPSTSAYVPHLYQNSIRVLCN 840

Query: 841  ANVSEGFRPSKDASFPEIHLRTGEIDGLLGGLSPSRRTVLAFFAGRLHGHIRYLLLQNWK 900
            AN SEGF PSKD SFPEIHLRTGE  GLLGGLSPSRR++LAFFAGRLHGHIRYLLL  WK
Sbjct: 841  ANTSEGFNPSKDVSFPEIHLRTGETKGLLGGLSPSRRSILAFFAGRLHGHIRYLLLNEWK 900

Query: 901  EKDEDVLVYDELPSGISYNSMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLISESYVP 960
            EKD+DV VYD+LP+G+SY SMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVL+S+SYVP
Sbjct: 901  EKDQDVQVYDQLPNGVSYESMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLVSDSYVP 960

Query: 961  PFSDVLNWKSFAVQIQVKDIPNIKEILRGISKTQYLRMQRRVKQVQKHFVLNGTPKRFDA 963
            PFSDVL WKSF+VQ+QVKDIPNIK IL GIS++QYLRMQRRVKQVQ+HFV+NG  KRFD 
Sbjct: 961  PFSDVLEWKSFSVQVQVKDIPNIKRILMGISQSQYLRMQRRVKQVQRHFVVNGPSKRFDV 1020

BLAST of HG10003551 vs. ExPASy Swiss-Prot
Match: Q9FFN2 (Probable glycosyltransferase At5g03795 OS=Arabidopsis thaliana OX=3702 GN=At5g03795 PE=3 SV=2)

HSP 1 Score: 496.9 bits (1278), Expect = 5.2e-139
Identity = 231/386 (59.84%), Postives = 296/386 (76.68%), Query Frame = 0

Query: 577 SKLKKIEEKLGRARAAIREAAQLHNLTSIHHDPDYVPTGPIYRNPNAFHRSYLEMERLLK 636
           S L+KIE KL +ARA+I+ A    ++     DPDYVP GP+Y N   FHRSYLEME+  K
Sbjct: 136 SNLEKIEFKLQKARASIKAA----SMDDPVDDPDYVPLGPMYWNAKVFHRSYLEMEKQFK 195

Query: 637 IYVYKEGEPPMFHQGPCKSIYSTEGRFIHEMEKGNLYTTNDPHQALLYFLPFSVVNLVQY 696
           IYVYKEGEPP+FH GPCKSIYS EG FI+E+E    + TN+P +A +++LPFSVV +V+Y
Sbjct: 196 IYVYKEGEPPLFHDGPCKSIYSMEGSFIYEIETDTRFRTNNPDKAHVFYLPFSVVKMVRY 255

Query: 697 LYVPNSHEVNAIGVAVSDYINVISNKHSFWNRSLGADHFMLSCHDWGPRTTSYVPLLFNN 756
           +Y  NS + + I   V DYIN++ +K+ +WNRS+GADHF+LSCHDWGP  +   P L +N
Sbjct: 256 VYERNSRDFSPIRNTVKDYINLVGDKYPYWNRSIGADHFILSCHDWGPEASFSHPHLGHN 315

Query: 757 SIRVLCNANVSEGFRPSKDASFPEIHLRTGEIDGLLGGLSPSRRTVLAFFAGRLHGHIRY 816
           SIR LCNAN SE F+P KD S PEI+LRTG + GL+GG SPS R +LAFFAG +HG +R 
Sbjct: 316 SIRALCNANTSERFKPRKDVSIPEINLRTGSLTGLVGGPSPSSRPILAFFAGGVHGPVRP 375

Query: 817 LLLQNWKEKDEDVLVYDELPSGISYNSMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVL 876
           +LLQ+W+ KD D+ V+  LP G SY+ M++ S+FC+CPSGYEVASPR+VEA+Y+ CVPVL
Sbjct: 376 VLLQHWENKDNDIRVHKYLPRGTSYSDMMRNSKFCICPSGYEVASPRIVEALYSGCVPVL 435

Query: 877 ISESYVPPFSDVLNWKSFAVQIQVKDIPNIKEILRGISKTQYLRMQRRVKQVQKHFVLNG 936
           I+  YVPPFSDVLNW+SF+V + V+DIPN+K IL  IS  QYLRM RRV +V++HF +N 
Sbjct: 436 INSGYVPPFSDVLNWRSFSVIVSVEDIPNLKTILTSISPRQYLRMYRRVLKVRRHFEVNS 495

Query: 937 TPKRFDAFHMILHSIWLRRLNIHIQD 963
             KRFD FHMILHSIW+RRLN+ I++
Sbjct: 496 PAKRFDVFHMILHSIWVRRLNVKIRE 517

BLAST of HG10003551 vs. ExPASy Swiss-Prot
Match: Q9SSE8 (Probable glycosyltransferase At3g07620 OS=Arabidopsis thaliana OX=3702 GN=At3g07620 PE=3 SV=1)

HSP 1 Score: 470.3 bits (1209), Expect = 5.2e-131
Identity = 229/407 (56.27%), Postives = 295/407 (72.48%), Query Frame = 0

Query: 557 ENKEGNGKDASSGVSRIKRYSKLKKIEEKLGRARAAIREAAQLHNLT--SIHHDPDYVPT 616
           E ++ NG +  SG      + +  K+E +L  AR  IREA   ++ T  S   D DYVP 
Sbjct: 68  EKRKRNGSNPGSGY-----WKRDGKVEAELATARVLIREAQLNYSSTTSSPLGDEDYVPH 127

Query: 617 GPIYRNPNAFHRSYLEMERLLKIYVYKEGEPPMFHQGPCKSIYSTEGRFIHEMEKGNL-Y 676
           G IYRNP AFHRSYL ME++ KIYVY+EG+PP+FH G CK IYS EG F++ ME   L Y
Sbjct: 128 GDIYRNPYAFHRSYLLMEKMFKIYVYEEGDPPIFHYGLCKDIYSMEGLFLNFMENDVLKY 187

Query: 677 TTNDPHQALLYFLPFSVVNLVQYLYVPNSHEVNAIGVAVSDYINVISNKHSFWNRSLGAD 736
            T DP +A +YFLPFSVV ++ +L+ P   +   +   ++DY+ +IS K+ +WN S G D
Sbjct: 188 RTRDPDKAHVYFLPFSVVMILHHLFDPVVRDKAVLERVIADYVQIISKKYPYWNTSDGFD 247

Query: 737 HFMLSCHDWGPRTTSYVPLLFNNSIRVLCNANVSEGFRPSKDASFPEIHLRTGEIDGLLG 796
           HFMLSCHDWG R T YV  LF NSIRVLCNAN+SE F P KDA FPEI+L TG+I+ L G
Sbjct: 248 HFMLSCHDWGHRATWYVKKLFFNSIRVLCNANISEYFNPEKDAPFPEINLLTGDINNLTG 307

Query: 797 GLSPSRRTVLAFFAGRLHGHIRYLLLQNWKEKDEDVLVYDELPSGISYNSMLKKSRFCLC 856
           GL P  RT LAFFAG+ HG IR +LL +WKEKD+D+LVY+ LP G+ Y  M++KSRFC+C
Sbjct: 308 GLDPISRTTLAFFAGKSHGKIRPVLLNHWKEKDKDILVYENLPDGLDYTEMMRKSRFCIC 367

Query: 857 PSGYEVASPRVVEAIYAECVPVLISESYVPPFSDVLNWKSFAVQIQVKDIPNIKEILRGI 916
           PSG+EVASPRV EAIY+ CVPVLISE+YV PFSDVLNW+ F+V + VK+IP +K IL  I
Sbjct: 368 PSGHEVASPRVPEAIYSGCVPVLISENYVLPFSDVLNWEKFSVSVSVKEIPELKRILMDI 427

Query: 917 SKTQYLRMQRRVKQVQKHFVLNGTPKRFDAFHMILHSIWLRRLNIHI 961
            + +Y+R+   VK+V++H ++N  PKR+D F+MI+HSIWLRRLN+ +
Sbjct: 428 PEERYMRLYEGVKKVKRHILVNDPPKRYDVFNMIIHSIWLRRLNVKL 469

BLAST of HG10003551 vs. ExPASy Swiss-Prot
Match: F4I8I0 (SUN domain-containing protein 4 OS=Arabidopsis thaliana OX=3702 GN=SUN4 PE=1 SV=1)

HSP 1 Score: 433.3 bits (1113), Expect = 7.0e-120
Identity = 267/571 (46.76%), Postives = 347/571 (60.77%), Query Frame = 0

Query: 1   MRRPVGALLRDRRAVQVPTSGRNHLYKVSLSLVFILWGLIFLFSLWFSRGDGCQEGSVLH 60
           M+R   ALL  RR  +  ++GRN  YKVSLSLVF++WGL+FL +LW S  DG +  S++ 
Sbjct: 1   MQRSRRALLVRRRVSETTSNGRNRFYKVSLSLVFLIWGLVFLSTLWISHVDGDKGRSLV- 60

Query: 61  PADVSTSNESKLENNKDSDILYEPPKGETDSTIQLNDSCSIDATS----PG--SDNDILS 120
                           DS    EP     D T +  D+ S+++TS    PG  SD DI +
Sbjct: 61  ----------------DSVEKGEPDDERADETAESVDATSLESTSVHSNPGLSSDVDIAA 120

Query: 121 SEESSSHIRPATRLPEAESSSTGV-------KSENKPVKG-------------------- 180
           + ES       T L + E  +T V         +N P+K                     
Sbjct: 121 AGESKG---SETILKQLEVDNTIVIVGNVTESKDNVPMKQSEINNNTVPGNDTETTGSKL 180

Query: 181 DTSSETVLLGLEEFKSRAFISRSKSETGQAGNTIHRVEPGGAEYNYASASKGAKVLAFNK 240
           D  S  V LGL+EFKSRA  SR KS +GQ    IHR+EPGG EYNYA+ASKGAKVL+ NK
Sbjct: 181 DQLSRAVPLGLDEFKSRASNSRDKSLSGQVTGVIHRMEPGGKEYNYAAASKGAKVLSSNK 240

Query: 241 EAKGASNILGRDKDKYLRNPCSAEEKFVVIELSEETLVVTIEIANFEHHSSNLKEFEVHG 300
           EAKGAS+I+ RDKDKYLRNPCS E KFVVIELSEETLV TI+IANFEH+SSNLK+FE+ G
Sbjct: 241 EAKGASSIICRDKDKYLRNPCSTEGKFVVIELSEETLVNTIKIANFEHYSSNLKDFEILG 300

Query: 301 SLVYPTDVWFKLGNFTAPNAKHAHRFVLKDPKWVRYLKLNFLTHYGSEFYCTLSTVEVYG 360
           +LVYPTD W  LGNFTA N KH   F   DPKWVRYLKLN L+HYGSEFYCTLS +EVYG
Sbjct: 301 TLVYPTDTWVHLGNFTALNMKHEQNFTFADPKWVRYLKLNLLSHYGSEFYCTLSLLEVYG 360

Query: 361 MDAVEMMLEDLISAQHKSSI----SDEATTEKRVIPSQ---SGPNDEGQHGRELQSLTTE 420
           +DAVE MLEDLIS Q K+ +     D    EK+ + ++       D+ +   + Q  + E
Sbjct: 361 VDAVERMLEDLISIQDKNILKLQEGDTEQKEKKTMQAKESFESDEDKSKQKEKEQEASPE 420

Query: 421 ESDDDVDLELTKSNIPDPVEESHHQQPGRMPGDTVLKILTQKVRSLDLSLSVLEQYLEDL 480
            +    ++ L K  +PDPVEE  HQ   RMPGDTVLKIL QK+RSLD+SLSVLE YLE+ 
Sbjct: 421 NAVVKDEVSLEKRKLPDPVEEIKHQPGSRMPGDTVLKILMQKIRSLDVSLSVLESYLEER 480

Query: 481 TSKYGNIFKEFDKDIENSDLLIEKTREDIRNILKIQDSTDKDLRDLISWKSIVSLQLDNL 532
           + KYG IFKE D +    +  +E  R ++  + + +++T K+  ++  W+  V  +L+  
Sbjct: 481 SLKYGMIFKEMDLEASKREKEVETMRLEVEGMKEREENTKKEAMEMRKWRMRVETELEKA 540

BLAST of HG10003551 vs. ExPASy Swiss-Prot
Match: Q3E7Q9 (Probable glycosyltransferase At5g25310 OS=Arabidopsis thaliana OX=3702 GN=At5g25310 PE=3 SV=2)

HSP 1 Score: 432.6 bits (1111), Expect = 1.2e-119
Identity = 209/394 (53.05%), Postives = 282/394 (71.57%), Query Frame = 0

Query: 571 SRIKRYSKLKKIEEKLGRARAAIREAAQLHNLTSIHHDPDYVPTGPIYRNPNAFHRSYLE 630
           S+ ++ ++   +E+ L +ARA+I EA+   N T    D   +P   IYRNP+A +RSYLE
Sbjct: 90  SKPEKLNRRNLVEQGLAKARASILEASSNVNTTLFKSD---LPNSEIYRNPSALYRSYLE 149

Query: 631 MERLLKIYVYKEGEPPMFHQGPCKSIYSTEGRFIHEMEKGNL-YTTNDPHQALLYFLPFS 690
           ME+  K+YVY+EGEPP+ H GPCKS+Y+ EGRFI EMEK    + T DP+QA +YFLPFS
Sbjct: 150 MEKRFKVYVYEEGEPPLVHDGPCKSVYAVEGRFITEMEKRRTKFRTYDPNQAYVYFLPFS 209

Query: 691 VVNLVQYLYVPNSHEVNAIGVAVSDYINVISNKHSFWNRSLGADHFMLSCHDWGPRTTSY 750
           V  LV+YLY  NS +   +   VSDYI ++S  H FWNR+ GADHFML+CHDWGP T+  
Sbjct: 210 VTWLVRYLYEGNS-DAKPLKTFVSDYIRLVSTNHPFWNRTNGADHFMLTCHDWGPLTSQA 269

Query: 751 VPLLFNNSIRVLCNANVSEGFRPSKDASFPEIHLRTGEID---GLLGGLSPSRRTVLAFF 810
              LFN SIRV+CNAN SEGF P+KD + PEI L  GE+D    L   LS S R  L FF
Sbjct: 270 NRDLFNTSIRVMCNANSSEGFNPTKDVTLPEIKLYGGEVDHKLRLSKTLSASPRPYLGFF 329

Query: 811 AGRLHGHIRYLLLQNWKEKDEDVLVYDELPSGISYNSMLKKSRFCLCPSGYEVASPRVVE 870
           AG +HG +R +LL++WK++D D+ VY+ LP  ++Y   ++ S+FC CPSGYEVASPRV+E
Sbjct: 330 AGGVHGPVRPILLKHWKQRDLDMPVYEYLPKHLNYYDFMRSSKFCFCPSGYEVASPRVIE 389

Query: 871 AIYAECVPVLISESYVPPFSDVLNWKSFAVQIQVKDIPNIKEILRGISKTQYLRMQRRVK 930
           AIY+EC+PV++S ++V PF+DVL W++F+V + V +IP +KEIL  IS  +Y  ++  ++
Sbjct: 390 AIYSECIPVILSVNFVLPFTDVLRWETFSVLVDVSEIPRLKEILMSISNEKYEWLKSNLR 449

Query: 931 QVQKHFVLNGTPKRFDAFHMILHSIWLRRLNIHI 961
            V++HF LN  P+RFDAFH+ LHSIWLRRLN+ +
Sbjct: 450 YVRRHFELNDPPQRFDAFHLTLHSIWLRRLNLKL 479

BLAST of HG10003551 vs. ExPASy Swiss-Prot
Match: Q3EAR7 (Probable glycosyltransferase At3g42180 OS=Arabidopsis thaliana OX=3702 GN=At3g42180 PE=2 SV=2)

HSP 1 Score: 410.2 bits (1053), Expect = 6.4e-113
Identity = 200/397 (50.38%), Postives = 268/397 (67.51%), Query Frame = 0

Query: 573 IKRYSKLKKIEEKLGRARAAIREAAQLHNLTSIHHDPDYVPTGPIYRNPNAFHRSYLEME 632
           +KR S L+K EE+L +ARAAIR A +  N TS      Y+PTG IYRN  AFH+S++EM 
Sbjct: 72  VKRRSNLEKREEELRKARAAIRRAVRFKNCTSNEEVITYIPTGQIYRNSFAFHQSHIEMM 131

Query: 633 RLLKIYVYKEGEPPMFHQGPCKSIYSTEGRFIHEME-----KGNLYTTNDPHQALLYFLP 692
           +  K++ YKEGE P+ H GP   IY  EG+FI E+          +  + P +A  +FLP
Sbjct: 132 KTFKVWSYKEGEQPLVHDGPVNDIYGIEGQFIDELSYVMGGPSGRFRASRPEEAHAFFLP 191

Query: 693 FSVVNLVQYLYVPNSHEVN----AIGVAVSDYINVISNKHSFWNRSLGADHFMLSCHDWG 752
           FSV N+V Y+Y P +   +     +    +DY++V+++KH FWN+S GADHFM+SCHDW 
Sbjct: 192 FSVANIVHYVYQPITSPADFNRARLHRIFNDYVDVVAHKHPFWNQSNGADHFMVSCHDWA 251

Query: 753 PRTTSYVPLLFNNSIRVLCNANVSEGFRPSKDASFPEIHLRTGEIDGLLGGLSPSRRTVL 812
           P      P  F N +R LCNAN SEGFR + D S PEI++   ++     G +P  RT+L
Sbjct: 252 PDVPDSKPEFFKNFMRGLCNANTSEGFRRNIDFSIPEINIPKRKLKPPFMGQNPENRTIL 311

Query: 813 AFFAGRLHGHIRYLLLQNWKEKDEDVLVYDELPSGISYNSMLKKSRFCLCPSGYEVASPR 872
           AFFAGR HG+IR +L  +WK KD+DV VYD L  G +Y+ ++  S+FCLCPSGYEVASPR
Sbjct: 312 AFFAGRAHGYIREVLFSHWKGKDKDVQVYDHLTKGQNYHELIGHSKFCLCPSGYEVASPR 371

Query: 873 VVEAIYAECVPVLISESYVPPFSDVLNWKSFAVQIQVKDIPNIKEILRGISKTQYLRMQR 932
            VEAIY+ CVPV+IS++Y  PF+DVL+W  F+V+I V  IP+IK+IL+ I   +YLRM R
Sbjct: 372 EVEAIYSGCVPVVISDNYSLPFNDVLDWSKFSVEIPVDKIPDIKKILQEIPHDKYLRMYR 431

Query: 933 RVKQVQKHFVLNGTPKRFDAFHMILHSIWLRRLNIHI 961
            V +V++HFV+N   + FD  HMILHS+WLRRLNI +
Sbjct: 432 NVMKVRRHFVVNRPAQPFDVIHMILHSVWLRRLNIRL 468

BLAST of HG10003551 vs. ExPASy TrEMBL
Match: A0A5A7TD36 (Putative glycosyltransferase OS=Cucumis melo var. makuwa OX=1194695 GN=E5676_scaffold169G001280 PE=3 SV=1)

HSP 1 Score: 1677.5 bits (4343), Expect = 0.0e+00
Identity = 846/962 (87.94%), Postives = 890/962 (92.52%), Query Frame = 0

Query: 1   MRRPVGALLRDRRAVQVPTSGRNHLYKVSLSLVFILWGLIFLFSLWFSRGDGCQEGSVLH 60
           MR+PVGALL DRRAV+VP SGRNHLYKVS+SLVFILWGLIFLFSLW SRGDGCQEGS+L 
Sbjct: 1   MRKPVGALLHDRRAVRVPISGRNHLYKVSISLVFILWGLIFLFSLWISRGDGCQEGSILL 60

Query: 61  PADVSTSNESKLENNKDSDILYEPPKGETDSTIQLNDSCSIDATSPGSDNDILSSEESSS 120
           P  VST+NESKLENNKDSD+L EPP GE+  TI LN+SCSI+A+SPGSDN+ILSSEESSS
Sbjct: 61  PDGVSTTNESKLENNKDSDVLCEPPNGESHCTIHLNNSCSINASSPGSDNEILSSEESSS 120

Query: 121 HIRPATRLPEAESSSTGVKSENKPVKGDTSSETVLLGLEEFKSRAFISRSKSETGQAGNT 180
           HI+  TRLPE ESSST VK E+KP KGD SS+TVLLGLEEFKSRAF+SR KSETGQAGNT
Sbjct: 121 HIQATTRLPEDESSSTRVKPESKPPKGDISSDTVLLGLEEFKSRAFVSRGKSETGQAGNT 180

Query: 181 IHRVEPGGAEYNYASASKGAKVLAFNKEAKGASNILGRDKDKYLRNPCSAEEKFVVIELS 240
           IHR+EPGGAEYNYASASKGAKVLAFNKEAKGASNILG+DKDKYLRNPCSAEEKFVVIELS
Sbjct: 181 IHRLEPGGAEYNYASASKGAKVLAFNKEAKGASNILGKDKDKYLRNPCSAEEKFVVIELS 240

Query: 241 EETLVVTIEIANFEHHSSNLKEFEVHGSLVYPTDVWFKLGNFTAPNAKHAHRFVLKDPKW 300
           EETLVVTIEIANFEHHSSNLKEFEVHGSLVYPTDVWFKLGNFTAPNAKHAHRFVLKDPKW
Sbjct: 241 EETLVVTIEIANFEHHSSNLKEFEVHGSLVYPTDVWFKLGNFTAPNAKHAHRFVLKDPKW 300

Query: 301 VRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVEMMLEDLISAQHKSSISDEATTEKRVIPS 360
           VRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVEMMLEDLISAQHK SISDEAT +KRVIPS
Sbjct: 301 VRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVEMMLEDLISAQHKPSISDEATPDKRVIPS 360

Query: 361 QSGPNDEGQHGRELQSLTTEESDDDVDLELTKSNIPDPVEESHHQQPGRMPGDTVLKILT 420
           Q GP DE  HGRELQSL  EE  D VDLEL+KSN PDPVEESHHQQPGRMPGDTVLKILT
Sbjct: 361 QPGPIDEVSHGRELQSLANEEGGDGVDLELSKSNTPDPVEESHHQQPGRMPGDTVLKILT 420

Query: 421 QKVRSLDLSLSVLEQYLEDLTSKYGNIFKEFDKDIENSDLLIEKTREDIRNILKIQDSTD 480
           QKVRSLDLSLSVLE+YLEDLTSKYGNIFKEFDKDI N++LLIEKT+EDIRNILKIQD+TD
Sbjct: 421 QKVRSLDLSLSVLERYLEDLTSKYGNIFKEFDKDIGNNNLLIEKTQEDIRNILKIQDNTD 480

Query: 481 KDLRDLISWKSIVSLQLDNLQRHNSILRSEIERVQKNQTSLENKGIVVFLLLEPAALDLK 540
           KDLRDLISWKS+VSLQLD LQRHNSILRSEIERVQKNQTSLENKGIV             
Sbjct: 481 KDLRDLISWKSMVSLQLDGLQRHNSILRSEIERVQKNQTSLENKGIV------------- 540

Query: 541 GQYFPPSIEGSVKGIVENKEGNGKDASSGVSRIKRYSKLKKIEEKLGRARAAIREAAQLH 600
                   E   K + ++KE NGK A  G+S+ K YSKLKK+EEKLGRARAAIR+A+QLH
Sbjct: 541 --------EEPQKTVAKDKEANGKSAIPGISKTKGYSKLKKLEEKLGRARAAIRKASQLH 600

Query: 601 NLTSIHHDPDYVPTGPIYRNPNAFHRSYLEMERLLKIYVYKEGEPPMFHQGPCKSIYSTE 660
           NLTSIHHDPDYVPTGPIYRNPNAFHRSYLEMERLLKIYVYKEGEPPMFH GPCKSIYSTE
Sbjct: 601 NLTSIHHDPDYVPTGPIYRNPNAFHRSYLEMERLLKIYVYKEGEPPMFHGGPCKSIYSTE 660

Query: 661 GRFIHEMEKGNLYTTNDPHQALLYFLPFSVVNLVQYLYVPNSHEVNAIGVAVSDYINVIS 720
           GRFIHEMEKGNLYTTNDP QALLYFLPFSVVNLVQYLYVPNSHEVNAIG A++DYINVIS
Sbjct: 661 GRFIHEMEKGNLYTTNDPDQALLYFLPFSVVNLVQYLYVPNSHEVNAIGRAITDYINVIS 720

Query: 721 NKHSFWNRSLGADHFMLSCHDWGPRTTSYVPLLFNNSIRVLCNANVSEGFRPSKDASFPE 780
            KH FW+RSLGADHFMLSCHDWGPRTTSYVPLLFNNSIRVLCNANVSEGF PSKDASFPE
Sbjct: 721 KKHPFWDRSLGADHFMLSCHDWGPRTTSYVPLLFNNSIRVLCNANVSEGFLPSKDASFPE 780

Query: 781 IHLRTGEIDGLLGGLSPSRRTVLAFFAGRLHGHIRYLLLQNWKEKDEDVLVYDELPSGIS 840
           IHLRTGEIDGL+GGLSPSRR+VLAFFAGRLHGHIRYLLLQ WKEKDEDVLVY+ELPSGIS
Sbjct: 781 IHLRTGEIDGLIGGLSPSRRSVLAFFAGRLHGHIRYLLLQEWKEKDEDVLVYEELPSGIS 840

Query: 841 YNSMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLISESYVPPFSDVLNWKSFAVQIQV 900
           YNSMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLISESYVPPFSDVLNWKSF+VQIQV
Sbjct: 841 YNSMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLISESYVPPFSDVLNWKSFSVQIQV 900

Query: 901 KDIPNIKEILRGISKTQYLRMQRRVKQVQKHFVLNGTPKRFDAFHMILHSIWLRRLNIHI 960
           KDIPNIK+IL+GIS+TQYLRMQRRVKQVQ+HFVLNGTPKRFDAFHMILHSIWLRRLNIHI
Sbjct: 901 KDIPNIKKILKGISQTQYLRMQRRVKQVQRHFVLNGTPKRFDAFHMILHSIWLRRLNIHI 941

Query: 961 QD 963
           QD
Sbjct: 961 QD 941

BLAST of HG10003551 vs. ExPASy TrEMBL
Match: A0A540LTE3 (SUN domain-containing protein OS=Malus baccata OX=106549 GN=C1H46_024731 PE=3 SV=1)

HSP 1 Score: 1160.2 bits (3000), Expect = 0.0e+00
Identity = 630/1039 (60.64%), Postives = 749/1039 (72.09%), Query Frame = 0

Query: 1    MRRPVGALLRDRRAVQVPTSGRNHLYKVSLSLVFILWGLIFLFSLWFSRGDGCQEGSVLH 60
            M+R   ALL +RRA+ +  SGR+ LYKVSLSLVF+LWGL+FLFSLWFSRG G ++GS + 
Sbjct: 1    MQRSRKALL-NRRALGI--SGRSRLYKVSLSLVFVLWGLVFLFSLWFSRGHGYKDGSTVS 60

Query: 61   PADVSTSNESKLENNKDSDILYEPPKGETD--------STIQLN------DSCSIDATSP 120
            P  +ST +E+KL+ ++  DI  E   G +          T  LN      +     A++ 
Sbjct: 61   PVGISTWDEAKLDRDEQYDIQKETDLGYSSGGECTNGVETGGLNGEFFAMEGSKQHASTE 120

Query: 121  GSDNDIL--------SSEESSSHIRPATRLPEAESSSTGVKSENKPVKGDTSSETVLLGL 180
            GS    L        S+E S  H       PE  ++ +GVK EN   K       V LGL
Sbjct: 121  GSRQQDLAEGSLHHASTEGSIFHDSAVDEQPEVVTAGSGVKLENDAPKNGRLPRAVPLGL 180

Query: 181  EEFKSRAFISRSKSETGQAGNTIHRVEPGGAEYNYASASKGAKVLAFNKEAKGASNILGR 240
            +EFKS+   S+SKS  GQAG   HRVEPGGAEYNYASA+KGAKVLAFNKEAKGASNILG+
Sbjct: 181  DEFKSKTCSSKSKSGNGQAGGIKHRVEPGGAEYNYASAAKGAKVLAFNKEAKGASNILGK 240

Query: 241  DKDKYLRNPCSAEEKFVVIELSEETLVVTIEIANFEHHSSNLKEFEVHGSLVYPTDVWFK 300
            DKDKYLRNPCSAE KFV IELSEETLV TIEIAN EH+SSNLK+FEV GSL YPT+ W  
Sbjct: 241  DKDKYLRNPCSAEGKFVDIELSEETLVDTIEIANLEHYSSNLKDFEVLGSLTYPTNEWVF 300

Query: 301  LGNFTAPNAKHAHRFVLKDPKWVRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVEMMLEDL 360
            LGN TA N K   RFVL+ PKWVRY+KL  L+HYGSEFYCTLS +E+YG+DAVE MLEDL
Sbjct: 301  LGNVTAANNKLVQRFVLQQPKWVRYIKLKLLSHYGSEFYCTLSIIELYGVDAVERMLEDL 360

Query: 361  ISAQHKSSISDEATTEKRVIPS--QSGPNDEGQHGRELQSLTTEESD-DDVDLELTKSNI 420
            IS +  S +S+ AT +++ +PS   S   DE  H    +S     +   +V+ ++  S +
Sbjct: 361  ISVEGSSFVSEGATVDQKPVPSHPDSPEVDEFFHDIVKESEPQYAAGVSNVNNDMLNSEV 420

Query: 421  PDPVEESHHQQPGRMPGDTVLKILTQKVRSLDLSLSVLEQYLEDLTSKYGNIFKEFDKDI 480
            PD V+E HHQQ  RMPGDTVLKIL QKVRSLD SLSVLE+YLE+ TSKYG+IF EFDKD+
Sbjct: 421  PDAVKEVHHQQVNRMPGDTVLKILMQKVRSLDFSLSVLERYLEESTSKYGSIFGEFDKDL 480

Query: 481  ENSDLLIEKTREDIRNILKIQDSTDKDLRDLISWKSIVSLQLDNLQRHNSILRSEIERVQ 540
               D  ++K REDIRN+++ Q+   KD+ +LISW+S+V++QL+NL R N+ILRSE+E+V+
Sbjct: 481  GEKDTDLQKIREDIRNLVQSQEVIAKDVHNLISWQSLVTMQLNNLVRDNAILRSEVEKVR 540

Query: 541  KNQTSLENKGIVVFL--------------LLEPAALDLKGQYF--------PPSI----- 600
            + Q S++NK +V  L              +LE          F        PP +     
Sbjct: 541  EKQISVDNKVLVCSLGVTSSPSWSWKFGNVLETEDYSSSSSAFSATATPPRPPQVLEAAK 600

Query: 601  --------------------EGSVKGIV-----ENKEGNGKDASSGVSRIKRYSKLKKIE 660
                                 G  +G+V     E+ E NG  A    + IKRYS+L+K+E
Sbjct: 601  QGHNITSSSKANETVVPRQNIGEKQGMVWINGTESDEINGTSAIITSTPIKRYSRLEKLE 660

Query: 661  EKLGRARAAIREAAQLHNLTSIHHDPDYVPTGPIYRNPNAFHRSYLEMERLLKIYVYKEG 720
              L   RA+IREAA++ NLTS H DPDYVP GPIYRN NAFHRSYL+ME+  KIYVY+EG
Sbjct: 661  ANLAGVRASIREAARVRNLTSTHEDPDYVPRGPIYRNANAFHRSYLKMEKHFKIYVYEEG 720

Query: 721  EPPMFHQGPCKSIYSTEGRFIHEMEKGNLYTTNDPHQALLYFLPFSVVNLVQYLYVPNSH 780
            EPP+FH GPCKSIYSTEGRFIHEME  N+Y T DP QAL+YFLPFSVV LVQYLYV +SH
Sbjct: 721  EPPIFHNGPCKSIYSTEGRFIHEMEMENIYRTRDPDQALVYFLPFSVVMLVQYLYVADSH 780

Query: 781  EVNAIGVAVSDYINVISNKHSFWNRSLGADHFMLSCHDWGPRTTSYVPLLFNNSIRVLCN 840
            +   IG AV DY+NVIS+KH FWNRSLGADHFMLSCHDWGP T++YVP L+ NSIRVLCN
Sbjct: 781  DTQPIGRAVVDYVNVISDKHPFWNRSLGADHFMLSCHDWGPSTSAYVPHLYQNSIRVLCN 840

Query: 841  ANVSEGFRPSKDASFPEIHLRTGEIDGLLGGLSPSRRTVLAFFAGRLHGHIRYLLLQNWK 900
            AN SEGF PSKD SFPEIHLRTGE  GLLGGLSPSRR++LAFFAGRLHGHIRYLLL  WK
Sbjct: 841  ANTSEGFNPSKDVSFPEIHLRTGETKGLLGGLSPSRRSILAFFAGRLHGHIRYLLLNEWK 900

Query: 901  EKDEDVLVYDELPSGISYNSMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLISESYVP 960
            EKD+DV VYD+LP+G+SY SMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVL+S+SYVP
Sbjct: 901  EKDQDVQVYDQLPNGVSYESMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVLVSDSYVP 960

Query: 961  PFSDVLNWKSFAVQIQVKDIPNIKEILRGISKTQYLRMQRRVKQVQKHFVLNGTPKRFDA 963
            PFSDVL WKSF+VQ+QVKDIPNIK IL GIS++QYLRMQRRVKQVQ+HFV+NG  KRFD 
Sbjct: 961  PFSDVLEWKSFSVQVQVKDIPNIKRILMGISQSQYLRMQRRVKQVQRHFVVNGPSKRFDV 1020

BLAST of HG10003551 vs. ExPASy TrEMBL
Match: A0A498IR71 (SUN domain-containing protein OS=Malus domestica OX=3750 GN=DVH24_009530 PE=3 SV=1)

HSP 1 Score: 1141.7 bits (2952), Expect = 0.0e+00
Identity = 629/1110 (56.67%), Postives = 748/1110 (67.39%), Query Frame = 0

Query: 9    LRDRRAVQVPTSGRNHLYKVSLSLVFILWGLIFLFSLWFSRGDGCQEGSVLHPADVSTSN 68
            L +RRA+ +  SGRN LYKVSLSLVF+LWGL+FLFSLWFSRG G ++GS + P  +ST +
Sbjct: 50   LLNRRALGI--SGRNRLYKVSLSLVFVLWGLVFLFSLWFSRGHGYKDGSTVSPVGISTWD 109

Query: 69   ESKLENNKDSDILYEPPKGETD--------STIQLN------DSCSIDATSPGSDNDIL- 128
            E+KL+ ++  DI  E   G +          T  LN      +     A++ GS    L 
Sbjct: 110  EAKLDRDEHYDIQKESDLGYSSGGECTNGVETGGLNGEFFAMEGSKQHASAEGSRQQDLA 169

Query: 129  -------SSEESSSHIRPATRLPEAESSSTGVKSENKPVKGDTSSETVLLGLEEFKSRAF 188
                   S+E S  H       PE  ++ +GVK EN   K       V LGL+EFKS+ F
Sbjct: 170  EGSLHHASTEGSIFHDSAVDEQPEVVTAGSGVKLENDAPKNGRLPRAVPLGLDEFKSKTF 229

Query: 189  ISRSKSETGQAGNTIHRVEPGGAEYNYASASKGAKVLAFNKEAKGASNILGRDKDKYLRN 248
             S+SKS  GQAG   HRVEPGGAEYNYASA+KGAKVLAFNKEAKGASNILG+DKDKYLRN
Sbjct: 230  SSKSKSGNGQAGGIKHRVEPGGAEYNYASAAKGAKVLAFNKEAKGASNILGKDKDKYLRN 289

Query: 249  PCSAEEKFVVIELSEETLVVTIEIANFEHHSSNLKEFEVHGSLVYPTDVWFKLGNFTAPN 308
            PCSAE KFV IELSEETLV TIEIAN EH+SSNLK+FEV GSL YPT+ W  LGN TA N
Sbjct: 290  PCSAEGKFVDIELSEETLVDTIEIANLEHYSSNLKDFEVLGSLTYPTNEWVFLGNVTAAN 349

Query: 309  AKHAHRFVLKDPKWVRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVEMMLEDLISAQHKSS 368
             K   RFVL+ PKWVRY+KL  L+HYGSEFYCTLS +E+YG+DAVE MLEDLIS +  S 
Sbjct: 350  NKLVQRFVLQQPKWVRYIKLKLLSHYGSEFYCTLSIIELYGVDAVERMLEDLISVESSSF 409

Query: 369  ISDEATTEKRVIPSQ--SGPNDEGQHGRELQSLTTEESD-DDVDLELTKSNIPDPVEESH 428
            +S+ AT +++ +PS   S   DE  H    +S     +   +V+ ++  S +PDPV+E  
Sbjct: 410  VSEGATVDQKPVPSHPYSPEVDEFFHDIVKESEPQYAAGVSNVNNDMMNSEVPDPVKEVR 469

Query: 429  HQQPGRMPGDTVLKILTQKVRSLDLSLSVLEQYLEDLTSKYGNIFKEFDKDIENSDLLIE 488
            HQQ  RMPGDTVLKIL QKVRSLD SLSVLE+YLE+ TSKYG+IF EFDKD+      ++
Sbjct: 470  HQQVNRMPGDTVLKILMQKVRSLDFSLSVLERYLEESTSKYGSIFGEFDKDLGEKGTDLQ 529

Query: 489  KTREDIRNILKIQDSTDKDLRDLISWKSIVSLQLDNLQRHNSILRSEIERVQKNQTSLEN 548
            K REDIRN+++ Q+   KD+ +LISW+S+V++QL+NL R N+ILRSE+E+V++ Q S++N
Sbjct: 530  KIREDIRNLVQSQEVIAKDVHNLISWQSLVTMQLNNLVRDNAILRSEVEKVREKQISVDN 589

Query: 549  KGIVVFLLL-----------------------------------------------EPAA 608
            KGI++FL+                                                +P +
Sbjct: 590  KGILIFLICIIFSLLALVRLFTEMAVSVYMVLSVDRATEKPRKFCWMKMKPPLSNKKPNS 649

Query: 609  LDLKGQYF---------------------------------------------------- 668
            L     Y                                                     
Sbjct: 650  LLSSSSYSVLLLAFVVPFFVISVLVCSLGVTSSLSWSWGFGNVLETEDYSSSSSAFSATA 709

Query: 669  -----PPSIEGSVKG---------------------------IVENKEGNGKDASSGVSR 728
                 P  +E + +G                           I E+ E NG  A    + 
Sbjct: 710  TPPRPPQVLEAAKQGHNNTSSSKSNETVVPRQNIGEKQGMVWITESDEINGTSAIITSTS 769

Query: 729  IKRYSKLKKIEEKLGRARAAIREAAQLHNLTSIHHDPDYVPTGPIYRNPNAFHRSYLEME 788
            IKRYS+L+K+E  L   RA+IREAA++ NLTS H DPDYVP GPIYRN NAFHRSYL+ME
Sbjct: 770  IKRYSRLEKLEANLAGVRASIREAARVRNLTSTHEDPDYVPRGPIYRNANAFHRSYLKME 829

Query: 789  RLLKIYVYKEGEPPMFHQGPCKSIYSTEGRFIHEMEKGNLYTTNDPHQALLYFLPFSVVN 848
            +  KIYVY+EGEPP+FH GPCKSIYSTEGRFIHEME  N+Y T DP QAL+YFLPFSVV 
Sbjct: 830  KHFKIYVYEEGEPPIFHNGPCKSIYSTEGRFIHEMEMENIYKTRDPDQALVYFLPFSVVM 889

Query: 849  LVQYLYVPNSHEVNAIGVAVSDYINVISNKHSFWNRSLGADHFMLSCHDWGPRTTSYVPL 908
            LVQYLYV +SH+   IG AV DY+NVIS+KH FWNRSLGADHFMLSCHDWGP T++YVP 
Sbjct: 890  LVQYLYVADSHDTQPIGRAVVDYVNVISDKHPFWNRSLGADHFMLSCHDWGPSTSAYVPH 949

Query: 909  LFNNSIRVLCNANVSEGFRPSKDASFPEIHLRTGEIDGLLGGLSPSRRTVLAFFAGRLHG 963
            L+ NSIRVLCNAN SEGF PSKD SFPEIHLRTGE  GLLGGLSPSRR++LAFFAGRLHG
Sbjct: 950  LYQNSIRVLCNANTSEGFNPSKDVSFPEIHLRTGETKGLLGGLSPSRRSILAFFAGRLHG 1009

BLAST of HG10003551 vs. ExPASy TrEMBL
Match: A0A314YQN6 (Putative glycosyltransferase OS=Prunus yedoensis var. nudiflora OX=2094558 GN=Pyn_37431 PE=3 SV=1)

HSP 1 Score: 1089.7 bits (2817), Expect = 0.0e+00
Identity = 571/885 (64.52%), Postives = 675/885 (76.27%), Query Frame = 0

Query: 132 ESSSTGVKSENKPVKGDTSSETVLLGLEEFKSRAFISRSKSETGQAGNTIHRVEPGGAEY 191
           ESS +GVK EN   K       V LGL+EFKS+ F S++KS  GQAG+  HRVEPGGAEY
Sbjct: 42  ESSGSGVKLENDAPKNGRLPRAVPLGLDEFKSKTFNSKTKSGNGQAGSIKHRVEPGGAEY 101

Query: 192 NYASASKGAKVLAFNKEAKGASNILGRDKDKYLRNPCSAEEKFVVIELSEETLVVTIEIA 251
           NYASA+KGAKVLAFNKEAKGASNILGRDKDKYLRNPCSAE KFV IELSEETLV TI+IA
Sbjct: 102 NYASAAKGAKVLAFNKEAKGASNILGRDKDKYLRNPCSAEGKFVDIELSEETLVDTIQIA 161

Query: 252 NFEHHSSNLKEFEVHGSLVYPTDVWFKLGNFTAPNAKHAHRFVLKDPKWVRYLKLNFLTH 311
           N EH+SSNLK FE+ GSLVYPTD W  LGNFTA N K A R+ L++PKWVRY+KLN L+H
Sbjct: 162 NHEHYSSNLKAFELLGSLVYPTDEWVLLGNFTAANNKLAQRYDLQEPKWVRYIKLNLLSH 221

Query: 312 YGSEFYCTLSTVEVYGMDAVEMMLEDLISAQHKSSISDEATTEKRVIPSQSGPN----DE 371
           +GSEFYCTLS +E+YG+DAVE MLEDLIS +    +S+ AT +++  P+ S P+    DE
Sbjct: 222 HGSEFYCTLSVIEIYGVDAVERMLEDLISVESSPFVSEGATVDQK--PTSSNPDSPEVDE 281

Query: 372 GQHGRELQSLTTEES--DDDVDLELTKSNIPDPVEESHHQQPGRMPGDTVLKILTQKVRS 431
             H   ++ L  E++    D+  E+ KS +PD ++E  H Q  RMPGDTVLKIL QKVRS
Sbjct: 282 FFH-NIVKELEPEDAVGKSDLSNEIMKSEVPDAIKEVRHLQVNRMPGDTVLKILMQKVRS 341

Query: 432 LDLSLSVLEQYLEDLTSKYGNIFKEFDKDIENSDLLIEKTREDIRNILKIQDSTDKDLRD 491
           LD SLSVLE+YLE+  SKYG+IF+EFDKD+   DL ++K REDIRN+L+ Q+   KD+ +
Sbjct: 342 LDFSLSVLERYLEESNSKYGSIFREFDKDLGEKDLDVQKIREDIRNLLESQEIIAKDVHN 401

Query: 492 LISWKSIVSLQLDNLQRHNSILRSEIERVQKNQTSLENKGIVV----------------- 551
           LISW+S+VS+QL NL R N+ILRSE+E+V++ Q S++NK +V                  
Sbjct: 402 LISWQSLVSMQLGNLVRDNAILRSEVEKVREKQQSVDNKVLVCTSGSSSWTWRFGNNILE 461

Query: 552 ---------------FLLLEPA--------ALDLKGQYFP---PSIEGSVKGIVENKEGN 611
                            +LE A         L    +  P   P   G  + +V N +G 
Sbjct: 462 TDYSSSLAVSTRSRPSQVLEAAEAHHINSSLLSKTNETTPHILPHQIGEKQEMVWNDQGL 521

Query: 612 GKD----ASSGVSRIKRYSKLKKIEEKLGRARAAIREAAQLHNLTSIHHDPDYVPTGPIY 671
           G D     S+    IKR+S+L+K+E  L   RA+IREAA++ NLTS H DPDYVP GPIY
Sbjct: 522 GADEVNVTSASAVTIKRHSRLEKLEANLAGVRASIREAARVRNLTSTHEDPDYVPKGPIY 581

Query: 672 RNPNAFHRSYLEMERLLKIYVYKEGEPPMFHQGPCKSIYSTEGRFIHEME-KGNLYTTND 731
           RN NAFHRSYLEMERL KIYVY+EG+PP+FH GPCKSIYSTEGRFIHEME   N+Y T D
Sbjct: 582 RNANAFHRSYLEMERLFKIYVYEEGDPPIFHNGPCKSIYSTEGRFIHEMEMDNNIYKTRD 641

Query: 732 PHQALLYFLPFSVVNLVQYLYVPNSHEVNAIGVAVSDYINVISNKHSFWNRSLGADHFML 791
           P +AL+YFLPFSVV LVQYLY  +SH  ++IG AV DY+NVIS+KH FWNRSLGADHFML
Sbjct: 642 PDEALVYFLPFSVVMLVQYLYAADSHNTDSIGRAVIDYVNVISDKHPFWNRSLGADHFML 701

Query: 792 SCHDWGPRTTSYVPLLFNNSIRVLCNANVSEGFRPSKDASFPEIHLRTGEIDGLLGGLSP 851
           SCHDWGPRT+SYVP L++ SIRVLCNAN SEGF PSKDASFPEIHLRTGE  GL+GGLSP
Sbjct: 702 SCHDWGPRTSSYVPHLYHKSIRVLCNANTSEGFNPSKDASFPEIHLRTGETKGLVGGLSP 761

Query: 852 SRRTVLAFFAGRLHGHIRYLLLQNWKEKDEDVLVYDELPSGISYNSMLKKSRFCLCPSGY 911
           SRR++LAFFAGRLHGHIRYLLL  WKEKD+DV VYD+LP G+SY SMLKKSRFCLCPSGY
Sbjct: 762 SRRSILAFFAGRLHGHIRYLLLNEWKEKDQDVQVYDQLPHGVSYESMLKKSRFCLCPSGY 821

Query: 912 EVASPRVVEAIYAECVPVLISESYVPPFSDVLNWKSFAVQIQVKDIPNIKEILRGISKTQ 963
           EVASPRVVEAIYAEC+PVLIS+SYVPPFSDVL+WKSF+VQ+QVKDIPNIK IL GIS++Q
Sbjct: 822 EVASPRVVEAIYAECIPVLISDSYVPPFSDVLDWKSFSVQVQVKDIPNIKTILMGISQSQ 881

BLAST of HG10003551 vs. ExPASy TrEMBL
Match: A0A5N5FL53 (SUN domain-containing protein OS=Pyrus ussuriensis x Pyrus communis OX=2448454 GN=D8674_004698 PE=4 SV=1)

HSP 1 Score: 1053.1 bits (2722), Expect = 6.9e-304
Identity = 591/1069 (55.29%), Postives = 705/1069 (65.95%), Query Frame = 0

Query: 1    MRRPVGALLRDRRAVQVPTSGRNHLYKVSLSLVFILWGLIFLFSLWFSRGDGCQEGSVLH 60
            M+R   ALL +RRA+ +  SGR+ LYKVSLSLVF+LWGL+FLFSLWFSRG G ++GS + 
Sbjct: 1    MQRSRRALL-NRRALGI--SGRSRLYKVSLSLVFVLWGLVFLFSLWFSRGHGYKDGSTVS 60

Query: 61   PADVSTSNESKLENNKDSDILYEPPKGETDSTIQLNDSCSIDATSPGSDNDIL------- 120
            P  +ST +E+KL+ ++  DI     + ETD        C+    + G + +         
Sbjct: 61   PVGISTWDEAKLDRDEHYDI-----QKETDLGYSSGGECTNGVETGGLNGEFFAIEGSKQ 120

Query: 121  --------------------SSEESSSHIRPATRLPEAESSSTGVKSENKPVKGDTSSET 180
                                S+E S  H       PE  ++ +GVK EN   K       
Sbjct: 121  HPSGEGSRQQDLAEGSLHRASAEGSIFHASAVDEQPEVVTTGSGVKLENDAPKNGRLPRA 180

Query: 181  VLLGLEEFKSRAFISRSKSETGQAGNTIHRVEPGGAEYNYASASKGAKVLAFNKEAKGAS 240
            V LGL+EFKS+ F S+SKS  GQAG   HRVEPGGAEYNYASA+KGAKVLAFNKEAKGAS
Sbjct: 181  VPLGLDEFKSKTFSSKSKSGNGQAGGIKHRVEPGGAEYNYASAAKGAKVLAFNKEAKGAS 240

Query: 241  NILGRDKDKYLRNPCSAEEKFVVIELSEETLVVTIEIANFEHHSSNLKEFEVHGSLVYPT 300
            NILG+DKDKYLRNPCSAEEKFV IELSEETLV TIEIAN EH+SSNLK+F V GSL YPT
Sbjct: 241  NILGKDKDKYLRNPCSAEEKFVDIELSEETLVDTIEIANLEHYSSNLKDFVVLGSLTYPT 300

Query: 301  DVWFKLGNFTAPNAKHAHRFVLKDPKWVRYLKLNFLTHYGSEFYCTLSTVEVYGMDAVEM 360
            + W  LGN TA N K   RFVL+ PKWVRY+KL  L+HYGSEFYCTLST+E+YG+DAVE 
Sbjct: 301  NEWVFLGNVTAANNKLVQRFVLQQPKWVRYIKLKLLSHYGSEFYCTLSTIELYGVDAVER 360

Query: 361  MLEDLISAQHKSSISDEATTEKRVIPS--QSGPNDEGQHGRELQSLTT-EESDDDVDLEL 420
            MLEDLIS +  S +S+ AT +++ +PS   S   DE  H    +S         +V+ ++
Sbjct: 361  MLEDLISVESSSFVSEGATVDQKPVPSHPDSLEVDEFYHDIVKESEPQYAAGGSNVNNDM 420

Query: 421  TKSNIPDPVEESHHQQPGRMPGDTVLKILTQKVRSLDLSLSVLEQYLEDLTSKYGNIFKE 480
              S + DPV+E  HQQ  RMPGDTVLKIL QKVRSLD SLSVLE+YLE+ TSKYG+IF E
Sbjct: 421  MNSEVLDPVKEVRHQQVNRMPGDTVLKILMQKVRSLDFSLSVLERYLEESTSKYGSIFGE 480

Query: 481  FDKDIENSDLLIEKTREDIRNILKIQDSTDKDLRDLISWKSIVSLQLDNLQRHNSILRSE 540
            FDKD+   D  ++K REDIRN+++ Q+    D+ +L SW+S+V++QL+NL R N+ILRSE
Sbjct: 481  FDKDLGEKDTDLQKIREDIRNLIQSQEDIGNDVHNLRSWQSLVTMQLNNLVRDNAILRSE 540

Query: 541  IERVQKNQTSLENKGIVVFLL--------------------------------------- 600
            +ERV++ Q S++NKG+++FL+                                       
Sbjct: 541  VERVREKQISVDNKGVLIFLICIIFSLLALVRLFTEMAMVCTNNLGVTSSLSWSWRFGNV 600

Query: 601  LEPAALDLKGQYF--------PPSI-------------------------EGSVKGIV-- 660
            LE          F        PP +                          G  +G+V  
Sbjct: 601  LETEDYSSSSSAFSATATPPRPPQVLEAAKQGHNVTSSSKPNETVVPRQNIGEKQGMVWI 660

Query: 661  ---ENKEGNGKDASSGVSRIKRYSKLKKIEEKLGRARAAIREAAQLHNLTSIHHDPDYVP 720
               E+ E NG+ A    + IKRYS+L+K+E  L   RA+IREAA++ NLTS H DPDYVP
Sbjct: 661  NGTESDEINGRSAIITGTSIKRYSRLEKLEANLAGVRASIREAARIRNLTSSHEDPDYVP 720

Query: 721  TGPIYRNPNAFHRSYLEMERLLKIYVYKEGEPPMFHQGPCKSIYSTEGRFIHEMEKGNLY 780
             GPIYRN NAFHR                                               
Sbjct: 721  RGPIYRNANAFHR----------------------------------------------- 780

Query: 781  TTNDPHQALLYFLPFSVVNLVQYLYVPNSHEVNAIGVAVSDYINVISNKHSFWNRSLGAD 840
             T DP QAL+YFLPFSVV LVQYLYV +SH+   IG AV DY+NVIS+KH FWNRSLGAD
Sbjct: 781  -TRDPDQALVYFLPFSVVMLVQYLYVADSHDTQPIGRAVVDYVNVISDKHPFWNRSLGAD 840

Query: 841  HFMLSCHDWGPRTTSYVPLLFNNSIRVLCNANVSEGFRPSKDASFPEIHLRTGEIDGLLG 900
            HFMLSCHDWGP T++YVP L+ NSIRVLCNAN SEGF PSKD SFPEIHLRTGE  GLLG
Sbjct: 841  HFMLSCHDWGPSTSAYVPHLYQNSIRVLCNANTSEGFNPSKDVSFPEIHLRTGETKGLLG 900

Query: 901  GLSPSRRTVLAFFAGRLHGHIRYLLLQNWKEKDEDVLVYDELPSGISYNSMLKKSRFCLC 960
            GLSPSRR +LAFFAGRLHGHIRYLLL  WKEKD+DV VYD+LP+G+SY SMLKKSRFCLC
Sbjct: 901  GLSPSRRLILAFFAGRLHGHIRYLLLNEWKEKDQDVQVYDQLPNGVSYESMLKKSRFCLC 960

Query: 961  PSGYEVASPRVVEAIYAECVPVLISESYVPPFSDVLNWKSFAVQIQVKDIPNIKEILRGI 963
            PSGYEVASPRVVEAIYAECVPVLIS+SYVPPFSDVL WKSF+VQ+QVKDIPNIK IL GI
Sbjct: 961  PSGYEVASPRVVEAIYAECVPVLISDSYVPPFSDVLEWKSFSVQVQVKDIPNIKRILMGI 1013

BLAST of HG10003551 vs. TAIR 10
Match: AT5G03795.1 (Exostosin family protein )

HSP 1 Score: 496.9 bits (1278), Expect = 3.7e-140
Identity = 231/386 (59.84%), Postives = 296/386 (76.68%), Query Frame = 0

Query: 577 SKLKKIEEKLGRARAAIREAAQLHNLTSIHHDPDYVPTGPIYRNPNAFHRSYLEMERLLK 636
           S L+KIE KL +ARA+I+ A    ++     DPDYVP GP+Y N   FHRSYLEME+  K
Sbjct: 136 SNLEKIEFKLQKARASIKAA----SMDDPVDDPDYVPLGPMYWNAKVFHRSYLEMEKQFK 195

Query: 637 IYVYKEGEPPMFHQGPCKSIYSTEGRFIHEMEKGNLYTTNDPHQALLYFLPFSVVNLVQY 696
           IYVYKEGEPP+FH GPCKSIYS EG FI+E+E    + TN+P +A +++LPFSVV +V+Y
Sbjct: 196 IYVYKEGEPPLFHDGPCKSIYSMEGSFIYEIETDTRFRTNNPDKAHVFYLPFSVVKMVRY 255

Query: 697 LYVPNSHEVNAIGVAVSDYINVISNKHSFWNRSLGADHFMLSCHDWGPRTTSYVPLLFNN 756
           +Y  NS + + I   V DYIN++ +K+ +WNRS+GADHF+LSCHDWGP  +   P L +N
Sbjct: 256 VYERNSRDFSPIRNTVKDYINLVGDKYPYWNRSIGADHFILSCHDWGPEASFSHPHLGHN 315

Query: 757 SIRVLCNANVSEGFRPSKDASFPEIHLRTGEIDGLLGGLSPSRRTVLAFFAGRLHGHIRY 816
           SIR LCNAN SE F+P KD S PEI+LRTG + GL+GG SPS R +LAFFAG +HG +R 
Sbjct: 316 SIRALCNANTSERFKPRKDVSIPEINLRTGSLTGLVGGPSPSSRPILAFFAGGVHGPVRP 375

Query: 817 LLLQNWKEKDEDVLVYDELPSGISYNSMLKKSRFCLCPSGYEVASPRVVEAIYAECVPVL 876
           +LLQ+W+ KD D+ V+  LP G SY+ M++ S+FC+CPSGYEVASPR+VEA+Y+ CVPVL
Sbjct: 376 VLLQHWENKDNDIRVHKYLPRGTSYSDMMRNSKFCICPSGYEVASPRIVEALYSGCVPVL 435

Query: 877 ISESYVPPFSDVLNWKSFAVQIQVKDIPNIKEILRGISKTQYLRMQRRVKQVQKHFVLNG 936
           I+  YVPPFSDVLNW+SF+V + V+DIPN+K IL  IS  QYLRM RRV +V++HF +N 
Sbjct: 436 INSGYVPPFSDVLNWRSFSVIVSVEDIPNLKTILTSISPRQYLRMYRRVLKVRRHFEVNS 495

Query: 937 TPKRFDAFHMILHSIWLRRLNIHIQD 963
             KRFD FHMILHSIW+RRLN+ I++
Sbjct: 496 PAKRFDVFHMILHSIWVRRLNVKIRE 517

BLAST of HG10003551 vs. TAIR 10
Match: AT3G07620.1 (Exostosin family protein )

HSP 1 Score: 470.3 bits (1209), Expect = 3.7e-132
Identity = 229/407 (56.27%), Postives = 295/407 (72.48%), Query Frame = 0

Query: 557 ENKEGNGKDASSGVSRIKRYSKLKKIEEKLGRARAAIREAAQLHNLT--SIHHDPDYVPT 616
           E ++ NG +  SG      + +  K+E +L  AR  IREA   ++ T  S   D DYVP 
Sbjct: 68  EKRKRNGSNPGSGY-----WKRDGKVEAELATARVLIREAQLNYSSTTSSPLGDEDYVPH 127

Query: 617 GPIYRNPNAFHRSYLEMERLLKIYVYKEGEPPMFHQGPCKSIYSTEGRFIHEMEKGNL-Y 676
           G IYRNP AFHRSYL ME++ KIYVY+EG+PP+FH G CK IYS EG F++ ME   L Y
Sbjct: 128 GDIYRNPYAFHRSYLLMEKMFKIYVYEEGDPPIFHYGLCKDIYSMEGLFLNFMENDVLKY 187

Query: 677 TTNDPHQALLYFLPFSVVNLVQYLYVPNSHEVNAIGVAVSDYINVISNKHSFWNRSLGAD 736
            T DP +A +YFLPFSVV ++ +L+ P   +   +   ++DY+ +IS K+ +WN S G D
Sbjct: 188 RTRDPDKAHVYFLPFSVVMILHHLFDPVVRDKAVLERVIADYVQIISKKYPYWNTSDGFD 247

Query: 737 HFMLSCHDWGPRTTSYVPLLFNNSIRVLCNANVSEGFRPSKDASFPEIHLRTGEIDGLLG 796
           HFMLSCHDWG R T YV  LF NSIRVLCNAN+SE F P KDA FPEI+L TG+I+ L G
Sbjct: 248 HFMLSCHDWGHRATWYVKKLFFNSIRVLCNANISEYFNPEKDAPFPEINLLTGDINNLTG 307

Query: 797 GLSPSRRTVLAFFAGRLHGHIRYLLLQNWKEKDEDVLVYDELPSGISYNSMLKKSRFCLC 856
           GL P  RT LAFFAG+ HG IR +LL +WKEKD+D+LVY+ LP G+ Y  M++KSRFC+C
Sbjct: 308 GLDPISRTTLAFFAGKSHGKIRPVLLNHWKEKDKDILVYENLPDGLDYTEMMRKSRFCIC 367

Query: 857 PSGYEVASPRVVEAIYAECVPVLISESYVPPFSDVLNWKSFAVQIQVKDIPNIKEILRGI 916
           PSG+EVASPRV EAIY+ CVPVLISE+YV PFSDVLNW+ F+V + VK+IP +K IL  I
Sbjct: 368 PSGHEVASPRVPEAIYSGCVPVLISENYVLPFSDVLNWEKFSVSVSVKEIPELKRILMDI 427

Query: 917 SKTQYLRMQRRVKQVQKHFVLNGTPKRFDAFHMILHSIWLRRLNIHI 961
            + +Y+R+   VK+V++H ++N  PKR+D F+MI+HSIWLRRLN+ +
Sbjct: 428 PEERYMRLYEGVKKVKRHILVNDPPKRYDVFNMIIHSIWLRRLNVKL 469

BLAST of HG10003551 vs. TAIR 10
Match: AT1G71360.1 (Galactose-binding protein )

HSP 1 Score: 433.3 bits (1113), Expect = 5.0e-121
Identity = 267/571 (46.76%), Postives = 347/571 (60.77%), Query Frame = 0

Query: 1   MRRPVGALLRDRRAVQVPTSGRNHLYKVSLSLVFILWGLIFLFSLWFSRGDGCQEGSVLH 60
           M+R   ALL  RR  +  ++GRN  YKVSLSLVF++WGL+FL +LW S  DG +  S++ 
Sbjct: 1   MQRSRRALLVRRRVSETTSNGRNRFYKVSLSLVFLIWGLVFLSTLWISHVDGDKGRSLV- 60

Query: 61  PADVSTSNESKLENNKDSDILYEPPKGETDSTIQLNDSCSIDATS----PG--SDNDILS 120
                           DS    EP     D T +  D+ S+++TS    PG  SD DI +
Sbjct: 61  ----------------DSVEKGEPDDERADETAESVDATSLESTSVHSNPGLSSDVDIAA 120

Query: 121 SEESSSHIRPATRLPEAESSSTGV-------KSENKPVKG-------------------- 180
           + ES       T L + E  +T V         +N P+K                     
Sbjct: 121 AGESKG---SETILKQLEVDNTIVIVGNVTESKDNVPMKQSEINNNTVPGNDTETTGSKL 180

Query: 181 DTSSETVLLGLEEFKSRAFISRSKSETGQAGNTIHRVEPGGAEYNYASASKGAKVLAFNK 240
           D  S  V LGL+EFKSRA  SR KS +GQ    IHR+EPGG EYNYA+ASKGAKVL+ NK
Sbjct: 181 DQLSRAVPLGLDEFKSRASNSRDKSLSGQVTGVIHRMEPGGKEYNYAAASKGAKVLSSNK 240

Query: 241 EAKGASNILGRDKDKYLRNPCSAEEKFVVIELSEETLVVTIEIANFEHHSSNLKEFEVHG 300
           EAKGAS+I+ RDKDKYLRNPCS E KFVVIELSEETLV TI+IANFEH+SSNLK+FE+ G
Sbjct: 241 EAKGASSIICRDKDKYLRNPCSTEGKFVVIELSEETLVNTIKIANFEHYSSNLKDFEILG 300

Query: 301 SLVYPTDVWFKLGNFTAPNAKHAHRFVLKDPKWVRYLKLNFLTHYGSEFYCTLSTVEVYG 360
           +LVYPTD W  LGNFTA N KH   F   DPKWVRYLKLN L+HYGSEFYCTLS +EVYG
Sbjct: 301 TLVYPTDTWVHLGNFTALNMKHEQNFTFADPKWVRYLKLNLLSHYGSEFYCTLSLLEVYG 360

Query: 361 MDAVEMMLEDLISAQHKSSI----SDEATTEKRVIPSQ---SGPNDEGQHGRELQSLTTE 420
           +DAVE MLEDLIS Q K+ +     D    EK+ + ++       D+ +   + Q  + E
Sbjct: 361 VDAVERMLEDLISIQDKNILKLQEGDTEQKEKKTMQAKESFESDEDKSKQKEKEQEASPE 420

Query: 421 ESDDDVDLELTKSNIPDPVEESHHQQPGRMPGDTVLKILTQKVRSLDLSLSVLEQYLEDL 480
            +    ++ L K  +PDPVEE  HQ   RMPGDTVLKIL QK+RSLD+SLSVLE YLE+ 
Sbjct: 421 NAVVKDEVSLEKRKLPDPVEEIKHQPGSRMPGDTVLKILMQKIRSLDVSLSVLESYLEER 480

Query: 481 TSKYGNIFKEFDKDIENSDLLIEKTREDIRNILKIQDSTDKDLRDLISWKSIVSLQLDNL 532
           + KYG IFKE D +    +  +E  R ++  + + +++T K+  ++  W+  V  +L+  
Sbjct: 481 SLKYGMIFKEMDLEASKREKEVETMRLEVEGMKEREENTKKEAMEMRKWRMRVETELEKA 540

BLAST of HG10003551 vs. TAIR 10
Match: AT5G25310.1 (Exostosin family protein )

HSP 1 Score: 432.6 bits (1111), Expect = 8.5e-121
Identity = 209/394 (53.05%), Postives = 282/394 (71.57%), Query Frame = 0

Query: 571 SRIKRYSKLKKIEEKLGRARAAIREAAQLHNLTSIHHDPDYVPTGPIYRNPNAFHRSYLE 630
           S+ ++ ++   +E+ L +ARA+I EA+   N T    D   +P   IYRNP+A +RSYLE
Sbjct: 90  SKPEKLNRRNLVEQGLAKARASILEASSNVNTTLFKSD---LPNSEIYRNPSALYRSYLE 149

Query: 631 MERLLKIYVYKEGEPPMFHQGPCKSIYSTEGRFIHEMEKGNL-YTTNDPHQALLYFLPFS 690
           ME+  K+YVY+EGEPP+ H GPCKS+Y+ EGRFI EMEK    + T DP+QA +YFLPFS
Sbjct: 150 MEKRFKVYVYEEGEPPLVHDGPCKSVYAVEGRFITEMEKRRTKFRTYDPNQAYVYFLPFS 209

Query: 691 VVNLVQYLYVPNSHEVNAIGVAVSDYINVISNKHSFWNRSLGADHFMLSCHDWGPRTTSY 750
           V  LV+YLY  NS +   +   VSDYI ++S  H FWNR+ GADHFML+CHDWGP T+  
Sbjct: 210 VTWLVRYLYEGNS-DAKPLKTFVSDYIRLVSTNHPFWNRTNGADHFMLTCHDWGPLTSQA 269

Query: 751 VPLLFNNSIRVLCNANVSEGFRPSKDASFPEIHLRTGEID---GLLGGLSPSRRTVLAFF 810
              LFN SIRV+CNAN SEGF P+KD + PEI L  GE+D    L   LS S R  L FF
Sbjct: 270 NRDLFNTSIRVMCNANSSEGFNPTKDVTLPEIKLYGGEVDHKLRLSKTLSASPRPYLGFF 329

Query: 811 AGRLHGHIRYLLLQNWKEKDEDVLVYDELPSGISYNSMLKKSRFCLCPSGYEVASPRVVE 870
           AG +HG +R +LL++WK++D D+ VY+ LP  ++Y   ++ S+FC CPSGYEVASPRV+E
Sbjct: 330 AGGVHGPVRPILLKHWKQRDLDMPVYEYLPKHLNYYDFMRSSKFCFCPSGYEVASPRVIE 389

Query: 871 AIYAECVPVLISESYVPPFSDVLNWKSFAVQIQVKDIPNIKEILRGISKTQYLRMQRRVK 930
           AIY+EC+PV++S ++V PF+DVL W++F+V + V +IP +KEIL  IS  +Y  ++  ++
Sbjct: 390 AIYSECIPVILSVNFVLPFTDVLRWETFSVLVDVSEIPRLKEILMSISNEKYEWLKSNLR 449

Query: 931 QVQKHFVLNGTPKRFDAFHMILHSIWLRRLNIHI 961
            V++HF LN  P+RFDAFH+ LHSIWLRRLN+ +
Sbjct: 450 YVRRHFELNDPPQRFDAFHLTLHSIWLRRLNLKL 479

BLAST of HG10003551 vs. TAIR 10
Match: AT3G42180.1 (Exostosin family protein )

HSP 1 Score: 410.2 bits (1053), Expect = 4.5e-114
Identity = 200/397 (50.38%), Postives = 268/397 (67.51%), Query Frame = 0

Query: 573 IKRYSKLKKIEEKLGRARAAIREAAQLHNLTSIHHDPDYVPTGPIYRNPNAFHRSYLEME 632
           +KR S L+K EE+L +ARAAIR A +  N TS      Y+PTG IYRN  AFH+S++EM 
Sbjct: 72  VKRRSNLEKREEELRKARAAIRRAVRFKNCTSNEEVITYIPTGQIYRNSFAFHQSHIEMM 131

Query: 633 RLLKIYVYKEGEPPMFHQGPCKSIYSTEGRFIHEME-----KGNLYTTNDPHQALLYFLP 692
           +  K++ YKEGE P+ H GP   IY  EG+FI E+          +  + P +A  +FLP
Sbjct: 132 KTFKVWSYKEGEQPLVHDGPVNDIYGIEGQFIDELSYVMGGPSGRFRASRPEEAHAFFLP 191

Query: 693 FSVVNLVQYLYVPNSHEVN----AIGVAVSDYINVISNKHSFWNRSLGADHFMLSCHDWG 752
           FSV N+V Y+Y P +   +     +    +DY++V+++KH FWN+S GADHFM+SCHDW 
Sbjct: 192 FSVANIVHYVYQPITSPADFNRARLHRIFNDYVDVVAHKHPFWNQSNGADHFMVSCHDWA 251

Query: 753 PRTTSYVPLLFNNSIRVLCNANVSEGFRPSKDASFPEIHLRTGEIDGLLGGLSPSRRTVL 812
           P      P  F N +R LCNAN SEGFR + D S PEI++   ++     G +P  RT+L
Sbjct: 252 PDVPDSKPEFFKNFMRGLCNANTSEGFRRNIDFSIPEINIPKRKLKPPFMGQNPENRTIL 311

Query: 813 AFFAGRLHGHIRYLLLQNWKEKDEDVLVYDELPSGISYNSMLKKSRFCLCPSGYEVASPR 872
           AFFAGR HG+IR +L  +WK KD+DV VYD L  G +Y+ ++  S+FCLCPSGYEVASPR
Sbjct: 312 AFFAGRAHGYIREVLFSHWKGKDKDVQVYDHLTKGQNYHELIGHSKFCLCPSGYEVASPR 371

Query: 873 VVEAIYAECVPVLISESYVPPFSDVLNWKSFAVQIQVKDIPNIKEILRGISKTQYLRMQR 932
            VEAIY+ CVPV+IS++Y  PF+DVL+W  F+V+I V  IP+IK+IL+ I   +YLRM R
Sbjct: 372 EVEAIYSGCVPVVISDNYSLPFNDVLDWSKFSVEIPVDKIPDIKKILQEIPHDKYLRMYR 431

Query: 933 RVKQVQKHFVLNGTPKRFDAFHMILHSIWLRRLNIHI 961
            V +V++HFV+N   + FD  HMILHS+WLRRLNI +
Sbjct: 432 NVMKVRRHFVVNRPAQPFDVIHMILHSVWLRRLNIRL 468

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
KAA0039335.10.0e+0087.94putative glycosyltransferase [Cucumis melo var. makuwa] >TYK00518.1 putative gly... [more]
KAE8648979.10.0e+0086.80hypothetical protein Csa_009042 [Cucumis sativus][more]
KAG6592335.10.0e+0081.49putative glycosyltransferase, partial [Cucurbita argyrosperma subsp. sororia][more]
KAF3440963.10.0e+0062.83hypothetical protein FNV43_RR19249 [Rhamnella rubrinervis][more]
TQD89737.10.0e+0060.64hypothetical protein C1H46_024731 [Malus baccata][more]
Match NameE-valueIdentityDescription
Q9FFN25.2e-13959.84Probable glycosyltransferase At5g03795 OS=Arabidopsis thaliana OX=3702 GN=At5g03... [more]
Q9SSE85.2e-13156.27Probable glycosyltransferase At3g07620 OS=Arabidopsis thaliana OX=3702 GN=At3g07... [more]
F4I8I07.0e-12046.76SUN domain-containing protein 4 OS=Arabidopsis thaliana OX=3702 GN=SUN4 PE=1 SV=... [more]
Q3E7Q91.2e-11953.05Probable glycosyltransferase At5g25310 OS=Arabidopsis thaliana OX=3702 GN=At5g25... [more]
Q3EAR76.4e-11350.38Probable glycosyltransferase At3g42180 OS=Arabidopsis thaliana OX=3702 GN=At3g42... [more]
Match NameE-valueIdentityDescription
A0A5A7TD360.0e+0087.94Putative glycosyltransferase OS=Cucumis melo var. makuwa OX=1194695 GN=E5676_sca... [more]
A0A540LTE30.0e+0060.64SUN domain-containing protein OS=Malus baccata OX=106549 GN=C1H46_024731 PE=3 SV... [more]
A0A498IR710.0e+0056.67SUN domain-containing protein OS=Malus domestica OX=3750 GN=DVH24_009530 PE=3 SV... [more]
A0A314YQN60.0e+0064.52Putative glycosyltransferase OS=Prunus yedoensis var. nudiflora OX=2094558 GN=Py... [more]
A0A5N5FL536.9e-30455.29SUN domain-containing protein OS=Pyrus ussuriensis x Pyrus communis OX=2448454 G... [more]
Match NameE-valueIdentityDescription
AT5G03795.13.7e-14059.84Exostosin family protein [more]
AT3G07620.13.7e-13256.27Exostosin family protein [more]
AT1G71360.15.0e-12146.76Galactose-binding protein [more]
AT5G25310.18.5e-12153.05Exostosin family protein [more]
AT3G42180.14.5e-11450.38Exostosin family protein [more]
InterPro
Analysis Name: InterPro Annotations of Bottle gourd (Hangzhou Gourd) v1
Date Performed: 2022-08-01
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
NoneNo IPR availableCOILSCoilCoilcoord: 579..599
NoneNo IPR availableCOILSCoilCoilcoord: 497..524
NoneNo IPR availableGENE3D2.60.120.260coord: 210..338
e-value: 3.8E-12
score: 48.3
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 129..149
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 70..84
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 89..122
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 346..372
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 61..151
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 346..386
NoneNo IPR availablePANTHERPTHR11062:SF313GLYCOSYLTRANSFERASE-RELATEDcoord: 540..959
IPR040911Exostosin, GT47 domainPFAMPF03016Exostosincoord: 633..912
e-value: 3.4E-55
score: 187.4
IPR012919SUN domainPFAMPF07738Sad1_UNCcoord: 206..328
e-value: 1.1E-30
score: 106.3
IPR012919SUN domainPROSITEPS51469SUNcoord: 163..330
score: 35.378563
IPR004263Exostosin-likePANTHERPTHR11062EXOSTOSIN HEPARAN SULFATE GLYCOSYLTRANSFERASE -RELATEDcoord: 540..959
IPR008979Galactose-binding-like domain superfamilySUPERFAMILY49785Galactose-binding domain-likecoord: 212..326

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
HG10003551.1HG10003551.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0006486 protein glycosylation
cellular_component GO:0000139 Golgi membrane
cellular_component GO:0016021 integral component of membrane
molecular_function GO:0016757 glycosyltransferase activity
molecular_function GO:0043621 protein self-association