HG10004824 (gene) Bottle gourd (Hangzhou Gourd) v1

Overview
NameHG10004824
Typegene
OrganismLagenaria siceraria cv. Hangzhou Gourd (Bottle gourd (Hangzhou Gourd) v1)
Descriptionpentatricopeptide repeat-containing protein At1g19720
LocationChr08: 20716060 .. 20722615 (+)
RNA-Seq ExpressionHG10004824
SyntenyHG10004824
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: CDSpolypeptide
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGGAGAAACTGGCAATTCCTTGCCAAACAAACCCTCCAATTTCTGTCCCTGCTTCAATTATCAAAGCCAAACCCCTTAAATTCTCCTCAAAACCAATTAAAACTTCTATATTTTTCACCCAGAAATTTACTACAAAGTTCAATGACGACAATTTGAGTTACCTTTGCAGCAATGGGCTCCTCCGGGAAGCCATAACATCCATCGATTCAATGTCTAAACGTGGGTCTAAGCTAAGCACCAACACGTATATCAATTTGCTTCAGACTTGCATAGATGCGGATTCTATTGAACTGGGTCGTGAGCTTCATGATCGTATGAGTTCAGTCGATCAGGTCAACCCATTTGTTGAGACAAAGCTAGTAAGCATGTATGCGAAATGTGGGTTTCTAAAAGATGCACGTAAGGTGTTTGATGGAATGCAGGAGAGAAATTTGTACACTTGGTCGGCAATGATAGGCGCATATTCAAGAGAGCAGAGGTGGAAAGAAGTAGTTGAACTTTTCTTTTTGATGATGGGAGATGGGGTACTGCCTGATGCTTTTCTTTTCCCAAAAATACTGCAGGCTTGTGGGAATTGCGAGGATCTTGAAACTGTAAAGTTAATACATTCTGTGGTTATTCGATGTGGGTTGAGTTGTTATATTCGAGTGAGCAATTCCATTTTGACGGCATTTGTCAAATGTGGGAAATTGTGTCTAGCTAGAAAGTTCTTTGGGAACATGGACGAAAGAGATGGGGTTTCTTGGAATGCTATAATAGCTGGTTATTGCCAGAAGGGAAATGGTGATGAAGCTAGGAGATTGCTTGATGCGATGAGCGATCAAGGATTCAAACCGGGTTTGGTTACTTATAACATAATGATTGCTAGTTATAGCCAGTTGGGGAATTTTAATCTTGTCATAGAATTGAAGAAGAAGATGGAGAGTATGGGGATAGCTCCTGATGTTTATACTTGGACCTCAATGATTTCGGGTTTTGCTCAAAGCAGCAGGATTAGTCAGGCATTGGATTTCTTCAAGGAGATGATTCTAGCTTGGGTTGAACCAAATGCTATAACTATTGCAAGTGCGACCTCAGCCTGTGCTTCCTTAAAATCACTGCAAAAAGGGCTGGAAATACATTGTTTTGCAATTAAAATGGGAATTGCGCATGAAATATTGGTTGGGAATTCACTTATTGATATGTATTCTAAATGTGGAAAATTGGAAGCTGCTCGCCATGTCTTTGACACGATTTTAGAAAAAGACATTTATACATGGAACTCGATGATTGGAGGATACTGTCAGGCTGGATATTGTGGAAAAGCATATGAACTTTTTATGAGATTAAGGGAATCAAATGTAATGCCTAATGTTGTTACATGGAATGTGATGATATCAGGATGTATACAGAATGGAGATGAGGATCAAGCTGTGAACCTCTTTCAAATTATGGAAAAAGATGGGGAGGTTAAGCGGAATACAGCATCCTGGAATTCTCTGATTGCTGGGTACCACCAGCTTGGTGAAAAGAACAAAGCTCTAGCAATATTTCGACAAATGCAGTCTCTTAATTTTAATCCTAATTCAGTGACTATCTTGAGCATTTTACCAGCTTGTGCAAATGTAATGGCAGAGAAAAAAGTAAAGGAAATCCATGGTTGTGTGTTGCGCAGAAACCTGGAATCTGAGCTTCCTATTGCAAACTCACTTATAGACACTTATGCCAAGTCAGGGAACATTCAATATTCAAGAACCATCTTTGATGGCATGTCATCCAAAGATATTATCACATGGAATTCAATTATTGCAGGATATATTTTACATGGTTGTTCAGATGCTGCATTTCATTTATTTGGTCAGATGAAAAAGTTAGGAATTAGGCCAAACCGAGGTACTCTGGCTAGTCTTATTCATGCCTATGGCATTGCCGGAATGGTAGACAAGGGAAGACATGTTTTTTCTAGCATCACTGAAGAACATCAAATTCTACCAACTTTAGATCATTATTTGGCTATGGTAGATCTGTATGGACGTTCTGGGAGGCTTACAGATGCAATAGAGTTCATCGAAGATATGCCTATAGAACCCGATGGCTCTATCTGGACCAGCTTACTTACTGCCTGTAGGTTTCACGGGAACTTACACTTGGCGGTACAAGCAGCCAAGCGCCTACACGAATTGGAGCCTGATAATCATGTGATTTACCGTTTATTAATACAGGCATATGCTTTATATGGGAACTTTGAACAAGCTCTAAAAGTGAGAAAGCTTGGAAAAGAGAGTGCGATGAAGAAATGTACAGCACAGTGCTGGGTTGAAGTCAGGAACAAAGTCCATTTATTTGTCACTAGCGACCAGTCTAAACTTGACGTTCTAAATACTTGGATAAAAAGCATTGAAGGGAAAGTAAAGAAATTTAATAATCACCATCAGCTTTCTATTGACGAAGAACAGAAGGAAGAAAAAATTGGTGGGTTCCACTGTGAAAAATTTGCATTTGCTTTTGCCCTTATTGGCTCATCTCATACACCCAAAAGTATAAAGATTGTGAAAAACTTGAGAATATGCGGAGACTGTCATCAGATGGCCAAATATATCTCAGCGGCTCATGACTGTGAAATATATTTAAGTGACTCAAAATGCTTGCACCATTTCAAAAATGGTCACTGTTCTTGTGGGGATTATTGGTAGCCTTTAACACGATCTGCAAGTTGTTTTTAGAATCAGATTCCCGTTAATGGAAGAGGGTAGTTGCTTGAAGAGATGAAATTAAATCTCAAGGAATGAGCATTTCTGAAGATTGGTTAGACAGATTCCTGTAGAATTTGGCTGAGAAAGAATTGTTCATAACACGTTTTGACAGGTGCACGATCTATATGTAGGTAGCCGAATAAATGTGAATGGAACCTACTAAGTTTTAAACATTAATTTTATTTATTTGTTAATATAATTCATACAGATTCTTAATAGTGGTTAGTGGAGCATGTAGACATTGCATGTTTGTTTGGGGTGCATAGCTTACTGTTTCTTGCAAGTGTGTAATGACATGACACATCTAGGTTGAAATGATTCTTGATATTGTTTACCTTGGTATTTTGATGAAGCTATGGAGAAAGCTCCTTGGTTGAACAAGTCTTTCACGATTTTCTTTATTCCCATGTTGCGAGGGGCAGTCACTTTCTAGATTCATGTCCAATTGATATGGTGGTTTTACCTCCTTAGCTTAGCTGCTGCTCACTTGCAGTCCATCTCGATTTTTGTCTGCCTTTGGTTATAAGAAATCTTCGGACAGATGGCTGTTTTACTAATAATTCAGTTGTCCTTCTATGAGTGAACATACGATTATGGATTGTCAAACTTCTTCATACTTACCGGTTATCATTTTCTCCAAAGGCCATGATGGTTACCCACATATCAAGAAAGCTTGTATATCTAACAGGATTAAAGGATGTTGGGTAATATGCGTGCCTTCTCGGAGACTGGGGAAGCCCCTGGGGAGGGCTTGATATGTAGAAAAACCAAGATAGAGATCTGTTGGGAATTCATGCCTGCGCAATTCACTTTTCAGTACAGAATTTTAAGTTTCTCTGGATTAAACTCTCACCGTCATTGTTTGTTAAATTAGGTGGAATATGCATTACAATTAGCTGGTGAGACTCTAGTTAGGATGCCATACATAGGAAAATAAGAGAAGCTCACCATATTTTAATTGAAAACGAAGTGACTCATGGAGGCGATAGAATAATAATGATACCAAACAGATTAGAATCCATTGGCAAACGCTACTATTTATTTGTTTTTTTAAAATATTAGATTTCATTTTCTATTCTATCCAGGGATCAGATTCTTGTCTACTAAATCTTTCCCCTCCTTAGTGAAATTTGGAATCTATCTTCCCCCCATCATTTGAAAAAGCTTTTGAAACCTTGATTCTGTACTCCACGCCTTTTTTTGCTCAAATAAATTCCTCAGTCTGCTTTGCACACTACAATTGTGCACGGGCATTGAAGTTCAAGAAAATTGTTGCTTTTTTCACTTTTCTAGTCTTTTGTTTCTGTAGCAAGAATGGTAGGTGCTATTCATTTTCTTTGTATTTTTTGATTGCATTTACTTCTTTTTTTCTCCACAACCAGTTAGAAGTTTGTTTTTTCCCACACAGAGTTTTGATGATTCTCGAAAAATCAAGCCAGTCTTGGCTGGTCCTTTTGGAGGACCAGGTGGAAATAATTGGGACGATGGAGTTTATTCGACTATCAGGCAGTTGGTAATTTGCCATGGAGCTGGTATCGACTCCATTAAGATTCAATATGATGTGAAGGGAAGTTCAATTTGGTCAGATAGACATGGAGGAAATGGTGGCACTAAAACAGACACGGTAGCTCCATACTTTGTTGGATCATCAATTGTTTGCTCAGTAATTTTGAAACATTTCGAGTCCTGTACTTCTTTGAAATGAAATTTAGATTTCCAAATGATACACAAATGACGTTTTAAGTTGGTTTTAGTTTGATCATCGATGAAGACAAGAGTCTCTGTGTACTAAGTTTTCATTTTTGGATTTACAGGTGAAGCTTGATTTTCCGGATGAGTACTTGACTATGATCCGTGGACACTATGGTAGCTTTGTGTCATTTGACAAAGTTTTTGTTCGATCCCTGACTTTTATGAGCAACAAAAAGAAGTACGGACCTTACGGGGTCGAACAAGGAACAGTTTTCTCTTTCCCAACGACTGAGGGCAAGATTGTAGGGTTCCATGGCAGGAGTGGATTGTACCTGGATGCCATTGGAGTTTACCTAAAGCCTATGGCAATACAAACACCATCTAAAGCAATGATTCAGTCACAGAATTATGTTGCGAGTAAGACTGAAAGTGAAGGCTATTCGATTATACAAGGAAGTGTTGGCCAAAATTATGATATTGTTCTTGCTTTGAGGCAGAAGGATGAATTCAAAAAGCCTCTTCCAACTACTGTCTCAAAACAAGTATCTAGTTCCTCAAGCTCAGAATCAAGTGATGATGAATCCACAGTCAAGGTGGGATGAAACATTTTTCTTTTATGCCTTTATTTGTCTTGTTTCAAGTAAACTTTTTCCCTTGTAAGGATTGAACTCAATTAACGGTTGCTGATGGTCATCTCAGAGGCCTGTTAAGAAGGGACTGTCTAAAGTCGAAAATGTGGTACCATGTGGACCATGGGGCGGCTCGGGCGGAACTCCATTTGATGATGGATATTACACTGGTATTAGACAAATTAATGTGTCACGCAATGTTGGAATTGTATATATAAGAGTTCTGTATGCTTGCGATGAGGAATCTATATGGGGAGTACGAGCAGGTGGAACGGGAGGATTCAAACATGACAAGGTGAAAGAATTTCCTGCATATTTTGGATTGAAAGGCATGGAATTACACCTCTCATTTATTACTCTCTTTCATCTCCTTCTCTTAGTCGAAACAACGGTTTTGCAGGTTATCTTGGACTATCCATATGAAATCTTGACTCATGTAACTGGACATTATGGGCCTGTCATGTACATGGGACCTAATGTTATCAAGTCACTCACATTCCATACTACAAAAACGAAGTACGGACCATTCGGAGAGGCACAAGGAACCCCCTTTAGTACCAACGTCAAGGAGGGGAAGATTGTTGGATTTCATGGGAGGAAAGGTTTGTTCCTAGATGCCCTTGGTGTGCACTTAGTTGAAGGAAAGGTGACCCCGGTGTCTCGTCCTCCCTCCAGTGATATTGTTCCTGCTGCACCACCACTTCTCGAAAATGAGAACGCCCCTTGGACTATGAAGCTGGCACCTTCAAAAGGAGGAGCACTTGAAGAGGTACTCTCTTGACTCTGCTAACCATTTTAGATGTTTCTTGCAACTGAACAAACAGAGGAAGCTAATATAAGTAGTCATGAAAGAACTTATGGATTAACTATTTGTTTCATAGATTGCTCGTGGTGTAGTAAAAGAACCGGCACCCTGTGGACCTGGACCATGGGGCGGAGATGGTGGGAAACCATGGGATGATGGAGTATTTTCTGGCATTAAACAGATATACTTGACACGGTCTCTTGAAGCTTTTTGTTCAATTCAAATTGAATATGATCGAAACAAACAATCAGTTTGGTCAGTTAAGCATGGAGGAAACAGTGGAACAACCATACATCGGGTACCTATAAATTTAACACCTTCGACAGTTTCGACAGTTTCTTATACCTGCCAATAGATATTACATCCACAATCGAACAAACATTTATGCTGGCTGTGTTTACAGGTAAAATTGAATTATCCACATGAAGTGTTAACCTGTATATCAGGATATTACGGTTACATCGGTAAAGATGAGAGACAACAAGCTATAAAGTCACTTACTTTTCACACAAGCAGGGGGAAGTTCGGTCCATTTGGGGAGGAGGTAGGGTCGTTTTTCACATCCACGACGACGGAAGGCAAAGTGGTTGGCTTCCATGGGAGGAGCAGCTTGTATTTGGACGCCATTGGAGTTCACATGCAACACTGGCTAGGAAGCCAAAGGGCATCCAAGTCGTCTTTGTTCAAACTGTTCTGA

mRNA sequence

ATGGAGAAACTGGCAATTCCTTGCCAAACAAACCCTCCAATTTCTGTCCCTGCTTCAATTATCAAAGCCAAACCCCTTAAATTCTCCTCAAAACCAATTAAAACTTCTATATTTTTCACCCAGAAATTTACTACAAAGTTCAATGACGACAATTTGAGTTACCTTTGCAGCAATGGGCTCCTCCGGGAAGCCATAACATCCATCGATTCAATGTCTAAACGTGGGTCTAAGCTAAGCACCAACACGTATATCAATTTGCTTCAGACTTGCATAGATGCGGATTCTATTGAACTGGGTCGTGAGCTTCATGATCGTATGAGTTCAGTCGATCAGGTCAACCCATTTGTTGAGACAAAGCTAGCTTGTGGGAATTGCGAGGATCTTGAAACTGTAAAGTTAATACATTCTGTGGTTATTCGATGTGGGTTGAGTTGTTATATTCGAGTGAGCAATTCCATTTTGACGGCATTTGTCAAATGTGGGAAATTGTGTCTAGCTAGAAAGTTCTTTGGGAACATGGACGAAAGAGATGGGGTTTCTTGGAATGCTATAATAGCTGGTTATTGCCAGAAGGGAAATGGTGATGAAGCTAGGAGATTGCTTGATGCGATGAGCGATCAAGGATTCAAACCGGGTTTGGTTACTTATAACATAATGATTGCTAGTTATAGCCAGTTGGGGAATTTTAATCTTGTCATAGAATTGAAGAAGAAGATGGAGAGTATGGGGATAGCTCCTGATGTTTATACTTGGACCTCAATGATTTCGGGTTTTGCTCAAAGCAGCAGGATTAGTCAGGCATTGGATTTCTTCAAGGAGATGATTCTAGCTTGGGTTGAACCAAATGCTATAACTATTGCAAGTGCGACCTCAGCCTGTGCTTCCTTAAAATCACTGCAAAAAGGGCTGGAAATACATTGTTTTGCAATTAAAATGGGAATTGCGCATGAAATATTGGTTGGGAATTCACTTATTGATATGTATTCTAAATGTGGAAAATTGGAAGCTGCTCGCCATGTCTTTGACACGATTTTAGAAAAAGACATTTATACATGGAACTCGATGATTGGAGGATACTGTCAGGCTGGATATTGTGGAAAAGCATATGAACTTTTTATGAGATTAAGGGAATCAAATGTAATGCCTAATGTTGTTACATGGAATGTGATGATATCAGGATGTATACAGAATGGAGATGAGGATCAAGCTGTGAACCTCTTTCAAATTATGGAAAAAGATGGGGAGGTTAAGCGGAATACAGCATCCTGGAATTCTCTGATTGCTGGGTACCACCAGCTTGGTGAAAAGAACAAAGCTCTAGCAATATTTCGACAAATGCAGTCTCTTAATTTTAATCCTAATTCAGTGACTATCTTGAGCATTTTACCAGCTTGTGCAAATGTAATGGCAGAGAAAAAAGTAAAGGAAATCCATGGTTGTGTGTTGCGCAGAAACCTGGAATCTGAGCTTCCTATTGCAAACTCACTTATAGACACTTATGCCAAGTCAGGGAACATTCAATATTCAAGAACCATCTTTGATGGCATGTCATCCAAAGATATTATCACATGGAATTCAATTATTGCAGGATATATTTTACATGGTTGTTCAGATGCTGCATTTCATTTATTTGGTCAGATGAAAAAGTTAGGAATTAGGCCAAACCGAGGTACTCTGGCTAGTCTTATTCATGCCTATGGCATTGCCGGAATGGTAGACAAGGGAAGACATGTTTTTTCTAGCATCACTGAAGAACATCAAATTCTACCAACTTTAGATCATTATTTGGCTATGGTAGATCTGTATGGACGTTCTGGGAGGCTTACAGATGCAATAGAGTTCATCGAAGATATGCCTATAGAACCCGATGGCTCTATCTGGACCAGCTTACTTACTGCCTGTAGGTTTCACGGGAACTTACACTTGGCGGTACAAGCAGCCAAGCGCCTACACGAATTGGAGCCTGATAATCATGTGATTTACCGTTTATTAATACAGGCATATGCTTTATATGGGAACTTTGAACAAGCTCTAAAAGTGAGAAAGCTTGGAAAAGAGAGTGCGATGAAGAAATGTACAGCACAGTGCTGGGTTGAAGTCAGGAACAAAGTCCATTTATTTGTCACTAGCGACCAGTCTAAACTTGACGTTCTAAATACTTGGATAAAAAGCATTGAAGGGAAAGTAAAGAAATTTAATAATCACCATCAGCTTTCTATTGACGAAGAACAGAAGGAAGAAAAAATTGTTAGAAGTTTGTTTTTTCCCACACAGAGTTTTGATGATTCTCGAAAAATCAAGCCAGTCTTGGCTGGTCCTTTTGGAGGACCAGGTGGAAATAATTGGGACGATGGAGTTTATTCGACTATCAGGCAGTTGGTAATTTGCCATGGAGCTGGTATCGACTCCATTAAGATTCAATATGATGTGAAGGGAAGTTCAATTTGGTCAGATAGACATGGAGGAAATGGTGGCACTAAAACAGACACGGTAGCTCCATACTTTGTTGGATCATCAATTGTTTGCTCAGTGAAGCTTGATTTTCCGGATGAGTACTTGACTATGATCCGTGGACACTATGGTAGCTTTGTGTCATTTGACAAAGTTTTTGTTCGATCCCTGACTTTTATGAGCAACAAAAAGAAGTACGGACCTTACGGGGTCGAACAAGGAACAGTTTTCTCTTTCCCAACGACTGAGGGCAAGATTGTAGGGTTCCATGGCAGGAGTGGATTGTACCTGGATGCCATTGGAGTTTACCTAAAGCCTATGGCAATACAAACACCATCTAAAGCAATGATTCAGTCACAGAATTATGTTGCGAGTAAGACTGAAAGTGAAGGCTATTCGATTATACAAGGAAGTGTTGGCCAAAATTATGATATTGTTCTTGCTTTGAGGCAGAAGGATGAATTCAAAAAGCCTCTTCCAACTACTGTCTCAAAACAAGTATCTAGTTCCTCAAGCTCAGAATCAAGTGATGATGAATCCACAGTCAAGAGGCCTGTTAAGAAGGGACTGTCTAAAGTCGAAAATGTGGTACCATGTGGACCATGGGGCGGCTCGGGCGGAACTCCATTTGATGATGGATATTACACTGGTATTAGACAAATTAATGTGTCACGCAATGTTGGAATTGTATATATAAGAGTTCTGTATGCTTGCGATGAGGAATCTATATGGGGAGTACGAGCAGGTGGAACGGGAGGATTCAAACATGACAAGGTTATCTTGGACTATCCATATGAAATCTTGACTCATGTAACTGGACATTATGGGCCTGTCATGTACATGGGACCTAATGTTATCAAGTCACTCACATTCCATACTACAAAAACGAAGTACGGACCATTCGGAGAGGCACAAGGAACCCCCTTTAGTACCAACGTCAAGGAGGGGAAGATTGTTGGATTTCATGGGAGGAAAGGTTTGTTCCTAGATGCCCTTGGTGTGCACTTAGTTGAAGGAAAGGTGACCCCGGTGTCTCGTCCTCCCTCCAGTGATATTGTTCCTGCTGCACCACCACTTCTCGAAAATGAGAACGCCCCTTGGACTATGAAGCTGGCACCTTCAAAAGGAGGAGCACTTGAAGAGATTGCTCGTGGTGTAGTAAAAGAACCGGCACCCTGTGGACCTGGACCATGGGGCGGAGATGGTGGGAAACCATGGGATGATGGAGTATTTTCTGGCATTAAACAGATATACTTGACACGGTCTCTTGAAGCTTTTTGTTCAATTCAAATTGAATATGATCGAAACAAACAATCAGTTTGGTCAGTTAAGCATGGAGGAAACAGTGGAACAACCATACATCGGGTAAAATTGAATTATCCACATGAAGTGTTAACCTGTATATCAGGATATTACGGTTACATCGGTAAAGATGAGAGACAACAAGCTATAAAGTCACTTACTTTTCACACAAGCAGGGGGAAGTTCGGTCCATTTGGGGAGGAGGTAGGGTCGTTTTTCACATCCACGACGACGGAAGGCAAAGTGGTTGGCTTCCATGGGAGGAGCAGCTTGTATTTGGACGCCATTGGAGTTCACATGCAACACTGGCTAGGAAGCCAAAGGGCATCCAAGTCGTCTTTGTTCAAACTGTTCTGA

Coding sequence (CDS)

ATGGAGAAACTGGCAATTCCTTGCCAAACAAACCCTCCAATTTCTGTCCCTGCTTCAATTATCAAAGCCAAACCCCTTAAATTCTCCTCAAAACCAATTAAAACTTCTATATTTTTCACCCAGAAATTTACTACAAAGTTCAATGACGACAATTTGAGTTACCTTTGCAGCAATGGGCTCCTCCGGGAAGCCATAACATCCATCGATTCAATGTCTAAACGTGGGTCTAAGCTAAGCACCAACACGTATATCAATTTGCTTCAGACTTGCATAGATGCGGATTCTATTGAACTGGGTCGTGAGCTTCATGATCGTATGAGTTCAGTCGATCAGGTCAACCCATTTGTTGAGACAAAGCTAGCTTGTGGGAATTGCGAGGATCTTGAAACTGTAAAGTTAATACATTCTGTGGTTATTCGATGTGGGTTGAGTTGTTATATTCGAGTGAGCAATTCCATTTTGACGGCATTTGTCAAATGTGGGAAATTGTGTCTAGCTAGAAAGTTCTTTGGGAACATGGACGAAAGAGATGGGGTTTCTTGGAATGCTATAATAGCTGGTTATTGCCAGAAGGGAAATGGTGATGAAGCTAGGAGATTGCTTGATGCGATGAGCGATCAAGGATTCAAACCGGGTTTGGTTACTTATAACATAATGATTGCTAGTTATAGCCAGTTGGGGAATTTTAATCTTGTCATAGAATTGAAGAAGAAGATGGAGAGTATGGGGATAGCTCCTGATGTTTATACTTGGACCTCAATGATTTCGGGTTTTGCTCAAAGCAGCAGGATTAGTCAGGCATTGGATTTCTTCAAGGAGATGATTCTAGCTTGGGTTGAACCAAATGCTATAACTATTGCAAGTGCGACCTCAGCCTGTGCTTCCTTAAAATCACTGCAAAAAGGGCTGGAAATACATTGTTTTGCAATTAAAATGGGAATTGCGCATGAAATATTGGTTGGGAATTCACTTATTGATATGTATTCTAAATGTGGAAAATTGGAAGCTGCTCGCCATGTCTTTGACACGATTTTAGAAAAAGACATTTATACATGGAACTCGATGATTGGAGGATACTGTCAGGCTGGATATTGTGGAAAAGCATATGAACTTTTTATGAGATTAAGGGAATCAAATGTAATGCCTAATGTTGTTACATGGAATGTGATGATATCAGGATGTATACAGAATGGAGATGAGGATCAAGCTGTGAACCTCTTTCAAATTATGGAAAAAGATGGGGAGGTTAAGCGGAATACAGCATCCTGGAATTCTCTGATTGCTGGGTACCACCAGCTTGGTGAAAAGAACAAAGCTCTAGCAATATTTCGACAAATGCAGTCTCTTAATTTTAATCCTAATTCAGTGACTATCTTGAGCATTTTACCAGCTTGTGCAAATGTAATGGCAGAGAAAAAAGTAAAGGAAATCCATGGTTGTGTGTTGCGCAGAAACCTGGAATCTGAGCTTCCTATTGCAAACTCACTTATAGACACTTATGCCAAGTCAGGGAACATTCAATATTCAAGAACCATCTTTGATGGCATGTCATCCAAAGATATTATCACATGGAATTCAATTATTGCAGGATATATTTTACATGGTTGTTCAGATGCTGCATTTCATTTATTTGGTCAGATGAAAAAGTTAGGAATTAGGCCAAACCGAGGTACTCTGGCTAGTCTTATTCATGCCTATGGCATTGCCGGAATGGTAGACAAGGGAAGACATGTTTTTTCTAGCATCACTGAAGAACATCAAATTCTACCAACTTTAGATCATTATTTGGCTATGGTAGATCTGTATGGACGTTCTGGGAGGCTTACAGATGCAATAGAGTTCATCGAAGATATGCCTATAGAACCCGATGGCTCTATCTGGACCAGCTTACTTACTGCCTGTAGGTTTCACGGGAACTTACACTTGGCGGTACAAGCAGCCAAGCGCCTACACGAATTGGAGCCTGATAATCATGTGATTTACCGTTTATTAATACAGGCATATGCTTTATATGGGAACTTTGAACAAGCTCTAAAAGTGAGAAAGCTTGGAAAAGAGAGTGCGATGAAGAAATGTACAGCACAGTGCTGGGTTGAAGTCAGGAACAAAGTCCATTTATTTGTCACTAGCGACCAGTCTAAACTTGACGTTCTAAATACTTGGATAAAAAGCATTGAAGGGAAAGTAAAGAAATTTAATAATCACCATCAGCTTTCTATTGACGAAGAACAGAAGGAAGAAAAAATTGTTAGAAGTTTGTTTTTTCCCACACAGAGTTTTGATGATTCTCGAAAAATCAAGCCAGTCTTGGCTGGTCCTTTTGGAGGACCAGGTGGAAATAATTGGGACGATGGAGTTTATTCGACTATCAGGCAGTTGGTAATTTGCCATGGAGCTGGTATCGACTCCATTAAGATTCAATATGATGTGAAGGGAAGTTCAATTTGGTCAGATAGACATGGAGGAAATGGTGGCACTAAAACAGACACGGTAGCTCCATACTTTGTTGGATCATCAATTGTTTGCTCAGTGAAGCTTGATTTTCCGGATGAGTACTTGACTATGATCCGTGGACACTATGGTAGCTTTGTGTCATTTGACAAAGTTTTTGTTCGATCCCTGACTTTTATGAGCAACAAAAAGAAGTACGGACCTTACGGGGTCGAACAAGGAACAGTTTTCTCTTTCCCAACGACTGAGGGCAAGATTGTAGGGTTCCATGGCAGGAGTGGATTGTACCTGGATGCCATTGGAGTTTACCTAAAGCCTATGGCAATACAAACACCATCTAAAGCAATGATTCAGTCACAGAATTATGTTGCGAGTAAGACTGAAAGTGAAGGCTATTCGATTATACAAGGAAGTGTTGGCCAAAATTATGATATTGTTCTTGCTTTGAGGCAGAAGGATGAATTCAAAAAGCCTCTTCCAACTACTGTCTCAAAACAAGTATCTAGTTCCTCAAGCTCAGAATCAAGTGATGATGAATCCACAGTCAAGAGGCCTGTTAAGAAGGGACTGTCTAAAGTCGAAAATGTGGTACCATGTGGACCATGGGGCGGCTCGGGCGGAACTCCATTTGATGATGGATATTACACTGGTATTAGACAAATTAATGTGTCACGCAATGTTGGAATTGTATATATAAGAGTTCTGTATGCTTGCGATGAGGAATCTATATGGGGAGTACGAGCAGGTGGAACGGGAGGATTCAAACATGACAAGGTTATCTTGGACTATCCATATGAAATCTTGACTCATGTAACTGGACATTATGGGCCTGTCATGTACATGGGACCTAATGTTATCAAGTCACTCACATTCCATACTACAAAAACGAAGTACGGACCATTCGGAGAGGCACAAGGAACCCCCTTTAGTACCAACGTCAAGGAGGGGAAGATTGTTGGATTTCATGGGAGGAAAGGTTTGTTCCTAGATGCCCTTGGTGTGCACTTAGTTGAAGGAAAGGTGACCCCGGTGTCTCGTCCTCCCTCCAGTGATATTGTTCCTGCTGCACCACCACTTCTCGAAAATGAGAACGCCCCTTGGACTATGAAGCTGGCACCTTCAAAAGGAGGAGCACTTGAAGAGATTGCTCGTGGTGTAGTAAAAGAACCGGCACCCTGTGGACCTGGACCATGGGGCGGAGATGGTGGGAAACCATGGGATGATGGAGTATTTTCTGGCATTAAACAGATATACTTGACACGGTCTCTTGAAGCTTTTTGTTCAATTCAAATTGAATATGATCGAAACAAACAATCAGTTTGGTCAGTTAAGCATGGAGGAAACAGTGGAACAACCATACATCGGGTAAAATTGAATTATCCACATGAAGTGTTAACCTGTATATCAGGATATTACGGTTACATCGGTAAAGATGAGAGACAACAAGCTATAAAGTCACTTACTTTTCACACAAGCAGGGGGAAGTTCGGTCCATTTGGGGAGGAGGTAGGGTCGTTTTTCACATCCACGACGACGGAAGGCAAAGTGGTTGGCTTCCATGGGAGGAGCAGCTTGTATTTGGACGCCATTGGAGTTCACATGCAACACTGGCTAGGAAGCCAAAGGGCATCCAAGTCGTCTTTGTTCAAACTGTTCTGA

Protein sequence

MEKLAIPCQTNPPISVPASIIKAKPLKFSSKPIKTSIFFTQKFTTKFNDDNLSYLCSNGLLREAITSIDSMSKRGSKLSTNTYINLLQTCIDADSIELGRELHDRMSSVDQVNPFVETKLACGNCEDLETVKLIHSVVIRCGLSCYIRVSNSILTAFVKCGKLCLARKFFGNMDERDGVSWNAIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIMIASYSQLGNFNLVIELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILAWVEPNAITIASATSACASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKCGKLEAARHVFDTILEKDIYTWNSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGDEDQAVNLFQIMEKDGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFNPNSVTILSILPACANVMAEKKVKEIHGCVLRRNLESELPIANSLIDTYAKSGNIQYSRTIFDGMSSKDIITWNSIIAGYILHGCSDAAFHLFGQMKKLGIRPNRGTLASLIHAYGIAGMVDKGRHVFSSITEEHQILPTLDHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDGSIWTSLLTACRFHGNLHLAVQAAKRLHELEPDNHVIYRLLIQAYALYGNFEQALKVRKLGKESAMKKCTAQCWVEVRNKVHLFVTSDQSKLDVLNTWIKSIEGKVKKFNNHHQLSIDEEQKEEKIVRSLFFPTQSFDDSRKIKPVLAGPFGGPGGNNWDDGVYSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDRHGGNGGTKTDTVAPYFVGSSIVCSVKLDFPDEYLTMIRGHYGSFVSFDKVFVRSLTFMSNKKKYGPYGVEQGTVFSFPTTEGKIVGFHGRSGLYLDAIGVYLKPMAIQTPSKAMIQSQNYVASKTESEGYSIIQGSVGQNYDIVLALRQKDEFKKPLPTTVSKQVSSSSSSESSDDESTVKRPVKKGLSKVENVVPCGPWGGSGGTPFDDGYYTGIRQINVSRNVGIVYIRVLYACDEESIWGVRAGGTGGFKHDKVILDYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKTKYGPFGEAQGTPFSTNVKEGKIVGFHGRKGLFLDALGVHLVEGKVTPVSRPPSSDIVPAAPPLLENENAPWTMKLAPSKGGALEEIARGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYLTRSLEAFCSIQIEYDRNKQSVWSVKHGGNSGTTIHRVKLNYPHEVLTCISGYYGYIGKDERQQAIKSLTFHTSRGKFGPFGEEVGSFFTSTTTEGKVVGFHGRSSLYLDAIGVHMQHWLGSQRASKSSLFKLF
Homology
BLAST of HG10004824 vs. NCBI nr
Match: XP_038884902.1 (pentatricopeptide repeat-containing protein At1g19720 [Benincasa hispida])

HSP 1 Score: 2558.9 bits (6631), Expect = 0.0e+00
Identity = 1288/1530 (84.18%), Postives = 1325/1530 (86.60%), Query Frame = 0

Query: 1    MEKLAIPCQTNPPISVPASIIKAKPLKFSSKPIKTSIFFTQKFTTKFNDDNLSYLCSNGL 60
            MEKLAIPCQTNPPISVPASIIK KPLKFSSKP K+SIFFTQK TT+FNDD+LSYLCSNGL
Sbjct: 1    MEKLAIPCQTNPPISVPASIIKPKPLKFSSKPTKSSIFFTQKLTTRFNDDHLSYLCSNGL 60

Query: 61   LREAITSIDSMSKRGSKLSTNTYINLLQTCIDADSIELGRELHDRMSSVDQVNPFVETKL 120
            LREAIT+IDSMSKRGSKLSTN+YINLLQTCID DS+ELGRELH RMS VDQVNPFVETKL
Sbjct: 61   LREAITAIDSMSKRGSKLSTNSYINLLQTCIDTDSVELGRELHVRMSLVDQVNPFVETKL 120

Query: 121  ------------------------------------------------------------ 180
                                                                        
Sbjct: 121  VSMYAKCGFLKDARKVFDGMQERNLYTWSAMIGAYSREQRWKEVVELFFLMMGDGVLPDA 180

Query: 181  --------ACGNCEDLETVKLIHSVVIRCGLSCYIRVSNSILTAFVKCGKLCLARKFFGN 240
                    ACGNCEDLETVKLIHSVVIRCGLSCY+RV+NSILTAFVKCGKL LARKFF N
Sbjct: 181  FLFPKILQACGNCEDLETVKLIHSVVIRCGLSCYMRVNNSILTAFVKCGKLSLARKFFEN 240

Query: 241  MDERDGVSWNAIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIMIASYSQLGNFNLV 300
            MDERD VS NA+IAGYCQKGNG+EARRLLDAMSDQGFKPGL+TYNIMIASYSQLGN +LV
Sbjct: 241  MDERDEVSCNAMIAGYCQKGNGNEARRLLDAMSDQGFKPGLITYNIMIASYSQLGNCSLV 300

Query: 301  IELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILAWVEPNAITIASATSA 360
            +ELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILA VEPNAITIAS TSA
Sbjct: 301  LELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILAGVEPNAITIASVTSA 360

Query: 361  CASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKCGKLEAARHVFDTILEKDIYTW 420
            CASLKSLQKGLEIHCFAIKMGIAHE+LVGNSLIDMYSKCGKLEAARHVFDTILEKDIYTW
Sbjct: 361  CASLKSLQKGLEIHCFAIKMGIAHEVLVGNSLIDMYSKCGKLEAARHVFDTILEKDIYTW 420

Query: 421  NSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGDEDQAVNLFQIMEK 480
            NSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGDEDQA+NLFQIMEK
Sbjct: 421  NSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGDEDQAMNLFQIMEK 480

Query: 481  DGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFNPNSVTILSILPACANVMAEK 540
            DGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFNPNSVTILSILP C NVMAEK
Sbjct: 481  DGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFNPNSVTILSILPTCGNVMAEK 540

Query: 541  KVKEIHGCVLRRNLESELPIANSLIDTYAKSGNIQYSRTIFDGMSSKDIITWNSIIAGYI 600
            K+KEIHGCVLRRNLESELP+ANSLIDTYAKSGNIQYSRTIFDGM SKDIITWNSIIAGY+
Sbjct: 541  KIKEIHGCVLRRNLESELPVANSLIDTYAKSGNIQYSRTIFDGMPSKDIITWNSIIAGYV 600

Query: 601  LHGCSDAAFHLFGQMKKLGIRPNRGTLASLIHAYGIAGMVDKGRHVFSSITEEHQILPTL 660
            LHGCSDAAFHLFGQMKK GIRPNRGTLAS+IHAYGIAGMVDKGRHVFSSITEEHQILPTL
Sbjct: 601  LHGCSDAAFHLFGQMKKFGIRPNRGTLASIIHAYGIAGMVDKGRHVFSSITEEHQILPTL 660

Query: 661  DHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDGSIWTSLLTACRFHGNLHLAVQAAKRLHE 720
            DHY AMVDLYGRSGRLTDAIEFIEDMPIEPD SIWTSLLTACRFHGNLHLAVQA +RLHE
Sbjct: 661  DHYFAMVDLYGRSGRLTDAIEFIEDMPIEPDVSIWTSLLTACRFHGNLHLAVQAVERLHE 720

Query: 721  LEPDNHVIYRLLIQAYALYGNFEQALKVRKLGKESAMKKCTAQCWVEVRNKVHLFVTSDQ 780
            LEPDNHV+YRLLIQAYALYG FEQ LK RKLGKESAMKKCTAQCWVEVRNKVHLFVT +Q
Sbjct: 721  LEPDNHVVYRLLIQAYALYGKFEQTLKGRKLGKESAMKKCTAQCWVEVRNKVHLFVTGEQ 780

Query: 781  SKLDVLNTWIKSIEGKVKKFNNHHQLSIDEEQKEEKI----------------------- 840
            SKLDVLNTWIKSIEGKVKKFNNHH LSI+EEQKEEKI                       
Sbjct: 781  SKLDVLNTWIKSIEGKVKKFNNHHHLSIEEEQKEEKIGGFHCEKFAFAFGLIGSSHTPKS 840

Query: 841  ------------------------------------------------------------ 900
                                                                        
Sbjct: 841  IKIVKNLRICGDCHQMAKYISAAHECEIYLSDSNCLHHFKNGHCSCGDYWCYSFALHYLI 900

Query: 901  ----------VRSLFFPTQSFDDSRKIKPVLAGPFGGPGGNNWDDGVYSTIRQLVICHGA 960
                      VRSLFFPTQSFDDSRKIKP++AGPFGGPGG+NWDDGVYSTIRQLVICHGA
Sbjct: 901  AFNFFFLPQPVRSLFFPTQSFDDSRKIKPIMAGPFGGPGGSNWDDGVYSTIRQLVICHGA 960

Query: 961  GIDSIKIQYDVKGSSIWSDRHGGNGGTKTDTVAPYFVGSSIVCSVKLDFPDEYLTMIRGH 1020
            GIDSIKIQYDVKGSSIWSDRHGGNGGTKTDT             VKLDFPDEYLTMIRGH
Sbjct: 961  GIDSIKIQYDVKGSSIWSDRHGGNGGTKTDT-------------VKLDFPDEYLTMIRGH 1020

Query: 1021 YGSFVSFDKVFVRSLTFMSNKKKYGPYGVEQGTVFSFPTTEGKIVGFHGRSGLYLDAIGV 1080
            YGSFVSFDKVFVRSLTFMSNKKKYGPYGVEQGT+FSFP TEGKIVGFHGRSGLYLDAIGV
Sbjct: 1021 YGSFVSFDKVFVRSLTFMSNKKKYGPYGVEQGTIFSFPMTEGKIVGFHGRSGLYLDAIGV 1080

Query: 1081 YLKPMAIQTPSKAMIQSQNYVASKTESEGYSIIQGSVGQNYDIVLALRQKDEFKKPLPTT 1140
            YLKPMA Q+PSKAMIQSQNYVASKT+SEGYSIIQGSVGQNYDIVLA+RQKDEFKKPLPTT
Sbjct: 1081 YLKPMATQSPSKAMIQSQNYVASKTDSEGYSIIQGSVGQNYDIVLAVRQKDEFKKPLPTT 1140

Query: 1141 VSKQVSSSSSSESSDDESTVKRPVKKGLSKVENVVPCGPWGGSGGTPFDDGYYTGIRQIN 1200
            +SKQVSSSSSSESSDDESTVKRPVKKG S+VENVVPCGPWGGSGGTPFDDGYYTGIRQIN
Sbjct: 1141 ISKQVSSSSSSESSDDESTVKRPVKKGPSRVENVVPCGPWGGSGGTPFDDGYYTGIRQIN 1200

Query: 1201 VSRNVGIVYIRVLYACDEESIWGVRAGGTGGFKHDKVILDYPYEILTHVTGHYGPVMYMG 1260
            VSRNVGIVYIRVLYACDEESIWG RAGGTGGFK+DKVILDYPYEILTHVTGHYGPVMYMG
Sbjct: 1201 VSRNVGIVYIRVLYACDEESIWGGRAGGTGGFKNDKVILDYPYEILTHVTGHYGPVMYMG 1260

Query: 1261 PNVIKSLTFHTTKTKYGPFGEAQGTPFSTNVKEGKIVGFHGRKGLFLDALGVHLVEGKVT 1320
            PNVIKSLTFHTTKTKYGPFGEAQGTPFSTNVKEGKIVGFHGRKGLFLDALGVHLVEGKVT
Sbjct: 1261 PNVIKSLTFHTTKTKYGPFGEAQGTPFSTNVKEGKIVGFHGRKGLFLDALGVHLVEGKVT 1320

Query: 1321 PVSRPPSSDIVPAAPPLLENENAPWTMKLAPSKGGALEEIARGVVKEPAPCGPGPWGGDG 1370
            PVSRPPSS IVPAAPP+LENENAPWT+KLAPSKGGALEEIARGVVK+PAPCGPGPWGGDG
Sbjct: 1321 PVSRPPSSGIVPAAPPVLENENAPWTVKLAPSKGGALEEIARGVVKQPAPCGPGPWGGDG 1380

BLAST of HG10004824 vs. NCBI nr
Match: XP_031737058.1 (pentatricopeptide repeat-containing protein At1g19720 [Cucumis sativus])

HSP 1 Score: 2485.3 bits (6440), Expect = 0.0e+00
Identity = 1247/1513 (82.42%), Postives = 1309/1513 (86.52%), Query Frame = 0

Query: 1    MEKLAIPCQTNPPISVPASIIKAKPLKFSSKPIKTSIFFTQKFTTKFNDDNLSYLCSNGL 60
            MEKLAIPCQTNPPIS PAS+IK +PLKFSSKPIKTSIFFT K T+KFNDD+LSYLCSNGL
Sbjct: 1    MEKLAIPCQTNPPISGPASVIKPRPLKFSSKPIKTSIFFTYKLTSKFNDDHLSYLCSNGL 60

Query: 61   LREAITSIDSMSKRGSKLSTNTYINLLQTCIDADSIELGRELHDRMSSVDQVNPFVETKL 120
            LREAIT+IDS+SKRGSKLSTNTYINLLQTCID  SIELGRELH RM  V +VNPFVETKL
Sbjct: 61   LREAITAIDSISKRGSKLSTNTYINLLQTCIDVGSIELGRELHVRMGLVHRVNPFVETKL 120

Query: 121  ------------------------------------------------------------ 180
                                                                        
Sbjct: 121  VSMYAKCGCLKDARKVFDGMQERNLYTWSAMIGAYSREQRWKEVVELFFLMMGDGVLPDA 180

Query: 181  --------ACGNCEDLETVKLIHSVVIRCGLSCYIRVSNSILTAFVKCGKLCLARKFFGN 240
                    ACGNCEDLETVKLIHS+VIRCGLSCY+R+SNSILTAFVKCGKL LARKFFGN
Sbjct: 181  FLFPKILQACGNCEDLETVKLIHSLVIRCGLSCYMRLSNSILTAFVKCGKLSLARKFFGN 240

Query: 241  MDERDGVSWNAIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIMIASYSQLGNFNLV 300
            MDERDGVSWN +IAGYCQKGNGDEARRLLD MS+QGFKPGLVTYNIMIASYSQLG+ +LV
Sbjct: 241  MDERDGVSWNVMIAGYCQKGNGDEARRLLDTMSNQGFKPGLVTYNIMIASYSQLGDCDLV 300

Query: 301  IELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILAWVEPNAITIASATSA 360
            I+LKKKMES+G+APDVYTWTSMISGF+QSSRISQALDFFK+MILA VEPN ITIASATSA
Sbjct: 301  IDLKKKMESVGLAPDVYTWTSMISGFSQSSRISQALDFFKKMILAGVEPNTITIASATSA 360

Query: 361  CASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKCGKLEAARHVFDTILEKDIYTW 420
            CASLKSLQ GLEIHCFAIKMGIA E LVGNSLIDMYSKCGKLEAARHVFDTILEKD+YTW
Sbjct: 361  CASLKSLQNGLEIHCFAIKMGIARETLVGNSLIDMYSKCGKLEAARHVFDTILEKDVYTW 420

Query: 421  NSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGDEDQAVNLFQIMEK 480
            NSMIGGYCQAGY GKAYELFMRLRES VMPNVVTWN MISGCIQNGDEDQA++LFQIMEK
Sbjct: 421  NSMIGGYCQAGYGGKAYELFMRLRESTVMPNVVTWNAMISGCIQNGDEDQAMDLFQIMEK 480

Query: 481  DGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFNPNSVTILSILPACANVMAEK 540
            DG VKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNF+PNSVTILSILPACANVMAEK
Sbjct: 481  DGGVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFSPNSVTILSILPACANVMAEK 540

Query: 541  KVKEIHGCVLRRNLESELPIANSLIDTYAKSGNIQYSRTIFDGMSSKDIITWNSIIAGYI 600
            K+KEIHGCVLRRNLESEL +ANSL+DTYAKSGNI+YSRT+F+GMSSKDIITWNSIIAGYI
Sbjct: 541  KIKEIHGCVLRRNLESELAVANSLVDTYAKSGNIKYSRTVFNGMSSKDIITWNSIIAGYI 600

Query: 601  LHGCSDAAFHLFGQMKKLGIRPNRGTLASLIHAYGIAGMVDKGRHVFSSITEEHQILPTL 660
            LHGCSD+AF LF QM+ LGIRPNRGTLAS+IHAYGIAGMVDKGRHVFSSITEEHQILPTL
Sbjct: 601  LHGCSDSAFQLFDQMRNLGIRPNRGTLASIIHAYGIAGMVDKGRHVFSSITEEHQILPTL 660

Query: 661  DHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDGSIWTSLLTACRFHGNLHLAVQAAKRLHE 720
            DHYLAMVDLYGRSGRL DAIEFIEDMPIEPD SIWTSLLTACRFHGNL+LAV AAKRLHE
Sbjct: 661  DHYLAMVDLYGRSGRLADAIEFIEDMPIEPDVSIWTSLLTACRFHGNLNLAVLAAKRLHE 720

Query: 721  LEPDNHVIYRLLIQAYALYGNFEQALKVRKLGKESAMKKCTAQCWVEVRNKVHLFVTSDQ 780
            LEPDNHVIYRLL+QAYALYG FEQ LKVRKLGKESAMKKCTAQCWVEVRNKVHLFVT DQ
Sbjct: 721  LEPDNHVIYRLLVQAYALYGKFEQTLKVRKLGKESAMKKCTAQCWVEVRNKVHLFVTGDQ 780

Query: 781  SKLDVLNTWIKSIEGKVKKFNNHHQLSIDEEQKEE------------------------- 840
            SKLDVLNTWIKSIEGKVKKFNNHHQLSI+EE+KEE                         
Sbjct: 781  SKLDVLNTWIKSIEGKVKKFNNHHQLSIEEEEKEEKIGGFHCEKFAFAFGLIGSSHTRKS 840

Query: 841  -KIVRSL------------------------------------------------FFPT- 900
             KIV++L                                                 F T 
Sbjct: 841  IKIVKNLRMCVDCHQMAKYISAAYECEIYLSDSKCLHHFKNGHCSCGDYCLAENKLFNTL 900

Query: 901  -QSFDDSRKIKPVLAGPFGGPGGNNWDDGVYSTIRQLVICHGAGIDSIKIQYDVKGSSIW 960
             QSFDDSRKIKP++AGPFGGP GNNWDDGVYSTIRQL+ICHGAGIDSIKIQYDVKGSSIW
Sbjct: 901  RQSFDDSRKIKPIMAGPFGGPAGNNWDDGVYSTIRQLIICHGAGIDSIKIQYDVKGSSIW 960

Query: 961  SDRHGGNGGTKTDTVAPYFVGSSIVCSVKLDFPDEYLTMIRGHYGSFVSFDKVFVRSLTF 1020
            SDRHGGNGGTKTDT             VKLDFPDEYLTMIRGHYGSFVSFDKVFVRSLTF
Sbjct: 961  SDRHGGNGGTKTDT-------------VKLDFPDEYLTMIRGHYGSFVSFDKVFVRSLTF 1020

Query: 1021 MSNKKKYGPYGVEQGTVFSFPTTEGKIVGFHGRSGLYLDAIGVYLKPMAIQTPSKAMIQS 1080
            MSNKKKYGPYGVEQGT+FSFPTTEGKIVGFHGRSGLYLDAIGVYLKPMAIQ+PSKAMIQS
Sbjct: 1021 MSNKKKYGPYGVEQGTIFSFPTTEGKIVGFHGRSGLYLDAIGVYLKPMAIQSPSKAMIQS 1080

Query: 1081 QNYVASKTESEGYSIIQGSVGQNYDIVLALRQKDEFKKPLPTTVSKQVSSSSSSESSDDE 1140
            ++++ASKTE+EGYSIIQGSVGQNYDIVLA+RQKDEFK PLPTT+SKQVSSSSSSESSDDE
Sbjct: 1081 RDHLASKTENEGYSIIQGSVGQNYDIVLAVRQKDEFKTPLPTTISKQVSSSSSSESSDDE 1140

Query: 1141 STVKRPVKKGLSKVENVVPCGPWGGSGGTPFDDGYYTGIRQINVSRNVGIVYIRVLYACD 1200
            ST+KRPVKKG SKVENVVPCGPWGGSGGT FDDG Y+GIRQINVSRNVGIVYIRVLYACD
Sbjct: 1141 STIKRPVKKGPSKVENVVPCGPWGGSGGTVFDDGCYSGIRQINVSRNVGIVYIRVLYACD 1200

Query: 1201 EESIWGVRAGGTGGFKHDKVILDYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKTKYG 1260
            EESIWG RAGGTGGFK+DKVI DYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTK KYG
Sbjct: 1201 EESIWGARAGGTGGFKYDKVIFDYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKAKYG 1260

Query: 1261 PFGEAQGTPFSTNVKEGKIVGFHGRKGLFLDALGVHLVEGKVTPVSRPPSSDIVPAAPPL 1320
            PFGEAQGTPFSTNVKEGKIVGFHGRKGLFLDALGVH+VEGKVTP+SRPPS DI+PAAPPL
Sbjct: 1261 PFGEAQGTPFSTNVKEGKIVGFHGRKGLFLDALGVHIVEGKVTPLSRPPSRDIIPAAPPL 1320

Query: 1321 LENENAPWTMKLAPSKGGALEEIARGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYL 1370
            LEN NAPWTMKLAPSK GALEE+ARGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYL
Sbjct: 1321 LENSNAPWTMKLAPSK-GALEEMARGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYL 1380

BLAST of HG10004824 vs. NCBI nr
Match: XP_022962565.1 (pentatricopeptide repeat-containing protein At1g19720 [Cucurbita moschata])

HSP 1 Score: 2370.9 bits (6143), Expect = 0.0e+00
Identity = 1204/1556 (77.38%), Postives = 1265/1556 (81.30%), Query Frame = 0

Query: 1    MEKLAIPCQTNPPISVPASIIKAKPLKFSSKPIKTSIFFTQKFTTKFNDDNLSYLCSNGL 60
            MEKLAIPCQT PPISVPASIIK KPLKFSSKP +T+IFFTQK ++K NDD+LSYLC +GL
Sbjct: 1    MEKLAIPCQTKPPISVPASIIKTKPLKFSSKPTQTTIFFTQKSSSKSNDDHLSYLCRHGL 60

Query: 61   LREAITSIDSMSKRGSKLSTNTYINLLQTCIDADSIELGRELHDRMSSVDQVNPFVETKL 120
            LREAI++IDSMS+ GSKLSTNTYINLLQTCIDADSIE+GRELH R+  VDQVNPFVETKL
Sbjct: 61   LREAISAIDSMSRHGSKLSTNTYINLLQTCIDADSIEVGRELHVRLCLVDQVNPFVETKL 120

Query: 121  ------------------------------------------------------------ 180
                                                                        
Sbjct: 121  VSMYAKCGFLKDARKVFDEMPERNLYTWSAMIGAYSREQRWKEVVELFFLMMGDGVLPDA 180

Query: 181  --------ACGNCEDLETVKLIHSVVIRCGLSCYIRVSNSILTAFVKCGKLCLARKFFGN 240
                    ACGNCEDLET+KL+HSVVIRCGLSC +RVSNSILTA VKCG L LARKFF N
Sbjct: 181  FLFPRILQACGNCEDLETLKLMHSVVIRCGLSCSMRVSNSILTALVKCGNLSLARKFFEN 240

Query: 241  MDERDGVSWNAIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIMIASYSQLGNFNLV 300
            MDERD VSWNAIIAGYC+KG+GDEAR LLD M+DQGFKPGLVT NI+IASYSQLG  NLV
Sbjct: 241  MDERDEVSWNAIIAGYCRKGHGDEARTLLDTMNDQGFKPGLVTCNILIASYSQLGKCNLV 300

Query: 301  IELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILAWVEPNAITIASATSA 360
            IELKKKMESMGI PDVYTWTSMISGFAQSSRI+ ALDFFKEMILA VEPNA+TI S TSA
Sbjct: 301  IELKKKMESMGITPDVYTWTSMISGFAQSSRINLALDFFKEMILAGVEPNAVTITSVTSA 360

Query: 361  CASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKCGKLEAARHVFDTILEKDIYTW 420
            CASLKSLQKGLEIHC AIKMGIAH++LVGNSLIDMYSKCGKLEAA HVFDTILEKDIYTW
Sbjct: 361  CASLKSLQKGLEIHCLAIKMGIAHQVLVGNSLIDMYSKCGKLEAAHHVFDTILEKDIYTW 420

Query: 421  NSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGDEDQAVNLFQIMEK 480
            NSMIGGYCQ GYCGKAYELFMRLRESNVMPNVVTWNVMISGCI NGDEDQA+NLFQ+ME 
Sbjct: 421  NSMIGGYCQGGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIHNGDEDQAMNLFQMMEN 480

Query: 481  DGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFNPNSVTILSILPACANVMAEK 540
            D EV  NTASWNSLIAGYH+LGEKNKALAIFRQMQSLNFNPNSVTILSILP CANVMAEK
Sbjct: 481  DEEVNPNTASWNSLIAGYHRLGEKNKALAIFRQMQSLNFNPNSVTILSILPVCANVMAEK 540

Query: 541  KVKEIHGCVLRRNLESELPIANSLIDTYAKSGNIQYSRTIFDGMSSKDIITWNSIIAGYI 600
            K+KEIHGCVLRRNLESELP+ANSLIDTYAKSGNIQYSR IFDGMSSKDIITWNSIIAGYI
Sbjct: 541  KIKEIHGCVLRRNLESELPVANSLIDTYAKSGNIQYSRNIFDGMSSKDIITWNSIIAGYI 600

Query: 601  LHGCSDAAFHLFGQMKKLGIRPNRGTLASLIHAYGIAGMVDKGRHVFSSITEEHQILPTL 660
            LHGCSDAAFHLF QMK+ GIRPNRGTLAS+I+A GIAGMVD+GRHVFSSITEEHQILPTL
Sbjct: 601  LHGCSDAAFHLFDQMKRFGIRPNRGTLASIIYACGIAGMVDRGRHVFSSITEEHQILPTL 660

Query: 661  DHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDGSIWTSLLTACRFHGNLHLAVQAAKRLHE 720
            DHY AMVDLYGRSGRLTDAIEFIE+MP EPD SIWTSLLTA RFHGNLHLAV+AA+ L E
Sbjct: 661  DHYSAMVDLYGRSGRLTDAIEFIENMPTEPDVSIWTSLLTASRFHGNLHLAVRAAEHLLE 720

Query: 721  LEPDNHVIYRLLIQAYALYGNFEQALKVRKLGKESAMKKCTAQCWVEVRNKVHLFVTSDQ 780
            LEPDNHVIYRLLIQAYALYG  EQALKVRKLG+ESAMKKCTAQCWVEV NKV+ FV  D 
Sbjct: 721  LEPDNHVIYRLLIQAYALYGKSEQALKVRKLGRESAMKKCTAQCWVEVGNKVYFFVNGDH 780

Query: 781  SKLDVLNTWIKSIEGKVKKFNNHHQLSIDEEQKEEKI----------------------- 840
            SK+DVLNTWIK I GKVKKFNNHHQLSID+E KEEKI                       
Sbjct: 781  SKVDVLNTWIKGIVGKVKKFNNHHQLSIDDEPKEEKIGGFHCEKFAFAFGLIGSSHKPKR 840

Query: 841  ------------------------------------------------------------ 900
                                                                        
Sbjct: 841  IKIVKNLRICGDCHQMAKYVSEAHGCEIYLSDSKCLHHFKNGCCSCGDYCIILIAFRHYM 900

Query: 901  ---VRSLFFP--------------------------------TQSFDDSRKIKPVLAGPF 960
               V SL  P                                T + D SRKIKP+ AGPF
Sbjct: 901  LVWVHSLLIPHRPWHTAIVLHRHRSSRKSLLSLLVLSFVSVTTMNSDGSRKIKPISAGPF 960

Query: 961  GGPGGNNWDDGVYSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDRHGGNGGTKTDTVAPY 1020
            GG GGN WDDGV+STIRQLVICHGAGIDSIKIQYDVKGSSIWSD+HGGNGGTKTDT    
Sbjct: 961  GGTGGNYWDDGVFSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDKHGGNGGTKTDT---- 1020

Query: 1021 FVGSSIVCSVKLDFPDEYLTMIRGHYGSFVSFDKVFVRSLTFMSNKKKYGPYGVEQGTVF 1080
                     VKLDFPDEYLTMIRGHYGSFVSFDKV+VRSLTFMSNK+K+GPYGVE GT+F
Sbjct: 1021 ---------VKLDFPDEYLTMIRGHYGSFVSFDKVYVRSLTFMSNKRKFGPYGVELGTIF 1080

Query: 1081 SFPTTEGKIVGFHGRSGLYLDAIGVYLKPMAIQTPSKAMIQSQNYVASKTESEGYSIIQG 1140
            SFP TEGKIVGFHGRSGLYLDAIGVYLKPM IQTPSK MIQS NYVA K ESEGYSIIQG
Sbjct: 1081 SFPATEGKIVGFHGRSGLYLDAIGVYLKPMPIQTPSKGMIQSPNYVACKAESEGYSIIQG 1140

Query: 1141 SVGQNYDIVLALRQKDEFKKPLPTTVSKQVSSSSSSESSDDESTVKRPVKKGLSKVENVV 1200
            SVGQNYDIVLALRQKDEFKKPLP T+SKQVSSSSSSESSDDEST KRPVKKG SKVEN V
Sbjct: 1141 SVGQNYDIVLALRQKDEFKKPLPNTISKQVSSSSSSESSDDESTDKRPVKKGPSKVENAV 1200

Query: 1201 PCGPWGGSGGTPFDDGYYTGIRQINVSRNVGIVYIRVLYACDEESIWGVRAGGTGGFKHD 1260
            PCGPWGGSGGT FDDG+Y+GIR+INVSRNVGIVYI+VLYA DEESIWG RAGG GGFKHD
Sbjct: 1201 PCGPWGGSGGTTFDDGHYSGIREINVSRNVGIVYIKVLYAWDEESIWGTRAGGKGGFKHD 1260

Query: 1261 KVILDYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKTKYGPFGEAQGTPFSTNVKEGK 1320
            KV+ DYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTK KYGPFGEA GTPFSTNVKEGK
Sbjct: 1261 KVVFDYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKAKYGPFGEALGTPFSTNVKEGK 1320

Query: 1321 IVGFHGRKGLFLDALGVHLVEGKVTPVSRPPSSDIVPAA-PPLLENENAPWTMKLAPSKG 1370
            IVGFHGRKGLFLDALGVHLVEGKVTP SRPPSS+IVPAA PPLL NE  PWT K+APSKG
Sbjct: 1321 IVGFHGRKGLFLDALGVHLVEGKVTPASRPPSSEIVPAARPPLLGNELVPWTKKVAPSKG 1380

BLAST of HG10004824 vs. NCBI nr
Match: KAG6598470.1 (Pentatricopeptide repeat-containing protein, partial [Cucurbita argyrosperma subsp. sororia])

HSP 1 Score: 2311.2 bits (5988), Expect = 0.0e+00
Identity = 1169/1492 (78.35%), Postives = 1235/1492 (82.77%), Query Frame = 0

Query: 1    MEKLAIPCQTNPPISVPASIIKAKPLKFSSKPIKTSIFFTQKFTTKFNDDNLSYLCSNGL 60
            MEKLAIPCQT PPISVPASIIK KPLKFSSKP +T+IFFTQK ++K NDD+LSYLC +GL
Sbjct: 1    MEKLAIPCQTKPPISVPASIIKTKPLKFSSKPTQTTIFFTQKSSSKSNDDHLSYLCRHGL 60

Query: 61   LREAITSIDSMSKRGSKLSTNTYINLLQTCIDADSIELGRELHDRMSSVDQVNPFVETKL 120
            LREAI++IDSMS+ GSKLSTNTYINLLQTCIDADSIE+GRELH R+  VDQVNPFVETKL
Sbjct: 61   LREAISAIDSMSRHGSKLSTNTYINLLQTCIDADSIEVGRELHVRLCLVDQVNPFVETKL 120

Query: 121  ------------------------------------------------------------ 180
                                                                        
Sbjct: 121  VSMYAKCGFLKDARKVFDEMPERNLYTWSAMIGAYSREQRWKEVVELFFLMMGDGVLPDA 180

Query: 181  --------ACGNCEDLETVKLIHSVVIRCGLSCYIRVSNSILTAFVKCGKLCLARKFFGN 240
                    ACGNCEDLET+KL+HSVVIRCGLSC +RVSNSILTA VKCG L LARKFF N
Sbjct: 181  FLFPRILQACGNCEDLETLKLMHSVVIRCGLSCSMRVSNSILTALVKCGNLSLARKFFEN 240

Query: 241  MDERDGVSWNAIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIMIASYSQLGNFNLV 300
            MDERD VSWNAIIAGYC+KG+GDEAR LLD M+DQGFKPGLVT NI+IASYSQLG  NLV
Sbjct: 241  MDERDEVSWNAIIAGYCRKGHGDEARTLLDTMNDQGFKPGLVTCNILIASYSQLGKCNLV 300

Query: 301  IELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILAWVEPNAITIASATSA 360
            IELKKKMESMGI PDVYTWTSMISGFAQSSRI+ ALDFFKEMILA VEPNA+TI S TS 
Sbjct: 301  IELKKKMESMGITPDVYTWTSMISGFAQSSRINLALDFFKEMILAGVEPNAVTITSVTSV 360

Query: 361  CASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKCGKLEAARHVFDTILEKDIYTW 420
            CASLKSLQKGLEIHC AIKMGIAH++LVGNSLIDMYSKCGKLEAA HVFDTILEKDIYTW
Sbjct: 361  CASLKSLQKGLEIHCLAIKMGIAHQVLVGNSLIDMYSKCGKLEAAHHVFDTILEKDIYTW 420

Query: 421  NSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGDEDQAVNLFQIMEK 480
            NSMIGGYCQ GYC                           GCI NGDEDQA+NLFQ+ME 
Sbjct: 421  NSMIGGYCQGGYC---------------------------GCIHNGDEDQAMNLFQMMEN 480

Query: 481  DGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFNPNSVTILSILPACANVMAEK 540
            D EV  NTASWNSLIAGYH+LGEKNKALAIFRQMQSLNFNPNSVTILSILP CANVMAEK
Sbjct: 481  DEEVNPNTASWNSLIAGYHRLGEKNKALAIFRQMQSLNFNPNSVTILSILPVCANVMAEK 540

Query: 541  KVKEIHGCVLRRNLESELPIANSLIDTYAKSGNIQYSRTIFDGMSSKDIITWNSIIAGYI 600
            K+KEIHGCVLRRNLESELP+ANSLIDTYAKSGNIQYSR IFDGMSSKDIITWNSIIAGYI
Sbjct: 541  KIKEIHGCVLRRNLESELPVANSLIDTYAKSGNIQYSRNIFDGMSSKDIITWNSIIAGYI 600

Query: 601  LHGCSDAAFHLFGQMKKLGIRPNRGTLASLIHAYGIAGMVDKGRHVFSSITEEHQILPTL 660
            LHGCSDAAFHLF QMK+ GIRPNRGTLAS+I+A GI+GMVD+GRHVFSSITEEHQILPTL
Sbjct: 601  LHGCSDAAFHLFDQMKRFGIRPNRGTLASIIYACGISGMVDRGRHVFSSITEEHQILPTL 660

Query: 661  DHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDGSIWTSLLTACRFHGNLHLAVQAAKRLHE 720
            DHY A+VDLYGRSGRLTDAIEFIE+MP EPD SIWTSLLTA RFHGNLHLAV+AA+RL E
Sbjct: 661  DHYSAVVDLYGRSGRLTDAIEFIENMPTEPDVSIWTSLLTASRFHGNLHLAVRAAERLLE 720

Query: 721  LEPDNHVIYRLLIQAYALYGNFEQALKVRKLGKESAMKKCTAQCWVEVRNKVHLFVTSDQ 780
            LEPDNHVIYRLLIQAYALYG  EQALKVRKLG+ESAMKKCTAQCWVEV NKV+ FV  D 
Sbjct: 721  LEPDNHVIYRLLIQAYALYGKSEQALKVRKLGRESAMKKCTAQCWVEVGNKVYFFVNGDH 780

Query: 781  SKLDVLNTWIKSIEGKVKKFNNHHQLSIDEEQKEE------------------------- 840
            SK+DVLNTWIK I GKVKKFNNHHQLSID+E KEE                         
Sbjct: 781  SKVDVLNTWIKGIVGKVKKFNNHHQLSIDDEPKEEKIGGFHCEKFAFAFGLIGSSHEPKR 840

Query: 841  -KIVRSL----------------------------FFPTQSFDDSRKIKPVLAGPFGGPG 900
             KIV++L                               T + D SRKIKP+ AGPFGG G
Sbjct: 841  IKIVKNLRICGDCHQMAKHRSSRKSFLSLLVLSFVSVTTMNSDGSRKIKPISAGPFGGTG 900

Query: 901  GNNWDDGVYSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDRHGGNGGTKTDTVAPYFVGS 960
            GN WDDGV+STIRQLVICHGAGIDSIKIQYDVKGSSIWSD+HGGNGGTKTDT        
Sbjct: 901  GNYWDDGVFSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDKHGGNGGTKTDT-------- 960

Query: 961  SIVCSVKLDFPDEYLTMIRGHYGSFVSFDKVFVRSLTFMSNKKKYGPYGVEQGTVFSFPT 1020
                 VKLDFPDEYLTMIRGHYGSFVSFDKV+VRSLTFMSNK+K+GPYGVE GT+FSFP 
Sbjct: 961  -----VKLDFPDEYLTMIRGHYGSFVSFDKVYVRSLTFMSNKRKFGPYGVELGTIFSFPA 1020

Query: 1021 TEGKIVGFHGRSGLYLDAIGVYLKPMAIQTPSKAMIQSQNYVASKTESEGYSIIQGSVGQ 1080
            TEGKIVGFHGRSGLYLDAIGVYLKPM IQTPSK MIQS NYVA K E+EGYSIIQGSVGQ
Sbjct: 1021 TEGKIVGFHGRSGLYLDAIGVYLKPMPIQTPSKGMIQSPNYVACKAENEGYSIIQGSVGQ 1080

Query: 1081 NYDIVLALRQKDEFKKPLPTTVSKQVSSSSSSESSDDESTVKRPVKKGLSKVENVVPCGP 1140
            NYDIVLALRQKDE KKPLP T+SKQVSSSSSSESSDDEST KRPVKKG SKVE  VPCGP
Sbjct: 1081 NYDIVLALRQKDELKKPLPNTISKQVSSSSSSESSDDESTDKRPVKKGPSKVETAVPCGP 1140

Query: 1141 WGGSGGTPFDDGYYTGIRQINVSRNVGIVYIRVLYACDEESIWGVRAGGTGGFKHDKVIL 1200
            WGGSGGT FDDG+Y+GIR+INVSRNVGIVYI+VLYA DEESIWG RAGG GGFKHDKV+ 
Sbjct: 1141 WGGSGGTTFDDGHYSGIREINVSRNVGIVYIKVLYAWDEESIWGTRAGGKGGFKHDKVVF 1200

Query: 1201 DYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKTKYGPFGEAQGTPFSTNVKEGKIVGF 1260
            DYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTK KYGPFGEA GTPFSTNVKEGKIVGF
Sbjct: 1201 DYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKAKYGPFGEALGTPFSTNVKEGKIVGF 1260

Query: 1261 HGRKGLFLDALGVHLVEGKVTPVSRPPSSDIVPAA-PPLLENENAPWTMKLAPSKGGALE 1320
            HGRKGLFLDALGVHLVEGKVTP SRPPSS+IVPAA PPLL NE  PWT K+APSKGG LE
Sbjct: 1261 HGRKGLFLDALGVHLVEGKVTPASRPPSSEIVPAARPPLLGNELVPWTKKVAPSKGGPLE 1320

Query: 1321 EIARGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYLTRSLEAFCSIQIEYDRNKQSV 1370
            EI RGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYLTRSLEAFCSIQIEYDRNKQSV
Sbjct: 1321 EITRGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYLTRSLEAFCSIQIEYDRNKQSV 1380

BLAST of HG10004824 vs. NCBI nr
Match: XP_023545984.1 (pentatricopeptide repeat-containing protein At1g19720-like isoform X2 [Cucurbita pepo subsp. pepo])

HSP 1 Score: 1998.0 bits (5175), Expect = 0.0e+00
Identity = 1035/1451 (71.33%), Postives = 1087/1451 (74.91%), Query Frame = 0

Query: 1    MEKLAIPCQTNPPISVPASIIKAKPLKFSSKPIKTSIFFTQKFTTKFNDDNLSYLCSNGL 60
            MEKLAIPCQT PPISVPASIIK KPLKFSSKP +T+IFFTQK ++K NDD+LSYLC +GL
Sbjct: 1    MEKLAIPCQTKPPISVPASIIKTKPLKFSSKPTQTTIFFTQKSSSKSNDDHLSYLCRHGL 60

Query: 61   LREAITSIDSMSKRGSKLSTNTYINLLQTCIDADSIELGRELHDRMSSVDQVNPFVETKL 120
            LREAI++IDSMS+ GSKLSTN YINLLQTCIDADSIE+GRELH R+  VDQVNPFVETKL
Sbjct: 61   LREAISAIDSMSRHGSKLSTNMYINLLQTCIDADSIEVGRELHVRLCLVDQVNPFVETKL 120

Query: 121  ------------------------------------------------------------ 180
                                                                        
Sbjct: 121  VSMYAKCGFLKDARKVFDEMPERNLYTWSAMIGAYSREQRWKEVVELFFLMMGDGVLPDA 180

Query: 181  --------ACGNCED-------------LETVKLIHSVVIRCGLSCYIRVSNSILTAFVK 240
                    ACGNCED             LET+KL+HSVVIRCGLSC +RVSNSILTA VK
Sbjct: 181  FLFPRIIQACGNCEDLETLKLMHSXCEELETLKLMHSVVIRCGLSCSMRVSNSILTALVK 240

Query: 241  CGKLCLARKFFGNMDERDGVSWNAIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIM 300
            CG L LARKFF NMDERD VSWNAIIAGYC+KG+GDEAR LLD M+DQGF PGLVT NI+
Sbjct: 241  CGNLSLARKFFENMDERDEVSWNAIIAGYCRKGHGDEARTLLDTMNDQGFNPGLVTCNIL 300

Query: 301  IASYSQLGNFNLVIELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILAWV 360
            IASYSQLG  NLVIELKKKMESMGI PDVYTWTSMISGFAQSSRI+ ALDFFKEMILA V
Sbjct: 301  IASYSQLGKCNLVIELKKKMESMGITPDVYTWTSMISGFAQSSRINLALDFFKEMILAGV 360

Query: 361  EPNAITIASATSACASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKCGKLEAARH 420
            EPNA+TI S TSACASLKSLQKGLEIHC AIKMGIAH++LVGNSLIDMYSKCGKLEAA H
Sbjct: 361  EPNAVTITSVTSACASLKSLQKGLEIHCLAIKMGIAHQVLVGNSLIDMYSKCGKLEAAHH 420

Query: 421  VFDTILEKDIYTWNSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGD 480
            VFDTILEKDIYTWNSMIGGYCQ GYCGKAYELFMRLRESNVMPNVVTWNVMISGCI NGD
Sbjct: 421  VFDTILEKDIYTWNSMIGGYCQGGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIHNGD 480

Query: 481  EDQAVNLFQIMEKDGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFNPNSVTIL 540
            EDQA+NLFQ+ME D EV  NTASWNSLIAGYH+LGEKNKALAIFRQMQSLNFNPNSVTIL
Sbjct: 481  EDQAMNLFQMMENDEEVNPNTASWNSLIAGYHRLGEKNKALAIFRQMQSLNFNPNSVTIL 540

Query: 541  SILPACANVMAEKKVKEIHGCVLRRNLESELPIANSLIDTYAKSGNIQYSRTIFDGMSSK 600
            SILP CANVMAEKK+KEIHGCVLRRNLESELP+ANSLIDTYAKSGNIQYSR IFDGMSSK
Sbjct: 541  SILPVCANVMAEKKIKEIHGCVLRRNLESELPVANSLIDTYAKSGNIQYSRNIFDGMSSK 600

Query: 601  DIITWNSIIAGYILHGCSDAAFHLFGQMKKLGIRPNRGTLASLIHAYGIAGMVDKGRHVF 660
            DIITWNSIIAGYILHGCSDAAFHLF QMK+ GIRPNR                       
Sbjct: 601  DIITWNSIIAGYILHGCSDAAFHLFDQMKRFGIRPNR----------------------- 660

Query: 661  SSITEEHQILPTLDHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDGSIWTSLLTACRFHGN 720
                                                                        
Sbjct: 661  ------------------------------------------------------------ 720

Query: 721  LHLAVQAAKRLHELEPDNHVIYRLLIQAYALYGNFEQALKVRKLGKESAMKKCTAQCWVE 780
                                                                        
Sbjct: 721  ------------------------------------------------------------ 780

Query: 781  VRNKVHLFVTSDQSKLDVLNTWIKSIEGKVKKFNNHHQLSIDEEQKEEKIVRSLFFPTQS 840
                                                              V+S FFP Q+
Sbjct: 781  --------------------------------------------------VKSSFFPAQN 840

Query: 841  FDDSRKIKPVLAGPFGGPGGNNWDDGVYSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDR 900
             D SRKIKP+ AGPFGG GGN WDDGV+STIRQLVICHGAGIDSIKIQYDVKGSSIWSD+
Sbjct: 841  SDGSRKIKPISAGPFGGTGGNYWDDGVFSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDK 900

Query: 901  HGGNGGTKTDTVAPYFVGSSIVCSVKLDFPDEYLTMIRGHYGSFVSFDKVFVRSLTFMSN 960
            HGGNGGTKTDT             VKLDFPDEYLTMIRGHYGSFVSFDKV+VRSLTFMSN
Sbjct: 901  HGGNGGTKTDT-------------VKLDFPDEYLTMIRGHYGSFVSFDKVYVRSLTFMSN 960

Query: 961  KKKYGPYGVEQGTVFSFPTTEGKIVGFHGRSGLYLDAIGVYLKPMAIQTPSKAMIQSQNY 1020
            K+K+GPYGVE GT+FSFP TEGKIVGFHGRSGLYLDAIGVYLKPM IQTPSK MIQS NY
Sbjct: 961  KRKFGPYGVELGTIFSFPATEGKIVGFHGRSGLYLDAIGVYLKPMPIQTPSKGMIQSPNY 1020

Query: 1021 VASKTESEGYSIIQGSVGQNYDIVLALRQKDEFKKPLPTTVSKQVSSSSSSESSDDESTV 1080
            VA K E+EGYSIIQGSVGQNYDIVLALRQKDEFKKPLP T+SKQVSSSSSSESSDDEST 
Sbjct: 1021 VACKAENEGYSIIQGSVGQNYDIVLALRQKDEFKKPLPNTISKQVSSSSSSESSDDESTD 1080

Query: 1081 KRPVKKGLSKVENVVPCGPWGGSGGTPFDDGYYTGIRQINVSRNVGIVYIRVLYACDEES 1140
            KR VKKG SKVEN VPCGPWGGSGGT FDDG+Y+GIR+INVSRNVGIVYI+VLYA DEES
Sbjct: 1081 KRLVKKGPSKVENAVPCGPWGGSGGTTFDDGHYSGIREINVSRNVGIVYIKVLYAWDEES 1140

Query: 1141 IWGVRAGGTGGFKHDKVILDYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKTKYGPFG 1200
            IWG RAGG GGFKHDKV+ DYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTK KYGPFG
Sbjct: 1141 IWGTRAGGKGGFKHDKVVFDYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKAKYGPFG 1200

Query: 1201 EAQGTPFSTNVKEGKIVGFHGRKGLFLDALGVHLVEGKVTPVSRPPSSDIVPAA-PPLLE 1260
            EA GTPFSTNVKEGKIVGFHGRKGLFLDALGVHLVEGKVTP SRPPSS+IVPAA PPLL 
Sbjct: 1201 EALGTPFSTNVKEGKIVGFHGRKGLFLDALGVHLVEGKVTPASRPPSSEIVPAARPPLLG 1245

Query: 1261 NENAPWTMKLAPSKGGALEEIARGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYLTR 1320
            NE  PWT K+APSKGGALEEI RGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYLTR
Sbjct: 1261 NELVPWTKKVAPSKGGALEEITRGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYLTR 1245

Query: 1321 SLEAFCSIQIEYDRNKQSVWSVKHGGNSGTTIHRVKLNYPHEVLTCISGYYGYIGKDERQ 1370
            SLEAFCSIQIEYDRNKQSVWSVKHGGNSGT+IHRVKL+YPHEVLTCISGYYGY+GK ERQ
Sbjct: 1321 SLEAFCSIQIEYDRNKQSVWSVKHGGNSGTSIHRVKLDYPHEVLTCISGYYGYVGKGERQ 1245

BLAST of HG10004824 vs. ExPASy Swiss-Prot
Match: Q9FXH1 (Pentatricopeptide repeat-containing protein At1g19720 OS=Arabidopsis thaliana OX=3702 GN=DYW7 PE=2 SV=1)

HSP 1 Score: 749.6 bits (1934), Expect = 6.3e-215
Identity = 390/821 (47.50%), Postives = 530/821 (64.56%), Query Frame = 0

Query: 1   MEKLAIPC--QTNPPISVPASIIKAKPLKFSSKPIKTSIFFTQKFTTK-FNDDNLSYLCS 60
           MEKL +P   +T      PA +  +  L   S+  K ++ FT+K       D+   YLC 
Sbjct: 1   MEKLFVPSFPKTFLNYQTPAKVENSPELHPKSR--KKNLSFTKKKEPNIIPDEQFDYLCR 60

Query: 61  NGLLREAITSIDSMSKRGSKLSTNTYINLLQTCIDADSIELGRELHDRMSSVDQVNPFVE 120
           NG L EA  ++DS+ ++GSK+  +TY+ LL++CID+ SI LGR LH R     + + FVE
Sbjct: 61  NGSLLEAEKALDSLFQQGSKVKRSTYLKLLESCIDSGSIHLGRILHARFGLFTEPDVFVE 120

Query: 121 TKL--------------------------------------------------------- 180
           TKL                                                         
Sbjct: 121 TKLLSMYAKCGCIADARKVFDSMRERNLFTWSAMIGAYSRENRWREVAKLFRLMMKDGVL 180

Query: 181 -----------ACGNCEDLETVKLIHSVVIRCGLSCYIRVSNSILTAFVKCGKLCLARKF 240
                       C NC D+E  K+IHSVVI+ G+S  +RVSNSIL  + KCG+L  A KF
Sbjct: 181 PDDFLFPKILQGCANCGDVEAGKVIHSVVIKLGMSSCLRVSNSILAVYAKCGELDFATKF 240

Query: 241 FGNMDERDGVSWNAIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIMIASYSQLGNF 300
           F  M ERD ++WN+++  YCQ G  +EA  L+  M  +G  PGLVT+NI+I  Y+QLG  
Sbjct: 241 FRRMRERDVIAWNSVLLAYCQNGKHEEAVELVKEMEKEGISPGLVTWNILIGGYNQLGKC 300

Query: 301 NLVIELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILAWVEPNAITIASA 360
           +  ++L +KME+ GI  DV+TWT+MISG   +    QALD F++M LA V PNA+TI SA
Sbjct: 301 DAAMDLMQKMETFGITADVFTWTAMISGLIHNGMRYQALDMFRKMFLAGVVPNAVTIMSA 360

Query: 361 TSACASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKCGKLEAARHVFDTILEKDI 420
            SAC+ LK + +G E+H  A+KMG   ++LVGNSL+DMYSKCGKLE AR VFD++  KD+
Sbjct: 361 VSACSCLKVINQGSEVHSIAVKMGFIDDVLVGNSLVDMYSKCGKLEDARKVFDSVKNKDV 420

Query: 421 YTWNSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGDEDQAVNLFQI 480
           YTWNSMI GYCQAGYCGKAYELF R++++N+ PN++TWN MISG I+NGDE +A++LFQ 
Sbjct: 421 YTWNSMITGYCQAGYCGKAYELFTRMQDANLRPNIITWNTMISGYIKNGDEGEAMDLFQR 480

Query: 481 MEKDGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFNPNSVTILSILPACANVM 540
           MEKDG+V+RNTA+WN +IAGY Q G+K++AL +FR+MQ   F PNSVTILS+LPACAN++
Sbjct: 481 MEKDGKVQRNTATWNLIIAGYIQNGKKDEALELFRKMQFSRFMPNSVTILSLLPACANLL 540

Query: 541 AEKKVKEIHGCVLRRNLESELPIANSLIDTYAKSGNIQYSRTIFDGMSSKDIITWNSIIA 600
             K V+EIHGCVLRRNL++   + N+L DTYAKSG+I+YSRTIF GM +KDIITWNS+I 
Sbjct: 541 GAKMVREIHGCVLRRNLDAIHAVKNALTDTYAKSGDIEYSRTIFLGMETKDIITWNSLIG 600

Query: 601 GYILHGCSDAAFHLFGQMKKLGIRPNRGTLASLIHAYGIAGMVDKGRHVFSSITEEHQIL 660
           GY+LHG    A  LF QMK  GI PNRGTL+S+I A+G+ G VD+G+ VF SI  ++ I+
Sbjct: 601 GYVLHGSYGPALALFNQMKTQGITPNRGTLSSIILAHGLMGNVDEGKKVFYSIANDYHII 660

Query: 661 PTLDHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDGSIWTSLLTACRFHGNLHLAVQAAKR 720
           P L+H  AMV LYGR+ RL +A++FI++M I+ +  IW S LT CR HG++ +A+ AA+ 
Sbjct: 661 PALEHCSAMVYLYGRANRLEEALQFIQEMNIQSETPIWESFLTGCRIHGDIDMAIHAAEN 720

Query: 721 LHELEPDNHVIYRLLIQAYALYGNFEQALKVRKLGKESAMKKCTAQCWVEVRNKVHLFVT 748
           L  LEP+N     ++ Q YAL     ++L+  K  +++ +KK   Q W+EVRN +H F T
Sbjct: 721 LFSLEPENTATESIVSQIYALGAKLGRSLEGNKPRRDNLLKKPLGQSWIEVRNLIHTFTT 780

BLAST of HG10004824 vs. ExPASy Swiss-Prot
Match: F4HQX1 (Jacalin-related lectin 3 OS=Arabidopsis thaliana OX=3702 GN=JAL3 PE=2 SV=1)

HSP 1 Score: 682.2 bits (1759), Expect = 1.2e-194
Identity = 332/611 (54.34%), Postives = 436/611 (71.36%), Query Frame = 0

Query: 767  KPVLAGPFGGPGGNNWDDGVYSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDRHGGNGGT 826
            KP   GP+GG  G+ WDDG+Y+T++Q++I HG+GIDSI+I+YD  GSS+WS++ GG GG 
Sbjct: 12   KPASLGPWGGQSGHAWDDGMYTTVKQIIIAHGSGIDSIQIEYDKNGSSVWSEKRGGKGGK 71

Query: 827  KTDTVAPYFVGSSIVCSVKLDFPDEYLTMIRGHYGSFVSFDKVFVRSLTFMSNKKKYGPY 886
            K D              VK D+P EYL  + G YGSF  +  + VRSLTF SN++KYGP+
Sbjct: 72   KFD-------------KVKFDYPHEYLISVNGTYGSFDVWGTICVRSLTFESNRRKYGPF 131

Query: 887  GVEQGTVFSFPTTEGKIVGFHGRSGLYLDAIGVYLKPMAIQT--PSKAMIQSQNYVASKT 946
            GV+ GT F+ P +  KI+GFHG++G YLDAIGV+ +P+  +    SK ++ S    +   
Sbjct: 132  GVDSGTFFALPKSGSKIIGFHGKAGWYLDAIGVHTQPIPKENNPSSKILLHSHQSFSQGD 191

Query: 947  ESEGYSIIQGSVGQNYDIVLALRQKDEFKKPLPTTVSKQVSSSSSSESSDDESTVKRPVK 1006
            +   YS++QGSVGQN+DIV+ LR+KD      PT  S +   S+ +E +  +  +    +
Sbjct: 192  KKHEYSVLQGSVGQNFDIVVTLRKKD------PTLPSFESRDSAGAEVT--KHKLVTDTE 251

Query: 1007 KGLSKVE-NVVPCGPWGGSGGTPFDDGYYTGIRQINVSRNVGIVYIRVLYACDEESIWGV 1066
            K  SK+E      GPWGG+GG  FDDG YTGIRQIN+SRNVGIV ++V Y    +++WG 
Sbjct: 252  KSQSKIEGGAKTYGPWGGTGGIMFDDGIYTGIRQINLSRNVGIVSMKVCYDFRGQAVWGS 311

Query: 1067 RAGGTGGFKHDKVILDYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKTKYGPFGEAQG 1126
            + GG GGFKHDK++ DYP E+LTHVTG YGP+MYMGPNVIKSLTF T + K+GP+GE QG
Sbjct: 312  KHGGVGGFKHDKIVFDYPSEVLTHVTGTYGPLMYMGPNVIKSLTFRTNRGKHGPYGEEQG 371

Query: 1127 TPFSTNVKEGKIVGFHGRKGLFLDALGVHLVEGKVTPVS-RPPSSDIVP-AAPPLLENEN 1186
              F+  + EGK+VGF GR+GLFLD++GVH++E K++ +    P + IVP       + EN
Sbjct: 372  PSFTHQMDEGKVVGFLGREGLFLDSIGVHVMECKISSLKPSSPHNAIVPHNNSGTAQIEN 431

Query: 1187 APWTMKLAPSKGGALEEIARGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYLTRSLE 1246
            +PW  KL  +  G  EE+ RGVVKEP P GPGPWGGDGG+ WDDGVFSGIKQI++TR  +
Sbjct: 432  SPWANKLVLAANGHGEEVDRGVVKEPTPSGPGPWGGDGGQAWDDGVFSGIKQIFVTRGND 491

Query: 1247 AFCSIQIEYDRNKQSVWSVKHGGNS-GTTIHRVKLNYPHEVLTCISGYYGYIGKDERQQA 1306
            A  SIQIEYDRN QSVWS+KHGG+S G   HR+K  YP E +TCISGYYG +   +R   
Sbjct: 492  AITSIQIEYDRNGQSVWSIKHGGDSNGVATHRIKFEYPDESITCISGYYGPLNNSDRYNV 551

Query: 1307 IKSLTFHTSRGKFGPFGEEVGSFFTSTTTEGKVVGFHGRSSLYLDAIGVHMQHWLGSQRA 1366
            +KSL+F+TSRG++GP+GEE G+FFTSTTT+GKV+GFHGRSS +LDAIGVHMQHWLG+ ++
Sbjct: 552  VKSLSFYTSRGRYGPYGEETGTFFTSTTTQGKVLGFHGRSSFHLDAIGVHMQHWLGNNKS 601

Query: 1367 --SKSSLFKLF 1370
              S++S FKLF
Sbjct: 612  YYSRASCFKLF 601

BLAST of HG10004824 vs. ExPASy Swiss-Prot
Match: Q9FM64 (Pentatricopeptide repeat-containing protein At5g55740, chloroplastic OS=Arabidopsis thaliana OX=3702 GN=CRR21 PE=2 SV=1)

HSP 1 Score: 355.1 bits (910), Expect = 3.5e-96
Identity = 215/805 (26.71%), Postives = 388/805 (48.20%), Query Frame = 0

Query: 26  LKFSSKPIKTSIFFTQKFTTKFNDD------------NLSYLCSNGLLREAITSIDSMSK 85
           L F++ P K     + K ++K +D+             +S LC NG ++EA++ +  M  
Sbjct: 4   LPFNTIPNKVPFSVSSKPSSKHHDEQAHSPSSTSYFHRVSSLCKNGEIKEALSLVTEMDF 63

Query: 86  RGSKLSTNTYINLLQTCIDADSIELGRELHDRMSSVDQV---NPFVETKLAC--GNCEDL 145
           R  ++    Y  +LQ C+    +  G+++H R+         N ++ETKL      C+ L
Sbjct: 64  RNLRIGPEIYGEILQGCVYERDLSTGKQIHARILKNGDFYARNEYIETKLVIFYAKCDAL 123

Query: 146 ETVKL------------------------------------------------------- 205
           E  ++                                                       
Sbjct: 124 EIAEVLFSKLRVRNVFSWAAIIGVKCRIGLCEGALMGFVEMLENEIFPDNFVVPNVCKAC 183

Query: 206 -----------IHSVVIRCGLSCYIRVSNSILTAFVKCGKLCLARKFFGNMDERDGVSWN 265
                      +H  V++ GL   + V++S+   + KCG L  A K F  + +R+ V+WN
Sbjct: 184 GALKWSRFGRGVHGYVVKSGLEDCVFVASSLADMYGKCGVLDDASKVFDEIPDRNAVAWN 243

Query: 266 AIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIMIASYSQLG--------------- 325
           A++ GY Q G  +EA RL   M  QG +P  VT +  +++ + +G               
Sbjct: 244 ALMVGYVQNGKNEEAIRLFSDMRKQGVEPTRVTVSTCLSASANMGGVEEGKQSHAIAIVN 303

Query: 326 -------------NFNL---VIELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFF 385
                        NF     +IE  + +       DV TW  +ISG+ Q   +  A+   
Sbjct: 304 GMELDNILGTSLLNFYCKVGLIEYAEMVFDRMFEKDVVTWNLIISGYVQQGLVEDAIYMC 363

Query: 386 KEMILAWVEPNAITIASATSACASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKC 445
           + M L  ++ + +T+A+  SA A  ++L+ G E+ C+ I+     +I++ ++++DMY+KC
Sbjct: 364 QLMRLEKLKYDCVTLATLMSAAARTENLKLGKEVQCYCIRHSFESDIVLASTVMDMYAKC 423

Query: 446 GKLEAARHVFDTILEKDIYTWNSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMI 505
           G +  A+ VFD+ +EKD+  WN+++  Y ++G  G+A  LF  ++   V PNV+TWN++I
Sbjct: 424 GSIVDAKKVFDSTVEKDLILWNTLLAAYAESGLSGEALRLFYGMQLEGVPPNVITWNLII 483

Query: 506 SGCIQNGDEDQAVNLFQIMEKDGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNF 565
              ++NG  D+A ++F  M+  G +  N  SW +++ G  Q G   +A+   R+MQ    
Sbjct: 484 LSLLRNGQVDEAKDMFLQMQSSGIIP-NLISWTTMMNGMVQNGCSEEAILFLRKMQESGL 543

Query: 566 NPNSVTILSILPACANVMAEKKVKEIHGCVLRRNLESEL-PIANSLIDTYAKSGNIQYSR 625
            PN+ +I   L ACA++ +    + IHG ++R    S L  I  SL+D YAK G+I  + 
Sbjct: 544 RPNAFSITVALSACAHLASLHIGRTIHGYIIRNLQHSSLVSIETSLVDMYAKCGDINKAE 603

Query: 626 TIFDGMSSKDIITWNSIIAGYILHGCSDAAFHLFGQMKKLGIRPNRGTLASLIHAYGIAG 685
            +F      ++   N++I+ Y L+G    A  L+  ++ +G++P+  T+ +++ A   AG
Sbjct: 604 KVFGSKLYSELPLSNAMISAYALYGNLKEAIALYRSLEGVGLKPDNITITNVLSACNHAG 663

Query: 686 MVDKGRHVFSSITEEHQILPTLDHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDGSIWTSL 714
            +++   +F+ I  +  + P L+HY  MVDL   +G    A+  IE+MP +PD  +  SL
Sbjct: 664 DINQAIEIFTDIVSKRSMKPCLEHYGLMVDLLASAGETEKALRLIEEMPFKPDARMIQSL 723

BLAST of HG10004824 vs. ExPASy Swiss-Prot
Match: Q9LFI1 (Pentatricopeptide repeat-containing protein At3g53360, mitochondrial OS=Arabidopsis thaliana OX=3702 GN=PCMP-E86 PE=2 SV=1)

HSP 1 Score: 354.4 bits (908), Expect = 5.9e-96
Identity = 227/747 (30.39%), Postives = 365/747 (48.86%), Query Frame = 0

Query: 29  SSKPIKTSIFFTQKFTTKFNDDNLSYLCSNGLLREAITSIDSMSKRGS-KLSTNTYINLL 88
           +S+ + TS   +   T +  +D+++ LC +   REA+ + D   K  S K+   TYI+L+
Sbjct: 15  NSQILATSSVVSTIKTEELMNDHINSLCKSNFYREALEAFDFAQKNSSFKIRLRTYISLI 74

Query: 89  QTCIDADSIELGRELHDRMSSVDQVNPFVETKLACGNCEDLETVKLIHSVVIRCGLSCYI 148
             C  + S+  GR++HD + +               NC+  +T+                
Sbjct: 75  CACSSSRSLAQGRKIHDHILN--------------SNCK-YDTI---------------- 134

Query: 149 RVSNSILTAFVKCGKLCLARKFFGNMDERDGVSWNAIIAGYCQKGNGDEARRLLDAM--- 208
            ++N IL+ + KCG L  AR+ F  M ER+ VS+ ++I GY Q G G EA RL   M   
Sbjct: 135 -LNNHILSMYGKCGSLRDAREVFDFMPERNLVSYTSVITGYSQNGQGAEAIRLYLKMLQE 194

Query: 209 -------------------SDQGFKPGLVTYNIMIASYSQLGNFNLVIELKKKMESMGIA 268
                              SD G    L    I + S S L   N +I +  +   M  A
Sbjct: 195 DLVPDQFAFGSIIKACASSSDVGLGKQLHAQVIKLESSSHLIAQNALIAMYVRFNQMSDA 254

Query: 269 ---------PDVYTWTSMISGFAQSSRISQALDFFKEMILAWV-EPNAITIASATSACAS 328
                     D+ +W+S+I+GF+Q     +AL   KEM+   V  PN     S+  AC+S
Sbjct: 255 SRVFYGIPMKDLISWSSIIAGFSQLGFEFEALSHLKEMLSFGVFHPNEYIFGSSLKACSS 314

Query: 329 LKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKCGKLEAARHVFDTILEKDIYTWNSM 388
           L     G +IH   IK  +A   + G SL DMY++CG L +AR VFD I   D  +WN +
Sbjct: 315 LLRPDYGSQIHGLCIKSELAGNAIAGCSLCDMYARCGFLNSARRVFDQIERPDTASWNVI 374

Query: 389 IGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGDEDQAVNLFQIMEKDG- 448
           I G    GY  +A  +F ++R S  +P+ ++   ++    +     Q + +   + K G 
Sbjct: 375 IAGLANNGYADEAVSVFSQMRSSGFIPDAISLRSLLCAQTKPMALSQGMQIHSYIIKWGF 434

Query: 449 ---------------------------EVKRNTA---SWNSLIAGYHQLGEKNKALAIFR 508
                                      E  RN A   SWN+++    Q  +  + L +F+
Sbjct: 435 LADLTVCNSLLTMYTFCSDLYCCFNLFEDFRNNADSVSWNTILTACLQHEQPVEMLRLFK 494

Query: 509 QMQSLNFNPNSVTILSILPACANVMAEKKVKEIHGCVLRRNLESELPIANSLIDTYAKSG 568
            M      P+ +T+ ++L  C  + + K   ++H   L+  L  E  I N LID YAK G
Sbjct: 495 LMLVSECEPDHITMGNLLRGCVEISSLKLGSQVHCYSLKTGLAPEQFIKNGLIDMYAKCG 554

Query: 569 NIQYSRTIFDGMSSKDIITWNSIIAGYILHGCSDAAFHLFGQMKKLGIRPNRGTLASLIH 628
           ++  +R IFD M ++D+++W+++I GY   G  + A  LF +MK  GI PN  T   ++ 
Sbjct: 555 SLGQARRIFDSMDNRDVVSWSTLIVGYAQSGFGEEALILFKEMKSAGIEPNHVTFVGVLT 614

Query: 629 AYGIAGMVDKGRHVFSSITEEHQILPTLDHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDG 688
           A    G+V++G  +++++  EH I PT +H   +VDL  R+GRL +A  FI++M +EPD 
Sbjct: 615 ACSHVGLVEEGLKLYATMQTEHGISPTKEHCSCVVDLLARAGRLNEAERFIDEMKLEPDV 674

Query: 689 SIWTSLLTACRFHGNLHLAVQAAKRLHELEPDNHVIYRLLIQAYALYGNFEQALKVRKLG 712
            +W +LL+AC+  GN+HLA +AA+ + +++P N   + LL   +A  GN+E A  +R   
Sbjct: 675 VVWKTLLSACKTQGNVHLAQKAAENILKIDPFNSTAHVLLCSMHASSGNWENAALLRSSM 729

BLAST of HG10004824 vs. ExPASy Swiss-Prot
Match: Q9SY02 (Pentatricopeptide repeat-containing protein At4g02750 OS=Arabidopsis thaliana OX=3702 GN=PCMP-H24 PE=3 SV=1)

HSP 1 Score: 352.1 bits (902), Expect = 2.9e-95
Identity = 225/738 (30.49%), Postives = 376/738 (50.95%), Query Frame = 0

Query: 52  LSYLCSNGLLREAITSIDSMSKRGSKLSTNTYINLLQT-----CIDADSIELGRELHDRM 111
           L Y   NGL R    +  + +    K +T T I   QT     C D+D            
Sbjct: 16  LHYTSLNGLKRRCNNAHGAANFHSLKRATQTQIQKSQTKPLLKCGDSD------------ 75

Query: 112 SSVDQVNPFVETKLACGNCEDLETVKLIHSVVIRCGLSCYIRVSNSILTAFVKCGKLCLA 171
             + + N  + + +  G C   E +++   +     +S      N +++ +++ G+  LA
Sbjct: 76  --IKEWNVAISSYMRTGRCN--EALRVFKRMPRWSSVS-----YNGMISGYLRNGEFELA 135

Query: 172 RKFFGNMDERDGVSWNAIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIMIASYSQL 231
           RK F  M ERD VSWN +I GY +  N  +AR L + M ++     + ++N M++ Y+Q 
Sbjct: 136 RKLFDEMPERDLVSWNVMIKGYVRNRNLGKARELFEIMPER----DVCSWNTMLSGYAQN 195

Query: 232 GNFNLVIELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILAWVEPNAITI 291
           G    V + +   + M    DV +W +++S + Q+S++ +A   FK             +
Sbjct: 196 G---CVDDARSVFDRMPEKNDV-SWNALLSAYVQNSKMEEACMLFKSR-------ENWAL 255

Query: 292 ASATSACASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKCGKLEAARHVFDTILE 351
            S           +K +E   F   M +  +++  N++I  Y++ GK++ AR +FD    
Sbjct: 256 VSWNCLLGGFVKKKKIVEARQFFDSMNV-RDVVSWNTIITGYAQSGKIDEARQLFDESPV 315

Query: 352 KDIYTWNSMIGGYCQAGYCGKAYELFMRLRESN-------------------------VM 411
           +D++TW +M+ GY Q     +A ELF ++ E N                         VM
Sbjct: 316 QDVFTWTAMVSGYIQNRMVEEARELFDKMPERNEVSWNAMLAGYVQGERMEMAKELFDVM 375

Query: 412 P--NVVTWNVMISGCIQNGDEDQAVNLFQIMEKDGEVKRNTASWNSLIAGYHQLGEKNKA 471
           P  NV TWN MI+G  Q G   +A NLF  M      KR+  SW ++IAGY Q G   +A
Sbjct: 376 PCRNVSTWNTMITGYAQCGKISEAKNLFDKMP-----KRDPVSWAAMIAGYSQSGHSFEA 435

Query: 472 LAIFRQMQSLNFNPNSVTILSILPACANVMAEKKVKEIHGCVLRRNLESELPIANSLIDT 531
           L +F QM+      N  +  S L  CA+V+A +  K++HG +++   E+   + N+L+  
Sbjct: 436 LRLFVQMEREGGRLNRSSFSSALSTCADVVALELGKQLHGRLVKGGYETGCFVGNALLLM 495

Query: 532 YAKSGNIQYSRTIFDGMSSKDIITWNSIIAGYILHGCSDAAFHLFGQMKKLGIRPNRGTL 591
           Y K G+I+ +  +F  M+ KDI++WN++IAGY  HG  + A   F  MK+ G++P+  T+
Sbjct: 496 YCKCGSIEEANDLFKEMAGKDIVSWNTMIAGYSRHGFGEVALRFFESMKREGLKPDDATM 555

Query: 592 ASLIHAYGIAGMVDKGRHVFSSITEEHQILPTLDHYLAMVDLYGRSGRLTDAIEFIEDMP 651
            +++ A    G+VDKGR  F ++T+++ ++P   HY  MVDL GR+G L DA   +++MP
Sbjct: 556 VAVLSACSHTGLVDKGRQYFYTMTQDYGVMPNSQHYACMVDLLGRAGLLEDAHNLMKNMP 615

Query: 652 IEPDGSIWTSLLTACRFHGNLHLAVQAAKRLHELEPDNHVIYRLLIQAYALYGNFEQALK 711
            EPD +IW +LL A R HGN  LA  AA ++  +EP+N  +Y LL   YA  G +    K
Sbjct: 616 FEPDAAIWGTLLGASRVHGNTELAETAADKIFAMEPENSGMYVLLSNLYASSGRWGDVGK 675

Query: 712 VRKLGKESAMKKCTAQCWVEVRNKVHLFVTSDQ--SKLDVLNTWIKSIEGKVKKFNNHHQ 752
           +R   ++  +KK     W+E++NK H F   D+   + D +  +++ ++ ++KK     +
Sbjct: 676 LRVRMRDKGVKKVPGYSWIEIQNKTHTFSVGDEFHPEKDEIFAFLEELDLRMKKAGYVSK 711

BLAST of HG10004824 vs. ExPASy TrEMBL
Match: A0A6J1HFG7 (pentatricopeptide repeat-containing protein At1g19720 OS=Cucurbita moschata OX=3662 GN=LOC111462963 PE=3 SV=1)

HSP 1 Score: 2370.9 bits (6143), Expect = 0.0e+00
Identity = 1204/1556 (77.38%), Postives = 1265/1556 (81.30%), Query Frame = 0

Query: 1    MEKLAIPCQTNPPISVPASIIKAKPLKFSSKPIKTSIFFTQKFTTKFNDDNLSYLCSNGL 60
            MEKLAIPCQT PPISVPASIIK KPLKFSSKP +T+IFFTQK ++K NDD+LSYLC +GL
Sbjct: 1    MEKLAIPCQTKPPISVPASIIKTKPLKFSSKPTQTTIFFTQKSSSKSNDDHLSYLCRHGL 60

Query: 61   LREAITSIDSMSKRGSKLSTNTYINLLQTCIDADSIELGRELHDRMSSVDQVNPFVETKL 120
            LREAI++IDSMS+ GSKLSTNTYINLLQTCIDADSIE+GRELH R+  VDQVNPFVETKL
Sbjct: 61   LREAISAIDSMSRHGSKLSTNTYINLLQTCIDADSIEVGRELHVRLCLVDQVNPFVETKL 120

Query: 121  ------------------------------------------------------------ 180
                                                                        
Sbjct: 121  VSMYAKCGFLKDARKVFDEMPERNLYTWSAMIGAYSREQRWKEVVELFFLMMGDGVLPDA 180

Query: 181  --------ACGNCEDLETVKLIHSVVIRCGLSCYIRVSNSILTAFVKCGKLCLARKFFGN 240
                    ACGNCEDLET+KL+HSVVIRCGLSC +RVSNSILTA VKCG L LARKFF N
Sbjct: 181  FLFPRILQACGNCEDLETLKLMHSVVIRCGLSCSMRVSNSILTALVKCGNLSLARKFFEN 240

Query: 241  MDERDGVSWNAIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIMIASYSQLGNFNLV 300
            MDERD VSWNAIIAGYC+KG+GDEAR LLD M+DQGFKPGLVT NI+IASYSQLG  NLV
Sbjct: 241  MDERDEVSWNAIIAGYCRKGHGDEARTLLDTMNDQGFKPGLVTCNILIASYSQLGKCNLV 300

Query: 301  IELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILAWVEPNAITIASATSA 360
            IELKKKMESMGI PDVYTWTSMISGFAQSSRI+ ALDFFKEMILA VEPNA+TI S TSA
Sbjct: 301  IELKKKMESMGITPDVYTWTSMISGFAQSSRINLALDFFKEMILAGVEPNAVTITSVTSA 360

Query: 361  CASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKCGKLEAARHVFDTILEKDIYTW 420
            CASLKSLQKGLEIHC AIKMGIAH++LVGNSLIDMYSKCGKLEAA HVFDTILEKDIYTW
Sbjct: 361  CASLKSLQKGLEIHCLAIKMGIAHQVLVGNSLIDMYSKCGKLEAAHHVFDTILEKDIYTW 420

Query: 421  NSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGDEDQAVNLFQIMEK 480
            NSMIGGYCQ GYCGKAYELFMRLRESNVMPNVVTWNVMISGCI NGDEDQA+NLFQ+ME 
Sbjct: 421  NSMIGGYCQGGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIHNGDEDQAMNLFQMMEN 480

Query: 481  DGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFNPNSVTILSILPACANVMAEK 540
            D EV  NTASWNSLIAGYH+LGEKNKALAIFRQMQSLNFNPNSVTILSILP CANVMAEK
Sbjct: 481  DEEVNPNTASWNSLIAGYHRLGEKNKALAIFRQMQSLNFNPNSVTILSILPVCANVMAEK 540

Query: 541  KVKEIHGCVLRRNLESELPIANSLIDTYAKSGNIQYSRTIFDGMSSKDIITWNSIIAGYI 600
            K+KEIHGCVLRRNLESELP+ANSLIDTYAKSGNIQYSR IFDGMSSKDIITWNSIIAGYI
Sbjct: 541  KIKEIHGCVLRRNLESELPVANSLIDTYAKSGNIQYSRNIFDGMSSKDIITWNSIIAGYI 600

Query: 601  LHGCSDAAFHLFGQMKKLGIRPNRGTLASLIHAYGIAGMVDKGRHVFSSITEEHQILPTL 660
            LHGCSDAAFHLF QMK+ GIRPNRGTLAS+I+A GIAGMVD+GRHVFSSITEEHQILPTL
Sbjct: 601  LHGCSDAAFHLFDQMKRFGIRPNRGTLASIIYACGIAGMVDRGRHVFSSITEEHQILPTL 660

Query: 661  DHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDGSIWTSLLTACRFHGNLHLAVQAAKRLHE 720
            DHY AMVDLYGRSGRLTDAIEFIE+MP EPD SIWTSLLTA RFHGNLHLAV+AA+ L E
Sbjct: 661  DHYSAMVDLYGRSGRLTDAIEFIENMPTEPDVSIWTSLLTASRFHGNLHLAVRAAEHLLE 720

Query: 721  LEPDNHVIYRLLIQAYALYGNFEQALKVRKLGKESAMKKCTAQCWVEVRNKVHLFVTSDQ 780
            LEPDNHVIYRLLIQAYALYG  EQALKVRKLG+ESAMKKCTAQCWVEV NKV+ FV  D 
Sbjct: 721  LEPDNHVIYRLLIQAYALYGKSEQALKVRKLGRESAMKKCTAQCWVEVGNKVYFFVNGDH 780

Query: 781  SKLDVLNTWIKSIEGKVKKFNNHHQLSIDEEQKEEKI----------------------- 840
            SK+DVLNTWIK I GKVKKFNNHHQLSID+E KEEKI                       
Sbjct: 781  SKVDVLNTWIKGIVGKVKKFNNHHQLSIDDEPKEEKIGGFHCEKFAFAFGLIGSSHKPKR 840

Query: 841  ------------------------------------------------------------ 900
                                                                        
Sbjct: 841  IKIVKNLRICGDCHQMAKYVSEAHGCEIYLSDSKCLHHFKNGCCSCGDYCIILIAFRHYM 900

Query: 901  ---VRSLFFP--------------------------------TQSFDDSRKIKPVLAGPF 960
               V SL  P                                T + D SRKIKP+ AGPF
Sbjct: 901  LVWVHSLLIPHRPWHTAIVLHRHRSSRKSLLSLLVLSFVSVTTMNSDGSRKIKPISAGPF 960

Query: 961  GGPGGNNWDDGVYSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDRHGGNGGTKTDTVAPY 1020
            GG GGN WDDGV+STIRQLVICHGAGIDSIKIQYDVKGSSIWSD+HGGNGGTKTDT    
Sbjct: 961  GGTGGNYWDDGVFSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDKHGGNGGTKTDT---- 1020

Query: 1021 FVGSSIVCSVKLDFPDEYLTMIRGHYGSFVSFDKVFVRSLTFMSNKKKYGPYGVEQGTVF 1080
                     VKLDFPDEYLTMIRGHYGSFVSFDKV+VRSLTFMSNK+K+GPYGVE GT+F
Sbjct: 1021 ---------VKLDFPDEYLTMIRGHYGSFVSFDKVYVRSLTFMSNKRKFGPYGVELGTIF 1080

Query: 1081 SFPTTEGKIVGFHGRSGLYLDAIGVYLKPMAIQTPSKAMIQSQNYVASKTESEGYSIIQG 1140
            SFP TEGKIVGFHGRSGLYLDAIGVYLKPM IQTPSK MIQS NYVA K ESEGYSIIQG
Sbjct: 1081 SFPATEGKIVGFHGRSGLYLDAIGVYLKPMPIQTPSKGMIQSPNYVACKAESEGYSIIQG 1140

Query: 1141 SVGQNYDIVLALRQKDEFKKPLPTTVSKQVSSSSSSESSDDESTVKRPVKKGLSKVENVV 1200
            SVGQNYDIVLALRQKDEFKKPLP T+SKQVSSSSSSESSDDEST KRPVKKG SKVEN V
Sbjct: 1141 SVGQNYDIVLALRQKDEFKKPLPNTISKQVSSSSSSESSDDESTDKRPVKKGPSKVENAV 1200

Query: 1201 PCGPWGGSGGTPFDDGYYTGIRQINVSRNVGIVYIRVLYACDEESIWGVRAGGTGGFKHD 1260
            PCGPWGGSGGT FDDG+Y+GIR+INVSRNVGIVYI+VLYA DEESIWG RAGG GGFKHD
Sbjct: 1201 PCGPWGGSGGTTFDDGHYSGIREINVSRNVGIVYIKVLYAWDEESIWGTRAGGKGGFKHD 1260

Query: 1261 KVILDYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKTKYGPFGEAQGTPFSTNVKEGK 1320
            KV+ DYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTK KYGPFGEA GTPFSTNVKEGK
Sbjct: 1261 KVVFDYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKAKYGPFGEALGTPFSTNVKEGK 1320

Query: 1321 IVGFHGRKGLFLDALGVHLVEGKVTPVSRPPSSDIVPAA-PPLLENENAPWTMKLAPSKG 1370
            IVGFHGRKGLFLDALGVHLVEGKVTP SRPPSS+IVPAA PPLL NE  PWT K+APSKG
Sbjct: 1321 IVGFHGRKGLFLDALGVHLVEGKVTPASRPPSSEIVPAARPPLLGNELVPWTKKVAPSKG 1380

BLAST of HG10004824 vs. ExPASy TrEMBL
Match: A0A6J1K2S7 (LOW QUALITY PROTEIN: uncharacterized protein LOC111491877 OS=Cucurbita maxima OX=3661 GN=LOC111491877 PE=3 SV=1)

HSP 1 Score: 1989.5 bits (5153), Expect = 0.0e+00
Identity = 1031/1466 (70.33%), Postives = 1088/1466 (74.22%), Query Frame = 0

Query: 1    MEKLAIPCQTNPPISVPASIIKAKPLKFSSKPIKTSIFFTQKFTTKFNDDNLSYLCSNGL 60
            MEKLAIPCQT PPISVPASIIK KPLKFSSKP +T+IFFTQK ++K NDD+LSYLC +GL
Sbjct: 1    MEKLAIPCQTKPPISVPASIIKTKPLKFSSKPTQTTIFFTQKTSSKSNDDHLSYLCRHGL 60

Query: 61   LREAITSIDSMSKRGSKLSTNTYINLLQTCIDADSIELGRELHDRMSSVDQVNPFVETKL 120
            LREAI +IDSMS+ GSKLSTNTYINLLQTCIDADSIE+GRELH R+  VDQVNPFVETKL
Sbjct: 61   LREAIAAIDSMSRHGSKLSTNTYINLLQTCIDADSIEVGRELHVRLCLVDQVNPFVETKL 120

Query: 121  ------------------------------------------------------------ 180
                                                                        
Sbjct: 121  VSMYAKCGFLKDARKVFDEMLERNLYTWSAMIGGYSREQRWTEVVELFFLMMGDGVLPDA 180

Query: 181  --------ACGNCEDLETVKLIHSVVIRCGLSCYIRVSNSILTAFVKCGKLCLARKFFGN 240
                    ACGNCEDLET+KL+HSVVIRCGLSC +RVSNSILTA VKCG L LARKFF N
Sbjct: 181  FLFPRILQACGNCEDLETLKLMHSVVIRCGLSCSMRVSNSILTALVKCGNLSLARKFFEN 240

Query: 241  MDERDGVSWNAIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIMIASYSQLGNFNLV 300
            MDERD VSWNAIIAGYC+KG+GDEAR LLD M+DQGFKPGLVT NI+IASYSQLG  NLV
Sbjct: 241  MDERDEVSWNAIIAGYCRKGHGDEARTLLDTMNDQGFKPGLVTCNILIASYSQLGKCNLV 300

Query: 301  IELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILAWVEPNAITIASATSA 360
            IELKKKMESMGI PDVYTWTSMISGFAQSSRI+ ALDFFKEMILA VEPNA+TI S +SA
Sbjct: 301  IELKKKMESMGITPDVYTWTSMISGFAQSSRINLALDFFKEMILAGVEPNAVTITSVSSA 360

Query: 361  CASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKCGKLEAARHVFDTILEKDIYTW 420
            CASLKSLQKGLEIHC AIKMGIAH++LVGNSLIDMYSKCGKLEAA HVFDTILEKDIYTW
Sbjct: 361  CASLKSLQKGLEIHCLAIKMGIAHQVLVGNSLIDMYSKCGKLEAAHHVFDTILEKDIYTW 420

Query: 421  NSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGDEDQAVNLFQIMEK 480
            NSMIGGYCQ GYCGKAYELFMR+RESNVMPNVVTWNVMISGCI NGDEDQA+NLFQ+ME 
Sbjct: 421  NSMIGGYCQGGYCGKAYELFMRIRESNVMPNVVTWNVMISGCIHNGDEDQAMNLFQMMEN 480

Query: 481  DGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFNPNSVTILSILPACANVMAEK 540
            DGEV  NTASWNSLIAGYH+LGEKNKALAIFRQMQSLNFNPNSVTILSILP CANVMAEK
Sbjct: 481  DGEVNPNTASWNSLIAGYHRLGEKNKALAIFRQMQSLNFNPNSVTILSILPVCANVMAEK 540

Query: 541  KVKEIHGCVLRRNLESELPIANSLIDTYAKSGNIQYSRTIFDGMSSKDIITWNSIIAGYI 600
            K+KEIHGCVLRRNLE+ELP+ANSLIDTYAKSGNIQYSR IFDGM SKDIITWNSIIAGY 
Sbjct: 541  KIKEIHGCVLRRNLETELPVANSLIDTYAKSGNIQYSRNIFDGMLSKDIITWNSIIAGYT 600

Query: 601  LHGCSDAAFHLFGQMKKLGIRPNRGTLASLIHAYGIAGMVDKGRHVFSSITEEHQILPTL 660
            LHGCSDAAFHLF QMK+ GIRPNRGTLA  +                             
Sbjct: 601  LHGCSDAAFHLFDQMKRFGIRPNRGTLAICL----------------------------- 660

Query: 661  DHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDGSIWTSLLTACRFHGNLHLAVQAAKRLHE 720
                                                                        
Sbjct: 661  ------------------------------------------------------------ 720

Query: 721  LEPDNHVIYRLLIQAYALYGNFEQALKVRKLGKESAMKKCTAQCWVEVRNKVHLFVTSDQ 780
                                                                        
Sbjct: 721  ------------------------------------------------------------ 780

Query: 781  SKLDVLNTWIKSIEGKVKKFNNHHQLSIDEEQKEEKIVRSLFFPTQSFDDSRKIKPVLAG 840
                                                      FP Q+ D SRKIKP+ AG
Sbjct: 781  ------------------------------------------FPPQNSDGSRKIKPISAG 840

Query: 841  PFGGPGGNNWDDGVYSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDRHGGNGGTKTDTVA 900
            PFGG GGN WDDGV+STIRQLVICHGAGIDSIKIQYDVKGSSIWSD+HGGNGGTKTDT  
Sbjct: 841  PFGGTGGNYWDDGVFSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDKHGGNGGTKTDT-- 900

Query: 901  PYFVGSSIVCSVKLDFPDEYLTMIRGHYGSFVSFDKVFVRSLTFMSNKKKYGPYGVEQGT 960
                       VKLDFPDEYLTMIRGHYGSFVSFDKV+VRSLTFMSNK+K+GPYGVE GT
Sbjct: 901  -----------VKLDFPDEYLTMIRGHYGSFVSFDKVYVRSLTFMSNKRKFGPYGVELGT 960

Query: 961  VFSFPTTEGKIVGFHGRSGLYLDAIGVYLKPMAIQTPSKAMIQSQNYVASKTESEGYSII 1020
            +FSFP TEGKIVGFHGRSGLYLDAIGVYLKPM IQTPSK MIQS NYVA K ESEGYSII
Sbjct: 961  IFSFPATEGKIVGFHGRSGLYLDAIGVYLKPMPIQTPSKGMIQSPNYVACKAESEGYSII 1020

Query: 1021 QGSVGQNYDIVLALRQKDEFKKPLPTTVSKQVSSSSSSESSDDESTVK------------ 1080
            QGSVGQNYDIVLALRQKDEFK+PLP T+SKQVSSSSSSESSDDEST K            
Sbjct: 1021 QGSVGQNYDIVLALRQKDEFKRPLPNTISKQVSSSSSSESSDDESTDKVRRNLFLXRFYL 1080

Query: 1081 ----------------RPVKKGLSKVENVVPCGPWGGSGGTPFDDGYYTGIRQINVSRNV 1140
                            RPVKKG SKVEN VPCGPWGGSGGT FDDG+Y+GIR+INVSRNV
Sbjct: 1081 SCFXVNSNNSWXWSSQRPVKKGPSKVENAVPCGPWGGSGGTTFDDGHYSGIREINVSRNV 1140

Query: 1141 GIVYIRVLYACDEESIWGVRAGGTGGFKHDKVILDYPYEILTHVTGHYGPVMYMGPNVIK 1200
            GIVYI+VLYA DEESIWG RAGG GGFKHDKV+ DYPYEILT VTG+YGPVMYMGPNVIK
Sbjct: 1141 GIVYIKVLYAWDEESIWGTRAGGKGGFKHDKVVFDYPYEILTRVTGYYGPVMYMGPNVIK 1200

Query: 1201 SLTFHTTKTKYGPFGEAQGTPFSTNVKEGKIVGFHGRKGLFLDALGVHLVEGKVTPVSRP 1260
            SLTFHTTK KYGP+GEA GTPFSTNVKEGKIVGFHGRKGLFLDALGVHLVEGKV P SRP
Sbjct: 1201 SLTFHTTKAKYGPYGEALGTPFSTNVKEGKIVGFHGRKGLFLDALGVHLVEGKVNPASRP 1260

Query: 1261 PSSDIVPAA-PPLLENENAPWTMKLAPSKGGALEEIARGVVKEPAPCGPGPWGGDGGKPW 1320
            PSS+IVPAA PPLL NE  PWT K+APSKGGALEEI RGVVKEPAPCGPGPWGGDGGKPW
Sbjct: 1261 PSSEIVPAAPPPLLGNELVPWTKKVAPSKGGALEEITRGVVKEPAPCGPGPWGGDGGKPW 1262

Query: 1321 DDGVFSGIKQIYLTRSLEAFCSIQIEYDRNKQSVWSVKHGGNSGTTIHRVKLNYPHEVLT 1370
            DDGVFSGIKQIYLTRSLEAFCSIQIEYDRNKQSVWSVKHGGNSGT+IHRVKL+YPHEVLT
Sbjct: 1321 DDGVFSGIKQIYLTRSLEAFCSIQIEYDRNKQSVWSVKHGGNSGTSIHRVKLDYPHEVLT 1262

BLAST of HG10004824 vs. ExPASy TrEMBL
Match: A0A6J1CST1 (pentatricopeptide repeat-containing protein At1g19720-like isoform X3 OS=Momordica charantia OX=3673 GN=LOC111013974 PE=3 SV=1)

HSP 1 Score: 1989.2 bits (5152), Expect = 0.0e+00
Identity = 1026/1451 (70.71%), Postives = 1094/1451 (75.40%), Query Frame = 0

Query: 1    MEKLAIPCQTNPPISVPASIIKAKPLKFSSKPIKTSIFFTQKFTTKFNDDNLSYLCSNGL 60
            MEKLAIPCQT PPI VPASIIKAKPLKFS KP KT+IFFT K +TKFNDD+L YLC+NGL
Sbjct: 1    MEKLAIPCQTKPPIPVPASIIKAKPLKFSPKPSKTAIFFTHKISTKFNDDHLRYLCNNGL 60

Query: 61   LREAITSIDSMSKRGSKLSTNTYINLLQTCIDADSIELGRELHDRMSSVDQVNPFVETKL 120
            L E+IT+ID+MSKRGSK+ST+TYINLLQ+CID +SIE+GRELH R+  VDQVNPFVETKL
Sbjct: 61   LSESITAIDAMSKRGSKISTSTYINLLQSCIDVNSIEVGRELHVRVRLVDQVNPFVETKL 120

Query: 121  ------------------------------------------------------------ 180
                                                                        
Sbjct: 121  ISMYAKCGFLEDARKVFDGMRERNLYTWSAMIGAYSREQRWKEVVKLFFLMMGDGVLPDA 180

Query: 181  --------ACGNCEDLETVKLIHSVVIRCGLSCYIRVSNSILTAFVKCGKLCLARKFFGN 240
                    ACGNCEDLETVKLIHSVVIRCG+SC++RVSNS+LTAFVKCGKL LARKFF N
Sbjct: 181  FLFPKILRACGNCEDLETVKLIHSVVIRCGMSCFMRVSNSVLTAFVKCGKLSLARKFFEN 240

Query: 241  MDERDGVSWNAIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIMIASYSQLGNFNLV 300
            MDERDGVSWNAII+ YCQKG+GDEARRLLDAMS++GF+PGLVT NI+IASYSQLGN NLV
Sbjct: 241  MDERDGVSWNAIISAYCQKGDGDEARRLLDAMSNEGFEPGLVTCNILIASYSQLGNCNLV 300

Query: 301  IELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILAWVEPNAITIASATSA 360
            IELKKKMES+GI PDVYTWTSMISGFAQSSRISQALDFFKEMIL  VEPNAITI SATSA
Sbjct: 301  IELKKKMESLGITPDVYTWTSMISGFAQSSRISQALDFFKEMILCGVEPNAITITSATSA 360

Query: 361  CASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKCGKLEAARHVFDTILEKDIYTW 420
            CASLKSLQ GLEIHCFA+KMGI+HE+LVGNSLIDMYSKCGKLEAARHVFD ILEKDI+TW
Sbjct: 361  CASLKSLQNGLEIHCFAVKMGISHEVLVGNSLIDMYSKCGKLEAARHVFDMILEKDIFTW 420

Query: 421  NSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGDEDQAVNLFQIMEK 480
            NSMIGGYCQAGYCGKAYELF+RLRES+V+PNVVTWNVMISGCIQNGDEDQA+NLFQIMEK
Sbjct: 421  NSMIGGYCQAGYCGKAYELFVRLRESDVLPNVVTWNVMISGCIQNGDEDQAMNLFQIMEK 480

Query: 481  DGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFNPNSVTILSILPACANVMAEK 540
            DGEVKRNTASWNSLIAG+ QLGEKNKALA+FRQMQ L FNPNSVTILSILPACA+VMAE+
Sbjct: 481  DGEVKRNTASWNSLIAGFQQLGEKNKALAVFRQMQFLYFNPNSVTILSILPACASVMAER 540

Query: 541  KVKEIHGCVLRRNLESELPIANSLIDTYAKSGNIQYSRTIFDGMSSKDIITWNSIIAGYI 600
            K+KEIHGCVLRRNLESELP+ANSLIDTYAKSGNIQYSRTIFDGMSSKDIITWNSIIAGYI
Sbjct: 541  KIKEIHGCVLRRNLESELPVANSLIDTYAKSGNIQYSRTIFDGMSSKDIITWNSIIAGYI 600

Query: 601  LHGCSDAAFHLFGQMKKLGIRPNRGTLASLIHAYGIAGMVDKGRHVFSSITEEHQILPTL 660
            LHGCSDAAF LF QMK+ GIRPNRGTLA                                
Sbjct: 601  LHGCSDAAFDLFDQMKRFGIRPNRGTLA-------------------------------- 660

Query: 661  DHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDGSIWTSLLTACRFHGNLHLAVQAAKRLHE 720
                                                                        
Sbjct: 661  ------------------------------------------------------------ 720

Query: 721  LEPDNHVIYRLLIQAYALYGNFEQALKVRKLGKESAMKKCTAQCWVEVRNKVHLFVTSDQ 780
                                                                        
Sbjct: 721  ------------------------------------------------------------ 780

Query: 781  SKLDVLNTWIKSIEGKVKKFNNHHQLSIDEEQKEEKIVRSLFFPTQSFDDSRKIKPVLAG 840
                                                 +   FF +QSFDDSRKIKPV  G
Sbjct: 781  -------------------------------------ICFCFFFSQSFDDSRKIKPVPGG 840

Query: 841  PFGGPGGNNWDDGVYSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDRHGGNGGTKTDTVA 900
            PFGGPGGNNW+DGV+ST+RQLVICHGAGIDSIKIQYDVKGSSIWSDRHGGNGGTKTDT  
Sbjct: 841  PFGGPGGNNWNDGVFSTVRQLVICHGAGIDSIKIQYDVKGSSIWSDRHGGNGGTKTDT-- 900

Query: 901  PYFVGSSIVCSVKLDFPDEYLTMIRGHYGSFVSFDKVFVRSLTFMSNKKKYGPYGVEQGT 960
                       VKL+ PDEYLTMIRGHYGSFVSF +VFVRSLTF+SNK+K+GPYGVE GT
Sbjct: 901  -----------VKLELPDEYLTMIRGHYGSFVSFGQVFVRSLTFVSNKRKFGPYGVELGT 960

Query: 961  VFSFPTTEGKIVGFHGRSGLYLDAIGVYLKPMAIQTPSKAMIQSQNYVASKTESEGYSII 1020
            VFSFP  EGKIVGFHGRSGLYLDAIGVYLKP+ +QTP KAMIQSQNYVA+KTE+E YSII
Sbjct: 961  VFSFPVAEGKIVGFHGRSGLYLDAIGVYLKPIQMQTPPKAMIQSQNYVANKTENEAYSII 1020

Query: 1021 QGSVGQNYDIVLALRQKDEFKKPLPTTVSKQVSSSSSSESSDDESTVK------------ 1080
            QGSVGQNYDIVLA+RQKDEF+KPLPTT SKQ SSSSSSESSD+ES  K            
Sbjct: 1021 QGSVGQNYDIVLAVRQKDEFRKPLPTTSSKQASSSSSSESSDEESIDKDRTQMMAGYGQS 1080

Query: 1081 -RPVKKGLSKVENVVPCGPWGGSGGTPFDDGYYTGIRQINVSRNVGIVYIRVLYACDEES 1140
             RPVKK  SKVENVVP GPWGGSGGT FDDG Y+GIRQINVSRNVGIVYIRVLYACDEE 
Sbjct: 1081 QRPVKKVPSKVENVVPYGPWGGSGGTAFDDGCYSGIRQINVSRNVGIVYIRVLYACDEEF 1140

Query: 1141 IWGVRAGGTGGFKHDKVILDYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKTKYGPFG 1200
            IWG RAGGTGGFKHDKVI DYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKTKYGPFG
Sbjct: 1141 IWGSRAGGTGGFKHDKVIFDYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKTKYGPFG 1200

Query: 1201 EAQGTPFSTNVKE-GKIVGFHGRKGLFLDALGVHLVEGKVTPVSRPPSSDIVPAAPPLLE 1260
            EA GTPFSTNV+E GK+VGFHGRKGLFLDALGVH+VEGKVTP+SRPP SDIVPA PP L 
Sbjct: 1201 EALGTPFSTNVREGGKVVGFHGRKGLFLDALGVHVVEGKVTPLSRPPCSDIVPAEPPSLG 1249

Query: 1261 NENAPWTMKLAPSKGGALEEIARGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYLTR 1320
             E+A W+ KLAPSKGG+ E +A GVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYLTR
Sbjct: 1261 TESAHWSKKLAPSKGGSAEAVAHGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYLTR 1249

Query: 1321 SLEAFCSIQIEYDRNKQSVWSVKHGGNSGTTIHRVKLNYPHEVLTCISGYYGYIGKDERQ 1370
            SLE FCSIQIEYDRNKQSVWSVKHGGN GTT+HRVKL YPHEVLTCISGYYGY+ KDERQ
Sbjct: 1321 SLEGFCSIQIEYDRNKQSVWSVKHGGNGGTTVHRVKLEYPHEVLTCISGYYGYVSKDERQ 1249

BLAST of HG10004824 vs. ExPASy TrEMBL
Match: A0A1R3IY37 (Mannose-binding lectin OS=Corchorus capsularis OX=210143 GN=CCACVL1_08980 PE=3 SV=1)

HSP 1 Score: 1622.8 bits (4201), Expect = 0.0e+00
Identity = 828/1494 (55.42%), Postives = 1054/1494 (70.55%), Query Frame = 0

Query: 1    MEKLAIPCQTNPPISVPASIIKAKPLKFSSKPIKTSIFFTQK-FTTKFNDDNLSYLCSNG 60
            ME + IPC + PPI +PA +      + S  P K +   ++K    K ++  L+YL  NG
Sbjct: 1    MENMMIPCTSKPPIIIPAKL--GNSTELSQFPTKLTFSNSRKTHNPKLSETYLNYLSRNG 60

Query: 61   LLREAITSIDSMSKRGSKLSTNTYINLLQTCIDADSIELGRELHDRMSSVDQVNPFVETK 120
             L EAI+++DS+++ GS++  +T+INLLQ CID  S++LGR+LH R+  V++ +PFVETK
Sbjct: 61   RLTEAISALDSIAQSGSQVRPSTFINLLQACIDLGSLDLGRKLHARIHLVEENDPFVETK 120

Query: 121  L----------------------------------------------------------- 180
            L                                                           
Sbjct: 121  LVSMYAKCGSLADARKVFDRMNGRNLYAWSAMIGACSRELRWKEVVKLFFLMMEEGVRPD 180

Query: 181  ---------ACGNCEDLETVKLIHSVVIRCGLSCYIRVSNSILTAFVKCGKLCLARKFFG 240
                     AC NC D+ T +L+HS+VIR G+    RVSNS+L  + KCGK+  AR+FF 
Sbjct: 181  EILFTKILQACANCGDVRTGRLLHSLVIRLGMVSVARVSNSVLAVYAKCGKVRSARRFFD 240

Query: 241  NMDERDGVSWNAIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIMIASYSQLGNFNL 300
            NM+ERD V+WN++I  YCQKG+ DEA RL   MS +G +P L+T+NI+I SY+QLG  ++
Sbjct: 241  NMNERDRVTWNSMILAYCQKGDSDEAYRLFSGMSLEGIQPCLITWNILINSYNQLGQCDV 300

Query: 301  VIELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILAWVEPNAITIASATS 360
             + L ++ME  GI PDV+TWTSMISG AQ+ R  QAL  FKEM LA ++PN +TI SA S
Sbjct: 301  AMGLVEEMEISGIIPDVFTWTSMISGLAQNGRRWQALCLFKEMYLAGIKPNGVTITSAVS 360

Query: 361  ACASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKCGKLEAARHVFDTILEKDIYT 420
            A AS++ L  G EIH  A+KMG+   +LVGNSLIDMYSKCG+LEAAR VFD I EKD+Y+
Sbjct: 361  ASASMRVLNTGREIHSVALKMGVIDNVLVGNSLIDMYSKCGELEAARQVFDKIEEKDVYS 420

Query: 421  WNSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGDEDQAVNLFQIME 480
            WNSMI GYC AGYCGKAYELFM+++ES+V PNV+TWN MISG IQNGDED+A++LFQ ME
Sbjct: 421  WNSMIAGYCHAGYCGKAYELFMKMQESDVKPNVITWNSMISGYIQNGDEDRAMDLFQRME 480

Query: 481  KDGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFNPNSVTILSILPACANVMAE 540
            +DG+V+RNTASWN+LIAG+ QLGE +KA  +FRQMQS + +PNSVTILSILP CAN++A 
Sbjct: 481  RDGKVRRNTASWNTLIAGFVQLGEIDKAFGVFRQMQSCSISPNSVTILSILPGCANLIAS 540

Query: 541  KKVKEIHGCVLRRNLESELPIANSLIDTYAKSGNIQYSRTIFDGMSSKDIITWNSIIAGY 600
            KKVKEIHGCVLRRNL+  L I+NSLIDTYAKSGNI YSR IFDGMS++DII+WNSII GY
Sbjct: 541  KKVKEIHGCVLRRNLD-VLSISNSLIDTYAKSGNILYSRIIFDGMSARDIISWNSIIGGY 600

Query: 601  ILHGCSDAAFHLFGQMKKLGIRPNRGTLASLIHAYGIAGMVDKGRHVFSSITEEHQILPT 660
            +LHG SDAA  LF QM  LG++PNRGT  S+I A+GIAGM+D+G+ +FSSI + ++I+P 
Sbjct: 601  VLHGYSDAALDLFNQMCMLGLKPNRGTFLSIILAHGIAGMLDEGKQIFSSIRDNYEIIPA 660

Query: 661  LDHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDGSIWTSLLTACRFHGNLHLAVQAAKRLH 720
            ++HY AM+D+YGRSGRL +A+EFIE+MP EPD SIW SLLTA R H N+ LAV A + L 
Sbjct: 661  IEHYSAMIDVYGRSGRLEEAMEFIEEMPTEPDSSIWASLLTASRIHSNIALAVLAGESLL 720

Query: 721  ELEPDNHVIYRLLIQAYALYGNFEQALKVRKLGKESAMKKCTAQCWVEVRNKVHLFVTSD 780
            +LEP N VI +L+ Q YAL GN + + KVRKL KE+ +++     W+EVRN VH FV  D
Sbjct: 721  DLEPGNMVINQLMFQIYALCGNLDASSKVRKLEKENLLRRSLGHSWIEVRNTVHRFVNGD 780

Query: 781  QSK--LDVLNTWIKSIEGKVKKFNNHHQLSIDEEQKEE-------------KIVRSLFFP 840
            +SK   ++L +W++SI  +V   ++H  L I+EE+KEE              ++ S   P
Sbjct: 781  KSKPCSNLLYSWLESIGREVNADDHHGGLFIEEEEKEETGGIHSEKLALAFALIGSSSSP 840

Query: 841  --TQSFDDSRKI------------------------KPVLAGPFGGPGGNNWDDGVYSTI 900
              T++  ++  I                        KPV  GP+GG GG++WDDGVY+T+
Sbjct: 841  QSTETHLETSTIVLLCARDSSSQKSKADVASLGDDTKPVSVGPWGGQGGSSWDDGVYATV 900

Query: 901  RQLVICHGAGIDSIKIQYDVKGSSIWSDRHGGNGGTKTDTVAPYFVGSSIVCSVKLDFPD 960
            RQLVI HGAGIDSI+I+YD KG+SIWS +HGG  G+K D              VKLD+PD
Sbjct: 901  RQLVIAHGAGIDSIQIEYDNKGNSIWSRKHGGEAGSKID-------------KVKLDYPD 960

Query: 961  EYLTMIRGHYGSFVSFDKVFVRSLTFMSNKKKYGPYGVEQGTVFSFPTTEGKIVGFHGRS 1020
            E+LT I GHYGS        VRSLTF SN+K YGPYGVEQGT  SF    GKIVGF+G+S
Sbjct: 961  EFLTSIHGHYGSLFEGGPHLVRSLTFHSNRKTYGPYGVEQGT--SFSMNRGKIVGFYGKS 1020

Query: 1021 GLYLDAIGVYLKPMAIQTPSKAMIQSQNYVASKTESE--GYSIIQGSVGQNYDIVLALRQ 1080
            G YLDAIGV+LKP      SK ++ +QN+VA+   ++  G+ +IQGSVG++YDIVLA+RQ
Sbjct: 1021 GWYLDAIGVHLKPFTKLNHSKTILHTQNFVANANGADKVGFQVIQGSVGESYDIVLAVRQ 1080

Query: 1081 KDEFKKPLPTTVSKQVSSSSSSESSDD-----------ESTVKRPVKKGLSKV--ENVVP 1140
            +D +  PLP  +S+Q SSSSSS+ S D            +  K P K    KV  E V+ 
Sbjct: 1081 RDAYGNPLPKELSRQPSSSSSSDDSSDVEAKTKFKVSLPTPEKVPAKILPPKVLPEGVLT 1140

Query: 1141 CGPWGGSGGTPFDDGYYTGIRQINVSRNVGIVYIRVLYACDEESIWGVRAGGTGGFKHDK 1200
             GPWGG+GG  FDDG YTGIRQINVSRNVGIV ++V Y  D +++WG + GGTGGF+ DK
Sbjct: 1141 YGPWGGNGGVKFDDGTYTGIRQINVSRNVGIVSLKVCYDRDGQAVWGSKHGGTGGFRTDK 1200

Query: 1201 VILDYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKTKYGPFGEAQGTPFSTNVKEGKI 1260
            +I DYP EILTH+TG +GP+MYMGPNVIKSLTFHT K K+GP+GE QG  F+  + EGKI
Sbjct: 1201 IIFDYPSEILTHITGTFGPLMYMGPNVIKSLTFHTNKGKHGPYGEEQGPSFTNKMDEGKI 1260

Query: 1261 VGFHGRKGLFLDALGVHLVEGKVTPVSRPPSSDIVPAAPPLLENENAPWTMKLAPSKGGA 1320
            VGFHGR+GLFLDA+GV ++EGKV P     S  I+P+   + E +N+PW+ KL  +K G 
Sbjct: 1261 VGFHGREGLFLDAIGVFVMEGKVPPPRPHFSQAIIPSERTIAEIDNSPWSNKLVLAKQGP 1320

Query: 1321 LEEIARGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYLTRSLEAFCSIQIEYDRNKQ 1370
            +EE+A GVVKEPAPCGPGPWGGDGG+PWDDGV+SGIKQI++T+S EA CSIQIEYDRN Q
Sbjct: 1321 VEELACGVVKEPAPCGPGPWGGDGGRPWDDGVYSGIKQIFITKSAEAICSIQIEYDRNGQ 1380

BLAST of HG10004824 vs. ExPASy TrEMBL
Match: A0A5D2CK41 (Uncharacterized protein OS=Gossypium darwinii OX=34276 GN=ES288_D05G209200v1 PE=3 SV=1)

HSP 1 Score: 1609.0 bits (4165), Expect = 0.0e+00
Identity = 815/1529 (53.30%), Postives = 1048/1529 (68.54%), Query Frame = 0

Query: 1    MEKLAIPCQTNPPISVPASIIKAKPLKFSSKPIKTSIFFTQ-KFTTKFNDDNLSYLCSNG 60
            ME L I C + PP+ +P         +FS    K S  +T+     K  D+++ YL  +G
Sbjct: 1    MENLMITCISKPPVIIPTKHDNLS--EFSQPQTKLSFTYTKNNKNPKITDNHVKYLARSG 60

Query: 61   LLREAITSIDSMSKRGSKLSTNTYINLLQTCIDADSIELGRELHDRMSSVDQVNPFVETK 120
             L EA+ ++DS++  GS++  NT+I+LLQ CID  S++LGR+LH R+  V + +PFVETK
Sbjct: 61   RLAEAVAALDSIALSGSQVRPNTFISLLQACIDFGSLDLGRKLHARIHLVKESDPFVETK 120

Query: 121  L----------------------------------------------------------- 180
            L                                                           
Sbjct: 121  LVSTYAKCGSFADARKVFDEMSQKNLYTWSAMIGAYSRVSRWKEVVELFFLMMEDGVLPD 180

Query: 181  ---------ACGNCEDLETVKLIHSVVIRCGLSCYIRVSNSILTAFVKCGKLCLARKFFG 240
                     AC NC D+ T +L+HS+VIR G+ CY RVSNS+L  + KCGKL  AR+FF 
Sbjct: 181  EFLFPRILQACANCGDVRTGRLLHSLVIRLGMVCYTRVSNSVLAVYAKCGKLRSARRFFD 240

Query: 241  NMDERDGVSWNAIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIMIASYSQLGNFNL 300
             M+ERD V+WN+++  YCQKG  DEA +L + M  +G +P +V++NI+I SY+QLG  ++
Sbjct: 241  YMNERDRVTWNSMLLAYCQKGENDEAYKLFNGMWGEGIEPCIVSWNILINSYNQLGRCDV 300

Query: 301  VIELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILAWVEPNAITIASATS 360
             + L K+MES  ++PDV+TWTSMISG AQ+ R  QAL  FKEM+LA ++PN +TI SA S
Sbjct: 301  ALGLMKEMESSRVSPDVFTWTSMISGLAQNGRRWQALFVFKEMLLAGIKPNGVTITSAVS 360

Query: 361  ACASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKCGKLEAARHVFDTILEKDIYT 420
            ACASLK L+ GLEIH  A++MGI   +LVGNSLIDMY+KCG+LEAAR VFD I EKD+YT
Sbjct: 361  ACASLKVLKLGLEIHSIALRMGITDNVLVGNSLIDMYAKCGELEAARQVFDMIEEKDVYT 420

Query: 421  WNSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGDEDQAVNLFQIME 480
            WNSMI GYCQAGYCGKAYELF++++ES+V PNV+TWN MISG IQNGDED+A++LFQ +E
Sbjct: 421  WNSMIAGYCQAGYCGKAYELFIKMQESDVKPNVITWNTMISGYIQNGDEDRAMDLFQRIE 480

Query: 481  KDGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFNPNSVTILSILPACANVMAE 540
            +DG+++RNTASWN+LIAGY QLG  +KA  +FRQMQS + +PNSVTILSILP CAN++A 
Sbjct: 481  QDGKIRRNTASWNALIAGYVQLGAIDKAFGVFRQMQSCSISPNSVTILSILPGCANLIAT 540

Query: 541  KKVKEIHGCVLRRNLESELPIANSLIDTYAKSGNIQYSRTIFDGMSSKDIITWNSIIAGY 600
            KKVKEIHGC+LRR+LE  + I+NSLIDTYAKSGNI YSR IFDGM ++DII+WNSII GY
Sbjct: 541  KKVKEIHGCILRRDLEFVISISNSLIDTYAKSGNILYSRNIFDGMPTRDIISWNSIIGGY 600

Query: 601  ILHGCSDAAFHLFGQMKKLGIRPNRGTLASLIHAYGIAGMVDKGRHVFSSITEEHQILPT 660
            +LHGC DAA  LF QM+KLGI+PNRGT  S+I A GIA MVD+G+ +FSSI++ ++I+P 
Sbjct: 601  VLHGCFDAALDLFDQMRKLGIKPNRGTFLSIILARGIAKMVDEGKQIFSSISDNYEIIPA 660

Query: 661  LDHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDGSIWTSLLTACRFHGNLHLAVQAAKRLH 720
            ++HY AM+DLYGRSGRL +A+EFIEDMPIEPD S+WTSLLTA R H ++ LAV A +RL 
Sbjct: 661  IEHYSAMIDLYGRSGRLGEAMEFIEDMPIEPDSSVWTSLLTASRIHKDIALAVLAGERLL 720

Query: 721  ELEPDNHVIYRLLIQAYALYGNFEQALKVRKLGKESAMKKCTAQCWVEVRNKVHLFVTSD 780
            +LEP N V+ +L+ Q Y+L G  + + KVRKL KES +++     W+EVRN VH FVT D
Sbjct: 721  DLEPGNIVVNQLMYQIYSLCGKLDDSSKVRKLEKESTLRRSLGHSWIEVRNTVHAFVTGD 780

Query: 781  QSK--LDVLNTWIKSIEGKVKKFNNHHQLSIDEEQKEE---------------------- 840
            QSK   ++L++W+++I  +V   ++H    I+EE+KEE                      
Sbjct: 781  QSKPSSNLLHSWVQNITREVNIDDHHGGFFIEEEEKEEIGGIHSEKLAIAFALISSPSSP 840

Query: 841  ------------------------------------------------------------ 900
                                                                        
Sbjct: 841  QSIRIVKNIRMCRNCHLTAKGEERIFKFREFMIYDSPPCPCTVSRNFASQYPDRQKGWNI 900

Query: 901  --KIVRSLFFPTQSFDDSRKIKPVLAGPFGGPGGNNWDDGVYSTIRQLVICHGAGIDSIK 960
              + +R+LF  T         KPV  GP+GG GG +WDDGVY TIRQLVI HG+GIDS++
Sbjct: 901  SWQKIRALFIKTLLLSTEDDKKPVSVGPWGGQGGTSWDDGVYCTIRQLVIAHGSGIDSVQ 960

Query: 961  IQYDVKGSSIWSDRHGGNGGTKTDTVAPYFVGSSIVCSVKLDFPDEYLTMIRGHYGSFVS 1020
            I+YD KG+S+WS +HGGNGG+KTD              VKLDFPDE+LT I G+YGS   
Sbjct: 961  IEYDTKGNSLWSRKHGGNGGSKTD-------------KVKLDFPDEFLTSIHGYYGSLNQ 1020

Query: 1021 FDKVFVRSLTFMSNKKKYGPYGVEQGTVFSFPTTEGKIVGFHGRSGLYLDAIGVYLKPMA 1080
               + VRSLTF SN+K YGP+G+EQGT  SF   +GKIVGF GRSG YLDAIGVY KP+ 
Sbjct: 1021 RGPIIVRSLTFHSNRKAYGPFGIEQGT--SFSMNKGKIVGFRGRSGWYLDAIGVYSKPVL 1080

Query: 1081 IQTPSKAMIQSQNYVASKTESEGYSIIQGSVGQNYDIVLALRQKDEFKKPLPTTVSKQVS 1140
               PSK ++ +Q+  A+  E  GYS+IQGSVG++YDIVLA+RQ+D F  P P  + +Q S
Sbjct: 1081 KLNPSKPIVHAQSVAATGPEKSGYSVIQGSVGESYDIVLAVRQRDGFVNPQPRELIRQNS 1140

Query: 1141 SSSSSESSDDEST-----VKRPVKKGLSKVENVVPCGPWGGSGGTPFDDGYYTGIRQINV 1200
            SSSSS+   D  T      + P+K      E V+  GPWGG GGT FDDG YTGIRQI +
Sbjct: 1141 SSSSSDDLSDVETKSKVPFRTPMKVPPRLPEGVLTYGPWGGQGGTKFDDGTYTGIRQIVL 1200

Query: 1201 SRNVGIVYIRVLYACDEESIWGVRAGGTGGFKHDKVILDYPYEILTHVTGHYGPVMYMGP 1260
            SRNVGIV ++V Y  + +++WG + GGTGGFK ++++ DYP EILTH+TG + P+MYMGP
Sbjct: 1201 SRNVGIVSMKVCYDREGQAVWGSKHGGTGGFKTERIMFDYPSEILTHITGTFAPLMYMGP 1260

Query: 1261 NVIKSLTFHTTKTKYGPFGEAQGTPFSTNVKEGKIVGFHGRKGLFLDALGVHLVEGKVTP 1320
            NVI+SLTF+T K K+GP+G+ QG  F+  + EGKIVGF GR+GLFLDA+GVH++EGKV P
Sbjct: 1261 NVIRSLTFYTNKGKHGPYGDEQGPSFTNKMNEGKIVGFLGREGLFLDAVGVHVMEGKVPP 1320

Query: 1321 VSRPPSSDIVPAAPPLLENENAPWTMKLAPSKGGALEEIARGVVKEPAPCGPGPWGGDGG 1370
                 S  I+ +  P+ E +N+PW+ KL  ++ G +EE+A GVVKEP+PCGPGPWGGDGG
Sbjct: 1321 PKPSYSQAIIQSERPIAEIDNSPWSNKLVLARRGPVEEVACGVVKEPSPCGPGPWGGDGG 1380

BLAST of HG10004824 vs. TAIR 10
Match: AT1G19720.1 (Pentatricopeptide repeat (PPR-like) superfamily protein )

HSP 1 Score: 749.6 bits (1934), Expect = 4.5e-216
Identity = 390/821 (47.50%), Postives = 530/821 (64.56%), Query Frame = 0

Query: 1   MEKLAIPC--QTNPPISVPASIIKAKPLKFSSKPIKTSIFFTQKFTTK-FNDDNLSYLCS 60
           MEKL +P   +T      PA +  +  L   S+  K ++ FT+K       D+   YLC 
Sbjct: 1   MEKLFVPSFPKTFLNYQTPAKVENSPELHPKSR--KKNLSFTKKKEPNIIPDEQFDYLCR 60

Query: 61  NGLLREAITSIDSMSKRGSKLSTNTYINLLQTCIDADSIELGRELHDRMSSVDQVNPFVE 120
           NG L EA  ++DS+ ++GSK+  +TY+ LL++CID+ SI LGR LH R     + + FVE
Sbjct: 61  NGSLLEAEKALDSLFQQGSKVKRSTYLKLLESCIDSGSIHLGRILHARFGLFTEPDVFVE 120

Query: 121 TKL--------------------------------------------------------- 180
           TKL                                                         
Sbjct: 121 TKLLSMYAKCGCIADARKVFDSMRERNLFTWSAMIGAYSRENRWREVAKLFRLMMKDGVL 180

Query: 181 -----------ACGNCEDLETVKLIHSVVIRCGLSCYIRVSNSILTAFVKCGKLCLARKF 240
                       C NC D+E  K+IHSVVI+ G+S  +RVSNSIL  + KCG+L  A KF
Sbjct: 181 PDDFLFPKILQGCANCGDVEAGKVIHSVVIKLGMSSCLRVSNSILAVYAKCGELDFATKF 240

Query: 241 FGNMDERDGVSWNAIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIMIASYSQLGNF 300
           F  M ERD ++WN+++  YCQ G  +EA  L+  M  +G  PGLVT+NI+I  Y+QLG  
Sbjct: 241 FRRMRERDVIAWNSVLLAYCQNGKHEEAVELVKEMEKEGISPGLVTWNILIGGYNQLGKC 300

Query: 301 NLVIELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFFKEMILAWVEPNAITIASA 360
           +  ++L +KME+ GI  DV+TWT+MISG   +    QALD F++M LA V PNA+TI SA
Sbjct: 301 DAAMDLMQKMETFGITADVFTWTAMISGLIHNGMRYQALDMFRKMFLAGVVPNAVTIMSA 360

Query: 361 TSACASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKCGKLEAARHVFDTILEKDI 420
            SAC+ LK + +G E+H  A+KMG   ++LVGNSL+DMYSKCGKLE AR VFD++  KD+
Sbjct: 361 VSACSCLKVINQGSEVHSIAVKMGFIDDVLVGNSLVDMYSKCGKLEDARKVFDSVKNKDV 420

Query: 421 YTWNSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMISGCIQNGDEDQAVNLFQI 480
           YTWNSMI GYCQAGYCGKAYELF R++++N+ PN++TWN MISG I+NGDE +A++LFQ 
Sbjct: 421 YTWNSMITGYCQAGYCGKAYELFTRMQDANLRPNIITWNTMISGYIKNGDEGEAMDLFQR 480

Query: 481 MEKDGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNFNPNSVTILSILPACANVM 540
           MEKDG+V+RNTA+WN +IAGY Q G+K++AL +FR+MQ   F PNSVTILS+LPACAN++
Sbjct: 481 MEKDGKVQRNTATWNLIIAGYIQNGKKDEALELFRKMQFSRFMPNSVTILSLLPACANLL 540

Query: 541 AEKKVKEIHGCVLRRNLESELPIANSLIDTYAKSGNIQYSRTIFDGMSSKDIITWNSIIA 600
             K V+EIHGCVLRRNL++   + N+L DTYAKSG+I+YSRTIF GM +KDIITWNS+I 
Sbjct: 541 GAKMVREIHGCVLRRNLDAIHAVKNALTDTYAKSGDIEYSRTIFLGMETKDIITWNSLIG 600

Query: 601 GYILHGCSDAAFHLFGQMKKLGIRPNRGTLASLIHAYGIAGMVDKGRHVFSSITEEHQIL 660
           GY+LHG    A  LF QMK  GI PNRGTL+S+I A+G+ G VD+G+ VF SI  ++ I+
Sbjct: 601 GYVLHGSYGPALALFNQMKTQGITPNRGTLSSIILAHGLMGNVDEGKKVFYSIANDYHII 660

Query: 661 PTLDHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDGSIWTSLLTACRFHGNLHLAVQAAKR 720
           P L+H  AMV LYGR+ RL +A++FI++M I+ +  IW S LT CR HG++ +A+ AA+ 
Sbjct: 661 PALEHCSAMVYLYGRANRLEEALQFIQEMNIQSETPIWESFLTGCRIHGDIDMAIHAAEN 720

Query: 721 LHELEPDNHVIYRLLIQAYALYGNFEQALKVRKLGKESAMKKCTAQCWVEVRNKVHLFVT 748
           L  LEP+N     ++ Q YAL     ++L+  K  +++ +KK   Q W+EVRN +H F T
Sbjct: 721 LFSLEPENTATESIVSQIYALGAKLGRSLEGNKPRRDNLLKKPLGQSWIEVRNLIHTFTT 780

BLAST of HG10004824 vs. TAIR 10
Match: AT1G19715.1 (Mannose-binding lectin superfamily protein )

HSP 1 Score: 682.2 bits (1759), Expect = 8.8e-196
Identity = 332/611 (54.34%), Postives = 436/611 (71.36%), Query Frame = 0

Query: 767  KPVLAGPFGGPGGNNWDDGVYSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDRHGGNGGT 826
            KP   GP+GG  G+ WDDG+Y+T++Q++I HG+GIDSI+I+YD  GSS+WS++ GG GG 
Sbjct: 6    KPASLGPWGGQSGHAWDDGMYTTVKQIIIAHGSGIDSIQIEYDKNGSSVWSEKRGGKGGK 65

Query: 827  KTDTVAPYFVGSSIVCSVKLDFPDEYLTMIRGHYGSFVSFDKVFVRSLTFMSNKKKYGPY 886
            K D              VK D+P EYL  + G YGSF  +  + VRSLTF SN++KYGP+
Sbjct: 66   KFD-------------KVKFDYPHEYLISVNGTYGSFDVWGTICVRSLTFESNRRKYGPF 125

Query: 887  GVEQGTVFSFPTTEGKIVGFHGRSGLYLDAIGVYLKPMAIQT--PSKAMIQSQNYVASKT 946
            GV+ GT F+ P +  KI+GFHG++G YLDAIGV+ +P+  +    SK ++ S    +   
Sbjct: 126  GVDSGTFFALPKSGSKIIGFHGKAGWYLDAIGVHTQPIPKENNPSSKILLHSHQSFSQGD 185

Query: 947  ESEGYSIIQGSVGQNYDIVLALRQKDEFKKPLPTTVSKQVSSSSSSESSDDESTVKRPVK 1006
            +   YS++QGSVGQN+DIV+ LR+KD      PT  S +   S+ +E +  +  +    +
Sbjct: 186  KKHEYSVLQGSVGQNFDIVVTLRKKD------PTLPSFESRDSAGAEVT--KHKLVTDTE 245

Query: 1007 KGLSKVE-NVVPCGPWGGSGGTPFDDGYYTGIRQINVSRNVGIVYIRVLYACDEESIWGV 1066
            K  SK+E      GPWGG+GG  FDDG YTGIRQIN+SRNVGIV ++V Y    +++WG 
Sbjct: 246  KSQSKIEGGAKTYGPWGGTGGIMFDDGIYTGIRQINLSRNVGIVSMKVCYDFRGQAVWGS 305

Query: 1067 RAGGTGGFKHDKVILDYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKTKYGPFGEAQG 1126
            + GG GGFKHDK++ DYP E+LTHVTG YGP+MYMGPNVIKSLTF T + K+GP+GE QG
Sbjct: 306  KHGGVGGFKHDKIVFDYPSEVLTHVTGTYGPLMYMGPNVIKSLTFRTNRGKHGPYGEEQG 365

Query: 1127 TPFSTNVKEGKIVGFHGRKGLFLDALGVHLVEGKVTPVS-RPPSSDIVP-AAPPLLENEN 1186
              F+  + EGK+VGF GR+GLFLD++GVH++E K++ +    P + IVP       + EN
Sbjct: 366  PSFTHQMDEGKVVGFLGREGLFLDSIGVHVMECKISSLKPSSPHNAIVPHNNSGTAQIEN 425

Query: 1187 APWTMKLAPSKGGALEEIARGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYLTRSLE 1246
            +PW  KL  +  G  EE+ RGVVKEP P GPGPWGGDGG+ WDDGVFSGIKQI++TR  +
Sbjct: 426  SPWANKLVLAANGHGEEVDRGVVKEPTPSGPGPWGGDGGQAWDDGVFSGIKQIFVTRGND 485

Query: 1247 AFCSIQIEYDRNKQSVWSVKHGGNS-GTTIHRVKLNYPHEVLTCISGYYGYIGKDERQQA 1306
            A  SIQIEYDRN QSVWS+KHGG+S G   HR+K  YP E +TCISGYYG +   +R   
Sbjct: 486  AITSIQIEYDRNGQSVWSIKHGGDSNGVATHRIKFEYPDESITCISGYYGPLNNSDRYNV 545

Query: 1307 IKSLTFHTSRGKFGPFGEEVGSFFTSTTTEGKVVGFHGRSSLYLDAIGVHMQHWLGSQRA 1366
            +KSL+F+TSRG++GP+GEE G+FFTSTTT+GKV+GFHGRSS +LDAIGVHMQHWLG+ ++
Sbjct: 546  VKSLSFYTSRGRYGPYGEETGTFFTSTTTQGKVLGFHGRSSFHLDAIGVHMQHWLGNNKS 595

Query: 1367 --SKSSLFKLF 1370
              S++S FKLF
Sbjct: 606  YYSRASCFKLF 595

BLAST of HG10004824 vs. TAIR 10
Match: AT1G19715.3 (Mannose-binding lectin superfamily protein )

HSP 1 Score: 682.2 bits (1759), Expect = 8.8e-196
Identity = 332/611 (54.34%), Postives = 436/611 (71.36%), Query Frame = 0

Query: 767  KPVLAGPFGGPGGNNWDDGVYSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDRHGGNGGT 826
            KP   GP+GG  G+ WDDG+Y+T++Q++I HG+GIDSI+I+YD  GSS+WS++ GG GG 
Sbjct: 12   KPASLGPWGGQSGHAWDDGMYTTVKQIIIAHGSGIDSIQIEYDKNGSSVWSEKRGGKGGK 71

Query: 827  KTDTVAPYFVGSSIVCSVKLDFPDEYLTMIRGHYGSFVSFDKVFVRSLTFMSNKKKYGPY 886
            K D              VK D+P EYL  + G YGSF  +  + VRSLTF SN++KYGP+
Sbjct: 72   KFD-------------KVKFDYPHEYLISVNGTYGSFDVWGTICVRSLTFESNRRKYGPF 131

Query: 887  GVEQGTVFSFPTTEGKIVGFHGRSGLYLDAIGVYLKPMAIQT--PSKAMIQSQNYVASKT 946
            GV+ GT F+ P +  KI+GFHG++G YLDAIGV+ +P+  +    SK ++ S    +   
Sbjct: 132  GVDSGTFFALPKSGSKIIGFHGKAGWYLDAIGVHTQPIPKENNPSSKILLHSHQSFSQGD 191

Query: 947  ESEGYSIIQGSVGQNYDIVLALRQKDEFKKPLPTTVSKQVSSSSSSESSDDESTVKRPVK 1006
            +   YS++QGSVGQN+DIV+ LR+KD      PT  S +   S+ +E +  +  +    +
Sbjct: 192  KKHEYSVLQGSVGQNFDIVVTLRKKD------PTLPSFESRDSAGAEVT--KHKLVTDTE 251

Query: 1007 KGLSKVE-NVVPCGPWGGSGGTPFDDGYYTGIRQINVSRNVGIVYIRVLYACDEESIWGV 1066
            K  SK+E      GPWGG+GG  FDDG YTGIRQIN+SRNVGIV ++V Y    +++WG 
Sbjct: 252  KSQSKIEGGAKTYGPWGGTGGIMFDDGIYTGIRQINLSRNVGIVSMKVCYDFRGQAVWGS 311

Query: 1067 RAGGTGGFKHDKVILDYPYEILTHVTGHYGPVMYMGPNVIKSLTFHTTKTKYGPFGEAQG 1126
            + GG GGFKHDK++ DYP E+LTHVTG YGP+MYMGPNVIKSLTF T + K+GP+GE QG
Sbjct: 312  KHGGVGGFKHDKIVFDYPSEVLTHVTGTYGPLMYMGPNVIKSLTFRTNRGKHGPYGEEQG 371

Query: 1127 TPFSTNVKEGKIVGFHGRKGLFLDALGVHLVEGKVTPVS-RPPSSDIVP-AAPPLLENEN 1186
              F+  + EGK+VGF GR+GLFLD++GVH++E K++ +    P + IVP       + EN
Sbjct: 372  PSFTHQMDEGKVVGFLGREGLFLDSIGVHVMECKISSLKPSSPHNAIVPHNNSGTAQIEN 431

Query: 1187 APWTMKLAPSKGGALEEIARGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYLTRSLE 1246
            +PW  KL  +  G  EE+ RGVVKEP P GPGPWGGDGG+ WDDGVFSGIKQI++TR  +
Sbjct: 432  SPWANKLVLAANGHGEEVDRGVVKEPTPSGPGPWGGDGGQAWDDGVFSGIKQIFVTRGND 491

Query: 1247 AFCSIQIEYDRNKQSVWSVKHGGNS-GTTIHRVKLNYPHEVLTCISGYYGYIGKDERQQA 1306
            A  SIQIEYDRN QSVWS+KHGG+S G   HR+K  YP E +TCISGYYG +   +R   
Sbjct: 492  AITSIQIEYDRNGQSVWSIKHGGDSNGVATHRIKFEYPDESITCISGYYGPLNNSDRYNV 551

Query: 1307 IKSLTFHTSRGKFGPFGEEVGSFFTSTTTEGKVVGFHGRSSLYLDAIGVHMQHWLGSQRA 1366
            +KSL+F+TSRG++GP+GEE G+FFTSTTT+GKV+GFHGRSS +LDAIGVHMQHWLG+ ++
Sbjct: 552  VKSLSFYTSRGRYGPYGEETGTFFTSTTTQGKVLGFHGRSSFHLDAIGVHMQHWLGNNKS 601

Query: 1367 --SKSSLFKLF 1370
              S++S FKLF
Sbjct: 612  YYSRASCFKLF 601

BLAST of HG10004824 vs. TAIR 10
Match: AT1G19715.2 (Mannose-binding lectin superfamily protein )

HSP 1 Score: 656.4 bits (1692), Expect = 5.1e-188
Identity = 321/592 (54.22%), Postives = 423/592 (71.45%), Query Frame = 0

Query: 786  VYSTIRQLVICHGAGIDSIKIQYDVKGSSIWSDRHGGNGGTKTDTVAPYFVGSSIVCSVK 845
            +Y+T++Q++I HG+GIDSI+I+YD  GSS+WS++ GG GG K D              VK
Sbjct: 1    MYTTVKQIIIAHGSGIDSIQIEYDKNGSSVWSEKRGGKGGKKFD-------------KVK 60

Query: 846  LDFPDEYLTMIRGHYGSFVSFDKVFVRSLTFMSNKKKYGPYGVEQGTVFSFPTTEGKIVG 905
             D+P EYL  + G YGSF  +  + VRSLTF SN++KYGP+GV+ GT F+ P +  KI+G
Sbjct: 61   FDYPHEYLISVNGTYGSFDVWGTICVRSLTFESNRRKYGPFGVDSGTFFALPKSGSKIIG 120

Query: 906  FHGRSGLYLDAIGVYLKPMAIQT--PSKAMIQSQNYVASKTESEGYSIIQGSVGQNYDIV 965
            FHG++G YLDAIGV+ +P+  +    SK ++ S    +   +   YS++QGSVGQN+DIV
Sbjct: 121  FHGKAGWYLDAIGVHTQPIPKENNPSSKILLHSHQSFSQGDKKHEYSVLQGSVGQNFDIV 180

Query: 966  LALRQKDEFKKPLPTTVSKQVSSSSSSESSDDESTVKRPVKKGLSKVE-NVVPCGPWGGS 1025
            + LR+KD      PT  S +   S+ +E +  +  +    +K  SK+E      GPWGG+
Sbjct: 181  VTLRKKD------PTLPSFESRDSAGAEVT--KHKLVTDTEKSQSKIEGGAKTYGPWGGT 240

Query: 1026 GGTPFDDGYYTGIRQINVSRNVGIVYIRVLYACDEESIWGVRAGGTGGFKHDKVILDYPY 1085
            GG  FDDG YTGIRQIN+SRNVGIV ++V Y    +++WG + GG GGFKHDK++ DYP 
Sbjct: 241  GGIMFDDGIYTGIRQINLSRNVGIVSMKVCYDFRGQAVWGSKHGGVGGFKHDKIVFDYPS 300

Query: 1086 EILTHVTGHYGPVMYMGPNVIKSLTFHTTKTKYGPFGEAQGTPFSTNVKEGKIVGFHGRK 1145
            E+LTHVTG YGP+MYMGPNVIKSLTF T + K+GP+GE QG  F+  + EGK+VGF GR+
Sbjct: 301  EVLTHVTGTYGPLMYMGPNVIKSLTFRTNRGKHGPYGEEQGPSFTHQMDEGKVVGFLGRE 360

Query: 1146 GLFLDALGVHLVEGKVTPVS-RPPSSDIVP-AAPPLLENENAPWTMKLAPSKGGALEEIA 1205
            GLFLD++GVH++E K++ +    P + IVP       + EN+PW  KL  +  G  EE+ 
Sbjct: 361  GLFLDSIGVHVMECKISSLKPSSPHNAIVPHNNSGTAQIENSPWANKLVLAANGHGEEVD 420

Query: 1206 RGVVKEPAPCGPGPWGGDGGKPWDDGVFSGIKQIYLTRSLEAFCSIQIEYDRNKQSVWSV 1265
            RGVVKEP P GPGPWGGDGG+ WDDGVFSGIKQI++TR  +A  SIQIEYDRN QSVWS+
Sbjct: 421  RGVVKEPTPSGPGPWGGDGGQAWDDGVFSGIKQIFVTRGNDAITSIQIEYDRNGQSVWSI 480

Query: 1266 KHGGNS-GTTIHRVKLNYPHEVLTCISGYYGYIGKDERQQAIKSLTFHTSRGKFGPFGEE 1325
            KHGG+S G   HR+K  YP E +TCISGYYG +   +R   +KSL+F+TSRG++GP+GEE
Sbjct: 481  KHGGDSNGVATHRIKFEYPDESITCISGYYGPLNNSDRYNVVKSLSFYTSRGRYGPYGEE 540

Query: 1326 VGSFFTSTTTEGKVVGFHGRSSLYLDAIGVHMQHWLGSQRA--SKSSLFKLF 1370
             G+FFTSTTT+GKV+GFHGRSS +LDAIGVHMQHWLG+ ++  S++S FKLF
Sbjct: 541  TGTFFTSTTTQGKVLGFHGRSSFHLDAIGVHMQHWLGNNKSYYSRASCFKLF 571

BLAST of HG10004824 vs. TAIR 10
Match: AT5G55740.1 (Tetratricopeptide repeat (TPR)-like superfamily protein )

HSP 1 Score: 355.1 bits (910), Expect = 2.5e-97
Identity = 215/805 (26.71%), Postives = 388/805 (48.20%), Query Frame = 0

Query: 26  LKFSSKPIKTSIFFTQKFTTKFNDD------------NLSYLCSNGLLREAITSIDSMSK 85
           L F++ P K     + K ++K +D+             +S LC NG ++EA++ +  M  
Sbjct: 4   LPFNTIPNKVPFSVSSKPSSKHHDEQAHSPSSTSYFHRVSSLCKNGEIKEALSLVTEMDF 63

Query: 86  RGSKLSTNTYINLLQTCIDADSIELGRELHDRMSSVDQV---NPFVETKLAC--GNCEDL 145
           R  ++    Y  +LQ C+    +  G+++H R+         N ++ETKL      C+ L
Sbjct: 64  RNLRIGPEIYGEILQGCVYERDLSTGKQIHARILKNGDFYARNEYIETKLVIFYAKCDAL 123

Query: 146 ETVKL------------------------------------------------------- 205
           E  ++                                                       
Sbjct: 124 EIAEVLFSKLRVRNVFSWAAIIGVKCRIGLCEGALMGFVEMLENEIFPDNFVVPNVCKAC 183

Query: 206 -----------IHSVVIRCGLSCYIRVSNSILTAFVKCGKLCLARKFFGNMDERDGVSWN 265
                      +H  V++ GL   + V++S+   + KCG L  A K F  + +R+ V+WN
Sbjct: 184 GALKWSRFGRGVHGYVVKSGLEDCVFVASSLADMYGKCGVLDDASKVFDEIPDRNAVAWN 243

Query: 266 AIIAGYCQKGNGDEARRLLDAMSDQGFKPGLVTYNIMIASYSQLG--------------- 325
           A++ GY Q G  +EA RL   M  QG +P  VT +  +++ + +G               
Sbjct: 244 ALMVGYVQNGKNEEAIRLFSDMRKQGVEPTRVTVSTCLSASANMGGVEEGKQSHAIAIVN 303

Query: 326 -------------NFNL---VIELKKKMESMGIAPDVYTWTSMISGFAQSSRISQALDFF 385
                        NF     +IE  + +       DV TW  +ISG+ Q   +  A+   
Sbjct: 304 GMELDNILGTSLLNFYCKVGLIEYAEMVFDRMFEKDVVTWNLIISGYVQQGLVEDAIYMC 363

Query: 386 KEMILAWVEPNAITIASATSACASLKSLQKGLEIHCFAIKMGIAHEILVGNSLIDMYSKC 445
           + M L  ++ + +T+A+  SA A  ++L+ G E+ C+ I+     +I++ ++++DMY+KC
Sbjct: 364 QLMRLEKLKYDCVTLATLMSAAARTENLKLGKEVQCYCIRHSFESDIVLASTVMDMYAKC 423

Query: 446 GKLEAARHVFDTILEKDIYTWNSMIGGYCQAGYCGKAYELFMRLRESNVMPNVVTWNVMI 505
           G +  A+ VFD+ +EKD+  WN+++  Y ++G  G+A  LF  ++   V PNV+TWN++I
Sbjct: 424 GSIVDAKKVFDSTVEKDLILWNTLLAAYAESGLSGEALRLFYGMQLEGVPPNVITWNLII 483

Query: 506 SGCIQNGDEDQAVNLFQIMEKDGEVKRNTASWNSLIAGYHQLGEKNKALAIFRQMQSLNF 565
              ++NG  D+A ++F  M+  G +  N  SW +++ G  Q G   +A+   R+MQ    
Sbjct: 484 LSLLRNGQVDEAKDMFLQMQSSGIIP-NLISWTTMMNGMVQNGCSEEAILFLRKMQESGL 543

Query: 566 NPNSVTILSILPACANVMAEKKVKEIHGCVLRRNLESEL-PIANSLIDTYAKSGNIQYSR 625
            PN+ +I   L ACA++ +    + IHG ++R    S L  I  SL+D YAK G+I  + 
Sbjct: 544 RPNAFSITVALSACAHLASLHIGRTIHGYIIRNLQHSSLVSIETSLVDMYAKCGDINKAE 603

Query: 626 TIFDGMSSKDIITWNSIIAGYILHGCSDAAFHLFGQMKKLGIRPNRGTLASLIHAYGIAG 685
            +F      ++   N++I+ Y L+G    A  L+  ++ +G++P+  T+ +++ A   AG
Sbjct: 604 KVFGSKLYSELPLSNAMISAYALYGNLKEAIALYRSLEGVGLKPDNITITNVLSACNHAG 663

Query: 686 MVDKGRHVFSSITEEHQILPTLDHYLAMVDLYGRSGRLTDAIEFIEDMPIEPDGSIWTSL 714
            +++   +F+ I  +  + P L+HY  MVDL   +G    A+  IE+MP +PD  +  SL
Sbjct: 664 DINQAIEIFTDIVSKRSMKPCLEHYGLMVDLLASAGETEKALRLIEEMPFKPDARMIQSL 723

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
XP_038884902.10.0e+0084.18pentatricopeptide repeat-containing protein At1g19720 [Benincasa hispida][more]
XP_031737058.10.0e+0082.42pentatricopeptide repeat-containing protein At1g19720 [Cucumis sativus][more]
XP_022962565.10.0e+0077.38pentatricopeptide repeat-containing protein At1g19720 [Cucurbita moschata][more]
KAG6598470.10.0e+0078.35Pentatricopeptide repeat-containing protein, partial [Cucurbita argyrosperma sub... [more]
XP_023545984.10.0e+0071.33pentatricopeptide repeat-containing protein At1g19720-like isoform X2 [Cucurbita... [more]
Match NameE-valueIdentityDescription
Q9FXH16.3e-21547.50Pentatricopeptide repeat-containing protein At1g19720 OS=Arabidopsis thaliana OX... [more]
F4HQX11.2e-19454.34Jacalin-related lectin 3 OS=Arabidopsis thaliana OX=3702 GN=JAL3 PE=2 SV=1[more]
Q9FM643.5e-9626.71Pentatricopeptide repeat-containing protein At5g55740, chloroplastic OS=Arabidop... [more]
Q9LFI15.9e-9630.39Pentatricopeptide repeat-containing protein At3g53360, mitochondrial OS=Arabidop... [more]
Q9SY022.9e-9530.49Pentatricopeptide repeat-containing protein At4g02750 OS=Arabidopsis thaliana OX... [more]
Match NameE-valueIdentityDescription
A0A6J1HFG70.0e+0077.38pentatricopeptide repeat-containing protein At1g19720 OS=Cucurbita moschata OX=3... [more]
A0A6J1K2S70.0e+0070.33LOW QUALITY PROTEIN: uncharacterized protein LOC111491877 OS=Cucurbita maxima OX... [more]
A0A6J1CST10.0e+0070.71pentatricopeptide repeat-containing protein At1g19720-like isoform X3 OS=Momordi... [more]
A0A1R3IY370.0e+0055.42Mannose-binding lectin OS=Corchorus capsularis OX=210143 GN=CCACVL1_08980 PE=3 S... [more]
A0A5D2CK410.0e+0053.30Uncharacterized protein OS=Gossypium darwinii OX=34276 GN=ES288_D05G209200v1 PE=... [more]
Match NameE-valueIdentityDescription
AT1G19720.14.5e-21647.50Pentatricopeptide repeat (PPR-like) superfamily protein [more]
AT1G19715.18.8e-19654.34Mannose-binding lectin superfamily protein [more]
AT1G19715.38.8e-19654.34Mannose-binding lectin superfamily protein [more]
AT1G19715.25.1e-18854.22Mannose-binding lectin superfamily protein [more]
AT5G55740.12.5e-9726.71Tetratricopeptide repeat (TPR)-like superfamily protein [more]
InterPro
Analysis Name: InterPro Annotations of Bottle gourd (Hangzhou Gourd) v1
Date Performed: 2022-08-01
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR001229Jacalin-like lectin domainSMARTSM00915Jacalin_2coord: 779..923
e-value: 4.3E-29
score: 112.6
coord: 1024..1155
e-value: 1.5E-35
score: 134.1
coord: 1220..1353
e-value: 2.1E-41
score: 153.6
IPR001229Jacalin-like lectin domainPFAMPF01419Jacalincoord: 779..923
e-value: 7.8E-26
score: 90.6
coord: 1220..1352
e-value: 3.6E-30
score: 104.6
coord: 1024..1153
e-value: 6.6E-27
score: 94.1
IPR001229Jacalin-like lectin domainPROSITEPS51752JACALIN_LECTINcoord: 1013..1155
score: 43.25481
IPR001229Jacalin-like lectin domainPROSITEPS51752JACALIN_LECTINcoord: 1209..1353
score: 47.794342
IPR001229Jacalin-like lectin domainPROSITEPS51752JACALIN_LECTINcoord: 768..923
score: 46.546909
IPR002885Pentatricopeptide repeatPFAMPF01535PPRcoord: 595..618
e-value: 0.055
score: 13.7
IPR002885Pentatricopeptide repeatPFAMPF13041PPR_2coord: 246..286
e-value: 1.6E-7
score: 31.4
coord: 419..466
e-value: 8.0E-8
score: 32.4
coord: 519..566
e-value: 7.4E-10
score: 38.9
coord: 347..394
e-value: 1.1E-12
score: 48.0
coord: 178..222
e-value: 2.2E-12
score: 47.0
IPR002885Pentatricopeptide repeatTIGRFAMTIGR00756TIGR00756coord: 179..211
e-value: 3.4E-8
score: 31.2
coord: 214..248
e-value: 1.1E-7
score: 29.5
coord: 249..283
e-value: 6.1E-7
score: 27.2
coord: 350..384
e-value: 2.3E-6
score: 25.4
coord: 422..454
e-value: 1.6E-4
score: 19.6
coord: 385..415
e-value: 1.5E-5
score: 22.8
coord: 522..555
e-value: 5.7E-7
score: 27.3
IPR002885Pentatricopeptide repeatPROSITEPS51375PPRcoord: 177..211
score: 13.307076
IPR002885Pentatricopeptide repeatPROSITEPS51375PPRcoord: 348..382
score: 12.58363
IPR002885Pentatricopeptide repeatPROSITEPS51375PPRcoord: 247..281
score: 11.312119
IPR002885Pentatricopeptide repeatPROSITEPS51375PPRcoord: 383..417
score: 11.290196
IPR002885Pentatricopeptide repeatPROSITEPS51375PPRcoord: 419..453
score: 11.257313
IPR002885Pentatricopeptide repeatPROSITEPS51375PPRcoord: 212..246
score: 11.005202
IPR002885Pentatricopeptide repeatPROSITEPS51375PPRcoord: 520..554
score: 12.013642
IPR011990Tetratricopeptide-like helical domain superfamilyGENE3D1.25.40.10Tetratricopeptide repeat domaincoord: 571..694
e-value: 4.1E-16
score: 60.8
coord: 477..570
e-value: 7.8E-19
score: 69.7
IPR011990Tetratricopeptide-like helical domain superfamilyGENE3D1.25.40.10Tetratricopeptide repeat domaincoord: 408..476
e-value: 2.3E-7
score: 32.5
IPR011990Tetratricopeptide-like helical domain superfamilyGENE3D1.25.40.10Tetratricopeptide repeat domaincoord: 279..400
e-value: 1.6E-26
score: 95.5
IPR011990Tetratricopeptide-like helical domain superfamilyGENE3D1.25.40.10Tetratricopeptide repeat domaincoord: 50..274
e-value: 2.2E-36
score: 127.8
IPR011990Tetratricopeptide-like helical domain superfamilySUPERFAMILY48452TPR-likecoord: 396..680
IPR036404Jacalin-like lectin domain superfamilyGENE3D2.100.10.30coord: 758..921
e-value: 8.4E-52
score: 177.6
coord: 1204..1352
e-value: 1.1E-48
score: 167.5
coord: 1003..1154
e-value: 1.3E-48
score: 167.3
IPR036404Jacalin-like lectin domain superfamilySUPERFAMILY51101Mannose-binding lectinscoord: 1209..1351
IPR036404Jacalin-like lectin domain superfamilySUPERFAMILY51101Mannose-binding lectinscoord: 1015..1153
IPR036404Jacalin-like lectin domain superfamilySUPERFAMILY51101Mannose-binding lectinscoord: 768..923
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 978..993
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 977..1004
NoneNo IPR availablePANTHERPTHR47926:SF286BNAA06G13890D PROTEINcoord: 121..313
NoneNo IPR availablePANTHERPTHR47926PENTATRICOPEPTIDE REPEAT-CONTAINING PROTEINcoord: 121..313
coord: 50..120
NoneNo IPR availablePANTHERPTHR47926PENTATRICOPEPTIDE REPEAT-CONTAINING PROTEINcoord: 315..383
coord: 455..747
NoneNo IPR availablePANTHERPTHR47926:SF286BNAA06G13890D PROTEINcoord: 315..383
coord: 455..747
coord: 50..120
IPR033734Jacalin-like lectin domain, plantCDDcd09612Jacalincoord: 782..922
e-value: 5.59206E-43
score: 151.18
IPR033734Jacalin-like lectin domain, plantCDDcd09612Jacalincoord: 1223..1352
e-value: 3.12465E-41
score: 146.172
IPR033734Jacalin-like lectin domain, plantCDDcd09612Jacalincoord: 1024..1153
e-value: 1.60063E-41
score: 146.943

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
HG10004824.1HG10004824.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
molecular_function GO:0030246 carbohydrate binding
molecular_function GO:0005515 protein binding
molecular_function GO:0008270 zinc ion binding