HG10018227 (gene) Bottle gourd (Hangzhou Gourd) v1

Overview
NameHG10018227
Typegene
OrganismLagenaria siceraria cv. Hangzhou Gourd (Bottle gourd (Hangzhou Gourd) v1)
Descriptionpolyadenylation and cleavage factor homolog 4 isoform X1
LocationChr04: 1993935 .. 2002724 (+)
RNA-Seq ExpressionHG10018227
SyntenyHG10018227
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: CDSpolypeptide
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGACCCCTTTCATGGAATCGGAAAAGCTCTTAATTTCACGAGGAAACCCTAGAAATTCAGCATATCCATCCGACCGCCAACTCCCCACCACCAGCGGCAGGACTATGCCCAATGAGTTGCCACAAAAGCCTCCCCCTTCTATAGCTCACCGGTTTAGAGCTCAGTTAAAGCAGCGGGATGACGAATTCAGGGTTTCTGGCCATGATGTTGTGCCCCCTCCTACTGCTGAGGATATCGTGCAGTTGTACGACCTCATGTTGTCGGAGCTCACCTTTAATTCGAAGCCCATCATTACGGATCTCACGGTTCTTGCTGACGAGCAGAGAGAACATGGGAAGGGCATTGCTGACTTAATTTGTGCACGTATTCTCGAGGTCTGTTTCTGTAATTATGTTTTCATCTAATTTGAGTTGAGAATACTCTTTTGAACGTTTTAATTTCTGGTAGTATATGAAGGAAGTATATTCTGATATTTTCAAATGGGATTTCTCTCTATTTTTGGGGTTATCCATATATCACGTATTCCTTATTTTAGATTTGATGGGGATTTGGGCTATAAAATTTAGGCTTTTGTATGCTTTGATGATTATATTTTCTGAGACCTTGCGGATGTTCCAGTTCATCATTTCAAATTTTTGTGCAGGTTCCGGTTGAGCAAAAACTTCCTTCATTATATTTATTGGATAGCATCGTTAAGAATGTTGGGCACGAATACATCAGTTATTTCTCGTCTCGTTTACCTGAGGTATGCAAATTTTGGTTTCTGGTGCACACTCTTGTTTCTTATATTTTCAATAATCAAACCGATATTAGTTTGAATTTGACGAGGTTTGTATCAATTCTTTGTTCTATCTTTTGGTGCTAAGGTGTTTTGCGAGGCTTACAGGCAAGTTCATCCTAATTTGCATAATGCAATGCGCCACCTCTTTGGGACATGGGCAACTGTGTTTCCACCATCCATCATTCGGAAGATTGAAGCTCAACTTTCTCAGCTAACAGCACAAGAGTCGTCAAGTTTGACATCCTCAAGGGCTTCTGAATCTCCTCGGCCAACTCATGGCATTCATGTCAATCCAAAATACTTGCGTCAACTGGAACATTCAGTGGTGGATAAAGTGAGCATCTATCTCTTTTTCTACTTAGAATACAATATTGGGTTTGCTTCTGTCAATTTGAATATTTATCATGCAGCATCCCTTCTTGTCCAAAAAAATTCACCATGCATCCTTTTGTTGGCTGTTTGTCATTCCCCCCACCCTCTCATTTTGTGAATGCAACCAACCCAAGTGAACGAATGCTTTTTTTTCTTCTTCTTCTTAATGTATTAATTGGATTAGTCCATCCCAGCCATCTATTATTTATGTTGGGGCCTGGGGTTGTGTTTGGTTTAGATGTGTACAGAAGGTGAAAGAGATGATTTTATGAGGATTTTTATCTGTACATTGGTTTTGCTTTTTCAGATAGAGATTGAATATCTTGCTAGCCTTTACTAAAAAAATGAGGTCCTTTAGAGTTTAGCTCCTTGACACATTACTTGTAAACGATGACTATGAAGGCGAACATTCAAATTAAGTGCCTAGCACAAAGGGATGAAAATTTGGGTCTTTTCTATATGCATATGTGATGTTGAGATAATGGAGGAAAAAATTCCTTTTATAGAGACGATGTTAGCCTAGTCAAATTTAGCAAAATCATTGTTGGTGGGTATTAATACCTCTACTGGTGAAGATGTAGCTGCAGCCAGAATTGACATGCAAGGTTTGGTTCTTAGTCTCTTTCTTAAGTGAGTTTATCCCTTGGGTGGAATCCTACTAGATCTTCTAGCTTTGGGATCCTATTATGGAAAAAATTCCAATAAAGATGATGCAAGTGAAGCTTGGACTCCTAGTTTCGTAGTTAGCTTTTGCGTAATTTGTACTTCAAAATCTATTAGTTTTTAGGGGACTTGCAAAGCAAAGGTGACTCCAAAGATGGAATAAGTCATTAAGATTTAAGATTTCTTTTGAGTAGACTTGACTCGATGGGCGATGCTTATGTTTCTATTTGGGGTGAGGAAGCCCTATGGAATTTATTGTATGAGGCATTTATTTAAAAATGTTTGAGTGTCCAAGCTAGCTTGTGCACATCTCAATTAATATCATGGTATAGTTGCCTGAACGCATCATATTTGGTTGTCAAAGAAACTTGTAGGGTCTTAAATCTTAGGTAGGCGGCTTCTAAGACTTGAATTTTTCTCCTGTAAGCTCATTTAGCTTTTCACTAGTCTGACCCATGGTGGTTGATTTTTTAGGAAATTTTGCGATGTAATTTTCCCTACACAACTGACCATAGGATTACATATTTTTTCGACAACACTCTTCATTCAAGCTGGGAGATGGAGTTTCTGTGAGGTTCATGGCTATTCTAATTTCATATTTTGTGATTCTTTTTAGAGGTTGTGTGCTTTGGTTAATTTCGTAGCGGTTGTTTCCTTCGATATACACTTTGAGGGTGGGGAAGTTGAGGTGTGGGTTCCTTTCAGCCAAACTTGATAATGTTATTCATTGATTGTCTCAAGACAGAAATATGAAATATGAAATGGAAGCTCAATCACTTGCGACTTTCTTGGGAAAACCTCTCGGCTGGTTAGCTGGGATGTGGTGGGAAAACCTGTGAGTTGGGGGGGGGGGGTTGGAATTCGAAAATTTAAGGTTACACAACAAAGCTTTAATGGCTTTGGCGGTTTGCTGTTGATTCTGAATCCTTGTGGCAGAAGATTATTGAGAGCAAGCATGGTCCCCATCCTTATGAGTGGGTGGTGAAAGGGGTTAATGGTACACACCGAAACCTATGGAGAGATATTTCTTCTCATCTCCCTTCTTTCTCGTATCTCGTTCTCTGTGTGTTGGGAGTTGGAAAGATCTTTGGGTGGGGGATAGACCCCTCTCCTCCGCGTTTCCTCGATTATAGCACCTATCCTCCCTTAAAAATTGTCGTGTGTCAGACTTTTTGGTTTGGTCGGGGAATTCAGTTTCTTTCTCCTTTGGTTTCTGTCGGTCTTTATCCAATAGGAAAACGACGGAGGTGGCCTCTCTTCTTTCTTCGATTGAGGGTTTTGACTTTAGGCTTGGGAGAAAAGATGTTCGGGTATGGAGTCTCTCCCCTACGGAGGGCTTCTTTTGTAAGTCTGTCTTTAGGTTTTTACTTGTTCCTTCTCCTGTTGTTGAGTCGGTCTTTGATTTTTTATGGAGGATTAAGACTCCAAAGAAAGTTAAGTTCTTTTCTTGGCAAGTTTTGCTCGGGCGTGTGAACACACGGACAGGTTTGTGAGAAAGATGCCTTTGTTATCACTTATCGTGATCTTGGTGCTGTATCCTTTGTCGGAAGGCGGAGGAAAGCTTGGATCACCTTCTATTGGAGTGTCAGTATGCAAGATCTGTGTGGAATGACTTCTTTCAGGAGCTTGATTTTGCGCTTGCTCGCTAAAGAGGTGTTCGTATGATGATCGAGGAGTTTCTCCTCCATCCGCCTTTCAAAGAAAAAGGGCGTTTTCTTTGGGTTGCGGGTGTGTGCGCGATTTTATGGGATATTTTGGGGGAGAGAAACAACAAAATTTTTCGTGGTGTGGAGAGAGATCCTAGCGAGGTTTGTTCTCTTGTGAGGTTTCATGTTTCTCTTTGGGCTTCAATTTTGAAGACTTTTTGTAATTATTTTCTTGGCAATATCTTACTTAGTTAGAAGCCCTTCCTCTGAGGGGTTTTGTGGGCTTGGTTTTTTTGTATGCCCTTGTATTCTTTCATTCTTTTCTCAATGAAAGCAGTTGTTTCAATAAAAAAAAATGTAAATACAAATGATTTTATATAGAATACGTCCAATCCTCCAACCTATAGTGCCCAAAATCCCCCAATCTTTATAGATAGTACATCCAAATTGTTGTCTCAGGCTTGGTTGGTGCTTTTTGCGCCAACAAATAGAGGGAGTGCTAAGCATTTGTTTTGTAAATGAAGTGAAAATGATAGTATATAGATAAGACATCCAAATTGTTCTCTTAGGCCTGGTTGGTGCTTTATGTGCCAACAAATATACAAATTGCTGAGAACTATGTTTTCTCTTGGTCTTCTACTCTCCTTGGGATATTCTATCCAATCCCTCTTTGTTGCTGAATAACACCTAGGACTTTATTTGTATGATGGTGTGCAACTTTTCGGATGTTGTAAGGTTAGGTTGGTTGTCTTCTAAGATTAGTCGAGGTGCATGCAAGCTGGAATGAACACGCACATATATAAAAAAGAAAGAAATAAATTGTGCAACTTTCTCTTTAAAAAAGAAAGGGAAGGTTTCATGGCCTTGCAATACTATTGTTTTTTCCACTTTTTTTGAAGACTTGGCTGAAGAGGAACCTTAGAATTCCAACGACAAGGAAATTTCTAACCCCCCATTTGGAACCAAGAAACTTTAAGAAATGATTTTTAATTCTATAATTAGTGTTTTAAAAAGCACTTCTACGTGCACTCCTCAGTGTTGGGCACAGGTTTGGCGCCGCCTCACCTTTCACAAAAGAAGGTAAGGATTAGGCACGCGCCTTTTGTGAAGCCTTGGGCCTTGGGGCTGTTTCATTATTTTAAAAAATAATAGTTATTAGGGTTTCTCCTTCGTTAATTTAAAAAAACAAAGTTTACTAAGCTTAATTACATAATTCTTTGTGTTTAGGGTTTTTACTTTTTTTTCATATTCTCACTTCAACTATGCGTTCTCTTCTTTATGTTGTACTATATATAGTGCCCCTCAAAAATAAAAAGTTCGTGCTTTTTTCTTACCTTGCTCTTAAGCCCTAGAAGACTATTGAGCTTTAGAAAACCTAGCTTGTGATTTGAAATTTGGGAAATGATTTGAACTTGTTTCTTGTGTTTGTAACGATAAATATTTGAAATGAATGTTGCTTGTGATTTGAAATATGGGAAATGATTTTAACTTGTTTATTGTGTTTGTAACGATAAATAGTTGAAATGAATGTATTTTAAATTGTGTTTGGATGGTCACCGAAAAAGAAAAGGAAGTTAAAAGTTTAATAACCTTGGATGTATTATGATTTTTTAATTGAAATTTGAACTATGACAAGGTTAAATAGTAGGTAATTTACCTCCAAATCCTGCAAAATCCATCCCCATTTTAGTTAGGATTTGGGAAATGATCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTATCTATCTATCTATATGTATATATGTAAATATTGTTATGAAAAAATCGATTTCAAGTCATTCTCTTTCTTGCGGATTCCGCAGTCATTTGAAATCCCTCAAAATTATGAGGTTCCTAACATGACATAAAGCAAAATTATGAGATTCTATTATCTTCGTTTTTTATAATTGGTCTTTTCTTTTTCAATTACTTTGTTAGTTTCGTATTAAAGTGATTGTAGTTAATAGAAGAGCCCTTTTATAATTCGTCTGCTTGTTTTGGTTGTTTGACTCGTTTTGTACTACACAATTTAAATTTGTTTTATGTCAGATAATATGGGGAAGATAAAACCACGTTCCTCAGATTTTTGGTGATTTTGTGCAGCTGATTATGTTCAAATATGAAATTCCTGTTCTATGTCGTGTGCCATCTATATATCTTTAAATGTGCACGTGAAATACATACGCAACACACACACACACCCCTCCCAACATGATTCCCTGACATTAATTTAATGCAATTAGAACTGTTAATTTTCTGTTTTCTTTTGAATATCTTGCATATATGGTAAAGAAATAACATACAAATATGTAACTAGCTTTTAGCTTTGCAGCATAACCAAGATGCAAGAGGGGCCTCAGCTCTAAAAGTTCATGATAAAAAGCTTGCTCCCGGATATGAAGAGTATGATTACGATCATGCAGATGTTCTTGAACATGGTGGATCTCAAGCATTTCGTTCAATGGGAAGCATGGGCCATGATTCTTTTGCTCTTGGAATAAATAAAACAAATATAAAGCTAGCCAAATCGTCTGCGTCTTCAAGAATTGGACACAATAGACCTCTACAATCAGCTGGTGATGAACTTGAAGCAGTAAGAGCCTCACCCTCGCAGAATGTATATGATTATGAAGGTTCTAGAATGATTGATAGAATTGAGGATACTAATAAATGGAGAAGAAAACAATATCCTGACGATAATCTGAATGGACTTGAAAGTACTTCATATAATATTAGAAATGGACATGCACTTGAGGGACCAAGAGCTTTAATTGAAGCATATGGAAGTGATAAAGGAAAGGGTTATTTAAATGACAATCCTCCTCAGGCTGAACATTTTTCTATCAATGGTATAGACAACAAGGTGACTCCAGTAACATGGCAGAACACTGAAGAAGAAGAGTTTGATTGGGAAGATATGAGCCCCACATTAGCTGATAGAGGCAGAAATAATGATATGTTGAAGCCACCTGTCCCGCCTTCAAGATTTAGGACAAGAACAGGATTTGAAAGATCAAATGCTATGTCTATAGAGCCTGGAATGAGAAGCAATTGGTCTAGTCAGTGTCAGCTACCTACTATTGATTCCTCCATGGTTGTTGAAGATGTGGTCCAATCAACACCTGTATGTTTCCTGAACTTGTTTACTCTGTCATCTCACTATTGCCATTATTATCATTTGCTTCTGGCATTCTATTCTTGCTGTTCATCAAATGTTTCTAAAGTTTGTGACCAGGCGTTACTCTACAGTTTGATTTTTTTTTTTTTTTTTTTGGATTCCTATCATACACTAGTAGAATAAACCTGTTAGATTCCCATAGCGTGTAAAAGCTTTTTAATTGGTAATGTACATCCTTTTCTTCCACTAGTCCTTTCTTTTTTGCTTATTGGCATCACTTATTGAGGAACGTGTAAGACCCATATCTGTGTTCAGATTTGATAGCATGCATTCGGTTATTATTCTGTTTCTGTTGAGGAGTATGTAGTAGGATCTCTGCATTCCACTAGGTAATTATAGTATGGGTTTTCATTTTCCTATTTTCAGGATATTTGGAATATGCACAATCACATTTCTCAGACATCCCAGAACCTCATGAACAATAAAGGAGCAGGAAGAAATTTCCAGATGCCTTCGTTGGGGAGAGGCATAGCTTCATCTGGTGGTGAGAAGATGTCTCCTTTTGTTGACAAGCTTTTGACCAATGATGCTTTACATAGGCCCACTACCATTGCTTCGAGATTGGGTTCTTCTGGTCTTGACTCTAGCATGGAGTCGCAATCAATTGTACAATCTATGGGCCCAAGGCATCCTCTGAATCTTCCTAACTCTTGCCCACCCTCTAGACCTCCAATTTTTCCTGTACCAAGACACAATAAGAGTCAGTTTGAGTCTTTAAATGGTAGTAATTCTCTCATCAATCGTGCAAATAGGTCTTTTTTGCCTGAGCAGCAGATGAATAACATGAGAAATAAGGAGCTAAGTCTTACAACTAAGTCGCCACAAGTTGGCAATCAACATACTGGGCATATTCCTTTAACTCGGGGAAACCAATTGCCGGCCATCCCTTTAAAACCGCAATTTCTACCATCTCAGGACATGCAGGATAATTTAAATGCATCAACAGTACCTCCAGCATTACCGCATTTAATGGCACCATCTTTGAGTCAAGGATACATTTCACAAGGATATCGCCCTGTTATTAGTGAGTGTTTGTCAAGTTCTGCCCCTATTGGGCAATGGAATTTGCCTGTTCATAATAGCCCCAGTAACCCTTTTCTTTTACAAGGGGGGCCGCTGCCACCTCTTCCACCTGGGCCTCATCCTACATCTGCTCCGTCGATATCTCTCTCTCAAAAGGCAGGATCCCTTGTTCCTGGTCAGCAACCAGGAACTGCATTTTCTGGCCTGATAAGTTCTCTCATGGCCCAGGGTTTAATCTCATTGAACAATCAAGCTTCTGTACAGGTATATGTCTGGGTAGTATCCTTCTTAATAGCTTTAGTTTGGGCATTTAATTTTTTTACTGTTATATTTTATCCACTAAGAGAGTTAAATGTTTAGGATTCTGTTGGGTTAGAATTCAATCCAGATGTACTCAAGGTGCGACATGAATCTGCAATAACTGCTCTATATGCTGATCTACCTAGACAATGCATGACCTGTGGCCTTCGATTCAAGACCCAGGAAGAGCATAGTAATCATATGGATTGGCATGTCACTAAAAACCGTATGTCAAAAAGTAGGAAGCAAAAGCCTTCTCGCAAGTGGTTTGTAAGTATAAGCATGTGGCTTAGCGGTGCAGAGGCTCTGGGAACGGAGGCAGTTCCAGGATTTTTGCCTGCTGAGGTCATTGTAGAGAAAAAAGATGATGAAGAACTGGCTGTTCCCGCTGACGAGGATCAGAAGACATGTGCATTATGTGGAGAACCTTTTGAGGATTTTTACAGTGATGAAACAGAGGAGTGGATGTATCGGGGCGCTGTCTACATGAATGCACCTGATGGACAAACAGCCGGCATGGATAGATCTCAGTTAGGGCCCATAGTGCATGCTAAATGCAGGACCGAAACTAATGTTGTTCCCTCCGAAAGTTTTGACCAAGATGAACCAGGGGTATGCTTATATCTTTTTTGTTATTCTCCCTTAGACCCAGGTCACCTGTGCATTCCTTGTAGTTCGCCCTCTAATTTGATTGCTTGTTAATGTATACCAGGTGTTTTATTAGCTGCAAGATTTTATGTGCTTTTTAAAACTCACATTTTTAAGTTTTTGACTGCAGGGATTAAGTGAAGAGGGTAATCGAAGAAAACGATTGCGGAGCTAG

mRNA sequence

ATGACCCCTTTCATGGAATCGGAAAAGCTCTTAATTTCACGAGGAAACCCTAGAAATTCAGCATATCCATCCGACCGCCAACTCCCCACCACCAGCGGCAGGACTATGCCCAATGAGTTGCCACAAAAGCCTCCCCCTTCTATAGCTCACCGGTTTAGAGCTCAGTTAAAGCAGCGGGATGACGAATTCAGGGTTTCTGGCCATGATGTTGTGCCCCCTCCTACTGCTGAGGATATCGTGCAGTTGTACGACCTCATGTTGTCGGAGCTCACCTTTAATTCGAAGCCCATCATTACGGATCTCACGGTTCTTGCTGACGAGCAGAGAGAACATGGGAAGGGCATTGCTGACTTAATTTGTGCACGTATTCTCGAGGTTCCGGTTGAGCAAAAACTTCCTTCATTATATTTATTGGATAGCATCGTTAAGAATGTTGGGCACGAATACATCAGTTATTTCTCGTCTCGTTTACCTGAGGTGTTTTGCGAGGCTTACAGGCAAGTTCATCCTAATTTGCATAATGCAATGCGCCACCTCTTTGGGACATGGGCAACTGTGTTTCCACCATCCATCATTCGGAAGATTGAAGCTCAACTTTCTCAGCTAACAGCACAAGAGTCGTCAAGTTTGACATCCTCAAGGGCTTCTGAATCTCCTCGGCCAACTCATGGCATTCATGTCAATCCAAAATACTTGCGTCAACTGGAACATTCAGTGGTGGATAAACATAACCAAGATGCAAGAGGGGCCTCAGCTCTAAAAGTTCATGATAAAAAGCTTGCTCCCGGATATGAAGAGTATGATTACGATCATGCAGATGTTCTTGAACATGGTGGATCTCAAGCATTTCGTTCAATGGGAAGCATGGGCCATGATTCTTTTGCTCTTGGAATAAATAAAACAAATATAAAGCTAGCCAAATCGTCTGCGTCTTCAAGAATTGGACACAATAGACCTCTACAATCAGCTGGTGATGAACTTGAAGCAGTAAGAGCCTCACCCTCGCAGAATGTATATGATTATGAAGGTTCTAGAATGATTGATAGAATTGAGGATACTAATAAATGGAGAAGAAAACAATATCCTGACGATAATCTGAATGGACTTGAAAGTACTTCATATAATATTAGAAATGGACATGCACTTGAGGGACCAAGAGCTTTAATTGAAGCATATGGAAGTGATAAAGGAAAGGGTTATTTAAATGACAATCCTCCTCAGGCTGAACATTTTTCTATCAATGGTATAGACAACAAGGTGACTCCAGTAACATGGCAGAACACTGAAGAAGAAGAGTTTGATTGGGAAGATATGAGCCCCACATTAGCTGATAGAGGCAGAAATAATGATATGTTGAAGCCACCTGTCCCGCCTTCAAGATTTAGGACAAGAACAGGATTTGAAAGATCAAATGCTATGTCTATAGAGCCTGGAATGAGAAGCAATTGGTCTAGTCAGTGTCAGCTACCTACTATTGATTCCTCCATGGTTGTTGAAGATGTGGTCCAATCAACACCTGATATTTGGAATATGCACAATCACATTTCTCAGACATCCCAGAACCTCATGAACAATAAAGGAGCAGGAAGAAATTTCCAGATGCCTTCGTTGGGGAGAGGCATAGCTTCATCTGGTGGTGAGAAGATGTCTCCTTTTGTTGACAAGCTTTTGACCAATGATGCTTTACATAGGCCCACTACCATTGCTTCGAGATTGGGTTCTTCTGGTCTTGACTCTAGCATGGAGTCGCAATCAATTGTACAATCTATGGGCCCAAGGCATCCTCTGAATCTTCCTAACTCTTGCCCACCCTCTAGACCTCCAATTTTTCCTGTACCAAGACACAATAAGAGTCAGTTTGAGTCTTTAAATGGTAGTAATTCTCTCATCAATCGTGCAAATAGGTCTTTTTTGCCTGAGCAGCAGATGAATAACATGAGAAATAAGGAGCTAAGTCTTACAACTAAGTCGCCACAAGTTGGCAATCAACATACTGGGCATATTCCTTTAACTCGGGGAAACCAATTGCCGGCCATCCCTTTAAAACCGCAATTTCTACCATCTCAGGACATGCAGGATAATTTAAATGCATCAACAGTACCTCCAGCATTACCGCATTTAATGGCACCATCTTTGAGTCAAGGATACATTTCACAAGGATATCGCCCTGTTATTAGTGAGTGTTTGTCAAGTTCTGCCCCTATTGGGCAATGGAATTTGCCTGTTCATAATAGCCCCAGTAACCCTTTTCTTTTACAAGGGGGGCCGCTGCCACCTCTTCCACCTGGGCCTCATCCTACATCTGCTCCGTCGATATCTCTCTCTCAAAAGGCAGGATCCCTTGTTCCTGGTCAGCAACCAGGAACTGCATTTTCTGGCCTGATAAGTTCTCTCATGGCCCAGGGTTTAATCTCATTGAACAATCAAGCTTCTGTACAGGATTCTGTTGGGTTAGAATTCAATCCAGATGTACTCAAGGTGCGACATGAATCTGCAATAACTGCTCTATATGCTGATCTACCTAGACAATGCATGACCTGTGGCCTTCGATTCAAGACCCAGGAAGAGCATAGTAATCATATGGATTGGCATGTCACTAAAAACCGTATGTCAAAAAGTAGGAAGCAAAAGCCTTCTCGCAAGTGGTTTGTAAGTATAAGCATGTGGCTTAGCGGTGCAGAGGCTCTGGGAACGGAGGCAGTTCCAGGATTTTTGCCTGCTGAGGTCATTGTAGAGAAAAAAGATGATGAAGAACTGGCTGTTCCCGCTGACGAGGATCAGAAGACATGTGCATTATGTGGAGAACCTTTTGAGGATTTTTACAGTGATGAAACAGAGGAGTGGATGTATCGGGGCGCTGTCTACATGAATGCACCTGATGGACAAACAGCCGGCATGGATAGATCTCAGTTAGGGCCCATAGTGCATGCTAAATGCAGGACCGAAACTAATGTTGTTCCCTCCGAAAGTTTTGACCAAGATGAACCAGGGGGATTAAGTGAAGAGGGTAATCGAAGAAAACGATTGCGGAGCTAG

Coding sequence (CDS)

ATGACCCCTTTCATGGAATCGGAAAAGCTCTTAATTTCACGAGGAAACCCTAGAAATTCAGCATATCCATCCGACCGCCAACTCCCCACCACCAGCGGCAGGACTATGCCCAATGAGTTGCCACAAAAGCCTCCCCCTTCTATAGCTCACCGGTTTAGAGCTCAGTTAAAGCAGCGGGATGACGAATTCAGGGTTTCTGGCCATGATGTTGTGCCCCCTCCTACTGCTGAGGATATCGTGCAGTTGTACGACCTCATGTTGTCGGAGCTCACCTTTAATTCGAAGCCCATCATTACGGATCTCACGGTTCTTGCTGACGAGCAGAGAGAACATGGGAAGGGCATTGCTGACTTAATTTGTGCACGTATTCTCGAGGTTCCGGTTGAGCAAAAACTTCCTTCATTATATTTATTGGATAGCATCGTTAAGAATGTTGGGCACGAATACATCAGTTATTTCTCGTCTCGTTTACCTGAGGTGTTTTGCGAGGCTTACAGGCAAGTTCATCCTAATTTGCATAATGCAATGCGCCACCTCTTTGGGACATGGGCAACTGTGTTTCCACCATCCATCATTCGGAAGATTGAAGCTCAACTTTCTCAGCTAACAGCACAAGAGTCGTCAAGTTTGACATCCTCAAGGGCTTCTGAATCTCCTCGGCCAACTCATGGCATTCATGTCAATCCAAAATACTTGCGTCAACTGGAACATTCAGTGGTGGATAAACATAACCAAGATGCAAGAGGGGCCTCAGCTCTAAAAGTTCATGATAAAAAGCTTGCTCCCGGATATGAAGAGTATGATTACGATCATGCAGATGTTCTTGAACATGGTGGATCTCAAGCATTTCGTTCAATGGGAAGCATGGGCCATGATTCTTTTGCTCTTGGAATAAATAAAACAAATATAAAGCTAGCCAAATCGTCTGCGTCTTCAAGAATTGGACACAATAGACCTCTACAATCAGCTGGTGATGAACTTGAAGCAGTAAGAGCCTCACCCTCGCAGAATGTATATGATTATGAAGGTTCTAGAATGATTGATAGAATTGAGGATACTAATAAATGGAGAAGAAAACAATATCCTGACGATAATCTGAATGGACTTGAAAGTACTTCATATAATATTAGAAATGGACATGCACTTGAGGGACCAAGAGCTTTAATTGAAGCATATGGAAGTGATAAAGGAAAGGGTTATTTAAATGACAATCCTCCTCAGGCTGAACATTTTTCTATCAATGGTATAGACAACAAGGTGACTCCAGTAACATGGCAGAACACTGAAGAAGAAGAGTTTGATTGGGAAGATATGAGCCCCACATTAGCTGATAGAGGCAGAAATAATGATATGTTGAAGCCACCTGTCCCGCCTTCAAGATTTAGGACAAGAACAGGATTTGAAAGATCAAATGCTATGTCTATAGAGCCTGGAATGAGAAGCAATTGGTCTAGTCAGTGTCAGCTACCTACTATTGATTCCTCCATGGTTGTTGAAGATGTGGTCCAATCAACACCTGATATTTGGAATATGCACAATCACATTTCTCAGACATCCCAGAACCTCATGAACAATAAAGGAGCAGGAAGAAATTTCCAGATGCCTTCGTTGGGGAGAGGCATAGCTTCATCTGGTGGTGAGAAGATGTCTCCTTTTGTTGACAAGCTTTTGACCAATGATGCTTTACATAGGCCCACTACCATTGCTTCGAGATTGGGTTCTTCTGGTCTTGACTCTAGCATGGAGTCGCAATCAATTGTACAATCTATGGGCCCAAGGCATCCTCTGAATCTTCCTAACTCTTGCCCACCCTCTAGACCTCCAATTTTTCCTGTACCAAGACACAATAAGAGTCAGTTTGAGTCTTTAAATGGTAGTAATTCTCTCATCAATCGTGCAAATAGGTCTTTTTTGCCTGAGCAGCAGATGAATAACATGAGAAATAAGGAGCTAAGTCTTACAACTAAGTCGCCACAAGTTGGCAATCAACATACTGGGCATATTCCTTTAACTCGGGGAAACCAATTGCCGGCCATCCCTTTAAAACCGCAATTTCTACCATCTCAGGACATGCAGGATAATTTAAATGCATCAACAGTACCTCCAGCATTACCGCATTTAATGGCACCATCTTTGAGTCAAGGATACATTTCACAAGGATATCGCCCTGTTATTAGTGAGTGTTTGTCAAGTTCTGCCCCTATTGGGCAATGGAATTTGCCTGTTCATAATAGCCCCAGTAACCCTTTTCTTTTACAAGGGGGGCCGCTGCCACCTCTTCCACCTGGGCCTCATCCTACATCTGCTCCGTCGATATCTCTCTCTCAAAAGGCAGGATCCCTTGTTCCTGGTCAGCAACCAGGAACTGCATTTTCTGGCCTGATAAGTTCTCTCATGGCCCAGGGTTTAATCTCATTGAACAATCAAGCTTCTGTACAGGATTCTGTTGGGTTAGAATTCAATCCAGATGTACTCAAGGTGCGACATGAATCTGCAATAACTGCTCTATATGCTGATCTACCTAGACAATGCATGACCTGTGGCCTTCGATTCAAGACCCAGGAAGAGCATAGTAATCATATGGATTGGCATGTCACTAAAAACCGTATGTCAAAAAGTAGGAAGCAAAAGCCTTCTCGCAAGTGGTTTGTAAGTATAAGCATGTGGCTTAGCGGTGCAGAGGCTCTGGGAACGGAGGCAGTTCCAGGATTTTTGCCTGCTGAGGTCATTGTAGAGAAAAAAGATGATGAAGAACTGGCTGTTCCCGCTGACGAGGATCAGAAGACATGTGCATTATGTGGAGAACCTTTTGAGGATTTTTACAGTGATGAAACAGAGGAGTGGATGTATCGGGGCGCTGTCTACATGAATGCACCTGATGGACAAACAGCCGGCATGGATAGATCTCAGTTAGGGCCCATAGTGCATGCTAAATGCAGGACCGAAACTAATGTTGTTCCCTCCGAAAGTTTTGACCAAGATGAACCAGGGGGATTAAGTGAAGAGGGTAATCGAAGAAAACGATTGCGGAGCTAG

Protein sequence

MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRDDEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLICARILEVPVEQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLFGTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVVDKHNQDARGASALKVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFALGINKTNIKLAKSSASSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQYPDDNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNKVTPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPPVPPSRFRTRTGFERSNAMSIEPGMRSNWSSQCQLPTIDSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQMPSLGRGIASSGGEKMSPFVDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQSMGPRHPLNLPNSCPPSRPPIFPVPRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRNKELSLTTKSPQVGNQHTGHIPLTRGNQLPAIPLKPQFLPSQDMQDNLNASTVPPALPHLMAPSLSQGYISQGYRPVISECLSSSAPIGQWNLPVHNSPSNPFLLQGGPLPPLPPGPHPTSAPSISLSQKAGSLVPGQQPGTAFSGLISSLMAQGLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYADLPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEAVPGFLPAEVIVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPDGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDEPGGLSEEGNRRKRLRS
Homology
BLAST of HG10018227 vs. NCBI nr
Match: XP_038894060.1 (polyadenylation and cleavage factor homolog 4 isoform X3 [Benincasa hispida])

HSP 1 Score: 1919.8 bits (4972), Expect = 0.0e+00
Identity = 960/1011 (94.96%), Postives = 973/1011 (96.24%), Query Frame = 0

Query: 1    MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60
            MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD
Sbjct: 1    MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60

Query: 61   DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120
            DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC
Sbjct: 61   DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120

Query: 121  ARILEVPVEQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180
            ARILEVPVEQKLPSLYLLDSIVKNVG EYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF
Sbjct: 121  ARILEVPVEQKLPSLYLLDSIVKNVGQEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180

Query: 181  GTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240
            GTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVV
Sbjct: 181  GTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240

Query: 241  DKHNQDARGASALKVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFALGINK 300
            DK   DARG SALKVHDKKLA GYEEYDYDHA+VLEHGG+QAF  + SM HDSFALG NK
Sbjct: 241  DKQIHDARGVSALKVHDKKLASGYEEYDYDHAEVLEHGGAQAFH-LRSMAHDSFALGTNK 300

Query: 301  TNIKLAKSSASSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQ 360
             NIKLAKSS SSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQ
Sbjct: 301  ANIKLAKSSPSSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQ 360

Query: 361  YPDDNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNKV 420
            YPDDNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNKV
Sbjct: 361  YPDDNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNKV 420

Query: 421  TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPPVPPSRFRTRTGFERSNAMSIEPGMR 480
            TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKP VPPSRF TRTGFERSNAMSIEPGMR
Sbjct: 421  TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPSVPPSRFVTRTGFERSNAMSIEPGMR 480

Query: 481  SNWSSQCQLPTIDSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQMPSLGRG 540
            SNWSSQ QLPTIDSSMV+EDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQ P LGRG
Sbjct: 481  SNWSSQVQLPTIDSSMVIEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQTPLLGRG 540

Query: 541  IASSGGEKMSPFVDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQSMGPRHPLNLPN 600
            IA SGGEKMSPF DKLLTNDALHRPTTIASRLGSSGLDSSME QSIVQSMGPRHPLNLPN
Sbjct: 541  IALSGGEKMSPFADKLLTNDALHRPTTIASRLGSSGLDSSMELQSIVQSMGPRHPLNLPN 600

Query: 601  SCPPSRPPIFPVPRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRNKELSLTTKSPQV 660
            SCPPSRPPIFPVPRHNKS FESLNG NS INRANRSFLPEQQMNNMRNKELSLTTK PQV
Sbjct: 601  SCPPSRPPIFPVPRHNKSPFESLNGGNSFINRANRSFLPEQQMNNMRNKELSLTTKLPQV 660

Query: 661  GNQHTGHIPLTRGNQLPAIPLKPQFLPSQDMQDNLNASTVPPALPHLMAPSLSQGYISQG 720
            GNQHTGHIPLTRGNQL AIPLKPQFLPSQDMQDNL+AS VPPALPHLMAPSLSQGYISQG
Sbjct: 661  GNQHTGHIPLTRGNQLQAIPLKPQFLPSQDMQDNLSASVVPPALPHLMAPSLSQGYISQG 720

Query: 721  YRPVISECLSSSAPIGQWNLPVHNSPSNPFLLQGGPLPPLPPGPHPTSAPSISLSQKAGS 780
            +RP ISECLSSSAPIGQWNLPVHNSPSNP  LQGGPLPPLPPGPHPTS P+I + QKAGS
Sbjct: 721  HRPAISECLSSSAPIGQWNLPVHNSPSNPLHLQGGPLPPLPPGPHPTSIPTIPIPQKAGS 780

Query: 781  LVPGQQPGTAFSGLISSLMAQGLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYADL 840
            LVPGQ+PGT FSGLISSLMAQGLISLNNQ SVQDSVGLEFNPDVLKVRHESAITALYADL
Sbjct: 781  LVPGQRPGTEFSGLISSLMAQGLISLNNQPSVQDSVGLEFNPDVLKVRHESAITALYADL 840

Query: 841  PRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEA 900
            PRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEA
Sbjct: 841  PRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEA 900

Query: 901  VPGFLPAEVIVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPD 960
            VPGFLP EVIVEKKDDEELAVPAD+DQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPD
Sbjct: 901  VPGFLPPEVIVEKKDDEELAVPADDDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPD 960

Query: 961  GQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDEPGGLSEEGNRRKRLRS 1012
            GQTAGMDRSQLGPIVHAKCRTETNVV SESF+Q+E GG+SEEGNRRKRLRS
Sbjct: 961  GQTAGMDRSQLGPIVHAKCRTETNVVTSESFEQEEQGGVSEEGNRRKRLRS 1010

BLAST of HG10018227 vs. NCBI nr
Match: XP_038894058.1 (polyadenylation and cleavage factor homolog 4 isoform X1 [Benincasa hispida])

HSP 1 Score: 1895.9 bits (4910), Expect = 0.0e+00
Identity = 947/997 (94.98%), Postives = 959/997 (96.19%), Query Frame = 0

Query: 1   MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60
           MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD
Sbjct: 1   MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60

Query: 61  DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120
           DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC
Sbjct: 61  DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120

Query: 121 ARILEVPVEQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180
           ARILEVPVEQKLPSLYLLDSIVKNVG EYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF
Sbjct: 121 ARILEVPVEQKLPSLYLLDSIVKNVGQEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180

Query: 181 GTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240
           GTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVV
Sbjct: 181 GTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240

Query: 241 DKHNQDARGASALKVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFALGINK 300
           DK   DARG SALKVHDKKLA GYEEYDYDHA+VLEHGG+QAF  + SM HDSFALG NK
Sbjct: 241 DKQIHDARGVSALKVHDKKLASGYEEYDYDHAEVLEHGGAQAFH-LRSMAHDSFALGTNK 300

Query: 301 TNIKLAKSSASSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQ 360
            NIKLAKSS SSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQ
Sbjct: 301 ANIKLAKSSPSSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQ 360

Query: 361 YPDDNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNKV 420
           YPDDNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNKV
Sbjct: 361 YPDDNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNKV 420

Query: 421 TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPPVPPSRFRTRTGFERSNAMSIEPGMR 480
           TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKP VPPSRF TRTGFERSNAMSIEPGMR
Sbjct: 421 TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPSVPPSRFVTRTGFERSNAMSIEPGMR 480

Query: 481 SNWSSQCQLPTIDSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQMPSLGRG 540
           SNWSSQ QLPTIDSSMV+EDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQ P LGRG
Sbjct: 481 SNWSSQVQLPTIDSSMVIEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQTPLLGRG 540

Query: 541 IASSGGEKMSPFVDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQSMGPRHPLNLPN 600
           IA SGGEKMSPF DKLLTNDALHRPTTIASRLGSSGLDSSME QSIVQSMGPRHPLNLPN
Sbjct: 541 IALSGGEKMSPFADKLLTNDALHRPTTIASRLGSSGLDSSMELQSIVQSMGPRHPLNLPN 600

Query: 601 SCPPSRPPIFPVPRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRNKELSLTTKSPQV 660
           SCPPSRPPIFPVPRHNKS FESLNG NS INRANRSFLPEQQMNNMRNKELSLTTK PQV
Sbjct: 601 SCPPSRPPIFPVPRHNKSPFESLNGGNSFINRANRSFLPEQQMNNMRNKELSLTTKLPQV 660

Query: 661 GNQHTGHIPLTRGNQLPAIPLKPQFLPSQDMQDNLNASTVPPALPHLMAPSLSQGYISQG 720
           GNQHTGHIPLTRGNQL AIPLKPQFLPSQDMQDNL+AS VPPALPHLMAPSLSQGYISQG
Sbjct: 661 GNQHTGHIPLTRGNQLQAIPLKPQFLPSQDMQDNLSASVVPPALPHLMAPSLSQGYISQG 720

Query: 721 YRPVISECLSSSAPIGQWNLPVHNSPSNPFLLQGGPLPPLPPGPHPTSAPSISLSQKAGS 780
           +RP ISECLSSSAPIGQWNLPVHNSPSNP  LQGGPLPPLPPGPHPTS P+I + QKAGS
Sbjct: 721 HRPAISECLSSSAPIGQWNLPVHNSPSNPLHLQGGPLPPLPPGPHPTSIPTIPIPQKAGS 780

Query: 781 LVPGQQPGTAFSGLISSLMAQGLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYADL 840
           LVPGQ+PGT FSGLISSLMAQGLISLNNQ SVQDSVGLEFNPDVLKVRHESAITALYADL
Sbjct: 781 LVPGQRPGTEFSGLISSLMAQGLISLNNQPSVQDSVGLEFNPDVLKVRHESAITALYADL 840

Query: 841 PRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEA 900
           PRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEA
Sbjct: 841 PRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEA 900

Query: 901 VPGFLPAEVIVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPD 960
           VPGFLP EVIVEKKDDEELAVPAD+DQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPD
Sbjct: 901 VPGFLPPEVIVEKKDDEELAVPADDDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPD 960

Query: 961 GQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDEPG 998
           GQTAGMDRSQLGPIVHAKCRTETNVV SESF+Q+E G
Sbjct: 961 GQTAGMDRSQLGPIVHAKCRTETNVVTSESFEQEEQG 996

BLAST of HG10018227 vs. NCBI nr
Match: XP_038894059.1 (polyadenylation and cleavage factor homolog 4 isoform X2 [Benincasa hispida])

HSP 1 Score: 1891.7 bits (4899), Expect = 0.0e+00
Identity = 947/997 (94.98%), Postives = 960/997 (96.29%), Query Frame = 0

Query: 1   MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60
           MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD
Sbjct: 1   MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60

Query: 61  DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120
           DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC
Sbjct: 61  DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120

Query: 121 ARILEVPVEQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180
           ARILEVPVEQKLPSLYLLDSIVKNVG EYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF
Sbjct: 121 ARILEVPVEQKLPSLYLLDSIVKNVGQEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180

Query: 181 GTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240
           GTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVV
Sbjct: 181 GTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240

Query: 241 DKHNQDARGASALKVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFALGINK 300
           DK + DARG SALKVHDKKLA GYEEYDYDHA+VLEHGG+QAF  + SM HDSFALG NK
Sbjct: 241 DKIH-DARGVSALKVHDKKLASGYEEYDYDHAEVLEHGGAQAFH-LRSMAHDSFALGTNK 300

Query: 301 TNIKLAKSSASSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQ 360
            NIKLAKSS SSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQ
Sbjct: 301 ANIKLAKSSPSSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQ 360

Query: 361 YPDDNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNKV 420
           YPDDNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNKV
Sbjct: 361 YPDDNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNKV 420

Query: 421 TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPPVPPSRFRTRTGFERSNAMSIEPGMR 480
           TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKP VPPSRF TRTGFERSNAMSIEPGMR
Sbjct: 421 TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPSVPPSRFVTRTGFERSNAMSIEPGMR 480

Query: 481 SNWSSQCQLPTIDSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQMPSLGRG 540
           SNWSSQ QLPTIDSSMV+EDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQ P LGRG
Sbjct: 481 SNWSSQVQLPTIDSSMVIEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQTPLLGRG 540

Query: 541 IASSGGEKMSPFVDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQSMGPRHPLNLPN 600
           IA SGGEKMSPF DKLLTNDALHRPTTIASRLGSSGLDSSME QSIVQSMGPRHPLNLPN
Sbjct: 541 IALSGGEKMSPFADKLLTNDALHRPTTIASRLGSSGLDSSMELQSIVQSMGPRHPLNLPN 600

Query: 601 SCPPSRPPIFPVPRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRNKELSLTTKSPQV 660
           SCPPSRPPIFPVPRHNKS FESLNG NS INRANRSFLPEQQMNNMRNKELSLTTK PQV
Sbjct: 601 SCPPSRPPIFPVPRHNKSPFESLNGGNSFINRANRSFLPEQQMNNMRNKELSLTTKLPQV 660

Query: 661 GNQHTGHIPLTRGNQLPAIPLKPQFLPSQDMQDNLNASTVPPALPHLMAPSLSQGYISQG 720
           GNQHTGHIPLTRGNQL AIPLKPQFLPSQDMQDNL+AS VPPALPHLMAPSLSQGYISQG
Sbjct: 661 GNQHTGHIPLTRGNQLQAIPLKPQFLPSQDMQDNLSASVVPPALPHLMAPSLSQGYISQG 720

Query: 721 YRPVISECLSSSAPIGQWNLPVHNSPSNPFLLQGGPLPPLPPGPHPTSAPSISLSQKAGS 780
           +RP ISECLSSSAPIGQWNLPVHNSPSNP  LQGGPLPPLPPGPHPTS P+I + QKAGS
Sbjct: 721 HRPAISECLSSSAPIGQWNLPVHNSPSNPLHLQGGPLPPLPPGPHPTSIPTIPIPQKAGS 780

Query: 781 LVPGQQPGTAFSGLISSLMAQGLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYADL 840
           LVPGQ+PGT FSGLISSLMAQGLISLNNQ SVQDSVGLEFNPDVLKVRHESAITALYADL
Sbjct: 781 LVPGQRPGTEFSGLISSLMAQGLISLNNQPSVQDSVGLEFNPDVLKVRHESAITALYADL 840

Query: 841 PRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEA 900
           PRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEA
Sbjct: 841 PRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEA 900

Query: 901 VPGFLPAEVIVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPD 960
           VPGFLP EVIVEKKDDEELAVPAD+DQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPD
Sbjct: 901 VPGFLPPEVIVEKKDDEELAVPADDDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPD 960

Query: 961 GQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDEPG 998
           GQTAGMDRSQLGPIVHAKCRTETNVV SESF+Q+E G
Sbjct: 961 GQTAGMDRSQLGPIVHAKCRTETNVVTSESFEQEEQG 995

BLAST of HG10018227 vs. NCBI nr
Match: XP_008462986.1 (PREDICTED: polyadenylation and cleavage factor homolog 4 isoform X2 [Cucumis melo])

HSP 1 Score: 1863.2 bits (4825), Expect = 0.0e+00
Identity = 926/1012 (91.50%), Postives = 961/1012 (94.96%), Query Frame = 0

Query: 1    MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60
            MT FMESEKLLISRGNPRNSAYPSDR +PTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD
Sbjct: 1    MTRFMESEKLLISRGNPRNSAYPSDRPIPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60

Query: 61   DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120
            DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC
Sbjct: 61   DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120

Query: 121  ARILEVPVEQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180
            ARILEVPV+QKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF
Sbjct: 121  ARILEVPVDQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180

Query: 181  GTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240
            GTWATVFPPSIIRKIEAQLSQLTAQESS LTSSRASESPRPTHGIHVNPKYLRQLEHSVV
Sbjct: 181  GTWATVFPPSIIRKIEAQLSQLTAQESSGLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240

Query: 241  DKHNQDARGASALKVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFALGINK 300
            DKH QD+RG SA+KVHDKKLA GYEEYDYDHAD LEHGG+Q F SMGSMGHDSF+LG NK
Sbjct: 241  DKHTQDSRGTSAIKVHDKKLASGYEEYDYDHADALEHGGAQEFHSMGSMGHDSFSLGTNK 300

Query: 301  TNIKLAKSSASSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQ 360
             N+KLAKSS SSRIGH+RPLQS GDELE+VRASPSQNVYDYEGS+++DR EDTNKWRRKQ
Sbjct: 301  ANVKLAKSSLSSRIGHHRPLQSLGDELESVRASPSQNVYDYEGSKILDRNEDTNKWRRKQ 360

Query: 361  YPDDNLNGLEST-SYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNK 420
            YPDDN+NGLE+T SYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSI+GIDNK
Sbjct: 361  YPDDNMNGLENTSSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSISGIDNK 420

Query: 421  VTPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPPVPPSRFRTRTGFERSNAMSIEPGM 480
             TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKP VPPSRFRTR+GFERSNAM IEPGM
Sbjct: 421  ATPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPTVPPSRFRTRSGFERSNAMPIEPGM 480

Query: 481  RSNWSSQCQLPTIDSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQMPSLGR 540
            RSNWSSQ QLP IDSS+V+EDVV STPDIW MHNHISQTSQNLMNNKG GRNFQMP LGR
Sbjct: 481  RSNWSSQVQLPGIDSSIVIEDVVHSTPDIWKMHNHISQTSQNLMNNKGPGRNFQMPMLGR 540

Query: 541  GIASSGGEKMSPFVDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQSMGPRHPLNLP 600
            GI SSGGEKMSP+ DKLLTNDALHRPT IASRLGSSGLDS+MESQSIVQSMGPRHPLNL 
Sbjct: 541  GITSSGGEKMSPYGDKLLTNDALHRPTNIASRLGSSGLDSNMESQSIVQSMGPRHPLNLS 600

Query: 601  NSCPPSRPPIFPVPRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRNKELSLTTKSPQ 660
            NSCPPSRPP+FPVPRHN SQFESLNGSNS +N ANR+FLPEQQMNN+RNKELSLTTKSPQ
Sbjct: 601  NSCPPSRPPVFPVPRHNTSQFESLNGSNSFMNSANRTFLPEQQMNNLRNKELSLTTKSPQ 660

Query: 661  VGNQHTGHIPLTRGNQLPAIPLKPQFLPSQDMQDNLNASTVPPALPHLMAPSLSQGYISQ 720
            VGNQHTGHIPLTRGNQL ++PLKPQFLPSQDMQDN + S VPP LPHL+APSLSQGYISQ
Sbjct: 661  VGNQHTGHIPLTRGNQLQSMPLKPQFLPSQDMQDNFSGSAVPPVLPHLIAPSLSQGYISQ 720

Query: 721  GYRPVISECLSSSAPIGQWNLPVHNSPSNPFLLQGGPLPPLPPGPHPTSAPSISLSQKAG 780
            G+RP  SE LSSSAPIGQWNL VHNS SNP  LQGGPLPPLPPGPHPTS P+I +SQK  
Sbjct: 721  GHRPANSEGLSSSAPIGQWNLSVHNSSSNPLHLQGGPLPPLPPGPHPTSGPTIPISQK-- 780

Query: 781  SLVPGQQPGTAFSGLISSLMAQGLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYAD 840
              VPGQQPGTA SGLISSLMA+GLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYAD
Sbjct: 781  --VPGQQPGTAISGLISSLMARGLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYAD 840

Query: 841  LPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTE 900
            LPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTE
Sbjct: 841  LPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTE 900

Query: 901  AVPGFLPAEVIVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAP 960
            AVPGFLPAEV+VEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAP
Sbjct: 901  AVPGFLPAEVVVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAP 960

Query: 961  DGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDEPGGLSEEGNRRKRLRS 1012
            DGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDE GG+SE+GNRRKRLRS
Sbjct: 961  DGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDE-GGVSEDGNRRKRLRS 1007

BLAST of HG10018227 vs. NCBI nr
Match: XP_008462960.1 (PREDICTED: polyadenylation and cleavage factor homolog 4 isoform X1 [Cucumis melo] >XP_008462968.1 PREDICTED: polyadenylation and cleavage factor homolog 4 isoform X1 [Cucumis melo])

HSP 1 Score: 1857.0 bits (4809), Expect = 0.0e+00
Identity = 926/1017 (91.05%), Postives = 961/1017 (94.49%), Query Frame = 0

Query: 1    MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60
            MT FMESEKLLISRGNPRNSAYPSDR +PTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD
Sbjct: 1    MTRFMESEKLLISRGNPRNSAYPSDRPIPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60

Query: 61   DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120
            DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC
Sbjct: 61   DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120

Query: 121  ARILEVPVEQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180
            ARILEVPV+QKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF
Sbjct: 121  ARILEVPVDQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180

Query: 181  GTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240
            GTWATVFPPSIIRKIEAQLSQLTAQESS LTSSRASESPRPTHGIHVNPKYLRQLEHSVV
Sbjct: 181  GTWATVFPPSIIRKIEAQLSQLTAQESSGLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240

Query: 241  DK-----HNQDARGASALKVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFA 300
            DK     H QD+RG SA+KVHDKKLA GYEEYDYDHAD LEHGG+Q F SMGSMGHDSF+
Sbjct: 241  DKLLALQHTQDSRGTSAIKVHDKKLASGYEEYDYDHADALEHGGAQEFHSMGSMGHDSFS 300

Query: 301  LGINKTNIKLAKSSASSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNK 360
            LG NK N+KLAKSS SSRIGH+RPLQS GDELE+VRASPSQNVYDYEGS+++DR EDTNK
Sbjct: 301  LGTNKANVKLAKSSLSSRIGHHRPLQSLGDELESVRASPSQNVYDYEGSKILDRNEDTNK 360

Query: 361  WRRKQYPDDNLNGLEST-SYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSIN 420
            WRRKQYPDDN+NGLE+T SYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSI+
Sbjct: 361  WRRKQYPDDNMNGLENTSSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSIS 420

Query: 421  GIDNKVTPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPPVPPSRFRTRTGFERSNAMS 480
            GIDNK TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKP VPPSRFRTR+GFERSNAM 
Sbjct: 421  GIDNKATPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPTVPPSRFRTRSGFERSNAMP 480

Query: 481  IEPGMRSNWSSQCQLPTIDSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQM 540
            IEPGMRSNWSSQ QLP IDSS+V+EDVV STPDIW MHNHISQTSQNLMNNKG GRNFQM
Sbjct: 481  IEPGMRSNWSSQVQLPGIDSSIVIEDVVHSTPDIWKMHNHISQTSQNLMNNKGPGRNFQM 540

Query: 541  PSLGRGIASSGGEKMSPFVDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQSMGPRH 600
            P LGRGI SSGGEKMSP+ DKLLTNDALHRPT IASRLGSSGLDS+MESQSIVQSMGPRH
Sbjct: 541  PMLGRGITSSGGEKMSPYGDKLLTNDALHRPTNIASRLGSSGLDSNMESQSIVQSMGPRH 600

Query: 601  PLNLPNSCPPSRPPIFPVPRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRNKELSLT 660
            PLNL NSCPPSRPP+FPVPRHN SQFESLNGSNS +N ANR+FLPEQQMNN+RNKELSLT
Sbjct: 601  PLNLSNSCPPSRPPVFPVPRHNTSQFESLNGSNSFMNSANRTFLPEQQMNNLRNKELSLT 660

Query: 661  TKSPQVGNQHTGHIPLTRGNQLPAIPLKPQFLPSQDMQDNLNASTVPPALPHLMAPSLSQ 720
            TKSPQVGNQHTGHIPLTRGNQL ++PLKPQFLPSQDMQDN + S VPP LPHL+APSLSQ
Sbjct: 661  TKSPQVGNQHTGHIPLTRGNQLQSMPLKPQFLPSQDMQDNFSGSAVPPVLPHLIAPSLSQ 720

Query: 721  GYISQGYRPVISECLSSSAPIGQWNLPVHNSPSNPFLLQGGPLPPLPPGPHPTSAPSISL 780
            GYISQG+RP  SE LSSSAPIGQWNL VHNS SNP  LQGGPLPPLPPGPHPTS P+I +
Sbjct: 721  GYISQGHRPANSEGLSSSAPIGQWNLSVHNSSSNPLHLQGGPLPPLPPGPHPTSGPTIPI 780

Query: 781  SQKAGSLVPGQQPGTAFSGLISSLMAQGLISLNNQASVQDSVGLEFNPDVLKVRHESAIT 840
            SQK    VPGQQPGTA SGLISSLMA+GLISLNNQASVQDSVGLEFNPDVLKVRHESAIT
Sbjct: 781  SQK----VPGQQPGTAISGLISSLMARGLISLNNQASVQDSVGLEFNPDVLKVRHESAIT 840

Query: 841  ALYADLPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAE 900
            ALYADLPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAE
Sbjct: 841  ALYADLPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAE 900

Query: 901  ALGTEAVPGFLPAEVIVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAV 960
            ALGTEAVPGFLPAEV+VEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAV
Sbjct: 901  ALGTEAVPGFLPAEVVVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAV 960

Query: 961  YMNAPDGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDEPGGLSEEGNRRKRLRS 1012
            YMNAPDGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDE GG+SE+GNRRKRLRS
Sbjct: 961  YMNAPDGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDE-GGVSEDGNRRKRLRS 1012

BLAST of HG10018227 vs. ExPASy Swiss-Prot
Match: Q0WPF2 (Polyadenylation and cleavage factor homolog 4 OS=Arabidopsis thaliana OX=3702 GN=PCFS4 PE=1 SV=1)

HSP 1 Score: 649.8 bits (1675), Expect = 5.0e-185
Identity = 446/1013 (44.03%), Postives = 565/1013 (55.77%), Query Frame = 0

Query: 5    MESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQK--PPPSIAHRFRAQLKQRDDE 64
            M+SEK+L    NPR  +      + +TS + M  ELPQK  PPPS+  RF+A L QR+DE
Sbjct: 1    MDSEKIL----NPRLVS------INSTSRKGMSVELPQKPPPPPSLLDRFKALLNQREDE 60

Query: 65   FRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLICAR 124
            F   G + V PP+ ++IVQLY+++L ELTFNSKPIITDLT++A EQREHG+GIA+ IC R
Sbjct: 61   F--GGGEEVLPPSMDEIVQLYEVVLGELTFNSKPIITDLTIIAGEQREHGEGIANAICTR 120

Query: 125  ILEVPVEQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLFGT 184
            ILE PVEQKLPSLYLLDSIVKN+G +Y  YFSSRLPEVFC AYRQ HP+LH +MRHLFGT
Sbjct: 121  ILEAPVEQKLPSLYLLDSIVKNIGRDYGRYFSSRLPEVFCLAYRQAHPSLHPSMRHLFGT 180

Query: 185  WATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVVDK 244
            W++VFPP ++RKI+ QL   +A   SS+    ASE  +PT GIHVNPKYLR+LE S  + 
Sbjct: 181  WSSVFPPPVLRKIDMQLQLSSAANQSSV---GASEPSQPTRGIHVNPKYLRRLEPSAAE- 240

Query: 245  HNQDARGA-SALKVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFALGINKT 304
               + RG  S+ +V+ +    GY +++    D LE   S       S   D F    N  
Sbjct: 241  --NNLRGINSSARVYGQNSLGGYNDFE----DQLESPSSL------SSTPDGFTRRSND- 300

Query: 305  NIKLAKSSASSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQY 364
                                          A+PS   ++Y   R   R ++  +WRRK+ 
Sbjct: 301  -----------------------------GANPSNQAFNYGMGRATSRDDEHMEWRRKE- 360

Query: 365  PDDNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNK-V 424
                         N+  G+  E PRALI+AYG D  K    + P +     +NG+ +K V
Sbjct: 361  -------------NLGQGNDHERPRALIDAYGVDTSKHVTINKPIR----DMNGMHSKMV 420

Query: 425  TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPPVPP-SRFRTRTGFERSNAMSIEPGM 484
            TP  WQNTEEEEFDWEDMSPTL DR R  + L+  VP     R R     ++   ++  +
Sbjct: 421  TP--WQNTEEEEFDWEDMSPTL-DRSRAGEFLRSSVPALGSVRARPRVGNTSDFHLDSDI 480

Query: 485  RSNWSSQCQLPTIDSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQMPSLGR 544
            ++  S Q +                  + W++  +   TS  +  +  AG++ ++ +   
Sbjct: 481  KNGVSHQLR------------------ENWSLSQNYPHTSNRV--DTRAGKDLKVLASSV 540

Query: 545  GIASSGGEKMSPFVDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQSMGPRHPLNLP 604
            G+ SS  E  +P  D +           + SR G          +++     P      P
Sbjct: 541  GLVSSNSEFGAPPFDSI---------QDVNSRFG----------RALPDGTWPHLSARGP 600

Query: 605  NSCPPSRPPIFPVPRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRNKELSLTTKSPQ 664
            NS         PVP            S  L + AN    P   M+N R +   L     Q
Sbjct: 601  NS--------LPVP------------SAHLHHLAN----PGNAMSN-RLQGKPLYRPENQ 660

Query: 665  VGNQHTGHIPLTRGNQLPAIPLKPQFLPSQDMQDNLNASTVPPALPHLMAPSLSQGYISQ 724
            V   H     +T+ NQ+        +LPS       +++  P  +  L+       ++S 
Sbjct: 661  VSQSHLN--DMTQQNQMLV-----NYLPS-------SSAMAPRPMQSLLT------HVSH 720

Query: 725  GYRPVISECLSSSAPIGQWNLPVHNSPSNPFL-LQGGPLPPLPPGPHPTSAPSISLSQKA 784
            GY                   P H S   P L +QGG         HP S  S  LSQ  
Sbjct: 721  GY-------------------PPHGSTIRPSLSIQGG------EAMHPLS--SGVLSQIG 780

Query: 785  GSLVPGQQPGTAFSGLISSLMAQGLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYA 844
             S    Q PG AFSGLI SLMAQGLISLNNQ + Q  +GLEF+ D+LK+R+ESAI+ALY 
Sbjct: 781  AS---NQPPGGAFSGLIGSLMAQGLISLNNQPAGQGPLGLEFDADMLKIRNESAISALYG 808

Query: 845  DLPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGT 904
            DLPRQC TCGLRFK QEEHS HMDWHVTKNRMSK+ KQ PSRKWFVS SMWLSGAEALG 
Sbjct: 841  DLPRQCTTCGLRFKCQEEHSKHMDWHVTKNRMSKNHKQNPSRKWFVSASMWLSGAEALGA 808

Query: 905  EAVPGFLPAEVIVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNA 964
            EAVPGFLP E   EKKDDE++AVPADEDQ +CALCGEPFEDFYSDETEEWMY+GAVYMNA
Sbjct: 901  EAVPGFLPTEPTTEKKDDEDMAVPADEDQTSCALCGEPFEDFYSDETEEWMYKGAVYMNA 808

Query: 965  PDGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDEPGGLSEEGNRRKRLRS 1012
            P+  T  MD+SQLGPIVHAKCR E+N            GG  EEG++RK++RS
Sbjct: 961  PEESTTDMDKSQLGPIVHAKCRPESN------------GGDMEEGSQRKKMRS 808

BLAST of HG10018227 vs. ExPASy Swiss-Prot
Match: Q9C710 (Polyadenylation and cleavage factor homolog 1 OS=Arabidopsis thaliana OX=3702 GN=PCFS1 PE=1 SV=1)

HSP 1 Score: 165.2 bits (417), Expect = 3.7e-39
Identity = 99/199 (49.75%), Postives = 119/199 (59.80%), Query Frame = 0

Query: 809 QASVQDS--VGLEF-NPDVLKVRHESAITALYADLPRQCMTCGLRFKTQEEHSNHMDWHV 868
           +AS  DS  VGL F NP  L VRHES I +LY+D+PRQC +CGLRFK QEEHS HMDWHV
Sbjct: 218 EASNSDSLPVGLSFDNPSSLNVRHESVIKSLYSDMPRQCSSCGLRFKCQEEHSKHMDWHV 277

Query: 869 TKNRMSKS-----RKQKPSRKWFVSISMWLSGAEALGTEAVPGFLPAEVIVEKKDDEE-- 928
            KNR  K+     ++ K SR W  S S+WL  A    T  V  F   E+  +K  DEE  
Sbjct: 278 RKNRSVKTTTRLGQQPKKSRGWLASASLWLCAATGGETVEVASF-GGEMQKKKGKDEEPK 337

Query: 929 -LAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPDGQTAGMDRSQLGPIVHA 988
            L VPADEDQK CALC EPFE+F+S E ++WMY+ AVY+            ++ G IVH 
Sbjct: 338 QLMVPADEDQKNCALCVEPFEEFFSHEDDDWMYKDAVYL------------TKNGRIVHV 397

Query: 989 KCRTETNVVPSESFDQDEP 997
           KC  E    P  + D  EP
Sbjct: 398 KCMPE----PRPAKDLREP 399

BLAST of HG10018227 vs. ExPASy Swiss-Prot
Match: Q9FIX8 (Polyadenylation and cleavage factor homolog 5 OS=Arabidopsis thaliana OX=3702 GN=PCFS5 PE=1 SV=1)

HSP 1 Score: 161.4 bits (407), Expect = 5.4e-38
Identity = 126/334 (37.72%), Postives = 165/334 (49.40%), Query Frame = 0

Query: 713 SQGYISQGYRPVISECLSSSAPIGQWNLPVHNSPSNPFLLQGGPLPPLPPGPHPTSA--P 772
           SQ Y + G   VI+   S + P    N    N+   PF++ G P P + P P P     P
Sbjct: 80  SQSYNNYGV-DVIASNSSFALPNNDSNT---NNYQKPFVVYGNPNPQIVPLPLPYRKLDP 139

Query: 773 SISLSQ-----------KAGSLVPG-------QQP--GTAFSGLISSLMAQGLI------ 832
             SL Q           ++ + VP        Q P   +    ++S  M Q ++      
Sbjct: 140 LDSLPQWVPNSTPNYPVRSSNFVPNTPDFTNVQNPMNHSNMVSVVSQSMHQPIVLSKELT 199

Query: 833 ----SLNN-------QASVQDS--VGLEF-NPDVLKVRHESAITALYADLPRQCMTCGLR 892
                LNN       +AS  DS  VGL F NP  L VRHES I +LY+D+PRQC +CG+R
Sbjct: 200 DLLSLLNNEKEKKTSEASNNDSLPVGLSFDNPSSLNVRHESVIKSLYSDMPRQCTSCGVR 259

Query: 893 FKTQEEHSNHMDWHVTKNRMSKS-----RKQKPSRKWFVSISMWLSGAEALGTEAVPGFL 952
           FK QEEHS HMDWHV KNR  K+     ++ K SR W  S S+WL      GT  V  F 
Sbjct: 260 FKCQEEHSKHMDWHVRKNRSVKTTTRLGQQPKKSRGWLASASLWLCAPTGGGTVEVASFG 319

Query: 953 PAEVIVEKKDDE---ELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPDGQ 997
             E+  + + D+   +  VPADEDQK CALC EPFE+F+S E ++WMY+ AVY+      
Sbjct: 320 GGEMQKKNEKDQVQKQHMVPADEDQKNCALCVEPFEEFFSHEADDWMYKDAVYL------ 379

BLAST of HG10018227 vs. ExPASy Swiss-Prot
Match: O94913 (Pre-mRNA cleavage complex 2 protein Pcf11 OS=Homo sapiens OX=9606 GN=PCF11 PE=1 SV=3)

HSP 1 Score: 98.6 bits (244), Expect = 4.3e-19
Identity = 57/158 (36.08%), Postives = 83/158 (52.53%), Query Frame = 0

Query: 77  EDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLICARILEVPVEQKLPSLY 136
           ED  + Y   L +LTFNSKP I  LT+LA+E     K I  LI A+  + P  +KLP +Y
Sbjct: 16  EDACRDYQSSLEDLTFNSKPHINMLTILAEENLPFAKEIVSLIEAQTAKAPSSEKLPVMY 75

Query: 137 LLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLFGTWATVFPPSIIRKIE 196
           L+DSIVKNVG EY++ F+  L   F   + +V  N   ++  L  TW  +FP   +  ++
Sbjct: 76  LMDSIVKNVGREYLTAFTKNLVATFICVFEKVDENTRKSLFKLRSTWDEIFPLKKLYALD 135

Query: 197 AQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQ 235
            +++ L             +     T  IHVNPK+L +
Sbjct: 136 VRVNSLDPAWPIKPLPPNVN-----TSSIHVNPKFLNK 168

BLAST of HG10018227 vs. ExPASy Swiss-Prot
Match: Q10237 (Uncharacterized protein C4G9.04c OS=Schizosaccharomyces pombe (strain 972 / ATCC 24843) OX=284812 GN=SPAC4G9.04c PE=4 SV=1)

HSP 1 Score: 82.4 bits (202), Expect = 3.2e-14
Identity = 59/159 (37.11%), Postives = 75/159 (47.17%), Query Frame = 0

Query: 78  DIVQL-YDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLICARILEVPVEQKLPSLY 137
           D+V+L Y   L +LTFNSKPII  LT +A E   +   I + I   I + P   KLP+LY
Sbjct: 2   DLVELDYLSALEDLTFNSKPIIHTLTYIAQENEPYAISIVNAIEKHIQKCPPNCKLPALY 61

Query: 138 LLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLFGTW----------ATV 197
           LLDSI KN+G  Y  +F   L   F  AY  V P L   +  L  TW            V
Sbjct: 62  LLDSISKNLGAPYTYFFGLHLFSTFMSAYTVVEPRLRLKLDQLLATWKQRPPNSSSLEPV 121

Query: 198 FPPSIIRKIEAQL----SQLTAQESSSLTSSRASESPRP 222
           F P +  KIE  L    S +   +S  L ++  S    P
Sbjct: 122 FSPIVTAKIENALLKYKSTILRHQSPLLANTSISSFSAP 160

BLAST of HG10018227 vs. ExPASy TrEMBL
Match: A0A1S3CI66 (polyadenylation and cleavage factor homolog 4 isoform X2 OS=Cucumis melo OX=3656 GN=LOC103501218 PE=4 SV=1)

HSP 1 Score: 1863.2 bits (4825), Expect = 0.0e+00
Identity = 926/1012 (91.50%), Postives = 961/1012 (94.96%), Query Frame = 0

Query: 1    MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60
            MT FMESEKLLISRGNPRNSAYPSDR +PTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD
Sbjct: 1    MTRFMESEKLLISRGNPRNSAYPSDRPIPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60

Query: 61   DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120
            DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC
Sbjct: 61   DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120

Query: 121  ARILEVPVEQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180
            ARILEVPV+QKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF
Sbjct: 121  ARILEVPVDQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180

Query: 181  GTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240
            GTWATVFPPSIIRKIEAQLSQLTAQESS LTSSRASESPRPTHGIHVNPKYLRQLEHSVV
Sbjct: 181  GTWATVFPPSIIRKIEAQLSQLTAQESSGLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240

Query: 241  DKHNQDARGASALKVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFALGINK 300
            DKH QD+RG SA+KVHDKKLA GYEEYDYDHAD LEHGG+Q F SMGSMGHDSF+LG NK
Sbjct: 241  DKHTQDSRGTSAIKVHDKKLASGYEEYDYDHADALEHGGAQEFHSMGSMGHDSFSLGTNK 300

Query: 301  TNIKLAKSSASSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQ 360
             N+KLAKSS SSRIGH+RPLQS GDELE+VRASPSQNVYDYEGS+++DR EDTNKWRRKQ
Sbjct: 301  ANVKLAKSSLSSRIGHHRPLQSLGDELESVRASPSQNVYDYEGSKILDRNEDTNKWRRKQ 360

Query: 361  YPDDNLNGLEST-SYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNK 420
            YPDDN+NGLE+T SYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSI+GIDNK
Sbjct: 361  YPDDNMNGLENTSSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSISGIDNK 420

Query: 421  VTPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPPVPPSRFRTRTGFERSNAMSIEPGM 480
             TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKP VPPSRFRTR+GFERSNAM IEPGM
Sbjct: 421  ATPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPTVPPSRFRTRSGFERSNAMPIEPGM 480

Query: 481  RSNWSSQCQLPTIDSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQMPSLGR 540
            RSNWSSQ QLP IDSS+V+EDVV STPDIW MHNHISQTSQNLMNNKG GRNFQMP LGR
Sbjct: 481  RSNWSSQVQLPGIDSSIVIEDVVHSTPDIWKMHNHISQTSQNLMNNKGPGRNFQMPMLGR 540

Query: 541  GIASSGGEKMSPFVDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQSMGPRHPLNLP 600
            GI SSGGEKMSP+ DKLLTNDALHRPT IASRLGSSGLDS+MESQSIVQSMGPRHPLNL 
Sbjct: 541  GITSSGGEKMSPYGDKLLTNDALHRPTNIASRLGSSGLDSNMESQSIVQSMGPRHPLNLS 600

Query: 601  NSCPPSRPPIFPVPRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRNKELSLTTKSPQ 660
            NSCPPSRPP+FPVPRHN SQFESLNGSNS +N ANR+FLPEQQMNN+RNKELSLTTKSPQ
Sbjct: 601  NSCPPSRPPVFPVPRHNTSQFESLNGSNSFMNSANRTFLPEQQMNNLRNKELSLTTKSPQ 660

Query: 661  VGNQHTGHIPLTRGNQLPAIPLKPQFLPSQDMQDNLNASTVPPALPHLMAPSLSQGYISQ 720
            VGNQHTGHIPLTRGNQL ++PLKPQFLPSQDMQDN + S VPP LPHL+APSLSQGYISQ
Sbjct: 661  VGNQHTGHIPLTRGNQLQSMPLKPQFLPSQDMQDNFSGSAVPPVLPHLIAPSLSQGYISQ 720

Query: 721  GYRPVISECLSSSAPIGQWNLPVHNSPSNPFLLQGGPLPPLPPGPHPTSAPSISLSQKAG 780
            G+RP  SE LSSSAPIGQWNL VHNS SNP  LQGGPLPPLPPGPHPTS P+I +SQK  
Sbjct: 721  GHRPANSEGLSSSAPIGQWNLSVHNSSSNPLHLQGGPLPPLPPGPHPTSGPTIPISQK-- 780

Query: 781  SLVPGQQPGTAFSGLISSLMAQGLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYAD 840
              VPGQQPGTA SGLISSLMA+GLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYAD
Sbjct: 781  --VPGQQPGTAISGLISSLMARGLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYAD 840

Query: 841  LPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTE 900
            LPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTE
Sbjct: 841  LPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTE 900

Query: 901  AVPGFLPAEVIVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAP 960
            AVPGFLPAEV+VEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAP
Sbjct: 901  AVPGFLPAEVVVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAP 960

Query: 961  DGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDEPGGLSEEGNRRKRLRS 1012
            DGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDE GG+SE+GNRRKRLRS
Sbjct: 961  DGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDE-GGVSEDGNRRKRLRS 1007

BLAST of HG10018227 vs. ExPASy TrEMBL
Match: A0A1S3CJP9 (polyadenylation and cleavage factor homolog 4 isoform X1 OS=Cucumis melo OX=3656 GN=LOC103501218 PE=4 SV=1)

HSP 1 Score: 1857.0 bits (4809), Expect = 0.0e+00
Identity = 926/1017 (91.05%), Postives = 961/1017 (94.49%), Query Frame = 0

Query: 1    MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60
            MT FMESEKLLISRGNPRNSAYPSDR +PTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD
Sbjct: 1    MTRFMESEKLLISRGNPRNSAYPSDRPIPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60

Query: 61   DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120
            DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC
Sbjct: 61   DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120

Query: 121  ARILEVPVEQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180
            ARILEVPV+QKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF
Sbjct: 121  ARILEVPVDQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180

Query: 181  GTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240
            GTWATVFPPSIIRKIEAQLSQLTAQESS LTSSRASESPRPTHGIHVNPKYLRQLEHSVV
Sbjct: 181  GTWATVFPPSIIRKIEAQLSQLTAQESSGLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240

Query: 241  DK-----HNQDARGASALKVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFA 300
            DK     H QD+RG SA+KVHDKKLA GYEEYDYDHAD LEHGG+Q F SMGSMGHDSF+
Sbjct: 241  DKLLALQHTQDSRGTSAIKVHDKKLASGYEEYDYDHADALEHGGAQEFHSMGSMGHDSFS 300

Query: 301  LGINKTNIKLAKSSASSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNK 360
            LG NK N+KLAKSS SSRIGH+RPLQS GDELE+VRASPSQNVYDYEGS+++DR EDTNK
Sbjct: 301  LGTNKANVKLAKSSLSSRIGHHRPLQSLGDELESVRASPSQNVYDYEGSKILDRNEDTNK 360

Query: 361  WRRKQYPDDNLNGLEST-SYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSIN 420
            WRRKQYPDDN+NGLE+T SYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSI+
Sbjct: 361  WRRKQYPDDNMNGLENTSSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSIS 420

Query: 421  GIDNKVTPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPPVPPSRFRTRTGFERSNAMS 480
            GIDNK TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKP VPPSRFRTR+GFERSNAM 
Sbjct: 421  GIDNKATPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPTVPPSRFRTRSGFERSNAMP 480

Query: 481  IEPGMRSNWSSQCQLPTIDSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQM 540
            IEPGMRSNWSSQ QLP IDSS+V+EDVV STPDIW MHNHISQTSQNLMNNKG GRNFQM
Sbjct: 481  IEPGMRSNWSSQVQLPGIDSSIVIEDVVHSTPDIWKMHNHISQTSQNLMNNKGPGRNFQM 540

Query: 541  PSLGRGIASSGGEKMSPFVDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQSMGPRH 600
            P LGRGI SSGGEKMSP+ DKLLTNDALHRPT IASRLGSSGLDS+MESQSIVQSMGPRH
Sbjct: 541  PMLGRGITSSGGEKMSPYGDKLLTNDALHRPTNIASRLGSSGLDSNMESQSIVQSMGPRH 600

Query: 601  PLNLPNSCPPSRPPIFPVPRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRNKELSLT 660
            PLNL NSCPPSRPP+FPVPRHN SQFESLNGSNS +N ANR+FLPEQQMNN+RNKELSLT
Sbjct: 601  PLNLSNSCPPSRPPVFPVPRHNTSQFESLNGSNSFMNSANRTFLPEQQMNNLRNKELSLT 660

Query: 661  TKSPQVGNQHTGHIPLTRGNQLPAIPLKPQFLPSQDMQDNLNASTVPPALPHLMAPSLSQ 720
            TKSPQVGNQHTGHIPLTRGNQL ++PLKPQFLPSQDMQDN + S VPP LPHL+APSLSQ
Sbjct: 661  TKSPQVGNQHTGHIPLTRGNQLQSMPLKPQFLPSQDMQDNFSGSAVPPVLPHLIAPSLSQ 720

Query: 721  GYISQGYRPVISECLSSSAPIGQWNLPVHNSPSNPFLLQGGPLPPLPPGPHPTSAPSISL 780
            GYISQG+RP  SE LSSSAPIGQWNL VHNS SNP  LQGGPLPPLPPGPHPTS P+I +
Sbjct: 721  GYISQGHRPANSEGLSSSAPIGQWNLSVHNSSSNPLHLQGGPLPPLPPGPHPTSGPTIPI 780

Query: 781  SQKAGSLVPGQQPGTAFSGLISSLMAQGLISLNNQASVQDSVGLEFNPDVLKVRHESAIT 840
            SQK    VPGQQPGTA SGLISSLMA+GLISLNNQASVQDSVGLEFNPDVLKVRHESAIT
Sbjct: 781  SQK----VPGQQPGTAISGLISSLMARGLISLNNQASVQDSVGLEFNPDVLKVRHESAIT 840

Query: 841  ALYADLPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAE 900
            ALYADLPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAE
Sbjct: 841  ALYADLPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAE 900

Query: 901  ALGTEAVPGFLPAEVIVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAV 960
            ALGTEAVPGFLPAEV+VEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAV
Sbjct: 901  ALGTEAVPGFLPAEVVVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAV 960

Query: 961  YMNAPDGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDEPGGLSEEGNRRKRLRS 1012
            YMNAPDGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDE GG+SE+GNRRKRLRS
Sbjct: 961  YMNAPDGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDE-GGVSEDGNRRKRLRS 1012

BLAST of HG10018227 vs. ExPASy TrEMBL
Match: A0A0A0LVG0 (CID domain-containing protein OS=Cucumis sativus OX=3659 GN=Csa_1G109350 PE=4 SV=1)

HSP 1 Score: 1848.2 bits (4786), Expect = 0.0e+00
Identity = 927/1012 (91.60%), Postives = 952/1012 (94.07%), Query Frame = 0

Query: 1    MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60
            MT FMESEKLLISRGNPRNS YPSDR +PTTSGRTMPNELPQKP PSIAHRFRAQLKQRD
Sbjct: 1    MTRFMESEKLLISRGNPRNSVYPSDRPIPTTSGRTMPNELPQKPAPSIAHRFRAQLKQRD 60

Query: 61   DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120
            DEFRVSGHDVVP PTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC
Sbjct: 61   DEFRVSGHDVVPLPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120

Query: 121  ARILEVPVEQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180
            ARILEVPV+QKLPSLYLLDSIVKNVGHEYISYF+SRLPEVFCEAYRQVHPNLHNAMRHLF
Sbjct: 121  ARILEVPVDQKLPSLYLLDSIVKNVGHEYISYFASRLPEVFCEAYRQVHPNLHNAMRHLF 180

Query: 181  GTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240
            GTWATVFPPSIIRKIEAQLSQLTAQESS LTSSRASESPRPTHGIHVNPKYLRQLEHSVV
Sbjct: 181  GTWATVFPPSIIRKIEAQLSQLTAQESSGLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240

Query: 241  DKHNQDARGASALKVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFALGINK 300
            DKH+QD+RG SA+KVHDKKLA GYEEYDYDHAD LEHGG Q F SMGSMGHDSF+LG NK
Sbjct: 241  DKHSQDSRGTSAIKVHDKKLASGYEEYDYDHADALEHGGPQGFHSMGSMGHDSFSLGTNK 300

Query: 301  TNIKLAKSSASSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQ 360
             NIKLAKSS SSRIG +RPLQS GDE E VRASPSQNVYDYEGS+MIDR EDTNKWRRKQ
Sbjct: 301  ANIKLAKSSLSSRIGPHRPLQSVGDEHETVRASPSQNVYDYEGSKMIDRNEDTNKWRRKQ 360

Query: 361  YPDDNLNGLEST-SYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNK 420
            YPDDNLNGLEST SYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSIN IDNK
Sbjct: 361  YPDDNLNGLESTSSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINVIDNK 420

Query: 421  VTPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPPVPPSRFRTRTGFERSNAMSIEPGM 480
             TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPPVPPSRFRTR+GFERSNAM IEPGM
Sbjct: 421  ATPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPPVPPSRFRTRSGFERSNAMPIEPGM 480

Query: 481  RSNWSSQCQLPTIDSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQMPSLGR 540
            RSNWSS  +LP IDSS+V+EDVV STPD WNMHNHISQTSQNLMNNKG GRNFQMP LGR
Sbjct: 481  RSNWSSPVRLPGIDSSIVIEDVVHSTPDNWNMHNHISQTSQNLMNNKGQGRNFQMPMLGR 540

Query: 541  GIASSGGEKMSPFVDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQSMGPRHPLNLP 600
            GI SS GEKMSP+ DKLLTNDALHRPT IASRLGSSGLDSSMESQSIVQSMGPRHPLNL 
Sbjct: 541  GITSSVGEKMSPYGDKLLTNDALHRPTNIASRLGSSGLDSSMESQSIVQSMGPRHPLNLS 600

Query: 601  NSCPPSRPPIFPVPRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRNKELSLTTKSPQ 660
            NSCPPSRPPIFPVPRHN SQFESLNGSNS +N ANR+FLPEQQMNN+RNKELSLTTKSPQ
Sbjct: 601  NSCPPSRPPIFPVPRHNASQFESLNGSNSFMNCANRTFLPEQQMNNLRNKELSLTTKSPQ 660

Query: 661  VGNQHTGHIPLTRGNQLPAIPLKPQFLPSQDMQDNLNASTVPPALPHLMAPSLSQGYISQ 720
            VGNQHTGHIPLTRGNQL  +PLKPQFLPSQDMQDN + S VPP LPHLMAPSLSQGYISQ
Sbjct: 661  VGNQHTGHIPLTRGNQLQGMPLKPQFLPSQDMQDNFSGSAVPPVLPHLMAPSLSQGYISQ 720

Query: 721  GYRPVISECLSSSAPIGQWNLPVHNSPSNPFLLQGGPLPPLPPGPHPTSAPSISLSQKAG 780
            G+RP ISE LSSSAPIGQWNL VHNS SNP  LQGGPLPPLPPGPHPTS P+I +SQK  
Sbjct: 721  GHRPAISEGLSSSAPIGQWNLSVHNSSSNPLHLQGGPLPPLPPGPHPTSGPTIPISQK-- 780

Query: 781  SLVPGQQPGTAFSGLISSLMAQGLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYAD 840
              VPGQQPGTA SGLISSLMA+GLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYAD
Sbjct: 781  --VPGQQPGTAISGLISSLMARGLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYAD 840

Query: 841  LPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTE 900
            LPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTE
Sbjct: 841  LPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTE 900

Query: 901  AVPGFLPAEVIVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAP 960
            AVPGFLPAEV+VEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAP
Sbjct: 901  AVPGFLPAEVVVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAP 960

Query: 961  DGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDEPGGLSEEGNRRKRLRS 1012
            DGQTAGMD SQLGPIVHAKCRTETNVVPSESFDQDE GG+SEEGNRRKRLRS
Sbjct: 961  DGQTAGMDISQLGPIVHAKCRTETNVVPSESFDQDE-GGVSEEGNRRKRLRS 1007

BLAST of HG10018227 vs. ExPASy TrEMBL
Match: A0A5A7UC46 (Polyadenylation and cleavage factor-like protein 4 isoform X2 OS=Cucumis melo var. makuwa OX=1194695 GN=E5676_scaffold609G001150 PE=4 SV=1)

HSP 1 Score: 1821.2 bits (4716), Expect = 0.0e+00
Identity = 901/983 (91.66%), Postives = 934/983 (95.02%), Query Frame = 0

Query: 14   RGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRDDEFRVSGHDVVPP 73
            RGNPRNSAYPSDR +PTTSGRTMPNELPQKPPPSIAHRFRAQLKQRDDEFRVSGHDVVPP
Sbjct: 160  RGNPRNSAYPSDRPIPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRDDEFRVSGHDVVPP 219

Query: 74   PTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLICARILEVPVEQKLP 133
            PTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLICARILEVPV+QKLP
Sbjct: 220  PTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLICARILEVPVDQKLP 279

Query: 134  SLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLFGTWATVFPPSIIR 193
            SLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLFGTWATVFPPSIIR
Sbjct: 280  SLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLFGTWATVFPPSIIR 339

Query: 194  KIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVVDKHNQDARGASAL 253
            KIEAQLSQLTAQESS LTSSRASESPRPTHGIHVNPKYLRQLEHSVVDKH QD+RG SA+
Sbjct: 340  KIEAQLSQLTAQESSGLTSSRASESPRPTHGIHVNPKYLRQLEHSVVDKHTQDSRGTSAI 399

Query: 254  KVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFALGINKTNIKLAKSSASSR 313
            KVHDKKLA GYEEYDYDHAD LEHGG+Q F SMGSMGHDSF+LG NK N+KLAKSS SSR
Sbjct: 400  KVHDKKLASGYEEYDYDHADALEHGGAQEFHSMGSMGHDSFSLGTNKANVKLAKSSLSSR 459

Query: 314  IGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQYPDDNLNGLEST- 373
            IGH+RPLQS GDELE+VRASPSQNVYDYEGS+++DR EDTNKWRRKQYPDDN+NGLE+T 
Sbjct: 460  IGHHRPLQSLGDELESVRASPSQNVYDYEGSKILDRNEDTNKWRRKQYPDDNMNGLENTS 519

Query: 374  SYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNKVTPVTWQNTEEEE 433
            SYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSI+GIDNK TPVTWQNTEEEE
Sbjct: 520  SYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSISGIDNKATPVTWQNTEEEE 579

Query: 434  FDWEDMSPTLADRGRNNDMLKPPVPPSRFRTRTGFERSNAMSIEPGMRSNWSSQCQLPTI 493
            FDWEDMSPTLADRGRNNDMLKP VPPSRFRTR+GFERSNAM IEPGMRSNWSSQ QLP I
Sbjct: 580  FDWEDMSPTLADRGRNNDMLKPTVPPSRFRTRSGFERSNAMPIEPGMRSNWSSQVQLPGI 639

Query: 494  DSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQMPSLGRGIASSGGEKMSPF 553
            DSS+V+EDVV STPDIW MHNHISQTSQNLMNNKG GRNFQMP LGRGI SSGGEKMSP+
Sbjct: 640  DSSIVIEDVVHSTPDIWKMHNHISQTSQNLMNNKGPGRNFQMPMLGRGITSSGGEKMSPY 699

Query: 554  VDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQSMGPRHPLNLPNSCPPSRPPIFPV 613
             DKLLTNDALHRPT IASRLGSSGLDS+MESQSIVQSMGPRHPLNL NSCPPSRPP+FPV
Sbjct: 700  GDKLLTNDALHRPTNIASRLGSSGLDSNMESQSIVQSMGPRHPLNLSNSCPPSRPPVFPV 759

Query: 614  PRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRNKELSLTTKSPQVGNQHTGHIPLTR 673
            PRHN SQFESLNGSNS +N ANR+FLPEQQMNN+RNKELSLTTKSPQVGNQHTGHIPLTR
Sbjct: 760  PRHNTSQFESLNGSNSFMNSANRTFLPEQQMNNLRNKELSLTTKSPQVGNQHTGHIPLTR 819

Query: 674  GNQLPAIPLKPQFLPSQDMQDNLNASTVPPALPHLMAPSLSQGYISQGYRPVISECLSSS 733
            GNQL ++PLKPQFLPSQDMQDN + S VPP LPHL+APSLSQGYISQG+RP  SE LSSS
Sbjct: 820  GNQLQSMPLKPQFLPSQDMQDNFSGSAVPPVLPHLIAPSLSQGYISQGHRPANSEGLSSS 879

Query: 734  APIGQWNLPVHNSPSNPFLLQGGPLPPLPPGPHPTSAPSISLSQKAGSLVPGQQPGTAFS 793
            APIGQWNL VHNS SNP  LQGGPLPPLPPGPHPTS P+I +SQK    VPGQQPGTA S
Sbjct: 880  APIGQWNLSVHNSSSNPLHLQGGPLPPLPPGPHPTSGPTIPISQK----VPGQQPGTAIS 939

Query: 794  GLISSLMAQGLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYADLPRQCMTCGLRFK 853
            GLISSLMA+GLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYADLPRQCMTCGLRFK
Sbjct: 940  GLISSLMARGLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYADLPRQCMTCGLRFK 999

Query: 854  TQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEAVPGFLPAEVIVE 913
            TQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEAVPGFLPAEV+VE
Sbjct: 1000 TQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEAVPGFLPAEVVVE 1059

Query: 914  KKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPDGQTAGMDRSQLG 973
            KKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPDGQTAGMDRSQLG
Sbjct: 1060 KKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPDGQTAGMDRSQLG 1119

Query: 974  PIVHAKCRTETNVVPSESFDQDE 996
            PIVHAKCRTETNVVPSESFDQDE
Sbjct: 1120 PIVHAKCRTETNVVPSESFDQDE 1138

BLAST of HG10018227 vs. ExPASy TrEMBL
Match: A0A6J1EZ18 (polyadenylation and cleavage factor homolog 4-like isoform X2 OS=Cucurbita moschata OX=3662 GN=LOC111440036 PE=4 SV=1)

HSP 1 Score: 1732.6 bits (4486), Expect = 0.0e+00
Identity = 878/1011 (86.84%), Postives = 920/1011 (91.00%), Query Frame = 0

Query: 1    MTPFMESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQKPPPSIAHRFRAQLKQRD 60
            M PFMESEKLLISRGNPR  AY SDR LPTT+GR MPNELPQKP PSIAHRFRAQLKQRD
Sbjct: 1    MNPFMESEKLLISRGNPRTLAYTSDRPLPTTTGRAMPNELPQKPSPSIAHRFRAQLKQRD 60

Query: 61   DEFRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLIC 120
            DEFRVSG DV P PT EDIVQLY+LMLSELTFNSKPIITDLTVLA+EQREHGKGIADLIC
Sbjct: 61   DEFRVSGLDVAPLPTTEDIVQLYELMLSELTFNSKPIITDLTVLAEEQREHGKGIADLIC 120

Query: 121  ARILEVPVEQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180
            +RILEVPV+QKLPSLYLLDSIVKNVGHEYI+YFSSRLPEVFCEAYRQVHPNLHNAMRHLF
Sbjct: 121  SRILEVPVDQKLPSLYLLDSIVKNVGHEYINYFSSRLPEVFCEAYRQVHPNLHNAMRHLF 180

Query: 181  GTWATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVV 240
            GTW+TVFPPSI+RKIEA+LSQLT QE+S+LTSSRASESPRPTHGIHVNPKYLRQLEHSV 
Sbjct: 181  GTWSTVFPPSILRKIEARLSQLTTQETSALTSSRASESPRPTHGIHVNPKYLRQLEHSVG 240

Query: 241  DKHNQDARGASALKVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFALGINK 300
            DKH  DARGAS LKVHDKKLAPGYEEYDYDHAD LEHGGSQAF SMGSM HDSF+LG NK
Sbjct: 241  DKHIPDARGASTLKVHDKKLAPGYEEYDYDHADGLEHGGSQAFNSMGSMSHDSFSLGTNK 300

Query: 301  TNIKLAKSSASSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQ 360
             NIKLAKSS SSRIGHNRPLQS GDELEAVRASPSQNVYDYEG RMI+R EDTNKWRRKQ
Sbjct: 301  ANIKLAKSSLSSRIGHNRPLQSVGDELEAVRASPSQNVYDYEGFRMINRNEDTNKWRRKQ 360

Query: 361  YPDDNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNKV 420
            YPDDNLNGLESTS+NIRNG ALEGPRALIEAYGSDKGKGYLNDNPPQAEHFS+NGIDNK+
Sbjct: 361  YPDDNLNGLESTSFNIRNGCALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSMNGIDNKM 420

Query: 421  TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPPVPPSRFRTRTGFERSNAMSIEPGMR 480
            TPVTWQNTEEEEFDWEDMSPTLADRGR+NDMLKPPVPPSRFRTR GF+RSNAMSIEPGMR
Sbjct: 421  TPVTWQNTEEEEFDWEDMSPTLADRGRSNDMLKPPVPPSRFRTRLGFDRSNAMSIEPGMR 480

Query: 481  SNWSSQCQLPTIDSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQMPSLGRG 540
            SN S Q                    D W+MH+H+SQTSQNLM+ KG G NFQ+P LGRG
Sbjct: 481  SNSSHQ--------------------DAWSMHSHLSQTSQNLMSTKGTGGNFQIPLLGRG 540

Query: 541  IASSGGEKMSPFVDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQSMGPRHPLNLPN 600
            IASSGGEKMSPFVDKL TNDALHRP T+ASRLGSS LDSSMESQS+VQSMG RHP+NL +
Sbjct: 541  IASSGGEKMSPFVDKLPTNDALHRP-TVASRLGSSALDSSMESQSVVQSMGQRHPVNLSD 600

Query: 601  SCPPSRPPIFPVPRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRNKELSLTTKSPQV 660
            SCPPSRPP F VP HNKSQFESLNGSN+ INRANRSFLPEQQMNN+RNKELS TTKSPQV
Sbjct: 601  SCPPSRPP-FHVPGHNKSQFESLNGSNAFINRANRSFLPEQQMNNVRNKELSHTTKSPQV 660

Query: 661  GNQHTGHIPLTRGNQLPAIPLKPQFLPSQDMQDNLNASTVPPALPHLMAPSLSQGYISQG 720
            GNQH G I LT+GNQL  IPLKPQFLPSQDM D+ +AS VPP LPHLMAPSLSQGY SQG
Sbjct: 661  GNQHGGRILLTQGNQLQTIPLKPQFLPSQDMHDSFSASAVPPVLPHLMAPSLSQGYSSQG 720

Query: 721  YRPVISECLSSSAPIGQWNLPVHNSPSNPFLLQGGPLPPLPPGPHPTSAPSISLSQKAGS 780
             RP ISECLSSS PIGQWNLPVHNSPSNP  LQ GPLPPLP GPHPT      +SQ AGS
Sbjct: 721  LRPGISECLSSSVPIGQWNLPVHNSPSNPLHLQ-GPLPPLPAGPHPT------ISQNAGS 780

Query: 781  LVPGQQPGTAFSGLISSLMAQGLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYADL 840
            LVPGQQPGTAFSGLISSLMAQGLISLNN+ASVQDSVG+EFNPDVLKVRH+SAITALYADL
Sbjct: 781  LVPGQQPGTAFSGLISSLMAQGLISLNNKASVQDSVGVEFNPDVLKVRHDSAITALYADL 840

Query: 841  PRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEA 900
            PRQCMTCGLRFKTQEEHSNHMDWHVT+NRMSKSRKQKPSRKWFVS SMWLSGAEALGTEA
Sbjct: 841  PRQCMTCGLRFKTQEEHSNHMDWHVTRNRMSKSRKQKPSRKWFVSTSMWLSGAEALGTEA 900

Query: 901  VPGFLPAEVIVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPD 960
            VPGFLPAEVIVEKKDDEELAVPADEDQKTCALCGEPF+DFYSDETEEWMYRGAVYMNAPD
Sbjct: 901  VPGFLPAEVIVEKKDDEELAVPADEDQKTCALCGEPFDDFYSDETEEWMYRGAVYMNAPD 960

Query: 961  GQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDEPGGLSEEGNRRKRLRS 1012
            GQTAGMDRSQLGPIVHAKCRTE+NVVPSESFDQDE  G+SEEG++RKRLRS
Sbjct: 961  GQTAGMDRSQLGPIVHAKCRTESNVVPSESFDQDEQRGVSEEGSQRKRLRS 982

BLAST of HG10018227 vs. TAIR 10
Match: AT4G04885.1 (PCF11P-similar protein 4 )

HSP 1 Score: 649.8 bits (1675), Expect = 3.6e-186
Identity = 446/1013 (44.03%), Postives = 565/1013 (55.77%), Query Frame = 0

Query: 5    MESEKLLISRGNPRNSAYPSDRQLPTTSGRTMPNELPQK--PPPSIAHRFRAQLKQRDDE 64
            M+SEK+L    NPR  +      + +TS + M  ELPQK  PPPS+  RF+A L QR+DE
Sbjct: 1    MDSEKIL----NPRLVS------INSTSRKGMSVELPQKPPPPPSLLDRFKALLNQREDE 60

Query: 65   FRVSGHDVVPPPTAEDIVQLYDLMLSELTFNSKPIITDLTVLADEQREHGKGIADLICAR 124
            F   G + V PP+ ++IVQLY+++L ELTFNSKPIITDLT++A EQREHG+GIA+ IC R
Sbjct: 61   F--GGGEEVLPPSMDEIVQLYEVVLGELTFNSKPIITDLTIIAGEQREHGEGIANAICTR 120

Query: 125  ILEVPVEQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLFGT 184
            ILE PVEQKLPSLYLLDSIVKN+G +Y  YFSSRLPEVFC AYRQ HP+LH +MRHLFGT
Sbjct: 121  ILEAPVEQKLPSLYLLDSIVKNIGRDYGRYFSSRLPEVFCLAYRQAHPSLHPSMRHLFGT 180

Query: 185  WATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESPRPTHGIHVNPKYLRQLEHSVVDK 244
            W++VFPP ++RKI+ QL   +A   SS+    ASE  +PT GIHVNPKYLR+LE S  + 
Sbjct: 181  WSSVFPPPVLRKIDMQLQLSSAANQSSV---GASEPSQPTRGIHVNPKYLRRLEPSAAE- 240

Query: 245  HNQDARGA-SALKVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFALGINKT 304
               + RG  S+ +V+ +    GY +++    D LE   S       S   D F    N  
Sbjct: 241  --NNLRGINSSARVYGQNSLGGYNDFE----DQLESPSSL------SSTPDGFTRRSND- 300

Query: 305  NIKLAKSSASSRIGHNRPLQSAGDELEAVRASPSQNVYDYEGSRMIDRIEDTNKWRRKQY 364
                                          A+PS   ++Y   R   R ++  +WRRK+ 
Sbjct: 301  -----------------------------GANPSNQAFNYGMGRATSRDDEHMEWRRKE- 360

Query: 365  PDDNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSINGIDNK-V 424
                         N+  G+  E PRALI+AYG D  K    + P +     +NG+ +K V
Sbjct: 361  -------------NLGQGNDHERPRALIDAYGVDTSKHVTINKPIR----DMNGMHSKMV 420

Query: 425  TPVTWQNTEEEEFDWEDMSPTLADRGRNNDMLKPPVPP-SRFRTRTGFERSNAMSIEPGM 484
            TP  WQNTEEEEFDWEDMSPTL DR R  + L+  VP     R R     ++   ++  +
Sbjct: 421  TP--WQNTEEEEFDWEDMSPTL-DRSRAGEFLRSSVPALGSVRARPRVGNTSDFHLDSDI 480

Query: 485  RSNWSSQCQLPTIDSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGAGRNFQMPSLGR 544
            ++  S Q +                  + W++  +   TS  +  +  AG++ ++ +   
Sbjct: 481  KNGVSHQLR------------------ENWSLSQNYPHTSNRV--DTRAGKDLKVLASSV 540

Query: 545  GIASSGGEKMSPFVDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQSMGPRHPLNLP 604
            G+ SS  E  +P  D +           + SR G          +++     P      P
Sbjct: 541  GLVSSNSEFGAPPFDSI---------QDVNSRFG----------RALPDGTWPHLSARGP 600

Query: 605  NSCPPSRPPIFPVPRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRNKELSLTTKSPQ 664
            NS         PVP            S  L + AN    P   M+N R +   L     Q
Sbjct: 601  NS--------LPVP------------SAHLHHLAN----PGNAMSN-RLQGKPLYRPENQ 660

Query: 665  VGNQHTGHIPLTRGNQLPAIPLKPQFLPSQDMQDNLNASTVPPALPHLMAPSLSQGYISQ 724
            V   H     +T+ NQ+        +LPS       +++  P  +  L+       ++S 
Sbjct: 661  VSQSHLN--DMTQQNQMLV-----NYLPS-------SSAMAPRPMQSLLT------HVSH 720

Query: 725  GYRPVISECLSSSAPIGQWNLPVHNSPSNPFL-LQGGPLPPLPPGPHPTSAPSISLSQKA 784
            GY                   P H S   P L +QGG         HP S  S  LSQ  
Sbjct: 721  GY-------------------PPHGSTIRPSLSIQGG------EAMHPLS--SGVLSQIG 780

Query: 785  GSLVPGQQPGTAFSGLISSLMAQGLISLNNQASVQDSVGLEFNPDVLKVRHESAITALYA 844
             S    Q PG AFSGLI SLMAQGLISLNNQ + Q  +GLEF+ D+LK+R+ESAI+ALY 
Sbjct: 781  AS---NQPPGGAFSGLIGSLMAQGLISLNNQPAGQGPLGLEFDADMLKIRNESAISALYG 808

Query: 845  DLPRQCMTCGLRFKTQEEHSNHMDWHVTKNRMSKSRKQKPSRKWFVSISMWLSGAEALGT 904
            DLPRQC TCGLRFK QEEHS HMDWHVTKNRMSK+ KQ PSRKWFVS SMWLSGAEALG 
Sbjct: 841  DLPRQCTTCGLRFKCQEEHSKHMDWHVTKNRMSKNHKQNPSRKWFVSASMWLSGAEALGA 808

Query: 905  EAVPGFLPAEVIVEKKDDEELAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNA 964
            EAVPGFLP E   EKKDDE++AVPADEDQ +CALCGEPFEDFYSDETEEWMY+GAVYMNA
Sbjct: 901  EAVPGFLPTEPTTEKKDDEDMAVPADEDQTSCALCGEPFEDFYSDETEEWMYKGAVYMNA 808

Query: 965  PDGQTAGMDRSQLGPIVHAKCRTETNVVPSESFDQDEPGGLSEEGNRRKRLRS 1012
            P+  T  MD+SQLGPIVHAKCR E+N            GG  EEG++RK++RS
Sbjct: 961  PEESTTDMDKSQLGPIVHAKCRPESN------------GGDMEEGSQRKKMRS 808

BLAST of HG10018227 vs. TAIR 10
Match: AT2G36480.1 (ENTH/VHS family protein )

HSP 1 Score: 198.0 bits (502), Expect = 3.7e-50
Identity = 235/900 (26.11%), Postives = 359/900 (39.89%), Query Frame = 0

Query: 124 LEVPVEQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLFGTW 183
           ++VP +QKLP+LYLLDSIVKN+G +YI YF +RLPEVF +AYRQV P +H+ MRHLFGTW
Sbjct: 1   MQVPSDQKLPTLYLLDSIVKNIGRDYIKYFGARLPEVFVKAYRQVDPPMHSNMRHLFGTW 60

Query: 184 ATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESP---RPTHGIHVNPKYLRQLEHSVV 243
             VF P  ++ IE +L      + S+   S A   P   RP H IHVNPKYL +      
Sbjct: 61  KGVFHPQTLQLIEKELGFNAKSDGSAAVVSTARAEPQSQRPPHSIHVNPKYLER------ 120

Query: 244 DKHNQDARGASALKVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFALGINK 303
            +  Q +     +     + AP         +D LE   S A          S+      
Sbjct: 121 -QRLQQSGRTKGMVTDVPETAPNLTR----DSDRLERVSSIA-------SGGSWVGPAKV 180

Query: 304 TNIKLAKSSASSRIGHNRPLQSAGDELEAVRASP--SQNVYDYEGSRMI-DRIEDTNKWR 363
            NI+  +    S   + + ++S   E +     P  S++V    GSR+  D  E      
Sbjct: 181 NNIRRPQRDLLSEPLYEKDIESIAGEYDYASDLPHNSRSVIKNVGSRITDDGCEKQWYGA 240

Query: 364 RKQYPD---DNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSIN 423
             + PD   D  +GL S S   R  +        +E+ G  +  G   D           
Sbjct: 241 TNRDPDLISDQRDGLHSKS---RTSNYATARVENLESSGPSRNIGVPYD----------- 300

Query: 424 GIDNKVTPVTWQNTEEEEFDWEDMSPTLADRG----RNNDMLKPPVPPSRFRTRTG-FER 483
                    +W+N+EEEEF W DM   L++         + L  P    R  +     +R
Sbjct: 301 ---------SWKNSEEEEFMW-DMHSRLSETDVATINPKNELHAPDESERLESENHLLKR 360

Query: 484 SNAMSIEPGM-RSNWSSQCQLPTIDSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGA 543
               +++P    +N ++       D S +      ST      +   + T + +      
Sbjct: 361 PRFSALDPRFDPANSTNSYSSEQKDPSSIGHWAFSST------NATSTATRKGIQPQPRV 420

Query: 544 GRNFQMPSLGRGIASSGGEKMSPFVDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQ 603
             +  +PS G     SG ++ SP  D     +   +    A  L     D         Q
Sbjct: 421 ASSGILPSSG-----SGSDRQSPLHDSTSKQNVTKQDVRRAHSLPQR--DPRASRFPAKQ 480

Query: 604 SMGPRHPLNLPNSCPPSRPPIFPVPRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRN 663
           ++     + LP+S        F      +   E  +  ++  N    +   E       +
Sbjct: 481 NVPRDDSVRLPSSSSQ-----FKNTNMRELPVEIFDSKSAAENAPGLTLASEATGQPNMS 540

Query: 664 KELSLTTKSPQVGNQHT-------GHIPLTRGNQLPAIPLKPQFLPSQDMQDNLNASTVP 723
             L    KS  + N  T        H  +  G        KP+ LP     DNL      
Sbjct: 541 DLLEAVMKSGILSNNSTCGAIKEESHDEVNPGALTLPAASKPKTLPISLATDNL------ 600

Query: 724 PALPHLMAPSLSQGYISQGYRPVISECLSSSAPIGQWNLPVHNSPSNPF------LLQGG 783
                     L++  + Q   P++S   S +           +  S+P       L+  G
Sbjct: 601 ----------LARLKVEQSSAPLVSCAASLTGITSVQTSKEKSKASDPLSCLLSSLVSKG 660

Query: 784 -------PLPPLPPGPHPTSAPSISLSQKAGSLVPGQ-QPGTAFSGLISSLMAQGLI--S 843
                   LP  P      S    + S  + S+VP   QP     G  ++   +GL   S
Sbjct: 661 LISASKTELPSAPSITQEHSPDHSTNSSMSVSVVPADAQPSVLVKGPSTAPKVKGLAAPS 720

Query: 844 LNNQASVQDSVGLEFNPDVLKVRHESAITALYADLPRQCMTCGLRFKTQEEHSNHMDWHV 903
             +++  +D +GL+F  D ++  H S I++L+ DLP  C +C +R K +EE   HM+ H 
Sbjct: 721 ETSKSEPKDLIGLKFRADKIRELHPSVISSLFDDLPHLCTSCSVRLKQKEELDRHMELH- 780

Query: 904 TKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEAVPGFLPAEVIVEKKDDEELAVPADE 963
            K ++  S      R WF  +  W++   A   E  P +       E   ++  AV ADE
Sbjct: 781 DKKKLELSGTNSKCRVWFPKVDNWIA---AKAGELEPEYEEVLSEPESAIEDCQAVAADE 815

Query: 964 DQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPDGQTAGMDRSQLGPIVHAKCRTETNV 986
            Q  C LCGE FED++S E  +WM++GA Y+  P   +        GPIVH  C T +++
Sbjct: 841 TQCACILCGEVFEDYFSQEMAQWMFKGASYLTNPPANSEAS-----GPIVHTGCLTTSSL 815

BLAST of HG10018227 vs. TAIR 10
Match: AT2G36480.2 (ENTH/VHS family protein )

HSP 1 Score: 198.0 bits (502), Expect = 3.7e-50
Identity = 235/900 (26.11%), Postives = 359/900 (39.89%), Query Frame = 0

Query: 124 LEVPVEQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLFGTW 183
           ++VP +QKLP+LYLLDSIVKN+G +YI YF +RLPEVF +AYRQV P +H+ MRHLFGTW
Sbjct: 1   MQVPSDQKLPTLYLLDSIVKNIGRDYIKYFGARLPEVFVKAYRQVDPPMHSNMRHLFGTW 60

Query: 184 ATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESP---RPTHGIHVNPKYLRQLEHSVV 243
             VF P  ++ IE +L      + S+   S A   P   RP H IHVNPKYL +      
Sbjct: 61  KGVFHPQTLQLIEKELGFNAKSDGSAAVVSTARAEPQSQRPPHSIHVNPKYLER------ 120

Query: 244 DKHNQDARGASALKVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFALGINK 303
            +  Q +     +     + AP         +D LE   S A          S+      
Sbjct: 121 -QRLQQSGRTKGMVTDVPETAPNLTR----DSDRLERVSSIA-------SGGSWVGPAKV 180

Query: 304 TNIKLAKSSASSRIGHNRPLQSAGDELEAVRASP--SQNVYDYEGSRMI-DRIEDTNKWR 363
            NI+  +    S   + + ++S   E +     P  S++V    GSR+  D  E      
Sbjct: 181 NNIRRPQRDLLSEPLYEKDIESIAGEYDYASDLPHNSRSVIKNVGSRITDDGCEKQWYGA 240

Query: 364 RKQYPD---DNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSIN 423
             + PD   D  +GL S S   R  +        +E+ G  +  G   D           
Sbjct: 241 TNRDPDLISDQRDGLHSKS---RTSNYATARVENLESSGPSRNIGVPYD----------- 300

Query: 424 GIDNKVTPVTWQNTEEEEFDWEDMSPTLADRG----RNNDMLKPPVPPSRFRTRTG-FER 483
                    +W+N+EEEEF W DM   L++         + L  P    R  +     +R
Sbjct: 301 ---------SWKNSEEEEFMW-DMHSRLSETDVATINPKNELHAPDESERLESENHLLKR 360

Query: 484 SNAMSIEPGM-RSNWSSQCQLPTIDSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGA 543
               +++P    +N ++       D S +      ST      +   + T + +      
Sbjct: 361 PRFSALDPRFDPANSTNSYSSEQKDPSSIGHWAFSST------NATSTATRKGIQPQPRV 420

Query: 544 GRNFQMPSLGRGIASSGGEKMSPFVDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQ 603
             +  +PS G     SG ++ SP  D     +   +    A  L     D         Q
Sbjct: 421 ASSGILPSSG-----SGSDRQSPLHDSTSKQNVTKQDVRRAHSLPQR--DPRASRFPAKQ 480

Query: 604 SMGPRHPLNLPNSCPPSRPPIFPVPRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRN 663
           ++     + LP+S        F      +   E  +  ++  N    +   E       +
Sbjct: 481 NVPRDDSVRLPSSSSQ-----FKNTNMRELPVEIFDSKSAAENAPGLTLASEATGQPNMS 540

Query: 664 KELSLTTKSPQVGNQHT-------GHIPLTRGNQLPAIPLKPQFLPSQDMQDNLNASTVP 723
             L    KS  + N  T        H  +  G        KP+ LP     DNL      
Sbjct: 541 DLLEAVMKSGILSNNSTCGAIKEESHDEVNPGALTLPAASKPKTLPISLATDNL------ 600

Query: 724 PALPHLMAPSLSQGYISQGYRPVISECLSSSAPIGQWNLPVHNSPSNPF------LLQGG 783
                     L++  + Q   P++S   S +           +  S+P       L+  G
Sbjct: 601 ----------LARLKVEQSSAPLVSCAASLTGITSVQTSKEKSKASDPLSCLLSSLVSKG 660

Query: 784 -------PLPPLPPGPHPTSAPSISLSQKAGSLVPGQ-QPGTAFSGLISSLMAQGLI--S 843
                   LP  P      S    + S  + S+VP   QP     G  ++   +GL   S
Sbjct: 661 LISASKTELPSAPSITQEHSPDHSTNSSMSVSVVPADAQPSVLVKGPSTAPKVKGLAAPS 720

Query: 844 LNNQASVQDSVGLEFNPDVLKVRHESAITALYADLPRQCMTCGLRFKTQEEHSNHMDWHV 903
             +++  +D +GL+F  D ++  H S I++L+ DLP  C +C +R K +EE   HM+ H 
Sbjct: 721 ETSKSEPKDLIGLKFRADKIRELHPSVISSLFDDLPHLCTSCSVRLKQKEELDRHMELH- 780

Query: 904 TKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEAVPGFLPAEVIVEKKDDEELAVPADE 963
            K ++  S      R WF  +  W++   A   E  P +       E   ++  AV ADE
Sbjct: 781 DKKKLELSGTNSKCRVWFPKVDNWIA---AKAGELEPEYEEVLSEPESAIEDCQAVAADE 815

Query: 964 DQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPDGQTAGMDRSQLGPIVHAKCRTETNV 986
            Q  C LCGE FED++S E  +WM++GA Y+  P   +        GPIVH  C T +++
Sbjct: 841 TQCACILCGEVFEDYFSQEMAQWMFKGASYLTNPPANSEAS-----GPIVHTGCLTTSSL 815

BLAST of HG10018227 vs. TAIR 10
Match: AT2G36480.3 (ENTH/VHS family protein )

HSP 1 Score: 198.0 bits (502), Expect = 3.7e-50
Identity = 235/900 (26.11%), Postives = 359/900 (39.89%), Query Frame = 0

Query: 124 LEVPVEQKLPSLYLLDSIVKNVGHEYISYFSSRLPEVFCEAYRQVHPNLHNAMRHLFGTW 183
           ++VP +QKLP+LYLLDSIVKN+G +YI YF +RLPEVF +AYRQV P +H+ MRHLFGTW
Sbjct: 1   MQVPSDQKLPTLYLLDSIVKNIGRDYIKYFGARLPEVFVKAYRQVDPPMHSNMRHLFGTW 60

Query: 184 ATVFPPSIIRKIEAQLSQLTAQESSSLTSSRASESP---RPTHGIHVNPKYLRQLEHSVV 243
             VF P  ++ IE +L      + S+   S A   P   RP H IHVNPKYL +      
Sbjct: 61  KGVFHPQTLQLIEKELGFNAKSDGSAAVVSTARAEPQSQRPPHSIHVNPKYLER------ 120

Query: 244 DKHNQDARGASALKVHDKKLAPGYEEYDYDHADVLEHGGSQAFRSMGSMGHDSFALGINK 303
            +  Q +     +     + AP         +D LE   S A          S+      
Sbjct: 121 -QRLQQSGRTKGMVTDVPETAPNLTR----DSDRLERVSSIA-------SGGSWVGPAKV 180

Query: 304 TNIKLAKSSASSRIGHNRPLQSAGDELEAVRASP--SQNVYDYEGSRMI-DRIEDTNKWR 363
            NI+  +    S   + + ++S   E +     P  S++V    GSR+  D  E      
Sbjct: 181 NNIRRPQRDLLSEPLYEKDIESIAGEYDYASDLPHNSRSVIKNVGSRITDDGCEKQWYGA 240

Query: 364 RKQYPD---DNLNGLESTSYNIRNGHALEGPRALIEAYGSDKGKGYLNDNPPQAEHFSIN 423
             + PD   D  +GL S S   R  +        +E+ G  +  G   D           
Sbjct: 241 TNRDPDLISDQRDGLHSKS---RTSNYATARVENLESSGPSRNIGVPYD----------- 300

Query: 424 GIDNKVTPVTWQNTEEEEFDWEDMSPTLADRG----RNNDMLKPPVPPSRFRTRTG-FER 483
                    +W+N+EEEEF W DM   L++         + L  P    R  +     +R
Sbjct: 301 ---------SWKNSEEEEFMW-DMHSRLSETDVATINPKNELHAPDESERLESENHLLKR 360

Query: 484 SNAMSIEPGM-RSNWSSQCQLPTIDSSMVVEDVVQSTPDIWNMHNHISQTSQNLMNNKGA 543
               +++P    +N ++       D S +      ST      +   + T + +      
Sbjct: 361 PRFSALDPRFDPANSTNSYSSEQKDPSSIGHWAFSST------NATSTATRKGIQPQPRV 420

Query: 544 GRNFQMPSLGRGIASSGGEKMSPFVDKLLTNDALHRPTTIASRLGSSGLDSSMESQSIVQ 603
             +  +PS G     SG ++ SP  D     +   +    A  L     D         Q
Sbjct: 421 ASSGILPSSG-----SGSDRQSPLHDSTSKQNVTKQDVRRAHSLPQR--DPRASRFPAKQ 480

Query: 604 SMGPRHPLNLPNSCPPSRPPIFPVPRHNKSQFESLNGSNSLINRANRSFLPEQQMNNMRN 663
           ++     + LP+S        F      +   E  +  ++  N    +   E       +
Sbjct: 481 NVPRDDSVRLPSSSSQ-----FKNTNMRELPVEIFDSKSAAENAPGLTLASEATGQPNMS 540

Query: 664 KELSLTTKSPQVGNQHT-------GHIPLTRGNQLPAIPLKPQFLPSQDMQDNLNASTVP 723
             L    KS  + N  T        H  +  G        KP+ LP     DNL      
Sbjct: 541 DLLEAVMKSGILSNNSTCGAIKEESHDEVNPGALTLPAASKPKTLPISLATDNL------ 600

Query: 724 PALPHLMAPSLSQGYISQGYRPVISECLSSSAPIGQWNLPVHNSPSNPF------LLQGG 783
                     L++  + Q   P++S   S +           +  S+P       L+  G
Sbjct: 601 ----------LARLKVEQSSAPLVSCAASLTGITSVQTSKEKSKASDPLSCLLSSLVSKG 660

Query: 784 -------PLPPLPPGPHPTSAPSISLSQKAGSLVPGQ-QPGTAFSGLISSLMAQGLI--S 843
                   LP  P      S    + S  + S+VP   QP     G  ++   +GL   S
Sbjct: 661 LISASKTELPSAPSITQEHSPDHSTNSSMSVSVVPADAQPSVLVKGPSTAPKVKGLAAPS 720

Query: 844 LNNQASVQDSVGLEFNPDVLKVRHESAITALYADLPRQCMTCGLRFKTQEEHSNHMDWHV 903
             +++  +D +GL+F  D ++  H S I++L+ DLP  C +C +R K +EE   HM+ H 
Sbjct: 721 ETSKSEPKDLIGLKFRADKIRELHPSVISSLFDDLPHLCTSCSVRLKQKEELDRHMELH- 780

Query: 904 TKNRMSKSRKQKPSRKWFVSISMWLSGAEALGTEAVPGFLPAEVIVEKKDDEELAVPADE 963
            K ++  S      R WF  +  W++   A   E  P +       E   ++  AV ADE
Sbjct: 781 DKKKLELSGTNSKCRVWFPKVDNWIA---AKAGELEPEYEEVLSEPESAIEDCQAVAADE 815

Query: 964 DQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPDGQTAGMDRSQLGPIVHAKCRTETNV 986
            Q  C LCGE FED++S E  +WM++GA Y+  P   +        GPIVH  C T +++
Sbjct: 841 TQCACILCGEVFEDYFSQEMAQWMFKGASYLTNPPANSEAS-----GPIVHTGCLTTSSL 815

BLAST of HG10018227 vs. TAIR 10
Match: AT1G66500.1 (Pre-mRNA cleavage complex II )

HSP 1 Score: 165.2 bits (417), Expect = 2.7e-40
Identity = 99/199 (49.75%), Postives = 119/199 (59.80%), Query Frame = 0

Query: 809 QASVQDS--VGLEF-NPDVLKVRHESAITALYADLPRQCMTCGLRFKTQEEHSNHMDWHV 868
           +AS  DS  VGL F NP  L VRHES I +LY+D+PRQC +CGLRFK QEEHS HMDWHV
Sbjct: 218 EASNSDSLPVGLSFDNPSSLNVRHESVIKSLYSDMPRQCSSCGLRFKCQEEHSKHMDWHV 277

Query: 869 TKNRMSKS-----RKQKPSRKWFVSISMWLSGAEALGTEAVPGFLPAEVIVEKKDDEE-- 928
            KNR  K+     ++ K SR W  S S+WL  A    T  V  F   E+  +K  DEE  
Sbjct: 278 RKNRSVKTTTRLGQQPKKSRGWLASASLWLCAATGGETVEVASF-GGEMQKKKGKDEEPK 337

Query: 929 -LAVPADEDQKTCALCGEPFEDFYSDETEEWMYRGAVYMNAPDGQTAGMDRSQLGPIVHA 988
            L VPADEDQK CALC EPFE+F+S E ++WMY+ AVY+            ++ G IVH 
Sbjct: 338 QLMVPADEDQKNCALCVEPFEEFFSHEDDDWMYKDAVYL------------TKNGRIVHV 397

Query: 989 KCRTETNVVPSESFDQDEP 997
           KC  E    P  + D  EP
Sbjct: 398 KCMPE----PRPAKDLREP 399

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
XP_038894060.10.0e+0094.96polyadenylation and cleavage factor homolog 4 isoform X3 [Benincasa hispida][more]
XP_038894058.10.0e+0094.98polyadenylation and cleavage factor homolog 4 isoform X1 [Benincasa hispida][more]
XP_038894059.10.0e+0094.98polyadenylation and cleavage factor homolog 4 isoform X2 [Benincasa hispida][more]
XP_008462986.10.0e+0091.50PREDICTED: polyadenylation and cleavage factor homolog 4 isoform X2 [Cucumis mel... [more]
XP_008462960.10.0e+0091.05PREDICTED: polyadenylation and cleavage factor homolog 4 isoform X1 [Cucumis mel... [more]
Match NameE-valueIdentityDescription
Q0WPF25.0e-18544.03Polyadenylation and cleavage factor homolog 4 OS=Arabidopsis thaliana OX=3702 GN... [more]
Q9C7103.7e-3949.75Polyadenylation and cleavage factor homolog 1 OS=Arabidopsis thaliana OX=3702 GN... [more]
Q9FIX85.4e-3837.72Polyadenylation and cleavage factor homolog 5 OS=Arabidopsis thaliana OX=3702 GN... [more]
O949134.3e-1936.08Pre-mRNA cleavage complex 2 protein Pcf11 OS=Homo sapiens OX=9606 GN=PCF11 PE=1 ... [more]
Q102373.2e-1437.11Uncharacterized protein C4G9.04c OS=Schizosaccharomyces pombe (strain 972 / ATCC... [more]
Match NameE-valueIdentityDescription
A0A1S3CI660.0e+0091.50polyadenylation and cleavage factor homolog 4 isoform X2 OS=Cucumis melo OX=3656... [more]
A0A1S3CJP90.0e+0091.05polyadenylation and cleavage factor homolog 4 isoform X1 OS=Cucumis melo OX=3656... [more]
A0A0A0LVG00.0e+0091.60CID domain-containing protein OS=Cucumis sativus OX=3659 GN=Csa_1G109350 PE=4 SV... [more]
A0A5A7UC460.0e+0091.66Polyadenylation and cleavage factor-like protein 4 isoform X2 OS=Cucumis melo va... [more]
A0A6J1EZ180.0e+0086.84polyadenylation and cleavage factor homolog 4-like isoform X2 OS=Cucurbita mosch... [more]
Match NameE-valueIdentityDescription
AT4G04885.13.6e-18644.03PCF11P-similar protein 4 [more]
AT2G36480.13.7e-5026.11ENTH/VHS family protein [more]
AT2G36480.23.7e-5026.11ENTH/VHS family protein [more]
AT2G36480.33.7e-5026.11ENTH/VHS family protein [more]
AT1G66500.12.7e-4049.75Pre-mRNA cleavage complex II [more]
InterPro
Analysis Name: InterPro Annotations of Bottle gourd (Hangzhou Gourd) v1
Date Performed: 2022-08-01
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR006569CID domainSMARTSM00582558neu5coord: 78..200
e-value: 1.8E-41
score: 153.8
IPR006569CID domainPFAMPF04818CIDcoord: 87..194
e-value: 2.0E-11
score: 44.2
IPR006569CID domainPROSITEPS51391CIDcoord: 75..203
score: 37.997066
IPR008942ENTH/VHSGENE3D1.25.40.90coord: 73..201
e-value: 5.9E-42
score: 144.8
IPR008942ENTH/VHSSUPERFAMILY48464ENTH/VHS domaincoord: 75..203
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 751..785
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 440..463
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 206..227
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 13..38
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 752..768
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 769..785
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 983..1011
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 1..47
NoneNo IPR availablePANTHERPTHR15921:SF12POLYADENYLATION AND CLEAVAGE FACTOR HOMOLOG 4coord: 42..985
NoneNo IPR availableCDDcd16982CID_Pcf11coord: 80..199
e-value: 3.39355E-54
score: 182.38
IPR045154Protein PCF11-likePANTHERPTHR15921PRE-MRNA CLEAVAGE COMPLEX IIcoord: 42..985
IPR013087Zinc finger C2H2-typePROSITEPS00028ZINC_FINGER_C2H2_1coord: 844..864

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
HG10018227.1HG10018227.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0006379 mRNA cleavage
biological_process GO:0006378 mRNA polyadenylation
biological_process GO:0009911 positive regulation of flower development
biological_process GO:0006369 termination of RNA polymerase II transcription
cellular_component GO:0005737 cytoplasm
cellular_component GO:0005849 mRNA cleavage factor complex
molecular_function GO:0003729 mRNA binding
molecular_function GO:0000993 RNA polymerase II complex binding