Cp4.1LG14g06200 (gene) Cucurbita pepo (MU‐CU‐16) v4.1

Overview
NameCp4.1LG14g06200
Typegene
OrganismCucurbita pepo (Cucurbita pepo (MU‐CU‐16) v4.1)
DescriptionCleavage and polyadenylation specificity factor subunit 2
LocationCp4.1LG14: 560647 .. 576144 (-)
RNA-Seq ExpressionCp4.1LG14g06200
SyntenyCp4.1LG14g06200
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: polypeptideexonCDSthree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGCCATGTGACTGTCGACTGGTACTAGCCGAGTAAGGAAATCAGTGATACCTAGCTGTTTTCGCCGGTTCTGAAACGTGTTAGGTTCTTAAGAAAATCGGTGGAATGGTTCATTAGAAGTTCTTCTTACATTTGAATGTAACAATCTTTTTTTGGATTAAAGGGTTCGGAGTGAATAACTCGCGAAAAGGCTTCTCAACAATTGGGCTACAAACTCTTGATTTTTGTTTACTTCATCGCCCTTAATACGTAATTGTGGTTTTTCAGGGTGGCATCGACGATTGATGCAGTTTTGATATCACATCCTGATACACTTCACCTCGGTGCCCTTCCATATGCCATGAAACAACTTGGACTTTCTGCTCCAGTATATTCCACTGAACCCGTGTATCGATTGGGCCTTCTTACAATGTATGATCAGTTTATAGCGAGGAAGGTAATTATTTTTTTCATATAATTTGGTTCCTTAAAGTTATGAATTAACCAAATCAAACAAGCTACTATACTATTTACCTGAACTACTTGCCATCAAGTTTCTTTAAAACTCAAGAACCTAGTTGTATAACCTATTGAATAGATCCTACTAAAGTAAGCTTGTATACTTGCGGTCAAGTTTGAAGTGTTGTGAAGAAAATAGTCCTTTTTGTTGTAGCCTTGTGAGCGTGGTAGACTCCTACAATTATGTGAGTGATGACTGACCAAGCAAGTTTTGGTTAATTTATCTAGGTCAATTATCACTCGCCAGTTTATGTGGGCTTGGACCCTCGAAATTTTTGCTGTACGTTGTTTTAAAGAATTTTAATTCTTTTTTGCTTGTGTGTGGGTGGGGGGTGGGGAATTATTGTCGGATCCAAGTTATATATTTCGTAATTACATGGTTGCTAATCGATTGCCGACATTTCATATTAGCATTGTTATGAGTAGATTTTTTTAAAAAATAATATTTAAGTCAGAATAATAGTCTCTGCTTCACAAATTTACCATTGATTCCTGATTGTTGTTTTCACTAGCTTTTCTTTACTGTCTTTGTGCTGGGTGTTGCCATATGGACACCTCCCCTTTCCCTTATGCAGCAAGTTTCAGAGTTTGATCTATTTACGCTGGATGATATCGATTCTGCTTTCCAAGTTGTAACCAGGCTAACATACTCCCAGAATCATCATCTTTCAGGTGCCTTTTCTTGGGCTTGTTTATTGTTAGATCAAAGCTAATCGATCCCTCTTTCTCTTGGTTGCCTAGTGATTGTATATGGTCTATACTGATGACCCACCTAATGTATCTTCGAAAATCATTCAGTCATTCATCTGAGAAGCACAAACATGTTATAGCATGATATGTTACAATAATGACTGTCCTTCCGGAATAAAAAGAAAAAAAGCGAGGGAACCTCAGGACACACCCATGACAACATATATGTTTTGAAATACTTTCTGTAAAGTATGTTATGATATAGGATTTAAGATATGTGTCACCTATGGGAAATTAAAGAAGCCAATGACGTGCTTCCAAATATGCTACATTTTATAACAAAAGCTTATAAAAGCAATGGAAAACCAAATATAAGTATTTTTATATTATTCAATGTCTTAAAAGAAAAGGAAATTCAGAATAAAATAAGTCCTACCATTGATATGCTTAAGTTTTTAATATTTGAACTCCTTATCGCCCCTCCTTTTGTATACATTTTTTGTTCAAAGCTTTTGTCTCTTGTTCACAGAAATTCCTATGCTCTCTTATTTTGTTGTATCCTCATGCTCGAGCTTGTTTCATTGTGTTGAAATAAAAAGAAATTATAAAAACAATTAACAATTGAGTTTGAATGTCCAATTAGCGGTAAAGTGTTAAGCTATCTAACTTATATTATATGAAGCATGGCCTACTTCTTAGACTGAATTGTTCATTCTCCATGGTTGTTCATGTAGCAATGTGATAATTTGTTAAAATTTTATTGATGAAAAAACCGTCTTTCTTTGAGAATAAATGAAAGCAATAAAAAGGAGAGCCTCAATACGAAGGAGCAAAATCAATGGGCTAATAGAAACGACAAAGCTACAAAAAGAAGCTCCAATTTAAAATATATAAATTTCATTTTGCTAGATGAGAAAAAGCTTTCATTAAAAACAATAAAAGAATATGGAGGAAGCCAGAAAGCTTAACAAAACAATGGCTAACTTGAAACCACTCAATATTTTCAACAAAAGAAGTAGATATCTCAAAATAAAGGAAAAGTAAAGTCCACAAGCCAATAATGTACTGTTTTGCACTCAGGAAGAAGGTTAGGACTTACTATTTGTTTACCATTACTTAGGTTGTACATAACTATTTTCCCCCTTTTTTCTCTTCTCTGATACTTTATGACAGGCAAAGGAGAGGGAATAGTTATTGCACCTCATGTGGCTGGGCATTTATTGGGGGGAACCCTATGGAAGATAACTAAGGATGGAGAAGATGTTATATATGCTGTTGATTTTAACCACCGCAAGGAAAGGTATGGTAACCGTTATTACAAGCACCAACACCTGCGTTAAGACTACTCACACAAAAGTGTAATACACAATGCTTATTTAACTTACTATCAGGCATCTGAATGGAACCATTCTAGAGTCATTTGTGCGACCTGCTGTATTGATAACCGATGCTTATAATGCTCTAAATAATCAGCCTTACAGGCGTCAGAAGGACAAAGAATTTGGAGGTACTTAAAATCCTTCCAGAGATTCCATCTTTTGCTTCATTCCCCCCTCCCATGCTCTACTTGTAGCATAAAAGTATATTCTAAAAGTCCTTTGTTATGGATGCTTAGTCGTTTTTGCCAAGAATTGTTGATGGAAGTTATCATTTTGATTAACATAGATACTATTCAGAAGACCTTAAGAGCTAATGGAAATGTCTTACTTCCTGTTGATACTGCTGGGCGAGTGTTGGAGCTTATTCAAATTTTAGAATGGGTTAGTTGGTGAAAGACAAAACTTTTGTCTCCTCTTTTTTTGTTTAAATTTTTTTATGTCGTGTTAATTAAGCTATAATATTTAGTCTCATTGTTATTATTGCAGTACTGGGAAGAGGAAAGTTTAAATTTTCCCATTTTCTTTTTAACTTACGTCGCATCTAGCACAATTGATTATATCAAGAGTTTCCTAGAGTGGATGAGTGATTCAATAGCGAAGTCTTTTGAACACACACGGAACAACGCCTTTCTTCTCAAGTAAGTATTCTTCCAAGTATTACATTTTTTTTAATCTTTTGATAGGAAACAGGGGGTAGTTACCGGTCAATCCCATTGCCTTGAGGGTTCAAGTGCTACATATTTCACACTCGATAGCATGCTTTATTTACCTTTGATTAAAGAAATTGAGATAGACTAAAATAAGTTTTAGTGCAGCAGAGTGCTGTGTGAAGTCATACAAATGGAACTAGGAACTCTTACCAGGCTGTTGTTTTAAATCTTGCAGGCATGTCACCCTTCTAATAAACAAAAGTGAACTTGATAATGCTCCAGATGGACCAAAGGTTGATTGCAACAAATTGTTAATTTTGTTATATGTTTACCTCCTTTTGCTTGAGGTTTAAATGTATTGACGTCTATATGGGGTGTCTTATAGGTTGTTCTAGCATCAATGGCTAGTTTGGAAGCTGGTTACTCACATGACATTTTTGTTGAGTGGGCAACGGATGCCAAAAATCTCGTCCTTTTTTCTGAAAGAGGCCAGGTATTCTTTAGTAGCTTGTTTACATGATTATCTAGCCAGACTTATAGTGATTTTTCATGTCTTGGTCTATTTATCTTCAGTCCTCGGAACTGAGTAACTGTTCTTCAAAATTGATTTGTTACACCTTGCATAAATGCCGTTCCATCTTTGTTTCCATTTTCTAAACATCTTCCATTTGACTAGAGTTTCTTTAATAGAAAATCAAAACACTTTTAATATTATTCGTAGTTTCTTTCTTTGTTGTTCTTGTACAAGTTCTAGCAAGTCATTGATTCATTGGTAGTCGATTACTCTAGGTGCACTTCGTTTCCTTGTCTTTATCATTTTCCTCCTCGAAAAATTCCTTTGTGGCTGATGACTTAGATTTGTTAGGGAGCTTTTTTCCTTTTCATTCGATTTCCATCGTCCTTTGTTTGATAGGAAAATGACGGATGTCACAATCATTCTATCTTTGATAGAAGAGTTTGATTTTAGGACCGGGAGAAGGAATGTGTGTACTTGGAGCCCTAACCACCTCTTGCCTATGTATTATTTCATTTTTTCTCAATGAAAAGTTGTTTCTCTTAAAAAATGTATACTCTTGGTGCCATGGGCGTGTCTTGCTATGACTGTGCTTGGAAGTTGGAAAAGTAATTTTTTCTCCTTTCTTTCTTTCTCCCCCTTCTGCATATACCTGATCAACTGCATGCTTCTTTGAACTCTTATAATTTCTACACTTATATTGTGAATGTGTAACCCCCATCAGTTGGAAAATGTAGTTTGGAACTTTGGCCCGCATGCTTCAAGCAGATCCACCTCCCAAAGCTGTTAAGGTAACTGTGTCTAAGAGAGTCCCTTTGACCGGAGATGAGCTCGTTGCTTATGAAGAAGAGCAAAACAGGAAAAAGGAAGAAGCTCTTAAGGCTAGTTTGCTTAAGGAGGAACAATCTAAAGCATCACATGGAACTGATAACGATATTGGTGATCCAATGATCATTGATGCTAGCAGTAATGTAGCACCAGATGGTATGATGATTGATAACTGCAGTTTTCTAGACACGTGCAAACTTTTTTTAGCAAGGAAATTATGAGCATCCAAACTTTTTAACTTGAATTGGAAAAGACATTTTAAGTGAAAATAGGATTTGGTTGCATAAATATTAGCAAAATGTTTTCCCAAACTCTATCATGGAAACATAGAACACGAAGTGTACACTGATGTGCCAGGTTTTAAGATGGTTTAGAAATGCTATAAATATCCCCATTGCTTGGGCACCACCAGTTATTTTAGTTGAAGGTGTCACTATCTTTTCTTAGTCCTAAGAATATACATGCATTTCAGTGACACTAGAAATGAGGATCCTAACCTGCAATCATCAAATGGTCTGTGATTTATGTTTTGCTTTGAATCAATTTCCATTCATTTACACTTTACTGTAAAATCTCCCTTCATTTCTTTGTATTTGGCTAATGGACATCTAGTGGTTTAAACTGTGAAAAATTCCCCATACATGCCTATGATTGTGTTTTTTTATTGGAATTGTTTTAATGTACGTTTTAATAATTGTTAGTGGAATTGATTGGTTGCTGATTGTATCTTGCAGTAGTTGGTTCACATGGAGGTGCATACCGAGACATATTTATTGATGGTTTTGTTCCTCCTTCAACAAGCGTTTCTCCAATGTTTCCCTTTTATGAAAACACTTCCGCATGGGATGATTTTGGTGAAGTAATCAATCCTGATGATTATGTAATTAAGGATGAAGACATGGACCAATCAGCGCCGCATGTAAGATTCTTTGATGTAATTATCTGCATTTAATTCCTGGACACGTGTAAGATTGATTGAAATGAAATGTTAAATTGGTTGATAGGTTGTGGGTTGAAATAGAAAGCCTTTCTTGTGGTAAAATTTAGGAAACCGATTTGTTTTATTTATTTATTATTATTATTTTTTTGAGAATAGAAACCATTTTGTTCAATAACATTTGGAATGTAAGATCCTGCTTGGGCTGGTTTTAGGAATAGGATATATATATAGTTAGTTTTGAGATTTGTCAATATTATTAATGTTGATGAGTAACCAACATTGAGAGGAATGAAAAGTAACAGCTAAGGACTGCTATCAAATTCCCTGGGAACAGCCGTGTGAAGATCAAAATGCATCCATGTAGCCAATTCATATAGTTGGGACACACAGTCAAATGGACTTTTATGCCCCTCGTCATTTTTTGTTGTGCATGGATGATTGCTAATTTACTAGAATCATACTTACAGGGTGGTGTGGACGTGGATGGAAAACTAGATGAAACTGCTGCTAACTTGATTCTGGATATGAAGCCTTCAAAAGTTGTATCTAACGAATTGACAGTAAGTTGATCTCCTGTTAAAGTTCTGTTTAAGGTTTATATTGGTCGACTGAATTCACATGAATGTGTATGTAATGTTATTGTCTCCAGTATATGTATATAAAGTTTATTTTAAGGAAAATATTTGACTTAATAAGAGAAGGTTATAATGGGTTAATAAACTATCTCTCTGATCTGTCTTTGTGTGTGTATGTGTATAAATATATATATATTTTTTTTCTGATATATCACTTTTTTTTTAAGTGCTTAATATTTATACTTTCCATTGATCATCTTTGTACCTATGTTCAATTTGATGATACAATCAAGTTATTTATCTATTGATTGGCTCAATGATTTTTTCTGTAATTTCCCTATCCAGGTCCAAGTTAAATGCTCATTGCATTACATGGATTTCGAAGGTCGTTCAGATGGGAGATCAATTAAATCAATACTCTCCCACGTTGCTCCCTTGAAGCTTGTATGGTGTTTTTTTTACAACTGCTCTTGTGTTGTGTATGAGCTTGTGCACTTGTAACTATTGATTGTAGGGTGAAAGTTCAGTTTAGCTAAGTTGGTGTATTCTGAGAAATTTCCTCTTAGCCGGGTTCTTAAATTTCCACATCCGTGTAGTTGCAATAGCTTTTTGTAGACATTGTTCAGTGTTACCACAAAAACTACTTTAATCAACGATTCAACCGTTTGATTCTATTTATTTATTTAAATTGTTGGTTGAAGGTCTTGGTGCATGGAACTGCAGAGGCCACTGAGCATCTTAAGCAACATTGCCTTAAAAATGTCTGTCCCCATGTCTATGCCCCCCAAATTGAAGAAACGATTGATGTTACTTCTGATCTGTGTGCATATAAGGTACAGTCGAGACTATTCATTTCCTTTTCAGGTTGTCATTTCTAATATTGAGACTAACAAATCTCCGATTCTTGAATTTCAGGTACAACTTTCAGAGAAGCTGATGAGCAATGTGCTGTTTAAGAAGGTAAAGTTCTTGGAGATCCATGAAATTTTCCTCCAATATTAACTTGTATTTTGTATCAGTATAAGCAGTCTTCCCTACGATGCCTGACATAATTTTGAATACAGCTAGGAGATTATGAAATCTCTTGGCTTGATGCTGACGTAGGAAAGACCGAGAATGGAACGTTGTCTTTACTTCCCCTCTCAAAGGCGGCTTTGCCTCATAAATCTGTTCTTGTTGGGGATCTAAAAATGGCTGACTTCAAACAATTTCTTTCCAGCAAGGGAATACAGGTATCTTTGCCTGTGAATCCCTCCCTTGTCAATGGGAACGAATATCTCCTTTTGACATGTATAGACACTAAAGCTACAATAGAGCGCTAAAAGTGAACGATATGATCGTCCTATATGCATTGCAGGTTGAATTTGCTGGGGGTGCTTTGAGATGTGGCGAGTATGTTACCCTACGCAAGGTTTCAGATGCAAGTCAGAAGGTGAGAAATTCTAAAAAAAGGGGGAAATAGATAAGTTGAATATGGCTCTTAATGTTGCTTATCGAATTTGAACCATGTTTCAGCTTCATACCACAAATGCATGCATATTTGCTTGTGATGGTGTTGTTTGGCTAGTACCATTTTCATGGTTTGCTTCTAAGCATAAACATAGTATATTCAATTTATGCAGGGTGGTGGTTCTGGTACTCAACAAGTTGTCATCGAAGGGCCCTTATGTGAAGATTATTACAAAATTCGGGAGCTTTTGTATTCACAATTTTATTTGCTATAGTTGAGATGGGTCGTTGGACATGAAAGAGATCCAGCAATTTTAATGATAAACTGCCTACTCCAAAATGTATTAATCTTTTTTGTTCCTGAAATGTGTTCGAGTTATGTTTCACGGCTGAAATTAGAACAAAATTTCCCTTTTTTTTTTCCCATTGTAGCATATGAAACTGAAAAATGTAAGCTAACTCATGCTTCTATGGTTGCAACACCATGGTAACCATAAGTTGATTGGTCTATGAAGATGCAGCCATAGCAAACCTTAACTTCGGTCAGGGAACTCTCAAAACCTATCGGATCGATTCTGTGGAGGTTTATTTCCCAGGGAATATGTTACTAGCGAAGCGTTTAGAATTATTGTTCATGTGCGTAAAATGTTTCTTTGAAATATATTCTAGATAGGCTGTAAAATGTTTCTTTGAAATATATTCTAGATGGGCTCTAATTTTTTTGGTATGGAAATGACTTCTAGCATAAGAATGAGTTGAGATTCCACTCGGTAAGGTTGCCATTTCTTATTACCTGAAATTTGTAATTATTAGAAATTTATTTTGGTAGAATTGTTGTAACTAATAAGAGTGAAATTTGAATTTTTTTTTCTCTATTTATGTAAAGTAATTTAATTAGTTAAAATATTATATTTTATTAAAAGCTCCACAATTTTCCTCTTCCACGTCGAAATAAAAAAATTTTATATAGTGAAACCATGAAACGTACAGACATATTCGTTCGGAGGGAGAGATAAGTAAAAAAAGATGTCATCCTTTTCTTCAATCCTTTTAACCCCAAAAACATTATTCTTGATGCTCTTCTGTAATTCAACCTCAATTTTTCTTTTCTCAATTTGAAGCGAGAATTATCCGCCATAGCTTCCTGAATCATAGCTCCTGTTGATTCTTCATTTTATTGTAAGATACTCAATCTGCTATTGGAGTAGGATTTTTCATGGCTTCCACTCCTCTTACGTGTTCACCGAGCTCTCTCCAGCTTAGGCTCGCTCTGAATAGCAAGAATTGCGGCAAATTCCCCTCAGTTCTTGTTCGGGCGAGAGTGACGAAATTGGATCCTTGGCTCCGCGTGATCTCCCGCCCTATTGTTCATAATGGCGTGATAATCGAGAGAGAAAATGGACTGCGTCGCAGTGGAGTTTGTTTTGCTGAGTCGGAGTCGACTACTGATGGATTCTCTGGGTGGTCGGAATCTGATTCCGGGGAGGAGGTTTTGGACTTACGGAGAAAGAAGTGGTTTGGAGGTACTTACTGCACTTCTTTAAACTTCCTTATGATTTTTCTTGGTGGTCTGTTTGATTGCTGAGCATAAAGGGAAACGAGAATCTAAATATCCGGTGGAGCTGTTATTTTCTATTTTCTTGGGCTTCTAGATATGATCCGCTTTGGCCCTGTTACGTATAGTCATTAACCCATTAACCTCACGGTTTTAGAACTTGTGGAGAGGTTTCTATTCCTTATATTTGTTTCCCTCTCCAACCAATATGAGATCTCACAATTCACCCTTCGGGGTCCAATATCCTCCAGTGTCAGGCCCCGATACCATTTATAACCGCACAAGCCCACCGCTAACAGATATTGTCCACTTGGCCTGTTACGTATTGTCGTCAATCTCATGGTTTTAAAACGCGCCTTTAGGAAGAGGTTTCTACACTTCTGTAAAAAAAATATTTCGTTCTTGCCTCCAATATGAGAACATGGGTTTGAATTATTGGAGAAATGCTTCTCGCCCCATGTTGAATTGCATTGCCATTGCCGTTATTTAGTTCTATATGATAAAACACTAATGCTGTAATGACCATGACATGGTAGATTTCAATTCCATGTGGAAGGTCCATTTGTAGTTGAAACGTTGATTGGTTATGCGTATAATGGTGTTTAGTGGCATCATTTAGTTCCTGGAAACAATTTAAGCCTTTTCATTTTGAGATTTTGACATTAATGTAGCTTTTGAAGTTTACCTTATGTGCTCGAGTACTCTTTTAAGTAATGAGTAGTAATGGTTTTGAAGTAGTTTCTTTCAGGGTTCGTGGGGATTGGAGTTACTGGATTCATCCTTGTCTCAGGAATCACCTTTGCAGCATGGTCAATAAACAAGCAGAATGGTTCCAGTAAGTAGTTTTCTTGTATACTTGCAAACTGTTCTTTGGCTTTGTTTCAATGCTCATGTTTATTGCCCCCAACTCTTTATAATTGACCTGTGTTCTTATCTGACTACTGCGCGTATATGTCTATGACAATTCATAGGACAAAAGCCGCAAATGGAGGCCTTAAGTATGCAGCAAGAATTATTGTTGGATTCTGATACTGGACATGATAGCCTTGATGAAGATGAAAAAGAAGATAGCAGTATGAATGCAGATGATGGAACTCTCGCTGGTAATTCAGGTAATCAAGAGGAATCTTCTTCATATACAGAAAATGATGTTGAATTCTCATCCAGCATTAGCAATAATGATGTCAATAATGTTGCTTCCTTGCTAGAAGATTTTCGATCTGATTCCTCGTTAGCTGTTACATTAGTTGCTTCCAGAAGTTTGTGTTCCTCGATCTCACCCGAATTTGAGATTGATGCTAAAGTAGCTTCTTGTTTTAAAGAAGTTAACGACTGTCATGAACCTGAAATGAATATATTAAAAGATGAACGAAATATTAAAACTGATATTCGGGATGAAACCGCTGACACGAGTGAAAATTATGATTTCAGCTCTAAGAGCTTGCCAGTGTATGATGATAGTTCATCAAACTATAATTCTGGCAACCAGGATGAGACATTTGGTCCTCCTGTAAATGAAATTACAGATTCTTCATTGCAAGAAATTTCTAGCATATGTAGCTACACAAAAGCCAAGGATTGTGGATTATTTGACAAAGAGACTGTGGCTGAATCACCCAAAGGATTTGAGCCCCACAAAAACTGCATCAACCATAGAACAGCAAATAGGAAGAGGATTATTTGAAGCGGCATTAGTCTGTATCACAGCTTATCCATTGGCAGATGTTCAAGTGAAAAATCATGAAACTATGATGAATAGTACTGCTGCCAAACCAGAACTACAAGGGATTTTATTTTCTTCTGCAGGTTTTCCTGCTCCTATGGTGTCTGCAGCTGTTAAAACACTTCCTGGCAAGGTCCTAGTTCCTGCAGCAGTGGATCAGGTTCAGGGGCAGGCATTGGCAGCACTGCAAGGTTTAAAGGTACTGTGTGATGCCTCTCCAAATGCTTCCTTCTGATTAACACACAGCTACTGGTTCAGATACTCAAGATAATGTATTTTAAAGATTGTAAATTCTTCAGGTGATAGAGGCTGACGTCGAACCTAGCGATCTATGTACTCGTCGTGAATATGCTCGTTGGTTGGTGTCTGCAAGCAGTGTTCTTTCAAGGTAATTATCATTGCAGCTTTAGGGTATTGAAATGGTGAACTTTTTTAGTTATGGTTTTCCTGAATATATCTTATGATACAATGGAATCGTTTTAATTCCTAGGAACACAACATTCAAAGTATACCAAGCAATGTATATACAGAATATTACTGAGCTTGCTTTTGATGATATTACACCAGAAGACCCTGATTTCGCATCTATTCAAGGTCAGTAGGAATTAGGATACTTGACTCATGGGATATGAGCTTTTTCTATTACACCTTCTGATTCACCTATATATTCCAGGTTTAGCAGAAGCTGGACTGATTTCAAGCAAGCTTTCGAGACATGATATTTCTTCGACGTTGGACGAAGACCAGGGTCCTTTTCATTTCTCTCCCGAAAGGTTATTCACCATTAATGATATTTCTTCGACGTTGGACTTCTTAACATATTCAATTGCATTGACTGGCCTTCTTTGATTTGTGGTTTCTTTCTCATCAACCATAATACGAGCTAAATTGCTTCTGCTGTTTATATTATGTCCAAAACTGCAGCCCTTTATCACGTCAAGATCTTGTGAGTTGGAAGATGGCCCTTGAAAAAAGACTGCTGCCAGAGGCAGACGGAAAGGTCAAATTATTTGAAGCTTGACTTCGACATCTTCACTTTTGTATATAAGAATCATCTTTGTGAGAGTTTGGCTTGACTGATATGCAGATGCTCCGCCAAGTTTCTGGATTTATTGATACTGATAAGATCCATCCAGATGCTTGTCCTGCACTTGTTGCTGATCTTTCTGCTGGAGAACAGGGAATAATAGCTCTTGCATTTGGTGCGTGCATAAAATTGCAATTCGCAAAAACATTCTCCTTAAGAAATTCATGTCAAAACTAACTTTGCTTCTTAAAGGATATACAAGGCTGTTCCAGCCGGATAAGCCAGTAACAAAAGCACAAGCTGCCATTGCTCTTGCAACAGGGGAGGCTTCTGATATGGTAAGCGAGGAGCTTGCAAGGATTGAAGCTGAATCAATGGCGGAAAATGCTGTTGCTGCACATAGTGCGTTGGTAGCTCAAGTTGAGAAAGATATTAATGGCAGCTTCGAGAAAGAACTTTCCATTGAAAGAGAAAAGGTTGAGGCTGTGGAGAAAATGGCAGAAGAGACAAAGCAAGAATTGGAAAGATTAAGATCCGAAAGAGAGAGAGATAATATCAGCTTGATGAGGGAACAAGCTGCTATTGAATCGGAAATGGAAGTTCTTTCAAAGTTAAGATATGAGTTGGAGGAACAGTTGCAAGGCCTCATGAGTAACAAAGTAGAGGTTTCCCATGAAAAGGAAAGAATCAACAAACTCAGGAAAGAAGCTGAAATTGAAAACCAGGAAATTTCCCGCTTGCAGTATGAGCTTGAAGTTGAGAGGAAGGCGTTGTCCATAGCCAGGTAAGGTTTTTGCAGCTCTTTTCTTGTTTGAGGCTGACACTGGACATGAAATGATTGGGTAAATAGCCCGTGATTTGTTTGTTTATCTTAATCCTTTGCCTCTGTTCATGTGAAATTTGAAGCATTTTTCTGCCATCAATGAAGATCTTCACCGTGCGAAGGAATCTGGTTTTACTTTCATCACCATGTCTTTTTGTTTCATCACGCAGAGCTTGGGCAGAGGATGAAGCAAAAAGAGCAAGAGAACAAGCAAAAGCACTTGAAGAGGCTAGAGATAGCTGGGAAAGGCGTGGCATCAAAGTAATGGTGGACAGCGATCTCCGTGAACAGGAATCAGCTGGCGATACTTGGCTCGATTCTAGCGAACAGTTTGCAGTCGAAGAAACTGTGGAAAGGGCCGAGAACTTATTGGCCAAGCTGAAAGGAATGGCTAGAGAAGTAGGAGGGAAATGCAGGGACATAATTGAGAAGATCATCCAGAAGATAGCATTAGTAGTATCAAACTTGAGACAATGGATTTCGAATGCTGGAGAACAGGGTGAAGATCTTAAGAATGTGGTCATTTCAAGGGCAAGTAGATCAGCAACTGAGGTGCAACAGAGCATTGCAGAGTTGAGGTTGGCCGTGAAGGTGGGAGCTAAGCGAGTTGTGGGAGATTGTAGGGAAGGAGTAGAGAAAATTACCCAAAAGTTTAGAACATAGAATGGTTAAGAAACTATTGATTTAGCAGCTGAGTGATGTAGTCGAATTCTTAAGGGCCGAACATCAACTTTTACAGATATCGATGGAATTCACACACTCGAACTCAATTAAATTCTAGACAACCCTTTATATAGGCATAAAAGTATTAATGTTATACCTGTATCAAGATGATTCATCTATCAAAATAGAAGATAAATCGAAGAGGATATGATCCAGATGAATCAAATGAACGAGTTATGAACCAAATGAACGATCCTGTTCATCTTCTACTGCTGGTTTAATCCCGTTCATCTTGTATGATTGTTTAATTTGGGCTTTTATCTTTTCTATTGGGCTCAATTTCATAGTTTCTAACAGAAGCCCATTAACGTTTGTGGAAGAAAGCCCAATAAAGCCCATTAACGTTTAGAAACCATCACCTGTCAATATTTTCTTATAAGCCGAAGCTCCCGAAACCCTAGTGAACCCTTTCGCTCGCCTCCGATCATTTGGTTAAAGTTTATCCGAGAAGGTACGTCGAGAATCCTTTCATGATCTTCGTTTCTTTTGCTTCTCCTTCTCCATCTTTGATGTACTATAATTGAACACAATATTTTCGAGTTTGGTTATTTGCTCCAGATTTTGATTCTTCTACTGACGTTTACTATGCATTCCCCTGGATTAAACGAGTAGTTCATAGTACTTTTTTGTTTTCGGGTGAATAAGCAGTCGAGTTTTCTGTTATCGTTACTGTAGATGAACTTCTGATTCGCAGTATATTTGTTTTGTTATCGTATAAATGTATTTATTTTCTGGACAGGCGATGGCTCCGAAGCAGCCGAATACTGGTCTCTTCGTGGGACTTAACAAAGGACACATTGTTACGAAGAAAGAGTTGGCTCCCCGCCCCTCAGATCGTAAAGGAGTGAGTTAATTTTGTGTTTTGTTCAATGTACATTTTGTAATGATAGACATTCGTTAAATTTTTATTGAGTTGATTGTATGCATCTCTTTACTTATCTGGGATTTTTGTGGAGGATGAAGGTGGGGTTTACCTGTTGCGAAATGAAACTTGAGAGGGGTTATTTAGTAGGACTTTTTTTTATAAAAAAAAAAAAATATTTATCGTTTGTAGTTATTACTTTACAAAAGCTTAAAAGCAAGAGGATAGATTTTCATCATATATTTGAAGTCTTATCGAAGACCTGTTTAGTTCATATCAGTATCATTCAACGATAGTTTAGCTGCTACCTTATGGACACTTGCGCTCAATTAGATTGATGATGTTGGAAATTAGCAAAGAAGTGCACTTGTGGGGCTCTTGAGCATATGCATATTGTCAATTTGCTTCTTTATCATACAAATTTAAAGTGGTTTTGCTCTGGATTATTGGTTTTTTTTCAGTTATTTTAGTTGGTAGATACAAGCAATTGACTTTCATGATTTATGTTTGCCTGATTTTGATTGATACCATAGAAAAGTAGCAAAAGAGTTCTCTTCGTGAGGAATTTGATCCGGGAAGTTGCTGGGTTTGCACCATATGAGAAGAGAATCACTGAGCTTCTTAAAGTTGGGAAGGACAAGAGAGCACTGAAAGTGGCCAAGAGAAAGTTGGGAACTCACAAGAGAGCAAAGAAGAAGCGAGAGGAGATGTCCAGCGTTCTCCGCAAAATGAGGTAATTCTTAGAAAGGCTTTCAATTTCCCTTCTTATCATTGAATTTAATCGTCATTCAATTTTGACATGTTTTCAATGATTACACTTAATAAAAAAACACATTTCTAATGGGATTTTTTTCTCTGAATTAATTTTTGATACAGAGCTGGTGGAGGCGGCGAGAAGAAGAAATAAACTTGTGTGGTCAATTTTGAAGCTTCAAGTTCCATTTTGTTAGAGACATGTTCTAGATTGTTATTGTTGTTACCCAGAATGAAATTCAGTGCATTTTTTAATCTTGTGTTAGGTGAACTATGACAGGTATTAGAACGTATATTTTAATTTCTTTAAGTCTTCTTTTTTTAATGAACATTACAGCCATTTGTATCCCTTTTGATACATTTTCATTAGCTTGGAAGGAATGCAGTGTTTCATTAGCTTGAAATTATTGCACATGTGATTTTCTGACCAAAGGCACATGCTGAACACAGGCATTAGTTGAACTTGAATGTTGGAAGCTGAGAGATCTAAATTAAGTGGTAGCTTAAGT

mRNA sequence

ATGCCATTTTATAGCGAGGAAGGCAAAGGAGAGGGAATAGTTATTGCACCTCATGTGGCTGGGCATTTATTGGGGGGAACCCTATGGAAGATAACTAAGGATGGAGAAGATGTTATATATGCTGTTGATTTTAACCACCGCAAGGAAAGGCATCTGAATGGAACCATTCTAGAGTCATTTGTGCGACCTGCTGTATTGATAACCGATGCTTATAATGCTCTAAATAATCAGCCTTACAGGCGTCAGAAGGACAAAGAATTTGGAGATACTATTCAGAAGACCTTAAGAGCTAATGGAAATGTCTTACTTCCTGTTGATACTGCTGGGCGAGTGTTGGAGCTTATTCAAATTTTAGAATGGGAAACAGGGGGTAGTTACCGGCATGTCACCCTTCTAATAAACAAAAGTGAACTTGATAATGCTCCAGATGGACCAAAGTTTGGAACTTTGGCCCGCATGCTTCAAGCAGATCCACCTCCCAAAGCTGTTAAGGTAACTGTGTCTAAGAGAGTCCCTTTGACCGGAGATGAGCTCGTTGCTTATGAAGAAGAGCAAAACAGGAAAAAGGAAGAAGCTCTTAAGGCTAGTTTGCTTAAGGAGGAACAATCTAAAGCATCACATGGAACTGATAACGATATTGGTGATCCAATGATCATTGATGCTAGCAGTAATGTAGCACCAGATGTAGTTGGTTCACATGGAGGTGCATACCGAGACATATTTATTGATGGTTTTGTTCCTCCTTCAACAAGCGTTTCTCCAATGTTTCCCTTTTATGAAAACACTTCCGCATGGGATGATTTTGGTGAAGTAATCAATCCTGATGATTATGTAATTAAGGATGAAGACATGGACCAATCAGCGCCGCATGGTGGTGTGGACGTGGATGGAAAACTAGATGAAACTGCTGCTAACTTGATTCTGGATATGAAGCCTTCAAAAGTTGTATCTAACGAATTGACAGTCTTGGTGCATGGAACTGCAGAGGCCACTGAGCATCTTAAGCAACATTGCCTTAAAAATGTACAACTTTCAGAGAAGCTGATGAGCAATGTGCTGTTTAAGAAGGTTGAATTTGCTGGGGGTGCTTTGAGATGTGGCGAGTATGTTACCCTACGCAAGGTTTCAGATGCAAGTCAGAAGGGTGGTGGTTCTGATACTCAATCTGCTATTGGAGTAGGATTTTTCATGGCTTCCACTCCTCTTACGTGTTCACCGAGCTCTCTCCAGCTTAGGCTCGCTCTGAATAGCAAGAATTGCGGCAAATTCCCCTCAGTTCTTGTTCGGGCGAGAGTGACGAAATTGGATCCTTGGCTCCGCGTGATCTCCCGCCCTATTGTTCATAATGGCGTGATAATCGAGAGAGAAAATGGACTGCGTCGCAGTGGAGTTTGTTTTGCTGAGTCGGAGTCGACTACTGATGGATTCTCTGGGTGGTCGGAATCTGATTCCGGGGAGGAGGTTTTGGACTTACGGAGAAAGAAGTGGTTTGGAGGGTTCGTGGGGATTGGAGTTACTGGATTCATCCTTGTCTCAGGAATCACCTTTGCAGCATGGTCAATAAACAAGCAGAATGGTTCCAGACAAAAGCCGCAAATGGAGGCCTTAAGTATGCAGCAAGAATTATTGTTGGATTCTGATACTGGACATGATAGCCTTGATGAAGATGAAAAAGAAGATAGCAGTATGAATGCAGATGATGGAACTCTCGCTGGTTTTCCTGCTCCTATGGTGTCTGCAGCTGTTAAAACACTTCCTGGCAAGGTCCTAGTTCCTGCAGCAGTGGATCAGGTTCAGGGGCAGGCATTGGCAGCACTGCAAGGTTTAAAGGTGATAGAGGCTGACGTCGAACCTAGCGATCTATGTACTCGTCGTGAATATGCTCGTTGGTTGGTGTCTGCAAGCAGTGTTCTTTCAAGGAACACAACATTCAAAGTATACCAAGCAATGTATATACAGAATATTACTGAGCTTGCTTTTGATGATATTACACCAGAAGACCCTGATTTCGCATCTATTCAAGGTTTAGCAGAAGCTGGACTGATTTCAAGCAAGCTTTCGAGACATGATATTTCTTCGACGTTGGACGAAGACCAGGGGGAGGCTTCTGATATGGTAAGCGAGGAGCTTGCAAGGATTGAAGCTGAATCAATGGCGGAAAATGCTGTTGCTGCACATAGTGCGTTGGTAGCTCAAGTTGAGAAAGATATTAATGGCAGCTTCGAGAAAGAACTTTCCATTGAAAGAGAAAAGGTTGAGGCTGTGGAGAAAATGGCAGAAGAGACAAAGCAAGAATTGGAAAGATTAAGATCCGAAAGAGAGAGAGATAATATCAGCTTGATGAGGGAACAAGCTGCTATTGAATCGGAAATGGAAGTTCTTTCAAAGTTAAGATATGAGTTGGAGGAACAGTTGCAAGGCCTCATGAGTAACAAAGTAGAGGTTTCCCATGAAAAGGAAAGAATCAACAAACTCAGGAAAGAAGCTGAAATTGAAAACCAGGAAATTTCCCGCTTGCAGTATGAGCTTGAAGTTGAGAGGAAGGCGTTGTCCATAGCCAGAGCTTGGGCAGAGGATGAAGCAAAAAGAGCAAGAGAACAAGCAAAAGCACTTGAAGAGGCTAGAGATAGCTGGGAAAGGCGTGGCATCAAAGTAATGGTGGACAGCGATCTCCGTGAACAGGAATCAGCTGGCGATACTTGGCTCGATTCTAGCGAACAGTTTGCAGTCGAAGAAACTGTGGAAAGGGCCGAGAACTTATTGGCCAAGCTGAAAGGAATGGCTAGAGAAGTAGGAGGGAAATGCAGGGACATAATTGAGAAGATCATCCAGAAGATAGCATTAGTAGTATCAAACTTGAGACAATGGATTTCGAATGCTGGAGAACAGGGTGAAGATCTTAAGAATGTGGTCATTTCAAGGGCAAGTAGATCAGCAACTGAGGTGCAACAGAGCATTGCAGAGTTGAGGTTGGCCGTGAAGGCGATGGCTCCGAAGCAGCCGAATACTGGTCTCTTCGTGGGACTTAACAAAGGACACATTGTTACGAAGAAAGAGTTGGCTCCCCGCCCCTCAGATCGTAAAGGAAAAAGTAGCAAAAGAGTTCTCTTCGTGAGGAATTTGATCCGGGAAGTTGCTGGGTTTGCACCATATGAGAAGAGAATCACTGAGCTTCTTAAAGTTGGGAAGGACAAGAGAGCACTGAAAGTGGCCAAGAGAAAGTTGGGAACTCACAAGAGAGCAAAGAAGAAGCGAGAGGAGATGTCCAGCGTTCTCCGCAAAATGAGAGCTGGTGGAGGCGGCGAGAAGAAGAAATAAACTTGTGTGGTCAATTTTGAAGCTTCAAGTTCCATTTTGTTAGAGACATGTTCTAGATTGTTATTGTTGTTACCCAGAATGAAATTCAGTGCATTTTTTAATCTTGTGTTAGGTGAACTATGACAGGTATTAGAACGTATATTTTAATTTCTTTAAGTCTTCTTTTTTTAATGAACATTACAGCCATTTGTATCCCTTTTGATACATTTTCATTAGCTTGGAAGGAATGCAGTGTTTCATTAGCTTGAAATTATTGCACATGTGATTTTCTGACCAAAGGCACATGCTGAACACAGGCATTAGTTGAACTTGAATGTTGGAAGCTGAGAGATCTAAATTAAGTGGTAGCTTAAGT

Coding sequence (CDS)

ATGCCATTTTATAGCGAGGAAGGCAAAGGAGAGGGAATAGTTATTGCACCTCATGTGGCTGGGCATTTATTGGGGGGAACCCTATGGAAGATAACTAAGGATGGAGAAGATGTTATATATGCTGTTGATTTTAACCACCGCAAGGAAAGGCATCTGAATGGAACCATTCTAGAGTCATTTGTGCGACCTGCTGTATTGATAACCGATGCTTATAATGCTCTAAATAATCAGCCTTACAGGCGTCAGAAGGACAAAGAATTTGGAGATACTATTCAGAAGACCTTAAGAGCTAATGGAAATGTCTTACTTCCTGTTGATACTGCTGGGCGAGTGTTGGAGCTTATTCAAATTTTAGAATGGGAAACAGGGGGTAGTTACCGGCATGTCACCCTTCTAATAAACAAAAGTGAACTTGATAATGCTCCAGATGGACCAAAGTTTGGAACTTTGGCCCGCATGCTTCAAGCAGATCCACCTCCCAAAGCTGTTAAGGTAACTGTGTCTAAGAGAGTCCCTTTGACCGGAGATGAGCTCGTTGCTTATGAAGAAGAGCAAAACAGGAAAAAGGAAGAAGCTCTTAAGGCTAGTTTGCTTAAGGAGGAACAATCTAAAGCATCACATGGAACTGATAACGATATTGGTGATCCAATGATCATTGATGCTAGCAGTAATGTAGCACCAGATGTAGTTGGTTCACATGGAGGTGCATACCGAGACATATTTATTGATGGTTTTGTTCCTCCTTCAACAAGCGTTTCTCCAATGTTTCCCTTTTATGAAAACACTTCCGCATGGGATGATTTTGGTGAAGTAATCAATCCTGATGATTATGTAATTAAGGATGAAGACATGGACCAATCAGCGCCGCATGGTGGTGTGGACGTGGATGGAAAACTAGATGAAACTGCTGCTAACTTGATTCTGGATATGAAGCCTTCAAAAGTTGTATCTAACGAATTGACAGTCTTGGTGCATGGAACTGCAGAGGCCACTGAGCATCTTAAGCAACATTGCCTTAAAAATGTACAACTTTCAGAGAAGCTGATGAGCAATGTGCTGTTTAAGAAGGTTGAATTTGCTGGGGGTGCTTTGAGATGTGGCGAGTATGTTACCCTACGCAAGGTTTCAGATGCAAGTCAGAAGGGTGGTGGTTCTGATACTCAATCTGCTATTGGAGTAGGATTTTTCATGGCTTCCACTCCTCTTACGTGTTCACCGAGCTCTCTCCAGCTTAGGCTCGCTCTGAATAGCAAGAATTGCGGCAAATTCCCCTCAGTTCTTGTTCGGGCGAGAGTGACGAAATTGGATCCTTGGCTCCGCGTGATCTCCCGCCCTATTGTTCATAATGGCGTGATAATCGAGAGAGAAAATGGACTGCGTCGCAGTGGAGTTTGTTTTGCTGAGTCGGAGTCGACTACTGATGGATTCTCTGGGTGGTCGGAATCTGATTCCGGGGAGGAGGTTTTGGACTTACGGAGAAAGAAGTGGTTTGGAGGGTTCGTGGGGATTGGAGTTACTGGATTCATCCTTGTCTCAGGAATCACCTTTGCAGCATGGTCAATAAACAAGCAGAATGGTTCCAGACAAAAGCCGCAAATGGAGGCCTTAAGTATGCAGCAAGAATTATTGTTGGATTCTGATACTGGACATGATAGCCTTGATGAAGATGAAAAAGAAGATAGCAGTATGAATGCAGATGATGGAACTCTCGCTGGTTTTCCTGCTCCTATGGTGTCTGCAGCTGTTAAAACACTTCCTGGCAAGGTCCTAGTTCCTGCAGCAGTGGATCAGGTTCAGGGGCAGGCATTGGCAGCACTGCAAGGTTTAAAGGTGATAGAGGCTGACGTCGAACCTAGCGATCTATGTACTCGTCGTGAATATGCTCGTTGGTTGGTGTCTGCAAGCAGTGTTCTTTCAAGGAACACAACATTCAAAGTATACCAAGCAATGTATATACAGAATATTACTGAGCTTGCTTTTGATGATATTACACCAGAAGACCCTGATTTCGCATCTATTCAAGGTTTAGCAGAAGCTGGACTGATTTCAAGCAAGCTTTCGAGACATGATATTTCTTCGACGTTGGACGAAGACCAGGGGGAGGCTTCTGATATGGTAAGCGAGGAGCTTGCAAGGATTGAAGCTGAATCAATGGCGGAAAATGCTGTTGCTGCACATAGTGCGTTGGTAGCTCAAGTTGAGAAAGATATTAATGGCAGCTTCGAGAAAGAACTTTCCATTGAAAGAGAAAAGGTTGAGGCTGTGGAGAAAATGGCAGAAGAGACAAAGCAAGAATTGGAAAGATTAAGATCCGAAAGAGAGAGAGATAATATCAGCTTGATGAGGGAACAAGCTGCTATTGAATCGGAAATGGAAGTTCTTTCAAAGTTAAGATATGAGTTGGAGGAACAGTTGCAAGGCCTCATGAGTAACAAAGTAGAGGTTTCCCATGAAAAGGAAAGAATCAACAAACTCAGGAAAGAAGCTGAAATTGAAAACCAGGAAATTTCCCGCTTGCAGTATGAGCTTGAAGTTGAGAGGAAGGCGTTGTCCATAGCCAGAGCTTGGGCAGAGGATGAAGCAAAAAGAGCAAGAGAACAAGCAAAAGCACTTGAAGAGGCTAGAGATAGCTGGGAAAGGCGTGGCATCAAAGTAATGGTGGACAGCGATCTCCGTGAACAGGAATCAGCTGGCGATACTTGGCTCGATTCTAGCGAACAGTTTGCAGTCGAAGAAACTGTGGAAAGGGCCGAGAACTTATTGGCCAAGCTGAAAGGAATGGCTAGAGAAGTAGGAGGGAAATGCAGGGACATAATTGAGAAGATCATCCAGAAGATAGCATTAGTAGTATCAAACTTGAGACAATGGATTTCGAATGCTGGAGAACAGGGTGAAGATCTTAAGAATGTGGTCATTTCAAGGGCAAGTAGATCAGCAACTGAGGTGCAACAGAGCATTGCAGAGTTGAGGTTGGCCGTGAAGGCGATGGCTCCGAAGCAGCCGAATACTGGTCTCTTCGTGGGACTTAACAAAGGACACATTGTTACGAAGAAAGAGTTGGCTCCCCGCCCCTCAGATCGTAAAGGAAAAAGTAGCAAAAGAGTTCTCTTCGTGAGGAATTTGATCCGGGAAGTTGCTGGGTTTGCACCATATGAGAAGAGAATCACTGAGCTTCTTAAAGTTGGGAAGGACAAGAGAGCACTGAAAGTGGCCAAGAGAAAGTTGGGAACTCACAAGAGAGCAAAGAAGAAGCGAGAGGAGATGTCCAGCGTTCTCCGCAAAATGAGAGCTGGTGGAGGCGGCGAGAAGAAGAAATAA

Protein sequence

MPFYSEEGKGEGIVIAPHVAGHLLGGTLWKITKDGEDVIYAVDFNHRKERHLNGTILESFVRPAVLITDAYNALNNQPYRRQKDKEFGDTIQKTLRANGNVLLPVDTAGRVLELIQILEWETGGSYRHVTLLINKSELDNAPDGPKFGTLARMLQADPPPKAVKVTVSKRVPLTGDELVAYEEEQNRKKEEALKASLLKEEQSKASHGTDNDIGDPMIIDASSNVAPDVVGSHGGAYRDIFIDGFVPPSTSVSPMFPFYENTSAWDDFGEVINPDDYVIKDEDMDQSAPHGGVDVDGKLDETAANLILDMKPSKVVSNELTVLVHGTAEATEHLKQHCLKNVQLSEKLMSNVLFKKVEFAGGALRCGEYVTLRKVSDASQKGGGSDTQSAIGVGFFMASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPWLRVISRPIVHNGVIIERENGLRRSGVCFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITFAAWSINKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKEDSSMNADDGTLAGFPAPMVSAAVKTLPGKVLVPAAVDQVQGQALAALQGLKVIEADVEPSDLCTRREYARWLVSASSVLSRNTTFKVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISSTLDEDQGEASDMVSEELARIEAESMAENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNISLMREQAAIESEMEVLSKLRYELEEQLQGLMSNKVEVSHEKERINKLRKEAEIENQEISRLQYELEVERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAGDTWLDSSEQFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQKIALVVSNLRQWISNAGEQGEDLKNVVISRASRSATEVQQSIAELRLAVKAMAPKQPNTGLFVGLNKGHIVTKKELAPRPSDRKGKSSKRVLFVRNLIREVAGFAPYEKRITELLKVGKDKRALKVAKRKLGTHKRAKKKREEMSSVLRKMRAGGGGEKKK
Homology
BLAST of Cp4.1LG14g06200 vs. ExPASy Swiss-Prot
Match: Q9LKF9 (Cleavage and polyadenylation specificity factor subunit 2 OS=Arabidopsis thaliana OX=3702 GN=CPSF100 PE=1 SV=2)

HSP 1 Score: 485.7 bits (1249), Expect = 1.4e-135
Identity = 293/584 (50.17%), Postives = 335/584 (57.36%), Query Frame = 0

Query: 4   YSEEGKGEGIVIAPHVAGHLLGGTLWKITKDGEDVIYAVDFNHRKERHLNGTILESFVRP 63
           Y   GKGEGIVIAPHVAGH+LGG++W+ITKDGEDVIYAVD+NHRKERHLNGT+L+SFVRP
Sbjct: 135 YHLSGKGEGIVIAPHVAGHMLGGSIWRITKDGEDVIYAVDYNHRKERHLNGTVLQSFVRP 194

Query: 64  AVLITDAYNAL-NNQPYRRQKDKEFGDTIQKTLRANGNVLLPVDTAGRVLELIQILE--W 123
           AVLITDAY+AL  NQ  R+Q+DKEF DTI K L   GNVLLPVDTAGRVLEL+ ILE  W
Sbjct: 195 AVLITDAYHALYTNQTARQQRDKEFLDTISKHLEVGGNVLLPVDTAGRVLELLLILEQHW 254

Query: 124 ETGG------------------------------------------SYRHVTLLINKSEL 183
              G                                            RHVTLLINK++L
Sbjct: 255 SQRGFSFPIYFLTYVSSSTIDYVKSFLEWMSDSISKSFETSRDNAFLLRHVTLLINKTDL 314

Query: 184 DNAPDGPK------------------------------------FGTLARMLQADPPPKA 243
           DNAP GPK                                    FGTLARMLQ+ PPPK 
Sbjct: 315 DNAPPGPKVVLASMASLEAGFAREIFVEWANDPRNLVLFTETGQFGTLARMLQSAPPPKF 374

Query: 244 VKVTVSKRVPLTGDELVAYEEEQNR-KKEEALKASLLKEEQSKASHGTDNDIGDPMIIDA 303
           VKVT+SKRVPL G+EL+AYEEEQNR K+EEAL+ASL+KEE++KASHG+D++  +PMIID 
Sbjct: 375 VKVTMSKRVPLAGEELIAYEEEQNRLKREEALRASLVKEEETKASHGSDDNSSEPMIID- 434

Query: 304 SSNVAPDVVGSHGGAYRDIFIDGFVPPSTSVSPMFPFYENTSAWDDFGEVINPDDYVIKD 363
            +    DV+GSHG AY+DI IDGFVPPS+SV+PMFP+Y+NTS WDDFGE+INPDDYVIKD
Sbjct: 435 -TKTTHDVIGSHGPAYKDILIDGFVPPSSSVAPMFPYYDNTSEWDDFGEIINPDDYVIKD 494

Query: 364 EDMDQSAPHGGVDVDGKLDETAANLILDMKPSKVVSNEL--------------------- 392
           EDMD+ A H G DVDG+LDE  A+L+LD +PSKV+SNEL                     
Sbjct: 495 EDMDRGAMHNGGDVDGRLDEATASLMLDTRPSKVMSNELIVTVSCSLVKMDYEGRSDGRS 554

BLAST of Cp4.1LG14g06200 vs. ExPASy Swiss-Prot
Match: Q652P4 (Cleavage and polyadenylation specificity factor subunit 2 OS=Oryza sativa subsp. japonica OX=39947 GN=Os09g0569400 PE=2 SV=1)

HSP 1 Score: 436.4 bits (1121), Expect = 9.5e-121
Identity = 272/577 (47.14%), Postives = 311/577 (53.90%), Query Frame = 0

Query: 9   KGEGIVIAPHVAGHLLGGTLWKITKDGEDVIYAVDFNHRKERHLNGTILESFVRPAVLIT 68
           KGEGIVIAPHVAGH LGGT+WKITKDGEDV+YAVDFNHRKERHLNGT L SFVRPAVLIT
Sbjct: 140 KGEGIVIAPHVAGHDLGGTVWKITKDGEDVVYAVDFNHRKERHLNGTALGSFVRPAVLIT 199

Query: 69  DAYNALNNQPYRRQKDKEFGDTIQKTLRANGNVLLPVDTAGRVLELIQI----------- 128
           DAYNALNN  Y+RQ+D++F D + K L   G+VLLP+DTAGRVLE++ I           
Sbjct: 200 DAYNALNNHVYKRQQDQDFIDALVKVLTGGGSVLLPIDTAGRVLEILLILEQYWAQRHLI 259

Query: 129 --------------------LEW---ETGGSYRH----------VTLLINKSELDNAPDG 188
                               LEW       S+ H          VT +INK EL+   D 
Sbjct: 260 YPIYFLTNVSTSTVDYVKSFLEWMNDSISKSFEHTRDNAFLLKCVTQIINKDELEKLGDA 319

Query: 189 PK------------------------------------FGTLARMLQADPPPKAVKVTVS 248
           PK                                    FGTLARMLQ DPPPKAVKVT+S
Sbjct: 320 PKVVLASMASLEVGFSHDIFVDMANEAKNLVLFTEKGQFGTLARMLQVDPPPKAVKVTMS 379

Query: 249 KRVPLTGDELVAYEEEQNR-KKEEALKASLLKEEQSKASHGTDNDIGDPMIIDASSNVAP 308
           KR+PL GDEL AYEEEQ R KKEEALKASL KEE+ KAS G++    DPM+IDAS++  P
Sbjct: 380 KRIPLVGDELKAYEEEQERIKKEEALKASLNKEEEKKASLGSNAKASDPMVIDASTSRKP 439

Query: 309 DVVGSHGGAYRDIFIDGFVPPSTSVSPMFPFYENTSAWDDFGEVINPDDYVIKDEDMDQS 368
              GS  G   DI IDGFVPPS+SV+PMFPF+ENTS WDDFGEVINP+DY++K E+MD +
Sbjct: 440 SNAGSKFGGNVDILIDGFVPPSSSVAPMFPFFENTSEWDDFGEVINPEDYLMKQEEMDNT 499

Query: 369 -APHGGVDVDGKLDETAANLILDMKPSKVVSNELT------------------------- 392
             P  G  +D  LDE +A L+LD  PSKV+SNE+T                         
Sbjct: 500 LMPGAGDGMDSMLDEGSARLLLDSTPSKVISNEMTVQVKCSLAYMDFEGRSDGRSVKSVI 559

BLAST of Cp4.1LG14g06200 vs. ExPASy Swiss-Prot
Match: Q9W799 (Cleavage and polyadenylation specificity factor subunit 2 OS=Xenopus laevis OX=8355 GN=cpsf2 PE=1 SV=1)

HSP 1 Score: 182.2 bits (461), Expect = 3.2e-44
Identity = 138/454 (30.40%), Postives = 204/454 (44.93%), Query Frame = 0

Query: 7   EGKGEGIVIAPHVAGHLLGGTLWKITKDG-EDVIYAVDFNHRKERHLNGTILESFVRPAV 66
           +GKG G+ I P  AGH++GGT+WKI KDG E+++YAVDFNH++E HLNG  LE   RP++
Sbjct: 138 KGKGHGLSITPLPAGHMIGGTIWKIVKDGEEEIVYAVDFNHKREIHLNGCSLEMINRPSL 197

Query: 67  LITDAYNALNNQPYRRQKDKEFGDTIQKTLRANGNVLLPVDTAGRVLELIQILE--WETG 126
           LITD++NA   QP R+Q+D++    + +TLR +GNVL+ VDTAGRVLEL Q+L+  W T 
Sbjct: 198 LITDSFNATYVQPRRKQRDEQLLTNVLETLRGDGNVLIAVDTAGRVLELAQLLDQIWRTK 257

Query: 127 GS---------------------------------------------YRHVTLLINKSEL 186
            +                                             +RH+TL    S+L
Sbjct: 258 DAGLGVYSLALLNNVSYNVVEFSKSQVEWMSDKLMRCFEDKRNNPFQFRHLTLCHGYSDL 317

Query: 187 DNAPDGPKF------------------------------------GTLARMLQADPPPKA 246
              P  PK                                     GTLAR L   P  + 
Sbjct: 318 ARVP-SPKVVLASQPDLECGFSRELFIQWCQDPKNSVILTYRTTPGTLARFLIDHPSERI 377

Query: 247 VKVTVSKRVPLTGDELVAYEEEQNRKKEEALKASLLKEEQSKASHGTDNDIGDPMIIDAS 306
           + + + KRV L G EL  Y E++  KKE A K    KE    +S   D+D+ + +    S
Sbjct: 378 IDIELRKRVKLEGKELEEYVEKEKLKKEAAKKLEQSKEADLDSS--DDSDVEEDIDQITS 437

Query: 307 SNVAPDVVGSHGGAYRDIFIDGFVPPSTSVSPMFPFYENTSAWDDFGEVINPDDYVIK-- 339
                D++  + G+ +      F   +    PMFP  E+   WD++GE+I P+D+++   
Sbjct: 438 HKAKHDLMMKNEGSRK----GSFFKQAKKSYPMFPAPEDRIKWDEYGEIIKPEDFLVPEL 497

BLAST of Cp4.1LG14g06200 vs. ExPASy Swiss-Prot
Match: Q10568 (Cleavage and polyadenylation specificity factor subunit 2 OS=Bos taurus OX=9913 GN=CPSF2 PE=1 SV=1)

HSP 1 Score: 181.4 bits (459), Expect = 5.5e-44
Identity = 140/456 (30.70%), Postives = 201/456 (44.08%), Query Frame = 0

Query: 7   EGKGEGIVIAPHVAGHLLGGTLWKITKDG-EDVIYAVDFNHRKERHLNGTILESFVRPAV 66
           +GKG G+ I P  AGH++GGT+WKI KDG E+++YAVDFNH++E HLNG  LE   RP++
Sbjct: 138 KGKGHGLSITPLPAGHMIGGTIWKIVKDGEEEIVYAVDFNHKREIHLNGCSLEMLSRPSL 197

Query: 67  LITDAYNALNNQPYRRQKDKEFGDTIQKTLRANGNVLLPVDTAGRVLELIQILE--WETG 126
           LITD++NA   QP R+Q+D++    + +TLR +GNVL+ VDTAGRVLEL Q+L+  W T 
Sbjct: 198 LITDSFNATYVQPRRKQRDEQLLTNVLETLRGDGNVLIAVDTAGRVLELAQLLDQIWRTK 257

Query: 127 GS---------------------------------------------YRHVTLLINKSEL 186
            +                                             +RH++L    S+L
Sbjct: 258 DAGLGVYSLALLNNVSYNVVEFSKSQVEWMSDKLMRCFEDKRNNPFQFRHLSLCHGLSDL 317

Query: 187 DNAPDGPKF------------------------------------GTLARMLQADPPPKA 246
              P  PK                                     GTLAR L  +P  K 
Sbjct: 318 ARVP-SPKVVLASQPDLECGFSRDLFIQWCQDPKNSIILTYRTTPGTLARFLIDNPSEKV 377

Query: 247 VKVTVSKRVPLTGDELVAYEEEQNRKKEEALKASLLKEEQSKASHGTD--NDIGDPMIID 306
            ++ + KRV L G EL  Y E++  KKE A K    KE    +S  +D   DI  P    
Sbjct: 378 TEIELRKRVKLEGKELEEYLEKEKLKKEAAKKLEQSKEADIDSSDESDAEEDIDQPSAHK 437

Query: 307 ASSNVAPDVVGSHGGAYRDIFIDGFVPPSTSVSPMFPFYENTSAWDDFGEVINPDDYVIK 339
              ++     GS  G+        F   +    PMFP  E    WD++GE+I P+D+++ 
Sbjct: 438 TKHDLMMKGEGSRKGS--------FFKQAKKSYPMFPAPEERIKWDEYGEIIKPEDFLVP 497

BLAST of Cp4.1LG14g06200 vs. ExPASy Swiss-Prot
Match: Q9P2I0 (Cleavage and polyadenylation specificity factor subunit 2 OS=Homo sapiens OX=9606 GN=CPSF2 PE=1 SV=2)

HSP 1 Score: 181.0 bits (458), Expect = 7.2e-44
Identity = 140/456 (30.70%), Postives = 201/456 (44.08%), Query Frame = 0

Query: 7   EGKGEGIVIAPHVAGHLLGGTLWKITKDG-EDVIYAVDFNHRKERHLNGTILESFVRPAV 66
           +GKG G+ I P  AGH++GGT+WKI KDG E+++YAVDFNH++E HLNG  LE   RP++
Sbjct: 138 KGKGHGLSITPLPAGHMIGGTIWKIVKDGEEEIVYAVDFNHKREIHLNGCSLEMLSRPSL 197

Query: 67  LITDAYNALNNQPYRRQKDKEFGDTIQKTLRANGNVLLPVDTAGRVLELIQILE--WETG 126
           LITD++NA   QP R+Q+D++    + +TLR +GNVL+ VDTAGRVLEL Q+L+  W T 
Sbjct: 198 LITDSFNATYVQPRRKQRDEQLLTNVLETLRGDGNVLIAVDTAGRVLELAQLLDQIWRTK 257

Query: 127 GS---------------------------------------------YRHVTLLINKSEL 186
            +                                             +RH++L    S+L
Sbjct: 258 DAGLGVYSLALLNNVSYNVVEFSKSQVEWMSDKLMRCFEDKRNNPFQFRHLSLCHGLSDL 317

Query: 187 DNAPDGPKF------------------------------------GTLARMLQADPPPKA 246
              P  PK                                     GTLAR L  +P  K 
Sbjct: 318 ARVP-SPKVVLASQPDLECGFSRDLFIQWCQDPKNSIILTYRTTPGTLARFLIDNPSEKI 377

Query: 247 VKVTVSKRVPLTGDELVAYEEEQNRKKEEALKASLLKEEQSKASHGTD--NDIGDPMIID 306
            ++ + KRV L G EL  Y E++  KKE A K    KE    +S  +D   DI  P    
Sbjct: 378 TEIELRKRVKLEGKELEEYLEKEKLKKEAAKKLEQSKEADIDSSDESDIEEDIDQPSAHK 437

Query: 307 ASSNVAPDVVGSHGGAYRDIFIDGFVPPSTSVSPMFPFYENTSAWDDFGEVINPDDYVIK 339
              ++     GS  G+        F   +    PMFP  E    WD++GE+I P+D+++ 
Sbjct: 438 TKHDLMMKGEGSRKGS--------FFKQAKKSYPMFPAPEERIKWDEYGEIIKPEDFLVP 497

BLAST of Cp4.1LG14g06200 vs. NCBI nr
Match: XP_023552429.1 (uncharacterized protein LOC111810089 isoform X1 [Cucurbita pepo subsp. pepo])

HSP 1 Score: 1060 bits (2740), Expect = 0.0
Identity = 599/701 (85.45%), Postives = 600/701 (85.59%), Query Frame = 0

Query: 397 MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPWLRVISRPIVHNGVIIERE 456
           MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPWLRVISRPIVHNGVIIERE
Sbjct: 1   MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPWLRVISRPIVHNGVIIERE 60

Query: 457 NGLRRSGVCFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITF 516
           NGLRRSGVCFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITF
Sbjct: 61  NGLRRSGVCFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITF 120

Query: 517 AAWSINKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKEDSSMNADDGTLAG---F 576
           AAWSINKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKEDSSMNADDGTLAG   F
Sbjct: 121 AAWSINKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKEDSSMNADDGTLAGNSGF 180

Query: 577 PAPMVSAAVKTLPGKVLVPAAVDQVQGQALAALQGLKVIEADVEPSDLCTRREYARWLVS 636
           PAPMVSAAVKTLPGKVLVPAAVDQVQGQALAALQGLKVIEADVEPSDLCTRREYARWLVS
Sbjct: 181 PAPMVSAAVKTLPGKVLVPAAVDQVQGQALAALQGLKVIEADVEPSDLCTRREYARWLVS 240

Query: 637 ASSVLSRNTTFKVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISS 696
           ASSVLSRNTTFKVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISS
Sbjct: 241 ASSVLSRNTTFKVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISS 300

Query: 697 TLDEDQG----------------------------------------------------- 756
           TLDEDQG                                                     
Sbjct: 301 TLDEDQGPFHFSPESPLSRQDLVSWKMALEKRLLPEADGKMLRQVSGFIDTDKIHPDACP 360

Query: 757 ------------------------------------------EASDMVSEELARIEAESM 816
                                                     EASDMVSEELARIEAESM
Sbjct: 361 ALVADLSAGEQGIIALAFGYTRLFQPDKPVTKAQAAIALATGEASDMVSEELARIEAESM 420

Query: 817 AENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNI 876
           AENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNI
Sbjct: 421 AENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNI 480

Query: 877 SLMREQAAIESEMEVLSKLRYELEEQLQGLMSNKVEVSHEKERINKLRKEAEIENQEISR 936
           SLMREQAAIESEMEVLSKLRYELEEQLQGLMSNKVEVSHEKERINKLRKEAEIENQEISR
Sbjct: 481 SLMREQAAIESEMEVLSKLRYELEEQLQGLMSNKVEVSHEKERINKLRKEAEIENQEISR 540

Query: 937 LQYELEVERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAG 996
           LQYELEVERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAG
Sbjct: 541 LQYELEVERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAG 600

Query: 997 DTWLDSSEQFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQKIALVVSNLRQWIS 999
           DTWLDSSEQFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQKIALVVSNLRQWIS
Sbjct: 601 DTWLDSSEQFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQKIALVVSNLRQWIS 660

BLAST of Cp4.1LG14g06200 vs. NCBI nr
Match: KAG7014766.1 (hypothetical protein SDJN02_22395 [Cucurbita argyrosperma subsp. argyrosperma])

HSP 1 Score: 1027 bits (2656), Expect = 0.0
Identity = 583/701 (83.17%), Postives = 590/701 (84.17%), Query Frame = 0

Query: 397 MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPWLRVISRPIVHNGVIIERE 456
           MASTPLTCSPSSLQLRLALNSKNCGKFPSV VRARVTKLDP LRVISRPIVHNGVIIERE
Sbjct: 1   MASTPLTCSPSSLQLRLALNSKNCGKFPSVHVRARVTKLDPRLRVISRPIVHNGVIIERE 60

Query: 457 NGLRRSGVCFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITF 516
           NGLRRSGVCFAE ESTTDGFSGWSESDSGEE LDLRRKKWFGGFVGIGVTGFILVSGITF
Sbjct: 61  NGLRRSGVCFAELESTTDGFSGWSESDSGEEALDLRRKKWFGGFVGIGVTGFILVSGITF 120

Query: 517 AAWSINKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKEDSSMNADDGTLA---GF 576
           AAWSINKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKED+ +NADDG LA   G 
Sbjct: 121 AAWSINKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKEDNIVNADDGALADNSGV 180

Query: 577 PAPMVSAAVKTLPGKVLVPAAVDQVQGQALAALQGLKVIEADVEPSDLCTRREYARWLVS 636
           PAP+VSAAVKTLPG+VLVPAAVDQVQGQALAALQGLKVIEADVEPSDLCTRREYARWLVS
Sbjct: 181 PAPLVSAAVKTLPGEVLVPAAVDQVQGQALAALQGLKVIEADVEPSDLCTRREYARWLVS 240

Query: 637 ASSVLSRNTTFKVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISS 696
           ASS LSRNTT KVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISS
Sbjct: 241 ASSALSRNTTSKVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISS 300

Query: 697 TLDEDQG----------------------------------------------------- 756
           TLDEDQG                                                     
Sbjct: 301 TLDEDQGPFHFSPESPLSRQDLVSWKMALEKRLLPEADGKMLRQVSGFIDTDKIHPDACP 360

Query: 757 ------------------------------------------EASDMVSEELARIEAESM 816
                                                     EASDMVSEELARIEAESM
Sbjct: 361 ALVADLSAGEQGIIALAFGYTRLFQPDKPVTKAQAAIALATGEASDMVSEELARIEAESM 420

Query: 817 AENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNI 876
           AENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNI
Sbjct: 421 AENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNI 480

Query: 877 SLMREQAAIESEMEVLSKLRYELEEQLQGLMSNKVEVSHEKERINKLRKEAEIENQEISR 936
           SLMREQAAIESEMEVLSKLRYELEEQLQGLMSNKVEVSHEKERINKLRKEAEIENQEISR
Sbjct: 481 SLMREQAAIESEMEVLSKLRYELEEQLQGLMSNKVEVSHEKERINKLRKEAEIENQEISR 540

Query: 937 LQYELEVERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAG 996
           LQYELEVERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAG
Sbjct: 541 LQYELEVERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAG 600

Query: 997 DTWLDSSEQFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQKIALVVSNLRQWIS 999
           DTWLDSSEQFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQK+A+VVSNLRQWIS
Sbjct: 601 DTWLDSSEQFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQKMAIVVSNLRQWIS 660

BLAST of Cp4.1LG14g06200 vs. NCBI nr
Match: XP_022984998.1 (uncharacterized protein LOC111483097 isoform X1 [Cucurbita maxima])

HSP 1 Score: 1018 bits (2631), Expect = 0.0
Identity = 578/701 (82.45%), Postives = 589/701 (84.02%), Query Frame = 0

Query: 397 MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPWLRVISRPIVHNGVIIERE 456
           MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDP LRVISRPIVHNGVIIERE
Sbjct: 1   MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPRLRVISRPIVHNGVIIERE 60

Query: 457 NGLRRSGVCFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITF 516
           NGLRRSGV FAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITF
Sbjct: 61  NGLRRSGVSFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITF 120

Query: 517 AAWSINKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKEDSSMNADDGTLA---GF 576
           AAWS+NKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKED+ + ADDG LA   G 
Sbjct: 121 AAWSMNKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKEDNIVKADDGALADNSGV 180

Query: 577 PAPMVSAAVKTLPGKVLVPAAVDQVQGQALAALQGLKVIEADVEPSDLCTRREYARWLVS 636
           PAP+VSAAVKTLPGKVLVPAAVDQVQGQALAALQGLKVIEAD+EPSDLCTRREYARWLVS
Sbjct: 181 PAPLVSAAVKTLPGKVLVPAAVDQVQGQALAALQGLKVIEADIEPSDLCTRREYARWLVS 240

Query: 637 ASSVLSRNTTFKVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISS 696
           ASS LSRNTT KVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISS
Sbjct: 241 ASSALSRNTTSKVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISS 300

Query: 697 TLDEDQG----------------------------------------------------- 756
           TLDEDQG                                                     
Sbjct: 301 TLDEDQGSFHFSPESPLSRQDLVSWKMALEKRLLPEADGKMLRQVSGFIDTDKIHPDACP 360

Query: 757 ------------------------------------------EASDMVSEELARIEAESM 816
                                                     EASDMVSEELARIEAESM
Sbjct: 361 ALVADLSAGEQGIIALAFGYTRLFQPDKPVTKEQAAIALATGEASDMVSEELARIEAESM 420

Query: 817 AENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNI 876
           AENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNI
Sbjct: 421 AENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNI 480

Query: 877 SLMREQAAIESEMEVLSKLRYELEEQLQGLMSNKVEVSHEKERINKLRKEAEIENQEISR 936
           SLMREQAAIESEM VLSKLRYELEEQLQGLMSNKV+VS+EKERINKLRKEAEIENQE+SR
Sbjct: 481 SLMREQAAIESEMGVLSKLRYELEEQLQGLMSNKVQVSYEKERINKLRKEAEIENQELSR 540

Query: 937 LQYELEVERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAG 996
           LQYEL+VERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAG
Sbjct: 541 LQYELKVERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAG 600

Query: 997 DTWLDSSEQFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQKIALVVSNLRQWIS 999
           DTWLDSS+QFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQKIALVVSNLRQWIS
Sbjct: 601 DTWLDSSKQFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQKIALVVSNLRQWIS 660

BLAST of Cp4.1LG14g06200 vs. NCBI nr
Match: KAG6574185.1 (60S ribosomal protein L36-2, partial [Cucurbita argyrosperma subsp. sororia])

HSP 1 Score: 944 bits (2440), Expect = 0.0
Identity = 602/1074 (56.05%), Postives = 650/1074 (60.52%), Query Frame = 0

Query: 397  MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPWLRVISRPIVHNGVIIERE 456
            MASTP TCSP+SLQLRLALN KN  KFP + VRA V KLDP LRVI RPIVHN   I R 
Sbjct: 1    MASTPSTCSPNSLQLRLALNCKNSPKFPFLPVRATVRKLDPRLRVIFRPIVHNSAKIART 60

Query: 457  NGLRRSGVCFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITF 516
            NGLRR+G+CFA S+S  DGFSGWSESDSGEE L+LR K W  G VGIG+TGFILVSGITF
Sbjct: 61   NGLRRNGICFAGSDSKADGFSGWSESDSGEEDLNLRTKNWLAGLVGIGITGFILVSGITF 120

Query: 517  AAWSINKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKEDSSMNADD--------- 576
            AAWSINKQN S+QK QM ALS  QELLLDSD+G+D L ED+KED+S+NADD         
Sbjct: 121  AAWSINKQNCSKQKAQMVALSTPQELLLDSDSGNDKLSEDQKEDNSVNADDEYHEEFSSY 180

Query: 577  ------------------------------------------------------------ 636
                                                                        
Sbjct: 181  TENDATLNKNRVGAVADVEELSGNDVESSSNNDNLNNVAFLQEDIQSDSSLAVTSVAYGS 240

Query: 637  ------------------------------------------------------------ 696
                                                                        
Sbjct: 241  SEIDSDVDSGFKDVNSGTEVLTSEPEMNDEPDNSAETKYDFSSEKLPVYDDSSSNYNSGY 300

Query: 697  ------------------------------------------------------------ 756
                                                                        
Sbjct: 301  QDETADPPVNETDDSSLHELGLVDKETVTESLEGVLNPGKTEQLLSEETASTIEQQIGRG 360

Query: 757  -------------------------------------GTL---AGFPAPMVSAAVKTLPG 816
                                                 G L   AG PAP+VSAAVKTLPG
Sbjct: 361  LSEAAFVSVTAYPLADDQEELNHETTMNSSAAEQELQGNLFSSAGVPAPLVSAAVKTLPG 420

Query: 817  KVLVPAAVDQVQGQALAALQGLKVIEADVEPSDLCTRREYARWLVSASSVLSRNTTFKVY 876
            KVLVPA VDQVQGQALAALQ LKVIE++VEPS LCTRREYARWLVSAS  LSRNT  KVY
Sbjct: 421  KVLVPAVVDQVQGQALAALQVLKVIESEVEPSGLCTRREYARWLVSASCALSRNTASKVY 480

Query: 877  QAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISSTLDEDQG------- 936
             AMYI+N+TELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDI S+LD+D+G       
Sbjct: 481  PAMYIENVTELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDILSSLDDDEGPFYFSPE 540

Query: 937  ------------------------------------------------------------ 996
                                                                        
Sbjct: 541  SPLSRQDLVSWKMALEKRQLPEADRKTLHQVSGFIDTDKIHPDACPALVADLSVGEHGII 600

Query: 997  ----------------------------EASDMVSEELARIEAESMAENAVAAHSALVAQ 1056
                                        EASD+VSEELARIEAESMAENAVAAHSALVAQ
Sbjct: 601  ALAFGYTRLFQPDKPVTKAQAAIALATGEASDIVSEELARIEAESMAENAVAAHSALVAQ 660

Query: 1057 VEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNISLMREQAAIESEME 1105
            VEKDIN SFEKELSIEREK +AVEKMAEE KQELERLRSERERDNI+LMRE+ AIESEME
Sbjct: 661  VEKDINASFEKELSIEREKADAVEKMAEEAKQELERLRSERERDNIALMRERTAIESEME 720

BLAST of Cp4.1LG14g06200 vs. NCBI nr
Match: XP_022140923.1 (uncharacterized protein LOC111011467 isoform X4 [Momordica charantia])

HSP 1 Score: 889 bits (2297), Expect = 1.09e-310
Identity = 504/701 (71.90%), Postives = 553/701 (78.89%), Query Frame = 0

Query: 397 MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPWLRVISRPIVHNGVIIERE 456
           MASTP TCSP SLQLRLALN KNC KFPSVLVRARV KLDP +R+   PIV+NG IIER 
Sbjct: 1   MASTPATCSPISLQLRLALNCKNCAKFPSVLVRARVRKLDPRVRMTCYPIVYNGAIIERA 60

Query: 457 NGLRRSGVCFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITF 516
           NG RRSGVCFA S+ST DGFSGWSESDSGEEVLDLRRK WFGG VGIG+TGFILVSGITF
Sbjct: 61  NGQRRSGVCFARSDSTGDGFSGWSESDSGEEVLDLRRKTWFGGLVGIGITGFILVSGITF 120

Query: 517 AAWSINKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKEDSSMNADDGTLAG---F 576
           AAWSI+KQN SRQKPQMEALS QQELLLDSDTG+D L E+EKED+S+NADD TLAG    
Sbjct: 121 AAWSISKQNSSRQKPQMEALSTQQELLLDSDTGNDRLGENEKEDNSVNADDRTLAGKTGV 180

Query: 577 PAPMVSAAVKTLPGKVLVPAAVDQVQGQALAALQGLKVIEADVEPSDLCTRREYARWLVS 636
           PAP+ SAA+KTLPGKVLVPA VDQVQGQAL+ALQ LKVIEA+VEPSDLCTRREYARWLVS
Sbjct: 181 PAPLASAAIKTLPGKVLVPAVVDQVQGQALSALQVLKVIEAEVEPSDLCTRREYARWLVS 240

Query: 637 ASSVLSRNTTFKVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISS 696
           ASS LSRNTT KVY AMYI+N+TELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDI S
Sbjct: 241 ASSALSRNTTSKVYPAMYIENVTELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDILS 300

Query: 697 TLDEDQG----------------------------------------------------- 756
           + DEDQG                                                     
Sbjct: 301 SFDEDQGPFYFSPESPLSRQDLVSWKMALEKRQLPEADRKMLHQVSGFIDTDKIHPDACP 360

Query: 757 ------------------------------------------EASDMVSEELARIEAESM 816
                                                     EASD+VSEELARIEAESM
Sbjct: 361 ALVADLSVGEHGIIALAFGYTRLFQPDKPVTKAQAAIALATGEASDIVSEELARIEAESM 420

Query: 817 AENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNI 876
           AENAVAAH ALVAQVEKDIN SFEK+LSIEREKV+AVEKMAEE KQELERLRSERER+N+
Sbjct: 421 AENAVAAHGALVAQVEKDINASFEKQLSIEREKVDAVEKMAEEAKQELERLRSERERENL 480

Query: 877 SLMREQAAIESEMEVLSKLRYELEEQLQGLMSNKVEVSHEKERINKLRKEAEIENQEISR 936
           +LM+E AAIESEMEV S+LR ELEEQLQGLMSNKVEVS+EKERINKLRKEAEIENQEI+R
Sbjct: 481 ALMKEHAAIESEMEVFSRLRNELEEQLQGLMSNKVEVSYEKERINKLRKEAEIENQEIAR 540

Query: 937 LQYELEVERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAG 996
           LQYELEVERKALS+ARAWAEDEAKRAREQAKALEEARD WERRGIKV+VDSDLREQESAG
Sbjct: 541 LQYELEVERKALSMARAWAEDEAKRAREQAKALEEARDRWERRGIKVVVDSDLREQESAG 600

Query: 997 DTWLDSSEQFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQKIALVVSNLRQWIS 999
           DTWLDSS+QF+V+ETV+RAENL+ KLK MA E+ GK ++I++KII+KIAL++SNLRQW+S
Sbjct: 601 DTWLDSSKQFSVKETVDRAENLMDKLKVMAAELRGKSKEIVDKIIEKIALLISNLRQWVS 660

BLAST of Cp4.1LG14g06200 vs. ExPASy TrEMBL
Match: A0A6J1JC97 (uncharacterized protein LOC111483097 isoform X1 OS=Cucurbita maxima OX=3661 GN=LOC111483097 PE=4 SV=1)

HSP 1 Score: 1018 bits (2631), Expect = 0.0
Identity = 578/701 (82.45%), Postives = 589/701 (84.02%), Query Frame = 0

Query: 397 MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPWLRVISRPIVHNGVIIERE 456
           MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDP LRVISRPIVHNGVIIERE
Sbjct: 1   MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPRLRVISRPIVHNGVIIERE 60

Query: 457 NGLRRSGVCFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITF 516
           NGLRRSGV FAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITF
Sbjct: 61  NGLRRSGVSFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITF 120

Query: 517 AAWSINKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKEDSSMNADDGTLA---GF 576
           AAWS+NKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKED+ + ADDG LA   G 
Sbjct: 121 AAWSMNKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKEDNIVKADDGALADNSGV 180

Query: 577 PAPMVSAAVKTLPGKVLVPAAVDQVQGQALAALQGLKVIEADVEPSDLCTRREYARWLVS 636
           PAP+VSAAVKTLPGKVLVPAAVDQVQGQALAALQGLKVIEAD+EPSDLCTRREYARWLVS
Sbjct: 181 PAPLVSAAVKTLPGKVLVPAAVDQVQGQALAALQGLKVIEADIEPSDLCTRREYARWLVS 240

Query: 637 ASSVLSRNTTFKVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISS 696
           ASS LSRNTT KVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISS
Sbjct: 241 ASSALSRNTTSKVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISS 300

Query: 697 TLDEDQG----------------------------------------------------- 756
           TLDEDQG                                                     
Sbjct: 301 TLDEDQGSFHFSPESPLSRQDLVSWKMALEKRLLPEADGKMLRQVSGFIDTDKIHPDACP 360

Query: 757 ------------------------------------------EASDMVSEELARIEAESM 816
                                                     EASDMVSEELARIEAESM
Sbjct: 361 ALVADLSAGEQGIIALAFGYTRLFQPDKPVTKEQAAIALATGEASDMVSEELARIEAESM 420

Query: 817 AENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNI 876
           AENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNI
Sbjct: 421 AENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNI 480

Query: 877 SLMREQAAIESEMEVLSKLRYELEEQLQGLMSNKVEVSHEKERINKLRKEAEIENQEISR 936
           SLMREQAAIESEM VLSKLRYELEEQLQGLMSNKV+VS+EKERINKLRKEAEIENQE+SR
Sbjct: 481 SLMREQAAIESEMGVLSKLRYELEEQLQGLMSNKVQVSYEKERINKLRKEAEIENQELSR 540

Query: 937 LQYELEVERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAG 996
           LQYEL+VERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAG
Sbjct: 541 LQYELKVERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAG 600

Query: 997 DTWLDSSEQFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQKIALVVSNLRQWIS 999
           DTWLDSS+QFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQKIALVVSNLRQWIS
Sbjct: 601 DTWLDSSKQFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQKIALVVSNLRQWIS 660

BLAST of Cp4.1LG14g06200 vs. ExPASy TrEMBL
Match: A0A6J1CJ48 (uncharacterized protein LOC111011467 isoform X4 OS=Momordica charantia OX=3673 GN=LOC111011467 PE=4 SV=1)

HSP 1 Score: 889 bits (2297), Expect = 5.29e-311
Identity = 504/701 (71.90%), Postives = 553/701 (78.89%), Query Frame = 0

Query: 397 MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPWLRVISRPIVHNGVIIERE 456
           MASTP TCSP SLQLRLALN KNC KFPSVLVRARV KLDP +R+   PIV+NG IIER 
Sbjct: 1   MASTPATCSPISLQLRLALNCKNCAKFPSVLVRARVRKLDPRVRMTCYPIVYNGAIIERA 60

Query: 457 NGLRRSGVCFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITF 516
           NG RRSGVCFA S+ST DGFSGWSESDSGEEVLDLRRK WFGG VGIG+TGFILVSGITF
Sbjct: 61  NGQRRSGVCFARSDSTGDGFSGWSESDSGEEVLDLRRKTWFGGLVGIGITGFILVSGITF 120

Query: 517 AAWSINKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKEDSSMNADDGTLAG---F 576
           AAWSI+KQN SRQKPQMEALS QQELLLDSDTG+D L E+EKED+S+NADD TLAG    
Sbjct: 121 AAWSISKQNSSRQKPQMEALSTQQELLLDSDTGNDRLGENEKEDNSVNADDRTLAGKTGV 180

Query: 577 PAPMVSAAVKTLPGKVLVPAAVDQVQGQALAALQGLKVIEADVEPSDLCTRREYARWLVS 636
           PAP+ SAA+KTLPGKVLVPA VDQVQGQAL+ALQ LKVIEA+VEPSDLCTRREYARWLVS
Sbjct: 181 PAPLASAAIKTLPGKVLVPAVVDQVQGQALSALQVLKVIEAEVEPSDLCTRREYARWLVS 240

Query: 637 ASSVLSRNTTFKVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISS 696
           ASS LSRNTT KVY AMYI+N+TELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDI S
Sbjct: 241 ASSALSRNTTSKVYPAMYIENVTELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDILS 300

Query: 697 TLDEDQG----------------------------------------------------- 756
           + DEDQG                                                     
Sbjct: 301 SFDEDQGPFYFSPESPLSRQDLVSWKMALEKRQLPEADRKMLHQVSGFIDTDKIHPDACP 360

Query: 757 ------------------------------------------EASDMVSEELARIEAESM 816
                                                     EASD+VSEELARIEAESM
Sbjct: 361 ALVADLSVGEHGIIALAFGYTRLFQPDKPVTKAQAAIALATGEASDIVSEELARIEAESM 420

Query: 817 AENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNI 876
           AENAVAAH ALVAQVEKDIN SFEK+LSIEREKV+AVEKMAEE KQELERLRSERER+N+
Sbjct: 421 AENAVAAHGALVAQVEKDINASFEKQLSIEREKVDAVEKMAEEAKQELERLRSERERENL 480

Query: 877 SLMREQAAIESEMEVLSKLRYELEEQLQGLMSNKVEVSHEKERINKLRKEAEIENQEISR 936
           +LM+E AAIESEMEV S+LR ELEEQLQGLMSNKVEVS+EKERINKLRKEAEIENQEI+R
Sbjct: 481 ALMKEHAAIESEMEVFSRLRNELEEQLQGLMSNKVEVSYEKERINKLRKEAEIENQEIAR 540

Query: 937 LQYELEVERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAG 996
           LQYELEVERKALS+ARAWAEDEAKRAREQAKALEEARD WERRGIKV+VDSDLREQESAG
Sbjct: 541 LQYELEVERKALSMARAWAEDEAKRAREQAKALEEARDRWERRGIKVVVDSDLREQESAG 600

Query: 997 DTWLDSSEQFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQKIALVVSNLRQWIS 999
           DTWLDSS+QF+V+ETV+RAENL+ KLK MA E+ GK ++I++KII+KIAL++SNLRQW+S
Sbjct: 601 DTWLDSSKQFSVKETVDRAENLMDKLKVMAAELRGKSKEIVDKIIEKIALLISNLRQWVS 660

BLAST of Cp4.1LG14g06200 vs. ExPASy TrEMBL
Match: A0A1S3AXY7 (uncharacterized protein LOC103484091 isoform X3 OS=Cucumis melo OX=3656 GN=LOC103484091 PE=4 SV=1)

HSP 1 Score: 888 bits (2295), Expect = 1.02e-310
Identity = 508/701 (72.47%), Postives = 548/701 (78.17%), Query Frame = 0

Query: 397 MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPWLRVISRPIVHNGVIIERE 456
           MAST  TCSPSSLQLRLALN  NCGKFPSV VRARV KLDP LR++ +PIVHNG   +R 
Sbjct: 1   MASTSPTCSPSSLQLRLALNCNNCGKFPSVFVRARVRKLDPRLRIVCQPIVHNGAKFDRG 60

Query: 457 NGLRRSGVCFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITF 516
           NGLR +GVCFA SEST DGFSGWSESDS  EVLDLRRKKWFGG VGIG+TGFILVSGITF
Sbjct: 61  NGLRGTGVCFAGSESTADGFSGWSESDSQGEVLDLRRKKWFGGLVGIGITGFILVSGITF 120

Query: 517 AAWSINKQNGSRQKPQMEALSMQQELLLDSDTGHDSLDEDEKEDSSMNADDGTLAG---F 576
           AAWSINKQN SRQK QMEALS QQELLLDS+TG D L EDEKED+S++ADD T AG    
Sbjct: 121 AAWSINKQNSSRQKLQMEALSTQQELLLDSETGTDRLGEDEKEDNSVDADDETFAGKAGV 180

Query: 577 PAPMVSAAVKTLPGKVLVPAAVDQVQGQALAALQGLKVIEADVEPSDLCTRREYARWLVS 636
           PAP+VSAAVKTLPGKVLVPA VDQVQGQALAALQ LKVIE+DVEPSDLCTRREYARWLVS
Sbjct: 181 PAPLVSAAVKTLPGKVLVPAVVDQVQGQALAALQVLKVIESDVEPSDLCTRREYARWLVS 240

Query: 637 ASSVLSRNTTFKVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSRHDISS 696
           ASS LSRNTT KVY AMY++N+TELAFDDITP+DPDFASIQGLAEAGLISSKLSRHDISS
Sbjct: 241 ASSALSRNTTSKVYPAMYVENVTELAFDDITPQDPDFASIQGLAEAGLISSKLSRHDISS 300

Query: 697 TLDEDQG----------------------------------------------------- 756
           +LDEDQG                                                     
Sbjct: 301 SLDEDQGPLYFSPESLLSRQDLVSWKMALEKRQLPEADRKMLHQVSGFIDTDKIHPDACP 360

Query: 757 ------------------------------------------EASDMVSEELARIEAESM 816
                                                     EASD+VSEELARIEAESM
Sbjct: 361 AIVADLSVGEQGIIALAFGYTRLFQPDKPVTKAQAAIALATGEASDIVSEELARIEAESM 420

Query: 817 AENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERDNI 876
           AENAVAAHSALVAQVEKDIN SFEKELSIEREKVEAVE+MAEE KQELERLRSER RD++
Sbjct: 421 AENAVAAHSALVAQVEKDINASFEKELSIEREKVEAVERMAEEAKQELERLRSERARDSL 480

Query: 877 SLMREQAAIESEMEVLSKLRYELEEQLQGLMSNKVEVSHEKERINKLRKEAEIENQEISR 936
           +LM E+A++ESEMEVLS+LR ELEEQLQGLMSNKVEVS+EKERINKLRKEAEIENQEISR
Sbjct: 481 ALMMERASVESEMEVLSRLRSELEEQLQGLMSNKVEVSYEKERINKLRKEAEIENQEISR 540

Query: 937 LQYELEVERKALSIARAWAEDEAKRAREQAKALEEARDSWERRGIKVMVDSDLREQESAG 996
           LQYELEVERKALS+ARAWAEDEAKRAREQAKALEEARD WE+RGIKV+VDSDLREQES G
Sbjct: 541 LQYELEVERKALSMARAWAEDEAKRAREQAKALEEARDRWEKRGIKVVVDSDLREQESTG 600

Query: 997 DTWLDSSEQFAVEETVERAENLLAKLKGMAREVGGKCRDIIEKIIQKIALVVSNLRQWIS 999
           DTWLDSS+QF VEET +RAENL+ KLK MA EV GK RD+IEKIIQKIAL+VSNLRQWIS
Sbjct: 601 DTWLDSSKQFTVEETTDRAENLMEKLKRMAAEVRGKSRDVIEKIIQKIALLVSNLRQWIS 660

BLAST of Cp4.1LG14g06200 vs. ExPASy TrEMBL
Match: A0A438E2N4 (Cleavage and polyadenylation specificity factor subunit 2 OS=Vitis vinifera OX=29760 GN=CPSF100_0 PE=4 SV=1)

HSP 1 Score: 911 bits (2354), Expect = 7.39e-304
Identity = 687/1737 (39.55%), Postives = 809/1737 (46.57%), Query Frame = 0

Query: 4    YSEEGKGEGIVIAPHVAGHLLGGTLWKITKDGEDVIYAVDFNHRKERHLNGTILESFVRP 63
            Y   GKGEGIVIAPHVAGHLLGGT+WKITKDGEDVIYAVDFNHRKER LNGT+LESFVRP
Sbjct: 141  YHLFGKGEGIVIAPHVAGHLLGGTVWKITKDGEDVIYAVDFNHRKERLLNGTVLESFVRP 200

Query: 64   AVLITDAYNALNNQPYRRQKDKEFGDTIQKTLRANGNVLLPVDTAGRVLELIQILE---- 123
            AVLITDAYNALNNQP RRQ+D+EF D I KTLR +GNVLLPVDTAGRVLEL+ ILE    
Sbjct: 201  AVLITDAYNALNNQPSRRQRDQEFLDVILKTLRGDGNVLLPVDTAGRVLELMLILEQYWT 260

Query: 124  ---------------------------W-------------ETGGSYRHVTLLINKSELD 183
                                       W             +     +HVTLLI+KSEL+
Sbjct: 261  QHHLNYPIFFLTYVASSTIDYVKSFLEWMSDSIAKSFEHTRDNAFLLKHVTLLISKSELE 320

Query: 184  NAPDGPK---------------------------------FGTLARMLQADPPPKAVKVT 243
              PDGPK                                 F TLARMLQADPPPKAVKVT
Sbjct: 321  KVPDGPKHPWPVLEAGFSHDIFVEWATDAKNLVLFSERGQFATLARMLQADPPPKAVKVT 380

Query: 244  VSKRVPLTGDELVAYEEEQNR-KKEEALKASLLKEEQSKASHGTDNDIGDPMIIDASSNV 303
            +SKRVPL G+EL AYEEEQ R KKEEALKASL KE++ KAS G+DN +GDPM+ID ++  
Sbjct: 381  MSKRVPLVGEELAAYEEEQERIKKEEALKASLSKEDEMKASRGSDNKLGDPMVIDTTTTP 440

Query: 304  AP-DVVGSHGGAYRDIFIDGFVPPSTSVSPMFPFYENTSAWDDFGEVINPDDYVIKDEDM 363
            A  DV   H G +RDI IDGFVPPSTSV+PMFPFYEN+S WDDFGEVINP+DYVIKDEDM
Sbjct: 441  ASSDVAVPHVGGHRDILIDGFVPPSTSVAPMFPFYENSSEWDDFGEVINPEDYVIKDEDM 500

Query: 364  DQ-----------------SAP---------HGGVDVDGKLDETAANLILDMKPSKVVSN 423
            DQ                 S P         + G D++GKLDE AA+LI D  PSKV+SN
Sbjct: 501  DQATMQHLITASFVTFPAISVPKDFDWMNQHYVGDDLNGKLDEGAASLIFDTTPSKVISN 560

Query: 424  ELTV---------------------------------LVHGTAEATEHLKQHCLKNV--- 483
            ELTV                                 LVHG+AEATEHLKQHCLK+V   
Sbjct: 561  ELTVQVKCMLVYMDFEGRSDGRSIKSILSHVAPLKLVLVHGSAEATEHLKQHCLKHVCPH 620

Query: 484  ---------------------QLSEKLMSNVLFKK------------------------- 543
                                 QLSEKLMSNVLFKK                         
Sbjct: 621  VYAPQIGETIDVTSDLCAYKVQLSEKLMSNVLFKKLGDYEVAWVDAEVGKTESGSLSLLP 680

Query: 544  -------------------------------VEFAGGALRCGEYVTLRKVSDASQKGG-- 603
                                           VEF+GGALRCGEYVTLRKV DASQK    
Sbjct: 681  LSTPPPSHDTVFVGDIKMADFKQFLASKGIQVEFSGGALRCGEYVTLRKVGDASQKTSII 740

Query: 604  -----------------------------------GSDTQSAIGVGFF------------ 663
                                               G +       G F            
Sbjct: 741  YMILEFLRDEHCPLSEVANRWANIEKCAFKGVLPRGKEVAIFGYKGHFIWGRQCLSFLEA 800

Query: 664  ------------MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPWLRVISR 723
                        MAS     SPSS QLR +     C + P+V VR  V KLD  +RV+S 
Sbjct: 801  LISLLEFTLYSVMASVTTNWSPSSFQLRFSFQ---CRRSPAVFVRTHVRKLDRQVRVLSI 860

Query: 724  PIVHNGVIIERENGLRRSGVCFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIG 783
                NGV      G  R G  +  SES  D  SGWS SD  E+    ++K+W GG VG G
Sbjct: 861  AGDGNGV------GRHRDGNSWISSESKGDDLSGWSGSDGSEQYGKSQKKRWPGGMVGAG 920

Query: 784  VTGFILVSGITFAAWSINKQNGSRQKPQMEALSMQQE-------LLLDSDTGHD-----S 843
            V G +LV+G++FAA+S++KQN SR + QMEA+++Q E         L+S TG D     S
Sbjct: 921  VAGVVLVAGLSFAAFSLSKQNPSRPEKQMEAMTIQMEQGILQEDYSLESKTGTDAMPTPS 980

Query: 844  LDED-------------------------------------------------------- 903
            + ED                                                        
Sbjct: 981  IQEDMSGESPFDDTLVAGDSMSSSPGFCESDIVIDPIDTLSFNYSDASLAVGSSESSQLE 1040

Query: 904  --------------EKEDSSMNADD----------------------------------- 963
                          + + +++N+DD                                   
Sbjct: 1041 ENGDALKLVNSSIHDPDTTNLNSDDQGELLGSKGTENSNFSLESSSSSFPRTVDEDHYVH 1100

Query: 964  --------------------GTL------------------------------------- 994
                                GT                                      
Sbjct: 1101 SDKMLNEWKSIPNKSFVDANGTQHPVSEKEYLDLDELQKDIPNESYVKLHDLNASGIQDP 1160

BLAST of Cp4.1LG14g06200 vs. ExPASy TrEMBL
Match: A0A438IZ50 (Cleavage and polyadenylation specificity factor subunit 2 OS=Vitis vinifera OX=29760 GN=CPSF100_1 PE=4 SV=1)

HSP 1 Score: 902 bits (2330), Expect = 3.32e-300
Identity = 682/1720 (39.65%), Postives = 806/1720 (46.86%), Query Frame = 0

Query: 4    YSEEGKGEGIVIAPHVAGHLLGGTLWKITKDGEDVIYAVDFNHRKERHLNGTILESFVRP 63
            Y   GKGEGIVIAPHVAGHLLGGT+WKITKDGEDVIYAVDFNHRKER LNGT+LESFVRP
Sbjct: 166  YHLFGKGEGIVIAPHVAGHLLGGTVWKITKDGEDVIYAVDFNHRKERLLNGTVLESFVRP 225

Query: 64   AVLITDAYNALNNQPYRRQKDKEFGDTIQKTLRANGNVLLPVDTAGRVLELIQILE---- 123
            AVLITDAYNALNNQP RRQ+D+EF D I KTLR +GNVLLPVDTAGRVLEL+ ILE    
Sbjct: 226  AVLITDAYNALNNQPSRRQRDQEFLDVILKTLRGDGNVLLPVDTAGRVLELMLILEQYWT 285

Query: 124  ---------------------------W-------------ETGGSYRHVTLLINKSELD 183
                                       W             +     +HVTLLI+KSEL+
Sbjct: 286  QHHLNYPIFFLTYVASSTIDYVKSFLEWMSDSIAKSFEHTRDNAFLLKHVTLLISKSELE 345

Query: 184  NAPDGPK------------------------------------FGTLARMLQADPPPKAV 243
              PDGPK                                    F TLARMLQADPPPKAV
Sbjct: 346  KVPDGPKIVLASMASLEAGFSHDIFVEWATDAKNLVLFSERGQFATLARMLQADPPPKAV 405

Query: 244  KVTVSKRVPLTGDELVAYEEEQNR-KKEEALKASLLKEEQSKASHGTDNDIGDPMIIDAS 303
            KVT+SKRVPL G+EL AYEEEQ R KKEEALKASL KE++ KAS G+DN +GDPM+ID +
Sbjct: 406  KVTMSKRVPLVGEELAAYEEEQERIKKEEALKASLSKEDEMKASRGSDNKLGDPMVIDTT 465

Query: 304  SNVAP-DVVGSHGGAYRDIFIDGFVPPSTSVSPMFPFYENTSAWDDFGEVINPDDYVIKD 363
            +  A  DV   H G +RDI IDGFVPPSTSV+PMFPFYEN+S WDDFGEVINP+DYVIKD
Sbjct: 466  TPPASSDVAVPHVGGHRDILIDGFVPPSTSVAPMFPFYENSSEWDDFGEVINPEDYVIKD 525

Query: 364  EDMDQ-----------------SAP---------HGGVDVDGKLDETAANLILDMKPSKV 423
            EDMDQ                 S P         + G D++GKLDE AA+LI D  PSKV
Sbjct: 526  EDMDQATMQHLITASFVTFPAISVPKDFDWMNQHYVGDDLNGKLDEGAASLIFDTTPSKV 585

Query: 424  VSNELTV---------------------------------LVHGTAEATEHLKQHCLKNV 483
            +SNELTV                                 LVHG+AEATEHLKQHCLK+V
Sbjct: 586  ISNELTVQVKCMLVYMDFEGRSDGRSIKSILSHVAPLKLVLVHGSAEATEHLKQHCLKHV 645

Query: 484  ------------------------QLSEKLMSNVLFKKVE-------------------- 543
                                    QLSEKLMSNVLFKK+                     
Sbjct: 646  CPHVYAPQIGETIDVTSDLCAYKVQLSEKLMSNVLFKKLGDYEVAWVDAEVGKTESGSLS 705

Query: 544  --------------FAG----------------------GALRCGEYVTLRKVSDASQKG 603
                          F G                      G   CGEYVTLRKV DASQK 
Sbjct: 706  LLPLSTPPPSHDTVFVGDIKMADSSSFWQAKASRSSSLVGRCGCGEYVTLRKVGDASQKV 765

Query: 604  GG------------------------------------SDTQSAIGVGFF-----MASTP 663
                                                  S  ++ I +  F     MAS  
Sbjct: 766  ANRWANIEKCAFKGVLPRGKEVAIFGYKGHFIWGRQCLSFLEALISLLEFTLYSVMASVT 825

Query: 664  LTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPWLRVISRPIVHNGVIIERENGLRR 723
               SPSS QLR +     C + P+V VR  V KLD  +RV+S     NGV      G  R
Sbjct: 826  TNWSPSSFQLRFSFQ---CRRSPAVFVRTHVRKLDRQVRVLSIAGDGNGV------GRHR 885

Query: 724  SGVCFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITFAAWSI 783
             G  +  SES  D  SGWS SD  E+    ++K+W GG VG GV G +LV+G++FAA+S+
Sbjct: 886  DGNSWISSESKGDDLSGWSGSDGSEQYGKSQKKRWPGGMVGAGVAGVVLVAGLSFAAFSL 945

Query: 784  NKQNGSRQKPQMEALSMQQE-------LLLDSDTGHD-----SLDED------------- 843
            +KQN SR + QMEA+++Q E         L+S TG D     S+ ED             
Sbjct: 946  SKQNPSRPEKQMEAMTIQMEQGILQEDYSLESKTGTDAMPTPSIQEDMSGESPFDDTLVA 1005

Query: 844  ---------------------------------------------------------EKE 903
                                                                     + +
Sbjct: 1006 GDSMSSSPGFCESDIVIDPIDTLSFNYSDASLAVGSSESSQLEENGDALKLVNSSIHDAD 1065

Query: 904  DSSMNADD---------------------------------------------------- 963
             +++N+DD                                                    
Sbjct: 1066 TTNLNSDDQGELLGSKGTENSNFSLESSSSSFPRTVDEDHYVHSDKMLNEWKSIPNKSFV 1125

Query: 964  ---GTL------------------------------------------------------ 994
               GT                                                       
Sbjct: 1126 DANGTQHPVSEKEYLDLDELQKDIPNESYVKLHDLNASGIQDPVSDGEYLDPDELQKDIT 1185

BLAST of Cp4.1LG14g06200 vs. TAIR 10
Match: AT5G23880.1 (cleavage and polyadenylation specificity factor 100 )

HSP 1 Score: 485.7 bits (1249), Expect = 9.7e-137
Identity = 293/584 (50.17%), Postives = 335/584 (57.36%), Query Frame = 0

Query: 4   YSEEGKGEGIVIAPHVAGHLLGGTLWKITKDGEDVIYAVDFNHRKERHLNGTILESFVRP 63
           Y   GKGEGIVIAPHVAGH+LGG++W+ITKDGEDVIYAVD+NHRKERHLNGT+L+SFVRP
Sbjct: 135 YHLSGKGEGIVIAPHVAGHMLGGSIWRITKDGEDVIYAVDYNHRKERHLNGTVLQSFVRP 194

Query: 64  AVLITDAYNAL-NNQPYRRQKDKEFGDTIQKTLRANGNVLLPVDTAGRVLELIQILE--W 123
           AVLITDAY+AL  NQ  R+Q+DKEF DTI K L   GNVLLPVDTAGRVLEL+ ILE  W
Sbjct: 195 AVLITDAYHALYTNQTARQQRDKEFLDTISKHLEVGGNVLLPVDTAGRVLELLLILEQHW 254

Query: 124 ETGG------------------------------------------SYRHVTLLINKSEL 183
              G                                            RHVTLLINK++L
Sbjct: 255 SQRGFSFPIYFLTYVSSSTIDYVKSFLEWMSDSISKSFETSRDNAFLLRHVTLLINKTDL 314

Query: 184 DNAPDGPK------------------------------------FGTLARMLQADPPPKA 243
           DNAP GPK                                    FGTLARMLQ+ PPPK 
Sbjct: 315 DNAPPGPKVVLASMASLEAGFAREIFVEWANDPRNLVLFTETGQFGTLARMLQSAPPPKF 374

Query: 244 VKVTVSKRVPLTGDELVAYEEEQNR-KKEEALKASLLKEEQSKASHGTDNDIGDPMIIDA 303
           VKVT+SKRVPL G+EL+AYEEEQNR K+EEAL+ASL+KEE++KASHG+D++  +PMIID 
Sbjct: 375 VKVTMSKRVPLAGEELIAYEEEQNRLKREEALRASLVKEEETKASHGSDDNSSEPMIID- 434

Query: 304 SSNVAPDVVGSHGGAYRDIFIDGFVPPSTSVSPMFPFYENTSAWDDFGEVINPDDYVIKD 363
            +    DV+GSHG AY+DI IDGFVPPS+SV+PMFP+Y+NTS WDDFGE+INPDDYVIKD
Sbjct: 435 -TKTTHDVIGSHGPAYKDILIDGFVPPSSSVAPMFPYYDNTSEWDDFGEIINPDDYVIKD 494

Query: 364 EDMDQSAPHGGVDVDGKLDETAANLILDMKPSKVVSNEL--------------------- 392
           EDMD+ A H G DVDG+LDE  A+L+LD +PSKV+SNEL                     
Sbjct: 495 EDMDRGAMHNGGDVDGRLDEATASLMLDTRPSKVMSNELIVTVSCSLVKMDYEGRSDGRS 554

BLAST of Cp4.1LG14g06200 vs. TAIR 10
Match: AT5G23890.1 (LOCATED IN: mitochondrion, chloroplast thylakoid membrane, chloroplast, plastid, chloroplast envelope; EXPRESSED IN: 24 plant structures; EXPRESSED DURING: 14 growth stages; CONTAINS InterPro DOMAIN/s: S-layer homology domain (InterPro:IPR001119); BEST Arabidopsis thaliana protein match is: unknown protein (TAIR:AT5G52410.2); Has 1807 Blast hits to 1807 proteins in 277 species: Archae - 0; Bacteria - 0; Metazoa - 736; Fungi - 347; Plants - 385; Viruses - 0; Other Eukaryotes - 339 (source: NCBI BLink). )

HSP 1 Score: 430.6 bits (1106), Expect = 3.7e-120
Identity = 276/556 (49.64%), Postives = 356/556 (64.03%), Query Frame = 0

Query: 546 SDTGHDSLD---EDEKEDSSM------NADDGTLAGFPAPMVSAAVKTLPGKVLVPAAVD 605
           +D   D L+   +DE +D+ M           + AG PAP +S  V   PGK+LVP A D
Sbjct: 377 TDGSKDELNIYSQDELDDNRMLLEIPSGGSAFSSAGIPAPFMSVIVN--PGKILVPVAAD 436

Query: 606 QVQGQALAALQGLKVIEADVEPSDLCTRREYARWLVSASSVLSRNTTFKVYQAMYIQNIT 665
           Q+Q QA AALQ LKVIE D +PSDLCTRREYARWL+SASS LSRNTT KVY AMYI+N+T
Sbjct: 437 QIQCQAFAALQVLKVIETDTQPSDLCTRREYARWLISASSALSRNTTSKVYPAMYIENVT 496

Query: 666 ELAFDDITPEDPDFASIQGLAEAGLISSKLSRHD-------------------------- 725
           ELAFDDITPEDPDF+SIQGLAEAGLI+SKLS  D                          
Sbjct: 497 ELAFDDITPEDPDFSSIQGLAEAGLIASKLSNRDLLDDVEGTFLFSPESLLSRQDLISWK 556

Query: 726 ------------------ISSTLDEDQ--------------------------------- 785
                             +S  +D D+                                 
Sbjct: 557 MALEKRQLPEADKKMLYKLSGFIDIDKINPDAWPSIIADLSTGEQGIAALAFGCTRLFQP 616

Query: 786 ---------------GEASDMVSEELARIEAESMAENAVAAHSALVAQVEKDINGSFEKE 845
                          GEASD+VSEELARIEAESMAE AV+AH+ALVA+VEKD+N SFEKE
Sbjct: 617 HKPVTKGQAAIALSSGEASDIVSEELARIEAESMAEKAVSAHNALVAEVEKDVNASFEKE 676

Query: 846 LSIEREKVEAVEKMAEETKQELERLRSERERDNISLMREQAAIESEMEVLSKLRYELEEQ 905
           LS+EREK+EAVEKMAE  K ELE+LR +RE +N++L++E+AA+ESEMEVLS+LR + EE+
Sbjct: 677 LSMEREKIEAVEKMAELAKVELEQLREKREEENLALVKERAAVESEMEVLSRLRRDAEEK 736

Query: 906 LQGLMSNKVEVSHEKERINKLRKEAEIENQEISRLQYELEVERKALSIARAWAEDEAKRA 965
           L+ LMSNK E++ EKER+  LRKEAE E+Q IS+LQYELEVERKALS+AR+WAE+EAK+A
Sbjct: 737 LEDLMSNKAEITFEKERVFNLRKEAEEESQRISKLQYELEVERKALSMARSWAEEEAKKA 796

Query: 966 REQAKALEEARDSWERRGIKVMVDSDLRE---QESAGDTWLDSSEQFAVEETVERAENLL 998
           REQ +ALEEAR  WE  G++V+VD DL+E   +E+     L+  E+ +VEET  RA+ L+
Sbjct: 797 REQGRALEEARKRWETNGLRVVVDKDLQETSSRETEQSIVLNEMERSSVEETERRAKTLM 856

BLAST of Cp4.1LG14g06200 vs. TAIR 10
Match: AT5G52410.2 (INVOLVED IN: biological_process unknown; LOCATED IN: chloroplast; EXPRESSED IN: 13 plant structures; EXPRESSED DURING: 6 growth stages; CONTAINS InterPro DOMAIN/s: S-layer homology domain (InterPro:IPR001119); BEST Arabidopsis thaliana protein match is: unknown protein (TAIR:AT5G23890.1); Has 35333 Blast hits to 34131 proteins in 2444 species: Archae - 798; Bacteria - 22429; Metazoa - 974; Fungi - 991; Plants - 531; Viruses - 0; Other Eukaryotes - 9610 (source: NCBI BLink). )

HSP 1 Score: 414.5 bits (1064), Expect = 2.7e-115
Identity = 312/755 (41.32%), Postives = 411/755 (54.44%), Query Frame = 0

Query: 397 MASTPLTCSPSSLQLRLALNSKNCGKFPSVLVRARVTKLDPWLRVISRPIVHNGVIIERE 456
           MAST  T +PSSLQLR+ALN     K P    RA++TKL   LR+          + +  
Sbjct: 1   MASTMATWTPSSLQLRIALN-HGIFKAPE---RAKMTKLSRRLRI--------SCVAQNA 60

Query: 457 NGLRRSGVCFAESESTTDGFSGWSESDSGEEVLDLRRKKWFGGFVGIGVTGFILVSGITF 516
              R SG       + +D F GW  +DSG++  + R   WF G +  GV G +L  G+T+
Sbjct: 61  EPGRDSG-----ESNGSDRFRGW--ADSGDDENNSRGGDWFKGTLLSGVAGMVLFVGLTY 120

Query: 517 AAWSINKQNGSRQKPQMEALSMQQ----ELLLDSDTGHDSLDEDEKE--------DSSMN 576
           AA S +K+N  R K ++   ++ +    ++  D + G+    +D +E        D S++
Sbjct: 121 AALS-SKRNVLRPKVEVMVTTVTKSSIDQISTDENEGNIVTSQDNQESYRAFPFLDVSLD 180

Query: 577 AD----------------------------------------DGT--------------- 636
           +                                         DG                
Sbjct: 181 SQVLSPDEIDVASKSTSTRKDNEEAEKASVSSAERYTSSTELDGVDTHTSQIPNEKQKAR 240

Query: 637 -LAGFPAPMVSAAVKTLPGKVLVPAAVDQVQGQALAALQGLKVIEADVEPSDLCTRREYA 696
              G PAP     V +L  K + P  VD VQ Q  AALQ LKVIE+D  P DLCTRRE+A
Sbjct: 241 RYTGIPAPSTVPQVDSL--KPIFPTVVDPVQSQMFAALQALKVIESDALPYDLCTRREFA 300

Query: 697 RWLVSASSVLSRNTTFKVYQAMYIQNITELAFDDITPEDPDFASIQGLAEAGLISSKLSR 756
           RW+VSAS+ LSRN+  KVY AMYI+N+TELAFDDITPEDPDF  IQGLAEAGLISSKLS 
Sbjct: 301 RWVVSASNTLSRNSASKVYPAMYIENVTELAFDDITPEDPDFPFIQGLAEAGLISSKLSN 360

Query: 757 HDISST--------------------------------------------LDEDQ----- 816
           +++ S+                                            LD D+     
Sbjct: 361 NNMPSSESSRVTFSPESPLTRQDLLSWKMALEFRQLPEADSKKLYQLSGFLDIDKINPEA 420

Query: 817 -------------------------------------------GEASDMVSEELARIEAE 876
                                                      G+A ++V EELARIEAE
Sbjct: 421 WPALIADLSAGEHGITALSFGRTRLFQPSKAVTKAQTAVSLAIGDAFEVVGEELARIEAE 480

Query: 877 SMAENAVAAHSALVAQVEKDINGSFEKELSIEREKVEAVEKMAEETKQELERLRSERERD 936
           +MAEN V AH+ LVAQVEKDIN SFEKEL  E+E V+AVEK+AEE K EL RLR E+E +
Sbjct: 481 AMAENVVCAHNELVAQVEKDINASFEKELLREKEIVDAVEKLAEEAKSELARLRVEKEEE 540

Query: 937 NISLMREQAAIESEMEVLSKLRYELEEQLQGLMSNKVEVSHEKERINKLRKEAEIENQEI 991
            ++L RE+ +IE+EME L+++R ELEEQLQ L SNK E+S+EKER ++L+K+ E ENQEI
Sbjct: 541 TLALERERTSIETEMEALARIRNELEEQLQSLASNKAEMSYEKERFDRLQKQVEDENQEI 600

BLAST of Cp4.1LG14g06200 vs. TAIR 10
Match: AT5G52410.1 (CONTAINS InterPro DOMAIN/s: S-layer homology domain (InterPro:IPR001119); BEST Arabidopsis thaliana protein match is: unknown protein (TAIR:AT5G23890.1); Has 30201 Blast hits to 17322 proteins in 780 species: Archae - 12; Bacteria - 1396; Metazoa - 17338; Fungi - 3422; Plants - 5037; Viruses - 0; Other Eukaryotes - 2996 (source: NCBI BLink). )

HSP 1 Score: 374.0 bits (959), Expect = 4.1e-103
Identity = 240/480 (50.00%), Postives = 304/480 (63.33%), Query Frame = 0

Query: 604 AALQGLKVIEADVEPSDLCTRREYARWLVSASSVLSRNTTFKVYQAMYIQNITELAFDDI 663
           AALQ LKVIE+D  P DLCTRRE+ARW+VSAS+ LSRN+  KVY AMYI+N+TELAFDDI
Sbjct: 3   AALQALKVIESDALPYDLCTRREFARWVVSASNTLSRNSASKVYPAMYIENVTELAFDDI 62

Query: 664 TPEDPDFASIQGLAEAGLISSKLSRHDISST----------------------------- 723
           TPEDPDF  IQGLAEAGLISSKLS +++ S+                             
Sbjct: 63  TPEDPDFPFIQGLAEAGLISSKLSNNNMPSSESSRVTFSPESPLTRQDLLSWKMALEFRQ 122

Query: 724 ---------------LDEDQ---------------------------------------- 783
                          LD D+                                        
Sbjct: 123 LPEADSKKLYQLSGFLDIDKINPEAWPALIADLSAGEHGITALSFGRTRLFQPSKAVTKA 182

Query: 784 --------GEASDMVSEELARIEAESMAENAVAAHSALVAQVEKDINGSFEKELSIEREK 843
                   G+A ++V EELARIEAE+MAEN V AH+ LVAQVEKDIN SFEKEL  E+E 
Sbjct: 183 QTAVSLAIGDAFEVVGEELARIEAEAMAENVVCAHNELVAQVEKDINASFEKELLREKEI 242

Query: 844 VEAVEKMAEETKQELERLRSERERDNISLMREQAAIESEMEVLSKLRYELEEQLQGLMSN 903
           V+AVEK+AEE K EL RLR E+E + ++L RE+ +IE+EME L+++R ELEEQLQ L SN
Sbjct: 243 VDAVEKLAEEAKSELARLRVEKEEETLALERERTSIETEMEALARIRNELEEQLQSLASN 302

Query: 904 KVEVSHEKERINKLRKEAEIENQEISRLQYELEVERKALSIARAWAEDEAKRAREQAKAL 963
           K E+S+EKER ++L+K+ E ENQEI RLQ ELEVER ALSIAR WA+DEA+RAREQAK L
Sbjct: 303 KAEMSYEKERFDRLQKQVEDENQEILRLQNELEVERNALSIARDWAKDEARRAREQAKVL 362

Query: 964 EEARDSWERRGIKVMVDSDLREQESAGD-TWLDSSEQFAVEETVERAENLLAKLKGMARE 991
           EEAR  WE+ G+KV+VDSDL EQ +  + TWL++ +Q  VE T++RA NL+AKLK MA++
Sbjct: 363 EEARGRWEKYGLKVIVDSDLHEQTTKTESTWLNAGKQNHVEGTMKRAGNLIAKLKKMAKD 422

BLAST of Cp4.1LG14g06200 vs. TAIR 10
Match: AT3G53740.2 (Ribosomal protein L36e family protein )

HSP 1 Score: 172.9 bits (437), Expect = 1.4e-42
Identity = 92/112 (82.14%), Postives = 101/112 (90.18%), Query Frame = 0

Query: 996  MAPKQPNTGLFVGLNKGHIVTKKELAPRPSDRKGKSSKRVLFVRNLIREVAGFAPYEKRI 1055
            M   Q  TGLFVGLNKGH+VT++ELAPRP  RKGK+SKR +F+RNLI+EVAG APYEKRI
Sbjct: 1    MTTPQVKTGLFVGLNKGHVVTRRELAPRPRSRKGKTSKRTIFIRNLIKEVAGQAPYEKRI 60

Query: 1056 TELLKVGKDKRALKVAKRKLGTHKRAKKKREEMSSVLRKMRAGGGG--EKKK 1106
            TELLKVGKDKRALKVAKRKLGTHKRAK+KREEMSSVLRKMR+GGGG  EKKK
Sbjct: 61   TELLKVGKDKRALKVAKRKLGTHKRAKRKREEMSSVLRKMRSGGGGATEKKK 112

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
Q9LKF91.4e-13550.17Cleavage and polyadenylation specificity factor subunit 2 OS=Arabidopsis thalian... [more]
Q652P49.5e-12147.14Cleavage and polyadenylation specificity factor subunit 2 OS=Oryza sativa subsp.... [more]
Q9W7993.2e-4430.40Cleavage and polyadenylation specificity factor subunit 2 OS=Xenopus laevis OX=8... [more]
Q105685.5e-4430.70Cleavage and polyadenylation specificity factor subunit 2 OS=Bos taurus OX=9913 ... [more]
Q9P2I07.2e-4430.70Cleavage and polyadenylation specificity factor subunit 2 OS=Homo sapiens OX=960... [more]
Match NameE-valueIdentityDescription
XP_023552429.10.085.45uncharacterized protein LOC111810089 isoform X1 [Cucurbita pepo subsp. pepo][more]
KAG7014766.10.083.17hypothetical protein SDJN02_22395 [Cucurbita argyrosperma subsp. argyrosperma][more]
XP_022984998.10.082.45uncharacterized protein LOC111483097 isoform X1 [Cucurbita maxima][more]
KAG6574185.10.056.0560S ribosomal protein L36-2, partial [Cucurbita argyrosperma subsp. sororia][more]
XP_022140923.11.09e-31071.90uncharacterized protein LOC111011467 isoform X4 [Momordica charantia][more]
Match NameE-valueIdentityDescription
A0A6J1JC970.082.45uncharacterized protein LOC111483097 isoform X1 OS=Cucurbita maxima OX=3661 GN=L... [more]
A0A6J1CJ485.29e-31171.90uncharacterized protein LOC111011467 isoform X4 OS=Momordica charantia OX=3673 G... [more]
A0A1S3AXY71.02e-31072.47uncharacterized protein LOC103484091 isoform X3 OS=Cucumis melo OX=3656 GN=LOC10... [more]
A0A438E2N47.39e-30439.55Cleavage and polyadenylation specificity factor subunit 2 OS=Vitis vinifera OX=2... [more]
A0A438IZ503.32e-30039.65Cleavage and polyadenylation specificity factor subunit 2 OS=Vitis vinifera OX=2... [more]
Match NameE-valueIdentityDescription
AT5G23880.19.7e-13750.17cleavage and polyadenylation specificity factor 100 [more]
AT5G23890.13.7e-12049.64LOCATED IN: mitochondrion, chloroplast thylakoid membrane, chloroplast, plastid,... [more]
AT5G52410.22.7e-11541.32INVOLVED IN: biological_process unknown; LOCATED IN: chloroplast; EXPRESSED IN: ... [more]
AT5G52410.14.1e-10350.00CONTAINS InterPro DOMAIN/s: S-layer homology domain (InterPro:IPR001119); BEST A... [more]
AT3G53740.21.4e-4282.14Ribosomal protein L36e family protein [more]
InterPro
Analysis Name: InterPro Annotations of Cucurbita pepo (Zucchini) v1
Date Performed: 2021-10-25
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
NoneNo IPR availableCOILSCoilCoilcoord: 854..881
NoneNo IPR availableCOILSCoilCoilcoord: 910..930
NoneNo IPR availableCOILSCoilCoilcoord: 812..842
NoneNo IPR availableCOILSCoilCoilcoord: 745..783
NoneNo IPR availableCOILSCoilCoilcoord: 181..201
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 1074..1105
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 547..568
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 1082..1105
NoneNo IPR availablePANTHERPTHR33740:SF3GPI-ANCHORED ADHESIN-LIKE PROTEINcoord: 397..561
coord: 700..996
NoneNo IPR availablePANTHERPTHR33740:SF3GPI-ANCHORED ADHESIN-LIKE PROTEINcoord: 560..699
NoneNo IPR availablePANTHERPTHR33740GPI-ANCHORED ADHESIN-LIKE PROTEINcoord: 397..561
coord: 700..996
NoneNo IPR availablePANTHERPTHR33740GPI-ANCHORED ADHESIN-LIKE PROTEINcoord: 560..699
IPR001279Metallo-beta-lactamasePFAMPF16661Lactamase_B_6coord: 8..66
e-value: 1.9E-17
score: 63.3
IPR000509Ribosomal protein L36ePFAMPF01158Ribosomal_L36ecoord: 1003..1096
e-value: 9.6E-41
score: 137.9
IPR000509Ribosomal protein L36ePROSITEPS01190RIBOSOMAL_L36Ecoord: 1050..1060
IPR036866Ribonuclease Z/Hydroxyacylglutathione hydrolase-likeGENE3D3.60.15.10coord: 4..127
e-value: 9.9E-30
score: 105.3
IPR036866Ribonuclease Z/Hydroxyacylglutathione hydrolase-likeSUPERFAMILY56281Metallo-hydrolase/oxidoreductasecoord: 8..342
IPR038097Ribosomal protein L36e domain superfamilyGENE3D1.10.10.176060S ribosomal protein L36coord: 994..1100
e-value: 2.7E-42
score: 145.3

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Cp4.1LG14g06200.1Cp4.1LG14g06200.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0006412 translation
cellular_component GO:0016021 integral component of membrane
cellular_component GO:0005840 ribosome
molecular_function GO:0003735 structural constituent of ribosome