Cp4.1LG00g03020 (gene) Cucurbita pepo (Zucchini)

NameCp4.1LG00g03020
Typegene
OrganismCucurbita pepo (Cucurbita pepo (Zucchini))
DescriptionProtein SET DOMAIN GROUP 41
LocationCp4.1LG00 : 9929885 .. 9942171 (+)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: five_prime_UTRpolypeptideCDS
Hold the cursor over a type above to highlight its positions in the sequence below.
CATTTTCAGTAAGGAGAGAGAAATGGAGATGGAAATGAGAGCAATGGAAGACATAGAAATGGCGGAAGACATTACTCCGCCGTTGCCTCCCCTCACCGCCGCTCTCCACGATGCCTTCCTCCTCACTCACTGCTCCTCCTGTTTCTCCCCTCTCCCAAATTCCTCAATTTCTCACTCTAATCTCCTCCGCTACTGCTCCCCTATATGCTCCCATTCCGATTCCCTCACCGCCGCCGTCTTCTCCACCGGTCAGTTCCCCTTCTCCGACACCTCCGACCTCCGCGCCTCCCTCCGCCTCCTCCACCTCCTTCTCTCTGATCCCTCCGCTTGGCGCTCTGCTCCTCCCGAGCGTATCTTTGGCCTTCTCACCAATCGGGAGAAATTGATGCTTGCTGATGACGATTCCGAGGTTTTCGTCAAGATTCGGGAAGGGTCCGACGCCATGGCCGCTTCCAGACGGACGAACTCTGCCGATATTCGCTATGACAACGCCTTGGAAGAGGCTATCCTGTGCCTTGTCTTGACCAACGCCGTCGAGGTTCAGGATTCGGTTGGCCGAACCATTGGGATTGCTGTGTACCATCCAACTTTCTGCTGGATCAATCACAGTTGTTCTCCCAATGCTTGTTACAGATTTGAAACTCCGTCGGATTCCATCAAGACGAGGCTACGGATTTCCCCCTTCTGTACTGACATTGGTACTGGTGAAGGAAGTTGCAGTCAAGTAATTGTTTCAACTACGAATAACTTTCATTTTCTTTATGGGCGTTTCTCTCCCTTTCTCGTGAAATTAGTGAGAGCTTTTGCATGATTGTTTATGGCAGATGAGTACTGTTCGTAGAAACTTTTCGCATTTCATTACAAAAGGTGTGTTTCTTCACGTTTGCAATCAGTTAAATGCTGTTGTTGTTGTCCCTGTTTGTATATATTCATGTGAATTTTGATGAAGATTTTCAGGGTTATGGTCCAAGAGTCATGGTTAGGAGTATAAAGAGTATAAGGAATGGCGAAGCAGTCACGATTGCATACTGTGACTTGTTGCAACCCAAGGTATTGCAGTTCTATCTTATTTTTGTTATCTTCCCTTCATAATATTTCAGTGGAAATTTCTGAACTAAGTTCCACATAAGCATGCCAAATTTCATTATGAGCTTGGTATATATTTCAACTCAAAGATTTGCAAGAGTAAATGAATGTGATGCTTGTGGAATATTGTAACGCCCCAGATCTGGGCTTTCCCTCAAGGCTTTAAAACGTGTATGCTAGGGGAAGGTTTCCACACCCTTATAAAGGGTGGTTTGTTCTCCTCCCCAACCAATGTGGGACATCACAATCCACCCCCCTTCGGGGCCCAGCGTCCTCGTTGGCACTCTTTCCTTCCTCCAATCGATGTGAGATTGCCCCCAAATTCACCCCCCTTTAGGGACCAGCGTCCTTACTGGTACACCGCCTCGTGTCTACCCCCTTCGGGGAACAGCGAGAAGGCTGGCACATCGTCCGATGTCTAACTCTGATACCATTTGTAAGGGTGTGGAAACCTTCCCCTAGCAGACACGTTTTAAAGCCTTAAGGGGAAGCCCAAAAGGGATTGTAACGACCCAGATCCGCCGCTAGCAGATATTGTCCTCTTTGAGCTTTCCCTTTCGGGCTTCCCCTCAAGGCTTTAAAACGCGTCTGCTAGGGGAAGGTTGCCACACCCTTATAAGGACTGTTTNATATCGTTGTTAATTTCATCTTCTTGTGGAATAGATTTGTGAGAAATTTGTTGATTTTATCTCAGGCAATGAGGCAGTCAGAATTGCGATCAAGATATAAATTTGTCTGCAGTTGCCAGCGATGTAGTGCCAAGCCCCCAACTTATGTGGACCATGCTTTGCAAGTAAGAAAAACTATTAAATTTTTTTTACAACTTGTTGTGTTTTTGAAATTTGGCTATGAATTCAAATTTTTCCTTAACAAAGATGAAAACCAATGTAAAGAAAATGTGAGGAATCAATCACATATTTCAAAAACCAAAGCCAAATGCTAAAAAACTATGTAGTATCCTTAGCGATTAGGAGGAGTTTGGTATCACTTCTTGACATAGTGAAAATGTATTTGGCTTATTGAAAACATTCAAGATGTTTATGTAAGGTATCTCCTTGTAAAAGAGTGCTAGGGAGACGCTGGGCCTTGAAGGGGGTGGATTGTGATGTCCCACATTGGTTGGGGAGGAGAACAAAAAGACCCTTTATGAGGGTGTGGAAACCTTCCCCTAGTAGATGCGTTTTAAAGCCTTGAGGGGAATCTCGAAAGGGAAAGCCCAAAGAGGACAATATGTACTAGCGGTGGATCTAGGTCGTTACAAATGGTATCAGAGCCAAACACCGGATGATGTGCCAGCCTTCTCGCTGTTCCCCGAAGGGGGGTAGACACGAGCCAGTGTGCTAGTAAGGACGCTGGTCCCAAAGGCGGGTGGATTTGGTGGGGGTCTCATATCGATTGGAGAGAGGAACGAGTGCCAGCGAGGACGCTGGACACCAAAGGGGGGTGGATTGTTATGTCCCACATTGGTTATAAGGGTGCGGAAACCTTCCCCTAGCAGACGCATTTTAAAGCCTTGAGGGGAAGCCTGAAAGGAAAAGCCCAAAGAGGACAATATCTACTAGCAGTGGGTTTGGGTCGTTACATAAACTGCCCTCATCTGACTAACTTGCAGGCTCCATGTCTTTCTTGCCCCTCCTAGTGTACTAATTAGTAATTGGTGGCCTAACATCACCCGCTGGGGTCGGAGACATTGTTGAGTTGCTCCCATGTTGCTCCCAAGTTGCTTCATGACCTAGCAAGTTCTTCCAACTGACTAATAACTTAGTCATGCCCGTGGTGCTATTCTTTTGATAACTGAAAATTTCCTCACGTTCAACTTTCCATCCAAAATCACTGGTCTACACAGGCAGCTTGGGCTTACTTGATGGTTTTGTTCCCAGAGCCTTCTCATCTGAGAATCATGGAATACTTGATGTATTGTTGTCCCAAGTGGAAGTTGGAGTTTATAAGCCACTGGCCCAATCTTGTATTCGATGAAATAAGGTTCAAAGAACTTCAGGGCTAACTTCTCACTACATTTCTTTGGTGGTGATGCCTGTCTATAAGGACGAATTTTGAAGAAAATCCAATCACCTACTGCAAATTCCACTTCCCTTTGTTTCATCCGAGGAGGAGTTAGGTGTTTTTTTGGATTCCATACAATAGCTAAGGAGGAGGTGCCCTGCCATTCACCACTTTGAATATATGGAACAGTGGTGTTATGCCTGCCCAACACAACTATTGTATAACGTTCACCACGAAAACACCTTAGGTAATTTTCTACACATTTATTTTATTTATTTTCCTTTGTTCATTTGCCTGAGGGTGGTATGGAGTGCTTTGCCAGAGTTGAGTCCCCTGTATCTGAAGAAGTTCAGTCCAGAAATGACTGATAAAGATCTTATCATAGTCTGAAATGATTGAGTTTGGAAACCTATGTAACTCTACTACCTTTCTAATAAATAGGGATGCTTATGTCTTAGCTGTAAACCGATGTTTGATGGCAAAAAACTGAGCATATTTACCGAAATGGTCCACGACTACTAAGATGGTATCATGTCCTTCTGATCGTGGTAATCCTTCAATAAACCCCATAGCTGATTTGTTTCGTTGACAGACTACACAATCCTTCAATAAACCCCATGCCTTAACAATACAACTCACTAGTCAAATGTTTGTAGGACTGTAGGAAACCAGAGTGTCCTCCAATTACAGAGTGATGGTACATGTGTAGAGTAGATGGGATCAAAGTGGAAGTTTTTGATAATGGTAACCTCCCCTTGCATTTCAATGTTCCTTGATGAAAGGAGAATTTGGATAGCTTTTCCCTTTCGTCTTGCAACTGTTGGATGATCTTGCCGAGCTCGGGGTCATCTTCAACCTCTTCTTTAATCGTATCTACATCCAATATGGCAGGGGGACAAATGGGTGAGACTCACTAATACGAGCTGAGATTTAGTGTTGGGAACTTACGATCATGAAGAAGGCCCAATTGGTGAAGGGACCGGAACAATCCTAAGCTGATGCAGGTAAGCAACCACTAAAATGGCCTAGTGTGTTAACTCTAAGGGGAACTCATCTTCATCAAAGGATAATAGCTAGTGGCGAACTTGATCCCCACATGTACCATTACGATACAGAGTGAAGCAATGCAGCAATAGATACGTCGAGAATGGCCAACGAAAAGATTTTCGAAATTGAAATTTTTAGCTCAAAAGAAAAAGGACCTAGTTTTCAATCGCATTAGTCATGCTAGGGTATTGTAAGTACTAAGTCTTTGCCAGGGTATCAGAGCCAAATTAGTGAGGGAAGAGGAGTATTCAATCTCTAAGAATCAAGGTTGAATGCAGGAAGAAGATGAAAAATTCAAAGGAATTTGGAGTATGTTATGACTGGGAGAATAGCCCTTGTGTTCCAGGGGGTAAGACGCGGAAAATGCAGTAAGCCCATTGACAAGTTTGGGTGCTTAATCATACAAAAGATTCCAAAGAAAATTTTATAACAATAACCTGACATCTTTATTAAAGACATTCTAATAATGAAGAAAATGACAAACGAAGATTGATAACTTGTCGGCCTCACAATGAAGGCTCACATGTGAATTATTAATCTTTTGTCTGGGTGAATTATTGGTCTTTTGTCTTTGTGAATTATTAGTTTGCTTTAGGTTTTGTCTATAAAAAGACTTCTTGAAGGTATAGAGAGGCATTGGTCGATAGCTATCTCTAAAATCAATCAGAAACCAAGAGACGAAACCTTTCTACTCCATTTATCTTTTTTATTTTCGACATTTTAATGAATGAGTTCTTAAAAAAGTTTACCAAATCAAATCGTCTCAAATGTTCTAACCAAGCAGAAGAAAGTTTAGATGGGAGAGGGAAACCTCCACCATCTTGAATACTGTCAATCTAGAATACCATATTGCAAGAATTGAAGAAAAACTTCAAAATTGGACTCTTTCGAAGGTAGATCCTAATACTATTTGCTAATTTTCAACCTTCAATTTTGCTCAAAGATCATGCATCAAATTCTCGGAGAAAAATATCCCAATTCATAGCAATAGAGAATCTCTGAATCTTTTTTCAAAATCTCTTATACTCCAACACCGTGAAGAGTTTAATTATCTTCACGTCGGTCTTGTTCAAGTTGCGATAAAACCGTTATTCAGACTTGGGCTAGATATCCCTGTCTTTATCTCCTTTCGAGATAAGAGACATGAAGATTTCTCAAATTCCTTCCTGGGAATGGTTCAATATAATTTAGAGAATGTACCTGTCTATTTCAATTGCTATCTAAATTTTACTTTGTCACTCAAAGATCCGCATATATTATCATCCCTTATGCTGGACCTATAATAAAAAAAACTTGAACATCAAGGTTGAAACCCATTCTTCAGTCGTTATATTCAGAGGTTATTACAAGTTCATGAATACAAATATCTCTCCAAGAGCATTAAGATCCTCACCAAAATGATCCACTATGCTCATAGAAGCAAATATTGGGAAGTCGTGACCGTTCCAAAACCCTCCCTTAGGATCAAATAACTAAAAATACTCTCTGGAAAATAGAAGACGCCCGTTTTCTCAAAAGAAAAGAGTCTCAAAACCCTGTTCAAATCATCGAACATGATAACGGAAGCGTTGAAATAAAGTTCAGCAAAGAACCTTCGTCAAATCCAAAAGTGAAAAAATTCCTGAGTTCGAGACCGAGTATTTCAGGAATTTCAAGCTCAATATATGACCCTTTAAAAGTAAAGGATGTCAACTACGATCAAAGAAGAGCCTCGATCCACTATGAAGATGGCTCAAGATCTCCAATCCATACTGATATGGATACTCANAGGTAATTTTCTACACATTTATTTTATTTATTTTCCTTTGTTCATTTGCCTGAGGGTGGTATGGAGTGCTTTGCCAGAGTTGAGTCCCCTGTATCTGAAGAAGTTCAGTCCAGAAATGACTGATAAAGATCTTATCATAGTCTGAAATGATTGAGTTTGGAAACCTATGTAACTCTACTACCTTTCTAATAAATAGGGATGCTTATGTCTTAGCTGTAAACCGATGTTTGATGGCAAAAAACTGAGCATATTTACCGAAATGGTCCACGACTACTAAGATGGTATCATGTCCTTCTGATCGTGGTAATCCTTCAATAAACCCCATAGCTGATTTGTTTCGTTGACAGACTACACAATCCTTCAATAAACCCCATGCCTTAACAATACAACTCACTAGTCAAATGTTTGTAGGACTGTAGGAAACCAGAGTGTCCTCCAATTACAGAGTGATGGTACATGTGTAGAGTAGATGGGATCAAAGTGGAAGTTTTTGATAATGGTAACCTCCCCTTGCATTTCAATGTTCCTTGATGAAAGGAGAATTTGGATAGCTTTTCCCTTTCGTCTTGCAACTGTTGGATGATCTTGCCGAGCTCGGGGTCATCTTCAACCTCTTCTTTAATCGTATCTACATCCAATATGGCAGGGGGACAAATGGGTGAGACTCACTAATACGAGCTGAGATTTAGTGTTGGGAACTTACGATCATGAAGAAGGCCCAATTGGTGAAGGGACCGGAACAATCCTAAGCTGATGCAGGTAAGCAACCACTAAAATGGCCTAGTGTGTTAACTCTAAGGGGAACTCATCTTCATCAAAGGATAATAGCTAGTGGCGAACTTGATCCCCACATGTACCATTACGATACAGAGTGAAGCAATGCAGCAATAGATACGTCGAGAATGGCCAACGAAAAGATTTTCGAAATTGAAATTTTTAGCTCAAAAGAAAAAGGACCTAGTTTTCAATCGCATTAGTCATGCTAGGGTATTGTAAGTACTAAGTCTTTGCCAGGGTATCAGAGCCAAATTAGTGAGGGAAGAGGAGTATTCAATCTCTAAGAATCAAGGTTGAATGCAGGAAGAAGATGAAAAATTCAAAGGAATTTGGAGTATGTTATGACTGGGAGAATAGCCCTTGTGTTCCAGGGGGTAAGACGCGGAAAATGCAGTAAGCCCATTGACAAGTTTGGGTGCTTAATCATACAAAAGATTCCAAAGAAAATTTTATAACAATAACCTGACATCTTTATTAAAGACATTCTAATAATGAAGAAAATGACAAACGAAGATTGATAACTTGTCGGCCTCACAATGAAGGCTCACATGTGAATTATTAATCTTTTGTCTGGGTGAATTATTGGTCTTTTGTCTTTGTGAATTATTAGTTTGCTTTAGGTTTTGTCTATAAAAAGACTTCTTGAAGGTATAGAGAGGCATTGGTCGATAGCTATCTCTAAAATCAATCAGAAACCAAGAGACGAAACCTTTCTACTCCATTTATCTTTTTTATTTTCGACATTTTAATGAATGAGTTCTTAAAAAAGTTTACCAAATCAAATCGTCTCAAATGTTCTAACCAAGCAGAAGAAAGTTTAGATGGGAGAGGGAAACCTCCACCATCTTGAATACTGTCAATCTAGAATACCATATTGCAAGAATTGAAGAAAAACTTCAAAATTGGACTCTTTCGAAGGTAGATCCTAATACTATTTGCTAATTTTCAACCTTCAATTTTGCTCAAAGATCATGCATCAAATTCTCGGAGAAAAATATCCCAATTCATAGCAATAGAGAATCTCTGAATCTTTTTTCAAAATCTCTTATACTCCAACACCGTGAAGAGTTTAATTATCTTCACGTCGGTCTTGTTCAAGTTGCGATAAAACCGTTATTCAGACTTGGGCTAGATATCCCTGTCTTTATCTCCTTTCGAGATAAGAGACATGAAGATTTCTCAAATTCCTTCCTGGGAATGGTTCAATATAATTTAGAGAATGTACCTGTCTATTTCAATTGCTATCTAAATTTTACTTTGTCACTCAAAGATCCGCATATATTATCATCCCTTATGCTGGACCTATAATAAAAAAAACTTGAACATCAAGGTTGAAACCCATTCTTCAGTCGTTATATTCAGAGGTTATTACAAGTTCATGAATACAAATATCTCTCCAAGAGCATTAAGATCCTCACCAAAATGATCCACTATGCTCATAGAAGCAAATATTGGGAAGTCGTGACCGTTCCAAAACCCTCCCTTAGGATCAAATAACTAAAAATACTCTCTGGAAAATAGAAGACGCCCGTTTTCTCAAAAGAAAAGAGTCTCAAAACCCTGTTCAAATCATCGAACATGATAACGGAAGCGTTGAAATAAAGTTCAGCAAAGAACCTTCGTCAAATCCAAAAGTGAAAAAATTCCTGAGTTCGAGACCGAGTATTTCAGGAATTTCAAGCTCAATATATGACCCTTTAAAAGTAAAGGATGTCAACTACGATCAAAGAAGAGCCTCGATCCACTATGAAGATGGCTCAAGATCTCCAATCCATACTGATATGGATACTCAATCTGTCTACGAAAGTCGCTAAACGTCATTAGATTGAACGATTGAACCATTCCCAGCAAGAAATTTGAGAAATCTTCATGTTTCGTGTCTTGAAAGGAGATAAAGATGGGGATATCTTCAGAGATTCCCTATTGCTATTAATTGAGATATTTTTCTCCGAGAATTTGATGCATGATCTTTGAGTAAAATTGAAGGTTGAAACTTAGTAAATAGTTTTAGTATCTACCTTCGGAATAGTCCAATTTGAAGTTTTTCTTCAATTCTGGCCATATGGTATTCTTGATTGACAGTATGGTGAGGCCTTTGTGCCCACTCTGTGTCCATCTGAACTTTCTTCTGCTTGGTTAGAAGATTTGAGACGATTTTATTTGGGGAACATTTTTAAGAACTTGTTTATTAAAGTGTCGAAAATAAGAAAGATAAGGGGAGTATAAAGGTTTCTTCTCTTGGTTTCTGATTGATTTTAGAGAGACCTATCGACCAATGCCTCTCTATACATTCAAGAAGTCTTTTTATAGACTCCAGATTTGAAAATAAACGTAAACTAAGAGATAAGATTAAATCTAGCAAGATAGGAACGATGCGAGATTTGTTTTTCTAAAAGTCTTAAATTAAGAAAAAACCTAAAGCAAACTAATAATTCACCGAGACAAAAGACTAATAATTCATAAAGACAAAAGATTAATAATTCACATGTGAGCTTTTATTGTGTGGCCAACAAGTTGTCAATCTTCATTTGTCATTTTCTTCTTTATTGAAGTGTCTTTTTAATAAAGATGTCAGGTTGTTGATATAAAATTTTCTTTGGAATCTTTCGTATGATTGAGCACCCAAACTTGCCAACGACTTACTGCACTTCCCGCGTCTTACCACATGGAACACCAAGGGCTATTCTCTCAGCTATCACATACTCCAAATTCCTCTAAATTTCTCTTATCTTCTTCCTGCGTTCAACCTTGATTCTTAGAGATTGAATACTCTTTCCTCACTAATTTGGCTCTGATACCACGACGAAGGCTTAGTATTTACAATACCCTAGCATAACTAATGCGATTGAACATGACTTAAGCTTAGTATTTACAATACCCTAATTCTTAGAGATTGAATACTCCTACTGCGCTTCATGCATCTTACCACCTGGAACACAAGAGTTATTCTCTCAGCTATCATATACTCCAAATTCCTCAAGGAAATAATACAACAGGAAAGCAGTAAAAGAAAGCGATAAAGTAGTACAAGGCATGAGTTTCAGCGTGTAACGGGAATGCGGCCATAGGCATCACGACTTGAACAAGCATGACCGTGACACTAAACAAATATATGAAAAACGTTAAAAGAGTGCTATTGTCCTTCCCTAACACGTATACCGCGCAACATTCCCTCAGCGGTCTCATAATAATTGGTGGGTGGTTAGATTTCTTGTGATTTCTGTTGTTCTTTCGAATAGTGTCGTAGTAAAATGATATGCATTTTAAATTTCTTCAGGAAATCTCTGCTGTCAATGTGGAATTGCTTGATTCAACTTCCATTAGCAACTTTGATTATGACACTGCAATAGCAAGAATAGATGATTATGTCAACAATGCCATAGCTGAGTACCTGTCTATTGGTTCCTCTGAATCATGTTGTGAGAAGCTTCAAAACTTGCTTACTTTAGGTTTCTACGACGAGCAAGCGGAAGACGGGGACGGAAAACAGCTTCTTAACTTAAGGCTGCATCCCGTGCACTTCCTGTTGCTGAACGCGTACACTGCTCTAGCATCGGCTTACAAAGTCCGTTCATGGAATGGCGATGAAAATCAATGCAACGCTACGATGAGCAAAACAAGTGCAGCTTACTCCTTGTTCCTTGCAGGTGCTACTCATCATCTTTTTCTTTCTGAACCGTCTTTGATTGCTTCTGCTGCAAATTGTTGGGTTGTTGCTGGAGAGTCTTTGCTCATTCTTGTTAAACATAGCTCATTATGGGGCTCCAACACTTCAAAATCGAGCTCCCCTATGGGCGAAATAACGTGTTTAAACTGCTCATGGGTCGATAAGTTCAATACGAGTAGAATCCATGGTCGATCTATAGAAGCAGATTTTCGGGAGTTTTCGATTGGTATTTCTAATTGCATTGCTAATATTTCACAAAAATATTGGAGCTTTCTGGCTCATGAATGCTCATATTTGAAGGCTTTCACTGACCCCTTTGATTTCAGCTGGCCGAAGACGATCACGACATGTTCGAATTACCGTGATCGTTCGTGTGATTGTAGTAAAATTCAAGATGTTTCTGACCAAGACAGGCAATCTATCTTTGAGCTTGGTATCCATTGCTTATTCTATGGAGGTTATTTAGCAAGTATTTGTTATGGTCACCATTCACATTTGGCATCCCAGATTCAATGTATTTTACATGACATGAACTGATACAGTTATTTAGTAGAAATATAAGTTATTCTGCGATTTGAGTACACAATCAGTCATTTCCGACACATTATCCTTATGTTAATAGCAAATGGGTGACATGAAGATTTTTAGTCCTCTACTCTGCCAATTGAGCTGTATCGGCCATTTTCGATACATTAGGCTTATTTTAATATTAAATTTGAACTGGTGACATGAAAATTTTCAATTCTCTGTTCTACTAATTGAGCTATATCGGCCATTTCATTTCTGACATATTAGACTTATTTTAATAGCAGATTTGATTTCTTGCCATTTCTTTACGATGATTGACCCTTTTAGATTAGGAAAGAACCAATTTACCACTCGAATGTTTCATAAGATGATCATAGACGACTTAACTTTAATCTGGAGAGAGTTATAAACACTCATGTCTATGAGAGCTCCTAGCATAGGTGAAAATGATTTTTGTAGTCAATTTGATATGTCTATAATTTTGCGAGATGGAATTCATTCATTTTCATATAGAGAAAGTAGATAGAATTTACTCATTCCAAGATGGAACTCACTCATTTTCATATATGGAAAGTGGATGAATAGTTTTCTTGAGTGCTTATTTGGGGACTTGAATAATGGGCCCTATTCTCTCATTGGCTCGGGACGAATTTAGTTTATGGCTAGATCATAAACAAACTATTTATTAGAGAATAAGTGATACTTAAGGAAAAGATGTAACTCCACGGGTAAAATGAACTTTTGATCGAGTTATAATTTAACAACCCTATTTTGCTACTAAATAGAGCTACTATCATATGTATGACGTGACCTTTTTTATGAAGAAACGTTCATTTAAAATAATTCATAAAATAATGTCATGGACTTACACATTAAAAAAATATCAATAAATCCAAAAAAAAACAAATAAAAAAAATGAACAATTAGGAATCTCCATANCGCCACCATCATTCGTACATGACTGTCTTGCCTTTACCTGAAAGGTAAGAGTAGCACATGACTTGAGTATTTTAAGAAATACTTAATAGGTGACCCCACTATTGAGGTTGTGAAAGGTAAATATATGCGTGGGACTTATCATTTAAATCATCATTTCATTCTCTTTGGCGGCTTTCTAGTTTGAGCTAAATTGGATATGTAGTACTTCTTTACCCACGACCCACATGTGCAAACGTGGACCCACCAGGTGGGTTCATGCACGTACGAGTCGTGTGTAGAATAGTAAATATACATTCAATGACTCGCTTAGGATAGTTACCTAAGAGAATACGACTTTAAAATGATAAGTCTAACTCATGTTGCTGTCTCTGCATTTACTAACCCCAATAGTGGGGTCACTTACTGAGTATTCCTTAAAATATTCAAGCTACGTATTACTTCTACTTTTTAGGTAAAGGCAAAACACCCGTG

mRNA sequence

CATTTTCAGTAAGGAGAGAGAAATGGAGATGGAAATGAGAGCAATGGAAGACATAGAAATGGCGGAAGACATTACTCCGCCGTTGCCTCCCCTCACCGCCGCTCTCCACGATGCCTTCCTCCTCACTCACTGCTCCTCCTGTTTCTCCCCTCTCCCAAATTCCTCAATTTCTCACTCTAATCTCCTCCGCTACTGCTCCCCTATATGCTCCCATTCCGATTCCCTCACCGCCGCCGTCTTCTCCACCGGTCAGTTCCCCTTCTCCGACACCTCCGACCTCCGCGCCTCCCTCCGCCTCCTCCACCTCCTTCTCTCTGATCCCTCCGCTTGGCGCTCTGCTCCTCCCGAGCGTATCTTTGGCCTTCTCACCAATCGGGAGAAATTGATGCTTGCTGATGACGATTCCGAGGTTTTCGTCAAGATTCGGGAAGGGTCCGACGCCATGGCCGCTTCCAGACGGACGAACTCTGCCGATATTCGCTATGACAACGCCTTGGAAGAGGCTATCCTGTGCCTTGTCTTGACCAACGCCGTCGAGGTTCAGGATTCGGTTGGCCGAACCATTGGGATTGCTGTGTACCATCCAACTTTCTGCTGGATCAATCACAGTTGTTCTCCCAATGCTTGTTACAGATTTGAAACTCCGTCGGATTCCATCAAGACGAGGCTACGGATTTCCCCCTTCTGTACTGACATTGGTACTGGTGAAGGAAGTTGCAGTCAAATGAGTACTGTTCGTAGAAACTTTTCGCATTTCATTACAAAAGATTTTCAGGGTTATGGTCCAAGAGTCATGGTTAGGAGTATAAAGAGTATAAGGAATGGCGAAGCAGTCACGATTGCATACTGTGACTTGTTGCAACCCAAGGAAATCTCTGCTGTCAATGTGGAATTGCTTGATTCAACTTCCATTAGCAACTTTGATTATGACACTGCAATAGCAAGAATAGATGATTATGTCAACAATGCCATAGCTGAGTACCTGTCTATTGGTTCCTCTGAATCATGTTGTGAGAAGCTTCAAAACTTGCTTACTTTAGGTTTCTACGACGAGCAAGCGGAAGACGGGGACGGAAAACAGCTTCTTAACTTAAGGCTGCATCCCGTGCACTTCCTGTTGCTGAACGCGTACACTGCTCTAGCATCGGCTTACAAAGTCCGTTCATGGAATGGCGATGAAAATCAATGCAACGCTACGATGAGCAAAACAAGTGCAGCTTACTCCTTGTTCCTTGCAGGTGCTACTCATCATCTTTTTCTTTCTGAACCGTCTTTGATTGCTTCTGCTGCAAATTGTTGGGTTGTTGCTGGAGAGTCTTTGCTCATTCTTGTTAAACATAGCTCATTATGGGGCTCCAACACTTCAAAATCGAGCTCCCCTATGGGCGAAATAACGTGTTTAAACTGCTCATGGGTCGATAAGTTCAATACGAGTAGAATCCATGGTCGATCTATAGAAGCAGATTTTCGGGAGTTTTCGATTGGTATTTCTAATTGCATTGCTAATATTTCACAAAAATATTGGAGCTTTCTGGCTCATGAATGCTCATATTTGAAGGCTTTCACTGACCCCTTTGATTTCAGCTGGCCGAAGACGATCACGACATGTTCGAATTACCGTGATCGTTCGTGTGATTGTAGTAAAATTCAAGATGTTTCTGACCAAGACAGGCAATCTATCTTTGAGCTTGGTATCCATTGCTTATTCTATGGAGGTTATTTAGCAAGTATTTGTTATGGTCACCATTCACATTTGGCATCCCAGATTCAATGTAAAGGCAAAACACCCGTG

Coding sequence (CDS)

ATGGAGATGGAAATGAGAGCAATGGAAGACATAGAAATGGCGGAAGACATTACTCCGCCGTTGCCTCCCCTCACCGCCGCTCTCCACGATGCCTTCCTCCTCACTCACTGCTCCTCCTGTTTCTCCCCTCTCCCAAATTCCTCAATTTCTCACTCTAATCTCCTCCGCTACTGCTCCCCTATATGCTCCCATTCCGATTCCCTCACCGCCGCCGTCTTCTCCACCGGTCAGTTCCCCTTCTCCGACACCTCCGACCTCCGCGCCTCCCTCCGCCTCCTCCACCTCCTTCTCTCTGATCCCTCCGCTTGGCGCTCTGCTCCTCCCGAGCGTATCTTTGGCCTTCTCACCAATCGGGAGAAATTGATGCTTGCTGATGACGATTCCGAGGTTTTCGTCAAGATTCGGGAAGGGTCCGACGCCATGGCCGCTTCCAGACGGACGAACTCTGCCGATATTCGCTATGACAACGCCTTGGAAGAGGCTATCCTGTGCCTTGTCTTGACCAACGCCGTCGAGGTTCAGGATTCGGTTGGCCGAACCATTGGGATTGCTGTGTACCATCCAACTTTCTGCTGGATCAATCACAGTTGTTCTCCCAATGCTTGTTACAGATTTGAAACTCCGTCGGATTCCATCAAGACGAGGCTACGGATTTCCCCCTTCTGTACTGACATTGGTACTGGTGAAGGAAGTTGCAGTCAAATGAGTACTGTTCGTAGAAACTTTTCGCATTTCATTACAAAAGATTTTCAGGGTTATGGTCCAAGAGTCATGGTTAGGAGTATAAAGAGTATAAGGAATGGCGAAGCAGTCACGATTGCATACTGTGACTTGTTGCAACCCAAGGAAATCTCTGCTGTCAATGTGGAATTGCTTGATTCAACTTCCATTAGCAACTTTGATTATGACACTGCAATAGCAAGAATAGATGATTATGTCAACAATGCCATAGCTGAGTACCTGTCTATTGGTTCCTCTGAATCATGTTGTGAGAAGCTTCAAAACTTGCTTACTTTAGGTTTCTACGACGAGCAAGCGGAAGACGGGGACGGAAAACAGCTTCTTAACTTAAGGCTGCATCCCGTGCACTTCCTGTTGCTGAACGCGTACACTGCTCTAGCATCGGCTTACAAAGTCCGTTCATGGAATGGCGATGAAAATCAATGCAACGCTACGATGAGCAAAACAAGTGCAGCTTACTCCTTGTTCCTTGCAGGTGCTACTCATCATCTTTTTCTTTCTGAACCGTCTTTGATTGCTTCTGCTGCAAATTGTTGGGTTGTTGCTGGAGAGTCTTTGCTCATTCTTGTTAAACATAGCTCATTATGGGGCTCCAACACTTCAAAATCGAGCTCCCCTATGGGCGAAATAACGTGTTTAAACTGCTCATGGGTCGATAAGTTCAATACGAGTAGAATCCATGGTCGATCTATAGAAGCAGATTTTCGGGAGTTTTCGATTGGTATTTCTAATTGCATTGCTAATATTTCACAAAAATATTGGAGCTTTCTGGCTCATGAATGCTCATATTTGAAGGCTTTCACTGACCCCTTTGATTTCAGCTGGCCGAAGACGATCACGACATGTTCGAATTACCGTGATCGTTCGTGTGATTGTAGTAAAATTCAAGATGTTTCTGACCAAGACAGGCAATCTATCTTTGAGCTTGGTATCCATTGCTTATTCTATGGAGGTTATTTAGCAAGTATTTGTTATGGTCACCATTCACATTTGGCATCCCAGATTCAATGTAAAGGCAAAACACCCGTG

Protein sequence

MEMEMRAMEDIEMAEDITPPLPPLTAALHDAFLLTHCSSCFSPLPNSSISHSNLLRYCSPICSHSDSLTAAVFSTGQFPFSDTSDLRASLRLLHLLLSDPSAWRSAPPERIFGLLTNREKLMLADDDSEVFVKIREGSDAMAASRRTNSADIRYDNALEEAILCLVLTNAVEVQDSVGRTIGIAVYHPTFCWINHSCSPNACYRFETPSDSIKTRLRISPFCTDIGTGEGSCSQMSTVRRNFSHFITKDFQGYGPRVMVRSIKSIRNGEAVTIAYCDLLQPKEISAVNVELLDSTSISNFDYDTAIARIDDYVNNAIAEYLSIGSSESCCEKLQNLLTLGFYDEQAEDGDGKQLLNLRLHPVHFLLLNAYTALASAYKVRSWNGDENQCNATMSKTSAAYSLFLAGATHHLFLSEPSLIASAANCWVVAGESLLILVKHSSLWGSNTSKSSSPMGEITCLNCSWVDKFNTSRIHGRSIEADFREFSIGISNCIANISQKYWSFLAHECSYLKAFTDPFDFSWPKTITTCSNYRDRSCDCSKIQDVSDQDRQSIFELGIHCLFYGGYLASICYGHHSHLASQIQCKGKTPV
BLAST of Cp4.1LG00g03020 vs. Swiss-Prot
Match: SDG41_ARATH (Protein SET DOMAIN GROUP 41 OS=Arabidopsis thaliana GN=SDG41 PE=2 SV=1)

HSP 1 Score: 316.6 bits (810), Expect = 5.7e-85
Identity = 227/620 (36.61%), Postives = 314/620 (50.65%), Query Frame = 1

Query: 3   MEMRAMEDIEMAEDITPPLPPLTAALHDAFLLTHCSSCFSPLPNSSISHSNLLRYCSPIC 62
           ME+RA EDIE+  D+ PPL PL ++L+D+FL +HCSSCFS LP S         YCS  C
Sbjct: 1   MEIRAAEDIEIRTDLFPPLSPLASSLYDSFLSSHCSSCFSLLPPSPPQPL----YCSAAC 60

Query: 63  SHSDSLTAAVFSTGQFPFSDT----SDLRASLRLLHLLLSDPSAWRSAPPERIFGLLTNR 122
           S +DS T    ++ QFP   T    SD+R SL LL+    D S+     P R+  LLTN 
Sbjct: 61  SLTDSFT----NSPQFPPEITPILPSDIRTSLHLLNSTAVDTSS----SPHRLNNLLTNH 120

Query: 123 EKLMLADDDSEVFVKIREGSDAMAASRRTNSADIRYDNALEEAILCLVLTNAVEVQDSVG 182
             LM    D  + V I   ++ +A   R+N    R +  LEEA +C VLTNAVEV DS G
Sbjct: 121 HLLMA---DPSISVAIHHAANFIATVIRSN----RKNTELEEAAICAVLTNAVEVHDSNG 180

Query: 183 RTIGIAVYHPTFCWINHSCSPNACYRFETPSDSIKTRLRISPFCTDIGTGEGSCSQMSTV 242
             +GIA+Y+ +F WINHSCSPN+CYRF     S            D+       S    +
Sbjct: 181 LALGIALYNSSFSWINHSCSPNSCYRFVNNRTSYH----------DVHVTNTETSSNLEL 240

Query: 243 RRNFSHFITKDFQGYGPRVMVRSIKSIRNGEAVTIAYCDLLQPKEI-------------- 302
           +            G GP+++VRSIK I++GE +T++Y DLLQP  +              
Sbjct: 241 QEQVCGTSLNSGNGNGPKLIVRSIKRIKSGEEITVSYIDLLQPTGLRQSDLWSKYRFMCN 300

Query: 303 ----SAVNVELLDS------------TSISNFD----YDTAIARIDDYVNNAIAEYLSIG 362
               +A     +DS            T++ +FD     D A+ +++DY+  AI ++LS  
Sbjct: 301 CGRCAASPPAYVDSILEGVLTLESEKTTVGHFDGSTNKDEAVGKMNDYIQEAIDDFLSDN 360

Query: 363 -SSESCCEKLQNLLTLGFYDEQAEDGDGKQLLNLRLHPVHFLLLNAYTALASAYKVRSWN 422
              ++CCE ++++L  G   ++       Q   LRLH  H++ LNAY  LA+AY++RS +
Sbjct: 361 IDPKTCCEMIESVLHHGIQFKE-----DSQPHCLRLHACHYVALNAYITLATAYRIRSID 420

Query: 423 GDENQCNATMSKTSAAYSLFLAGATHHLFLSEPSLIASAANCWVVAGESLLILVKHSSLW 482
             E      MS+ SAAYSLFLAG +HHLF +E S   SAA  W  AGE L  L     + 
Sbjct: 421 S-ETGIVCDMSRISAAYSLFLAGVSHHLFCAERSFAISAAKFWKNAGELLFDLAPKLLM- 480

Query: 483 GSNTSKSSSPMGEITCLNCSWVDKFNTSRIHGRSIEADFREFSIGISNCIANISQKYWSF 542
                   S   ++ C  C  ++  N+ R        D +E S  I +C+ +ISQ  WSF
Sbjct: 481 ------ELSVESDVKCTKCLMLETSNSHR--------DIKEKSRQILSCVRDISQVTWSF 540

Query: 543 LAHECSYLKAFTDPFDFSWPKTITTCSNYRDRSCDCSKIQDVSDQDRQSIFELGIHCLFY 584
           L   C YL+ F  P DFS    +T  +  R+ S   SK Q V      ++  L  HCL Y
Sbjct: 541 LTRGCPYLEKFRSPVDFS----LTRTNGEREES---SKDQTV------NVLLLSSHCLLY 557

BLAST of Cp4.1LG00g03020 vs. TrEMBL
Match: A0A0A0KAK3_CUCSA (Uncharacterized protein OS=Cucumis sativus GN=Csa_6G014840 PE=4 SV=1)

HSP 1 Score: 728.0 bits (1878), Expect = 9.1e-207
Identity = 407/648 (62.81%), Postives = 460/648 (70.99%), Query Frame = 1

Query: 1   MEMEMRAMEDIEMAEDITPPLPPLTAALHDAFLLTHCSSCFSPLPNSSISHSNLLRYCSP 60
           MEMEM A+EDIEMAEDI+PPL PLT+ALHD+FL THCSSCFS LPN  ISHS  L YCS 
Sbjct: 1   MEMEMIAVEDIEMAEDISPPLFPLTSALHDSFLFTHCSSCFSLLPNPPISHSIPLHYCSL 60

Query: 61  ICS--HSDSLTAAVFSTGQFP--FSDTSDLRASLRLLHLLLSDPSAWRSAPPERIFGLLT 120
            CS  HSD LT A FS   FP   SDTSDLRASLRLLHLLLS PS   S PP+RI+GLLT
Sbjct: 61  KCSLSHSDPLTDAFFSIHPFPDASSDTSDLRASLRLLHLLLSHPSPSLSPPPDRIYGLLT 120

Query: 121 NREKLMLADDDSEVFVKIREGSDAMAASRRTNSADIRYDNALEEAILCLVLTNAVEVQDS 180
           NR KLM   +DSEVF+K+REG++A+AA RR N ADI    ALEEA+LCLVLTNAV+VQDS
Sbjct: 121 NRHKLMTPQNDSEVFLKLREGANAIAALRRKNYADIPPGTALEEAVLCLVLTNAVDVQDS 180

Query: 181 VGRTIGIAVYHPTFCWINHSCSPNACYRFETPSDSIKTRLRISPFCTDIGTGEGSCSQMS 240
           +G+TIGIAVY  TF WINHSCSPNACYRFETPSDS+ TR RI+P CTD  + EGSC QM 
Sbjct: 181 IGQTIGIAVYASTFSWINHSCSPNACYRFETPSDSVTTRFRIAPSCTDFMSDEGSCRQMG 240

Query: 241 TVRRNFSHFITKD--FQGYGPRVMVRSIKSIRNGEAVTIAYCDLLQPK------------ 300
            VR N   FI +     G GPRV+VRSIK I+ GEAVTIAYCDLLQPK            
Sbjct: 241 NVRSNILDFIREGALLNGNGPRVVVRSIKRIKKGEAVTIAYCDLLQPKARRQSELWSRYQ 300

Query: 301 ------EISAVNVELLD-------STSISNFDYDTAIARIDDYVNNAIAE---YLSIGSS 360
                   SAV +  +D       S  +   D  T I+  D   + A+     Y+    +
Sbjct: 301 FVCSCQRCSAVPLTYVDHALQEISSVKVELLD-STPISNFDH--DTAVRRIDEYVDNAIT 360

Query: 361 ES--------CCEKLQNLLTLGFYDEQAEDGDGKQLLNLRLHPVHFLLLNAYTALASAYK 420
           E         CCEKLQNLLT GF+DEQ EDG+GKQ ++LRLHP+HFLLLNAYTAL SAYK
Sbjct: 361 EYLSTSSPESCCEKLQNLLTFGFHDEQVEDGEGKQHVSLRLHPLHFLLLNAYTALTSAYK 420

Query: 421 VRSWN----------GDENQCNA-TMSKTSAAYSLFLAGATHHLFLSEPSLIASAANCWV 480
           VRS +           + N+ NA TM KTSAAY+LFLAGATH LFL EPSL+ASAANCWV
Sbjct: 421 VRSCDLVALSSEMDKDNGNRHNALTMGKTSAAYALFLAGATHRLFLFEPSLVASAANCWV 480

Query: 481 VAGESLLILVKHSSLWG--SNTSKSSSPMGEITCLNCSWVDKFNTSRIHGRSIEADFREF 540
           VAGESLLIL +HSSLW   +NTS    P+G+  C NCSWVD+FN SRIHG+ ++ADFREF
Sbjct: 481 VAGESLLILARHSSLWATTTNTSNWVFPLGKRMCYNCSWVDEFNASRIHGQPVQADFREF 540

Query: 541 SIGISNCIANISQKYWSFLAHECSYLKAFTDPFDFSWPKT--ITTCSNYRDRSCDCSKIQ 584
           SIGISNCIA+ISQK WS L H C YLKAFT PFDFSWPKT     C    D SC CSK Q
Sbjct: 541 SIGISNCIASISQKCWSSLTHGCPYLKAFTGPFDFSWPKTNEQDICGRGIDHSCACSKTQ 600

BLAST of Cp4.1LG00g03020 vs. TrEMBL
Match: A0A061FQH5_THECC (SET domain-containing protein, putative isoform 3 OS=Theobroma cacao GN=TCM_035633 PE=4 SV=1)

HSP 1 Score: 448.0 bits (1151), Expect = 1.8e-122
Identity = 281/629 (44.67%), Postives = 366/629 (58.19%), Query Frame = 1

Query: 2   EMEMRAMEDIEMAEDITPPLPPLTAALHDAFLLTHCSSCFSPLPNSSISH--SNLLRYCS 61
           EMEMRA +D++  +DITPP+ PL+++L+D+FL +HCSSCFSPLP  +  H   ++  YCS
Sbjct: 12  EMEMRAKQDLDYGQDITPPILPLSSSLYDSFLSSHCSSCFSPLP-PTFPHIPRHVPLYCS 71

Query: 62  PICSHSDSLTAAVFSTGQFPFS--DTSDLRASLRLLHLLLSDPSAWRSAPPERIFGLLTN 121
           P CS S S   +  +    P +  D+SDLR +LRLL  L S P         RI GLLTN
Sbjct: 72  PTCSSSHSPLHSSSAESLLPPTCPDSSDLRTALRLLQSLPSTPPHLH-----RIDGLLTN 131

Query: 122 REKLMLADDDSEVFVKIREGSDAMAASRRTNSADIRYDN---ALEEAILCLVLTNAVEVQ 181
               ML     EV  KIR+G+ AMAA+R++ + D    +    LEEA+L LV+TNAVEVQ
Sbjct: 132 HH--MLTSSSPEVAAKIRQGAIAMAAARKSRNRDNEGQSDGFLLEEAVLSLVITNAVEVQ 191

Query: 182 DSVGRTIGIAVYHPTFCWINHSCSPNACYRFETPS--------DSIKTRLRISPFCTDIG 241
           D  GR++GIAVY  +F WINHSCSPNACYRF   S        +   + LRI P      
Sbjct: 192 DKSGRSLGIAVYDLSFSWINHSCSPNACYRFSISSPHATLSFREDSSSTLRIVPSVL--- 251

Query: 242 TGEGSCSQMSTVRRNFSHFITKDFQGY--GPRVMVRSIKSIRNGEAVTIAYCDLLQPKEI 301
            GE  C   S V        TK  +GY  GP+++VRSIK IR GE V ++Y DLLQPKEI
Sbjct: 252 -GE-ECDACSCVEH------TKGNKGYELGPKIIVRSIKRIRKGEEVCVSYTDLLQPKEI 311

Query: 302 SAVNVELLDSTSISNFDYDTAIARIDDYVNNAIAEYLSIGSSESCCEKLQNLLTLGFYDE 361
           S  N+    S+   N   D A  R+  Y++  I E LS G  ESCCEKL+++L LG + E
Sbjct: 312 STCNLSFSSSSFDHNLYRDEASKRVYSYMDETITEVLSDGDPESCCEKLESILNLGLHIE 371

Query: 362 QAEDGDGKQLLNLRLHPVHFLLLNAYTALASAYKVRS-----WNGDENQCNA---TMSKT 421
           Q E  DGK LLN +LHP H L LNAYT L SAY++ S      + D ++C      M++T
Sbjct: 372 QVESKDGKSLLNFKLHPFHHLALNAYTTLTSAYRICSSDLLALHPDVDECQLKAFDMNRT 431

Query: 422 SAAYSLFLAGATHHLFLSEPSLIASAANCWVVAGESLLILVKHSSLWGSNTSKSSSPMGE 481
           SAAYSL LAGATH LF SE SLIASAAN W  AGESL+ L + SSLW     K   P+ E
Sbjct: 432 SAAYSLLLAGATHRLFCSESSLIASAANFWTNAGESLVTLAR-SSLWNLFV-KWGFPISE 491

Query: 482 IT------CLNCSWVDKFNTSRIHGRSIEADFREFSIGISNCIANISQKYWSFLAHECSY 541
           ++      C  CS +D F+T  I  ++   +F   S    +C++N++ K W FL   C Y
Sbjct: 492 VSTIAKHKCSKCSLMDIFDTKSILSQAQRVNFENISSDFLDCVSNMTAKIWRFLVRGCHY 551

Query: 542 LKAFTDPFDFSW-----------------PKTITTCSNYRDRSCDCSKIQDVSDQDRQSI 583
           L+ F DPFDF W                  K IT  S Y+ ++      Q  +++ R  +
Sbjct: 552 LEVFEDPFDFGWLVHTWDFHARANRNDEDSKFITEGSIYKHQA------QWYTNERRIHV 611

BLAST of Cp4.1LG00g03020 vs. TrEMBL
Match: A0A0D2UBQ4_GOSRA (Uncharacterized protein OS=Gossypium raimondii GN=B456_010G134600 PE=4 SV=1)

HSP 1 Score: 423.7 bits (1088), Expect = 3.7e-115
Identity = 263/607 (43.33%), Postives = 344/607 (56.67%), Query Frame = 1

Query: 3   MEMRAMEDIEMAEDITPPLPPLTAALHDAFLLTHCSSCFSPLPNSSISHSNLLRYCSPIC 62
           MEMRA +DIE+ +DITPPL PL+ +LHD+FL +HCSSCFSPL      H     YCS  C
Sbjct: 1   MEMRAKQDIEIGDDITPPLLPLSFSLHDSFLSSHCSSCFSPLSFPPSPHHYGSLYCSAPC 60

Query: 63  SHSDSLTAAVFSTGQFPFSD--TSDLRASLRLLHLLLSDPSAWRSAPPERIF--GLLTNR 122
           S S S  ++  +    P +   +SDLR +LRLL   LS PS   + P    F  GLLTN 
Sbjct: 61  SSSHSPISSSSAESFLPLTCPLSSDLRTALRLL---LSLPS---TCPHLHRFTNGLLTNY 120

Query: 123 EKLMLADDDSEVFVKIREGSDAMAASRRTN---SADIRYDNALEEAILCLVLTNAVEVQD 182
            KL       E   +IR+G+ AMAA+R+     S D   D  LEEA+LCLV+TNAVEVQD
Sbjct: 121 LKLT---SSPEFAAQIRQGAIAMAAARKLRKGLSLDQSDDVLLEEAVLCLVVTNAVEVQD 180

Query: 183 SVGRTIGIAVYHPTFCWINHSCSPNACYRFETP-------SDSIKTRLRISPFCTDIGTG 242
             GR++GIAVY P+F WINHSCSPNACYRF           +   + LRI P  ++   G
Sbjct: 181 ESGRSLGIAVYDPSFSWINHSCSPNACYRFIVSPPNATSFGEDSASALRIVPSVSEENFG 240

Query: 243 EGSCSQMSTVRRNFSHFITKDFQGYGPRVMVRSIKSIRNGEAVTIAYCDLLQPKEISAVN 302
             SCS+ +           K+   YGP++MVRSIK I+ GE V ++Y DLLQPKEI A N
Sbjct: 241 VCSCSEYN-----------KEGYKYGPKIMVRSIKRIKKGEEVCVSYTDLLQPKEILASN 300

Query: 303 VELLDSTSISNFDYDTAIARIDDYVNNAIAEYLSIGSSESCCEKLQNLLTLGFYDEQAED 362
                +    N   D A  ++  YV+    E+LS+G  ESCC+KL+++L  GF+ EQ E 
Sbjct: 301 PSFSSAGLDLNLYRDEANKKLSHYVDETNTEFLSVGDPESCCKKLESVLEGGFHVEQLES 360

Query: 363 GDGKQLLNLRLHPVHFLLLNAYTALASAYKVRSWN-------GDENQCNA-TMSKTSAAY 422
            DGK  LN + HP + + LN+Y  LASAY++RS +        DE+Q  A  MS+ SA Y
Sbjct: 361 EDGKSRLNCKFHPFNHIALNSYMTLASAYRIRSSDFLAFQSKTDESQLKAFEMSRISAGY 420

Query: 423 SLFLAGATHHLFLSEPSLIASAANCWVVAGESLLILVKHSSLWG--SNTSKSSSPMGEIT 482
           SL LAGATH+LF SE SLI SA N W  AGESLL  +  SS+W          S + +  
Sbjct: 421 SLLLAGATHYLFCSESSLIVSAVNFWKQAGESLL-TIAGSSVWNLLGLPKSELSTVVKYK 480

Query: 483 CLNCSWVDKFNTSRIHGRSIEADFREFSIGISNCIANISQKYWSFLAHECSYLKAFTDPF 542
           C  CS +D F    I  ++   +F   S     C+ + S K+W FL H C YL+ F DPF
Sbjct: 481 CSECSLMDIFGAKSILNQAERTNFENISSDFLACVRSASPKFWRFLIHGCHYLETFKDPF 540

Query: 543 DFSW---PKTITTCSNYRDRSCDCSKIQDVSDQDRQSIFELGIHCLFYGGYLASICYGHH 583
           DF W      +    ++     +C    +     R  I+++G+HCL YG  LA ICYG +
Sbjct: 541 DFRWLAHAHCVAEDVDFIKEDSNCEHHAEWYTNARTHIYKVGMHCLVYGVILAHICYGQN 586

BLAST of Cp4.1LG00g03020 vs. TrEMBL
Match: M5VHG1_PRUPE (Uncharacterized protein (Fragment) OS=Prunus persica GN=PRUPE_ppa023162mg PE=4 SV=1)

HSP 1 Score: 423.3 bits (1087), Expect = 4.8e-115
Identity = 277/659 (42.03%), Postives = 355/659 (53.87%), Query Frame = 1

Query: 3   MEMRAMEDIEMAEDITPPLPPLTAALHDAFLLTHCSSCFS--------------PLPNSS 62
           MEMRA EDIE+ EDITPPL PL  ALHD+ L +HCSSCFS              P P++ 
Sbjct: 1   MEMRAEEDIEIGEDITPPLTPLGFALHDSLLSSHCSSCFSLLPPHPFPPLHFTPPFPHNP 60

Query: 63  ISHSNLLRYCSPICSHSDS-----------LTAAVFSTGQFPFSDTSDLRASLRLLHLLL 122
               +   YCSP+CS SDS           L         +P  D+SDLRA+LRLLH L 
Sbjct: 61  HHVLSSSSYCSPLCSTSDSPLHVSSAELHLLHLLQSHPSTYPHGDSSDLRAALRLLHSLP 120

Query: 123 SDPSAWRSAPPERIFGLLTNREKLMLADDDSEVFVKIREGSDAMAASRRT-NSADIRYDN 182
           +      + P  RI GLLTN  K +  DD      +IR+G+ AM  +R+  + A   YD 
Sbjct: 121 A------TGPSARIAGLLTNHHKFLHHDDHH----RIRDGARAMFLARKMRDEAPNVYDA 180

Query: 183 ALEEAILCLVLTNAVEVQDSVGRTIGIAVYHPTFCWINHSCSPNACYRF------ETPSD 242
            LEEA LCLVLTNAVEVQD  GRT+GI+VY P+FCWINHSCSPNACYRF        P  
Sbjct: 181 VLEEAALCLVLTNAVEVQDKTGRTLGISVYGPSFCWINHSCSPNACYRFLVSPPPPPPCS 240

Query: 243 SIKTRLRISPFCTDIGTGEGSC--SQMSTVRRNFSHFITKDFQGYGPRVMVRSIKSIRNG 302
           + +T LRI+P    +G G  SC       +R  F   I      YGPRV+VRSIK I+ G
Sbjct: 241 AERTPLRIAP----LGQGTQSCGIDICCRLRVVFVAII------YGPRVIVRSIKRIKKG 300

Query: 303 EAVTIAYCDLLQPKEISAVNV----------------------ELLDSTSISNFDYDTAI 362
           E VT+ Y DLLQPK +    +                      ++L+  S +NF+  +  
Sbjct: 301 EEVTVTYTDLLQPKAMRQSELWSRYRFICSCTRCSASPLTYVDQVLEEISAANFNSSSLS 360

Query: 363 A-----------RIDDYVNNAIAEYLSIGSSESCCEKLQNLLTLGFYDEQAEDGDGKQLL 422
           +           R+ +Y+++AI +YLSIG  ES   +L+++LT G  D+Q+E  +    L
Sbjct: 361 SDINFNRDKATQRLTNYIDDAIDDYLSIGDPESSSVRLEHVLTQGLSDKQSECKEETSQL 420

Query: 423 NLRLHPVHFLLLNAYTALASAYKVRSWNGDENQCNA-TMSKTSAAYSLFLAGATHHLFLS 482
              LHP+H L LNAYT LA     +    D++  NA  +S+TS AYSL LAGATHHLF S
Sbjct: 421 TYWLHPLHHLSLNAYTTLAQPLYSKM---DDHLLNALDLSRTSTAYSLLLAGATHHLFRS 480

Query: 483 EPSLIASAANCWVVAGESLLILVKHSSLWGSNTSK-----SSSPMGEITCLNCSWVDKFN 542
           E SLI S AN W  AGESLL L + SS+W     +     + S  G+  C NCS  DKF 
Sbjct: 481 ESSLIVSVANFWSSAGESLLTLAR-SSVWSQFVQRDLPVSNPSSTGKYRCPNCSLADKFE 540

Query: 543 TSRIHGRSIEADFREFSIGISNCIANISQKYWSFLAHECSYLKAFTDPFDFSWPKTITTC 573
           T   HG+   ADF   S    +C+ N +Q  W+FL   C YL+   +P DFSW  T+   
Sbjct: 541 TDSFHGQVRYADFDYVSNEFVDCVTNFTQNVWNFLGLGCQYLRLVKNPIDFSWLGTVRYS 600

BLAST of Cp4.1LG00g03020 vs. TrEMBL
Match: A0A0D2R7I2_GOSRA (Uncharacterized protein OS=Gossypium raimondii GN=B456_010G134600 PE=4 SV=1)

HSP 1 Score: 423.3 bits (1087), Expect = 4.8e-115
Identity = 262/607 (43.16%), Postives = 343/607 (56.51%), Query Frame = 1

Query: 3   MEMRAMEDIEMAEDITPPLPPLTAALHDAFLLTHCSSCFSPLPNSSISHSNLLRYCSPIC 62
           MEMRA +DIE+ +DITPPL PL+ +LHD+FL +HCSSCFSPL      H     YCS  C
Sbjct: 1   MEMRAKQDIEIGDDITPPLLPLSFSLHDSFLSSHCSSCFSPLSFPPSPHHYGSLYCSAPC 60

Query: 63  SHSDSLTAAVFSTGQFPFSD--TSDLRASLRLLHLLLSDPSAWRSAPPERIF--GLLTNR 122
           S S S  ++  +    P +   +SDLR +LRLL   LS PS   + P    F  GLLTN 
Sbjct: 61  SSSHSPISSSSAESFLPLTCPLSSDLRTALRLL---LSLPS---TCPHLHRFTNGLLTNY 120

Query: 123 EKLMLADDDSEVFVKIREGSDAMAASRRTN---SADIRYDNALEEAILCLVLTNAVEVQD 182
            KL       E   +IR+G+ AMAA+R+     S D   D  LEEA+LCLV+TNAVEVQD
Sbjct: 121 LKLT---SSPEFAAQIRQGAIAMAAARKLRKGLSLDQSDDVLLEEAVLCLVVTNAVEVQD 180

Query: 183 SVGRTIGIAVYHPTFCWINHSCSPNACYRFETP-------SDSIKTRLRISPFCTDIGTG 242
             GR++GIAVY P+F WINHSCSPNACYRF           +   + LRI P  ++   G
Sbjct: 181 ESGRSLGIAVYDPSFSWINHSCSPNACYRFIVSPPNATSFGEDSASALRIVPSVSEENFG 240

Query: 243 EGSCSQMSTVRRNFSHFITKDFQGYGPRVMVRSIKSIRNGEAVTIAYCDLLQPKEISAVN 302
             SCS+ +     +          YGP++MVRSIK I+ GE V ++Y DLLQPKEI A N
Sbjct: 241 VCSCSEYNKGTEGYK---------YGPKIMVRSIKRIKKGEEVCVSYTDLLQPKEILASN 300

Query: 303 VELLDSTSISNFDYDTAIARIDDYVNNAIAEYLSIGSSESCCEKLQNLLTLGFYDEQAED 362
                +    N   D A  ++  YV+    E+LS+G  ESCC+KL+++L  GF+ EQ E 
Sbjct: 301 PSFSSAGLDLNLYRDEANKKLSHYVDETNTEFLSVGDPESCCKKLESVLEGGFHVEQLES 360

Query: 363 GDGKQLLNLRLHPVHFLLLNAYTALASAYKVRSWN-------GDENQCNA-TMSKTSAAY 422
            DGK  LN + HP + + LN+Y  LASAY++RS +        DE+Q  A  MS+ SA Y
Sbjct: 361 EDGKSRLNCKFHPFNHIALNSYMTLASAYRIRSSDFLAFQSKTDESQLKAFEMSRISAGY 420

Query: 423 SLFLAGATHHLFLSEPSLIASAANCWVVAGESLLILVKHSSLWG--SNTSKSSSPMGEIT 482
           SL LAGATH+LF SE SLI SA N W  AGESLL  +  SS+W          S + +  
Sbjct: 421 SLLLAGATHYLFCSESSLIVSAVNFWKQAGESLL-TIAGSSVWNLLGLPKSELSTVVKYK 480

Query: 483 CLNCSWVDKFNTSRIHGRSIEADFREFSIGISNCIANISQKYWSFLAHECSYLKAFTDPF 542
           C  CS +D F    I  ++   +F   S     C+ + S K+W FL H C YL+ F DPF
Sbjct: 481 CSECSLMDIFGAKSILNQAERTNFENISSDFLACVRSASPKFWRFLIHGCHYLETFKDPF 540

Query: 543 DFSW---PKTITTCSNYRDRSCDCSKIQDVSDQDRQSIFELGIHCLFYGGYLASICYGHH 583
           DF W      +    ++     +C    +     R  I+++G+HCL YG  LA ICYG +
Sbjct: 541 DFRWLAHAHCVAEDVDFIKEDSNCEHHAEWYTNARTHIYKVGMHCLVYGVILAHICYGQN 588

BLAST of Cp4.1LG00g03020 vs. TAIR10
Match: AT1G43245.1 (AT1G43245.1 SET domain-containing protein)

HSP 1 Score: 316.6 bits (810), Expect = 3.2e-86
Identity = 227/620 (36.61%), Postives = 314/620 (50.65%), Query Frame = 1

Query: 3   MEMRAMEDIEMAEDITPPLPPLTAALHDAFLLTHCSSCFSPLPNSSISHSNLLRYCSPIC 62
           ME+RA EDIE+  D+ PPL PL ++L+D+FL +HCSSCFS LP S         YCS  C
Sbjct: 1   MEIRAAEDIEIRTDLFPPLSPLASSLYDSFLSSHCSSCFSLLPPSPPQPL----YCSAAC 60

Query: 63  SHSDSLTAAVFSTGQFPFSDT----SDLRASLRLLHLLLSDPSAWRSAPPERIFGLLTNR 122
           S +DS T    ++ QFP   T    SD+R SL LL+    D S+     P R+  LLTN 
Sbjct: 61  SLTDSFT----NSPQFPPEITPILPSDIRTSLHLLNSTAVDTSS----SPHRLNNLLTNH 120

Query: 123 EKLMLADDDSEVFVKIREGSDAMAASRRTNSADIRYDNALEEAILCLVLTNAVEVQDSVG 182
             LM    D  + V I   ++ +A   R+N    R +  LEEA +C VLTNAVEV DS G
Sbjct: 121 HLLMA---DPSISVAIHHAANFIATVIRSN----RKNTELEEAAICAVLTNAVEVHDSNG 180

Query: 183 RTIGIAVYHPTFCWINHSCSPNACYRFETPSDSIKTRLRISPFCTDIGTGEGSCSQMSTV 242
             +GIA+Y+ +F WINHSCSPN+CYRF     S            D+       S    +
Sbjct: 181 LALGIALYNSSFSWINHSCSPNSCYRFVNNRTSYH----------DVHVTNTETSSNLEL 240

Query: 243 RRNFSHFITKDFQGYGPRVMVRSIKSIRNGEAVTIAYCDLLQPKEI-------------- 302
           +            G GP+++VRSIK I++GE +T++Y DLLQP  +              
Sbjct: 241 QEQVCGTSLNSGNGNGPKLIVRSIKRIKSGEEITVSYIDLLQPTGLRQSDLWSKYRFMCN 300

Query: 303 ----SAVNVELLDS------------TSISNFD----YDTAIARIDDYVNNAIAEYLSIG 362
               +A     +DS            T++ +FD     D A+ +++DY+  AI ++LS  
Sbjct: 301 CGRCAASPPAYVDSILEGVLTLESEKTTVGHFDGSTNKDEAVGKMNDYIQEAIDDFLSDN 360

Query: 363 -SSESCCEKLQNLLTLGFYDEQAEDGDGKQLLNLRLHPVHFLLLNAYTALASAYKVRSWN 422
              ++CCE ++++L  G   ++       Q   LRLH  H++ LNAY  LA+AY++RS +
Sbjct: 361 IDPKTCCEMIESVLHHGIQFKE-----DSQPHCLRLHACHYVALNAYITLATAYRIRSID 420

Query: 423 GDENQCNATMSKTSAAYSLFLAGATHHLFLSEPSLIASAANCWVVAGESLLILVKHSSLW 482
             E      MS+ SAAYSLFLAG +HHLF +E S   SAA  W  AGE L  L     + 
Sbjct: 421 S-ETGIVCDMSRISAAYSLFLAGVSHHLFCAERSFAISAAKFWKNAGELLFDLAPKLLM- 480

Query: 483 GSNTSKSSSPMGEITCLNCSWVDKFNTSRIHGRSIEADFREFSIGISNCIANISQKYWSF 542
                   S   ++ C  C  ++  N+ R        D +E S  I +C+ +ISQ  WSF
Sbjct: 481 ------ELSVESDVKCTKCLMLETSNSHR--------DIKEKSRQILSCVRDISQVTWSF 540

Query: 543 LAHECSYLKAFTDPFDFSWPKTITTCSNYRDRSCDCSKIQDVSDQDRQSIFELGIHCLFY 584
           L   C YL+ F  P DFS    +T  +  R+ S   SK Q V      ++  L  HCL Y
Sbjct: 541 LTRGCPYLEKFRSPVDFS----LTRTNGEREES---SKDQTV------NVLLLSSHCLLY 557

BLAST of Cp4.1LG00g03020 vs. NCBI nr
Match: gi|778709799|ref|XP_011656459.1| (PREDICTED: protein SET DOMAIN GROUP 41 [Cucumis sativus])

HSP 1 Score: 740.0 bits (1909), Expect = 3.3e-210
Identity = 410/646 (63.47%), Postives = 463/646 (71.67%), Query Frame = 1

Query: 1   MEMEMRAMEDIEMAEDITPPLPPLTAALHDAFLLTHCSSCFSPLPNSSISHSNLLRYCSP 60
           MEMEM A+EDIEMAEDI+PPL PLT+ALHD+FL THCSSCFS LPN  ISHS  L YCS 
Sbjct: 1   MEMEMIAVEDIEMAEDISPPLFPLTSALHDSFLFTHCSSCFSLLPNPPISHSIPLHYCSL 60

Query: 61  ICS--HSDSLTAAVFSTGQFP--FSDTSDLRASLRLLHLLLSDPSAWRSAPPERIFGLLT 120
            CS  HSD LT A FS   FP   SDTSDLRASLRLLHLLLS PS   S PP+RI+GLLT
Sbjct: 61  KCSLSHSDPLTDAFFSIHPFPDASSDTSDLRASLRLLHLLLSHPSPSLSPPPDRIYGLLT 120

Query: 121 NREKLMLADDDSEVFVKIREGSDAMAASRRTNSADIRYDNALEEAILCLVLTNAVEVQDS 180
           NR KLM   +DSEVF+K+REG++A+AA RR N ADI    ALEEA+LCLVLTNAV+VQDS
Sbjct: 121 NRHKLMTPQNDSEVFLKLREGANAIAALRRKNYADIPPGTALEEAVLCLVLTNAVDVQDS 180

Query: 181 VGRTIGIAVYHPTFCWINHSCSPNACYRFETPSDSIKTRLRISPFCTDIGTGEGSCSQMS 240
           +G+TIGIAVY  TF WINHSCSPNACYRFETPSDS+ TR RI+P CTD  + EGSC QM 
Sbjct: 181 IGQTIGIAVYASTFSWINHSCSPNACYRFETPSDSVTTRFRIAPSCTDFMSDEGSCRQMG 240

Query: 241 TVRRNFSHFITKDFQGYGPRVMVRSIKSIRNGEAVTIAYCDLLQPK-------------- 300
            VR N   FI +DFQG GPRV+VRSIK I+ GEAVTIAYCDLLQPK              
Sbjct: 241 NVRSNILDFIREDFQGNGPRVVVRSIKRIKKGEAVTIAYCDLLQPKARRQSELWSRYQFV 300

Query: 301 ----EISAVNVELLD-------STSISNFDYDTAIARIDDYVNNAIAE---YLSIGSSES 360
                 SAV +  +D       S  +   D  T I+  D   + A+     Y+    +E 
Sbjct: 301 CSCQRCSAVPLTYVDHALQEISSVKVELLD-STPISNFDH--DTAVRRIDEYVDNAITEY 360

Query: 361 --------CCEKLQNLLTLGFYDEQAEDGDGKQLLNLRLHPVHFLLLNAYTALASAYKVR 420
                   CCEKLQNLLT GF+DEQ EDG+GKQ ++LRLHP+HFLLLNAYTAL SAYKVR
Sbjct: 361 LSTSSPESCCEKLQNLLTFGFHDEQVEDGEGKQHVSLRLHPLHFLLLNAYTALTSAYKVR 420

Query: 421 SWN----------GDENQCNA-TMSKTSAAYSLFLAGATHHLFLSEPSLIASAANCWVVA 480
           S +           + N+ NA TM KTSAAY+LFLAGATH LFL EPSL+ASAANCWVVA
Sbjct: 421 SCDLVALSSEMDKDNGNRHNALTMGKTSAAYALFLAGATHRLFLFEPSLVASAANCWVVA 480

Query: 481 GESLLILVKHSSLWG--SNTSKSSSPMGEITCLNCSWVDKFNTSRIHGRSIEADFREFSI 540
           GESLLIL +HSSLW   +NTS    P+G+  C NCSWVD+FN SRIHG+ ++ADFREFSI
Sbjct: 481 GESLLILARHSSLWATTTNTSNWVFPLGKRMCYNCSWVDEFNASRIHGQPVQADFREFSI 540

Query: 541 GISNCIANISQKYWSFLAHECSYLKAFTDPFDFSWPKT--ITTCSNYRDRSCDCSKIQDV 584
           GISNCIA+ISQK WS L H C YLKAFT PFDFSWPKT     C    D SC CSK QDV
Sbjct: 541 GISNCIASISQKCWSSLTHGCPYLKAFTGPFDFSWPKTNEQDICGRGIDHSCACSKTQDV 600

BLAST of Cp4.1LG00g03020 vs. NCBI nr
Match: gi|659126234|ref|XP_008463080.1| (PREDICTED: protein SET DOMAIN GROUP 41 isoform X1 [Cucumis melo])

HSP 1 Score: 736.9 bits (1901), Expect = 2.8e-209
Identity = 409/650 (62.92%), Postives = 460/650 (70.77%), Query Frame = 1

Query: 3   MEMRAMEDIEMAEDITPPLPPLTAALHDAFLLTHCSSCFSPLPNSSISHSNLLRYCSPIC 62
           MEMRA+EDIEMAEDITPPL PLT+ALHD+FL THCSSCFS LPN  ISHS LL YCS  C
Sbjct: 1   MEMRALEDIEMAEDITPPLFPLTSALHDSFLSTHCSSCFSLLPNPPISHSPLLHYCSLKC 60

Query: 63  S--HSDSLTAAVFSTGQFP--FSDTSDLRASLRLLHL--LLSDPSAWRSAPPERIFGLLT 122
           S  HSD LTAA FS    P   SDTSDLRASLRLLHL  LLS PS   S PP RIFGLLT
Sbjct: 61  SLSHSDPLTAAFFSIHPLPDASSDTSDLRASLRLLHLHLLLSHPSPSLSPPPHRIFGLLT 120

Query: 123 NREKLMLADDDSEVFVKIREGSDAMAASRRTNSADIRYDNALEEAILCLVLTNAVEVQDS 182
           NR KLM   + SEVF+K+RE ++A+AA RR N ADI    ALEEA+LCLVLTNAV+VQDS
Sbjct: 121 NRHKLMTPQNGSEVFLKLREAANAIAALRRKNYADISPGTALEEAVLCLVLTNAVDVQDS 180

Query: 183 VGRTIGIAVYHPTFCWINHSCSPNACYRFETPSDSIKTRLRISPFCTDIGTGEGSCSQMS 242
           +G+TIGIAVY PTF WINHSCSPNACYRFETPSD   TR RI+P CTD  + EG+C QM 
Sbjct: 181 IGQTIGIAVYAPTFSWINHSCSPNACYRFETPSDFFTTRFRIAPSCTDFVSDEGTCRQMG 240

Query: 243 TVRRNFSHFITKDFQGYGPRVMVRSIKSIRNGEAVTIAYCDLLQPKEISAVNVELLDSTS 302
            VR N   F+ +DFQG GPRV+VRSIK I+ GEAVTIAYCDLLQPK           S  
Sbjct: 241 NVRSNILDFMREDFQGNGPRVVVRSIKRIKKGEAVTIAYCDLLQPK-------ARRQSEL 300

Query: 303 ISNFDYDTAIARID----DYVNNAIAEYLSIGSS-------------------------- 362
            S + +  +  R       YV++A+ E  ++                             
Sbjct: 301 WSRYQFVCSCQRCSAVPLTYVDHALQEISAVKVELLDSAPISNFDHDTAVRRIDEYVDNA 360

Query: 363 ----------ESCCEKLQNLLTLGFYDEQAEDGDGKQLLNLRLHPVHFLLLNAYTALASA 422
                     ESCCEKLQNLLT GF DEQ EDG+GKQ ++LRLHP HFLLLNAYTAL SA
Sbjct: 361 ITEYLSIGSPESCCEKLQNLLTFGFRDEQVEDGEGKQPVSLRLHPSHFLLLNAYTALTSA 420

Query: 423 YKVRSWN----------GDENQCNA-TMSKTSAAYSLFLAGATHHLFLSEPSLIASAANC 482
           YKVRS +           +EN+ NA TMSKTSAAY+LFLAGATHHLFL EPSLIASAANC
Sbjct: 421 YKVRSCDLLALSSEMDKDNENRHNALTMSKTSAAYALFLAGATHHLFLFEPSLIASAANC 480

Query: 483 WVVAGESLLILVKHSSLWG--SNTSKSSSPMGEITCLNCSWVDKFNTSRIHGRSIEADFR 542
           WVVAGESLLIL +HSSLW   +NTS    P+G+  C NCSWVD+FN SRIHGR I+ADFR
Sbjct: 481 WVVAGESLLILARHSSLWATTTNTSDWGFPLGKRMCSNCSWVDEFNGSRIHGRRIQADFR 540

Query: 543 EFSIGISNCIANISQKYWSFLAHECSYLKAFTDPFDFSWPKTI--TTCSNYRDRSCDCSK 584
           EFSIGISNCIA+IS+K WSFL H C YLKAFTDPFDFSWPKT       +  DRSC CSK
Sbjct: 541 EFSIGISNCIASISRKCWSFLTHGCPYLKAFTDPFDFSWPKTNDGDIGGHGIDRSCACSK 600

BLAST of Cp4.1LG00g03020 vs. NCBI nr
Match: gi|700190660|gb|KGN45864.1| (hypothetical protein Csa_6G014840 [Cucumis sativus])

HSP 1 Score: 728.0 bits (1878), Expect = 1.3e-206
Identity = 407/648 (62.81%), Postives = 460/648 (70.99%), Query Frame = 1

Query: 1   MEMEMRAMEDIEMAEDITPPLPPLTAALHDAFLLTHCSSCFSPLPNSSISHSNLLRYCSP 60
           MEMEM A+EDIEMAEDI+PPL PLT+ALHD+FL THCSSCFS LPN  ISHS  L YCS 
Sbjct: 1   MEMEMIAVEDIEMAEDISPPLFPLTSALHDSFLFTHCSSCFSLLPNPPISHSIPLHYCSL 60

Query: 61  ICS--HSDSLTAAVFSTGQFP--FSDTSDLRASLRLLHLLLSDPSAWRSAPPERIFGLLT 120
            CS  HSD LT A FS   FP   SDTSDLRASLRLLHLLLS PS   S PP+RI+GLLT
Sbjct: 61  KCSLSHSDPLTDAFFSIHPFPDASSDTSDLRASLRLLHLLLSHPSPSLSPPPDRIYGLLT 120

Query: 121 NREKLMLADDDSEVFVKIREGSDAMAASRRTNSADIRYDNALEEAILCLVLTNAVEVQDS 180
           NR KLM   +DSEVF+K+REG++A+AA RR N ADI    ALEEA+LCLVLTNAV+VQDS
Sbjct: 121 NRHKLMTPQNDSEVFLKLREGANAIAALRRKNYADIPPGTALEEAVLCLVLTNAVDVQDS 180

Query: 181 VGRTIGIAVYHPTFCWINHSCSPNACYRFETPSDSIKTRLRISPFCTDIGTGEGSCSQMS 240
           +G+TIGIAVY  TF WINHSCSPNACYRFETPSDS+ TR RI+P CTD  + EGSC QM 
Sbjct: 181 IGQTIGIAVYASTFSWINHSCSPNACYRFETPSDSVTTRFRIAPSCTDFMSDEGSCRQMG 240

Query: 241 TVRRNFSHFITKD--FQGYGPRVMVRSIKSIRNGEAVTIAYCDLLQPK------------ 300
            VR N   FI +     G GPRV+VRSIK I+ GEAVTIAYCDLLQPK            
Sbjct: 241 NVRSNILDFIREGALLNGNGPRVVVRSIKRIKKGEAVTIAYCDLLQPKARRQSELWSRYQ 300

Query: 301 ------EISAVNVELLD-------STSISNFDYDTAIARIDDYVNNAIAE---YLSIGSS 360
                   SAV +  +D       S  +   D  T I+  D   + A+     Y+    +
Sbjct: 301 FVCSCQRCSAVPLTYVDHALQEISSVKVELLD-STPISNFDH--DTAVRRIDEYVDNAIT 360

Query: 361 ES--------CCEKLQNLLTLGFYDEQAEDGDGKQLLNLRLHPVHFLLLNAYTALASAYK 420
           E         CCEKLQNLLT GF+DEQ EDG+GKQ ++LRLHP+HFLLLNAYTAL SAYK
Sbjct: 361 EYLSTSSPESCCEKLQNLLTFGFHDEQVEDGEGKQHVSLRLHPLHFLLLNAYTALTSAYK 420

Query: 421 VRSWN----------GDENQCNA-TMSKTSAAYSLFLAGATHHLFLSEPSLIASAANCWV 480
           VRS +           + N+ NA TM KTSAAY+LFLAGATH LFL EPSL+ASAANCWV
Sbjct: 421 VRSCDLVALSSEMDKDNGNRHNALTMGKTSAAYALFLAGATHRLFLFEPSLVASAANCWV 480

Query: 481 VAGESLLILVKHSSLWG--SNTSKSSSPMGEITCLNCSWVDKFNTSRIHGRSIEADFREF 540
           VAGESLLIL +HSSLW   +NTS    P+G+  C NCSWVD+FN SRIHG+ ++ADFREF
Sbjct: 481 VAGESLLILARHSSLWATTTNTSNWVFPLGKRMCYNCSWVDEFNASRIHGQPVQADFREF 540

Query: 541 SIGISNCIANISQKYWSFLAHECSYLKAFTDPFDFSWPKT--ITTCSNYRDRSCDCSKIQ 584
           SIGISNCIA+ISQK WS L H C YLKAFT PFDFSWPKT     C    D SC CSK Q
Sbjct: 541 SIGISNCIASISQKCWSSLTHGCPYLKAFTGPFDFSWPKTNEQDICGRGIDHSCACSKTQ 600

BLAST of Cp4.1LG00g03020 vs. NCBI nr
Match: gi|659126236|ref|XP_008463081.1| (PREDICTED: protein SET DOMAIN GROUP 41 isoform X2 [Cucumis melo])

HSP 1 Score: 589.0 bits (1517), Expect = 9.5e-165
Identity = 333/539 (61.78%), Postives = 374/539 (69.39%), Query Frame = 1

Query: 3   MEMRAMEDIEMAEDITPPLPPLTAALHDAFLLTHCSSCFSPLPNSSISHSNLLRYCSPIC 62
           MEMRA+EDIEMAEDITPPL PLT+ALHD+FL THCSSCFS LPN  ISHS LL YCS  C
Sbjct: 1   MEMRALEDIEMAEDITPPLFPLTSALHDSFLSTHCSSCFSLLPNPPISHSPLLHYCSLKC 60

Query: 63  S--HSDSLTAAVFSTGQFP--FSDTSDLRASLRLLHL--LLSDPSAWRSAPPERIFGLLT 122
           S  HSD LTAA FS    P   SDTSDLRASLRLLHL  LLS PS   S PP RIFGLLT
Sbjct: 61  SLSHSDPLTAAFFSIHPLPDASSDTSDLRASLRLLHLHLLLSHPSPSLSPPPHRIFGLLT 120

Query: 123 NREKLMLADDDSEVFVKIREGSDAMAASRRTNSADIRYDNALEEAILCLVLTNAVEVQDS 182
           NR KLM   + SEVF+K+RE ++A+AA RR N ADI    ALEEA+LCLVLTNAV+VQDS
Sbjct: 121 NRHKLMTPQNGSEVFLKLREAANAIAALRRKNYADISPGTALEEAVLCLVLTNAVDVQDS 180

Query: 183 VGRTIGIAVYHPTFCWINHSCSPNACYRFETPSDSIKTRLRISPFCTDIGTGEGSCSQMS 242
           +G+TIGIAVY PTF WINHSCSPNACYRFETPSD   TR RI+P CTD  + EG+C QM 
Sbjct: 181 IGQTIGIAVYAPTFSWINHSCSPNACYRFETPSDFFTTRFRIAPSCTDFVSDEGTCRQMG 240

Query: 243 TVRRNFSHFITKDFQGYGPRVMVRSIKSIRNGEAVTIAYCDLLQPKEISAVNVELLDSTS 302
            VR N   F+ +DFQG GPRV+VRSIK I+ GEAVTIAYCDLLQPK           S  
Sbjct: 241 NVRSNILDFMREDFQGNGPRVVVRSIKRIKKGEAVTIAYCDLLQPK-------ARRQSEL 300

Query: 303 ISNFDYDTAIARID----DYVNNAIAEYLSIGSS-------------------------- 362
            S + +  +  R       YV++A+ E  ++                             
Sbjct: 301 WSRYQFVCSCQRCSAVPLTYVDHALQEISAVKVELLDSAPISNFDHDTAVRRIDEYVDNA 360

Query: 363 ----------ESCCEKLQNLLTLGFYDEQAEDGDGKQLLNLRLHPVHFLLLNAYTALASA 422
                     ESCCEKLQNLLT GF DEQ EDG+GKQ ++LRLHP HFLLLNAYTAL SA
Sbjct: 361 ITEYLSIGSPESCCEKLQNLLTFGFRDEQVEDGEGKQPVSLRLHPSHFLLLNAYTALTSA 420

Query: 423 YKVRSWN----------GDENQCNA-TMSKTSAAYSLFLAGATHHLFLSEPSLIASAANC 482
           YKVRS +           +EN+ NA TMSKTSAAY+LFLAGATHHLFL EPSLIASAANC
Sbjct: 421 YKVRSCDLLALSSEMDKDNENRHNALTMSKTSAAYALFLAGATHHLFLFEPSLIASAANC 480

BLAST of Cp4.1LG00g03020 vs. NCBI nr
Match: gi|802658905|ref|XP_012080733.1| (PREDICTED: protein SET DOMAIN GROUP 41 isoform X2 [Jatropha curcas])

HSP 1 Score: 463.0 bits (1190), Expect = 7.8e-127
Identity = 289/616 (46.92%), Postives = 367/616 (59.58%), Query Frame = 1

Query: 1   MEMEMRAMEDIEMAEDITPPLPPLTAALHDAFLLTHCSSCFSPLPN--SSISHSNLLRYC 60
           MEMEM A EDI + EDIT PL PL+ +LHD+FL +HCS+CFSPLPN  SS S    L YC
Sbjct: 9   MEMEMEAGEDIGIGEDITLPLFPLSFSLHDSFLHSHCSACFSPLPNPHSSTSCPPFL-YC 68

Query: 61  SPICSHSDSLTAAVF-----STGQFPFSDTSDLRASLRLLHLLLSDPSAWRSAPPERIFG 120
           SPICS      A  +     S+   P S TSDLR +LRLLH L S      SA   RI G
Sbjct: 69  SPICSSLHFSCAEFYLLQSLSSVSAPPSSTSDLRVALRLLHSLPS-----LSAKDGRISG 128

Query: 121 LLTNREKLMLADDDSEVFVKIREGSDAMAASRRTNSADIRYDNA-------LEEAILCLV 180
           LLTNREKLM    D+E+F +IR+G+ A+AA+RR     +    A       LEE+ LCLV
Sbjct: 129 LLTNREKLMT---DNEIFTRIRDGAKAIAATRRLRDGKVAVTAANENDEVSLEESALCLV 188

Query: 181 LTNAVEVQDSVGRTIGIAVYHPTFCWINHSCSPNACYRFETPSDSIKTRLRISPFCTDIG 240
           LTNAVEVQD+ GRT+GIAVY  TF WINHSCSPNACYRF      + + L I+PF ++  
Sbjct: 189 LTNAVEVQDNEGRTLGIAVYDHTFSWINHSCSPNACYRF------LISPLSIAPFPSESR 248

Query: 241 TG---EGSCSQMSTVRRNFSHF-ITKDFQGYGPRVMVRSIKSIRNGEAVTIAYCDLLQPK 300
                 GS  + S     FS+  +TK    YGP ++VRSIK I+ GE VT+AY DLLQPK
Sbjct: 249 QAIVPAGSNGEKSA----FSNIELTKGHGEYGPMIVVRSIKRIKKGEKVTVAYTDLLQPK 308

Query: 301 EISAVNVELLDSTSISNFDYDTAIARIDDYVNNAIAEYLSIGSSESCCEKLQNLLTLGFY 360
           E +A N+    S+S  +F  D A   + DYV+  I EYLS G  ESCCEKL+++L LG  
Sbjct: 309 ETTAANLASSSSSSYHSFHRDVANRNLIDYVDEVITEYLSSGDPESCCEKLESVLVLGLL 368

Query: 361 DEQAEDGDGKQLLNLRLHPVHFLLLNAYTALASAYKVRS---------WNGDENQCNATM 420
           DE  E  +GK  L ++LHP+H L LNAY  LASAY++R+          NG + +    M
Sbjct: 369 DEPLETKEGKSQLTVKLHPLHHLALNAYMTLASAYRIRASEYLAVSSDTNGHQLEVFG-M 428

Query: 421 SKTSAAYSLFLAGATHHLFLSEPSLIASAANCWVVAGESLLILVKHS-----SLWGSNTS 480
            +T AAYS  LA A+HHLF  E SLIAS AN W  AGESLL L + S       W    S
Sbjct: 429 LRTGAAYSFLLAAASHHLFCFESSLIASVANFWTSAGESLLTLARSSLWDLFGKWELPES 488

Query: 481 KSSSPMGEITCLNCSWVDKFNTSRIHGRSIEADFREFSIGISNCIANISQKYWSFLAHEC 540
           K  S + +  C NCS +D+F  +  H  ++  DF   S    +CI + S++ WSFL  +C
Sbjct: 489 KHFS-LAKYKCSNCSLLDRFEANFSHCHAVNNDFENISSKFLDCITSFSREVWSFLIQDC 548

Query: 541 SYLKAFTDPFDFSWPKTITTCSNYRDRSCDCS-KIQDVSDQDRQSIFELGIHCLFYGGYL 584
           +YLK   DPF+ +   ++   SN  D   D   + +  ++++R +IF LG HCL  G  L
Sbjct: 549 NYLKLLNDPFNLN---SLGKLSNISDFVADSGYEAKKYANEERVTIFRLGFHCLLCGELL 600

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
SDG41_ARATH5.7e-8536.61Protein SET DOMAIN GROUP 41 OS=Arabidopsis thaliana GN=SDG41 PE=2 SV=1[more]
Match NameE-valueIdentityDescription
A0A0A0KAK3_CUCSA9.1e-20762.81Uncharacterized protein OS=Cucumis sativus GN=Csa_6G014840 PE=4 SV=1[more]
A0A061FQH5_THECC1.8e-12244.67SET domain-containing protein, putative isoform 3 OS=Theobroma cacao GN=TCM_0356... [more]
A0A0D2UBQ4_GOSRA3.7e-11543.33Uncharacterized protein OS=Gossypium raimondii GN=B456_010G134600 PE=4 SV=1[more]
M5VHG1_PRUPE4.8e-11542.03Uncharacterized protein (Fragment) OS=Prunus persica GN=PRUPE_ppa023162mg PE=4 S... [more]
A0A0D2R7I2_GOSRA4.8e-11543.16Uncharacterized protein OS=Gossypium raimondii GN=B456_010G134600 PE=4 SV=1[more]
Match NameE-valueIdentityDescription
AT1G43245.13.2e-8636.61 SET domain-containing protein[more]
Match NameE-valueIdentityDescription
gi|778709799|ref|XP_011656459.1|3.3e-21063.47PREDICTED: protein SET DOMAIN GROUP 41 [Cucumis sativus][more]
gi|659126234|ref|XP_008463080.1|2.8e-20962.92PREDICTED: protein SET DOMAIN GROUP 41 isoform X1 [Cucumis melo][more]
gi|700190660|gb|KGN45864.1|1.3e-20662.81hypothetical protein Csa_6G014840 [Cucumis sativus][more]
gi|659126236|ref|XP_008463081.1|9.5e-16561.78PREDICTED: protein SET DOMAIN GROUP 41 isoform X2 [Cucumis melo][more]
gi|802658905|ref|XP_012080733.1|7.8e-12746.92PREDICTED: protein SET DOMAIN GROUP 41 isoform X2 [Jatropha curcas][more]
The following terms have been associated with this gene:
Vocabulary: Molecular Function
TermDefinition
GO:0005515protein binding
Vocabulary: INTERPRO
TermDefinition
IPR001214SET_dom
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0008150 biological_process
cellular_component GO:0005575 cellular_component
molecular_function GO:0005515 protein binding

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Cp4.1LG00g03020.1Cp4.1LG00g03020.1mRNA


Analysis Name: InterPro Annotations of Cucurbita pepo
Date Performed: 2017-12-02
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR001214SET domainPFAMPF00856SETcoord: 161..275
score: 2.
NoneNo IPR availableGENE3DG3DSA:2.170.270.10coord: 255..279
score: 1.1E-5coord: 183..209
score: 1.
NoneNo IPR availablePANTHERPTHR12197SET AND MYND DOMAIN CONTAININGcoord: 1..213
score: 2.8E-75coord: 254..421
score: 2.8
NoneNo IPR availablePANTHERPTHR12197:SF160PROTEIN SET DOMAIN GROUP 41coord: 1..213
score: 2.8E-75coord: 254..421
score: 2.8
NoneNo IPR availableunknownSSF82199SET domaincoord: 255..279
score: 1.66E-7coord: 182..209
score: 1.66E-7coord: 77..213
score: 6.41E-7coord: 250..276
score: 6.4

The following gene(s) are orthologous to this gene:

None

The following gene(s) are paralogous to this gene:

None

The following block(s) are covering this gene:
GeneOrganismBlock
Cp4.1LG00g03020Cucumber (Chinese Long) v3cpecucB0016
Cp4.1LG00g03020Wax gourdcpewgoB0006
Cp4.1LG00g03020Cucumber (Gy14) v1cgycpeB0730
Cp4.1LG00g03020Cucurbita maxima (Rimu)cmacpeB693
Cp4.1LG00g03020Cucurbita moschata (Rifu)cmocpeB645
Cp4.1LG00g03020Wild cucumber (PI 183967)cpecpiB013
Cp4.1LG00g03020Cucumber (Chinese Long) v2cpecuB016
Cp4.1LG00g03020Melon (DHL92) v3.5.1cpemeB001
Cp4.1LG00g03020Cucumber (Gy14) v2cgybcpeB579
Cp4.1LG00g03020Melon (DHL92) v3.6.1cpemedB001