Cp4.1LG18g01390 (gene) Cucurbita pepo (Zucchini)

NameCp4.1LG18g01390
Typegene
OrganismCucurbita pepo (Cucurbita pepo (Zucchini))
DescriptionUPF0061 protein azo1574
LocationCp4.1LG18 : 2990491 .. 3001774 (+)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: five_prime_UTRCDSthree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
AAAAGAAGTCCAACAAAAAACGGAATGTCATTGGTTTCCCATCTCTTCCCCAAGCTCTCGTTGCTTCCCCACATTTCTCTGCTCTGCCATGGCCACCGCCACCGCCACCGCCTCGGCTTAGTTCGCCGCCATTCGACTACACTCATTCACCGCCGTTCTCCTGCCTCTATTTTCTCTCCACCTTCACCACTCGTCGGCCACTCTCGCCACGGCCGCCGGAGAGTTTCCATGGACTCTGCTTCACCGGAGGTTTCGGCGTCTGTCGACTCCGTGGCGGATGGTTTGAAGAATCAAAGCTTGAATAGTGACGATCGTGGGGGTTTAGGTAGTGGAGTCGAGCACGGGGCGAAAAAGAAGCTCGAAGAACTTAACTGGGACAATTCATTTGTGAGAGAGCTGCCTGGCGATCCCCGTACGGATGTCCTTCCACGACAGGTACTATTTATTTGTTTTCAGCTTTTATTTGTTCTTTTTTGAAGTACAATTTTTTGGAAATTAGAATTGGGGTTGGTACTTGTATTTACAAGGAAATGTGGACCGTGAGGAATTATGTTTGTAATCAATTGAAAGCTCGAGTAGGAGCTCTTTTATATGCTATAGTTTAATGGCTTTTCCTTACTGCGAGTTTCAGTTGCTTCTCGGTNCCCACTATAGGCTCGAGAAACAAAGAGAAGATGTGTATGAACGAGATATACAGCAACAAGAAGAAGGAAAAGGATTGAATAGCGGAACTCCCGAACATTTGGCGATCTCAGATGTGTCGATATCGATGGTGACTCATTATTTCGATGAATCATTTCTTCGGACAGAAGAGGATTATGTAAACACTTACTCGAAATCTCACTTATCAGATTCCATTTTGGAAGACACAATTTTTTCTGAAGAATTCGCCATGATATAGCTGATCCATGCATAATATCACGAAAAATGGATACAAATTTTTGGCTGCTACTTATACTTAGTATCGGCAANGTGATAAAAAGGACTTTGAATTTGGTTTTTTCTGTTTGATAAGAGAACTCTCAAGGGTTTTTAGGCTTTTTTTTTTATATATTTTTTTTATGATAAGTTTTAATTAGTATTTTATTGTAAGCATGAGATCCATTCTAGATGCAATTAATTGATTGTGTTAAACTTCCATGCTGTTTTTCTTAATTTTGGGAAAAAGGGTTGTGCTTACCCCCTTCATTTTCATTTCCCAGTTCCTCTTCTGGTTTCGCATGGGTTATGAACCAATAGCTTTGATGGCTAGATTTGGTTTTATGGGTTAGTCTACTATTATGACTGATTAGTTTATCTACGACATGAATCTTAGGTGTTACATGCATGTTATTCAAATGTATTACCTTCAGTTGAAGTAGAAAGTCCTCAGCTTGTTGCTTGGTCAGAATCGGTTGCTGATTTGCTAGATTTGGATCTTCAAGAGTGAGTTTTCTTGTNCATTTTAAAAATTTAAAAGTATTTTTGAAATAAAGTATTGTGGTTTTTTACTCGAACCAAAATTAAATATATATATATATATATATATATATATTTTTTTTTTTTTTNGTTTCTTTTGCTTTCTGATAATTGATAGATGAGGTATTTACACTATGTACTATGAATTATGAAACATGGACATTTAGTTTGAGAAGCATCTTGAAGATCATAGTCTGATGCACTATACACTCTACTGGACACTTGAGATACAATTTAGAGGTCTTTTTTTTTATCATTATTGTTTACTTTTAAATAAAGTGTGAATATATTCCGATAGATTGAGGCAGAGCTTCTTGAAGGTCTTCTTTCTTTTGATGCAAAATAAATATACTTGTTCCTCACAAGAACATTACAAATTCCCGATATAGAAATCACTCGCGCAACTTGGAGGAATGCATACAACCTTCCTAATTTTGAAACAACTAGCAAAAAGAGTAGTGTTCAAACTCCTTATCTAAAGAGGTCGACCGAAAACAGTTATATTTTGCAAGATGAAAAACCCCATCCATCCCCTCAGAAGGGCCTCTCTAAATCCTGATGTTTCTTTCTATCTATAGATTCCAGAAAACTGCCTTTACAACATAATTCCATAAAACCTTCCTCTTCTTTGTCTTTTGGTTTGCGATGCTCTATTTGAGGGACATTAGACTTGGTGGGCTTTTTTCTCTTCTTTGTGATTAACACCTGTACCAAGGGAGAAAGGAATCTAGGTTTTGGATTCGTGACTATTAGTGGGGCTTCTTTTGCAGTTCTTTTTTCTACATGTTGTCTATCCCTTCCTCGCTCTCAACTGCCTTTATTTTCCCCTATTTGGAAGGTTAAAATCTCTAGGAAGGGAATTTTTTTTGTGGCATGTTTGACATGGGAGATGTTCAAGGACACTCATCCTTTCATGTTGTACATTTGATAGTGCATTCTTTGTAAACGTTAGGTGGAGGATCTCCATCACTTGTCGTGGGGTTGCCAGTTTGGTTAAACTCTCTGGTTTCGGTGGATGAGCTCCTTTAGCTTGTGTTTGGCTTGGAATAGGGATAGTTACACTATGATTGAGGAGTGTTTACGAGTTCATTTTTTGGAGAAGGGAAAGGTCTCGTGGCATGCTAATTTCTTTGCTGTTTGGTGTGGTAATTGACGCAAGAGAAATAATATGATTTTTCATGAAGTTGAGGGGTCCTTTGATGAAGTTTGGGAGCTGGTGAGGTTTAATACTTCTTTGTGGGCATCAATCATTTGACCTTTTTGTAATTATGCTCGTTTTTTGTATGTCCTTCTTATTTTTTCATTTTTTTCCATGAAAGCTTGGTTTTTTAGAGTAAACATTAATTCATAAGATTTTAACTCCTTTGAAGAAAGTATCCACGATAACTGCATAACATTCTCTTTCAACAGGCAATGGAGCACCCCAATTTAAACGAAACAATTGTAGAGATCAATTCCAAAATTTTGGGCTGAAAAAACACCCGAAAAAAATAAAAAGTGATAAAAAAGTCTTGTATAACTCCTGATGCTTAAACAAAGCACATACATAGGACCCAACACCAAACCTAATCATTTTCTTTTGTACTCCTTCCCGTGTGGAGGTTGTGGCATGAAACAAATTGAAGAGTCTTCAAAGGAAAGAGTGAAAACTTTGAGGATTTTGGGACCTAGTCTGTGTTTTTGCCTCTACTTGGTTTTCTTTGTCCAATGATTTTTGCAATTTTAGCTTCGTCATATCAATGCCAATTGGAATTCCTCTTTGTAATCCAATGGCTTCTTAGAGATAGCTCATCCCCTTATTTGTATGAACTCTTTTTGAACAATATACTTCTCAAGAGCTGTTGAGGTCGCTGGAAGATAACCTGGACCGTTATGATTTGTTACTGTACGAGTGTTATAATGTAATTTTCTCTTTGTATTGTCTTAATCTGCTGTAGTCAATTATATGTCTGTGATATCCTCTCCCTTAGAGCTATTTTCTTCTTGCAAGAGCTGAAGGATTTTGCTTTTGTCACTGCAGATTTGAAAGGCCAGATTTCCCCCTCTTGTTCTCTGGCGCATCTCCATTAGTTGGAGTGTAAGTGTGAAATATTATTTGATTGACAGCACCAGCATTCTGATCTTCATTATCCCCTGCCTGCTGATATCGTGTATTACAGGTCACCTTATGCTCAATGTTACGGGGGCCATCAGTTTGGCATGTGGGCTGGGCAGTTGGGTGATGGTCGAGCAATAACCCTTGGAGAGATACTTAATTCCCGATCTGAAAGGTGGGAGTTGCAGTTAAAAGGTGCTGGGAAGACGCCATACAGTCGGTTTGCAGATGGCTTGGCTGTGCTACGAAGTAGCATCAGGGAGTTTCTTTGTAGTGAAGCAATGCATAGTCTTGGAATACCAACAACTCGTGCCCTTTGTCTTCTGACCACAGGAACATTTGTTACCCGAGACATGTTCTATGAGTACGTAGACAAAATAATCTGGTCAAGATGGCAGTTTAAGAAAATTTGTAGTTATAAGATGATCTTACATCTTCTCATGCATTTGAATAGGATTTTCTAAAAAATTCAAGAATTCTTAATCTCTGAATATGTCGTTTTTTCTGCAAAACCTTTCTAACACAAATAAATTCCAATTGGCATTAATATGAACATGTTAGGATTTTTATTTTATTTTATTTGAAATCATGATTCATTGAGTACTTATATTTAGGTACTCTTTTGGTAGTTTTGAACATGTTAATCAACTAAAAAAATGTGAAGAGATATAAATTGATGCTGGTATGGAAATGCAATAATAAGTTCCGATTGTTTCAATTTGTAGTGGGAATCAAAAAGAAGAGCCTGGTGCAATTGTATGTCGAGTGGCTCAATCCTTTTTGCGTTTTGGATCCTTCCAAATTCATGCCTCTAGAGGGAAGGATGATTATAAAATTGTTCGGGCTTTAGCAGACTATGCGATTCGCCACCATTTTCCGCACTTTGAGAATATGAGCAGCAGTCAGAGCTTATCTTTCAGCACGGGTGATGAAGATAGTTCAGTTGTTGATCTCACTTCAAACAAGTATGCAGGTAATCTTTTCTGTTAAAAATTTGATATGACATAGATATGGTGGCGACATGTGAATTTTTAAAATGTAGGACATAGATACGGTGACTTGTCAATTTCTAAAAAAGTAGAACACAGACACACTAGGACACATTTTTTTCTTTTTAATAGAATATGTGTATTTGTTGATATATTAATACTTTTTTATGGATAAATCTTTTAAATTAACGTAAAAAAATACTTCACTAAAAAAAAACATACATAGAATGATAAGTCGATGAATTATTATTTTTTTTTTTAACAAAAAACAAACTTTTCATTGACAAAATGAGAAAAGACTAATGCTGTTCAAAATGCCACAAAAAGTGGCTTCGGATATAGAAAAATTGTTCAGGAACTATATGTGGAAAGATGACATACACCTTGTTCCATAGAGTATTATAAACCTCCCTGCAGAAAAGGGAGGACTCTGTCTTTTCTCGATAAAGAAGAAGCACAAAGCTCTCCTCGCAAAATGGATATGGAGATATCATCACAAAGAAAATGCCTAGTGGAGAAAACTTATAAAAGCTAAATATATTCCTACATCATGTAAAAATCCACCCCCACCATCTTCTGCAAAGTGGCCGTGGAAGTACATAAAGGAACATTTCATGCCTATTATTAAGCGACGGTTTCAGTCATTCTTCTCTCTCTCAAGACCTTGGTCTGGATTCGTGTTAAAATTTTATTCCTTTATGGCATTTGGCAGCAACCTGGAATTTGTTAGCAACCCAAGGACAGCATAGCCTATGTGATGAATCCAGTTTAACAGCTTACTTCTTTTCTTGTCTGTGAAGTTTTGTAGAGGCCAGCGCATGCGATCTTTTTATTTATTTATTTATTTTTGTTTATTTTTATTCATTTATAAATTTTTTGGGTAAGAAACTGAGCTTTTATGGAGAGAAGGGAAAGAAGTGTATGAAAAAATTCTCCATTTTATGGCATGGTTGATGTAGTTCTCCTTTTAGTTGCATTCAACTTGCTGTTTCTGAACAAAAAGGGTTTCACATTTTATCTTAATGACTCTTTTGTTTTTTATTTCCTTTACATGGTTGATTTATCAGCTTGGGCTGTAGAAGTTGCTGAGCGAACTGCTTCCTTAATAGCAAGTTGGCAGGGAGTTGGGTTCACACATGGTGTACTCAACACTGACAATATGAGCATCTTGGGTCTTACCATTGATTATGGTCCCTTCGGATTTTTGGATGCTTTTGATCCTAGTTATACGCCTAATACAACTGATCTTCCGGGCAGAAGATACTGTTTTGCAAATCAGCCAGATGTAGGCTTATGGAATATAGCCCAATTTGCTTCAACACTTTCAGCTGCTGAATTAATAAATGATAAAGAAGCAAACTATGCCATGGAGAGGTAGTTTTACCTTATCGAACTATGTTTTTTTCTAGAACTGTTGACCTATCATTGCACTCATAATATGACGTTTCCCCTCTCCTCTTACAGATACGGAGACAAATTTATGGATGACTATCAAACAATCATGACCAAGAAAATTGGTCTACCAAAGTACAATAAACAGTTAATCAGCAAACTTCTCAACAACATGGCTGTTGACAAAGTTGATTATACAAATTTCTTTAGATCACTTGCCAATATCAAAGCTGATCCCAGCACCCCAGAGGAGGAGTTGCTGGTCCCTCTGAAGGCTGTTCTGTTAGATATGGGCAAGGAGCGTAAGGAAGCTTGGGTCAGCTGGGTAAAGACCTACATAGCGGAGGTATAGTGACCTTAAATACCCGATGGTTAAAAAGTAGCAGTCATGAATTAAAACTTTGACCTATTTCTGTTTCTACGTTGGTTCTGATCTGTTTTGCTGTTGTTCCTTTGCATTTTTAAAATCATGTTTCAGTATGTCAGCTTGTTGATTCGGAGTTAAACATTCTGAAGCTTTTGCATCACTATGGACGTCACATTTATAAGTTATATAACTTGCATGTCTATCTGCAATGGAGTTGGCATAGAATTATTACTAATGAGACTAGGAGGCCATTAGATGATTCAAGTAGCTGGAAACTGTTTCTCTCATGTCTTCAATTTCACAGGAGCAAGATGTAGTTGCCTTTTGCAATCTTTGTCATATCTTCCTTGTATTCTCTATTTCTCATTCCCATAAAGAAGGGATTCTAAGAGCTATTTCTTAAAAAAAATTATATACTTGTATCAAGTGCTCATAACTAATGAGGCCATATAGGATTATAGAGTCTGCCAGCAGGAGGAGTTATCATTTAGATTGACTTCGAGAAAGCTTATGCAAATTGGGATTTTTTGGATAAGGTCTTTGAAAAAGGGCTTGGTGTTAAGTGGAGATCCTGGATCTGGAGTGGCATTTGAATGGTGAAGTTCTCCATCTTTATCAACGATAACCTGAGAGGCAGGATTTAAGCTTCCAGATATCAAACAAGGTGATCCGCTCTCTTCTTATCTTTTCCTCTTGGTAGTGGATGTCCTAAGTAGATTGATCTAAAAAAGGATGGAGAGCAACATTTGGAGGTTTTTTCAGGTTGAGAAGCATAGTCGACAACTCGATTTTTTCTGTTCTGGAAAAGAGGAGTCATTTGGTAATTGAATCAAGTTTTATCGTTTTTCGAACCCATTTTTTGGGCCTGAGGACTAATAGAGGTAAATGTTAAATTTTGGGGATTTACTGCAACTCATTTGAGCTTAGTAGATGGGCATTTGGGTGTGTGTGAGGTGGATGCTTTGCCTTCGGCCTATCTTGGTTTTCCCCTTAGTCATAACCATATGAGCATTTCATTTTGGGACCTGATCATCAATAAGACTCGCTTGCCTCATGAAAAAAGAGCTTCTTTTTAGGAGGGGAAAGATTCACTTTCATTTAGTCGATGTTGAGTGGTATACCTACCTACTTCTTGTCTCTGTTGAGAGTCCCTAGCTCGATGAGTAAGTCGCTTGAGAATTTGAGGAGAAATTTCTTACGGGAGGGCATTAACGAGGGCAATGGGTTGAACCTCATGCACTTGGAGGTGGTTTCTAGGCTGTTGGACCCAGGGGGTTAGGGCGTTAGGCATTGGTAACCCGAGGGAGAGAAGCACAACCCCTTTTGGCTAAATGGTTCTGATGATTCTATCATAAATCCAACACTTTATATCACAAGGTTATTGTTACAAATACGGGTCTCATCCCTTTGATTGAACCAATGGTGGGGTTAAAGGAAGCTCCAAAATCATTGGAAATTTCGGCGAAGTTCCCTTAGCTTTCCTTGTTCATTCTGAATGTGGTGGGGATGAGAGGAATACCTATTTTTGGGAGGATAAGTGGCTGGGGGATAACACCTATATGCCTTATTTCCTCATTTATATCCGCTGTCTTCTTTCAGAAATCATTTCGTGGTGTCAATCCTTTCTTTGTCAGATTCTCTATCTTTTCCATCTTCTTTGGGTTTCTATTTTCCATTGACCAATTGGGAAATGACAGATATCTTGGCTATTTTATCTTTGTTTTGTGAGTTTTGATTTACGTCCGGTAGGGGGACATTTACCTTTGGTTCGTAGTCCTTCCAAAGGCTTTTCTTGCTGCCTTTTCTTCGGTTGCCTGACCCCTTCCTATACTTTCTCCTTGGTGTGGAGGGTAAAAATATCAAAGTATGTGAAATTCTTTTGGCATGTTTTACATGGACGGGCCAACACCTTGGATTTGATCTCAGCCTGAGGTTTCTCCTTGGTTGGGTCTCACTGTTGTACTCTGCATGCAAGTGGATGAGGACCTTGATCATATCCTTTGAAGTTGTGTTTTTGCTTGGGCTGTTTGGTTTGTTTTCTTTGAGGCTTTAGCTATAGCTTTGTTAGCTCTCAAAGTTACAAGGATTCGATTGAGGAGCTATTAAAAGGCTATTTTTGGGGGCAAGCTGGAGTGTGCTATTTTGTGAGGTATATGGGGGCGAGAAACAATAGAATCTTCATGGAAGGAGATTTCCAACATGATGTTTGGTCTTTTGCTAGACTCTCTGTTTCTATCTTGGTCTATGTATCTAGGCCTTTTTGTAATTTCTCATTATGTCTTATTTTACTTTATTGAAGCCACTTCTTGAAGTTGACTCCCCTTTTGTGGGTTTATTTTATTTTTTTGTAAGCTTGTGTATTCTTTCATTTCTTCCCAATGGAAATCTGGTATTTCATGCCTCCAATAAAAGTATATATATATGTACTTGTTTCATATAAGAAAATTCCGTCTCCTATACTCTACCCAACTCAACGCAGTGAGTCTTTTTCAACTTCAATTTTTAATTTTATCAAACTTCAACTTTGAAAGGGTGGATAAATTTTTACTTTTTCTTATGACTAAATTTGAAATTGGCGACAAACCTTCCAATAAATCACACAATTTAATGGGGAGAAGGGTTTCTTGTTGAAGCTGGATCTGGAAAAAGCCTATGATAAGGTAGATTGGTCTTTTTAGACTCAGTGTTGGAGTGGAAAGGGTTTGGCACAAGGTGGAGAAGGTGGACAAGGGGTTGTTTATCTTGAGCCAATTTCTGTTTGGATTAACGGAAGACCTAGAGGGAAATTTTTTGTGAAGAGAGGGCTGAGACAAGGAGATTCCCTCTCTATTTCTTTTCACGTTAGCCAGAGATGCCCTTGGTAGAATTATCCACTTTTGTACTGAACGAAAGGAGATTGTTGAAGGGCTTTGTGGTTGGAAAATAAGGGGTTTTTGTTACTCACCTTCAGTATGCGGATGACACCTTAATTTTGGTGATGGGGAAGTTGAGAATTTAGAGGCTTTGTGGTTTCTCCGCAGATCTATTTGCAGGCCTCGGGATTAATCCTAAACTTGGGAAAAATAGCCATTGTGGGCATTAATGCCAGTCTTGAAGAGGTTGTTGATAAGGCCATAGATTGGGTGTGCGGTTCAAGAGCTCCCAATTTCTTATTTGGGATTCCCTTTAGGTGGAAACTCTCACACTATTGGTTTTTGGGAACCTTTGCTTGATAGATTTCAAATCAAGACAAGTGGAGGAGTCTTGCCATGTCTAAAGCTGGGAGGTTGACGTTGACGCAGCCAATGTTAAATAGCCTTCCTACATACTATCTCTGAAGGGCCCCCAAATGAAAATCTCTTGGTTTTTTTTTTTCTTCTCTCCTAAAATAAATATCTGGAATTCAATTTTCTTTGACCTTTGTCTACTATATATTTTTACTTCTGTTCAGGTGGTGTCGGTTTTCCTGGTTGTAGCTTGCTTAGTGCTTGGTACTTTGTTATACCTTCCTAGAAATTGATTGACCGTAGTAGGAGAGAATTGGAAGTCCATACCTTTCAATGGAATGACGTTGTGCAGGCATTAGCTTCAACTTAACTCATATCAATAATTAAATAATTCAATTTGTAATGCTAATTTTCTCATTGGCAAGGGCTTGGGGACTCTTGGTCATACCGGCTTAGAGGTCATAAGTTCAAACCTTGGGTATGCTTAATACCAAAAATACTTCATATCTCCTGGTGGGGCCTTGGTGTGGCACTGGTGCCATCGGGTATGGTGGAGCTTTCTCGTGGTTATTATAAAAAAAGAAAGTAAAATAGACCTACCAATTCATAATTTCATTTTTTTTCCCTTCATTCTGTTTTCAGTATATTGAGTTAAATATTATATCTCTATAACAATCTGGAGTCCCCGCAAGCATCGAGGAAAGCCTGTTTAGGCATATACATTCTTTTCCAGTTTTCAATTTGGATGGGGAGAATGAAAATTATTAAACTGATATGATTATCATTTGCTAAAACTCATCCTCTTTATCTAATTTGAGGCAGTGTCTGCCATCCAATCATATAACCTCCAGGTTCTAACCAAACAACTTAATGCTCATCTCACGGTCCAGTTTGCGTTGGGCAGAAATCATGCCCAGATATTGGACTTGGTTCAAGTTTCAGAAAACTCTATAGTGCTACTAGATGTGTTTTCTGTTAGTAGCTGAATTTTCATAAGACACTACATTTAAGGAACGCTGGCCTGCTTTAAATATAATGATAATTATATTTTCCTTTATCTTATCCAGTATTCTATTGCTTTTGCAGCTTGCTGCAAGTGGCATATCAGATGAGGAGCGCAAGGCCTCCATGGATGCAATAAATCCTAAATATATTCTCAGGAACTACCTATGCCAGACTGCCATTGATGCAGCTGAGCAGGGTGATTTTGGAGAGGTTCGTCGACTGCTGAAGATAATGGAACGACCATTTGATGAGCAGCCAGAAATGGAAAAATATGCACGGTTGCCCCCAGCTTGGGCTTATCGGCCGGGCGTTTGTATGCTTTCTTGTTCCTCATGAGTTGATAGCTTGCTTTACGTTACACGAAAAGGGGGGAAATAATTTATTGATAGGATATTTGCTGTGTATATAATGTTGTCAAGTCAGTGTTGTAATCTCAAACAACAATAATGTTCAGTTATTGTTTTTCTCTATAATATGACCTGAAATTTAGACAGTTCTCATTTCCACTTCTACGTTCATATACATGCTTTTTTGTACTTGCATTACTGAGAGAAACGTTCTGTAATGTCCATACATATAGTCTTGACAGACAAGATTCAATTGATCCTTTCTATTTATTTGTATTCAGACATGCCTAGGATTTGTTCGCAAATTGCTGATTTTTATTGTTTAGAACAGAAATCATTGAAAACATTCAAGGATCTGCATAAAAAGCATTCTAGTGATGACTTCTCTGAATCTGTTCTTTGAGAATTATGACTTTTGTTTATTT

mRNA sequence

AAAAGAAGTCCAACAAAAAACGGAATGTCATTGGTTTCCCATCTCTTCCCCAAGCTCTCGTTGCTTCCCCACATTTCTCTGCTCTGCCATGGCCACCGCCACCGCCACCGCCTCGGCTTAGTTCGCCGCCATTCGACTACACTCATTCACCGCCGTTCTCCTGCCTCTATTTTCTCTCCACCTTCACCACTCGTCGGCCACTCTCGCCACGGCCGCCGGAGAGTTTCCATGGACTCTGCTTCACCGGAGGTTTCGGCGTCTGTCGACTCCGTGGCGGATGGTTTGAAGAATCAAAGCTTGAATAGTGACGATCGTGGGGGTTTAGGTAGTGGAGTCGAGCACGGGGCGAAAAAGAAGCTCGAAGAACTTAACTGGGACAATTCATTTGTGAGAGAGCTGCCTGGCGATCCCCGTACGGATGTCCTTCCACGACAGGTGTTACATGCATGTTATTCAAATGTATTACCTTCAGTTGAAGTAGAAAGTCCTCAGCTTGTTGCTTGGTCAGAATCGGTTGCTGATTTGCTAGATTTGGATCTTCAAGAATTTGAAAGGCCAGATTTCCCCCTCTTGTTCTCTGGCGCATCTCCATTAGTTGGAGTGTCACCTTATGCTCAATGTTACGGGGGCCATCAGTTTGGCATGTGGGCTGGGCAGTTGGGTGATGGTCGAGCAATAACCCTTGGAGAGATACTTAATTCCCGATCTGAAAGGTGGGAGTTGCAGTTAAAAGGTGCTGGGAAGACGCCATACAGTCGGTTTGCAGATGGCTTGGCTGTGCTACGAAGTAGCATCAGGGAGTTTCTTTGTAGTGAAGCAATGCATAGTCTTGGAATACCAACAACTCGTGCCCTTTGTCTTCTGACCACAGGAACATTTGTTACCCGAGACATGTTCTATGATGGGAATCAAAAAGAAGAGCCTGGTGCAATTGTATGTCGAGTGGCTCAATCCTTTTTGCGTTTTGGATCCTTCCAAATTCATGCCTCTAGAGGGAAGGATGATTATAAAATTGTTCGGGCTTTAGCAGACTATGCGATTCGCCACCATTTTCCGCACTTTGAGAATATGAGCAGCAGTCAGAGCTTATCTTTCAGCACGGGTGATGAAGATAGTTCAGTTGTTGATCTCACTTCAAACAAGTATGCAGCTTGGGCTGTAGAAGTTGCTGAGCGAACTGCTTCCTTAATAGCAAGTTGGCAGGGAGTTGGGTTCACACATGGTGTACTCAACACTGACAATATGAGCATCTTGGGTCTTACCATTGATTATGGTCCCTTCGGATTTTTGGATGCTTTTGATCCTAGTTATACGCCTAATACAACTGATCTTCCGGGCAGAAGATACTGTTTTGCAAATCAGCCAGATGTAGGCTTATGGAATATAGCCCAATTTGCTTCAACACTTTCAGCTGCTGAATTAATAAATGATAAAGAAGCAAACTATGCCATGGAGAGATACGGAGACAAATTTATGGATGACTATCAAACAATCATGACCAAGAAAATTGGTCTACCAAAGTACAATAAACAGTTAATCAGCAAACTTCTCAACAACATGGCTGTTGACAAAGTTGATTATACAAATTTCTTTAGATCACTTGCCAATATCAAAGCTGATCCCAGCACCCCAGAGGAGGAGTTGCTGGTCCCTCTGAAGGCTGTTCTGTTAGATATGGGCAAGGAGCGTAAGGAAGCTTGGGTCAGCTGGGTAAAGACCTACATAGCGGAGCTTGCTGCAAGTGGCATATCAGATGAGGAGCGCAAGGCCTCCATGGATGCAATAAATCCTAAATATATTCTCAGGAACTACCTATGCCAGACTGCCATTGATGCAGCTGAGCAGGGTGATTTTGGAGAGGTTCGTCGACTGCTGAAGATAATGGAACGACCATTTGATGAGCAGCCAGAAATGGAAAAATATGCACGGTTGCCCCCAGCTTGGGCTTATCGGCCGGGCGTTTGTATGCTTTCTTGTTCCTCATGAGTTGATAGCTTGCTTTACGTTACACGAAAAGGGGGGAAATAATTTATTGATAGGATATTTGCTGTGTATATAATGTTGTCAAGTCAGTGTTGTAATCTCAAACAACAATAATGTTCAGTTATTGTTTTTCTCTATAATATGACCTGAAATTTAGACAGTTCTCATTTCCACTTCTACGTTCATATACATGCTTTTTTGTACTTGCATTACTGAGAGAAACGTTCTGTAATGTCCATACATATAGTCTTGACAGACAAGATTCAATTGATCCTTTCTATTTATTTGTATTCAGACATGCCTAGGATTTGTTCGCAAATTGCTGATTTTTATTGTTTAGAACAGAAATCATTGAAAACATTCAAGGATCTGCATAAAAAGCATTCTAGTGATGACTTCTCTGAATCTGTTCTTTGAGAATTATGACTTTTGTTTATTT

Coding sequence (CDS)

ATGTCATTGGTTTCCCATCTCTTCCCCAAGCTCTCGTTGCTTCCCCACATTTCTCTGCTCTGCCATGGCCACCGCCACCGCCACCGCCTCGGCTTAGTTCGCCGCCATTCGACTACACTCATTCACCGCCGTTCTCCTGCCTCTATTTTCTCTCCACCTTCACCACTCGTCGGCCACTCTCGCCACGGCCGCCGGAGAGTTTCCATGGACTCTGCTTCACCGGAGGTTTCGGCGTCTGTCGACTCCGTGGCGGATGGTTTGAAGAATCAAAGCTTGAATAGTGACGATCGTGGGGGTTTAGGTAGTGGAGTCGAGCACGGGGCGAAAAAGAAGCTCGAAGAACTTAACTGGGACAATTCATTTGTGAGAGAGCTGCCTGGCGATCCCCGTACGGATGTCCTTCCACGACAGGTGTTACATGCATGTTATTCAAATGTATTACCTTCAGTTGAAGTAGAAAGTCCTCAGCTTGTTGCTTGGTCAGAATCGGTTGCTGATTTGCTAGATTTGGATCTTCAAGAATTTGAAAGGCCAGATTTCCCCCTCTTGTTCTCTGGCGCATCTCCATTAGTTGGAGTGTCACCTTATGCTCAATGTTACGGGGGCCATCAGTTTGGCATGTGGGCTGGGCAGTTGGGTGATGGTCGAGCAATAACCCTTGGAGAGATACTTAATTCCCGATCTGAAAGGTGGGAGTTGCAGTTAAAAGGTGCTGGGAAGACGCCATACAGTCGGTTTGCAGATGGCTTGGCTGTGCTACGAAGTAGCATCAGGGAGTTTCTTTGTAGTGAAGCAATGCATAGTCTTGGAATACCAACAACTCGTGCCCTTTGTCTTCTGACCACAGGAACATTTGTTACCCGAGACATGTTCTATGATGGGAATCAAAAAGAAGAGCCTGGTGCAATTGTATGTCGAGTGGCTCAATCCTTTTTGCGTTTTGGATCCTTCCAAATTCATGCCTCTAGAGGGAAGGATGATTATAAAATTGTTCGGGCTTTAGCAGACTATGCGATTCGCCACCATTTTCCGCACTTTGAGAATATGAGCAGCAGTCAGAGCTTATCTTTCAGCACGGGTGATGAAGATAGTTCAGTTGTTGATCTCACTTCAAACAAGTATGCAGCTTGGGCTGTAGAAGTTGCTGAGCGAACTGCTTCCTTAATAGCAAGTTGGCAGGGAGTTGGGTTCACACATGGTGTACTCAACACTGACAATATGAGCATCTTGGGTCTTACCATTGATTATGGTCCCTTCGGATTTTTGGATGCTTTTGATCCTAGTTATACGCCTAATACAACTGATCTTCCGGGCAGAAGATACTGTTTTGCAAATCAGCCAGATGTAGGCTTATGGAATATAGCCCAATTTGCTTCAACACTTTCAGCTGCTGAATTAATAAATGATAAAGAAGCAAACTATGCCATGGAGAGATACGGAGACAAATTTATGGATGACTATCAAACAATCATGACCAAGAAAATTGGTCTACCAAAGTACAATAAACAGTTAATCAGCAAACTTCTCAACAACATGGCTGTTGACAAAGTTGATTATACAAATTTCTTTAGATCACTTGCCAATATCAAAGCTGATCCCAGCACCCCAGAGGAGGAGTTGCTGGTCCCTCTGAAGGCTGTTCTGTTAGATATGGGCAAGGAGCGTAAGGAAGCTTGGGTCAGCTGGGTAAAGACCTACATAGCGGAGCTTGCTGCAAGTGGCATATCAGATGAGGAGCGCAAGGCCTCCATGGATGCAATAAATCCTAAATATATTCTCAGGAACTACCTATGCCAGACTGCCATTGATGCAGCTGAGCAGGGTGATTTTGGAGAGGTTCGTCGACTGCTGAAGATAATGGAACGACCATTTGATGAGCAGCCAGAAATGGAAAAATATGCACGGTTGCCCCCAGCTTGGGCTTATCGGCCGGGCGTTTGTATGCTTTCTTGTTCCTCATGA

Protein sequence

MSLVSHLFPKLSLLPHISLLCHGHRHRHRLGLVRRHSTTLIHRRSPASIFSPPSPLVGHSRHGRRRVSMDSASPEVSASVDSVADGLKNQSLNSDDRGGLGSGVEHGAKKKLEELNWDNSFVRELPGDPRTDVLPRQVLHACYSNVLPSVEVESPQLVAWSESVADLLDLDLQEFERPDFPLLFSGASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQLKGAGKTPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGNQKEEPGAIVCRVAQSFLRFGSFQIHASRGKDDYKIVRALADYAIRHHFPHFENMSSSQSLSFSTGDEDSSVVDLTSNKYAAWAVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFGFLDAFDPSYTPNTTDLPGRRYCFANQPDVGLWNIAQFASTLSAAELINDKEANYAMERYGDKFMDDYQTIMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSLANIKADPSTPEEELLVPLKAVLLDMGKERKEAWVSWVKTYIAELAASGISDEERKASMDAINPKYILRNYLCQTAIDAAEQGDFGEVRRLLKIMERPFDEQPEMEKYARLPPAWAYRPGVCMLSCSS
BLAST of Cp4.1LG18g01390 vs. Swiss-Prot
Match: Y1574_AZOSB (UPF0061 protein azo1574 OS=Azoarcus sp. (strain BH72) GN=azo1574 PE=3 SV=1)

HSP 1 Score: 500.0 bits (1286), Expect = 4.0e-140
Identity = 277/553 (50.09%), Postives = 341/553 (61.66%), Query Frame = 1

Query: 112 LEELNWDNSFVRELPGDPRTDVLPRQVLHACYSNVLPSVEVESPQLVAWSESVADLLDLD 171
           +  L +DN FVRELP DP T    RQV  A YS V P+  V +P LVA S  VA LL  D
Sbjct: 1   MRPLVFDNRFVRELPADPETGPHTRQVAGASYSRVNPT-PVAAPHLVAHSAEVAALLGWD 60

Query: 172 LQEFERPDFPLLFSGASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERW 231
             +   P+F  +F G   L G+ PYA CYGGHQFG WAGQLGDGRAITLGE+LN +  RW
Sbjct: 61  ESDIASPEFAEVFGGNRLLDGMEPYAACYGGHQFGNWAGQLGDGRAITLGEVLNGQGGRW 120

Query: 232 ELQLKGAGKTPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMF 291
           ELQLKGAG TPYSR ADG AVLRSSIREFLCSEAMH LG+PTTRAL L+ TG  V RDMF
Sbjct: 121 ELQLKGAGPTPYSRRADGRAVLRSSIREFLCSEAMHHLGVPTTRALSLVGTGEKVVRDMF 180

Query: 292 YDGNQKEEPGAIVCRVAQSFLRFGSFQIHASRGKDDYKIVRALADYAIRHHFPHFENMSS 351
           YDGN + EPGAIVCRVA SF+RFG+F++ A+RG  D  ++  L D+ I   FP  E  + 
Sbjct: 181 YDGNPQAEPGAIVCRVAPSFIRFGNFELLAARG--DLDLLNRLIDFTIARDFPGIEGSAR 240

Query: 352 SQSLSFSTGDEDSSVVDLTSNKYAAWAVEVAERTASLIASWQGVGFTHGVLNTDNMSILG 411
                               +K A W   V  RTA+++A W  VGF HGV+NTDNMSILG
Sbjct: 241 --------------------DKRARWFETVCARTATMVAHWMRVGFVHGVMNTDNMSILG 300

Query: 412 LTIDYGPFGFLDAFDPSYTPNTTDLPGRRYCFANQPDVGLWNIAQFASTLSAAELINDKE 471
           LTIDYGP+G++D FDP +TPNTTD  GRRY F +QP +  WN+ Q A+ L  A      E
Sbjct: 301 LTIDYGPYGWVDNFDPGWTPNTTDAGGRRYRFGHQPRIANWNLLQLANALFPA--FGSTE 360

Query: 472 ANYA-MERYGDKFMDDYQTIMTKKIGLPKY---NKQLISKLLNNMAVDKVDYTNFFRSLA 531
           A  A +  Y + +  + + +   K+GL      +  ++  L   M   +VD T FFR+LA
Sbjct: 361 ALQAGLNTYAEVYDRESRAMTAAKLGLAALADADLPMVDALHGWMKRAEVDMTLFFRALA 420

Query: 532 NIKADPSTPEEELLVPLKAVLLD------MGKERKEAWVSWVKTYIAELAASGISDEERK 591
                    E +LL P  A+ LD         E  E +  W++ Y       G+  ++R+
Sbjct: 421 ---------EVDLLKPDPALFLDAFYDDAKRLETAEEFSGWLRLYADRCRQEGLDADQRR 480

Query: 592 ASMDAINPKYILRNYLCQTAIDAAEQGDFGEVRRLLKIMERPFDEQPEMEKYARLPPAWA 651
           A M+A NP+Y++RNYL Q AIDAAEQGD+G VR LL +M RP+DEQPE   YA+  P WA
Sbjct: 481 ARMNAANPRYVMRNYLAQQAIDAAEQGDYGPVRSLLDVMRRPYDEQPERAAYAQRRPDWA 519

Query: 652 -YRPGVCMLSCSS 654
             R G  MLSCSS
Sbjct: 541 RERAGCSMLSCSS 519

BLAST of Cp4.1LG18g01390 vs. Swiss-Prot
Match: Y3800_AROAE (UPF0061 protein AZOSEA38000 OS=Aromatoleum aromaticum (strain EbN1) GN=AZOSEA38000 PE=3 SV=1)

HSP 1 Score: 492.3 bits (1266), Expect = 8.3e-138
Identity = 265/556 (47.66%), Postives = 342/556 (61.51%), Query Frame = 1

Query: 112 LEELNWDNSFVRELPGDPRTDVLPRQVLHACYSNVLPSVEVESPQLVAWSESVADLLDLD 171
           ++ L  DN FV ELPGDP      RQV  ACYS V+P+  V +P L+AWS  VA LL  D
Sbjct: 1   MKNLVLDNRFVHELPGDPNPSPDVRQVHGACYSRVMPT-PVSAPHLIAWSPEVAALLGFD 60

Query: 172 LQEFERPDFPLLFSGASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSE-- 231
             +   P+F  +F+G + + G+ PYA CYGGHQFG WAGQLGDGRAITLGE + +R +  
Sbjct: 61  ESDVRSPEFAAVFAGNALMPGMEPYAACYGGHQFGNWAGQLGDGRAITLGEAVTTRGDGH 120

Query: 232 --RWELQLKGAGKTPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVT 291
             RWELQLKGAG TPYSR ADG AVLRSSIREFLCSEAMH LG+PTTRALCL+ TG  V 
Sbjct: 121 TGRWELQLKGAGPTPYSRHADGRAVLRSSIREFLCSEAMHHLGVPTTRALCLVGTGEKVV 180

Query: 292 RDMFYDGNQKEEPGAIVCRVAQSFLRFGSFQIHASRGKDDYKIVRALADYAIRHHFPHFE 351
           RDMFYDG  K EPGA+VCRVA SF+RFG+F+I  SRG  D  ++  L D+ I   FP   
Sbjct: 181 RDMFYDGRPKAEPGAVVCRVAPSFIRFGNFEIFTSRG--DEALLTRLVDFTIARDFPEL- 240

Query: 352 NMSSSQSLSFSTGDEDSSVVDLTSNKYAAWAVEVAERTASLIASWQGVGFTHGVLNTDNM 411
                       G E ++       + A W  +V ERTA +IA W  VGF HGV+NTDNM
Sbjct: 241 ------------GGEPAT-------RRAEWFCKVCERTARMIAQWMRVGFVHGVMNTDNM 300

Query: 412 SILGLTIDYGPFGFLDAFDPSYTPNTTDLPGRRYCFANQPDVGLWNIAQFASTL----SA 471
           SILGLTIDYGP+G++D FDP +TPNTTD  G+RY F NQP +  WN+ Q A+ L     A
Sbjct: 301 SILGLTIDYGPYGWIDNFDPGWTPNTTDAGGKRYRFGNQPHIAHWNLLQLANALYPVFGA 360

Query: 472 AELINDKEANYAMERYGDKFMDDYQTIMTKKIGLPKYNKQ---LISKLLNNMAVDKVDYT 531
           AE +++      ++ Y   F ++ + ++  K+G   +  +   L+  L   +   +VD T
Sbjct: 361 AEPLHE-----GLDLYARVFDEENRRMLAAKLGFEAFGDEDATLVETLHALLTRAEVDMT 420

Query: 532 NFFRSLANIKADPSTPEEELLVPLKAVLLDMGKER--KEAWVSWVKTYIAELAASGISDE 591
            FFR LA++  +  + +     PL+       K    +    SW+  Y           +
Sbjct: 421 IFFRGLASLDLEAPSID-----PLRDAFYSAEKAAVAEPEMNSWLAAYTKRTKQERTPGD 480

Query: 592 ERKASMDAINPKYILRNYLCQTAIDAAEQGDFGEVRRLLKIMERPFDEQPEMEKYARLPP 651
           +R+  M+A+NP+++LRNYL Q AIDAAEQG++  V  LL +M  P+DEQP  E++A   P
Sbjct: 481 QRRVRMNAVNPRFVLRNYLAQEAIDAAEQGEYALVSELLDVMRHPYDEQPGRERFAARRP 523

Query: 652 AWA-YRPGVCMLSCSS 654
            WA  R G  MLSCSS
Sbjct: 541 DWARNRAGCSMLSCSS 523

BLAST of Cp4.1LG18g01390 vs. Swiss-Prot
Match: Y1788_METFK (UPF0061 protein Mfla_1788 OS=Methylobacillus flagellatus (strain KT / ATCC 51484 / DSM 6875) GN=Mfla_1788 PE=3 SV=1)

HSP 1 Score: 486.1 bits (1250), Expect = 6.0e-136
Identity = 255/546 (46.70%), Postives = 338/546 (61.90%), Query Frame = 1

Query: 115 LNWDNSFVRELPGDPRTDVLPRQVLHACYSNVLPSVEVESPQLVAWSESVADLLDLDLQE 174
           L +DN F+RELPGDP T    RQV  AC+S V+P+  V SP+L+A+S  + + L+L  +E
Sbjct: 2   LTFDNRFLRELPGDPETSNQLRQVYGACWSRVMPT-SVSSPKLLAYSHEMLEALELSEEE 61

Query: 175 FERPDFPLLFSGASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQ 234
              P +    +G   + G+ PYA CYGGHQFG WAGQLGDGRAI+LGE++N + +RWELQ
Sbjct: 62  IRSPAWVDALAGNGLMPGMEPYAACYGGHQFGHWAGQLGDGRAISLGEVVNRQGQRWELQ 121

Query: 235 LKGAGKTPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDG 294
           LKGAG TPYSR ADG AVLRSS+REFLCSEAMH LGIPTTRAL L+ TG  V RDMFYDG
Sbjct: 122 LKGAGVTPYSRMADGRAVLRSSVREFLCSEAMHHLGIPTTRALSLVQTGDVVIRDMFYDG 181

Query: 295 NQKEEPGAIVCRVAQSFLRFGSFQIHASRGKDDYKIVRALADYAIRHHFPHFENMSSSQS 354
           + + E GAIVCRV+ SF+RFG+F+I A R  DD + ++ L D+ I   FP   N    + 
Sbjct: 182 HPQAEKGAIVCRVSPSFIRFGNFEIFAMR--DDKQTLQKLVDFTIDRDFPELRNYPEEER 241

Query: 355 LSFSTGDEDSSVVDLTSNKYAAWAVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTI 414
           L                   A W   +  RTA LIA W  VGF HGV+NTDNMSILGLTI
Sbjct: 242 L-------------------AEWFAIICVRTARLIAQWMRVGFVHGVMNTDNMSILGLTI 301

Query: 415 DYGPFGFLDAFDPSYTPNTTDLPGRRYCFANQPDVGLWN---IAQFASTLSAAELINDKE 474
           DYGP+G++D FDP +TPNTTD  GRRYCF  QPD+  WN   +AQ   TL     I D+ 
Sbjct: 302 DYGPYGWVDNFDPGWTPNTTDAAGRRYCFGRQPDIARWNLERLAQALYTLKPEREIYDE- 361

Query: 475 ANYAMERYGDKFMDDYQTIMTKKIGLPKYNKQ---LISKLLNNMAVDKVDYTNFFRSLAN 534
               +  Y   + +++  ++  K G   +  +   L++++   M   ++D T FFR LA 
Sbjct: 362 ---GLMLYDQAYNNEWGAVLAAKFGFSAWRDEYEPLLNEVFGLMTQAEIDMTEFFRKLAL 421

Query: 535 IKADPSTPEEELLVPLKAVLLDMGKERKEAWVSWVKTYIAELAASGISDEERKASMDAIN 594
           +  D + P+  +L    A    + +  K  +  W+  Y     A G    ER+ +M+ +N
Sbjct: 422 V--DAAQPDLGIL-QSAAYSPALWETFKPRFSDWLGQYAQATLADGRDPAERREAMNRVN 481

Query: 595 PKYILRNYLCQTAIDAAEQGDFGEVRRLLKIMERPFDEQPEMEKYARLPPAWA-YRPGVC 654
           P+Y+LRNYL Q AID A+ GD   +  L+ ++ +P+DEQP  E++A L P WA ++ G  
Sbjct: 482 PRYVLRNYLAQQAIDLADTGDTSMIEALMDVLRKPYDEQPGKERFAALRPDWARHKAGCS 518

BLAST of Cp4.1LG18g01390 vs. Swiss-Prot
Match: Y683_TOLAT (UPF0061 protein Tola_0683 OS=Tolumonas auensis (strain DSM 9187 / TA4) GN=Tola_0683 PE=3 SV=1)

HSP 1 Score: 474.2 bits (1219), Expect = 2.4e-132
Identity = 256/543 (47.15%), Postives = 340/543 (62.62%), Query Frame = 1

Query: 115 LNWDNSFVRELPGDPRTDVLPRQVLHACYSNVLPSVEVESPQLVAWSESVADLLDLDLQE 174
           L++DN F+RELPGDP T   PRQV  A +S V P+  V  PQL+A S  VA LL + L E
Sbjct: 4   LHFDNRFIRELPGDPLTLNQPRQVHAAFWSAVTPA-PVPQPQLIASSAEVAALLGISLAE 63

Query: 175 FERPDFPLLFSGASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQ 234
            ++P +    SG   L G+SP+A CYGGHQFG WAGQLGDGRAI+LGE++++   RWELQ
Sbjct: 64  LQQPAWVAALSGNGLLDGMSPFATCYGGHQFGNWAGQLGDGRAISLGELIHN-DLRWELQ 123

Query: 235 LKGAGKTPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDG 294
           LKGAG TPYSR  DG AVLRSSIREFLCSEAM  LG+PTTRAL L+ TG  + RDMFYDG
Sbjct: 124 LKGAGVTPYSRRGDGKAVLRSSIREFLCSEAMFHLGVPTTRALSLVLTGEQIWRDMFYDG 183

Query: 295 NQKEEPGAIVCRVAQSFLRFGSFQIHASRGKDDYKIVRALADYAIRHHFPHFENMSSSQS 354
           N ++EPGAIVCRVA SF+RFG FQ+ A RG+ D  ++  L D+ I   FPH     S+Q 
Sbjct: 184 NPQQEPGAIVCRVAPSFIRFGHFQLPAMRGESD--LLNQLIDFTIDRDFPHL----SAQP 243

Query: 355 LSFSTGDEDSSVVDLTSNKYAAWAVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTI 414
            +   G                W  EV   TA L+  W  VGF HGV+NTDNMSILGLTI
Sbjct: 244 ATVRRG---------------VWFSEVCITTAKLMVEWTRVGFVHGVMNTDNMSILGLTI 303

Query: 415 DYGPFGFLDAFDPSYTPNTTDLPGRRYCFANQPDVGLWNIAQFASTLSAAELINDKEANY 474
           DYGP+G++D FD ++TPNTTD  G RYCF  QP +  WN+ + A  L    + +      
Sbjct: 304 DYGPYGWVDNFDLNWTPNTTDAEGLRYCFGRQPAIARWNLERLAEALGTV-MTDHAILAQ 363

Query: 475 AMERYGDKFMDDYQTIMTKKIGLPKY---NKQLISKLLNNMAVDKVDYTNFFRSLANIKA 534
            +E + + F  +   ++  K+G  ++   + +L+++L + +   +VD T FFR LA +  
Sbjct: 364 GIEMFDETFAQEMAAMLAAKLGWQQWLPEDSELVNRLFDLLQQAEVDMTLFFRRLALV-- 423

Query: 535 DPSTPEEELLVPLKAVLLDMGKERKEAWVSWVKTYIAELAASGISDEERKASMDAINPKY 594
           D S P+  +L        D+  + + A+  W+  Y   + + G+   ER A M+ +NP Y
Sbjct: 424 DVSAPDLTVLADA-FYRDDLFCQHQPAFTQWLTNYSQRVLSEGVLPAERAARMNQVNPVY 483

Query: 595 ILRNYLCQTAIDAAEQGDFGEVRRLLKIMERPFDEQPEMEKYARLPPAWA-YRPGVCMLS 654
           +LRNYL Q  IDAAEQG++  +  LL+++ +P+ EQ   E YA+  P WA ++PG  MLS
Sbjct: 484 VLRNYLAQQVIDAAEQGNYQPIAELLEVLRQPYTEQSGKEAYAQKRPDWARHKPGCSMLS 519

BLAST of Cp4.1LG18g01390 vs. Swiss-Prot
Match: Y1510_NITMU (UPF0061 protein Nmul_A1510 OS=Nitrosospira multiformis (strain ATCC 25196 / NCIMB 11849) GN=Nmul_A1510 PE=3 SV=1)

HSP 1 Score: 470.7 bits (1210), Expect = 2.6e-131
Identity = 259/576 (44.97%), Postives = 341/576 (59.20%), Query Frame = 1

Query: 112 LEELNWDNSFVRELPGDPRTDVLPRQVLHACYSNVLPSVEVESPQLVAWSESVADLLDLD 171
           L +  +DN FVR+LPGDP T  +PRQV +A Y+ V P+  V SP+L+AW++ V ++L + 
Sbjct: 18  LFDARFDNRFVRQLPGDPETRNVPRQVRNAGYTQVSPT-PVRSPRLLAWADEVGEMLGI- 77

Query: 172 LQEFERPDFPL-----LFSGASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNS 231
                RP  P+     + +G   L  + PYA  YGGHQFG WAGQLGDGRAITLGE+++ 
Sbjct: 78  ----ARPASPVSPAVEVLAGNRILPSMQPYAARYGGHQFGHWAGQLGDGRAITLGELISP 137

Query: 232 RSERWELQLKGAGKTPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFV 291
             +R+ELQLKGAGKTPYSR ADG AVLRSS+REFLCSEAMHSLG+PTTRAL L+ TG  V
Sbjct: 138 NDKRYELQLKGAGKTPYSRTADGRAVLRSSVREFLCSEAMHSLGVPTTRALSLVATGEAV 197

Query: 292 TRDMFYDGNQKEEPGAIVCRVAQSFLRFGSFQIHASRGKDDYKIVRALADYAIRHHFPHF 351
            RDMFYDG+   EPGAIVCRV+ SFLRFG+F+I A++ + +  ++R LAD+ I  HFP  
Sbjct: 198 IRDMFYDGHPGAEPGAIVCRVSPSFLRFGNFEILAAQKEPE--LLRQLADFVIGEHFPEL 257

Query: 352 ENMSSSQSLSFSTGDEDSSVVDLTSNKYAAWAVEVAERTASLIASWQGVGFTHGVLNTDN 411
            +      +                  YA W  EV  RT  L+A W  VGF HGV+NTDN
Sbjct: 258 ASSHRPPEV------------------YAKWFEEVCRRTGILVAHWMRVGFVHGVMNTDN 317

Query: 412 MSILGLTIDYGPFGFLDAFDPSYTPNTTDLPGRRYCFANQPDVGLWNIAQFASTLSAAEL 471
           MSILGLTIDYGP+G+L+ FD  +TPNTTD  GRRYC+ NQP +  WN+ + A  L+   L
Sbjct: 318 MSILGLTIDYGPYGWLEGFDLHWTPNTTDAQGRRYCYGNQPKIAQWNLTRLAGALT--PL 377

Query: 472 INDKEA-NYAMERYGDKFMDDYQTIMTKKIGLPKY----NKQLISKLLNNMAVDKVDYTN 531
           I D  A  + +  +G+ F + +  ++  K+GL       +  L+S L   +   + D T 
Sbjct: 378 IEDDAALEHGLAVFGETFNNTWSGMLAAKLGLASLEHSDDDSLLSDLFETLQQVETDMTL 437

Query: 532 FFRSLANIKADPSTPEEELLVPLKAVLLDMGKERKEAWV--------------------- 591
           FFR L NI  +P +       P    L  + +      V                     
Sbjct: 438 FFRCLMNIPLNPISGNRATTFPAPENLESVDQMNDHGLVELFRPAFYDAHQAFSHAHLTR 497

Query: 592 --SWVKTYIAELAASGISDEERKASMDAINPKYILRNYLCQTAIDAAEQGDFGEVRRLLK 651
              W++ YIA +   G  +  R   M   NPKY+LRNYL Q AI+A E+GD   + RL++
Sbjct: 498 LAGWLRRYIARVRQEGEPEGLRYHRMSRANPKYVLRNYLAQQAIEALERGDDSVIIRLME 557

Query: 652 IMERPFDEQPEMEKYARLPPAWA-YRPGVCMLSCSS 654
           +++ P+DEQPE E  A   P WA  +PG   LSCSS
Sbjct: 558 MLKHPYDEQPEHEDLAARRPEWARNKPGCSALSCSS 565

BLAST of Cp4.1LG18g01390 vs. TrEMBL
Match: A0A0A0LXE5_CUCSA (Uncharacterized protein OS=Cucumis sativus GN=Csa_1G605720 PE=3 SV=1)

HSP 1 Score: 1202.2 bits (3109), Expect = 0.0e+00
Identity = 592/653 (90.66%), Postives = 616/653 (94.33%), Query Frame = 1

Query: 1   MSLVSHLFPKLSLLPHISLLCHGHRHRHRLGLVRRHSTTLIHRRSPASIFSPPSPLVGHS 60
           MSLVSHLFPK S+  +ISLLCHGHR    LGLVRR ST LI R  PAS  S PSPL  HS
Sbjct: 1   MSLVSHLFPKPSVFSNISLLCHGHR----LGLVRRRSTLLIRRHPPASFTSLPSPLPAHS 60

Query: 61  RHGRRRVSMDSASPEVSASVDSVADGLKNQSLNSDDRGGLGSGVEHGAKKKLEELNWDNS 120
           RHGRR++SMDSASPEVSASVDSVA+GLKNQSLN+DDR   GS + H  KKKLE+LNWDNS
Sbjct: 61  RHGRRKLSMDSASPEVSASVDSVAEGLKNQSLNNDDRVDGGSSINHATKKKLEDLNWDNS 120

Query: 121 FVRELPGDPRTDVLPRQVLHACYSNVLPSVEVESPQLVAWSESVADLLDLDLQEFERPDF 180
           FVRELPGDPRTD++PR+VLHACYS VLPSVEV+SPQLVAWSESVADLLDLD QEFERPDF
Sbjct: 121 FVRELPGDPRTDIIPREVLHACYSKVLPSVEVQSPQLVAWSESVADLLDLDPQEFERPDF 180

Query: 181 PLLFSGASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQLKGAGK 240
           PLLFSGASPLVG SPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQLKGAGK
Sbjct: 181 PLLFSGASPLVGASPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQLKGAGK 240

Query: 241 TPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGNQKEEP 300
           TPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGN KEEP
Sbjct: 241 TPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGNPKEEP 300

Query: 301 GAIVCRVAQSFLRFGSFQIHASRGKDDYKIVRALADYAIRHHFPHFENMSSSQSLSFSTG 360
           GAIVCRVAQSFLRFGS+QIHASRGKDD+KIVRALADY IRHHFPH ENMSSSQS+SFSTG
Sbjct: 301 GAIVCRVAQSFLRFGSYQIHASRGKDDFKIVRALADYVIRHHFPHLENMSSSQSVSFSTG 360

Query: 361 DEDSSVVDLTSNKYAAWAVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFG 420
           + DSSVVDLTSNKYAAW VEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFG
Sbjct: 361 NTDSSVVDLTSNKYAAWTVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFG 420

Query: 421 FLDAFDPSYTPNTTDLPGRRYCFANQPDVGLWNIAQFASTLSAAELINDKEANYAMERYG 480
           FLDAFDPS+TPNTTDLPGRRYCFANQPD+GLWNIAQFASTLSAAELINDKEANYAMERYG
Sbjct: 421 FLDAFDPSFTPNTTDLPGRRYCFANQPDIGLWNIAQFASTLSAAELINDKEANYAMERYG 480

Query: 481 DKFMDDYQTIMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSLANIKADPSTPEEEL 540
           DKFMDDYQ IMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSL+N+KADPS PEEEL
Sbjct: 481 DKFMDDYQAIMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSLSNLKADPSIPEEEL 540

Query: 541 LVPLKAVLLDMGKERKEAWVSWVKTYIAELAASGISDEERKASMDAINPKYILRNYLCQT 600
           LVPLKAVLLD+GKERKEAWVSWVKTY+ ELA SGISDEERKASMDA+NPKYILRNYLCQT
Sbjct: 541 LVPLKAVLLDIGKERKEAWVSWVKTYMEELAGSGISDEERKASMDAVNPKYILRNYLCQT 600

Query: 601 AIDAAEQGDFGEVRRLLKIMERPFDEQPEMEKYARLPPAWAYRPGVCMLSCSS 654
           AIDAAEQGDFGEVR+LLKIMERPFDEQP MEKYARLPPAWAYRPGVCMLSCSS
Sbjct: 601 AIDAAEQGDFGEVRQLLKIMERPFDEQPGMEKYARLPPAWAYRPGVCMLSCSS 649

BLAST of Cp4.1LG18g01390 vs. TrEMBL
Match: W9QD90_9ROSA (UPF0061 protein azo1574 OS=Morus notabilis GN=L484_018405 PE=3 SV=1)

HSP 1 Score: 1032.3 bits (2668), Expect = 2.5e-298
Identity = 518/650 (79.69%), Postives = 565/650 (86.92%), Query Frame = 1

Query: 3   LVSHLFPKLSLLPHISLLCHGHRHRHRLGLVRRHSTTLIHRRSPASIFSPPSPLVGHSRH 62
           ++SH   K S  P  SL    H  R    L  RHS+             P  PL  H+  
Sbjct: 1   MLSHFSSKASSFPLPSL---SHLSRSAASLRHRHSSPKFPFYPSIPTKFPRLPLACHASS 60

Query: 63  GRRRVSMDSASP----EVSASVDSVADGLKNQSLNS--DDRG-GLGSGVEHGAKKKLEEL 122
           G   VSMDS SP    E +A+VDSVA  L+NQSL    D R  G G+G +  A+ KLE+L
Sbjct: 61  GG--VSMDSPSPSPSPETAAAVDSVARDLQNQSLRGGEDQRDDGFGNGAKR-ARLKLEDL 120

Query: 123 NWDNSFVRELPGDPRTDVLPRQVLHACYSNVLPSVEVESPQLVAWSESVADLLDLDLQEF 182
           NWD+SFVRELPGDPRTD +PR+VLHACY+ VLPS EV+ PQLVAWSESVADLLDLD +EF
Sbjct: 121 NWDHSFVRELPGDPRTDTMPREVLHACYTKVLPSAEVDKPQLVAWSESVADLLDLDPKEF 180

Query: 183 ERPDFPLLFSGASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQL 242
           ERPDFPLLFSGASPLVG  PYAQCYGGHQFGMWAGQLGDGRAITLGE++NS+S+RWELQL
Sbjct: 181 ERPDFPLLFSGASPLVGAVPYAQCYGGHQFGMWAGQLGDGRAITLGEVINSKSQRWELQL 240

Query: 243 KGAGKTPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGN 302
           KGAGKTPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCL+TTG FVTRDMFYDGN
Sbjct: 241 KGAGKTPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLVTTGKFVTRDMFYDGN 300

Query: 303 QKEEPGAIVCRVAQSFLRFGSFQIHASRGKDDYKIVRALADYAIRHHFPHFENMSSSQSL 362
            K+EPGAIVCRV+QSFLRFGS+QIHASRGK+D  IVRALADYAI+HHFPH ENM  S+SL
Sbjct: 301 PKDEPGAIVCRVSQSFLRFGSYQIHASRGKEDLGIVRALADYAIKHHFPHIENMDKSESL 360

Query: 363 SFSTGDEDSSVVDLTSNKYAAWAVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTID 422
           SFSTGDED SVVDLTSNKYAAWAVEVAERTASL+ASWQGVGFTHGVLNTDNMSILGLTID
Sbjct: 361 SFSTGDEDHSVVDLTSNKYAAWAVEVAERTASLVASWQGVGFTHGVLNTDNMSILGLTID 420

Query: 423 YGPFGFLDAFDPSYTPNTTDLPGRRYCFANQPDVGLWNIAQFASTLSAAELINDKEANYA 482
           YGPFGFLDAFDPSYTPNTTDLPGRRYCFANQPD+GLWNIAQFA+TLSAA+LI+DKEANYA
Sbjct: 421 YGPFGFLDAFDPSYTPNTTDLPGRRYCFANQPDIGLWNIAQFATTLSAAQLIDDKEANYA 480

Query: 483 MERYGDKFMDDYQTIMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSLANIKADPST 542
           MERYG KFMD+YQ IMT+K+GLPKYNKQLISKLLNNMAVDKVDYTNFFR L+NI+ADPS 
Sbjct: 481 MERYGTKFMDEYQAIMTRKLGLPKYNKQLISKLLNNMAVDKVDYTNFFRLLSNIRADPSI 540

Query: 543 PEEELLVPLKAVLLDMGKERKEAWVSWVKTYIAELAASGISDEERKASMDAINPKYILRN 602
           PEEELL+PLKAVLLD+GKERKEAW+ WVK+YI ELAASGISDEERKASM+A+NPKYILRN
Sbjct: 541 PEEELLIPLKAVLLDIGKERKEAWIGWVKSYIEELAASGISDEERKASMNAVNPKYILRN 600

Query: 603 YLCQTAIDAAEQGDFGEVRRLLKIMERPFDEQPEMEKYARLPPAWAYRPG 646
           YLCQ+AIDAAEQGDFGEVRRLLK++ERPFDEQP MEK ARLPPAWAYRPG
Sbjct: 601 YLCQSAIDAAEQGDFGEVRRLLKVVERPFDEQPGMEKNARLPPAWAYRPG 644

BLAST of Cp4.1LG18g01390 vs. TrEMBL
Match: A0A067JMC6_JATCU (Uncharacterized protein OS=Jatropha curcas GN=JCGZ_22671 PE=3 SV=1)

HSP 1 Score: 1025.8 bits (2651), Expect = 2.3e-296
Identity = 488/588 (82.99%), Postives = 542/588 (92.18%), Query Frame = 1

Query: 67  VSMDSASPEVSASVDSVADGLKNQSLNSDDR-GGLGSGVEHGAKKKLEELNWDNSFVREL 126
           V MD++SPE + S+DS+A+ LKNQSL +DD+     +   +  K  LE+LNWD+SF+REL
Sbjct: 66  VPMDTSSPEATVSIDSLANDLKNQSLGADDKICNNNNSNANKVKLTLEDLNWDHSFIREL 125

Query: 127 PGDPRTDVLPRQVLHACYSNVLPSVEVESPQLVAWSESVADLLDLDLQEFERPDFPLLFS 186
           PGD RTD +PRQVLHACY+ V PSVEVE+PQL+AWSESVA+ LDLD +EFERPDFPL+FS
Sbjct: 126 PGDSRTDTIPRQVLHACYTKVSPSVEVENPQLIAWSESVAEFLDLDPKEFERPDFPLIFS 185

Query: 187 GASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQLKGAGKTPYSR 246
           G+SPL G  PYAQCYGGHQFGMWAGQLGDGRAITLGEILN +SERWELQLKGAGKTPYSR
Sbjct: 186 GSSPLAGALPYAQCYGGHQFGMWAGQLGDGRAITLGEILNLKSERWELQLKGAGKTPYSR 245

Query: 247 FADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGNQKEEPGAIVC 306
           FADGLAVLRSSIREFLCSEAMH LGIPTTRALCL+TTG +VTRDMFYDGN KEEPGAIVC
Sbjct: 246 FADGLAVLRSSIREFLCSEAMHHLGIPTTRALCLVTTGKYVTRDMFYDGNPKEEPGAIVC 305

Query: 307 RVAQSFLRFGSFQIHASRGKDDYKIVRALADYAIRHHFPHFENMSSSQSLSFSTGDEDSS 366
           RVAQSFLRFGS+QIHASRG +D  IVRALADYAIRHHFPH ENM+ S+SLSFSTGDED S
Sbjct: 306 RVAQSFLRFGSYQIHASRGNEDLDIVRALADYAIRHHFPHIENMNKSESLSFSTGDEDHS 365

Query: 367 VVDLTSNKYAAWAVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFGFLDAF 426
           VVDLTSNKYAAW VEVAERTASL+ASWQGVGFTHGVLNTDNMSILGLTIDYGPFGFLDAF
Sbjct: 366 VVDLTSNKYAAWMVEVAERTASLVASWQGVGFTHGVLNTDNMSILGLTIDYGPFGFLDAF 425

Query: 427 DPSYTPNTTDLPGRRYCFANQPDVGLWNIAQFASTLSAAELINDKEANYAMERYGDKFMD 486
           DPSYTPNTTDLPGRRYCFANQPD+GLWNIAQF ++L+AA+LIND+EANYAMERYG KFMD
Sbjct: 426 DPSYTPNTTDLPGRRYCFANQPDIGLWNIAQFTASLTAAQLINDQEANYAMERYGTKFMD 485

Query: 487 DYQTIMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSLANIKADPSTPEEELLVPLK 546
           +YQ IMT+K+GLPKYNKQLI KLLNNMAVDKVDYTNFFR L+NIKADP+ PE+E+LVPLK
Sbjct: 486 EYQAIMTRKLGLPKYNKQLIGKLLNNMAVDKVDYTNFFRLLSNIKADPNIPEDEMLVPLK 545

Query: 547 AVLLDMGKERKEAWVSWVKTYIAELAASGISDEERKASMDAINPKYILRNYLCQTAIDAA 606
           AVLLD+GKERKEAW++WV++Y+ ELAASGISDE+RKA MD++NPKYILRNYLCQTAIDAA
Sbjct: 546 AVLLDIGKERKEAWINWVQSYVQELAASGISDEQRKAQMDSVNPKYILRNYLCQTAIDAA 605

Query: 607 EQGDFGEVRRLLKIMERPFDEQPEMEKYARLPPAWAYRPGVCMLSCSS 654
           EQGDFGEVRRLLK++ERP+DEQP MEKYARLPPAWAYRPGVCMLSCSS
Sbjct: 606 EQGDFGEVRRLLKLIERPYDEQPGMEKYARLPPAWAYRPGVCMLSCSS 653

BLAST of Cp4.1LG18g01390 vs. TrEMBL
Match: B9RI64_RICCO (Selenoprotein O, putative OS=Ricinus communis GN=RCOM_1576590 PE=3 SV=1)

HSP 1 Score: 1019.6 bits (2635), Expect = 1.7e-294
Identity = 497/591 (84.09%), Postives = 536/591 (90.69%), Query Frame = 1

Query: 67  VSMDSA-SPEVSAS---VDSVADGLKNQSLNSDDRGGLGSGVEHGAKKKLEELNWDNSFV 126
           VSMDS+ SPE +++   VDSV +  KNQSL  DD     +      K  L++LNWD+SFV
Sbjct: 65  VSMDSSGSPEAASTMSVVDSVTNDFKNQSLRDDDNNNKNNTTSK-VKSSLDDLNWDHSFV 124

Query: 127 RELPGDPRTDVLPRQVLHACYSNVLPSVEVESPQLVAWSESVADLLDLDLQEFERPDFPL 186
           RELPGD RTD +PRQVLHAC+S V PS EVE+PQLVAWSESVA LLDLDL+EFERPDF L
Sbjct: 125 RELPGDSRTDTIPRQVLHACFSKVFPSAEVENPQLVAWSESVAVLLDLDLKEFERPDFAL 184

Query: 187 LFSGASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQLKGAGKTP 246
            FSGAS LVG  PYAQCYGGHQFGMWAGQLGDGRAITLGEILNS+SERWELQLKGAGKTP
Sbjct: 185 KFSGASTLVGSLPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSKSERWELQLKGAGKTP 244

Query: 247 YSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGNQKEEPGA 306
           YSRFADGLAVLRSSIREFLCSEAMH LGIPTTRALCL+TTG +VTRDMFYDGN KEEPGA
Sbjct: 245 YSRFADGLAVLRSSIREFLCSEAMHHLGIPTTRALCLVTTGKYVTRDMFYDGNPKEEPGA 304

Query: 307 IVCRVAQSFLRFGSFQIHASRGKDDYKIVRALADYAIRHHFPHFENMSSSQSLSFSTGDE 366
           IVCRVAQSFLRFGSFQIHASRGK+D+ IVRALADYAIRHHFPH +NM+ S+SLSFS G E
Sbjct: 305 IVCRVAQSFLRFGSFQIHASRGKEDFGIVRALADYAIRHHFPHIDNMTKSESLSFSMGAE 364

Query: 367 DSSVVDLTSNKYAAWAVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFGFL 426
           D S+VDLTSNKYAAW VEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFGFL
Sbjct: 365 DDSIVDLTSNKYAAWTVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFGFL 424

Query: 427 DAFDPSYTPNTTDLPGRRYCFANQPDVGLWNIAQFASTLSAAELINDKEANYAMERYGDK 486
           DAFDPSYTPNTTDLPGRRYCFANQPD+GLWNIAQF +TLS A+LINDKEANYAMERYG+K
Sbjct: 425 DAFDPSYTPNTTDLPGRRYCFANQPDIGLWNIAQFTATLSEAQLINDKEANYAMERYGNK 484

Query: 487 FMDDYQTIMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSLANIKADPSTPEEELLV 546
           FMD+YQ IMT+K+GLPKYNKQLISKLLNNMAVDKVDYTNFFR L+NIKADP+ PEEELLV
Sbjct: 485 FMDEYQAIMTRKLGLPKYNKQLISKLLNNMAVDKVDYTNFFRLLSNIKADPNIPEEELLV 544

Query: 547 PLKAVLLDMGKERKEAWVSWVKTYIAELAASGISDEERKASMDAINPKYILRNYLCQTAI 606
           PLKA LLD+GKERKEAW+SWV++Y+ ELAAS ISD+ERKA MDA+NPKYILRNYLCQTAI
Sbjct: 545 PLKAALLDIGKERKEAWISWVQSYVQELAASDISDDERKAQMDAVNPKYILRNYLCQTAI 604

Query: 607 DAAEQGDFGEVRRLLKIMERPFDEQPEMEKYARLPPAWAYRPGVCMLSCSS 654
           DAAEQGD GEVRRLLK+MERPFDEQP MEKYARLPPAWAYRPGVCMLSCSS
Sbjct: 605 DAAEQGDMGEVRRLLKLMERPFDEQPGMEKYARLPPAWAYRPGVCMLSCSS 654

BLAST of Cp4.1LG18g01390 vs. TrEMBL
Match: A0A061GSR9_THECC (Uncharacterized protein OS=Theobroma cacao GN=TCM_040408 PE=3 SV=1)

HSP 1 Score: 1016.9 bits (2628), Expect = 1.1e-293
Identity = 497/619 (80.29%), Postives = 546/619 (88.21%), Query Frame = 1

Query: 43  RRSPASIFSPP---SPLVGHSRH----GRRRVSMDSASPEVSASVDSVADGLKNQSLNSD 102
           R SP S FSP     P +  + H    G  R+   S  P  S SV+S+A+GLKNQSL   
Sbjct: 29  RSSPKSPFSPSLFLKPRLSLACHLSTGGSLRMDSPSPDPPSSLSVESIAEGLKNQSLTEQ 88

Query: 103 DRGGLGSGVEH-GAKKKLEELNWDNSFVRELPGDPRTDVLPRQVLHACYSNVLPSVEVES 162
           D   + + +++   K  LE+LNWD+SFVRELPGDPR+D +PRQVLHACY+ VLPS EVE+
Sbjct: 89  DNDNINNKIKNKNVKLGLEDLNWDHSFVRELPGDPRSDSIPRQVLHACYTKVLPSAEVEN 148

Query: 163 PQLVAWSESVADLLDLDLQEFERPDFPLLFSGASPLVGVSPYAQCYGGHQFGMWAGQLGD 222
           P+LVAWS+SVADLLDL+ +EFERPDFPL FSG SPL G  PYAQCYGGHQFG WAGQLGD
Sbjct: 149 PKLVAWSDSVADLLDLNPKEFERPDFPLKFSGVSPLAGAVPYAQCYGGHQFGTWAGQLGD 208

Query: 223 GRAITLGEILNSRSERWELQLKGAGKTPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTT 282
           GRAITLGEILNS+ ERWELQLKGAGKTPYSRFADGLAVLRSSIREFLCSEAMH LGIPTT
Sbjct: 209 GRAITLGEILNSKLERWELQLKGAGKTPYSRFADGLAVLRSSIREFLCSEAMHFLGIPTT 268

Query: 283 RALCLLTTGTFVTRDMFYDGNQKEEPGAIVCRVAQSFLRFGSFQIHASRGKDDYKIVRAL 342
           RALCL+TTG FVTRDMFYDGN KEEPGAIVCRVAQSFLRFGSFQIHASRG++D  IVR L
Sbjct: 269 RALCLVTTGKFVTRDMFYDGNPKEEPGAIVCRVAQSFLRFGSFQIHASRGEEDLGIVRDL 328

Query: 343 ADYAIRHHFPHFENMSSSQSLSFSTGDEDSSVVDLTSNKYAAWAVEVAERTASLIASWQG 402
           ADYAIRHHFPH EN+S S+SLSFSTGD+D SVVDLTSNKYAAW VEVAERTASL+A WQG
Sbjct: 329 ADYAIRHHFPHIENISKSESLSFSTGDDDHSVVDLTSNKYAAWIVEVAERTASLVARWQG 388

Query: 403 VGFTHGVLNTDNMSILGLTIDYGPFGFLDAFDPSYTPNTTDLPGRRYCFANQPDVGLWNI 462
           VGFTHGVLNTDNMSILGLTIDYGPFGFLDAFDPSYTPNTTDLPGRRYCFANQPD+GLWNI
Sbjct: 389 VGFTHGVLNTDNMSILGLTIDYGPFGFLDAFDPSYTPNTTDLPGRRYCFANQPDIGLWNI 448

Query: 463 AQFASTLSAAELINDKEANYAMERYGDKFMDDYQTIMTKKIGLPKYNKQLISKLLNNMAV 522
           AQFASTL AA LINDKEANYAMERYG KFMDDYQ I+++K+GL KYNKQL++KLLNN+AV
Sbjct: 449 AQFASTLMAAHLINDKEANYAMERYGTKFMDDYQAIISQKLGLQKYNKQLVNKLLNNLAV 508

Query: 523 DKVDYTNFFRSLANIKADPSTPEEELLVPLKAVLLDMGKERKEAWVSWVKTYIAELAASG 582
           DKVDYTNFFRSL+NIKADP  PE+ELLVPLKAVLLD+G+ERKEAWVSWV++YI EL ASG
Sbjct: 509 DKVDYTNFFRSLSNIKADPGIPEDELLVPLKAVLLDIGRERKEAWVSWVQSYIQELVASG 568

Query: 583 ISDEERKASMDAINPKYILRNYLCQTAIDAAEQGDFGEVRRLLKIMERPFDEQPEMEKYA 642
           ISDEERKASMD++NPKY+LRNYLCQ+AIDAAE GDF EVRRLLK+MERP+DEQP MEKYA
Sbjct: 569 ISDEERKASMDSVNPKYVLRNYLCQSAIDAAELGDFREVRRLLKVMERPYDEQPGMEKYA 628

Query: 643 RLPPAWAYRPGVCMLSCSS 654
           RLPPAWAYRPGVCMLSCSS
Sbjct: 629 RLPPAWAYRPGVCMLSCSS 647

BLAST of Cp4.1LG18g01390 vs. TAIR10
Match: AT5G13030.1 (AT5G13030.1 unknown protein)

HSP 1 Score: 969.1 bits (2504), Expect = 1.3e-282
Identity = 462/586 (78.84%), Postives = 517/586 (88.23%), Query Frame = 1

Query: 68  SMDSASPEVSASVDSVADGLKNQSLNSDDRGGLGSGVEHGAKKKLEELNWDNSFVRELPG 127
           S  S +P   +S DS+A  L+NQSL     G +  GV+   KKKLE+ NWD+SFV+ELPG
Sbjct: 55  SSSSPTPVTDSSADSLAKDLQNQSL-----GAVDEGVK--IKKKLEDFNWDHSFVKELPG 114

Query: 128 DPRTDVLPRQVLHACYSNVLPSVEVESPQLVAWSESVADLLDLDLQEFERPDFPLLFSGA 187
           DPRTDV+ R+VLHACYS V PSVEV+ PQLVAWS SVA+LLDLD +EFERPDFPL+ SGA
Sbjct: 115 DPRTDVISREVLHACYSKVSPSVEVDDPQLVAWSVSVAELLDLDPKEFERPDFPLMLSGA 174

Query: 188 SPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQLKGAGKTPYSRFA 247
            PL G   YAQCYGGHQFGMWAGQLGDGRAITLGE+LNS+ ERWELQLKGAG+TPYSRFA
Sbjct: 175 KPLPGAMSYAQCYGGHQFGMWAGQLGDGRAITLGEVLNSKGERWELQLKGAGRTPYSRFA 234

Query: 248 DGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGNQKEEPGAIVCRV 307
           DGLAVLRSSIREFLCSE MH LGIPTTRALCLLTTG  VTRDMFYDGN KEEPGAIVCRV
Sbjct: 235 DGLAVLRSSIREFLCSETMHCLGIPTTRALCLLTTGQNVTRDMFYDGNPKEEPGAIVCRV 294

Query: 308 AQSFLRFGSFQIHASRGKDDYKIVRALADYAIRHHFPHFENMSSSQSLSFSTGDEDSSVV 367
           +QSFLRFGS+QIHASRGK+D  IVR LADYAI+HHFPH E+M  S SLSF TGDED SVV
Sbjct: 295 SQSFLRFGSYQIHASRGKEDLDIVRKLADYAIKHHFPHIESMDRSDSLSFKTGDEDDSVV 354

Query: 368 DLTSNKYAAWAVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFGFLDAFDP 427
           DLTSNKYAAW VE+AERTA+L+A WQGVGFTHGVLNTDNMSILG TIDYGPFGFLDAFDP
Sbjct: 355 DLTSNKYAAWIVEIAERTATLVARWQGVGFTHGVLNTDNMSILGQTIDYGPFGFLDAFDP 414

Query: 428 SYTPNTTDLPGRRYCFANQPDVGLWNIAQFASTLSAAELINDKEANYAMERYGDKFMDDY 487
           SYTPNTTDLPGRRYCFANQPD+GLWNIAQF+ TL+ A+LIN KEANYAMERYGDKFMD+Y
Sbjct: 415 SYTPNTTDLPGRRYCFANQPDIGLWNIAQFSKTLAVAQLINQKEANYAMERYGDKFMDEY 474

Query: 488 QTIMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSLANIKADPSTPEEELLVPLKAV 547
           Q IM+KK+GL KYNK++ISKLLNNM+VDKVDYTNFFR LAN+KA+P+TPE ELL PLKAV
Sbjct: 475 QAIMSKKLGLTKYNKEVISKLLNNMSVDKVDYTNFFRLLANVKANPNTPENELLKPLKAV 534

Query: 548 LLDMGKERKEAWVSWVKTYIAELAASGISDEERKASMDAINPKYILRNYLCQTAIDAAEQ 607
           LLD+GKERKEAW+ W+++YI E+  S +SDEERKA MD++NPKYILRNYLCQ+AIDAAEQ
Sbjct: 535 LLDIGKERKEAWIKWMRSYIQEVGGSEVSDEERKARMDSVNPKYILRNYLCQSAIDAAEQ 594

Query: 608 GDFGEVRRLLKIMERPFDEQPEMEKYARLPPAWAYRPGVCMLSCSS 654
           GDF EV  L+++M+RP++EQP MEKYARLPPAWAYRPGVCMLSCSS
Sbjct: 595 GDFSEVNNLIRLMKRPYEEQPGMEKYARLPPAWAYRPGVCMLSCSS 633

BLAST of Cp4.1LG18g01390 vs. NCBI nr
Match: gi|449462599|ref|XP_004149028.1| (PREDICTED: selenoprotein O [Cucumis sativus])

HSP 1 Score: 1202.2 bits (3109), Expect = 0.0e+00
Identity = 592/653 (90.66%), Postives = 616/653 (94.33%), Query Frame = 1

Query: 1   MSLVSHLFPKLSLLPHISLLCHGHRHRHRLGLVRRHSTTLIHRRSPASIFSPPSPLVGHS 60
           MSLVSHLFPK S+  +ISLLCHGHR    LGLVRR ST LI R  PAS  S PSPL  HS
Sbjct: 1   MSLVSHLFPKPSVFSNISLLCHGHR----LGLVRRRSTLLIRRHPPASFTSLPSPLPAHS 60

Query: 61  RHGRRRVSMDSASPEVSASVDSVADGLKNQSLNSDDRGGLGSGVEHGAKKKLEELNWDNS 120
           RHGRR++SMDSASPEVSASVDSVA+GLKNQSLN+DDR   GS + H  KKKLE+LNWDNS
Sbjct: 61  RHGRRKLSMDSASPEVSASVDSVAEGLKNQSLNNDDRVDGGSSINHATKKKLEDLNWDNS 120

Query: 121 FVRELPGDPRTDVLPRQVLHACYSNVLPSVEVESPQLVAWSESVADLLDLDLQEFERPDF 180
           FVRELPGDPRTD++PR+VLHACYS VLPSVEV+SPQLVAWSESVADLLDLD QEFERPDF
Sbjct: 121 FVRELPGDPRTDIIPREVLHACYSKVLPSVEVQSPQLVAWSESVADLLDLDPQEFERPDF 180

Query: 181 PLLFSGASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQLKGAGK 240
           PLLFSGASPLVG SPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQLKGAGK
Sbjct: 181 PLLFSGASPLVGASPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQLKGAGK 240

Query: 241 TPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGNQKEEP 300
           TPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGN KEEP
Sbjct: 241 TPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGNPKEEP 300

Query: 301 GAIVCRVAQSFLRFGSFQIHASRGKDDYKIVRALADYAIRHHFPHFENMSSSQSLSFSTG 360
           GAIVCRVAQSFLRFGS+QIHASRGKDD+KIVRALADY IRHHFPH ENMSSSQS+SFSTG
Sbjct: 301 GAIVCRVAQSFLRFGSYQIHASRGKDDFKIVRALADYVIRHHFPHLENMSSSQSVSFSTG 360

Query: 361 DEDSSVVDLTSNKYAAWAVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFG 420
           + DSSVVDLTSNKYAAW VEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFG
Sbjct: 361 NTDSSVVDLTSNKYAAWTVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFG 420

Query: 421 FLDAFDPSYTPNTTDLPGRRYCFANQPDVGLWNIAQFASTLSAAELINDKEANYAMERYG 480
           FLDAFDPS+TPNTTDLPGRRYCFANQPD+GLWNIAQFASTLSAAELINDKEANYAMERYG
Sbjct: 421 FLDAFDPSFTPNTTDLPGRRYCFANQPDIGLWNIAQFASTLSAAELINDKEANYAMERYG 480

Query: 481 DKFMDDYQTIMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSLANIKADPSTPEEEL 540
           DKFMDDYQ IMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSL+N+KADPS PEEEL
Sbjct: 481 DKFMDDYQAIMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSLSNLKADPSIPEEEL 540

Query: 541 LVPLKAVLLDMGKERKEAWVSWVKTYIAELAASGISDEERKASMDAINPKYILRNYLCQT 600
           LVPLKAVLLD+GKERKEAWVSWVKTY+ ELA SGISDEERKASMDA+NPKYILRNYLCQT
Sbjct: 541 LVPLKAVLLDIGKERKEAWVSWVKTYMEELAGSGISDEERKASMDAVNPKYILRNYLCQT 600

Query: 601 AIDAAEQGDFGEVRRLLKIMERPFDEQPEMEKYARLPPAWAYRPGVCMLSCSS 654
           AIDAAEQGDFGEVR+LLKIMERPFDEQP MEKYARLPPAWAYRPGVCMLSCSS
Sbjct: 601 AIDAAEQGDFGEVRQLLKIMERPFDEQPGMEKYARLPPAWAYRPGVCMLSCSS 649

BLAST of Cp4.1LG18g01390 vs. NCBI nr
Match: gi|659100510|ref|XP_008451128.1| (PREDICTED: selenoprotein O [Cucumis melo])

HSP 1 Score: 1189.1 bits (3075), Expect = 0.0e+00
Identity = 588/653 (90.05%), Postives = 612/653 (93.72%), Query Frame = 1

Query: 1   MSLVSHLFPKLSLLPHISLLCHGHRHRHRLGLVRRHSTTLIHRRSPASIFSPPSPLVGHS 60
           MSLVSHLFPK S+  +ISLLCHGHR    LGLV R ST LI R SP+S  S PS L  HS
Sbjct: 1   MSLVSHLFPKPSVFSNISLLCHGHR----LGLVPRRSTLLIRRHSPSSFTSLPSSLPAHS 60

Query: 61  RHGRRRVSMDSASPEVSASVDSVADGLKNQSLNSDDRGGLGSGVEHGAKKKLEELNWDNS 120
           RH RR++SMDSASPEVSASVDSVA+GLKNQSLN+DDR   GS + H  KKKLE+LNWDNS
Sbjct: 61  RHVRRKLSMDSASPEVSASVDSVAEGLKNQSLNNDDRVDGGSSINHATKKKLEDLNWDNS 120

Query: 121 FVRELPGDPRTDVLPRQVLHACYSNVLPSVEVESPQLVAWSESVADLLDLDLQEFERPDF 180
           FVRELPGDPRTD++PR+VLHACYS VLPSVEV+SPQLVAWSESVA+LLDLD QEFERPDF
Sbjct: 121 FVRELPGDPRTDIIPREVLHACYSKVLPSVEVQSPQLVAWSESVANLLDLDPQEFERPDF 180

Query: 181 PLLFSGASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQLKGAGK 240
           PLLFSGASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQLKGAGK
Sbjct: 181 PLLFSGASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQLKGAGK 240

Query: 241 TPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGNQKEEP 300
           TPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGN KEEP
Sbjct: 241 TPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGNPKEEP 300

Query: 301 GAIVCRVAQSFLRFGSFQIHASRGKDDYKIVRALADYAIRHHFPHFENMSSSQSLSFSTG 360
           GAIVCRVAQSFLRFGS+QIHASRGKDDYKIVRALADY I HHFPH ENMSSSQS+SFSTG
Sbjct: 301 GAIVCRVAQSFLRFGSYQIHASRGKDDYKIVRALADYVIHHHFPHLENMSSSQSVSFSTG 360

Query: 361 DEDSSVVDLTSNKYAAWAVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFG 420
           + DSSVVDLTSNKYAAW VEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFG
Sbjct: 361 NTDSSVVDLTSNKYAAWTVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFG 420

Query: 421 FLDAFDPSYTPNTTDLPGRRYCFANQPDVGLWNIAQFASTLSAAELINDKEANYAMERYG 480
           FLDAFDPS+TPNTTDLPGRRYCFANQPD+GLWNIAQFASTLSAAELINDKEANYAMERYG
Sbjct: 421 FLDAFDPSFTPNTTDLPGRRYCFANQPDIGLWNIAQFASTLSAAELINDKEANYAMERYG 480

Query: 481 DKFMDDYQTIMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSLANIKADPSTPEEEL 540
           DKFMDDYQ IMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSL+NIKAD S PEEEL
Sbjct: 481 DKFMDDYQAIMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSLSNIKADSSIPEEEL 540

Query: 541 LVPLKAVLLDMGKERKEAWVSWVKTYIAELAASGISDEERKASMDAINPKYILRNYLCQT 600
           LVPLKAVLLD+GKERKEAWVSWVKTY+ ELA SGISDEERKASMD +NPKYILRNYLCQT
Sbjct: 541 LVPLKAVLLDIGKERKEAWVSWVKTYMEELAGSGISDEERKASMDVVNPKYILRNYLCQT 600

Query: 601 AIDAAEQGDFGEVRRLLKIMERPFDEQPEMEKYARLPPAWAYRPGVCMLSCSS 654
           AIDAAEQGDFGEVR+LLKIMERPFDEQP MEKYARLPPAWAYRPGVCMLSCSS
Sbjct: 601 AIDAAEQGDFGEVRQLLKIMERPFDEQPGMEKYARLPPAWAYRPGVCMLSCSS 649

BLAST of Cp4.1LG18g01390 vs. NCBI nr
Match: gi|1009118638|ref|XP_015875963.1| (PREDICTED: UPF0061 protein azo1574 [Ziziphus jujuba])

HSP 1 Score: 1038.1 bits (2683), Expect = 6.5e-300
Identity = 513/631 (81.30%), Postives = 555/631 (87.96%), Query Frame = 1

Query: 35  RHSTTLIHRRSPASI----FSPPSPL------VGHSRHGRRRVSMDSASPEVSASVDSVA 94
           R S  ++  R P+S     F P  PL      +      RR VSMDS SP+V+ SVDSVA
Sbjct: 24  RLSFAVLSLRRPSSASKFRFRPSLPLKFSGLSLACHLSSRRGVSMDSPSPDVAVSVDSVA 83

Query: 95  DGLKNQSLNSDDR--GGLGSGVEHGAKKKLEELNWDNSFVRELPGDPRTDVLPRQVLHAC 154
             L+NQSL +DD   G   S   + A+ KLE+L WD+SFVRELPGDPR+D++PR+VLH+C
Sbjct: 84  RDLQNQSLCNDDNHEGDDPSNGSNRARLKLEDLTWDHSFVRELPGDPRSDIIPREVLHSC 143

Query: 155 YSNVLPSVEVESPQLVAWSESVADLLDLDLQEFERPDFPLLFSGASPLVGVSPYAQCYGG 214
           Y+ V PS EVE PQLVAWSESVA+LLDLD +EFERPDFPLLF+GASPLVG  PYAQCYGG
Sbjct: 144 YTRVSPSAEVEKPQLVAWSESVAELLDLDPKEFERPDFPLLFTGASPLVGALPYAQCYGG 203

Query: 215 HQFGMWAGQLGDGRAITLGEILNSRSERWELQLKGAGKTPYSRFADGLAVLRSSIREFLC 274
           HQFGMWAGQLGDGRAITLGE++NS+SERWELQLKGAGKT YSRFADGLAVLRSSIREFLC
Sbjct: 204 HQFGMWAGQLGDGRAITLGEVINSKSERWELQLKGAGKTAYSRFADGLAVLRSSIREFLC 263

Query: 275 SEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGNQKEEPGAIVCRVAQSFLRFGSFQIHAS 334
           SEAMHSLGIPTTRALCL TTG +VTRDMFYDGN K+EPGAIVCRVAQSFLRFGS+QIHAS
Sbjct: 264 SEAMHSLGIPTTRALCLATTGKYVTRDMFYDGNPKDEPGAIVCRVAQSFLRFGSYQIHAS 323

Query: 335 RGKDDYKIVRALADYAIRHHFPHFENMSSSQSLSFSTGDEDSSVVDLTSNKYAAWAVEVA 394
           RGK+D  IVRALADYAIRHHFPH ENMS S SLSFSTGDED SVVDLTSNKYAAW VEVA
Sbjct: 324 RGKEDLGIVRALADYAIRHHFPHIENMSKSDSLSFSTGDEDHSVVDLTSNKYAAWVVEVA 383

Query: 395 ERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFGFLDAFDPSYTPNTTDLPGRRYC 454
           ERTASL+A WQGVGFTHGVLNTDNMSILGLTIDYGPFGFLDAFDPSYTPNTTDLPGRRYC
Sbjct: 384 ERTASLLARWQGVGFTHGVLNTDNMSILGLTIDYGPFGFLDAFDPSYTPNTTDLPGRRYC 443

Query: 455 FANQPDVGLWNIAQFASTLSAAELINDKEANYAMERYGDKFMDDYQTIMTKKIGLPKYNK 514
           FANQPD+GLWNI QF STL  AELINDKEANYAMERYG K MD+YQ +MT+K+GLPKYNK
Sbjct: 444 FANQPDIGLWNIGQFTSTLLTAELINDKEANYAMERYGTKLMDEYQALMTRKLGLPKYNK 503

Query: 515 QLISKLLNNMAVDKVDYTNFFRSLANIKADPSTPEEELLVPLKAVLLDMGKERKEAWVSW 574
           QLISKLLNNMAVDKVDYTNFFRSL+N+KAD S PEEELL+PLKAVLLDMGKERKEAW+ W
Sbjct: 504 QLISKLLNNMAVDKVDYTNFFRSLSNVKADLSIPEEELLIPLKAVLLDMGKERKEAWIGW 563

Query: 575 VKTYIAELAASGISDEERKASMDAINPKYILRNYLCQTAIDAAEQGDFGEVRRLLKIMER 634
           VK YI ELAASGISDEERKASMDA+NPKYILRNYLCQ+AIDAAEQGDFGEVRRLLK+MER
Sbjct: 564 VKGYIEELAASGISDEERKASMDAVNPKYILRNYLCQSAIDAAEQGDFGEVRRLLKLMER 623

Query: 635 PFDEQPEMEKYARLPPAWAYRPGVCMLSCSS 654
           P+DEQP MEKYARLPPAWAYRPGVCMLSCSS
Sbjct: 624 PYDEQPGMEKYARLPPAWAYRPGVCMLSCSS 654

BLAST of Cp4.1LG18g01390 vs. NCBI nr
Match: gi|703065161|ref|XP_010087379.1| (UPF0061 protein azo1574 [Morus notabilis])

HSP 1 Score: 1032.3 bits (2668), Expect = 3.6e-298
Identity = 518/650 (79.69%), Postives = 565/650 (86.92%), Query Frame = 1

Query: 3   LVSHLFPKLSLLPHISLLCHGHRHRHRLGLVRRHSTTLIHRRSPASIFSPPSPLVGHSRH 62
           ++SH   K S  P  SL    H  R    L  RHS+             P  PL  H+  
Sbjct: 1   MLSHFSSKASSFPLPSL---SHLSRSAASLRHRHSSPKFPFYPSIPTKFPRLPLACHASS 60

Query: 63  GRRRVSMDSASP----EVSASVDSVADGLKNQSLNS--DDRG-GLGSGVEHGAKKKLEEL 122
           G   VSMDS SP    E +A+VDSVA  L+NQSL    D R  G G+G +  A+ KLE+L
Sbjct: 61  GG--VSMDSPSPSPSPETAAAVDSVARDLQNQSLRGGEDQRDDGFGNGAKR-ARLKLEDL 120

Query: 123 NWDNSFVRELPGDPRTDVLPRQVLHACYSNVLPSVEVESPQLVAWSESVADLLDLDLQEF 182
           NWD+SFVRELPGDPRTD +PR+VLHACY+ VLPS EV+ PQLVAWSESVADLLDLD +EF
Sbjct: 121 NWDHSFVRELPGDPRTDTMPREVLHACYTKVLPSAEVDKPQLVAWSESVADLLDLDPKEF 180

Query: 183 ERPDFPLLFSGASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQL 242
           ERPDFPLLFSGASPLVG  PYAQCYGGHQFGMWAGQLGDGRAITLGE++NS+S+RWELQL
Sbjct: 181 ERPDFPLLFSGASPLVGAVPYAQCYGGHQFGMWAGQLGDGRAITLGEVINSKSQRWELQL 240

Query: 243 KGAGKTPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGN 302
           KGAGKTPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCL+TTG FVTRDMFYDGN
Sbjct: 241 KGAGKTPYSRFADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLVTTGKFVTRDMFYDGN 300

Query: 303 QKEEPGAIVCRVAQSFLRFGSFQIHASRGKDDYKIVRALADYAIRHHFPHFENMSSSQSL 362
            K+EPGAIVCRV+QSFLRFGS+QIHASRGK+D  IVRALADYAI+HHFPH ENM  S+SL
Sbjct: 301 PKDEPGAIVCRVSQSFLRFGSYQIHASRGKEDLGIVRALADYAIKHHFPHIENMDKSESL 360

Query: 363 SFSTGDEDSSVVDLTSNKYAAWAVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTID 422
           SFSTGDED SVVDLTSNKYAAWAVEVAERTASL+ASWQGVGFTHGVLNTDNMSILGLTID
Sbjct: 361 SFSTGDEDHSVVDLTSNKYAAWAVEVAERTASLVASWQGVGFTHGVLNTDNMSILGLTID 420

Query: 423 YGPFGFLDAFDPSYTPNTTDLPGRRYCFANQPDVGLWNIAQFASTLSAAELINDKEANYA 482
           YGPFGFLDAFDPSYTPNTTDLPGRRYCFANQPD+GLWNIAQFA+TLSAA+LI+DKEANYA
Sbjct: 421 YGPFGFLDAFDPSYTPNTTDLPGRRYCFANQPDIGLWNIAQFATTLSAAQLIDDKEANYA 480

Query: 483 MERYGDKFMDDYQTIMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSLANIKADPST 542
           MERYG KFMD+YQ IMT+K+GLPKYNKQLISKLLNNMAVDKVDYTNFFR L+NI+ADPS 
Sbjct: 481 MERYGTKFMDEYQAIMTRKLGLPKYNKQLISKLLNNMAVDKVDYTNFFRLLSNIRADPSI 540

Query: 543 PEEELLVPLKAVLLDMGKERKEAWVSWVKTYIAELAASGISDEERKASMDAINPKYILRN 602
           PEEELL+PLKAVLLD+GKERKEAW+ WVK+YI ELAASGISDEERKASM+A+NPKYILRN
Sbjct: 541 PEEELLIPLKAVLLDIGKERKEAWIGWVKSYIEELAASGISDEERKASMNAVNPKYILRN 600

Query: 603 YLCQTAIDAAEQGDFGEVRRLLKIMERPFDEQPEMEKYARLPPAWAYRPG 646
           YLCQ+AIDAAEQGDFGEVRRLLK++ERPFDEQP MEK ARLPPAWAYRPG
Sbjct: 601 YLCQSAIDAAEQGDFGEVRRLLKVVERPFDEQPGMEKNARLPPAWAYRPG 644

BLAST of Cp4.1LG18g01390 vs. NCBI nr
Match: gi|802744158|ref|XP_012087438.1| (PREDICTED: selenoprotein O [Jatropha curcas])

HSP 1 Score: 1025.8 bits (2651), Expect = 3.3e-296
Identity = 488/588 (82.99%), Postives = 542/588 (92.18%), Query Frame = 1

Query: 67  VSMDSASPEVSASVDSVADGLKNQSLNSDDR-GGLGSGVEHGAKKKLEELNWDNSFVREL 126
           V MD++SPE + S+DS+A+ LKNQSL +DD+     +   +  K  LE+LNWD+SF+REL
Sbjct: 66  VPMDTSSPEATVSIDSLANDLKNQSLGADDKICNNNNSNANKVKLTLEDLNWDHSFIREL 125

Query: 127 PGDPRTDVLPRQVLHACYSNVLPSVEVESPQLVAWSESVADLLDLDLQEFERPDFPLLFS 186
           PGD RTD +PRQVLHACY+ V PSVEVE+PQL+AWSESVA+ LDLD +EFERPDFPL+FS
Sbjct: 126 PGDSRTDTIPRQVLHACYTKVSPSVEVENPQLIAWSESVAEFLDLDPKEFERPDFPLIFS 185

Query: 187 GASPLVGVSPYAQCYGGHQFGMWAGQLGDGRAITLGEILNSRSERWELQLKGAGKTPYSR 246
           G+SPL G  PYAQCYGGHQFGMWAGQLGDGRAITLGEILN +SERWELQLKGAGKTPYSR
Sbjct: 186 GSSPLAGALPYAQCYGGHQFGMWAGQLGDGRAITLGEILNLKSERWELQLKGAGKTPYSR 245

Query: 247 FADGLAVLRSSIREFLCSEAMHSLGIPTTRALCLLTTGTFVTRDMFYDGNQKEEPGAIVC 306
           FADGLAVLRSSIREFLCSEAMH LGIPTTRALCL+TTG +VTRDMFYDGN KEEPGAIVC
Sbjct: 246 FADGLAVLRSSIREFLCSEAMHHLGIPTTRALCLVTTGKYVTRDMFYDGNPKEEPGAIVC 305

Query: 307 RVAQSFLRFGSFQIHASRGKDDYKIVRALADYAIRHHFPHFENMSSSQSLSFSTGDEDSS 366
           RVAQSFLRFGS+QIHASRG +D  IVRALADYAIRHHFPH ENM+ S+SLSFSTGDED S
Sbjct: 306 RVAQSFLRFGSYQIHASRGNEDLDIVRALADYAIRHHFPHIENMNKSESLSFSTGDEDHS 365

Query: 367 VVDLTSNKYAAWAVEVAERTASLIASWQGVGFTHGVLNTDNMSILGLTIDYGPFGFLDAF 426
           VVDLTSNKYAAW VEVAERTASL+ASWQGVGFTHGVLNTDNMSILGLTIDYGPFGFLDAF
Sbjct: 366 VVDLTSNKYAAWMVEVAERTASLVASWQGVGFTHGVLNTDNMSILGLTIDYGPFGFLDAF 425

Query: 427 DPSYTPNTTDLPGRRYCFANQPDVGLWNIAQFASTLSAAELINDKEANYAMERYGDKFMD 486
           DPSYTPNTTDLPGRRYCFANQPD+GLWNIAQF ++L+AA+LIND+EANYAMERYG KFMD
Sbjct: 426 DPSYTPNTTDLPGRRYCFANQPDIGLWNIAQFTASLTAAQLINDQEANYAMERYGTKFMD 485

Query: 487 DYQTIMTKKIGLPKYNKQLISKLLNNMAVDKVDYTNFFRSLANIKADPSTPEEELLVPLK 546
           +YQ IMT+K+GLPKYNKQLI KLLNNMAVDKVDYTNFFR L+NIKADP+ PE+E+LVPLK
Sbjct: 486 EYQAIMTRKLGLPKYNKQLIGKLLNNMAVDKVDYTNFFRLLSNIKADPNIPEDEMLVPLK 545

Query: 547 AVLLDMGKERKEAWVSWVKTYIAELAASGISDEERKASMDAINPKYILRNYLCQTAIDAA 606
           AVLLD+GKERKEAW++WV++Y+ ELAASGISDE+RKA MD++NPKYILRNYLCQTAIDAA
Sbjct: 546 AVLLDIGKERKEAWINWVQSYVQELAASGISDEQRKAQMDSVNPKYILRNYLCQTAIDAA 605

Query: 607 EQGDFGEVRRLLKIMERPFDEQPEMEKYARLPPAWAYRPGVCMLSCSS 654
           EQGDFGEVRRLLK++ERP+DEQP MEKYARLPPAWAYRPGVCMLSCSS
Sbjct: 606 EQGDFGEVRRLLKLIERPYDEQPGMEKYARLPPAWAYRPGVCMLSCSS 653

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
Y1574_AZOSB4.0e-14050.09UPF0061 protein azo1574 OS=Azoarcus sp. (strain BH72) GN=azo1574 PE=3 SV=1[more]
Y3800_AROAE8.3e-13847.66UPF0061 protein AZOSEA38000 OS=Aromatoleum aromaticum (strain EbN1) GN=AZOSEA380... [more]
Y1788_METFK6.0e-13646.70UPF0061 protein Mfla_1788 OS=Methylobacillus flagellatus (strain KT / ATCC 51484... [more]
Y683_TOLAT2.4e-13247.15UPF0061 protein Tola_0683 OS=Tolumonas auensis (strain DSM 9187 / TA4) GN=Tola_0... [more]
Y1510_NITMU2.6e-13144.97UPF0061 protein Nmul_A1510 OS=Nitrosospira multiformis (strain ATCC 25196 / NCIM... [more]
Match NameE-valueIdentityDescription
A0A0A0LXE5_CUCSA0.0e+0090.66Uncharacterized protein OS=Cucumis sativus GN=Csa_1G605720 PE=3 SV=1[more]
W9QD90_9ROSA2.5e-29879.69UPF0061 protein azo1574 OS=Morus notabilis GN=L484_018405 PE=3 SV=1[more]
A0A067JMC6_JATCU2.3e-29682.99Uncharacterized protein OS=Jatropha curcas GN=JCGZ_22671 PE=3 SV=1[more]
B9RI64_RICCO1.7e-29484.09Selenoprotein O, putative OS=Ricinus communis GN=RCOM_1576590 PE=3 SV=1[more]
A0A061GSR9_THECC1.1e-29380.29Uncharacterized protein OS=Theobroma cacao GN=TCM_040408 PE=3 SV=1[more]
Match NameE-valueIdentityDescription
AT5G13030.11.3e-28278.84 unknown protein[more]
Match NameE-valueIdentityDescription
gi|449462599|ref|XP_004149028.1|0.0e+0090.66PREDICTED: selenoprotein O [Cucumis sativus][more]
gi|659100510|ref|XP_008451128.1|0.0e+0090.05PREDICTED: selenoprotein O [Cucumis melo][more]
gi|1009118638|ref|XP_015875963.1|6.5e-30081.30PREDICTED: UPF0061 protein azo1574 [Ziziphus jujuba][more]
gi|703065161|ref|XP_010087379.1|3.6e-29879.69UPF0061 protein azo1574 [Morus notabilis][more]
gi|802744158|ref|XP_012087438.1|3.3e-29682.99PREDICTED: selenoprotein O [Jatropha curcas][more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR003846UPF0061
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0042967 acyl-carrier-protein biosynthetic process
biological_process GO:0006464 cellular protein modification process
biological_process GO:0009107 lipoate biosynthetic process
biological_process GO:0008150 biological_process
cellular_component GO:0009570 chloroplast stroma
cellular_component GO:0005575 cellular_component
molecular_function GO:0016415 octanoyltransferase activity
molecular_function GO:0003674 molecular_function

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Cp4.1LG18g01390.1Cp4.1LG18g01390.1mRNA


Analysis Name: InterPro Annotations of Cucurbita pepo
Date Performed: 2017-12-02
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR003846Uncharacterised protein family UPF0061HAMAPMF_00692UPF0061coord: 113..636
score: 4
IPR003846Uncharacterised protein family UPF0061PFAMPF02696UPF0061coord: 140..619
score: 1.8E
NoneNo IPR availablePANTHERPTHR32057FAMILY NOT NAMEDcoord: 96..653
score: 9.4E
NoneNo IPR availablePANTHERPTHR32057:SF14UPF0061 PROTEIN FMP40coord: 96..653
score: 9.4E

The following gene(s) are paralogous to this gene:
GeneParalogueOrganismBlock
Cp4.1LG18g01390Cp4.1LG04g11910Cucurbita pepo (Zucchini)cpecpeB363