Sequences
The following sequences are available for this feature:
Gene sequence (with intron)
Legend: polypeptideCDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGAACAGTTTGTGCAGTTGGAGGAACTGCAAGATGATGCTAACCCAAATTGTGGAGGAAATTGTTACATTATACTACAGAGACTCATCCAGACTGATCCTCAGCATAGAGCAAGCACTGTAAGAGCAGCTTTTTATTTTCCTTTCTTGGCCCAAATTCTAATTAATTTTGGGCGTGTTTGTTAATTAATCTCCAAAATGCAGACAGAAGAGCCCTCTGGACTTGAGTAAGTTGGATGCACTTATGCACCAGTTTAAAGGAAGTAGCTCAAGGTAACACCCTAAATAAAATAATATTTTATTATTTAAAATGTAGTGAAAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGATGGAGTAGGTGTGAGGTTGGGCATTGGAGTATATACTCTGATGGGCACTTTAAAAATAATTAAAAATAGTAGCTAAACAAAAAATGTCACTGTTGAAAAAACAGACATTATATTATTCATGTTTCTTTTGTAATGGGTAATGGGTGCAGTATTGGAGCCAAAAAGGTGAAAGCTGAGTGCACACAGTTGAGGGAATATTGCAAGGCAGGAAGTGGAGAAGGGTAAATATCTATCATACTCATACATATATCTTTCTTCTTATTCCTTCAACGTGGACGGCTGAGATCAATTTCATCCCTTTCCCCTGCTCCTGCATTTTAAGTATATATCTATATTTCTCTCTCTTTTCTTCGAAATATGATCACTTTTAAATTATCAAACTACAAAGTAACATGGGCTTTCAGTTGCATGTATGCTTCAGTTTTTGCATGCATGCAATGGATCGAAACAAGGACAGGAGGAAGTAGGAACTCAATATTATTATTATTATTATTATTATTTACATCGCATGCTCACTTTCCTACATTAATAATAATAATAAAAAATAAAAGCACCAACTTGTTTAAGTTATTTTGAACTAATTAAAAATAAATATTCGTGAGTAGGGGTACACTTTGATATAGTTTATTTATCTACTTATTGGGATGAAAAAAGATGCTTGAGGACATTCCAACAACTGAAGAAAGAATATACAACTCTGAGAAAGAAGCTTGAAGCCTATTTTCAGGTTCTTATAAAGCTCCATTCTTTGGTTCCTTTTTTACTTTTTTTACTTTTACTTTTCAATAATATAATAAATAATCTGTAAACTAAGGTAACATTGAACTTTTGTGGTCTAATATTGCAGCTAGCAAGACAGGCTGGGCCCACTGAATCTGCCTGTCGGCCCAAGTGATCATATGTGGATGACAAATGGACGACAAAGCTGATATGTGGCAAGATGTGATTGGAGATCTATATGTTTCTCTACATTATGTAAGCTCCAATTATGGATTGGAGCAGCCTGTGGGGCTTAGCTTATGGGTTGCTTTTGTTTTGTTTTGTTTTGTTTGGTTCATGAATGCAATTTTAATTTTAATTTTAATTGGAAGTTTTGGTTTCTTTATGACGATTGTTTGGCCTTCTGATCTTTAACTTAAGACATATCCCATTCTCGGCCCAATTATTTAATTTGGATGTCGAGGGACAAAGCCTGTACTCTTCTTTTATTTTTTCAGATCAAAATGGATTCAACGAGAGAAAATATATGTTTCCATTCTTAAATTAAGGGAAATAGTTAGACAATAAACGACATCAAATTATGTCTTATTAACTTATATAGAAAAAAAAATCAATAATTAATGAATGTCTATTTCATTTGATAAGTCAAATATTTAAATCTCTTTCTTCTACATATCCTATCTAATTTTTTATTGAACAACTGTCTCGCTTACATAAAGATAGAAAAAAACTTAGGTACATTACAACACTTACTATTATACATTCCACTAAATTGTTATCATAATTTATTTATTTATTTATTTTTGGGTATGAGTAATGCGATGAGTATGCGTAAAAATAAAAAGGCAAAATCACAACAATAAAACCTGGACGGAATAATGTGATGACCGATGAGAAAAGGGCGTGACGTAGGTGGGAGAGATCGACGAAGGCCGCCCCCTCCGCTCGCATTTGCAAACCCAAAACTCTTTCTCCTACTTTTTTTCTACTCTCCCAATCACTCCACCTTCTTCTTTCTCTCTCTTAAGCCTTCCCTTCTGCTTCCTACTCCTCTGCACTTCCTTCTTAAGCTAGGGTTTCCTCGGTGATCGATTCAAATTGCAAGCTCTCTCTCTCGTCCAATCCCTCTTCATCTTCTCGGGACTCGGTTTTTTCTTTAAGGAAAACATCTTTTTGCATTGCTGCTTTTCGTGTTTCTTCCTTTCCCTTTGTCTCTTATCTTCTAGATGCGTTCATAGGGTTTTGGAGATTTCTTGTTTGGGAGGAGGGTTTCTGATGATTGGCCCGGAAATCTGACTCAACTTCAACTTACTGGTGATTTCGCGATTTTCTATCTTCGGGGGTTCTGCTTGCTCTAGTTTTTGGGGAGATATAGCTATGGCTTCTTACAGGCCATATCCTCCACAATCGTCTTTCGGTCCTGCACCTGGTCAAAACTCGATTCCGCCTCCACCAGCGCAATCGGCCGCCGTTCCAGCGCAACAGCGAGGAGGAGGTAGTCAGTATAATCAGAATTGGGGTGGTTATGGTGGTGACGGGTCTGTGCCTCCTGCTCCATCTTCATCGTATCCCCAAAATTACAACCAAGTCCATCAAAGTTCTAATTACAACCAGCAACATTATGGTCCCCGAGAAGCCAACACCCTCCACCTCCTCCTCCTCACCAGTCGTATCCTTATGCACCACAACCGCCTCCGCCCTCCTCCCGATTCATCCTATCCACCGCCTCCACCCCCACCAGCGCCCTCGCAACCTTCTAATCATTACTATCCCCCTTCACAGTATTCGCAGAGTAATCAAAATCAGCAGTCAATGCAGCCACCACCTCCGCCCTCATCTCCACCACCGAGCTCTTCAATTCCCCCACCTCCACCTCCAAATTCTCCACCACCTCCATCAGCGCCTCAGCAAAAAGCAGAGGGTACAAATATGGGAGCACACGAACGTGATAAAGGGGTTTCAAAGGATCCCTCATATGGCAGGCGTGAACGTGAAAATTCAAATCATGACAAACACCAGAGGCATTCTGGTCCCCCAATGCCTCCCAAGAAAGCAAACGGTCCTTCAGGAAGAATGGAGACAGATGATGAGAAAAGACTGAGGAAGAAAAGAGAGTTCGAAAAACAAAGGCAAGATGAGAGGCACAGACACCATCTTAAAGAATCCCAAAACACTATACTGCAAAAGACCCAGATGATATCTACTGGGAAGGGGCATGGATCAATTGCAGGGTCCCGAATGGGGAAAGGAAGGCCACTCCATTTCTCAGTGGTGAGAGGATAGAAAATAGGTTGAAGAAGCCAACAACATTTTGTGCAAGTTGAAGTGAGTATGTAGATACTTTCATTGTTATGATCTGGTTTTAACTCTTTCTATAATTATTGACGTGTACTAAGAGTTATTTGGATGAATCGATGAACCAATGTTTTTCTAGTAGAAATGATATTTCGTGCTTCTTCTCTTTTGATGTTTATTGATTTTTATTTCATGTGGCTAATGATTCCATGCATTTCAAAGGTTCCGGAACGAGCTTCCAGATACAAGTGCTCAGCCGAAGCTCATGTCACTACGGAAAGAGAAAGATCAGTATGTATTCTCTTCTTTTCCCTCCAAACTTTGAATAGATGTGCAGTTTGTGGGTTACTGATGATATGTGGGGTGCTTCTCCATTTATGTTATTTCTCCCTTCATCTCGCTGGAGTACTATTATGTTATGTTGGCGATCTAGAATTATGTGACGTAGTATAAAGATTATACTTGTTGGGATATTGAGCAACAAATGAAATTGTATAAATTAAGTGAAACGTCTTGCTAGACATAAGGGAAATAGAGCAAGGTGAGTAAGAGCACTAGAGTCTAGTTGTAATATATGAGAACTCAGAAAGCTAAAGTTTGATGCATTTGCTTATAAATGTTCACCTAGATAGACAACCTTACTTGAATAGGATCTGTCCTATAGGTTGTAGAAATGTGTCGAGTTCTAAAAATGTTCACCTAGATAGAGTCCATCTCAAGAAGCTGGGTTGGCACTGTGTTCTTTTATCCTAAGTTGGAAGAAATGTAGTTTAGAATCATGTGACAGTTTAGCTCTCGGGTTTAATCTTCTTTCCAAAGCATTACTTGAAAGACATAAAGGGAAGCGGAAGTCATGTCATGTGTATTAACATTTGAGATTGGATGTTAACTTATTGTGTGAAAAGATAGTTGAAGCCACCACTGTGTTTAGGTTGTATTGGTTGTTTCAACTTAAAGTTATTAAATTTCTTGGTCTTACGGTGTTTTTATCTTTATTATCTTACAAATAGGTTTGACTTAAGTAATAAGAGGTGGCATCTTCTTTTTTTGGGGTTGCAGCTATACAAGATATACAATCACATCACTAGAGAAAACATACAAACCTCAGCTTTATGTGGAGCCAGATCTTGGAATACCCTCGATTTGCTTGACCTCAGCGTATACAAGTATGGATCATATAAATATTTTCTATAATGTGGCTATTCTTTGTGTGCCATTATTCTAACTTTTCTTTATGCATGTGTTTACTAGCCCTCCTAGTGTTAGAATTCCCCTTGCTCCTGAAGATGAGGAACTATTACGCGATGATGTATTGAAAACTCCAGTTAAAAAGGATGGTGGTATAAAAAGAAAAGAGCGTCCTACTGATAAAGGTGTTGCCTGGCTTGTTAAGACGCAGTACATCTCTCCTCTTAGCATTGAATCGGCGAAACAGGTATTGATATTTAGGTTCTTTCATACAAATAATTCTGTTGTAGTTGTGTCGGTGCTAACAAGGATTACATCTGCAGTCTTTGACTGAAAAACAGGCAAAAGAACTGCGAGAAATGAAGGGAGGGCGCAATATTCTTGAGAACCTCAATAATAGGTTTGTCTTTTAACTTTAAGAGAAGGGGAAAAAGAAAACCCAAAGTTATATGTCGTGTGATTATATCCTTCACTGAAAGATCTTTATATTTTCCAGGGAAAGGCAAATTAAGGAAATTGAGGCGTCATTCGAGGCATGCAAGTCACGCCCTGTTCATGCCACTAATAAGAATTTATATCCTGTAGAGGTTTTACCTCTTCTACCTGATTTTGATAGGTACAAAATACTTGAACGCTTGATTGTGAATTTGGTAAATAAATAATAGATTTATCATAGTACATTAGGGCAAGAATTTTGTGTCAACTTCTCTGCAAGGCTACAATACAACGTTAGGATTTAGATGGCAATTAGGTTGCATTCCTTTCTCCTGTGGCACTGGATGTTTCGTGTAAGCACTTGATTCACTTTTAAGAAAGTTTGATCGCTAGCAACAATGTTATCTCTGATAACGATCTTACATCATGTGCCGGCACTTTCTAATTTTGGCATTCCTATAATTTGTAACAGTTATGAGGAAGTTTGGAGGCAGTGTGCCTTATATCTCATTCGTTGGTGAAGTAGTTTCTTATTCAAAATGATATTGTTATTGCATCAGGTATGATGATCCATTGTGTGGTGGCGTTTGATAGCGCTCCCACAGCTGATTCAGAGACTTTCAATAAGTTAGACCAATCCATCCGTATTGCTCATGAATCACAGGTCAGTTCTTGAGGTCATTTATACTTCAAATAGATTGTTGAAAGGCTGGTTGCTAAAGTTCGTTTTGTTAACTCAGATATAAATGTTTTCTTTTTGCATTCCAAAAAAATATTATTGGTTTTAGTTTCAGTATTAGGCTCTTGTTTGATGTAGAATTTGTTTCAGATTTTTGTGCATCCGAGGGAATGGTTTATGTGGCATAATATTGATTGTCAGATAGGCAAGGGTTTCTGTGGATTAGATTGGGTAGGTTGGGAAGTATTTTGGACGTAATCCAAATGTTCAAGTTGGAAATTAGCCAACTTCAATAACCCTAGTTTATCACTTGTTTGAATCGGTAAGGACATCTATTTTTAAAGTTTGTACAGAAAAATATAGTGAGATAATTCAACAGTAGTTTTCCGATGCAATCATTAAATTATTTAACTTCTCCCTGTTGGGATGCCCGACAGAATTAAATTATCTTAATCAATGCTTTTAGACTTGATTCAATTGCTTTCTGGAAATTACTGGAGAGGAGGTATCATTGGTTGGAATTAAATTTAAAGAGGAGGTACATGTGTAATTGGTTTTCATAATAGGAAATGTCTGAGTAGAGACTGATTAATTAATTTGAGTTAGTTCAGTTCAACTCAAAAAATTAGTAGAACTACAACAATTTAATCTACTTCACTGAATGAGTTCAATTTATTTGACCTCAAATCAATTTAATAAGTTTATTACAGAACTTTTTTTTTTTTTTTTTTTTTTTTAAGAAACAAAACTTTTCATTAATAAAATGAAAAGAGATTATTGCTCAAATTACAAGGAAACAAAGCGAAAAGCCAGAACAAACTCAAGAGAGTCATGCTTTACAACGAATAATAGGCAGAATAAATAAAAGCCCTCCAATAAACATAAAATTTATTGAATTTGACTCTTCTAACTTTTCTGAAAAGTCCTAATCTTTAGTACTGTACCAGAATTATGTTGTTAACCTGCAGTATGAAATAGTTAGTTCTAACCCCACTTGTTGATCCTTACCAGGCGATAATGAAAAGTTACATGGCAACAGGCTCAGATCCTTCAAAACCTGAGAAATTTCTTGCATACATGGTTCCTCTCCAGATGAGGTTGGTCAATGTATTATATAATTTAGTATTATTATTTGCTATTGGAATTATAGTTTCTATTTTTCTTATCTTGCCAAAAGCAAAAGCCAAATGAAGTCCATAGCATTCTTAGCCATCTTCGCTATTCCAAGAACAAGAGGAAGATCTTAAACATAGTGTTGTAATCTATAAGGCAAAGATTAACAACACCTCCACTTAGTGTTTACGTTCCTTTATATGAAAAATCTCTTCTGTGTTTACCTTTCTTTTTATTGAAAGTCTCGTTCTCTCTCTCTCACACACACATACACACTCCTACGTGAAGCAAACTCAATTAATAAGTATCATTCTTATTTGAAGCAAGTTCATGTTATCCTCTTGTTTTGATTCAGCTATCAAAGGATATTTATGATGAACAAGAAGATGTCGCATATTCCTGGGTTCGTGAGTACCATTGGGATGTAAGTGCAGGCATTTTTAATTAGTGTGCAACATTAAGTTAGAATTTTGACTGCACGCTTGATGAGGTTTCTGATGACACTTTTAACAGGTACGGGGTGACAATGTGGATGACCCCACTACATATCTTGTTTCATTTGATGATGCAGAAGCTCGTTATGTGGTATTTTCATCTCTCTTGTATCTCTTTATAATCCATTGGAATTCACGTATGTATCATGGCTTGTTCTTGGTTTAGATATTTACTCATCTGTTCAATTTCAATTGTAGCCACTTCCTACAAAGCTTGTTCTTAGAAAAAAGAGGGCTAAAGAAGGGAGATCTAGTGATGAGGTTGAACATTTTCCTGCACCTGCAAGAGTGACTGTAAGGAGAAGACCAACTGTAGCTACGTTGGAAGTGAAGGATCCAGGGGTATAGTTTAAGTTTATGAACTTCAATTAGTATTACGATTACCCATGTTTATGGTTAGCGATTTTTTAATGTGGTGCTGCTTTCAGGTTTACTCAAATTCGAAAAGAGGATCAGATATTGAAGATGGTATAGGGAGATCACATAAACATGATAGACACCAAGACATGGATCAATACAGTGGAGCTGAAGACGAGATGTCTGATTGA
mRNA sequence
ATGAACAGTTTGTGCAGTTGGAGGAACTGCAAGATGATGCTAACCCAAATTGTGGAGGAAATTGTTACATTATACTACAGAGACTCATCCAGACTGATCCTCAGCATAGAGCAAGCACTTATTGGAGCCAAAAAGGTGAAAGCTGAGTGCACACAGTTGAGGGAATATTGCAAGGCAGGAAGTGGAGAAGGGCCATATCCTCCACAATCGTCTTTCGGTCCTGCACCTGGTCAAAACTCGATTCCGCCTCCACCAGCGCAATCGGCCGCCGTTCCAGCGCAACAGCGAGGAGGAGGTAGTCAGTATAATCAGAATTGGGGTGGTTATGGTGGTGACGGGTCTGTGCCTCCTGCTCCATCTTCATCGTATCCCCAAAATTACAACCAAGTCCATCAAAGTTCTAATTACAACCAGCAACATTATGGTCCCCGAGAAGCCAACACCCTCCACCTCCTCCTCCTCACCAGTCGTATCCTTATGCACCACAACCGCCTCCGCCCTCCTCCCGATTCATCCTATCCACCGCCTCCACCCCCACCAGCGCCCTCGCAACCTTCTAATCATTACTATCCCCCTTCACAGTATTCGCAGAGTAATCAAAATCAGCAGTCAATGCAGCCACCACCTCCGCCCTCATCTCCACCACCGAGCTCTTCAATTCCCCCACCTCCACCTCCAAATTCTCCACCACCTCCATCAGCGCCTCAGCAAAAAGCAGAGGGTACAAATATGGGAGCACACGAACGTGATAAAGGGGTTTCAAAGGATCCCTCATATGGCAGGCGTGAACGTGAAAATTCAAATCATGACAAACACCAGAGGCATTCTGGTCCCCCAATGCCTCCCAAGAAAGCAAACGGTCCTTCAGGAAGAATGGAGACAGATGATGAGAAAAGACTGAGGAAGAAAAGAGAGTTCGAAAAACAAAGGCAAGATGAGAGGCACAGACACCATCTTAAAGAATCCCAAAACACTATACTGCAAAAGACCCAGATGATATCTACTGGGAAGGGGCATGGATCAATTGCAGGGTCCCGAATGGGGAAAGGAAGGCCACTCCATTTCTCAGTGGTTCCGGAACGAGCTTCCAGATACAAGTGCTCAGCCGAAGCTCATGTCACTACGGAAAGAGAAAGATCACTATACAAGATATACAATCACATCACTAGAGAAAACATACAAACCTCAGCTTTATGTGGAGCCAGATCTTGGAATACCCTCGATTTGCTTGACCTCAGCGTATACAACCCTCCTAGTGTTAGAATTCCCCTTGCTCCTGAAGATGAGGAACTATTACGCGATGATGTATTGAAAACTCCAGTTAAAAAGGATGGTGGTATAAAAAGAAAAGAGCGTCCTACTGATAAAGGTGTTGCCTGGCTTGTTAAGACGCAGTACATCTCTCCTCTTAGCATTGAATCGGCGAAACAGTCTTTGACTGAAAAACAGGCAAAAGAACTGCGAGAAATGAAGGGAGGGCGCAATATTCTTGAGAACCTCAATAATAGGGAAAGGCAAATTAAGGAAATTGAGGCGTCATTCGAGGCATGCAAGTCACGCCCTGTTCATGCCACTAATAAGAATTTATATCCTGTAGAGGTTTTACCTCTTCTACCTGATTTTGATAGGTACAAAATACTTGAACGCTTGATTGTGAATTTGGTGGTGGCGTTTGATAGCGCTCCCACAGCTGATTCAGAGACTTTCAATAAGTTAGACCAATCCATCCGTATTGCTCATGAATCACAGCTATCAAAGGATATTTATGATGAACAAGAAGATGTCGCATATTCCTGGGTTCGTGAGTACCATTGGGATGTACGGGGTGACAATGTGGATGACCCCACTACATATCTTGTTTCATTTGATGATGCAGAAGCTCGTTATGTGCCACTTCCTACAAAGCTTGTTCTTAGAAAAAAGAGGGCTAAAGAAGGGAGATCTAGTGATGAGGTTGAACATTTTCCTGCACCTGCAAGAGTGACTGTAAGGAGAAGACCAACTGTAGCTACGTTGGAAGTGAAGGATCCAGGGGTTTACTCAAATTCGAAAAGAGGATCAGATATTGAAGATGGTATAGGGAGATCACATAAACATGATAGACACCAAGACATGGATCAATACAGTGGAGCTGAAGACGAGATGTCTGATTGA
Coding sequence (CDS)
ATGAACAGTTTGTGCAGTTGGAGGAACTGCAAGATGATGCTAACCCAAATTGTGGAGGAAATTGTTACATTATACTACAGAGACTCATCCAGACTGATCCTCAGCATAGAGCAAGCACTTATTGGAGCCAAAAAGGTGAAAGCTGAGTGCACACAGTTGAGGGAATATTGCAAGGCAGGAAGTGGAGAAGGGCCATATCCTCCACAATCGTCTTTCGGTCCTGCACCTGGTCAAAACTCGATTCCGCCTCCACCAGCGCAATCGGCCGCCGTTCCAGCGCAACAGCGAGGAGGAGGTAGTCAGTATAATCAGAATTGGGGTGGTTATGGTGGTGACGGGTCTGTGCCTCCTGCTCCATCTTCATCGTATCCCCAAAATTACAACCAAGTCCATCAAAGTTCTAATTACAACCAGCAACATTATGGTCCCCGAGAAGCCAACACCCTCCACCTCCTCCTCCTCACCAGTCGTATCCTTATGCACCACAACCGCCTCCGCCCTCCTCCCGATTCATCCTATCCACCGCCTCCACCCCCACCAGCGCCCTCGCAACCTTCTAATCATTACTATCCCCCTTCACAGTATTCGCAGAGTAATCAAAATCAGCAGTCAATGCAGCCACCACCTCCGCCCTCATCTCCACCACCGAGCTCTTCAATTCCCCCACCTCCACCTCCAAATTCTCCACCACCTCCATCAGCGCCTCAGCAAAAAGCAGAGGGTACAAATATGGGAGCACACGAACGTGATAAAGGGGTTTCAAAGGATCCCTCATATGGCAGGCGTGAACGTGAAAATTCAAATCATGACAAACACCAGAGGCATTCTGGTCCCCCAATGCCTCCCAAGAAAGCAAACGGTCCTTCAGGAAGAATGGAGACAGATGATGAGAAAAGACTGAGGAAGAAAAGAGAGTTCGAAAAACAAAGGCAAGATGAGAGGCACAGACACCATCTTAAAGAATCCCAAAACACTATACTGCAAAAGACCCAGATGATATCTACTGGGAAGGGGCATGGATCAATTGCAGGGTCCCGAATGGGGAAAGGAAGGCCACTCCATTTCTCAGTGGTTCCGGAACGAGCTTCCAGATACAAGTGCTCAGCCGAAGCTCATGTCACTACGGAAAGAGAAAGATCACTATACAAGATATACAATCACATCACTAGAGAAAACATACAAACCTCAGCTTTATGTGGAGCCAGATCTTGGAATACCCTCGATTTGCTTGACCTCAGCGTATACAACCCTCCTAGTGTTAGAATTCCCCTTGCTCCTGAAGATGAGGAACTATTACGCGATGATGTATTGAAAACTCCAGTTAAAAAGGATGGTGGTATAAAAAGAAAAGAGCGTCCTACTGATAAAGGTGTTGCCTGGCTTGTTAAGACGCAGTACATCTCTCCTCTTAGCATTGAATCGGCGAAACAGTCTTTGACTGAAAAACAGGCAAAAGAACTGCGAGAAATGAAGGGAGGGCGCAATATTCTTGAGAACCTCAATAATAGGGAAAGGCAAATTAAGGAAATTGAGGCGTCATTCGAGGCATGCAAGTCACGCCCTGTTCATGCCACTAATAAGAATTTATATCCTGTAGAGGTTTTACCTCTTCTACCTGATTTTGATAGGTACAAAATACTTGAACGCTTGATTGTGAATTTGGTGGTGGCGTTTGATAGCGCTCCCACAGCTGATTCAGAGACTTTCAATAAGTTAGACCAATCCATCCGTATTGCTCATGAATCACAGCTATCAAAGGATATTTATGATGAACAAGAAGATGTCGCATATTCCTGGGTTCGTGAGTACCATTGGGATGTACGGGGTGACAATGTGGATGACCCCACTACATATCTTGTTTCATTTGATGATGCAGAAGCTCGTTATGTGCCACTTCCTACAAAGCTTGTTCTTAGAAAAAAGAGGGCTAAAGAAGGGAGATCTAGTGATGAGGTTGAACATTTTCCTGCACCTGCAAGAGTGACTGTAAGGAGAAGACCAACTGTAGCTACGTTGGAAGTGAAGGATCCAGGGGTTTACTCAAATTCGAAAAGAGGATCAGATATTGAAGATGGTATAGGGAGATCACATAAACATGATAGACACCAAGACATGGATCAATACAGTGGAGCTGAAGACGAGATGTCTGATTGA
Protein sequence
MNSLCSWRNCKMMLTQIVEEIVTLYYRDSSRLILSIEQALIGAKKVKAECTQLREYCKAGSGEGPYPPQSSFGPAPGQNSIPPPPAQSAAVPAQQRGGGSQYNQNWGGYGGDGSVPPAPSSSYPQNYNQVHQSSNYNQQHYGPREANTLHLLLLTSRILMHHNRLRPPPDSSYPPPPPPPAPSQPSNHYYPPSQYSQSNQNQQSMQPPPPPSSPPPSSSIPPPPPPNSPPPPSAPQQKAEGTNMGAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKKANGPSGRMETDDEKRLRKKREFEKQRQDERHRHHLKESQNTILQKTQMISTGKGHGSIAGSRMGKGRPLHFSVVPERASRYKCSAEAHVTTERERSLYKIYNHITRENIQTSALCGARSWNTLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDKGVAWLVKTQYISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEACKSRPVHATNKNLYPVEVLPLLPDFDRYKILERLIVNLVVAFDSAPTADSETFNKLDQSIRIAHESQLSKDIYDEQEDVAYSWVREYHWDVRGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRPTVATLEVKDPGVYSNSKRGSDIEDGIGRSHKHDRHQDMDQYSGAEDEMSD
Homology
BLAST of HG10023520 vs. NCBI nr
Match:
XP_038898523.1 (protein PAF1 homolog [Benincasa hispida])
HSP 1 Score: 972.2 bits (2512), Expect = 2.4e-279
Identity = 547/701 (78.03%), Postives = 565/701 (80.60%), Query Frame = 0
Query: 65 PYPPQSSFGPAPGQNSIPPPPAQSAAVPAQQRGGGSQYNQNWGGYGGDGSVPPAPSSSYP 124
PYPPQSSFGPAPGQN +PPPP QSA+VPAQQRGGGSQYNQNWGGYGGDGS+PPA SSSYP
Sbjct: 6 PYPPQSSFGPAPGQNPVPPPPTQSASVPAQQRGGGSQYNQNWGGYGGDGSMPPAQSSSYP 65
Query: 125 QNYNQVHQSSNYNQQHYGPREANTLHLLLLTSRILMHHNRLRPPPDSSYPPPPPPPAPSQ 184
QNYNQ HQSSNY+QQHYGP + PPPDSSYPPPPPPPAPSQ
Sbjct: 66 QNYNQAHQSSNYHQQHYGPPRSQHPPPPPPNQSYPYAPQPPPPPPDSSYPPPPPPPAPSQ 125
Query: 185 PSNHYYPPSQYSQSNQNQQSMQPPPPPSSPPPSSSIPPPPPPNSPPPPSAPQQKAEGTNM 244
P N YYPPS QSMQPPPPPSSPPPSSSIPPPPPPNSPPP SAPQQKAEGTNM
Sbjct: 126 PPNLYYPPS---------QSMQPPPPPSSPPPSSSIPPPPPPNSPPPLSAPQQKAEGTNM 185
Query: 245 GAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKKANGPSGRMETDDEKRLRKKR 304
GAHERDKGVSKDPSYGRR+RENSNHDKHQRHSGPPMPPKKANGPSGRMETDDEKRLRKKR
Sbjct: 186 GAHERDKGVSKDPSYGRRDRENSNHDKHQRHSGPPMPPKKANGPSGRMETDDEKRLRKKR 245
Query: 305 EFEKQRQDERHRHHLKESQNTILQKTQMISTGKGHGSIAGSRMGKGRPLHFSVVPERASR 364
EFEKQRQDERHRHHLKESQNTILQKTQM+STGKGHGSI GSRMG+ + F +R
Sbjct: 246 EFEKQRQDERHRHHLKESQNTILQKTQMLSTGKGHGSIVGSRMGERKATPFLSGERIENR 305
Query: 365 YK------CSA----EAHVTTERER--SLYKIYNHITRENI-------QTSALCGARSWN 424
K C E T+ + + SL K +H TR I +
Sbjct: 306 LKKPTTFLCKLKFRNELPDTSAQPKLMSLRKEKDHYTRYTITSLEKTYKPQLYVEPDLGI 365
Query: 425 TLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDKGVAWLVKTQ 484
LDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDKGVAWLVKTQ
Sbjct: 366 PLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDKGVAWLVKTQ 425
Query: 485 YISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEACKSRPVHAT 544
YISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEACKSRPVHAT
Sbjct: 426 YISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEACKSRPVHAT 485
Query: 545 NKNLYPVEVLPLLPDFDRYKILERLIVNLVVAFDSAPTADSETFNKLDQSIRIAHESQ-- 604
NKNLYPVEVLPLLPDFDRY +VVAFDSAPTADSETFNKLDQSIR HESQ
Sbjct: 486 NKNLYPVEVLPLLPDFDRYD-----DPFVVVAFDSAPTADSETFNKLDQSIRDTHESQAI 545
Query: 605 ---------------------------LSKDIYDEQEDVAYSWVREYHWDVRGDNVDDPT 664
LSKDIYDEQEDV+YSWVREYHWDVRGDNVDDPT
Sbjct: 546 MKSYMATGSDPSKPEKFLAYMVPSPDELSKDIYDEQEDVSYSWVREYHWDVRGDNVDDPT 605
Query: 665 TYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRPTVATLEVKD 718
TYLVSFDD EARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRPTVATLEVKD
Sbjct: 606 TYLVSFDDTEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRPTVATLEVKD 665
BLAST of HG10023520 vs. NCBI nr
Match:
XP_022953373.1 (protein PAF1 homolog [Cucurbita moschata])
HSP 1 Score: 952.6 bits (2461), Expect = 2.0e-273
Identity = 544/710 (76.62%), Postives = 571/710 (80.42%), Query Frame = 0
Query: 65 PYPPQSSFGPAPGQNSIPPPPAQSAA-VPAQQRGGGSQYNQNWGGYGGDGSV-PPAPSSS 124
PYPPQSSFGP+PGQN IPPPPA AA VP QQR GGSQYNQNWGGYGGDGSV PPA SSS
Sbjct: 6 PYPPQSSFGPSPGQNPIPPPPAPPAASVPTQQR-GGSQYNQNWGGYGGDGSVPPPASSSS 65
Query: 125 YPQNYNQVHQSSNYNQQHYGPREAN------TLHLLLLTSRILMHHNRLRPPPDSSYPPP 184
YPQNYNQVHQSSNY+QQHYGP + H S PPPDSSYPPP
Sbjct: 66 YPQNYNQVHQSSNYHQQHYGPPRSQQPPPPPPPH----QSYPYAPQPPPPPPPDSSYPPP 125
Query: 185 PPPPAPSQPSNHYYPPSQYSQSNQNQQSMQ-PPPPPSSPPPSSSIPPPPPPNSPPPPSAP 244
PPPPA SQPS HY+PPSQY Q NQNQQS+Q PPPPPSSPPPSSSIPPPPPPNSPPPPSAP
Sbjct: 126 PPPPASSQPSQHYFPPSQYPQGNQNQQSIQPPPPPPSSPPPSSSIPPPPPPNSPPPPSAP 185
Query: 245 QQKAEGTNMGAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKKANGPSGRMETD 304
QQK EG+++GAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKK+NGPSGR+ETD
Sbjct: 186 QQKTEGSSVGAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKKSNGPSGRIETD 245
Query: 305 DEKRLRKKREFEKQRQDERHRHHLKESQNTILQKTQMISTGKGHGSIAGSRMGKGRPLHF 364
DEKR RKKREFEKQRQDERHRHHLKESQNTILQKTQM+STGKGHGSI GSRMG+ + F
Sbjct: 246 DEKRQRKKREFEKQRQDERHRHHLKESQNTILQKTQMLSTGKGHGSIVGSRMGERKATPF 305
Query: 365 SVVPERASRYK------CSA----EAHVTTERER--SLYKIYNHITRENI-------QTS 424
+R K C E T+ + + SL K +H TR I +
Sbjct: 306 LSGERIENRLKKPTTFLCKLKFRNELPDTSAQPKLMSLRKEKDHYTRYTITSLEKTYKPQ 365
Query: 425 ALCGARSWNTLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK 484
LDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK
Sbjct: 366 LYVEPDLGIPLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK 425
Query: 485 GVAWLVKTQYISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEA 544
GVAWLVKTQYISPLSIES KQSLTEKQAKELREMKGGRNILENLNNRER+IKEI+ASFEA
Sbjct: 426 GVAWLVKTQYISPLSIESTKQSLTEKQAKELREMKGGRNILENLNNRERKIKEIDASFEA 485
Query: 545 CKSRPVHATNKNLYPVEVLPLLPDFDRYKILERLIVNLVVAFDSAPTADSETFNKLDQSI 604
CKSRPVHATNKNLYPVEVLPLLPDFDRY +VVAFD+APTADSETFNKLDQSI
Sbjct: 486 CKSRPVHATNKNLYPVEVLPLLPDFDRYD-----DPFVVVAFDNAPTADSETFNKLDQSI 545
Query: 605 RIAHESQ-----------------------------LSKDIYDEQEDVAYSWVREYHWDV 664
R AHESQ LSKDIYDEQEDV+YSWVREYHWDV
Sbjct: 546 RDAHESQAIMKSYMATGSDPSKPEKFLAYMVPSPDELSKDIYDEQEDVSYSWVREYHWDV 605
Query: 665 RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRP 718
RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRS+DEVEHFPAPARVTVRRRP
Sbjct: 606 RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSNDEVEHFPAPARVTVRRRP 665
BLAST of HG10023520 vs. NCBI nr
Match:
XP_023547399.1 (protein PAF1 homolog [Cucurbita pepo subsp. pepo])
HSP 1 Score: 948.7 bits (2451), Expect = 2.8e-272
Identity = 542/710 (76.34%), Postives = 570/710 (80.28%), Query Frame = 0
Query: 65 PYPPQSSFGPAPGQNSIPPPPAQSAA-VPAQQRGGGSQYNQNWGGYGGDGSV-PPAPSSS 124
PYPPQSSFGP+PGQN IPPPPA AA VP QQR GGSQYNQNWGGYGGDGSV PPA SSS
Sbjct: 6 PYPPQSSFGPSPGQNPIPPPPAPPAASVPTQQR-GGSQYNQNWGGYGGDGSVPPPASSSS 65
Query: 125 YPQNYNQVHQSSNYNQQHYGPREAN------TLHLLLLTSRILMHHNRLRPPPDSSYPPP 184
YPQNYNQVHQSSN++QQHYGP + H S PPPDSSYPPP
Sbjct: 66 YPQNYNQVHQSSNFHQQHYGPPRSQQPPPPPPPH----QSYPYAPQPPPPPPPDSSYPPP 125
Query: 185 PPPPAPSQPSNHYYPPSQYSQSNQNQQSMQ-PPPPPSSPPPSSSIPPPPPPNSPPPPSAP 244
PPPPA SQPS HY+PPSQY Q NQNQQS+Q PPPPPSSPPPSSSIPPPPPPNSPPPPSAP
Sbjct: 126 PPPPASSQPSQHYFPPSQYPQGNQNQQSIQPPPPPPSSPPPSSSIPPPPPPNSPPPPSAP 185
Query: 245 QQKAEGTNMGAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKKANGPSGRMETD 304
Q K EG+++GAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKK+NGPSGR+ETD
Sbjct: 186 QPKTEGSSVGAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKKSNGPSGRIETD 245
Query: 305 DEKRLRKKREFEKQRQDERHRHHLKESQNTILQKTQMISTGKGHGSIAGSRMGKGRPLHF 364
DEKR RKKREFEKQRQDERHRHHLKESQNTILQKTQM+STGKGHGSI GSRMG+ + F
Sbjct: 246 DEKRQRKKREFEKQRQDERHRHHLKESQNTILQKTQMLSTGKGHGSIVGSRMGERKATPF 305
Query: 365 SVVPERASRYK------CSA----EAHVTTERER--SLYKIYNHITRENI-------QTS 424
+R K C E T+ + + SL K +H TR I +
Sbjct: 306 LSGERIENRLKKPTTFLCKLKFRNELPDTSAQPKLMSLRKEKDHYTRYTITSLEKTYKPQ 365
Query: 425 ALCGARSWNTLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK 484
LDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK
Sbjct: 366 LYVEPDLGIPLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK 425
Query: 485 GVAWLVKTQYISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEA 544
GVAWLVKTQYISPLSIES KQSLTEKQAKELREMKGGRNILENLNNRER+IKEI+ASFEA
Sbjct: 426 GVAWLVKTQYISPLSIESTKQSLTEKQAKELREMKGGRNILENLNNRERKIKEIDASFEA 485
Query: 545 CKSRPVHATNKNLYPVEVLPLLPDFDRYKILERLIVNLVVAFDSAPTADSETFNKLDQSI 604
CKSRPVHATNKNLYPVEVLPLLPDFDRY +VVAFD+APTADSETFNKLDQSI
Sbjct: 486 CKSRPVHATNKNLYPVEVLPLLPDFDRYD-----DPFVVVAFDNAPTADSETFNKLDQSI 545
Query: 605 RIAHESQ-----------------------------LSKDIYDEQEDVAYSWVREYHWDV 664
R AHESQ LSKDIYDEQEDV+YSWVREYHWDV
Sbjct: 546 RDAHESQAIMKSYMATGSDPSKPEKFLAYMVPSPDELSKDIYDEQEDVSYSWVREYHWDV 605
Query: 665 RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRP 718
RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRS+DEVEHFPAPARVTVRRRP
Sbjct: 606 RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSNDEVEHFPAPARVTVRRRP 665
BLAST of HG10023520 vs. NCBI nr
Match:
KAG7014045.1 (Protein PAF1-like protein, partial [Cucurbita argyrosperma subsp. argyrosperma])
HSP 1 Score: 948.0 bits (2449), Expect = 4.8e-272
Identity = 541/710 (76.20%), Postives = 570/710 (80.28%), Query Frame = 0
Query: 65 PYPPQSSFGPAPGQNSIPPPPAQSAA-VPAQQRGGGSQYNQNWGGYGGDGSV-PPAPSSS 124
PYPPQSSFGP+PGQN IPPPPA AA VP QQR GGSQYNQNWGGYGGDGSV PPA SSS
Sbjct: 6 PYPPQSSFGPSPGQNPIPPPPAPPAASVPTQQR-GGSQYNQNWGGYGGDGSVPPPASSSS 65
Query: 125 YPQNYNQVHQSSNYNQQHYGPREAN------TLHLLLLTSRILMHHNRLRPPPDSSYPPP 184
YPQNYNQVHQSSNY+QQHYGP + H S PPPDSSYPPP
Sbjct: 66 YPQNYNQVHQSSNYHQQHYGPPRSQQPPPPPPPH----QSYPYAPQPPPPPPPDSSYPPP 125
Query: 185 PPPPAPSQPSNHYYPPSQYSQSNQNQQSMQ-PPPPPSSPPPSSSIPPPPPPNSPPPPSAP 244
PPPPA SQPS HY+PPSQY Q +QNQQS+Q PPPPPSSPPPSSSIPPPPPPNSPPPPSAP
Sbjct: 126 PPPPASSQPSQHYFPPSQYPQGSQNQQSIQPPPPPPSSPPPSSSIPPPPPPNSPPPPSAP 185
Query: 245 QQKAEGTNMGAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKKANGPSGRMETD 304
QQK EG+++G HERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKK+NGPSGR+ETD
Sbjct: 186 QQKTEGSSVGTHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKKSNGPSGRIETD 245
Query: 305 DEKRLRKKREFEKQRQDERHRHHLKESQNTILQKTQMISTGKGHGSIAGSRMGKGRPLHF 364
DEKR RKKREFEKQRQDERHRHHLKESQNTILQKTQM+STGKGHGSI GSRMG+ + F
Sbjct: 246 DEKRQRKKREFEKQRQDERHRHHLKESQNTILQKTQMLSTGKGHGSIVGSRMGERKATPF 305
Query: 365 SVVPERASRYK------CSA----EAHVTTERER--SLYKIYNHITRENI-------QTS 424
+R K C E T+ + + SL K +H TR I +
Sbjct: 306 LSGERIENRLKKPTTFLCKLKFRNELPDTSAQPKLMSLRKEKDHYTRYTITSLEKTYKPQ 365
Query: 425 ALCGARSWNTLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK 484
LDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK
Sbjct: 366 LYVEPDLGIPLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK 425
Query: 485 GVAWLVKTQYISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEA 544
GVAWLVKTQYISPLSIES KQSLTEKQAKELREMKGGRNILENLNNRER+IKEI+ASFEA
Sbjct: 426 GVAWLVKTQYISPLSIESTKQSLTEKQAKELREMKGGRNILENLNNRERKIKEIDASFEA 485
Query: 545 CKSRPVHATNKNLYPVEVLPLLPDFDRYKILERLIVNLVVAFDSAPTADSETFNKLDQSI 604
CKSRPVHATNKNLYPVEVLPLLPDFDRY +VVAFD+APTADSETFNKLDQSI
Sbjct: 486 CKSRPVHATNKNLYPVEVLPLLPDFDRYD-----DPFVVVAFDNAPTADSETFNKLDQSI 545
Query: 605 RIAHESQ-----------------------------LSKDIYDEQEDVAYSWVREYHWDV 664
R AHESQ LSKDIYDEQEDV+YSWVREYHWDV
Sbjct: 546 RDAHESQAIMKSYMATGSDPSKPEKFLAYMVPSPDELSKDIYDEQEDVSYSWVREYHWDV 605
Query: 665 RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRP 718
RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRS+DEVEHFPAPARVTVRRRP
Sbjct: 606 RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSNDEVEHFPAPARVTVRRRP 665
BLAST of HG10023520 vs. NCBI nr
Match:
XP_022992172.1 (protein PAF1 homolog [Cucurbita maxima])
HSP 1 Score: 948.0 bits (2449), Expect = 4.8e-272
Identity = 542/710 (76.34%), Postives = 569/710 (80.14%), Query Frame = 0
Query: 65 PYPPQSSFGPAPGQNSIPPPPAQSAA-VPAQQRGGGSQYNQNWGGYGGDGSV-PPAPSSS 124
PYPPQSSFGP+PGQN IPPPPA AA VP QQR G SQYNQNWGGYGGDGSV PPA SSS
Sbjct: 6 PYPPQSSFGPSPGQNPIPPPPAPPAASVPTQQR-GSSQYNQNWGGYGGDGSVPPPASSSS 65
Query: 125 YPQNYNQVHQSSNYNQQHYGPREAN------TLHLLLLTSRILMHHNRLRPPPDSSYPPP 184
YPQNYNQVHQSSNY+QQHYGP + H S PPPDSSYPPP
Sbjct: 66 YPQNYNQVHQSSNYHQQHYGPPRSQQPPPPPPPH----QSYPYAPQPPPPPPPDSSYPPP 125
Query: 185 PPPPAPSQPSNHYYPPSQYSQSNQNQQSMQ-PPPPPSSPPPSSSIPPPPPPNSPPPPSAP 244
PPPPA SQPS HY+PPSQY Q NQNQQS+Q PPPPPSSPPPSSSIPPPPPPNSPPPPSAP
Sbjct: 126 PPPPASSQPSQHYFPPSQYPQGNQNQQSIQPPPPPPSSPPPSSSIPPPPPPNSPPPPSAP 185
Query: 245 QQKAEGTNMGAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKKANGPSGRMETD 304
Q K EG+++GAHERDKGV+KDPSYGRRERENSNHDKHQRHSGPPMPPKK+NGPSGR+ETD
Sbjct: 186 QPKTEGSSVGAHERDKGVTKDPSYGRRERENSNHDKHQRHSGPPMPPKKSNGPSGRIETD 245
Query: 305 DEKRLRKKREFEKQRQDERHRHHLKESQNTILQKTQMISTGKGHGSIAGSRMGKGRPLHF 364
DEKR RKKREFEKQRQDERHRHHLKESQNTILQKTQM+STGKGHGSI GSRMG+ + F
Sbjct: 246 DEKRQRKKREFEKQRQDERHRHHLKESQNTILQKTQMLSTGKGHGSIVGSRMGERKATPF 305
Query: 365 SVVPERASRYK------CSA----EAHVTTERER--SLYKIYNHITRENI-------QTS 424
+R K C E T+ + + SL K +H TR I +
Sbjct: 306 LSGERIENRLKKPTTFLCKLKFRNELPDTSAQPKLMSLRKEKDHYTRYTITSLEKTYKPQ 365
Query: 425 ALCGARSWNTLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK 484
LDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK
Sbjct: 366 LYVEPDLGIPLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK 425
Query: 485 GVAWLVKTQYISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEA 544
GVAWLVKTQYISPLSIES KQSLTEKQAKELREMKGGRNILENLNNRER+IKEI+ASFEA
Sbjct: 426 GVAWLVKTQYISPLSIESTKQSLTEKQAKELREMKGGRNILENLNNRERKIKEIDASFEA 485
Query: 545 CKSRPVHATNKNLYPVEVLPLLPDFDRYKILERLIVNLVVAFDSAPTADSETFNKLDQSI 604
CKSRPVHATNKNLYPVEVLPLLPDFDRY +VVAFD+APTADSETFNKLDQSI
Sbjct: 486 CKSRPVHATNKNLYPVEVLPLLPDFDRYD-----DPFVVVAFDNAPTADSETFNKLDQSI 545
Query: 605 RIAHESQ-----------------------------LSKDIYDEQEDVAYSWVREYHWDV 664
R AHESQ LSKDIYDEQEDV+YSWVREYHWDV
Sbjct: 546 RDAHESQAIMKSYMATGSDPSKPEKFLAYMVPSPDELSKDIYDEQEDVSYSWVREYHWDV 605
Query: 665 RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRP 718
RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRP
Sbjct: 606 RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRP 665
BLAST of HG10023520 vs. ExPASy Swiss-Prot
Match:
F4HQA1 (Protein PAF1 homolog OS=Arabidopsis thaliana OX=3702 GN=VIP2 PE=1 SV=1)
HSP 1 Score: 431.8 bits (1109), Expect = 1.5e-119
Identity = 299/606 (49.34%), Postives = 370/606 (61.06%), Query Frame = 0
Query: 167 PPPDSSYPPPPPPPAPSQPSNHYYPPSQYSQSNQNQQSMQPPPPPSS-------PPPSSS 226
PPP S PPP PPP PS Y PP PPPPP + P +
Sbjct: 23 PPPPPSLPPPVPPPPPSHQPYSYPPP--------------PPPPPHAYYQQGPHYPQFNQ 82
Query: 227 I--PPPPPPNSPPPPSAPQQKAEGTNMGAHERDKGVSKDPSYGRRERENSNHDKHQRHSG 286
+ PPPPPP S PPP P + G ++ +KG SK GRRER + KH S
Sbjct: 83 LQAPPPPPPPSAPPPLVPDPP---RHQGPNDHEKGASK--QVGRRERAKPDPSKHHHRSH 142
Query: 287 PPMPPKKANGPSGRMETDDEKRLRKKREFEKQRQDERHRHHLKESQNTILQK-------- 346
P S ++ET++E+RLRKKRE EKQRQDE+HR +K S + + K
Sbjct: 143 LP--------HSKKIETEEERRLRKKRELEKQRQDEKHRQQMKNSHKSQMPKGHTEEKKP 202
Query: 347 TQMISTGKGHGSIAGSRMGKGRPLHFSVVPERASRYKCSAEAHVTTERERSLYKIYNHIT 406
T +++T + + + + +P+ +++ K +T +R++ + Y +
Sbjct: 203 TPLLTTDRVENRLKKPTTFICKLKFRNELPDPSAQLKL-----MTIKRDKDQFTKYTITS 262
Query: 407 RENIQTSALCGARSWN-TLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIK 466
E + + LDLLDLSVYNPP V+ PLAPEDEELLRDD TP+KKD GI+
Sbjct: 263 LEKLWKPKIFVEPDLGIPLDLLDLSVYNPPKVKAPLAPEDEELLRDDDAVTPIKKD-GIR 322
Query: 467 RKERPTDKGVAWLVKTQYISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIK 526
RKERPTDKG++WLVKTQYIS ++ ESA+QSLTEKQAKELREMKGG NIL NLNNRERQIK
Sbjct: 323 RKERPTDKGMSWLVKTQYISSINNESARQSLTEKQAKELREMKGGINILHNLNNRERQIK 382
Query: 527 EIEASFEACKSRPVHATNKNLYPVEVLPLLPDFDRYKILERLIVNLVVAFDSAPTADSET 586
+IEASFEACKSRPVHATNKNL PVEVLPLLP FDRY E+ + V FD AP ADSE
Sbjct: 383 DIEASFEACKSRPVHATNKNLQPVEVLPLLPYFDRYD--EQFV---VANFDGAPIADSEF 442
Query: 587 FNKLDQSIRIAHES-----------------------------QLSKDIYDEQEDVAYSW 646
F KLD SIR AHES +LSKDI+DE E+++Y+W
Sbjct: 443 FGKLDPSIRDAHESRAILKSYVVAGSDTANPEKFLAYMVPSLDELSKDIHDENEEISYTW 502
Query: 647 VREYHWDVRGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPA 706
VREY WDV+ N +DP TYLVSFD+ A Y+PLP +L LRKKRA+EGRSSDE+EHFP P+
Sbjct: 503 VREYLWDVQ-PNANDPGTYLVSFDNGTASYLPLPMRLNLRKKRAREGRSSDEIEHFPVPS 562
Query: 707 RVTVRRRPTVATLEVKDPGVY-------SNSKRGSDIEDGIGRSHKHDRHQDMDQYS-GA 718
RVTVRRR TV+ +E KD GVY S+ R + E G+GRS KH+ QD +QYS G
Sbjct: 563 RVTVRRRSTVSVIEHKDSGVYSSRVGASSSKMRRLEDEGGLGRSWKHEPEQDANQYSDGN 589
BLAST of HG10023520 vs. ExPASy TrEMBL
Match:
A0A6J1GN64 (protein PAF1 homolog OS=Cucurbita moschata OX=3662 GN=LOC111455942 PE=3 SV=1)
HSP 1 Score: 952.6 bits (2461), Expect = 9.5e-274
Identity = 544/710 (76.62%), Postives = 571/710 (80.42%), Query Frame = 0
Query: 65 PYPPQSSFGPAPGQNSIPPPPAQSAA-VPAQQRGGGSQYNQNWGGYGGDGSV-PPAPSSS 124
PYPPQSSFGP+PGQN IPPPPA AA VP QQR GGSQYNQNWGGYGGDGSV PPA SSS
Sbjct: 6 PYPPQSSFGPSPGQNPIPPPPAPPAASVPTQQR-GGSQYNQNWGGYGGDGSVPPPASSSS 65
Query: 125 YPQNYNQVHQSSNYNQQHYGPREAN------TLHLLLLTSRILMHHNRLRPPPDSSYPPP 184
YPQNYNQVHQSSNY+QQHYGP + H S PPPDSSYPPP
Sbjct: 66 YPQNYNQVHQSSNYHQQHYGPPRSQQPPPPPPPH----QSYPYAPQPPPPPPPDSSYPPP 125
Query: 185 PPPPAPSQPSNHYYPPSQYSQSNQNQQSMQ-PPPPPSSPPPSSSIPPPPPPNSPPPPSAP 244
PPPPA SQPS HY+PPSQY Q NQNQQS+Q PPPPPSSPPPSSSIPPPPPPNSPPPPSAP
Sbjct: 126 PPPPASSQPSQHYFPPSQYPQGNQNQQSIQPPPPPPSSPPPSSSIPPPPPPNSPPPPSAP 185
Query: 245 QQKAEGTNMGAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKKANGPSGRMETD 304
QQK EG+++GAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKK+NGPSGR+ETD
Sbjct: 186 QQKTEGSSVGAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKKSNGPSGRIETD 245
Query: 305 DEKRLRKKREFEKQRQDERHRHHLKESQNTILQKTQMISTGKGHGSIAGSRMGKGRPLHF 364
DEKR RKKREFEKQRQDERHRHHLKESQNTILQKTQM+STGKGHGSI GSRMG+ + F
Sbjct: 246 DEKRQRKKREFEKQRQDERHRHHLKESQNTILQKTQMLSTGKGHGSIVGSRMGERKATPF 305
Query: 365 SVVPERASRYK------CSA----EAHVTTERER--SLYKIYNHITRENI-------QTS 424
+R K C E T+ + + SL K +H TR I +
Sbjct: 306 LSGERIENRLKKPTTFLCKLKFRNELPDTSAQPKLMSLRKEKDHYTRYTITSLEKTYKPQ 365
Query: 425 ALCGARSWNTLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK 484
LDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK
Sbjct: 366 LYVEPDLGIPLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK 425
Query: 485 GVAWLVKTQYISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEA 544
GVAWLVKTQYISPLSIES KQSLTEKQAKELREMKGGRNILENLNNRER+IKEI+ASFEA
Sbjct: 426 GVAWLVKTQYISPLSIESTKQSLTEKQAKELREMKGGRNILENLNNRERKIKEIDASFEA 485
Query: 545 CKSRPVHATNKNLYPVEVLPLLPDFDRYKILERLIVNLVVAFDSAPTADSETFNKLDQSI 604
CKSRPVHATNKNLYPVEVLPLLPDFDRY +VVAFD+APTADSETFNKLDQSI
Sbjct: 486 CKSRPVHATNKNLYPVEVLPLLPDFDRYD-----DPFVVVAFDNAPTADSETFNKLDQSI 545
Query: 605 RIAHESQ-----------------------------LSKDIYDEQEDVAYSWVREYHWDV 664
R AHESQ LSKDIYDEQEDV+YSWVREYHWDV
Sbjct: 546 RDAHESQAIMKSYMATGSDPSKPEKFLAYMVPSPDELSKDIYDEQEDVSYSWVREYHWDV 605
Query: 665 RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRP 718
RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRS+DEVEHFPAPARVTVRRRP
Sbjct: 606 RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSNDEVEHFPAPARVTVRRRP 665
BLAST of HG10023520 vs. ExPASy TrEMBL
Match:
A0A6J1JP14 (protein PAF1 homolog OS=Cucurbita maxima OX=3661 GN=LOC111488581 PE=3 SV=1)
HSP 1 Score: 948.0 bits (2449), Expect = 2.3e-272
Identity = 542/710 (76.34%), Postives = 569/710 (80.14%), Query Frame = 0
Query: 65 PYPPQSSFGPAPGQNSIPPPPAQSAA-VPAQQRGGGSQYNQNWGGYGGDGSV-PPAPSSS 124
PYPPQSSFGP+PGQN IPPPPA AA VP QQR G SQYNQNWGGYGGDGSV PPA SSS
Sbjct: 6 PYPPQSSFGPSPGQNPIPPPPAPPAASVPTQQR-GSSQYNQNWGGYGGDGSVPPPASSSS 65
Query: 125 YPQNYNQVHQSSNYNQQHYGPREAN------TLHLLLLTSRILMHHNRLRPPPDSSYPPP 184
YPQNYNQVHQSSNY+QQHYGP + H S PPPDSSYPPP
Sbjct: 66 YPQNYNQVHQSSNYHQQHYGPPRSQQPPPPPPPH----QSYPYAPQPPPPPPPDSSYPPP 125
Query: 185 PPPPAPSQPSNHYYPPSQYSQSNQNQQSMQ-PPPPPSSPPPSSSIPPPPPPNSPPPPSAP 244
PPPPA SQPS HY+PPSQY Q NQNQQS+Q PPPPPSSPPPSSSIPPPPPPNSPPPPSAP
Sbjct: 126 PPPPASSQPSQHYFPPSQYPQGNQNQQSIQPPPPPPSSPPPSSSIPPPPPPNSPPPPSAP 185
Query: 245 QQKAEGTNMGAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKKANGPSGRMETD 304
Q K EG+++GAHERDKGV+KDPSYGRRERENSNHDKHQRHSGPPMPPKK+NGPSGR+ETD
Sbjct: 186 QPKTEGSSVGAHERDKGVTKDPSYGRRERENSNHDKHQRHSGPPMPPKKSNGPSGRIETD 245
Query: 305 DEKRLRKKREFEKQRQDERHRHHLKESQNTILQKTQMISTGKGHGSIAGSRMGKGRPLHF 364
DEKR RKKREFEKQRQDERHRHHLKESQNTILQKTQM+STGKGHGSI GSRMG+ + F
Sbjct: 246 DEKRQRKKREFEKQRQDERHRHHLKESQNTILQKTQMLSTGKGHGSIVGSRMGERKATPF 305
Query: 365 SVVPERASRYK------CSA----EAHVTTERER--SLYKIYNHITRENI-------QTS 424
+R K C E T+ + + SL K +H TR I +
Sbjct: 306 LSGERIENRLKKPTTFLCKLKFRNELPDTSAQPKLMSLRKEKDHYTRYTITSLEKTYKPQ 365
Query: 425 ALCGARSWNTLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK 484
LDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK
Sbjct: 366 LYVEPDLGIPLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDK 425
Query: 485 GVAWLVKTQYISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEA 544
GVAWLVKTQYISPLSIES KQSLTEKQAKELREMKGGRNILENLNNRER+IKEI+ASFEA
Sbjct: 426 GVAWLVKTQYISPLSIESTKQSLTEKQAKELREMKGGRNILENLNNRERKIKEIDASFEA 485
Query: 545 CKSRPVHATNKNLYPVEVLPLLPDFDRYKILERLIVNLVVAFDSAPTADSETFNKLDQSI 604
CKSRPVHATNKNLYPVEVLPLLPDFDRY +VVAFD+APTADSETFNKLDQSI
Sbjct: 486 CKSRPVHATNKNLYPVEVLPLLPDFDRYD-----DPFVVVAFDNAPTADSETFNKLDQSI 545
Query: 605 RIAHESQ-----------------------------LSKDIYDEQEDVAYSWVREYHWDV 664
R AHESQ LSKDIYDEQEDV+YSWVREYHWDV
Sbjct: 546 RDAHESQAIMKSYMATGSDPSKPEKFLAYMVPSPDELSKDIYDEQEDVSYSWVREYHWDV 605
Query: 665 RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRP 718
RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRP
Sbjct: 606 RGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRP 665
BLAST of HG10023520 vs. ExPASy TrEMBL
Match:
A0A0A0KCT6 (Uncharacterized protein OS=Cucumis sativus OX=3659 GN=Csa_7G452360 PE=3 SV=1)
HSP 1 Score: 944.1 bits (2439), Expect = 3.4e-271
Identity = 542/708 (76.55%), Postives = 565/708 (79.80%), Query Frame = 0
Query: 65 PYPPQSSFGPAPGQNSIPPPPAQSAAVPAQQRGGG-SQYNQNWGGYGGDGSVPPAPSSSY 124
PYPPQSSFG AP QNSIPPP AQSA+V +QQRGG +QYNQNWG Y GD S PPAPSSSY
Sbjct: 6 PYPPQSSFGSAPAQNSIPPPSAQSASVSSQQRGGATTQYNQNWGTYAGDASAPPAPSSSY 65
Query: 125 PQNY-NQVHQSSNYNQQHYGPREANTLH-----LLLLTSRILMHHNRLRPPPDSSYPPPP 184
PQNY NQ+HQ+SNY+ Q YGP T H S PPPDSSYPPPP
Sbjct: 66 PQNYNNQLHQTSNYHHQQYGP--PRTQHPPPPPPPPHQSYPYAPQPPPPPPPDSSYPPPP 125
Query: 185 PPPAPSQPSNHYYPPSQYSQSNQNQQSMQPPPPPSSPPPSSSIPPPPPPNSPPPPSAPQQ 244
PPPA SQP N YYP SQYSQ NQNQQSMQPPPPPSSPPPSSSIPPPPPPNSPPPPSA QQ
Sbjct: 126 PPPATSQPPNLYYPSSQYSQGNQNQQSMQPPPPPSSPPPSSSIPPPPPPNSPPPPSASQQ 185
Query: 245 KAEGTNMGAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKKANGPSGRMETDDE 304
KAEGTNMGAHERDKGV KDPSYGRR+RENSNHDKHQ+HSGPPMPPKKANGPSGRMETDDE
Sbjct: 186 KAEGTNMGAHERDKGVPKDPSYGRRDRENSNHDKHQKHSGPPMPPKKANGPSGRMETDDE 245
Query: 305 KRLRKKREFEKQRQDERHRHHLKESQNTILQKTQMISTGKGHGSIAGSRMGKGRPLHFSV 364
KRLRKKREFEKQRQDERHRHHLKESQNTILQKTQM+STGK HGSI GSRMG+ + F
Sbjct: 246 KRLRKKREFEKQRQDERHRHHLKESQNTILQKTQMLSTGKVHGSIVGSRMGERKATPFLS 305
Query: 365 VPERASRYK------CSA----EAHVTTERER--SLYKIYNHITRENI-------QTSAL 424
+R K C E T+ + + SL K +H TR I +
Sbjct: 306 GERIENRLKKPTTFLCKLKFRNELPDTSAQPKLMSLRKEKDHYTRYTITSLEKTYKPQLY 365
Query: 425 CGARSWNTLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDKGV 484
LDLLDLSVYNP SVR+PLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDKGV
Sbjct: 366 VEPDLGIPLDLLDLSVYNPSSVRMPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDKGV 425
Query: 485 AWLVKTQYISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEACK 544
AWLVKTQYISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEACK
Sbjct: 426 AWLVKTQYISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEACK 485
Query: 545 SRPVHATNKNLYPVEVLPLLPDFDRYKILERLIVNLVVAFDSAPTADSETFNKLDQSIRI 604
SRP+HATNKNLYPVEVLPLLPDFDRY +VVAFDSAPTADSETFNKLDQSIR
Sbjct: 486 SRPIHATNKNLYPVEVLPLLPDFDRYD-----DPFVVVAFDSAPTADSETFNKLDQSIRD 545
Query: 605 AHESQ-----------------------------LSKDIYDEQEDVAYSWVREYHWDVRG 664
AHESQ LSKDIYDEQEDV+YSWVREYHWDVRG
Sbjct: 546 AHESQAIMKSYMATSSDPSKPEKFLAYMVPSPDELSKDIYDEQEDVSYSWVREYHWDVRG 605
Query: 665 DNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRPTV 718
DNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRPTV
Sbjct: 606 DNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRPTV 665
BLAST of HG10023520 vs. ExPASy TrEMBL
Match:
A0A1S3CHF3 (LOW QUALITY PROTEIN: protein PAF1 homolog OS=Cucumis melo OX=3656 GN=LOC103500777 PE=3 SV=1)
HSP 1 Score: 943.7 bits (2438), Expect = 4.4e-271
Identity = 539/705 (76.45%), Postives = 565/705 (80.14%), Query Frame = 0
Query: 65 PYPPQSSFGPAPGQNSIPPPPAQSAAVPAQQRGGG-SQYNQNWGGYGGDGSVPPAPSSSY 124
PYPPQSSFG AP QNSIPPPP+QSA+ +QQRGG +QYNQNWG Y GD SVPPAPSSSY
Sbjct: 6 PYPPQSSFGSAPAQNSIPPPPSQSASASSQQRGGATTQYNQNWGAYAGDASVPPAPSSSY 65
Query: 125 PQNY-NQVHQSSNYNQQHYG-PREANTLHLLLLTSRILMHHNRLRPPPDSSYPPPPPPPA 184
PQNY NQ+HQ+SNY+ Q YG PR + S PPPDSSYPPPPPPPA
Sbjct: 66 PQNYNNQLHQTSNYHHQQYGTPRTQHPPPPPPHQSYPYAPQPPPPPPPDSSYPPPPPPPA 125
Query: 185 PSQPSNHYYPPSQYSQSNQNQQSMQPPPPPSSPPPSSSIPPPPPPNSPPPPSAPQQKAEG 244
PSQP N YYP SQYSQ NQNQQSMQPPPPPSSPPPSSSIPPPPPPNSPPPPSA QQKAEG
Sbjct: 126 PSQPPNLYYPSSQYSQGNQNQQSMQPPPPPSSPPPSSSIPPPPPPNSPPPPSASQQKAEG 185
Query: 245 TNMGAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKKANGPSGRMETDDEKRLR 304
NMGAHERDKGVSKDPSYGRR+RENSNHDKHQ+HSGPPMPPKKANGPSGRMETDDEK+LR
Sbjct: 186 KNMGAHERDKGVSKDPSYGRRDRENSNHDKHQKHSGPPMPPKKANGPSGRMETDDEKKLR 245
Query: 305 KKREFEKQRQDERHRHHLKESQNTILQKTQMISTGKGHGSIAGSRMGKGRPLHFSVVPER 364
KKREFEKQRQDERHRHHLKESQNTILQKTQM+STGK HGSI GSRMG+ + F
Sbjct: 246 KKREFEKQRQDERHRHHLKESQNTILQKTQMLSTGKVHGSIVGSRMGERKATPFLSGERI 305
Query: 365 ASRYK------CSA----EAHVTTERER--SLYKIYNHITRENI-------QTSALCGAR 424
+R K C E T+ + + SL K +H TR I +
Sbjct: 306 ENRLKKPTTFLCKLKFRNELPDTSAQPKLMSLRKEKDHYTRYTITSLEKTYKPQLYVEPD 365
Query: 425 SWNTLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDKGVAWLV 484
LDLLDLSVYNPPSVR+PLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDKGVAWLV
Sbjct: 366 LGIPLDLLDLSVYNPPSVRVPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDKGVAWLV 425
Query: 485 KTQYISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEACKSRPV 544
KTQYISPLSIES KQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEACKSRP+
Sbjct: 426 KTQYISPLSIESTKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEACKSRPI 485
Query: 545 HATNKNLYPVEVLPLLPDFDRYKILERLIVNLVVAFDSAPTADSETFNKLDQSIRIAHES 604
HATNKNLYPVEVLPLLPDFDRY +VVAFDSAPTADSETFNKLDQSIR AHES
Sbjct: 486 HATNKNLYPVEVLPLLPDFDRYD-----DPFVVVAFDSAPTADSETFNKLDQSIRDAHES 545
Query: 605 Q-----------------------------LSKDIYDEQEDVAYSWVREYHWDVRGDNVD 664
Q LSKDIYDEQEDV+YSWVREYHWDVRGDNVD
Sbjct: 546 QAIMKSYMATGSDPSKPEKFLAYMVPSPDELSKDIYDEQEDVSYSWVREYHWDVRGDNVD 605
Query: 665 DPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRPTVATLE 718
DPTTYLVSFDD+EARYVPLPTKLVL KKRAKEGRSSDEVEHFPAPARVTVRRRPTVATLE
Sbjct: 606 DPTTYLVSFDDSEARYVPLPTKLVLXKKRAKEGRSSDEVEHFPAPARVTVRRRPTVATLE 665
BLAST of HG10023520 vs. ExPASy TrEMBL
Match:
A0A5A7UA23 (Protein PAF1-like protein OS=Cucumis melo var. makuwa OX=1194695 GN=E5676_scaffold347G00330 PE=3 SV=1)
HSP 1 Score: 943.3 bits (2437), Expect = 5.7e-271
Identity = 539/705 (76.45%), Postives = 565/705 (80.14%), Query Frame = 0
Query: 65 PYPPQSSFGPAPGQNSIPPPPAQSAAVPAQQRGGG-SQYNQNWGGYGGDGSVPPAPSSSY 124
PYPPQSSFG AP QNSIPPPP+QSA+ +QQRGG +QYNQNWG Y GD S PPAPSSSY
Sbjct: 6 PYPPQSSFGSAPAQNSIPPPPSQSASASSQQRGGATTQYNQNWGAYAGDASGPPAPSSSY 65
Query: 125 PQNY-NQVHQSSNYNQQHYG-PREANTLHLLLLTSRILMHHNRLRPPPDSSYPPPPPPPA 184
PQNY NQ+HQ+SNY+ Q YG PR + S PPPDSSYPPPPPPPA
Sbjct: 66 PQNYNNQLHQTSNYHHQQYGTPRTQHPPPPPPHQSYPYAPQPPPPPPPDSSYPPPPPPPA 125
Query: 185 PSQPSNHYYPPSQYSQSNQNQQSMQPPPPPSSPPPSSSIPPPPPPNSPPPPSAPQQKAEG 244
PSQP N YYP SQYSQ NQNQQSMQPPPPPSSPPPSSSIPPPPPPNSPPPPSA QQKAEG
Sbjct: 126 PSQPPNLYYPSSQYSQGNQNQQSMQPPPPPSSPPPSSSIPPPPPPNSPPPPSASQQKAEG 185
Query: 245 TNMGAHERDKGVSKDPSYGRRERENSNHDKHQRHSGPPMPPKKANGPSGRMETDDEKRLR 304
NMGAHERDKGVSKDPSYGRR+RENSNHDKHQ+HSGPPMPPKKANGPSGRMETDDEK+LR
Sbjct: 186 KNMGAHERDKGVSKDPSYGRRDRENSNHDKHQKHSGPPMPPKKANGPSGRMETDDEKKLR 245
Query: 305 KKREFEKQRQDERHRHHLKESQNTILQKTQMISTGKGHGSIAGSRMGKGRPLHFSVVPER 364
KKREFEKQRQDERHRHHLKESQNTILQKTQM+STGK HGSI GSRMG+ + F
Sbjct: 246 KKREFEKQRQDERHRHHLKESQNTILQKTQMLSTGKVHGSIVGSRMGERKATPFLSGERI 305
Query: 365 ASRYK------CSA----EAHVTTERER--SLYKIYNHITRENI-------QTSALCGAR 424
+R K C E T+ + + SL K +H TR I +
Sbjct: 306 ENRLKKPTTFLCKLKFRNELPDTSAQPKLMSLRKEKDHYTRYTITSLEKTYKPQLYVEPD 365
Query: 425 SWNTLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDKGVAWLV 484
LDLLDLSVYNPPSVR+PLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDKGVAWLV
Sbjct: 366 LGIPLDLLDLSVYNPPSVRVPLAPEDEELLRDDVLKTPVKKDGGIKRKERPTDKGVAWLV 425
Query: 485 KTQYISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEACKSRPV 544
KTQYISPLSIES KQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEACKSRP+
Sbjct: 426 KTQYISPLSIESTKQSLTEKQAKELREMKGGRNILENLNNRERQIKEIEASFEACKSRPI 485
Query: 545 HATNKNLYPVEVLPLLPDFDRYKILERLIVNLVVAFDSAPTADSETFNKLDQSIRIAHES 604
HATNKNLYPVEVLPLLPDFDRY +VVAFDSAPTADSETFNKLDQSIR AHES
Sbjct: 486 HATNKNLYPVEVLPLLPDFDRYD-----DPFVVVAFDSAPTADSETFNKLDQSIRDAHES 545
Query: 605 Q-----------------------------LSKDIYDEQEDVAYSWVREYHWDVRGDNVD 664
Q LSKDIYDEQEDV+YSWVREYHWDVRGDNVD
Sbjct: 546 QAIMKSYMATGSDPSKPEKFLAYMVPSPDELSKDIYDEQEDVSYSWVREYHWDVRGDNVD 605
Query: 665 DPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRPTVATLE 718
DPTTYLVSFDD+EARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRPTVATLE
Sbjct: 606 DPTTYLVSFDDSEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPARVTVRRRPTVATLE 665
BLAST of HG10023520 vs. TAIR 10
Match:
AT1G79730.1 (hydroxyproline-rich glycoprotein family protein )
HSP 1 Score: 431.8 bits (1109), Expect = 1.1e-120
Identity = 299/606 (49.34%), Postives = 370/606 (61.06%), Query Frame = 0
Query: 167 PPPDSSYPPPPPPPAPSQPSNHYYPPSQYSQSNQNQQSMQPPPPPSS-------PPPSSS 226
PPP S PPP PPP PS Y PP PPPPP + P +
Sbjct: 23 PPPPPSLPPPVPPPPPSHQPYSYPPP--------------PPPPPHAYYQQGPHYPQFNQ 82
Query: 227 I--PPPPPPNSPPPPSAPQQKAEGTNMGAHERDKGVSKDPSYGRRERENSNHDKHQRHSG 286
+ PPPPPP S PPP P + G ++ +KG SK GRRER + KH S
Sbjct: 83 LQAPPPPPPPSAPPPLVPDPP---RHQGPNDHEKGASK--QVGRRERAKPDPSKHHHRSH 142
Query: 287 PPMPPKKANGPSGRMETDDEKRLRKKREFEKQRQDERHRHHLKESQNTILQK-------- 346
P S ++ET++E+RLRKKRE EKQRQDE+HR +K S + + K
Sbjct: 143 LP--------HSKKIETEEERRLRKKRELEKQRQDEKHRQQMKNSHKSQMPKGHTEEKKP 202
Query: 347 TQMISTGKGHGSIAGSRMGKGRPLHFSVVPERASRYKCSAEAHVTTERERSLYKIYNHIT 406
T +++T + + + + +P+ +++ K +T +R++ + Y +
Sbjct: 203 TPLLTTDRVENRLKKPTTFICKLKFRNELPDPSAQLKL-----MTIKRDKDQFTKYTITS 262
Query: 407 RENIQTSALCGARSWN-TLDLLDLSVYNPPSVRIPLAPEDEELLRDDVLKTPVKKDGGIK 466
E + + LDLLDLSVYNPP V+ PLAPEDEELLRDD TP+KKD GI+
Sbjct: 263 LEKLWKPKIFVEPDLGIPLDLLDLSVYNPPKVKAPLAPEDEELLRDDDAVTPIKKD-GIR 322
Query: 467 RKERPTDKGVAWLVKTQYISPLSIESAKQSLTEKQAKELREMKGGRNILENLNNRERQIK 526
RKERPTDKG++WLVKTQYIS ++ ESA+QSLTEKQAKELREMKGG NIL NLNNRERQIK
Sbjct: 323 RKERPTDKGMSWLVKTQYISSINNESARQSLTEKQAKELREMKGGINILHNLNNRERQIK 382
Query: 527 EIEASFEACKSRPVHATNKNLYPVEVLPLLPDFDRYKILERLIVNLVVAFDSAPTADSET 586
+IEASFEACKSRPVHATNKNL PVEVLPLLP FDRY E+ + V FD AP ADSE
Sbjct: 383 DIEASFEACKSRPVHATNKNLQPVEVLPLLPYFDRYD--EQFV---VANFDGAPIADSEF 442
Query: 587 FNKLDQSIRIAHES-----------------------------QLSKDIYDEQEDVAYSW 646
F KLD SIR AHES +LSKDI+DE E+++Y+W
Sbjct: 443 FGKLDPSIRDAHESRAILKSYVVAGSDTANPEKFLAYMVPSLDELSKDIHDENEEISYTW 502
Query: 647 VREYHWDVRGDNVDDPTTYLVSFDDAEARYVPLPTKLVLRKKRAKEGRSSDEVEHFPAPA 706
VREY WDV+ N +DP TYLVSFD+ A Y+PLP +L LRKKRA+EGRSSDE+EHFP P+
Sbjct: 503 VREYLWDVQ-PNANDPGTYLVSFDNGTASYLPLPMRLNLRKKRAREGRSSDEIEHFPVPS 562
Query: 707 RVTVRRRPTVATLEVKDPGVY-------SNSKRGSDIEDGIGRSHKHDRHQDMDQYS-GA 718
RVTVRRR TV+ +E KD GVY S+ R + E G+GRS KH+ QD +QYS G
Sbjct: 563 RVTVRRRSTVSVIEHKDSGVYSSRVGASSSKMRRLEDEGGLGRSWKHEPEQDANQYSDGN 589
The following BLAST results are available for this feature:
Match Name | E-value | Identity | Description | |
F4HQA1 | 1.5e-119 | 49.34 | Protein PAF1 homolog OS=Arabidopsis thaliana OX=3702 GN=VIP2 PE=1 SV=1 | [more] |
Match Name | E-value | Identity | Description | |
A0A6J1GN64 | 9.5e-274 | 76.62 | protein PAF1 homolog OS=Cucurbita moschata OX=3662 GN=LOC111455942 PE=3 SV=1 | [more] |
A0A6J1JP14 | 2.3e-272 | 76.34 | protein PAF1 homolog OS=Cucurbita maxima OX=3661 GN=LOC111488581 PE=3 SV=1 | [more] |
A0A0A0KCT6 | 3.4e-271 | 76.55 | Uncharacterized protein OS=Cucumis sativus OX=3659 GN=Csa_7G452360 PE=3 SV=1 | [more] |
A0A1S3CHF3 | 4.4e-271 | 76.45 | LOW QUALITY PROTEIN: protein PAF1 homolog OS=Cucumis melo OX=3656 GN=LOC10350077... | [more] |
A0A5A7UA23 | 5.7e-271 | 76.45 | Protein PAF1-like protein OS=Cucumis melo var. makuwa OX=1194695 GN=E5676_scaffo... | [more] |
Match Name | E-value | Identity | Description | |
AT1G79730.1 | 1.1e-120 | 49.34 | hydroxyproline-rich glycoprotein family protein | [more] |