CmaCh16G006480.1 (mRNA) Cucurbita maxima (Rimu) v1.1

Overview
NameCmaCh16G006480.1
TypemRNA
OrganismCucurbita maxima (Cucurbita maxima (Rimu) v1.1)
DescriptionIntegrase catalytic domain-containing protein
LocationCma_Chr16: 3365219 .. 3369415 (+)
Sequence length4197
RNA-Seq ExpressionCmaCh16G006480.1
SyntenyCmaCh16G006480.1
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: exonCDSpolypeptide
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGTCAATCCCAATCAACACAATGTCTTCTCCGTCGATCAGTCAGGTGATTAGTGTTAAGCTTACACAAGAAAATTATCTACTGTGGTCTACCCAAATCCTTCCCTACTTGCGTAGCCAAAACCTTGTTGGTTTTGTGGATGGATCCATGCCTGCACCAAGCCAGACGATCGCCGTTGAACCAAGTGAAGAAATAGGGAATCGCAAAATTATCATCAACCCTGAGTTCACAGTCTGGTACCCCCAGGACCAGCTGGTACTCAGCCTCATCAACTCATCAGTCACTGAGGAGGTTCTCAGCACGATGGTTGGAATCACCACTGCACGAGAAGCCTGGATTACGCTGGAGCGACAATTTGCTTCCACATCTCGAGCAAGAGCAATGCAGATCCGTATGGAACTCTCTACTATCCAGAAAAAGGACATGACAATTGCTGACTACTTTCGTAAAGTAAAACATCTTGGTGATACACTTGCTGCCATTGGCAAGCGAATAGAGGATGAAGAACTCATCGCCTACATGCTGCAAGGACTTGGTCCAGATTATGATCCTCTAGTCACAAGCATTACAACCAGAACAGATGTATACACTGTCAGCGACGTGTATGCTCACATGCTGAGCTATGAGATGCGGCACTTGCGTAAGGGTACATTTGAGCAACTTTCATCTGCTAACAATGTCAATAGGATATCCATTCGTGGAGGTGCCAATGGAGGTCGAGGTAGTCGCGGTCGTAGTCGTCAGTTAAATAGTGGTCATGGACAATCAAGGCGTACTGTGAACAATCCTGGACGTCAACCATCAAAGACACAAAGCAGCTCAGGCATTGTCTGTCAGATTTGTGGTAAGCCCAATCACGATGCTTTGCAATGCTGGCACAGATTTGATCAGGCATATCAAGCCGAAAATAATCTCAAACAAGCAGCTTTGGCAACTAGTGGATACACTAGTGACACAAACTGGTATGTTGACACTGGAGCCACAGATCATATCACCAATGACCTAGAGAGGCTTACCACCAGAGAACGCTACACTGGCACCGACCAAATTCAGGTTGCAAATGGCGCAGGTTTGTCTATCTCTCATATTGGGAATTCATTAATTTCTGGTTCATCTCTTGTTCTGAAACATATCCTATATGCTCCTAAAATCAATAAGCACCTAATTTCAGTACAAAGACTAGCATCTGATAATAATGCTGTTGTAGAATTTCACCCAAACTATTTTTTGGTTAAGGACCGAGTCACGAAGAAACTCCTGCTCCACGGTAGATGTAAGAATGGCCTATACGTTCTACCGCATAATTTCAGTCAAGCCTTGCTGACAGCCAAACTTTCGAAAGAACAATGGCACAGAAGGCTAGGGCACCCTGCATCTCCAATTACCATTAGAATTCTACAAGATAATAATTTAGCTATAGATACTAATATTCCCTCTTCCTCAATTTGTAATGCTTGTCAATTAGGGAAAGCACATCAATTGCCATTTGGTTCTTCTCAGCATGTATCTACAGCACCCCTTCAATTAATTCACACTGATGTATGGGGTCCATCCATTGCGTCAGTAAATAATTCCAAATATTATGTTTCCTTTGTTGATGATTTTAGTCGTTATGTTTGGATTTACTTTCTGAGATGCAAATCTGATGTTGAGTCTGTGTTCCTTCAATTTCAAAAACATGTTGAAACTATGCTAAATACCAAAATTCGCTCCGTCCAATCAGATTGGGGGGGTGAATACCATCGGTTACACAATTATTTCAAATCCACAGGCATTGAACATCATATCTCCTGTCCTCACACACACCAGCAGAATGGGTTAGTCGAAAGAAAACACAGACACATTGTAGAAACTGGCCTTGCTTTACTCGCTCAAGCCAACATGCCTCTATCCTACTGGGATGAAGCTTTCAACACAGCTTGCTTTCTTATAAATAGAATGCCCAGCCGAACCATACAACAAGACACACCACTTCATAAATTGTTTGGTAAAAGTCCAGACTACTCCATGCTTAGGGTGTTTGGCTGTGCTTGCTGGCCTAATTTAAGGCCTTACAACAACAAGAAACTGAGTTTCAGAACTACTAGATGTATATTCTTGGGTTATAGTTCTTCTCATAAGGGATATAAATGCTTAAATAGAAGTACAGGACGTATTTACATCTCTAGAGACGTGGTTTTCGATGAAAATATTTTTCCTTTTGAAGAATCTAAGCCACCAAACAAAACCACAAATCCACATCATCCTGTTCTACTTCCAGCCTTAGCCAAACTTGCTAGTTTTTACACTGAAAATGCTCTTACAGATATTGAACCAGTTGTTAGTAATTCCCATATGAATGATGGTCAAACTGATAATATTGCTAGTGACAACTTGTCTGGTGTCAGCTTATCTTCTGCAGATAATACAAGAAGTTCAGAGGAAATTGCAGAATATGAAGCTGAGAGCAGTTCGATCAATGCTCAAAACCAAACTCATGAACATGTGTCTGATCAACCAACTGAAGCAGCTAGTCAACATCCAATGCGAACAAGGTTGAGAAATAACATTGTACAAGCTAAACAATTCACTGATGGAACTATCAGATATTCAGAAACCTCAAGAAAATTCGCAAGCGCTGTAACTATCACAACTCCGATCATAGAGACTGCTACTGAACCTCGAAACCTGCAGGAAGCCATGCAACATCCAAGATGGAGAGAAGCAATGAATGATGAGCTCTCAGCGCTAAAACGAAATGCCACTTGGGATCTAGTTCCACCCAAACCTGGAATAAATCTCATTGATAGTAAATGGGTGTATAAAGTGAAAAGAAAAGCAGATGGGTCAGTTGAAAGATTAAAAGCAAGATTAGTTGCCAAAGGATTCAAGCAAAGATTTGGTGTTGATTACACTGATACTTTTAGCCCTGTGATCAAACCGTCAACAATCAGGGTCATTCTTTCGCTAGCAGTAACCAAGGGCTGGAATATGAGACAAGTTGATATCCAAAATGCATTTTTGCATGGAATTCTGAAAGAGGAAGTGTACATGCGACAACCACCAGGATTTCAAGACTCAGCCAAACCAAAGAATTACATATGCAAGCTCAAGAAAGCCCTTTATGGCCTGAAACAAGCCCCAAAAGCTTGGCATTCAAGGTTGACTGGAAAACTTATTGAGTTAGGCTTCAAGGCTTCAGTAGCTGATTCATCTCTTTTTATTCTCAAAAACAGAGAGATAACTATCTATATGCTCATCTATGTTGATGATATAATTATTGTGAGCTCCTCTGATCAAGCAACCGAAAGGTTGATTCAGAAATTGAAAATAGATTTTGCAGTAAAAGATTTGGGTGATCTTGAGTATTTTCTGGGTATTGAAGTCAAGAAAACACGAGATGGTATCATACTGTCACAGAGACGATATGCTTTAGATTTGTTGAAAAGAGTAAACATGGAAAAATGCAAACCTATGTCTACACCAATGGGTTCTGCTGAAAAATTATTCAGAGAACAAGGAATACCCTTATCAGCTGAAGAACAATTCAAATACAGAAGTACAGTGGGAGCACTACAATATTTGACAATGACTAGGCCTGATTTGGCATTTGCTGTCAATAAAGTGTGTCAATATCTTCATACACCTACTGATGCTCATTGGGGTGCTGTGAAGAGAATTCTTCGTTATGTTAAAGGCACACTAGCATTAGGAGTGAAAATTCAGAAATCAACCATGATGTTGTCGGGGTTTTCTGATGCTGATTGGGCTGGTTGTCCCGATGATCGACGTTCAACTAGCGGCTTTGCTGTATTTCTTGGAGCAAATCTAATCTCATGGAGTTCCAGAAAACAGGCTACAGTGTCAAGATCAAGCACCGAAGCAGAATACAAGGCCATTGCGAATCTTACTGCAGAAATGATTTGGATCAAGTCATTACTGAAGGAACTGGGCGTGTATCAATCAAAGGCTCCTCGCCTCTGGTGTGACAACCTCGGAGCTACATATTTAACTTCAAATCCAGTATTTCATGCTAGAACGAAACATATTGAAGTTGATTTTCATTTTGTTCGAGAACAAGTAGCACGTAAAGCAATGGAAGTTCGGTTCATTTCATCAAGTGATCAAGTAGCTGATATCCTGACAAAACCACTGTCTAAAACTCCTTTTACTACACATTGTAACAATCTCAACATGTACAAGACTTGTTGGGATTGA

mRNA sequence

ATGTCAATCCCAATCAACACAATGTCTTCTCCGTCGATCAGTCAGGTGATTAGTGTTAAGCTTACACAAGAAAATTATCTACTGTGGTCTACCCAAATCCTTCCCTACTTGCGTAGCCAAAACCTTGTTGGTTTTGTGGATGGATCCATGCCTGCACCAAGCCAGACGATCGCCGTTGAACCAAGTGAAGAAATAGGGAATCGCAAAATTATCATCAACCCTGAGTTCACAGTCTGGTACCCCCAGGACCAGCTGGTACTCAGCCTCATCAACTCATCAGTCACTGAGGAGGTTCTCAGCACGATGGTTGGAATCACCACTGCACGAGAAGCCTGGATTACGCTGGAGCGACAATTTGCTTCCACATCTCGAGCAAGAGCAATGCAGATCCGTATGGAACTCTCTACTATCCAGAAAAAGGACATGACAATTGCTGACTACTTTCGTAAAGTAAAACATCTTGGTGATACACTTGCTGCCATTGGCAAGCGAATAGAGGATGAAGAACTCATCGCCTACATGCTGCAAGGACTTGGTCCAGATTATGATCCTCTAGTCACAAGCATTACAACCAGAACAGATGTATACACTGTCAGCGACGTGTATGCTCACATGCTGAGCTATGAGATGCGGCACTTGCGTAAGGGTACATTTGAGCAACTTTCATCTGCTAACAATGTCAATAGGATATCCATTCGTGGAGGTGCCAATGGAGGTCGAGGTAGTCGCGGTCGTAGTCGTCAGTTAAATAGTGGTCATGGACAATCAAGGCGTACTGTGAACAATCCTGGACGTCAACCATCAAAGACACAAAGCAGCTCAGGCATTGTCTGTCAGATTTGTGGTAAGCCCAATCACGATGCTTTGCAATGCTGGCACAGATTTGATCAGGCATATCAAGCCGAAAATAATCTCAAACAAGCAGCTTTGGCAACTAGTGGATACACTAGTGACACAAACTGGTATGTTGACACTGGAGCCACAGATCATATCACCAATGACCTAGAGAGGCTTACCACCAGAGAACGCTACACTGGCACCGACCAAATTCAGGTTGCAAATGGCGCAGGTTTGTCTATCTCTCATATTGGGAATTCATTAATTTCTGGTTCATCTCTTGTTCTGAAACATATCCTATATGCTCCTAAAATCAATAAGCACCTAATTTCAGTACAAAGACTAGCATCTGATAATAATGCTGTTGTAGAATTTCACCCAAACTATTTTTTGGTTAAGGACCGAGTCACGAAGAAACTCCTGCTCCACGGTAGATGTAAGAATGGCCTATACGTTCTACCGCATAATTTCAGTCAAGCCTTGCTGACAGCCAAACTTTCGAAAGAACAATGGCACAGAAGGCTAGGGCACCCTGCATCTCCAATTACCATTAGAATTCTACAAGATAATAATTTAGCTATAGATACTAATATTCCCTCTTCCTCAATTTGTAATGCTTGTCAATTAGGGAAAGCACATCAATTGCCATTTGGTTCTTCTCAGCATGTATCTACAGCACCCCTTCAATTAATTCACACTGATGTATGGGGTCCATCCATTGCGTCAGTAAATAATTCCAAATATTATGTTTCCTTTGTTGATGATTTTAGTCGTTATGTTTGGATTTACTTTCTGAGATGCAAATCTGATGTTGAGTCTGTGTTCCTTCAATTTCAAAAACATGTTGAAACTATGCTAAATACCAAAATTCGCTCCGTCCAATCAGATTGGGGGGGTGAATACCATCGGTTACACAATTATTTCAAATCCACAGGCATTGAACATCATATCTCCTGTCCTCACACACACCAGCAGAATGGGTTAGTCGAAAGAAAACACAGACACATTGTAGAAACTGGCCTTGCTTTACTCGCTCAAGCCAACATGCCTCTATCCTACTGGGATGAAGCTTTCAACACAGCTTGCTTTCTTATAAATAGAATGCCCAGCCGAACCATACAACAAGACACACCACTTCATAAATTGTTTGGTAAAAGTCCAGACTACTCCATGCTTAGGGTGTTTGGCTGTGCTTGCTGGCCTAATTTAAGGCCTTACAACAACAAGAAACTGAGTTTCAGAACTACTAGATGTATATTCTTGGGTTATAGTTCTTCTCATAAGGGATATAAATGCTTAAATAGAAGTACAGGACGTATTTACATCTCTAGAGACGTGGTTTTCGATGAAAATATTTTTCCTTTTGAAGAATCTAAGCCACCAAACAAAACCACAAATCCACATCATCCTGTTCTACTTCCAGCCTTAGCCAAACTTGCTAGTTTTTACACTGAAAATGCTCTTACAGATATTGAACCAGTTGTTAGTAATTCCCATATGAATGATGGTCAAACTGATAATATTGCTAGTGACAACTTGTCTGGTGTCAGCTTATCTTCTGCAGATAATACAAGAAGTTCAGAGGAAATTGCAGAATATGAAGCTGAGAGCAGTTCGATCAATGCTCAAAACCAAACTCATGAACATGTGTCTGATCAACCAACTGAAGCAGCTAGTCAACATCCAATGCGAACAAGGTTGAGAAATAACATTGTACAAGCTAAACAATTCACTGATGGAACTATCAGATATTCAGAAACCTCAAGAAAATTCGCAAGCGCTGTAACTATCACAACTCCGATCATAGAGACTGCTACTGAACCTCGAAACCTGCAGGAAGCCATGCAACATCCAAGATGGAGAGAAGCAATGAATGATGAGCTCTCAGCGCTAAAACGAAATGCCACTTGGGATCTAGTTCCACCCAAACCTGGAATAAATCTCATTGATAGTAAATGGGTGTATAAAGTGAAAAGAAAAGCAGATGGGTCAGTTGAAAGATTAAAAGCAAGATTAGTTGCCAAAGGATTCAAGCAAAGATTTGGTGTTGATTACACTGATACTTTTAGCCCTGTGATCAAACCGTCAACAATCAGGGTCATTCTTTCGCTAGCAGTAACCAAGGGCTGGAATATGAGACAAGTTGATATCCAAAATGCATTTTTGCATGGAATTCTGAAAGAGGAAGTGTACATGCGACAACCACCAGGATTTCAAGACTCAGCCAAACCAAAGAATTACATATGCAAGCTCAAGAAAGCCCTTTATGGCCTGAAACAAGCCCCAAAAGCTTGGCATTCAAGGTTGACTGGAAAACTTATTGAGTTAGGCTTCAAGGCTTCAGTAGCTGATTCATCTCTTTTTATTCTCAAAAACAGAGAGATAACTATCTATATGCTCATCTATGTTGATGATATAATTATTGTGAGCTCCTCTGATCAAGCAACCGAAAGGTTGATTCAGAAATTGAAAATAGATTTTGCAGTAAAAGATTTGGGTGATCTTGAGTATTTTCTGGGTATTGAAGTCAAGAAAACACGAGATGGTATCATACTGTCACAGAGACGATATGCTTTAGATTTGTTGAAAAGAGTAAACATGGAAAAATGCAAACCTATGTCTACACCAATGGGTTCTGCTGAAAAATTATTCAGAGAACAAGGAATACCCTTATCAGCTGAAGAACAATTCAAATACAGAAGTACAGTGGGAGCACTACAATATTTGACAATGACTAGGCCTGATTTGGCATTTGCTGTCAATAAAGTGTGTCAATATCTTCATACACCTACTGATGCTCATTGGGGTGCTGTGAAGAGAATTCTTCGTTATGTTAAAGGCACACTAGCATTAGGAGTGAAAATTCAGAAATCAACCATGATGTTGTCGGGGTTTTCTGATGCTGATTGGGCTGGTTGTCCCGATGATCGACGTTCAACTAGCGGCTTTGCTGTATTTCTTGGAGCAAATCTAATCTCATGGAGTTCCAGAAAACAGGCTACAGTGTCAAGATCAAGCACCGAAGCAGAATACAAGGCCATTGCGAATCTTACTGCAGAAATGATTTGGATCAAGTCATTACTGAAGGAACTGGGCGTGTATCAATCAAAGGCTCCTCGCCTCTGGTGTGACAACCTCGGAGCTACATATTTAACTTCAAATCCAGTATTTCATGCTAGAACGAAACATATTGAAGTTGATTTTCATTTTGTTCGAGAACAAGTAGCACGTAAAGCAATGGAAGTTCGGTTCATTTCATCAAGTGATCAAGTAGCTGATATCCTGACAAAACCACTGTCTAAAACTCCTTTTACTACACATTGTAACAATCTCAACATGTACAAGACTTGTTGGGATTGA

Coding sequence (CDS)

ATGTCAATCCCAATCAACACAATGTCTTCTCCGTCGATCAGTCAGGTGATTAGTGTTAAGCTTACACAAGAAAATTATCTACTGTGGTCTACCCAAATCCTTCCCTACTTGCGTAGCCAAAACCTTGTTGGTTTTGTGGATGGATCCATGCCTGCACCAAGCCAGACGATCGCCGTTGAACCAAGTGAAGAAATAGGGAATCGCAAAATTATCATCAACCCTGAGTTCACAGTCTGGTACCCCCAGGACCAGCTGGTACTCAGCCTCATCAACTCATCAGTCACTGAGGAGGTTCTCAGCACGATGGTTGGAATCACCACTGCACGAGAAGCCTGGATTACGCTGGAGCGACAATTTGCTTCCACATCTCGAGCAAGAGCAATGCAGATCCGTATGGAACTCTCTACTATCCAGAAAAAGGACATGACAATTGCTGACTACTTTCGTAAAGTAAAACATCTTGGTGATACACTTGCTGCCATTGGCAAGCGAATAGAGGATGAAGAACTCATCGCCTACATGCTGCAAGGACTTGGTCCAGATTATGATCCTCTAGTCACAAGCATTACAACCAGAACAGATGTATACACTGTCAGCGACGTGTATGCTCACATGCTGAGCTATGAGATGCGGCACTTGCGTAAGGGTACATTTGAGCAACTTTCATCTGCTAACAATGTCAATAGGATATCCATTCGTGGAGGTGCCAATGGAGGTCGAGGTAGTCGCGGTCGTAGTCGTCAGTTAAATAGTGGTCATGGACAATCAAGGCGTACTGTGAACAATCCTGGACGTCAACCATCAAAGACACAAAGCAGCTCAGGCATTGTCTGTCAGATTTGTGGTAAGCCCAATCACGATGCTTTGCAATGCTGGCACAGATTTGATCAGGCATATCAAGCCGAAAATAATCTCAAACAAGCAGCTTTGGCAACTAGTGGATACACTAGTGACACAAACTGGTATGTTGACACTGGAGCCACAGATCATATCACCAATGACCTAGAGAGGCTTACCACCAGAGAACGCTACACTGGCACCGACCAAATTCAGGTTGCAAATGGCGCAGGTTTGTCTATCTCTCATATTGGGAATTCATTAATTTCTGGTTCATCTCTTGTTCTGAAACATATCCTATATGCTCCTAAAATCAATAAGCACCTAATTTCAGTACAAAGACTAGCATCTGATAATAATGCTGTTGTAGAATTTCACCCAAACTATTTTTTGGTTAAGGACCGAGTCACGAAGAAACTCCTGCTCCACGGTAGATGTAAGAATGGCCTATACGTTCTACCGCATAATTTCAGTCAAGCCTTGCTGACAGCCAAACTTTCGAAAGAACAATGGCACAGAAGGCTAGGGCACCCTGCATCTCCAATTACCATTAGAATTCTACAAGATAATAATTTAGCTATAGATACTAATATTCCCTCTTCCTCAATTTGTAATGCTTGTCAATTAGGGAAAGCACATCAATTGCCATTTGGTTCTTCTCAGCATGTATCTACAGCACCCCTTCAATTAATTCACACTGATGTATGGGGTCCATCCATTGCGTCAGTAAATAATTCCAAATATTATGTTTCCTTTGTTGATGATTTTAGTCGTTATGTTTGGATTTACTTTCTGAGATGCAAATCTGATGTTGAGTCTGTGTTCCTTCAATTTCAAAAACATGTTGAAACTATGCTAAATACCAAAATTCGCTCCGTCCAATCAGATTGGGGGGGTGAATACCATCGGTTACACAATTATTTCAAATCCACAGGCATTGAACATCATATCTCCTGTCCTCACACACACCAGCAGAATGGGTTAGTCGAAAGAAAACACAGACACATTGTAGAAACTGGCCTTGCTTTACTCGCTCAAGCCAACATGCCTCTATCCTACTGGGATGAAGCTTTCAACACAGCTTGCTTTCTTATAAATAGAATGCCCAGCCGAACCATACAACAAGACACACCACTTCATAAATTGTTTGGTAAAAGTCCAGACTACTCCATGCTTAGGGTGTTTGGCTGTGCTTGCTGGCCTAATTTAAGGCCTTACAACAACAAGAAACTGAGTTTCAGAACTACTAGATGTATATTCTTGGGTTATAGTTCTTCTCATAAGGGATATAAATGCTTAAATAGAAGTACAGGACGTATTTACATCTCTAGAGACGTGGTTTTCGATGAAAATATTTTTCCTTTTGAAGAATCTAAGCCACCAAACAAAACCACAAATCCACATCATCCTGTTCTACTTCCAGCCTTAGCCAAACTTGCTAGTTTTTACACTGAAAATGCTCTTACAGATATTGAACCAGTTGTTAGTAATTCCCATATGAATGATGGTCAAACTGATAATATTGCTAGTGACAACTTGTCTGGTGTCAGCTTATCTTCTGCAGATAATACAAGAAGTTCAGAGGAAATTGCAGAATATGAAGCTGAGAGCAGTTCGATCAATGCTCAAAACCAAACTCATGAACATGTGTCTGATCAACCAACTGAAGCAGCTAGTCAACATCCAATGCGAACAAGGTTGAGAAATAACATTGTACAAGCTAAACAATTCACTGATGGAACTATCAGATATTCAGAAACCTCAAGAAAATTCGCAAGCGCTGTAACTATCACAACTCCGATCATAGAGACTGCTACTGAACCTCGAAACCTGCAGGAAGCCATGCAACATCCAAGATGGAGAGAAGCAATGAATGATGAGCTCTCAGCGCTAAAACGAAATGCCACTTGGGATCTAGTTCCACCCAAACCTGGAATAAATCTCATTGATAGTAAATGGGTGTATAAAGTGAAAAGAAAAGCAGATGGGTCAGTTGAAAGATTAAAAGCAAGATTAGTTGCCAAAGGATTCAAGCAAAGATTTGGTGTTGATTACACTGATACTTTTAGCCCTGTGATCAAACCGTCAACAATCAGGGTCATTCTTTCGCTAGCAGTAACCAAGGGCTGGAATATGAGACAAGTTGATATCCAAAATGCATTTTTGCATGGAATTCTGAAAGAGGAAGTGTACATGCGACAACCACCAGGATTTCAAGACTCAGCCAAACCAAAGAATTACATATGCAAGCTCAAGAAAGCCCTTTATGGCCTGAAACAAGCCCCAAAAGCTTGGCATTCAAGGTTGACTGGAAAACTTATTGAGTTAGGCTTCAAGGCTTCAGTAGCTGATTCATCTCTTTTTATTCTCAAAAACAGAGAGATAACTATCTATATGCTCATCTATGTTGATGATATAATTATTGTGAGCTCCTCTGATCAAGCAACCGAAAGGTTGATTCAGAAATTGAAAATAGATTTTGCAGTAAAAGATTTGGGTGATCTTGAGTATTTTCTGGGTATTGAAGTCAAGAAAACACGAGATGGTATCATACTGTCACAGAGACGATATGCTTTAGATTTGTTGAAAAGAGTAAACATGGAAAAATGCAAACCTATGTCTACACCAATGGGTTCTGCTGAAAAATTATTCAGAGAACAAGGAATACCCTTATCAGCTGAAGAACAATTCAAATACAGAAGTACAGTGGGAGCACTACAATATTTGACAATGACTAGGCCTGATTTGGCATTTGCTGTCAATAAAGTGTGTCAATATCTTCATACACCTACTGATGCTCATTGGGGTGCTGTGAAGAGAATTCTTCGTTATGTTAAAGGCACACTAGCATTAGGAGTGAAAATTCAGAAATCAACCATGATGTTGTCGGGGTTTTCTGATGCTGATTGGGCTGGTTGTCCCGATGATCGACGTTCAACTAGCGGCTTTGCTGTATTTCTTGGAGCAAATCTAATCTCATGGAGTTCCAGAAAACAGGCTACAGTGTCAAGATCAAGCACCGAAGCAGAATACAAGGCCATTGCGAATCTTACTGCAGAAATGATTTGGATCAAGTCATTACTGAAGGAACTGGGCGTGTATCAATCAAAGGCTCCTCGCCTCTGGTGTGACAACCTCGGAGCTACATATTTAACTTCAAATCCAGTATTTCATGCTAGAACGAAACATATTGAAGTTGATTTTCATTTTGTTCGAGAACAAGTAGCACGTAAAGCAATGGAAGTTCGGTTCATTTCATCAAGTGATCAAGTAGCTGATATCCTGACAAAACCACTGTCTAAAACTCCTTTTACTACACATTGTAACAATCTCAACATGTACAAGACTTGTTGGGATTGA

Protein sequence

MSIPINTMSSPSISQVISVKLTQENYLLWSTQILPYLRSQNLVGFVDGSMPAPSQTIAVEPSEEIGNRKIIINPEFTVWYPQDQLVLSLINSSVTEEVLSTMVGITTAREAWITLERQFASTSRARAMQIRMELSTIQKKDMTIADYFRKVKHLGDTLAAIGKRIEDEELIAYMLQGLGPDYDPLVTSITTRTDVYTVSDVYAHMLSYEMRHLRKGTFEQLSSANNVNRISIRGGANGGRGSRGRSRQLNSGHGQSRRTVNNPGRQPSKTQSSSGIVCQICGKPNHDALQCWHRFDQAYQAENNLKQAALATSGYTSDTNWYVDTGATDHITNDLERLTTRERYTGTDQIQVANGAGLSISHIGNSLISGSSLVLKHILYAPKINKHLISVQRLASDNNAVVEFHPNYFLVKDRVTKKLLLHGRCKNGLYVLPHNFSQALLTAKLSKEQWHRRLGHPASPITIRILQDNNLAIDTNIPSSSICNACQLGKAHQLPFGSSQHVSTAPLQLIHTDVWGPSIASVNNSKYYVSFVDDFSRYVWIYFLRCKSDVESVFLQFQKHVETMLNTKIRSVQSDWGGEYHRLHNYFKSTGIEHHISCPHTHQQNGLVERKHRHIVETGLALLAQANMPLSYWDEAFNTACFLINRMPSRTIQQDTPLHKLFGKSPDYSMLRVFGCACWPNLRPYNNKKLSFRTTRCIFLGYSSSHKGYKCLNRSTGRIYISRDVVFDENIFPFEESKPPNKTTNPHHPVLLPALAKLASFYTENALTDIEPVVSNSHMNDGQTDNIASDNLSGVSLSSADNTRSSEEIAEYEAESSSINAQNQTHEHVSDQPTEAASQHPMRTRLRNNIVQAKQFTDGTIRYSETSRKFASAVTITTPIIETATEPRNLQEAMQHPRWREAMNDELSALKRNATWDLVPPKPGINLIDSKWVYKVKRKADGSVERLKARLVAKGFKQRFGVDYTDTFSPVIKPSTIRVILSLAVTKGWNMRQVDIQNAFLHGILKEEVYMRQPPGFQDSAKPKNYICKLKKALYGLKQAPKAWHSRLTGKLIELGFKASVADSSLFILKNREITIYMLIYVDDIIIVSSSDQATERLIQKLKIDFAVKDLGDLEYFLGIEVKKTRDGIILSQRRYALDLLKRVNMEKCKPMSTPMGSAEKLFREQGIPLSAEEQFKYRSTVGALQYLTMTRPDLAFAVNKVCQYLHTPTDAHWGAVKRILRYVKGTLALGVKIQKSTMMLSGFSDADWAGCPDDRRSTSGFAVFLGANLISWSSRKQATVSRSSTEAEYKAIANLTAEMIWIKSLLKELGVYQSKAPRLWCDNLGATYLTSNPVFHARTKHIEVDFHFVREQVARKAMEVRFISSSDQVADILTKPLSKTPFTTHCNNLNMYKTCWD
Homology
BLAST of CmaCh16G006480.1 vs. ExPASy Swiss-Prot
Match: Q94HW2 (Retrovirus-related Pol polyprotein from transposon RE1 OS=Arabidopsis thaliana OX=3702 GN=RE1 PE=2 SV=1)

HSP 1 Score: 982.2 bits (2538), Expect = 5.9e-285
Identity = 579/1475 (39.25%), Postives = 839/1475 (56.88%), Query Frame = 0

Query: 5    INTMSSPSISQVISVKLTQENYLLWSTQILPYLRSQNLVGFVDGSMPAPSQTIAVEPSEE 64
            +N  S  +++     KLT  NYL+WS Q+        L GF+DGS   P  TI  + +  
Sbjct: 10   LNNTSILNVNMSNVTKLTSTNYLMWSRQVHALFDGYELAGFLDGSTTMPPATIGTDAAPR 69

Query: 65   IGNRKIIINPEFTVWYPQDQLVLSLINSSVTEEVLSTMVGITTAREAWITLERQFASTSR 124
                   +NP++T W  QD+L+ S +  +++  V   +   TTA + W TL + +A+ S 
Sbjct: 70   -------VNPDYTRWKRQDKLIYSAVLGAISMSVQPAVSRATTAAQIWETLRKIYANPSY 129

Query: 125  ARAMQIRMELSTIQKKDMTIADYFRKVKHLGDTLAAIGKRIEDEELIAYMLQGLGPDYDP 184
                Q+R +L    K   TI DY + +    D LA +GK ++ +E +  +L+ L  +Y P
Sbjct: 130  GHVTQLRTQLKQWTKGTKTIDDYMQGLVTRFDQLALLGKPMDHDEQVERVLENLPEEYKP 189

Query: 185  LVTSITTRTDVYTVSDVYAHMLSYEMRHLRKGTFEQL----SSANNVNRISIRGGANGGR 244
            ++  I  +    T+++++  +L++E + L   +   +    ++ ++ N  +     NG R
Sbjct: 190  VIDQIAAKDTPPTLTEIHERLLNHESKILAVSSATVIPITANAVSHRNTTTTNNNNNGNR 249

Query: 245  GSR--GRSRQLNSGHGQSRRTVNNPGRQPSKTQSSSGIVCQICGKPNHDALQCWHRFDQA 304
             +R   R+   NS   Q   T  +P    SK        CQICG   H A +C     Q 
Sbjct: 250  NNRYDNRNNNNNSKPWQQSSTNFHPNNNQSKPYLGK---CQICGVQGHSAKRCSQL--QH 309

Query: 305  YQAENNLKQ-----------AALATSGYTSDTNWYVDTGATDHITNDLERLTTRERYTGT 364
            + +  N +Q           A LA     S  NW +D+GAT HIT+D   L+  + YTG 
Sbjct: 310  FLSSVNSQQPPSPFTPWQPRANLALGSPYSSNNWLLDSGATHHITSDFNNLSLHQPYTGG 369

Query: 365  DQIQVANGAGLSISHIGNSLISGSS--LVLKHILYAPKINKHLISVQRLASDNNAVVEFH 424
            D + VA+G+ + ISH G++ +S  S  L L +ILY P I+K+LISV RL + N   VEF 
Sbjct: 370  DDVMVADGSTIPISHTGSTSLSTKSRPLNLHNILYVPNIHKNLISVYRLCNANGVSVEFF 429

Query: 425  PNYFLVKDRVTKKLLLHGRCKNGLYVLPHNFSQ-----ALLTAKLSKEQWHRRLGHPASP 484
            P  F VKD  T   LL G+ K+ LY  P   SQ     A  ++K +   WH RLGHPA  
Sbjct: 430  PASFQVKDLNTGVPLLQGKTKDELYEWPIASSQPVSLFASPSSKATHSSWHARLGHPAPS 489

Query: 485  ITIRILQDNNLAIDTNIPSSSICNACQLGKAHQLPFGSSQHVSTAPLQLIHTDVWGPSIA 544
            I   ++ + +L++         C+ C + K++++PF  S   ST PL+ I++DVW   I 
Sbjct: 490  ILNSVISNYSLSVLNPSHKFLSCSDCLINKSNKVPFSQSTINSTRPLEYIYSDVWSSPIL 549

Query: 545  SVNNSKYYVSFVDDFSRYVWIYFLRCKSDVESVFLQFQKHVETMLNTKIRSVQSDWGGEY 604
            S +N +YYV FVD F+RY W+Y L+ KS V+  F+ F+  +E    T+I +  SD GGE+
Sbjct: 550  SHDNYRYYVIFVDHFTRYTWLYPLKQKSQVKETFITFKNLLENRFQTRIGTFYSDNGGEF 609

Query: 605  HRLHNYFKSTGIEHHISCPHTHQQNGLVERKHRHIVETGLALLAQANMPLSYWDEAFNTA 664
              L  YF   GI H  S PHT + NGL ERKHRHIVETGL LL+ A++P +YW  AF  A
Sbjct: 610  VALWEYFSQHGISHLTSPPHTPEHNGLSERKHRHIVETGLTLLSHASIPKTYWPYAFAVA 669

Query: 665  CFLINRMPSRTIQQDTPLHKLFGKSPDYSMLRVFGCACWPNLRPYNNKKLSFRTTRCIFL 724
             +LINR+P+  +Q ++P  KLFG SP+Y  LRVFGCAC+P LRPYN  KL  ++ +C+FL
Sbjct: 670  VYLINRLPTPLLQLESPFQKLFGTSPNYDKLRVFGCACYPWLRPYNQHKLDDKSRQCVFL 729

Query: 725  GYSSSHKGYKCLNRSTGRIYISRDVVFDENIFPF--------------EESK-------- 784
            GYS +   Y CL+  T R+YISR V FDEN FPF               ES         
Sbjct: 730  GYSLTQSAYLCLHLQTSRLYISRHVRFDENCFPFSNYLATLSPVQEQRRESSCVWSPHTT 789

Query: 785  --------PPNKTTNPHHPVLLP------------ALAKLASFYTENALTDIEPVVSNSH 844
                    P    ++PHH    P            + + L S ++ +  +  EP     +
Sbjct: 790  LPTRTPVLPAPSCSDPHHAATPPSSPSAPFRNSQVSSSNLDSSFSSSFPSSPEPTAPRQN 849

Query: 845  MNDGQTDNIASDNLSGVSLSSADNTRSSEEIAEYEAESSSINAQNQTHEHVSDQPTEAAS 904
                 T    +   +  S +++ N  ++E  ++  A+S S  AQ+ +    S  PT +AS
Sbjct: 850  GPQPTTQPTQTQTQTHSSQNTSQNNPTNESPSQL-AQSLSTPAQSSSS---SPSPTTSAS 909

Query: 905  QH--------------PMRTRLRNNIVQA-----KQFTDGTIRYSETSRKFASAVTITTP 964
                            P   ++ NN  QA        T       + + K++ AV++   
Sbjct: 910  SSSTSPTPPSILIHPPPPLAQIVNNNNQAPLNTHSMGTRAKAGIIKPNPKYSLAVSLA-- 969

Query: 965  IIETATEPRNLQEAMQHPRWREAMNDELSALKRNATWDLVPPKPG-INLIDSKWVYKVKR 1024
                 +EPR   +A++  RWR AM  E++A   N TWDLVPP P  + ++  +W++  K 
Sbjct: 970  ---AESEPRTAIQALKDERWRNAMGSEINAQIGNHTWDLVPPPPSHVTIVGCRWIFTKKY 1029

Query: 1025 KADGSVERLKARLVAKGFKQRFGVDYTDTFSPVIKPSTIRVILSLAVTKGWNMRQVDIQN 1084
             +DGS+ R KARLVAKG+ QR G+DY +TFSPVIK ++IR++L +AV + W +RQ+D+ N
Sbjct: 1030 NSDGSLNRYKARLVAKGYNQRPGLDYAETFSPVIKSTSIRIVLGVAVDRSWPIRQLDVNN 1089

Query: 1085 AFLHGILKEEVYMRQPPGFQDSAKPKNYICKLKKALYGLKQAPKAWHSRLTGKLIELGFK 1144
            AFL G L ++VYM QPPGF D  +P NY+CKL+KALYGLKQAP+AW+  L   L+ +GF 
Sbjct: 1090 AFLQGTLTDDVYMSQPPGFIDKDRP-NYVCKLRKALYGLKQAPRAWYVELRNYLLTIGFV 1149

Query: 1145 ASVADSSLFILKNREITIYMLIYVDDIIIVSSSDQATERLIQKLKIDFAVKDLGDLEYFL 1204
             SV+D+SLF+L+  +  +YML+YVDDI+I  +        +  L   F+VKD  +L YFL
Sbjct: 1150 NSVSDTSLFVLQRGKSIVYMLVYVDDILITGNDPTLLHNTLDNLSQRFSVKDHEELHYFL 1209

Query: 1205 GIEVKKTRDGIILSQRRYALDLLKRVNMEKCKPMSTPMGSAEKLFREQGIPLSAEEQFKY 1264
            GIE K+   G+ LSQRRY LDLL R NM   KP++TPM  + KL    G  L+  +  +Y
Sbjct: 1210 GIEAKRVPTGLHLSQRRYILDLLARTNMITAKPVTTPMAPSPKLSLYSGTKLT--DPTEY 1269

Query: 1265 RSTVGALQYLTMTRPDLAFAVNKVCQYLHTPTDAHWGAVKRILRYVKGTLALGVKIQK-S 1324
            R  VG+LQYL  TRPD+++AVN++ Q++H PT+ H  A+KRILRY+ GT   G+ ++K +
Sbjct: 1270 RGIVGSLQYLAFTRPDISYAVNRLSQFMHMPTEEHLQALKRILRYLAGTPNHGIFLKKGN 1329

Query: 1325 TMMLSGFSDADWAGCPDDRRSTSGFAVFLGANLISWSSRKQATVSRSSTEAEYKAIANLT 1384
            T+ L  +SDADWAG  DD  ST+G+ V+LG + ISWSS+KQ  V RSSTEAEY+++AN +
Sbjct: 1330 TLSLHAYSDADWAGDKDDYVSTNGYIVYLGHHPISWSSKKQKGVVRSSTEAEYRSVANTS 1389

Query: 1385 AEMIWIKSLLKELGVYQSKAPRLWCDNLGATYLTSNPVFHARTKHIEVDFHFVREQVARK 1393
            +EM WI SLL ELG+  ++ P ++CDN+GATYL +NPVFH+R KHI +D+HF+R QV   
Sbjct: 1390 SEMQWICSLLTELGIRLTRPPVIYCDNVGATYLCANPVFHSRMKHIAIDYHFIRNQVQSG 1449

BLAST of CmaCh16G006480.1 vs. ExPASy Swiss-Prot
Match: Q9ZT94 (Retrovirus-related Pol polyprotein from transposon RE2 OS=Arabidopsis thaliana OX=3702 GN=RE2 PE=4 SV=1)

HSP 1 Score: 960.3 bits (2481), Expect = 2.4e-278
Identity = 565/1462 (38.65%), Postives = 821/1462 (56.16%), Query Frame = 0

Query: 20   KLTQENYLLWSTQILPYLRSQNLVGFVDGSMPAPSQTIAVEPSEEIGNRKIIINPEFTVW 79
            KLT  NYL+WS Q+        L GF+DGS P P  TI  +           +NP++T W
Sbjct: 25   KLTSTNYLMWSRQVHALFDGYELAGFLDGSTPMPPATIGTDAVPR-------VNPDYTRW 84

Query: 80   YPQDQLVLSLINSSVTEEVLSTMVGITTAREAWITLERQFASTSRARAMQIRMELSTIQK 139
              QD+L+ S I  +++  V   +   TTA + W TL + +A+ S     Q+R        
Sbjct: 85   RRQDKLIYSAILGAISMSVQPAVSRATTAAQIWETLRKIYANPSYGHVTQLRF------- 144

Query: 140  KDMTIADYFRKVKHLGDTLAAIGKRIEDEELIAYMLQGLGPDYDPLVTSITTRTDVYTVS 199
                I  +        D LA +GK ++ +E +  +L+ L  DY P++  I  +    +++
Sbjct: 145  ----ITRF--------DQLALLGKPMDHDEQVERVLENLPDDYKPVIDQIAAKDTPPSLT 204

Query: 200  DVYAHMLSYEMRHLRKGTFEQLSSANNVNRISIRG-GANGGRGSRGRSRQLNSGHGQSRR 259
            +++  +++ E + L   + E +    NV  ++ R    N  + +RG +R  N+ + +S  
Sbjct: 205  EIHERLINRESKLLALNSAEVVPITANV--VTHRNTNTNRNQNNRGDNRNYNNNNNRSNS 264

Query: 260  TVNNPGRQPSKTQSSSGIV--CQICGKPNHDALQC--WHRFDQAYQAENNLK-------Q 319
               +     S  +     +  CQIC    H A +C   H+F      + +         +
Sbjct: 265  WQPSSSGSRSDNRQPKPYLGRCQICSVQGHSAKRCPQLHQFQSTTNQQQSTSPFTPWQPR 324

Query: 320  AALATSGYTSDTNWYVDTGATDHITNDLERLTTRERYTGTDQIQVANGAGLSISHIGNSL 379
            A LA +   +  NW +D+GAT HIT+D   L+  + YTG D + +A+G+ + I+H G++ 
Sbjct: 325  ANLAVNSPYNANNWLLDSGATHHITSDFNNLSFHQPYTGGDDVMIADGSTIPITHTGSAS 384

Query: 380  I--SGSSLVLKHILYAPKINKHLISVQRLASDNNAVVEFHPNYFLVKDRVTKKLLLHGRC 439
            +  S  SL L  +LY P I+K+LISV RL + N   VEF P  F VKD  T   LL G+ 
Sbjct: 385  LPTSSRSLDLNKVLYVPNIHKNLISVYRLCNTNRVSVEFFPASFQVKDLNTGVPLLQGKT 444

Query: 440  KNGLYVLPHNFSQALL-----TAKLSKEQWHRRLGHPASPITIRILQDNNLAIDTNIPSS 499
            K+ LY  P   SQA+       +K +   WH RLGHP+  I   ++ +++L +       
Sbjct: 445  KDELYEWPIASSQAVSMFASPCSKATHSSWHSRLGHPSLAILNSVISNHSLPVLNPSHKL 504

Query: 500  SICNACQLGKAHQLPFGSSQHVSTAPLQLIHTDVWGPSIASVNNSKYYVSFVDDFSRYVW 559
              C+ C + K+H++PF +S   S+ PL+ I++DVW   I S++N +YYV FVD F+RY W
Sbjct: 505  LSCSDCFINKSHKVPFSNSTITSSKPLEYIYSDVWSSPILSIDNYRYYVIFVDHFTRYTW 564

Query: 560  IYFLRCKSDVESVFLQFQKHVETMLNTKIRSVQSDWGGEYHRLHNYFKSTGIEHHISCPH 619
            +Y L+ KS V+  F+ F+  VE    T+I ++ SD GGE+  L +Y    GI H  S PH
Sbjct: 565  LYPLKQKSQVKDTFIIFKSLVENRFQTRIGTLYSDNGGEFVVLRDYLSQHGISHFTSPPH 624

Query: 620  THQQNGLVERKHRHIVETGLALLAQANMPLSYWDEAFNTACFLINRMPSRTIQQDTPLHK 679
            T + NGL ERKHRHIVE GL LL+ A++P +YW  AF+ A +LINR+P+  +Q  +P  K
Sbjct: 625  TPEHNGLSERKHRHIVEMGLTLLSHASVPKTYWPYAFSVAVYLINRLPTPLLQLQSPFQK 684

Query: 680  LFGKSPDYSMLRVFGCACWPNLRPYNNKKLSFRTTRCIFLGYSSSHKGYKCLNRSTGRIY 739
            LFG+ P+Y  L+VFGCAC+P LRPYN  KL  ++ +C F+GYS +   Y CL+  TGR+Y
Sbjct: 685  LFGQPPNYEKLKVFGCACYPWLRPYNRHKLEDKSKQCAFMGYSLTQSAYLCLHIPTGRLY 744

Query: 740  ISRDVVFDENIFPF--------------EESKP--PNKTTNPHHPVLLPALAKLASFYTE 799
             SR V FDE  FPF               +S P  P+ TT P  P++LPA   L      
Sbjct: 745  TSRHVQFDERCFPFSTTNFGVSTSQEQRSDSAPNWPSHTTLPTTPLVLPAPPCLGPHLD- 804

Query: 800  NALTDIEPVVSNSHMNDGQTDNIASDNLSGVSLSSADNTRSSEEIAEYEAESSSINAQNQ 859
               T   P  S S +    T  ++S NL   S+SS     SSE  A            +Q
Sbjct: 805  ---TSPRPPSSPSPL---CTTQVSSSNLPSSSISSPS---SSEPTAPSHNGPQPTAQPHQ 864

Query: 860  THEHVSDQP--------TEAASQHPMRTRLRNNIVQAKQFTDGTIRYSETSRKFASA--- 919
            T    S+ P        + + +     + L  + + +      +   SE +   +S+   
Sbjct: 865  TQNSNSNSPILNNPNPNSPSPNSPNQNSPLPQSPISSPHIPTPSTSISEPNSPSSSSTST 924

Query: 920  -----VTITTPIIE----------------------------------TATEPRNLQEAM 979
                 V    PII+                                    +EPR   +AM
Sbjct: 925  PPLPPVLPAPPIIQVNAQAPVNTHSMATRAKDGIRKPNQKYSYATSLAANSEPRTAIQAM 984

Query: 980  QHPRWREAMNDELSALKRNATWDLV-PPKPGINLIDSKWVYKVKRKADGSVERLKARLVA 1039
            +  RWR+AM  E++A   N TWDLV PP P + ++  +W++  K  +DGS+ R KARLVA
Sbjct: 985  KDDRWRQAMGSEINAQIGNHTWDLVPPPPPSVTIVGCRWIFTKKFNSDGSLNRYKARLVA 1044

Query: 1040 KGFKQRFGVDYTDTFSPVIKPSTIRVILSLAVTKGWNMRQVDIQNAFLHGILKEEVYMRQ 1099
            KG+ QR G+DY +TFSPVIK ++IR++L +AV + W +RQ+D+ NAFL G L +EVYM Q
Sbjct: 1045 KGYNQRPGLDYAETFSPVIKSTSIRIVLGVAVDRSWPIRQLDVNNAFLQGTLTDEVYMSQ 1104

Query: 1100 PPGFQDSAKPKNYICKLKKALYGLKQAPKAWHSRLTGKLIELGFKASVADSSLFILKNRE 1159
            PPGF D  +P +Y+C+L+KA+YGLKQAP+AW+  L   L+ +GF  S++D+SLF+L+   
Sbjct: 1105 PPGFVDKDRP-DYVCRLRKAIYGLKQAPRAWYVELRTYLLTVGFVNSISDTSLFVLQRGR 1164

Query: 1160 ITIYMLIYVDDIIIVSSSDQATERLIQKLKIDFAVKDLGDLEYFLGIEVKKTRDGIILSQ 1219
              IYML+YVDDI+I  +     +  +  L   F+VK+  DL YFLGIE K+   G+ LSQ
Sbjct: 1165 SIIYMLVYVDDILITGNDTVLLKHTLDALSQRFSVKEHEDLHYFLGIEAKRVPQGLHLSQ 1224

Query: 1220 RRYALDLLKRVNMEKCKPMSTPMGSAEKLFREQGIPLSAEEQFKYRSTVGALQYLTMTRP 1279
            RRY LDLL R NM   KP++TPM ++ KL    G  L   +  +YR  VG+LQYL  TRP
Sbjct: 1225 RRYTLDLLARTNMLTAKPVATPMATSPKLTLHSGTKL--PDPTEYRGIVGSLQYLAFTRP 1284

Query: 1280 DLAFAVNKVCQYLHTPTDAHWGAVKRILRYVKGTLALGVKIQK-STMMLSGFSDADWAGC 1339
            DL++AVN++ QY+H PTD HW A+KR+LRY+ GT   G+ ++K +T+ L  +SDADWAG 
Sbjct: 1285 DLSYAVNRLSQYMHMPTDDHWNALKRVLRYLAGTPDHGIFLKKGNTLSLHAYSDADWAGD 1344

Query: 1340 PDDRRSTSGFAVFLGANLISWSSRKQATVSRSSTEAEYKAIANLTAEMIWIKSLLKELGV 1395
             DD  ST+G+ V+LG + ISWSS+KQ  V RSSTEAEY+++AN ++E+ WI SLL ELG+
Sbjct: 1345 TDDYVSTNGYIVYLGHHPISWSSKKQKGVVRSSTEAEYRSVANTSSELQWICSLLTELGI 1404

BLAST of CmaCh16G006480.1 vs. ExPASy Swiss-Prot
Match: P10978 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum OX=4097 PE=2 SV=1)

HSP 1 Score: 587.8 bits (1514), Expect = 3.2e-166
Identity = 420/1333 (31.51%), Postives = 667/1333 (50.04%), Query Frame = 0

Query: 79   WYPQDQLVLSLINSSVTEEVLSTMVGITTAREAWITLERQFASTSRARAMQIRMELSTIQ 138
            W   D+   S I   ++++V++ ++   TAR  W  LE  + S +    + ++ +L  + 
Sbjct: 52   WADLDERAASAIRLHLSDDVVNNIIDEDTARGIWTRLESLYMSKTLTNKLYLKKQLYALH 111

Query: 139  KKDMT-IADYFRKVKHLGDTLAAIGKRIEDEELIAYMLQGLGPDYDPLVTSITTRTDVYT 198
              + T    +      L   LA +G +IE+E+    +L  L   YD L T+I        
Sbjct: 112  MSEGTNFLSHLNVFNGLITQLANLGVKIEEEDKAILLLNSLPSSYDNLATTILHGKTTIE 171

Query: 199  VSDVYAHMLSYEMRHLRKGTFEQLSSANNVNRISIRGGANGGR-GSRGRSRQLNSGHGQS 258
            + DV + +L  E    +     Q        R   R   N GR G+RG+S+  +    ++
Sbjct: 172  LKDVTSALLLNEKMRKKPENQGQALITEGRGRSYQRSSNNYGRSGARGKSKNRSKSRVRN 231

Query: 259  RRTVNNPGRQPSKTQSSSGIVCQICGKPNHDALQCWHRFDQAYQAENNLKQAALATSGYT 318
                N PG       +      +  G+ N D      + +       N ++  +  SG  
Sbjct: 232  CYNCNQPGHFKRDCPNPRKGKGETSGQKNDDNTAAMVQNNDNVVLFINEEEECMHLSG-- 291

Query: 319  SDTNWYVDTGATDHITNDLERLTTRERYTGTD--QIQVANGAGLSISHIGNSLIS---GS 378
             ++ W VDT A+ H T   +      RY   D   +++ N +   I+ IG+  I    G 
Sbjct: 292  PESEWVVDTAASHHATPVRDLFC---RYVAGDFGTVKMGNTSYSKIAGIGDICIKTNVGC 351

Query: 379  SLVLKHILYAPKINKHLISVQRLASDNNAVVEFHPNYFLVKDRVTKKLLL--HGRCKNGL 438
            +LVLK + + P +  +LIS   +A D +    +  N    K R+TK  L+   G  +  L
Sbjct: 352  TLVLKDVRHVPDLRMNLIS--GIALDRDGYESYFANQ---KWRLTKGSLVIAKGVARGTL 411

Query: 439  YVLPHNFSQALLTA---KLSKEQWHRRLGHPASPITIRILQDNNLAIDTNIPSSSICNAC 498
            Y       Q  L A   ++S + WH+R+GH  S   ++IL   +L       +   C+ C
Sbjct: 412  YRTNAEICQGELNAAQDEISVDLWHKRMGH-MSEKGLQILAKKSLISYAKGTTVKPCDYC 471

Query: 499  QLGKAHQLPFGSSQHVSTAPLQLIHTDVWGP-SIASVNNSKYYVSFVDDFSRYVWIYFLR 558
              GK H++ F +S       L L+++DV GP  I S+  +KY+V+F+DD SR +W+Y L+
Sbjct: 472  LFGKQHRVSFQTSSERKLNILDLVYSDVCGPMEIESMGGNKYFVTFIDDASRKLWVYILK 531

Query: 559  CKSDVESVFLQFQKHVETMLNTKIRSVQSDWGGEY--HRLHNYFKSTGIEHHISCPHTHQ 618
             K  V  VF +F   VE     K++ ++SD GGEY       Y  S GI H  + P T Q
Sbjct: 532  TKDQVFQVFQKFHALVERETGRKLKRLRSDNGGEYTSREFEEYCSSHGIRHEKTVPGTPQ 591

Query: 619  QNGLVERKHRHIVETGLALLAQANMPLSYWDEAFNTACFLINRMPSRTIQQDTPLHKLFG 678
             NG+ ER +R IVE   ++L  A +P S+W EA  TAC+LINR PS  +  + P      
Sbjct: 592  HNGVAERMNRTIVEKVRSMLRMAKLPKSFWGEAVQTACYLINRSPSVPLAFEIPERVWTN 651

Query: 679  KSPDYSMLRVFGCACWPNLRPYNNKKLSFRTTRCIFLGYSSSHKGYKCLNRSTGRIYISR 738
            K   YS L+VFGC  + ++      KL  ++  CIF+GY     GY+  +    ++  SR
Sbjct: 652  KEVSYSHLKVFGCRAFAHVPKEQRTKLDDKSIPCIFIGYGDEEFGYRLWDPVKKKVIRSR 711

Query: 739  DVVFDENIFPFEESKPPNKTTNPHHPVLLPALAKLASFYTENALTDIEPVVSNSHMNDGQ 798
            DVVF E+                                      D+   V N  + +  
Sbjct: 712  DVVFRES--------------------------------EVRTAADMSEKVKNGIIPNFV 771

Query: 799  TDNIASDNLSGVSLSSADNTRSSEEIAEYEAESSSINAQNQTHEHVSDQPTEAASQH-PM 858
            T    S+N +    ++ + +   E+  E   +   ++   +  EH    PT+   QH P+
Sbjct: 772  TIPSTSNNPTSAESTTDEVSEQGEQPGEVIEQGEQLDEGVEEVEH----PTQGEEQHQPL 831

Query: 859  RTRLRNNIVQAKQFTDGTIRYSETSRKFASAVTITTPIIETATEPRNLQEAMQHP---RW 918
            R   R  +                SR++ S   +   +I    EP +L+E + HP   + 
Sbjct: 832  RRSERPRV---------------ESRRYPSTEYV---LISDDREPESLKEVLSHPEKNQL 891

Query: 919  REAMNDELSALKRNATWDLVPPKPGINLIDSKWVYKVKRKADGSVERLKARLVAKGFKQR 978
             +AM +E+ +L++N T+ LV    G   +  KWV+K+K+  D  + R KARLV KGF+Q+
Sbjct: 892  MKAMQEEMESLQKNGTYKLVELPKGKRPLKCKWVFKLKKDGDCKLVRYKARLVVKGFEQK 951

Query: 979  FGVDYTDTFSPVIKPSTIRVILSLAVTKGWNMRQVDIQNAFLHGILKEEVYMRQPPGFQD 1038
             G+D+ + FSPV+K ++IR ILSLA +    + Q+D++ AFLHG L+EE+YM QP GF+ 
Sbjct: 952  KGIDFDEIFSPVVKMTSIRTILSLAASLDLEVEQLDVKTAFLHGDLEEEIYMEQPEGFEV 1011

Query: 1039 SAKPKNYICKLKKALYGLKQAPKAWHSRLTGKLIELGFKASVADSSLFILKNREIT-IYM 1098
            + K K+ +CKL K+LYGLKQAP+ W+ +    +    +  + +D  ++  +  E   I +
Sbjct: 1012 AGK-KHMVCKLNKSLYGLKQAPRQWYMKFDSFMKSQTYLKTYSDPCVYFKRFSENNFIIL 1071

Query: 1099 LIYVDDIIIVSSSDQATERLIQKLKIDFAVKDLGDLEYFLGIEV--KKTRDGIILSQRRY 1158
            L+YVDD++IV        +L   L   F +KDLG  +  LG+++  ++T   + LSQ +Y
Sbjct: 1072 LLYVDDMLIVGKDKGLIAKLKGDLSKSFDMKDLGPAQQILGMKIVRERTSRKLWLSQEKY 1131

Query: 1159 ALDLLKRVNMEKCKPMSTPMGSAEKLFREQGIPLSAEE-----QFKYRSTVGALQY-LTM 1218
               +L+R NM+  KP+STP+    KL ++   P + EE     +  Y S VG+L Y +  
Sbjct: 1132 IERVLERFNMKNAKPVSTPLAGHLKLSKKM-CPTTVEEKGNMAKVPYSSAVGSLMYAMVC 1191

Query: 1219 TRPDLAFAVNKVCQYLHTPTDAHWGAVKRILRYVKGTLALGVKIQKSTMMLSGFSDADWA 1278
            TRPD+A AV  V ++L  P   HW AVK ILRY++GT    +    S  +L G++DAD A
Sbjct: 1192 TRPDIAHAVGVVSRFLENPGKEHWEAVKWILRYLRGTTGDCLCFGGSDPILKGYTDADMA 1251

Query: 1279 GCPDDRRSTSGFAVFLGANLISWSSRKQATVSRSSTEAEYKAIANLTAEMIWIKSLLKEL 1338
            G  D+R+S++G+        ISW S+ Q  V+ S+TEAEY A      EMIW+K  L+EL
Sbjct: 1252 GDIDNRKSSTGYLFTFSGGAISWQSKLQKCVALSTTEAEYIAATETGKEMIWLKRFLQEL 1311

Query: 1339 GVYQSKAPRLWCDNLGATYLTSNPVFHARTKHIEVDFHFVREQVARKAMEVRFISSSDQV 1384
            G++Q K   ++CD+  A  L+ N ++HARTKHI+V +H++RE V  ++++V  IS+++  
Sbjct: 1312 GLHQ-KEYVVYCDSQSAIDLSKNSMYHARTKHIDVRYHWIREMVDDESLKVLKISTNENP 1316

BLAST of CmaCh16G006480.1 vs. ExPASy Swiss-Prot
Match: P04146 (Copia protein OS=Drosophila melanogaster OX=7227 GN=GIP PE=1 SV=3)

HSP 1 Score: 503.4 bits (1295), Expect = 8.0e-141
Identity = 417/1459 (28.58%), Postives = 679/1459 (46.54%), Query Frame = 0

Query: 24   ENYLLWSTQILPYLRSQNLVGFVDGSMPAPSQTIAVEPSEEIGNRKIIINPEFTVWYPQD 83
            E Y +W  +I   L  Q+++  VDG MP                     N     W   +
Sbjct: 14   EKYAIWKFRIRALLAEQDVLKVVDGLMP---------------------NEVDDSWKKAE 73

Query: 84   QLVLSLINSSVTEEVLSTMVGITTAREAWITLERQFASTSRARAMQIRMELSTIQ-KKDM 143
            +   S I   +++  L+      TAR+    L+  +   S A  + +R  L +++   +M
Sbjct: 74   RCAKSTIIEYLSDSFLNFATSDITARQILENLDAVYERKSLASQLALRKRLLSLKLSSEM 133

Query: 144  TIADYFRKVKHLGDTLAAIGKRIEDEELIAYMLQGLGPDYDPLVTSITTRTDV-YTVSDV 203
            ++  +F     L   L A G +IE+ + I+++L  L   YD ++T+I T ++   T++ V
Sbjct: 134  SLLSHFHIFDELISELLAAGAKIEEMDKISHLLITLPSCYDGIITAIETLSEENLTLAFV 193

Query: 204  YAHMLSYEMRHLRKGTFEQLSSANNVNRISIRGGANGGRGSRGRSRQLNSGHGQSRRTVN 263
               +L  E+    K   +   ++  V    +    N  + +  ++R            V 
Sbjct: 194  KNRLLDQEI----KIKNDHNDTSKKVMNAIVHNNNNTYKNNLFKNR------------VT 253

Query: 264  NPGRQPSKTQSSSGIVCQICGKPNHDALQCWHRFDQAYQAEN--NLKQAALATS------ 323
             P ++  K  S   + C  CG+  H    C+H + +    +N  N KQ   ATS      
Sbjct: 254  KP-KKIFKGNSKYKVKCHHCGREGHIKKDCFH-YKRILNNKNKENEKQVQTATSHGIAFM 313

Query: 324  -------GYTSDTNWYVDTGATDHITNDLERLTTRERYTGTDQIQVANGAGLSISHIGN- 383
                       +  + +D+GA+DH+ ND E L        TD ++V     ++++  G  
Sbjct: 314  VKEVNNTSVMDNCGFVLDSGASDHLIND-ESLY-------TDSVEVVPPLKIAVAKQGEF 373

Query: 384  ---------SLISGSSLVLKHILYAPKINKHLISVQRLASDNNAVVEFHPNYFLVKDRVT 443
                      L +   + L+ +L+  +   +L+SV+RL  +    +EF       K  VT
Sbjct: 374  IYATKRGIVRLRNDHEITLEDVLFCKEAAGNLMSVKRL-QEAGMSIEFD------KSGVT 433

Query: 444  KKLLLHGRCKNGLYVLPH----------NFSQALLTAKLSK--EQWHRRLGHPASPITIR 503
                     KNGL V+ +          NF    + AK       WH R GH +    + 
Sbjct: 434  IS-------KNGLMVVKNSGMLNNVPVINFQAYSINAKHKNNFRLWHERFGHISDGKLLE 493

Query: 504  ILQDN---NLAIDTNIP-SSSICNACQLGKAHQLPF---GSSQHVSTAPLQLIHTDVWGP 563
            I + N   + ++  N+  S  IC  C  GK  +LPF       H+   PL ++H+DV GP
Sbjct: 494  IKRKNMFSDQSLLNNLELSCEICEPCLNGKQARLPFKQLKDKTHIK-RPLFVVHSDVCGP 553

Query: 564  -SIASVNNSKYYVSFVDDFSRYVWIYFLRCKSDVESVFLQFQKHVETMLNTKIRSVQSDW 623
             +  ++++  Y+V FVD F+ Y   Y ++ KSDV S+F  F    E   N K+  +  D 
Sbjct: 554  ITPVTLDDKNYFVIFVDQFTHYCVTYLIKYKSDVFSMFQDFVAKSEAHFNLKVVYLYIDN 613

Query: 624  GGEY--HRLHNYFKSTGIEHHISCPHTHQQNGLVERKHRHIVETGLALLAQANMPLSYWD 683
            G EY  + +  +    GI +H++ PHT Q NG+ ER  R I E    +++ A +  S+W 
Sbjct: 614  GREYLSNEMRQFCVKKGISYHLTVPHTPQLNGVSERMIRTITEKARTMVSGAKLDKSFWG 673

Query: 684  EAFNTACFLINRMPSRTI--QQDTPLHKLFGKSPDYSMLRVFGCACWPNLRPYNNKKLSF 743
            EA  TA +LINR+PSR +     TP      K P    LRVFG   + +++   NK+  F
Sbjct: 674  EAVLTATYLINRIPSRALVDSSKTPYEMWHNKKPYLKHLRVFGATVYVHIK---NKQGKF 733

Query: 744  --RTTRCIFLGYSSSHKGYKCLNRSTGRIYISRDVVFD------------ENIFPFEESK 803
              ++ + IF+GY  +  G+K  +    +  ++RDVV D            E +F  +  +
Sbjct: 734  DDKSFKSIFVGYEPN--GFKLWDAVNEKFIVARDVVVDETNMVNSRAVKFETVFLKDSKE 793

Query: 804  PPNKT-TNPHHPVL-------------LPALAKLASFYTENALTDIEPVVSNSHMNDG-Q 863
              NK   N    ++             +  L        +N   D   ++     N+  +
Sbjct: 794  SENKNFPNDSRKIIQTEFPNESKECDNIQFLKDSKESENKNFPNDSRKIIQTEFPNESKE 853

Query: 864  TDNIA----SDNLSGVSLSSADNTRSSEEIAEYEAESS-SINAQNQTHEHVS----DQPT 923
             DNI     S   +   L+ +   +  + + E +   + + + +++T EH+     D PT
Sbjct: 854  CDNIQFLKDSKESNKYFLNESKKRKRDDHLNESKGSGNPNESRESETAEHLKEIGIDNPT 913

Query: 924  EAASQHPMRTRLRNNIVQAKQFTDGTIRYSETSRKFASAVTITTPII-ETATEPRNLQEA 983
            +      +  R        +  T   I Y+E        V     I  +       +Q  
Sbjct: 914  KNDGIEIINRR------SERLKTKPQISYNEEDNSLNKVVLNAHTIFNDVPNSFDEIQYR 973

Query: 984  MQHPRWREAMNDELSALKRNATWDLVPPKPGINLIDSKWVYKVKRKADGSVERLKARLVA 1043
                 W EA+N EL+A K N TW +       N++DS+WV+ VK    G+  R KARLVA
Sbjct: 974  DDKSSWEEAINTELNAHKINNTWTITKRPENKNIVDSRWVFSVKYNELGNPIRYKARLVA 1033

Query: 1044 KGFKQRFGVDYTDTFSPVIKPSTIRVILSLAVTKGWNMRQVDIQNAFLHGILKEEVYMRQ 1103
            +GF Q++ +DY +TF+PV + S+ R ILSL +     + Q+D++ AFL+G LKEE+YMR 
Sbjct: 1034 RGFTQKYQIDYEETFAPVARISSFRFILSLVIQYNLKVHQMDVKTAFLNGTLKEEIYMRL 1093

Query: 1104 PPGFQDSAKPKNYICKLKKALYGLKQAPKAWHSRLTGKLIELGFKASVADSSLFILKNRE 1163
            P G   ++   + +CKL KA+YGLKQA + W       L E  F  S  D  ++IL    
Sbjct: 1094 PQGISCNS---DNVCKLNKAIYGLKQAARCWFEVFEQALKECEFVNSSVDRCIYILDKGN 1153

Query: 1164 I--TIYMLIYVDDIIIVSSSDQATERLIQKLKIDFAVKDLGDLEYFLGIEVKKTRDGIIL 1223
            I   IY+L+YVDD++I +          + L   F + DL ++++F+GI ++   D I L
Sbjct: 1154 INENIYVLLYVDDVVIATGDMTRMNNFKRYLMEKFRMTDLNEIKHFIGIRIEMQEDKIYL 1213

Query: 1224 SQRRYALDLLKRVNMEKCKPMSTPMGSAEKLFREQGIPLSAEEQFK--YRSTVGALQYLT 1283
            SQ  Y   +L + NME C  +STP+ S  K+  E    L+++E      RS +G L Y+ 
Sbjct: 1214 SQSAYVKKILSKFNMENCNAVSTPLPS--KINYEL---LNSDEDCNTPCRSLIGCLMYIM 1273

Query: 1284 M-TRPDLAFAVNKVCQYLHTPTDAHWGAVKRILRYVKGTLALGVKIQKSTMM---LSGFS 1343
            + TRPDL  AVN + +Y        W  +KR+LRY+KGT+ + +  +K+      + G+ 
Sbjct: 1274 LCTRPDLTTAVNILSRYSSKNNSELWQNLKRVLRYLKGTIDMKLIFKKNLAFENKIIGYV 1333

Query: 1344 DADWAGCPDDRRSTSGFAV-FLGANLISWSSRKQATVSRSSTEAEYKAIANLTAEMIWIK 1384
            D+DWAG   DR+ST+G+       NLI W++++Q +V+ SSTEAEY A+     E +W+K
Sbjct: 1334 DSDWAGSEIDRKSTTGYLFKMFDFNLICWNTKRQNSVAASSTEAEYMALFEAVREALWLK 1391

BLAST of CmaCh16G006480.1 vs. ExPASy Swiss-Prot
Match: P92519 (Uncharacterized mitochondrial protein AtMg00810 OS=Arabidopsis thaliana OX=3702 GN=AtMg00810 PE=4 SV=1)

HSP 1 Score: 218.4 bits (555), Expect = 5.1e-55
Identity = 112/228 (49.12%), Postives = 151/228 (66.23%), Query Frame = 0

Query: 1076 IYMLIYVDDIIIVSSSDQATERLIQKLKIDFAVKDLGDLEYFLGIEVKKTRDGIILSQRR 1135
            +Y+L+YVDDI++  SS+     LI +L   F++KDLG + YFLGI++K    G+ LSQ +
Sbjct: 1    MYLLLYVDDILLTGSSNTLLNMLIFQLSSTFSMKDLGPVHYFLGIQIKTHPSGLFLSQTK 60

Query: 1136 YALDLLKRVNMEKCKPMSTPMGSAEKLFREQGIPLSAEEQFKYRSTVGALQYLTMTRPDL 1195
            YA  +L    M  CKPMSTP+                 +   +RS VGALQYLT+TRPD+
Sbjct: 61   YAEQILNNAGMLDCKPMSTPL---PLKLNSSVSTAKYPDPSDFRSIVGALQYLTLTRPDI 120

Query: 1196 AFAVNKVCQYLHTPTDAHWGAVKRILRYVKGTLALGVKIQK-STMMLSGFSDADWAGCPD 1255
            ++AVN VCQ +H PT A +  +KR+LRYVKGT+  G+ I K S + +  F D+DWAGC  
Sbjct: 121  SYAVNIVCQRMHEPTLADFDLLKRVLRYVKGTIFHGLYIHKNSKLNVQAFCDSDWAGCTS 180

Query: 1256 DRRSTSGFAVFLGANLISWSSRKQATVSRSSTEAEYKAIANLTAEMIW 1303
             RRST+GF  FLG N+ISWS+++Q TVSRSSTE EY+A+A   AE+ W
Sbjct: 181  TRRSTTGFCTFLGCNIISWSAKRQPTVSRSSTETEYRALALTAAELTW 225

BLAST of CmaCh16G006480.1 vs. TAIR 10
Match: AT4G23160.1 (cysteine-rich RLK (RECEPTOR-like protein kinase) 8 )

HSP 1 Score: 404.8 bits (1039), Expect = 2.8e-112
Identity = 211/482 (43.78%), Postives = 304/482 (63.07%), Query Frame = 0

Query: 881  IETATEPRNLQEAMQHPRWREAMNDELSALKRNATWDLVPPKPGINLIDSKWVYKVKRKA 940
            I  A EP    EA +   W  AM+DE+ A++   TW++    P    I  KWVYK+K  +
Sbjct: 80   IAKAKEPSTYNEAKEFLVWCGAMDDEIGAMETTHTWEICTLPPNKKPIGCKWVYKIKYNS 139

Query: 941  DGSVERLKARLVAKGFKQRFGVDYTDTFSPVIKPSTIRVILSLAVTKGWNMRQVDIQNAF 1000
            DG++ER KARLVAKG+ Q+ G+D+ +TFSPV K +++++IL+++    + + Q+DI NAF
Sbjct: 140  DGTIERYKARLVAKGYTQQEGIDFIETFSPVCKLTSVKLILAISAIYNFTLHQLDISNAF 199

Query: 1001 LHGILKEEVYMRQPPGF---QDSAKPKNYICKLKKALYGLKQAPKAWHSRLTGKLIELGF 1060
            L+G L EE+YM+ PPG+   Q  + P N +C LKK++YGLKQA + W  + +  LI  GF
Sbjct: 200  LNGDLDEEIYMKLPPGYAARQGDSLPPNAVCYLKKSIYGLKQASRQWFLKFSVTLIGFGF 259

Query: 1061 KASVADSSLFILKNREITIYMLIYVDDIIIVSSSDQATERLIQKLKIDFAVKDLGDLEYF 1120
              S +D + F+     + + +L+YVDDIII S++D A + L  +LK  F ++DLG L+YF
Sbjct: 260  VQSHSDHTYFLKITATLFLCVLVYVDDIIICSNNDAAVDELKSQLKSCFKLRDLGPLKYF 319

Query: 1121 LGIEVKKTRDGIILSQRRYALDLLKRVNMEKCKPMSTPMGSAEKLFREQGIPLSAEEQFK 1180
            LG+E+ ++  GI + QR+YALDLL    +  CKP S PM  +       G      +   
Sbjct: 320  LGLEIARSAAGINICQRKYALDLLDETGLLGCKPSSVPMDPSVTFSAHSGGDF--VDAKA 379

Query: 1181 YRSTVGALQYLTMTRPDLAFAVNKVCQYLHTPTDAHWGAVKRILRYVKGTLALGV-KIQK 1240
            YR  +G L YL +TR D++FAVNK+ Q+   P  AH  AV +IL Y+KGT+  G+    +
Sbjct: 380  YRRLIGRLMYLQITRLDISFAVNKLSQFSEAPRLAHQQAVMKILHYIKGTVGQGLFYSSQ 439

Query: 1241 STMMLSGFSDADWAGCPDDRRSTSGFAVFLGANLISWSSRKQATVSRSSTEAEYKAIANL 1300
            + M L  FSDA +  C D RRST+G+ +FLG +LISW S+KQ  VS+SS EAEY+A++  
Sbjct: 440  AEMQLQVFSDASFQSCKDTRRSTNGYCMFLGTSLISWKSKKQQVVSKSSAEAEYRALSFA 499

Query: 1301 TAEMIWIKSLLKELGVYQSKAPRLWCDNLGATYLTSNPVFHARTKHIEVDFHFVREQVAR 1359
            T EM+W+    +EL +  SK   L+CDN  A ++ +N VFH RTKHIE D H VRE+   
Sbjct: 500  TDEMMWLAQFFRELQLPLSKPTLLFCDNTAAIHIATNAVFHERTKHIESDCHSVRERSVY 559

BLAST of CmaCh16G006480.1 vs. TAIR 10
Match: ATMG00810.1 (DNA/RNA polymerases superfamily protein )

HSP 1 Score: 218.4 bits (555), Expect = 3.7e-56
Identity = 112/228 (49.12%), Postives = 151/228 (66.23%), Query Frame = 0

Query: 1076 IYMLIYVDDIIIVSSSDQATERLIQKLKIDFAVKDLGDLEYFLGIEVKKTRDGIILSQRR 1135
            +Y+L+YVDDI++  SS+     LI +L   F++KDLG + YFLGI++K    G+ LSQ +
Sbjct: 1    MYLLLYVDDILLTGSSNTLLNMLIFQLSSTFSMKDLGPVHYFLGIQIKTHPSGLFLSQTK 60

Query: 1136 YALDLLKRVNMEKCKPMSTPMGSAEKLFREQGIPLSAEEQFKYRSTVGALQYLTMTRPDL 1195
            YA  +L    M  CKPMSTP+                 +   +RS VGALQYLT+TRPD+
Sbjct: 61   YAEQILNNAGMLDCKPMSTPL---PLKLNSSVSTAKYPDPSDFRSIVGALQYLTLTRPDI 120

Query: 1196 AFAVNKVCQYLHTPTDAHWGAVKRILRYVKGTLALGVKIQK-STMMLSGFSDADWAGCPD 1255
            ++AVN VCQ +H PT A +  +KR+LRYVKGT+  G+ I K S + +  F D+DWAGC  
Sbjct: 121  SYAVNIVCQRMHEPTLADFDLLKRVLRYVKGTIFHGLYIHKNSKLNVQAFCDSDWAGCTS 180

Query: 1256 DRRSTSGFAVFLGANLISWSSRKQATVSRSSTEAEYKAIANLTAEMIW 1303
             RRST+GF  FLG N+ISWS+++Q TVSRSSTE EY+A+A   AE+ W
Sbjct: 181  TRRSTTGFCTFLGCNIISWSAKRQPTVSRSSTETEYRALALTAAELTW 225

BLAST of CmaCh16G006480.1 vs. TAIR 10
Match: ATMG00820.1 (Reverse transcriptase (RNA-dependent DNA polymerase) )

HSP 1 Score: 112.5 bits (280), Expect = 2.8e-24
Identity = 55/112 (49.11%), Postives = 80/112 (71.43%), Query Frame = 0

Query: 873 AVTITTPIIETATEPRNLQEAMQHPRWREAMNDELSALKRNATWDLVPPKPGINLIDSKW 932
           ++TITT I     EP+++  A++ P W +AM +EL AL RN TW LVPP    N++  KW
Sbjct: 17  SLTITTTI---KKEPKSVIFALKDPGWCQAMQEELDALSRNKTWILVPPPVNQNILGCKW 76

Query: 933 VYKVKRKADGSVERLKARLVAKGFKQRFGVDYTDTFSPVIKPSTIRVILSLA 985
           V+K K  +DG+++RLKARLVAKGF Q  G+ + +T+SPV++ +TIR IL++A
Sbjct: 77  VFKTKLHSDGTLDRLKARLVAKGFHQEEGIYFVETYSPVVRTATIRTILNVA 125

BLAST of CmaCh16G006480.1 vs. TAIR 10
Match: ATMG00240.1 (Gag-Pol-related retrotransposon family protein )

HSP 1 Score: 80.5 bits (197), Expect = 1.2e-14
Identity = 42/88 (47.73%), Postives = 54/88 (61.36%), Query Frame = 0

Query: 1187 YLTMTRPDLAFAVNKVCQYLHTPTDAHWGAVKRILRYVKGTLALGVKIQ-KSTMMLSGFS 1246
            YLT+TRPDL FAVN++ Q+      A   AV ++L YVKGT+  G+     S + L  F+
Sbjct: 2    YLTITRPDLTFAVNRLSQFSSASRTAQMQAVYKVLHYVKGTVGQGLFYSATSDLQLKAFA 61

Query: 1247 DADWAGCPDDRRSTSGFAV-----FLGA 1269
            D+DWA CPD RRS +GF       FLGA
Sbjct: 62   DSDWASCPDTRRSVTGFCSLVPLWFLGA 89

BLAST of CmaCh16G006480.1 vs. TAIR 10
Match: AT1G34070.1 (CONTAINS InterPro DOMAIN/s: Retrotransposon gag protein (InterPro:IPR005162); BEST Arabidopsis thaliana protein match is: unknown protein (TAIR:AT5G48050.1); Has 648 Blast hits to 647 proteins in 29 species: Archae - 0; Bacteria - 0; Metazoa - 16; Fungi - 25; Plants - 607; Viruses - 0; Other Eukaryotes - 0 (source: NCBI BLink). )

HSP 1 Score: 79.7 bits (195), Expect = 2.0e-14
Identity = 66/261 (25.29%), Postives = 108/261 (41.38%), Query Frame = 0

Query: 17  ISVKLTQENYLLWSTQILPYLRSQNLVGFVDGSMPAPSQTIAVEPSEEIGNRKIIINPEF 76
           + + + + NY  W    L +  S +++G +DG++                   +  N   
Sbjct: 22  VMLDIEESNYDAWRELFLTHCLSFDVMGHIDGTL-------------------LPTNAND 81

Query: 77  TVWYPQDQLV-LSLINSSVTEEVLSTMVGITTAREAWITLERQFASTSRARAMQIRMELS 136
             W  +D +V LSL  +   ++   + V  +T+R+ W+ ++ QF +   ARA+++  EL 
Sbjct: 82  VNWQKRDGIVKLSLYGTLTPKQFQGSFVTSSTSRDIWLRIKNQFRNNKDARALRLDSELR 141

Query: 137 TIQKKDMTIADYFRKVKHLGDTLAAIGKRIEDEELIAYMLQGLGPDYDPLVTSITTRTDV 196
           T    DM +ADY+RK+K L D+L  +   + D  L+ Y+L GL P +D ++  I  R   
Sbjct: 142 TKDIGDMRVADYYRKMKKLADSLRNVDVPVTDRNLVMYVLNGLNPKFDNIINVIKHRQPF 201

Query: 197 YTVSDVYAHMLSYEMR-------------HLRKGTFEQLSSANNVNRISIRGGANGGRGS 256
            +  D    +   E R             H    T    S A  V      GG   G   
Sbjct: 202 PSFDDAATMLQEEEDRLKRAIKPNPTHVDHSSSSTVLACSEAPPVTNFQRSGGNQMGYRG 261

Query: 257 RGRSRQLNSGHGQSRRTVNNP 264
           RGR   +  G G      N P
Sbjct: 262 RGRGNNIFRGRGGRFSYYNMP 263

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
Q94HW25.9e-28539.25Retrovirus-related Pol polyprotein from transposon RE1 OS=Arabidopsis thaliana O... [more]
Q9ZT942.4e-27838.65Retrovirus-related Pol polyprotein from transposon RE2 OS=Arabidopsis thaliana O... [more]
P109783.2e-16631.51Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum... [more]
P041468.0e-14128.58Copia protein OS=Drosophila melanogaster OX=7227 GN=GIP PE=1 SV=3[more]
P925195.1e-5549.12Uncharacterized mitochondrial protein AtMg00810 OS=Arabidopsis thaliana OX=3702 ... [more]
Match NameE-valueIdentityDescription
AT4G23160.12.8e-11243.78cysteine-rich RLK (RECEPTOR-like protein kinase) 8 [more]
ATMG00810.13.7e-5649.12DNA/RNA polymerases superfamily protein [more]
ATMG00820.12.8e-2449.11Reverse transcriptase (RNA-dependent DNA polymerase) [more]
ATMG00240.11.2e-1447.73Gag-Pol-related retrotransposon family protein [more]
AT1G34070.12.0e-1425.29CONTAINS InterPro DOMAIN/s: Retrotransposon gag protein (InterPro:IPR005162); BE... [more]
InterPro
Analysis Name: InterPro Annotations of Cucurbita maxima (Rimu) v1.1
Date Performed: 2021-10-25
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR025724GAG-pre-integrase domainPFAMPF13976gag_pre-integrscoord: 439..491
e-value: 1.4E-7
score: 31.2
IPR001584Integrase, catalytic corePFAMPF00665rvecoord: 504..600
e-value: 4.0E-8
score: 33.5
IPR001584Integrase, catalytic corePROSITEPS50994INTEGRASEcoord: 502..665
score: 22.256716
IPR029472Retrotransposon Copia-like, N-terminalPFAMPF14244Retrotran_gag_3coord: 15..53
e-value: 1.4E-7
score: 31.2
NoneNo IPR availablePFAMPF14223Retrotran_gag_2coord: 81..212
e-value: 3.6E-21
score: 75.4
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 229..273
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 812..843
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 777..843
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 243..273
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 777..805
NoneNo IPR availablePANTHERPTHR45895FAMILY NOT NAMEDcoord: 319..1245
NoneNo IPR availableCDDcd09272RNase_HI_RT_Ty1coord: 1243..1380
e-value: 6.44461E-75
score: 242.759
IPR036397Ribonuclease H superfamilyGENE3D3.30.420.10coord: 498..673
e-value: 2.1E-34
score: 120.5
IPR013103Reverse transcriptase, RNA-dependent DNA polymerasePFAMPF07727RVT_2coord: 913..1156
e-value: 1.2E-65
score: 221.5
IPR012337Ribonuclease H-like superfamilySUPERFAMILY53098Ribonuclease H-likecoord: 502..659
IPR043502DNA/RNA polymerase superfamilySUPERFAMILY56672DNA/RNA polymerasescoord: 912..1349

Relationships

This mRNA is a part of the following gene feature(s):

Feature NameUnique NameType
CmaCh16G006480CmaCh16G006480gene


The following exon feature(s) are a part of this mRNA:

Feature NameUnique NameType
CmaCh16G006480.1:exon:314CmaCh16G006480.1:exon:314exon


The following CDS feature(s) are a part of this mRNA:

Feature NameUnique NameType
CmaCh16G006480.1:cdsCmaCh16G006480.1:cdsCDS


The following polypeptide feature(s) derives from this mRNA:

Feature NameUnique NameType
CmaCh16G006480.1CmaCh16G006480.1-proteinpolypeptide


GO Annotation
GO Assignments
This mRNA is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0015074 DNA integration
molecular_function GO:0003676 nucleic acid binding