CmUC08G144660 (gene) Watermelon (USVL531) v1

Overview
NameCmUC08G144660
Typegene
OrganismCitrullus mucosospermus (Watermelon (USVL531) v1)
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
LocationCmU531Chr08: 4314995 .. 4317927 (+)
RNA-Seq ExpressionCmUC08G144660
SyntenyCmUC08G144660
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: exonCDSpolypeptide
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGGCTGAAAACCCCTCCTTCACCAATTTACTGAACCAAGTGACCTCCATCGAGCTTGAGAGAGGTAATTTCCTCCTCTGGCAGAACATTGTTCTCCCGATTCTAAGGAGTTATAAGCTTGAAGATGATCTCACGGGTAAGACGTCGGCGCCTGAGCATTCCATTATCATCCCACCTAGTGAAGAAGAACTTCAAAGACTTGTTCTTCCTAATCCTGAGCACGATATATGGCTAGCAGCTGATCAACTTCTAGTCGGATGGCTGTACAACTCAATGACTGCAGAAGTAACAACTCAAGTAATGGGATATGATGAAGCCAAACCGCTGTGGGATGCCATACAAGAATATTTTGGGATTCAATCACAATCTCAAGAAGACTATTATTGGCAAATTATGTAGCAAACAAGAAAAGAAAGCATGAAAATGGAAGAGTACCTAGCTATGATGAAGAAATATTAAGACAATCTAATATTAGCCGGATTACTAATGGATTTTAGGAGGTTCACTTCTCATGTGTTTGCCGGACTCGATGAAGAATTTAATCCCATTGTGTGTGTTATAAGAAGTCAAGACCTCACATGGAATAAAATCCAGTATGAACTCTTTTCTTATGAACAGAGACATGAGAAGCTTCAAGAACTCAAAGGGAGTTTGAATTTGCATCAACTATCCACCAACATCGTCTTCGAAGATAAGAGGAAGAATCACAACTCGTCACAGTCCAATCGAAGCAACAATCAGAATCGAGATAATAATAGAGGAAGAGGAAGGAATCGCTACTCACCATATCCACCACCCAACAAGCCAACTTATCAAATAAGTGGGAAATACGGACATGCCGCTAATGTATGTTACTCTCGCTACAACAGAAATGATGATGCCAATGCCACTTCTTCTGATAACAAGAAAAACAACTCTCCAACGGCTCTTATGGCTTTTCCCAAAACTGTAGACGACTCCACTTGGTACCTGGACAGTGGTGCAACAAACCACATCACCACCGATATTCAACGTCTATCCTTGAAGGGTACGTATCTAATTTCCACTACTATTATTGTGGGTGATGGAACGAAAGTTCCCATTGCTGCTTATGGAAATGCTAAATTATTTTGCACAACACTACCCATAAACTTAAAAGATGTCCTCTATGTTCCTCGTATGAAAAAGAATCTTATTAGCATTTCAAAGCTCACTTAAGATAATGAAGTCACTGTGGAATTTGATAATAACTTCTGTTTTATAAAGGCAAAGAAGTCCAGGGAAACATGCCTAATCGGGAAGCTTGAAAGCGGACTCTACTGTTTGGAAGTAGTTCCAAACAAAGCCTCTTTTCGTGAAGATAATAAGGATGCTGCTCGGTTGGTATTTCTTGGAGAAGCCACCAACTCTCATCCTCAAACTTTTGAAGACTTGGCTGCTGAAAAATCATCTAACTTTTGTACCAAGGATTTGTGGCACCAAAGATTAGGTCATCCTTCCTCTAGAATTCTAAACCAAGTCTTACACTCTTGTAATCTTGTAATATTTCTATTAACAGCAATGAAAAGGATGACTTTTGTGAGGCTTATAAATTCGAAAAGGTGCGTCTCTTCAAAATTCTCTCTCCTGAGCCAACAAGAAACTTGAACTTATACATGCTGATTTATGGGGACTAGCACCTATTTTGTCTACTAGTGGATATCGGTATTACATAGCATTTGTGGATGACTGCAATCGGTTTGTTTGGATTTATCTGCTTCAGACAAAAGGGGACACAGCCACAGTATTCAAACAATTCAAGAGTCTTGCCGAAAATTTATATTCTTCAAAAATTAAAACTCTAAGGTGTGATGGAGGAGGAGAATTCAAACTAGTCATTGAATATGCACACCAACAAGGCATTGAAGTTCAAATGACATGCCCATATACTTCTTCTCAAAATGGTCGAGTGGAACGTAAACACCGTCACAATGTTGAAACTGGTCTCACACTTCTTGCTCAAGCACAAATGCCATTAAATTTCTGGTGGGAAGCCTTTCATACAGCTACATATCTTATTAACCGAATGCCTTCAAATGTCATCAATAATGAATGTCCGTACACTCTTCTCAAAGGACATTCTCCTGACTATACATGGCTTCGAACCGTCGGTTCCGCCTGTTTTCCCTATCTATGGCCATATCAACAACACAAATTCGACTTTCATACTACAAAATGTGTGTTCATTGAATATAGCGACCGCCACAAAGGATACAAGTTTCTCTCTTCAACAAGAAGAATATTTATCTCACGGCATGTCTACTTCAATGAGAAAGAATTTCCATATTCATCCATCTTCAATAAATCGGAGACCATCACTAACTCCAGCTCGGAAATTTCTTGGATACCAATTCCAAACATAGCAACCACTGCATCACCATCGTAATCCATCAATCAATACAGACAACAATTGTGATCTATCCGATCCAAGAGTTGGGACACCTGCAAGCCATGACTGATCCAACAATTGGTCTGATACAACCTATTCACCCTCTATTTCATCAAATTCTCCTAACTCACCAAATTCACCCCCTACGCAACCTTACCACCCTCCAACCCTCCTCCCACGGCTACCAACATCCATCCCATGACCACTCGGGCTAAGGGAGGTATTTTCAAACCAAAACAGGGAAAAGTTCATGCAACACGACGGAGAGCACGTTGGCCAATACTTGGAAGATGTGAAGGATGGTAAAACCAAGATCGCCGCCGGTGCACTGCTTCCTCACGAGATCATAAAGTCCTTGGACGACGGTGAGGAAGACTGTGGAGAAGTCGCAGAGCTTCAATGGAAGTGAATGGTGGATGACTAGTTGAAGAAAGGGAAATTGAGAAACTGCATTGCTGTTTGTGATGCGTCTGTACGGGATTCCATGGGTGTTTGTGTTGCTTTGGATCTTTTGGTTTCTGAATGA

mRNA sequence

ATGGCTGAAAACCCCTCCTTCACCAATTTACTGAACCAAGTGACCTCCATCGAGCTTGAGAGAGGTAATTTCCTCCTCTGGCAGAACATTGTTCTCCCGATTCTAAGGAGTTATAAGCTTGAAGATGATCTCACGGGTAAGACGTCGGCGCCTGAGCATTCCATTATCATCCCACCTAGTGAAGAAGAACTTCAAAGACTTGTTCTTCCTAATCCTGAGCACGATATATGGCTAGCAGCTGATCAACTTCTAGTCGGATGGCTGTACAACTCAATGACTGCAGAAGTAACAACTCAAGTAATGGGATATGATGAAGCCAAACCGCTGTGGGATGCCATACAAGAATATTTTGGGATTCAATCACAATCTCAAGAAGACTATTATTGGCAAATTATGAGGTTCACTTCTCATGTGTTTGCCGGACTCGATGAAGAATTTAATCCCATTAGACATGAGAAGCTTCAAGAACTCAAAGGGAGTTTGAATTTGCATCAACTATCCACCAACATCGTCTTCGAAGATAAGAGGAAGAATCACAACTCGTCACAGTCCAATCGAAGCAACAATCAGAATCGAGATAATAATAGAGGAAGAGGAAGGAATCGCTACTCACCATATCCACCACCCAACAAGCCAACTTATCAAATAAGTGGGAAATACGGACATGCCGCTAATGTATGTTACTCTCGCTACAACAGAAATGATGATGCCAATGCCACTTCTTCTGATAACAAGAAAAACAACTCTCCAACGGCTCTTATGGCTTTTCCCAAAACTGTAGACGACTCCACTTGGTACCTGGACAGTGGTGCAACAAACCACATCACCACCGATATTCAACGTCTATCCTTGAAGGTTCCCATTGCTGCTTATGGAAATGCTAAATTATTTTGCACAACACTACCCATAAACTTAAAAGATGTCCTCTATGTTCCTCATAATGAAGTCACTGTGGAATTTGATAATAACTTCTGTTTTATAAAGGCAAAGAAGTCCAGGGAAACATGCCTAATCGGGAAGCTTGAAAGCGGACTCTACTGTTTGGAAGTAGTTCCAAACAAAGCCTCTTTTCGTGAAGATAATAAGGATGCTGCTCGGTTGGTATTTCTTGGAGAAGCCACCAACTCTCATCCTCAAACTTTTGAAGACTTGGCTGCTGAAAAATCATCTAACTTTTGTACCAAGGATTTGTGGCACCAAAGATTAGGTCATCCTTCCTCTAGAATTCTAAACCAAAAACTTGAACTTATACATGCTGATTTATGGGGACTAGCACCTATTTTGTCTACTAGTGGATATCGGTATTACATAGCATTTGTGGATGACTGCAATCGGTTTGTTTGGATTTATCTGCTTCAGACAAAAGGGGACACAGCCACAGTATTCAAACAATTCAAGAGTCTTGCCGAAAATTTATATTCTTCAAAAATTAAAACTCTAAGGTGTGATGGAGGAGGAGAATTCAAACTAGTCATTGAATATGCACACCAACAAGGCATTGAAGTTCAAATGACATGCCCATATACTTCTTCTCAAAATGGTCGAGTGGAACGTAAACACCGTCACAATGTTGAAACTGGTCTCACACTTCTTGCTCAAGCACAAATGCCATTAAATTTCTGGTGGGAAGCCTTTCATACAGCTACATATCTTATTAACCGAATGCCTTCAAATGTCATCAATAATGAATGTCCGTACACTCTTCTCAAAGGACATTCTCCTGACTATACATGGCTTCGAACCGTCGGTTCCGCCTGTTTTCCCTATCTATGGCCATATCAACAACACAAATTCGACTTTCATACTACAAAATGTGTGTTCATTGAATATAGCGACCGCCACAAAGGATACAAGTTTCTCTCTTCAACAAGAAGAATATTTATCTCACGGCATGTCTACTTCAATGAGAAAGAATTTCCATATTCATCCATCTTCAATAAATCGGAGACCATCACTAACTCCAGCTCGGAAATTTCTTGGATACCAATTCCAAACATAGCAACCACTGCATCACCATCGGAGGTATTTTCAAACCAAAACAGGGAAAAGTTCATGCAACACGACGGAGAGCACGTTGGCCAATACTTGGAAGATGTGAAGGATGGTAAAACCAAGATCGCCGCCGGTGCACTGCTTCCTCACGAGATCATAAAGTCCTTGGACGACGGTGAGGAAGACTGTGGAGAATTGAAGAAAGGGAAATTGAGAAACTGCATTGCTGTTTGTGATGCGTCTGTACGGGATTCCATGGGTGTTTGTGTTGCTTTGGATCTTTTGGTTTCTGAATGA

Coding sequence (CDS)

ATGGCTGAAAACCCCTCCTTCACCAATTTACTGAACCAAGTGACCTCCATCGAGCTTGAGAGAGGTAATTTCCTCCTCTGGCAGAACATTGTTCTCCCGATTCTAAGGAGTTATAAGCTTGAAGATGATCTCACGGGTAAGACGTCGGCGCCTGAGCATTCCATTATCATCCCACCTAGTGAAGAAGAACTTCAAAGACTTGTTCTTCCTAATCCTGAGCACGATATATGGCTAGCAGCTGATCAACTTCTAGTCGGATGGCTGTACAACTCAATGACTGCAGAAGTAACAACTCAAGTAATGGGATATGATGAAGCCAAACCGCTGTGGGATGCCATACAAGAATATTTTGGGATTCAATCACAATCTCAAGAAGACTATTATTGGCAAATTATGAGGTTCACTTCTCATGTGTTTGCCGGACTCGATGAAGAATTTAATCCCATTAGACATGAGAAGCTTCAAGAACTCAAAGGGAGTTTGAATTTGCATCAACTATCCACCAACATCGTCTTCGAAGATAAGAGGAAGAATCACAACTCGTCACAGTCCAATCGAAGCAACAATCAGAATCGAGATAATAATAGAGGAAGAGGAAGGAATCGCTACTCACCATATCCACCACCCAACAAGCCAACTTATCAAATAAGTGGGAAATACGGACATGCCGCTAATGTATGTTACTCTCGCTACAACAGAAATGATGATGCCAATGCCACTTCTTCTGATAACAAGAAAAACAACTCTCCAACGGCTCTTATGGCTTTTCCCAAAACTGTAGACGACTCCACTTGGTACCTGGACAGTGGTGCAACAAACCACATCACCACCGATATTCAACGTCTATCCTTGAAGGTTCCCATTGCTGCTTATGGAAATGCTAAATTATTTTGCACAACACTACCCATAAACTTAAAAGATGTCCTCTATGTTCCTCATAATGAAGTCACTGTGGAATTTGATAATAACTTCTGTTTTATAAAGGCAAAGAAGTCCAGGGAAACATGCCTAATCGGGAAGCTTGAAAGCGGACTCTACTGTTTGGAAGTAGTTCCAAACAAAGCCTCTTTTCGTGAAGATAATAAGGATGCTGCTCGGTTGGTATTTCTTGGAGAAGCCACCAACTCTCATCCTCAAACTTTTGAAGACTTGGCTGCTGAAAAATCATCTAACTTTTGTACCAAGGATTTGTGGCACCAAAGATTAGGTCATCCTTCCTCTAGAATTCTAAACCAAAAACTTGAACTTATACATGCTGATTTATGGGGACTAGCACCTATTTTGTCTACTAGTGGATATCGGTATTACATAGCATTTGTGGATGACTGCAATCGGTTTGTTTGGATTTATCTGCTTCAGACAAAAGGGGACACAGCCACAGTATTCAAACAATTCAAGAGTCTTGCCGAAAATTTATATTCTTCAAAAATTAAAACTCTAAGGTGTGATGGAGGAGGAGAATTCAAACTAGTCATTGAATATGCACACCAACAAGGCATTGAAGTTCAAATGACATGCCCATATACTTCTTCTCAAAATGGTCGAGTGGAACGTAAACACCGTCACAATGTTGAAACTGGTCTCACACTTCTTGCTCAAGCACAAATGCCATTAAATTTCTGGTGGGAAGCCTTTCATACAGCTACATATCTTATTAACCGAATGCCTTCAAATGTCATCAATAATGAATGTCCGTACACTCTTCTCAAAGGACATTCTCCTGACTATACATGGCTTCGAACCGTCGGTTCCGCCTGTTTTCCCTATCTATGGCCATATCAACAACACAAATTCGACTTTCATACTACAAAATGTGTGTTCATTGAATATAGCGACCGCCACAAAGGATACAAGTTTCTCTCTTCAACAAGAAGAATATTTATCTCACGGCATGTCTACTTCAATGAGAAAGAATTTCCATATTCATCCATCTTCAATAAATCGGAGACCATCACTAACTCCAGCTCGGAAATTTCTTGGATACCAATTCCAAACATAGCAACCACTGCATCACCATCGGAGGTATTTTCAAACCAAAACAGGGAAAAGTTCATGCAACACGACGGAGAGCACGTTGGCCAATACTTGGAAGATGTGAAGGATGGTAAAACCAAGATCGCCGCCGGTGCACTGCTTCCTCACGAGATCATAAAGTCCTTGGACGACGGTGAGGAAGACTGTGGAGAATTGAAGAAAGGGAAATTGAGAAACTGCATTGCTGTTTGTGATGCGTCTGTACGGGATTCCATGGGTGTTTGTGTTGCTTTGGATCTTTTGGTTTCTGAATGA

Protein sequence

MAENPSFTNLLNQVTSIELERGNFLLWQNIVLPILRSYKLEDDLTGKTSAPEHSIIIPPSEEELQRLVLPNPEHDIWLAADQLLVGWLYNSMTAEVTTQVMGYDEAKPLWDAIQEYFGIQSQSQEDYYWQIMRFTSHVFAGLDEEFNPIRHEKLQELKGSLNLHQLSTNIVFEDKRKNHNSSQSNRSNNQNRDNNRGRGRNRYSPYPPPNKPTYQISGKYGHAANVCYSRYNRNDDANATSSDNKKNNSPTALMAFPKTVDDSTWYLDSGATNHITTDIQRLSLKVPIAAYGNAKLFCTTLPINLKDVLYVPHNEVTVEFDNNFCFIKAKKSRETCLIGKLESGLYCLEVVPNKASFREDNKDAARLVFLGEATNSHPQTFEDLAAEKSSNFCTKDLWHQRLGHPSSRILNQKLELIHADLWGLAPILSTSGYRYYIAFVDDCNRFVWIYLLQTKGDTATVFKQFKSLAENLYSSKIKTLRCDGGGEFKLVIEYAHQQGIEVQMTCPYTSSQNGRVERKHRHNVETGLTLLAQAQMPLNFWWEAFHTATYLINRMPSNVINNECPYTLLKGHSPDYTWLRTVGSACFPYLWPYQQHKFDFHTTKCVFIEYSDRHKGYKFLSSTRRIFISRHVYFNEKEFPYSSIFNKSETITNSSSEISWIPIPNIATTASPSEVFSNQNREKFMQHDGEHVGQYLEDVKDGKTKIAAGALLPHEIIKSLDDGEEDCGELKKGKLRNCIAVCDASVRDSMGVCVALDLLVSE
Homology
BLAST of CmUC08G144660 vs. NCBI nr
Match: PNX76291.1 (gag/pol polyprotein - maize retrotransposon Hopscotch, partial [Trifolium pratense])

HSP 1 Score: 458.4 bits (1178), Expect = 1.2e-124
Identity = 296/825 (35.88%), Postives = 414/825 (50.18%), Query Frame = 0

Query: 2   AENPSFTNLLNQVTSIELERGNFLLWQNIVLPILRSYKLEDDLTGKTSAPEHSIIIPPSE 61
           A N +  N L    S++L+R N+ LWQ++VLPI+R  +L+  + GK   PE  I    S 
Sbjct: 4   AANSNHKNDLPSTVSVKLDRDNYPLWQSMVLPIIRGARLDGYMLGKKKCPEEFITAADSS 63

Query: 62  EELQRLVLPNPEHDIWLAADQLLVGWLYNSMTAEVTTQVMGYDEAKPLWDAIQEYFGIQS 121
           ++       NPE + W A DQ L+GWL NSMT  + TQ++  + +  LWD  Q   G  +
Sbjct: 64  KKF------NPEFEDWQAYDQQLLGWLRNSMTVGIATQLLHCETSMQLWDEAQSLAGAHT 123

Query: 122 QSQ------------------EDYYWQIMRFTS----------------HVFAGLDEEFN 181
           +SQ                  EDY  ++                         GLD E+N
Sbjct: 124 RSQITYLKSEFHSTRKGEMKMEDYLIKMKNLADKLKLAGNPISTSDLIIQTLNGLDSEYN 183

Query: 182 PI---------------------RHEKLQELKGSLNLHQLSTNIVFEDKRKNHNSSQSNR 241
           P+                        ++++L    NL   +T  V   K+ +H  ++ N 
Sbjct: 184 PVVVKLSDQTTLSWVDLQAQLLTFENRIEQLNSLTNLTLNATANV--AKKSDHRGNRFNS 243

Query: 242 SNNQNRDNNR-----------GRGRNRYSPYPPPNKPTYQISGKYGHAANVCYSRYNRND 301
           +NN    NN            GRGR R        K T Q+ G   H A  C+ R+++  
Sbjct: 244 NNNWRGSNNNWRGSNFRGWRGGRGRGR------SFKTTCQVCGLDNHIAIDCFYRFDKTY 303

Query: 302 DANATSSDNKKNNSPTALMAFPKTVDDSTWYLDSGATNHIT--TD-IQRLS--------- 361
             +  S++N K  S  A +A   +++D  WY DSGA+NH+T  TD  Q LS         
Sbjct: 304 SRSNHSANNDKQGSHNAFLASQNSIEDYDWYFDSGASNHVTHQTDKFQNLSEHHGKNSLI 363

Query: 362 ----LKVPIAAYGNAKLFCTTLPINLKDVLYVP--------------HNEVTVEFDNNFC 421
                K+ I A G++KL      +NL D+LYVP               N + VEFD N C
Sbjct: 364 VGNGEKLEIVATGSSKL----KSLNLHDILYVPKITKNLLSVSKLAADNNILVEFDENCC 423

Query: 422 FIKAKKSRETCLIGKLESGLYCLEVVPNKASFREDNKDAARLVFLGEATNSHPQTFEDLA 481
           F+K K + +  L G L+ GLY L             KD++  V +               
Sbjct: 424 FVKDKLTGKAILRGILKDGLYQL-----------SEKDSSAYVSI--------------- 483

Query: 482 AEKSSNFCTKDLWHQRLGHPSSRILN---------------------------------- 541
                    K+ WH++LGHP++++L+                                  
Sbjct: 484 ---------KESWHRKLGHPNNKVLDIVLKSCNVKLSPSDQFSFCEACQYGKMHFLPFKT 543

Query: 542 ------QKLELIHADLWGLAPILSTSGYRYYIAFVDDCNRFVWIYLLQTKGDTATVFKQF 601
                 + LEL+H D+WG API+S+SG++YY+ F+DD  RF WIY L+ K DTA  F QF
Sbjct: 544 SFSHAKEILELVHTDVWGPAPIISSSGFKYYVHFIDDFTRFTWIYPLKQKSDTAHAFIQF 603

Query: 602 KSLAENLYSSKIKTLRCDGGGEFKLVIEYAHQQGIEVQMTCPYTSSQNGRVERKHRHNVE 661
           K++ EN +S KIKT++CDGGGE+K V ++A + GI+ +M+CPYTS QNGR ERKHRH  E
Sbjct: 604 KNMVENQFSKKIKTIQCDGGGEYKPVQKHAIEAGIQFRMSCPYTSQQNGRAERKHRHIAE 663

Query: 662 TGLTLLAQAQMPLNFWWEAFHTATYLINRMPSNVINNECPYTLLKGHSPDYTWLRTVGSA 683
            GLTLLAQA+MPLN+WWEAF TA YLINR+PS+V +N+ PY+LL    PDY  L+  G A
Sbjct: 664 FGLTLLAQAKMPLNYWWEAFSTAVYLINRLPSSVTHNKSPYSLLHKREPDYNSLKPFGCA 723

BLAST of CmUC08G144660 vs. NCBI nr
Match: PNX94503.1 (putative retrotransposon Ty1-copia subclass protein, partial [Trifolium pratense])

HSP 1 Score: 444.1 bits (1141), Expect = 2.4e-120
Identity = 282/802 (35.16%), Postives = 400/802 (49.88%), Query Frame = 0

Query: 9   NLLNQVTSIELERGNFLLWQNIVLPILRSYKLEDDLTGKTSAPEHSIIIPPSEEELQRLV 68
           N L    S++L+R NF LW+++VLP++R  K +  + G    P+  +    + E++    
Sbjct: 10  NDLPSTVSVKLDRDNFPLWKSLVLPLIRGCKYDGYMLGTKKCPDQFVTSIDNTEKI---- 69

Query: 69  LPNPEHDIWLAADQLLVGWLYNSMTAEVTTQVMGYDEAKPLWDAIQEYFGIQSQSQ---- 128
             NP++  W A DQ L+GWL NSMT ++ TQV+  + +K LWD  Q   G  ++S+    
Sbjct: 70  --NPDYQDWQADDQALLGWLMNSMTVDIATQVLHCETSKQLWDEAQSLAGAHTRSRIIYL 129

Query: 129 --------------EDYYWQIMRFTS----------------HVFAGLDEEFNPI----- 188
                         E Y  ++                         GLD E+NP+     
Sbjct: 130 KSEFHNTHKREMKMEQYLAKMKNLADKLKLAGSPISSSDLMIQTLNGLDSEYNPVVVKLS 189

Query: 189 ----------------RHEKLQELKGSLNLHQLSTNIVFEDKRK---NHNSSQSNRSNNQ 248
                              +L +L    N++ L+ +  F  K +   N   S+     + 
Sbjct: 190 DQTNISWVDFQAQLLAFESRLDQLNNFNNIN-LNASANFASKNESGGNKFGSRGGWRGSN 249

Query: 249 NRDNNRGRGRNRYSPYPPPNKPTYQISGKYGHAANVCYSRYNRNDDANATSSDNKKNNSP 308
           +R    GRGR R S    P +P  QI GK+GH A  CY R++++       ++ + ++S 
Sbjct: 250 SRGMRGGRGRARMS---KPPRPICQICGKFGHTAAQCYYRFDKSYTEKNHYAEGEGSHS- 309

Query: 309 TALMAFPKTVDDSTWYLDSGATNHITTDIQRL----------------SLKVPIAAYGNA 368
            A +A P    D  WY DSGA+NH+T    +L                  K+ I A G+ 
Sbjct: 310 -AFVASPYHGQDYEWYFDSGASNHVTHQSGQLQDLNENNGKNSLLVGNGEKLKILASGST 369

Query: 369 KLFCTTLPINLKDVLYVPH--------------NEVTVEFDNNFCFIKAKKSRETCLIGK 428
           KL      +NL++VLYVP               N   VEFD N+C++K K + +  L G+
Sbjct: 370 KL----NDVNLRNVLYVPEITKNLLSVSKLTIDNNALVEFDENYCYVKDKLTGKALLKGR 429

Query: 429 LESGLYCLEVVPNKASFREDNKDAARLVFLGEATNSHPQTFEDLAAEKSSNFCTKDLWHQ 488
           L+ GLY L                        + N  P T +D  A  S     K++WH+
Sbjct: 430 LKDGLYQL------------------------SANKEPPTNKDPCAYIS----LKEIWHR 489

Query: 489 RLGHPSSRIL----------------------------------------NQKLELIHAD 548
           +LGHP++++L                                         + L+LIH D
Sbjct: 490 KLGHPNNKVLEKVLKDNNVKISPSDKFTFCEACQFGKLHLLPFKTSSSHAKEPLDLIHTD 549

Query: 549 LWGLAPILSTSGYRYYIAFVDDCNRFVWIYLLQTKGDTATVFKQFKSLAENLYSSKIKTL 608
           +WG APILS S ++YY+ F+DD +RF WI+ L+ K +T   F QFK+L EN ++ KIK +
Sbjct: 550 VWGPAPILSQSNFKYYVHFLDDFSRFTWIFPLKQKSETIHAFNQFKNLVENQFNKKIKVI 609

Query: 609 RCDGGGEFKLVIEYAHQQGIEVQMTCPYTSSQNGRVERKHRHNVETGLTLLAQAQMPLNF 668
           RCDGGGE+K V + A   GI+ QM+CPYTS QNGR ERKHRH  E GLTLLAQA+MPL++
Sbjct: 610 RCDGGGEYKPVQKCAIDSGIQFQMSCPYTSQQNGRAERKHRHVTELGLTLLAQAKMPLSY 669

Query: 669 WWEAFHTATYLINRMPSNVINNECPYTLLKGHSPDYTWLRTVGSACFPYLWPYQQHKFDF 679
           WWEAF TA YLINR+PS+V  NE PYTL+    PDYT L+  G AC+P L PY QHK  F
Sbjct: 670 WWEAFSTAVYLINRLPSSVNPNESPYTLVFKKEPDYTALKPFGCACYPCLKPYNQHKLQF 729

BLAST of CmUC08G144660 vs. NCBI nr
Match: GAU19483.1 (hypothetical protein TSUD_77270 [Trifolium subterraneum])

HSP 1 Score: 437.6 bits (1124), Expect = 2.2e-118
Identity = 279/827 (33.74%), Postives = 395/827 (47.76%), Query Frame = 0

Query: 9   NLLNQVTSIELERGNFLLWQNIVLPILRSYKLEDDLTGKTSAPEHSIIIPPSEEELQRLV 68
           N L    S++L+R N+ LW+++VLP++R  KL+  + G    PE  I    S +      
Sbjct: 11  NDLPSSVSVKLDRNNYPLWKSLVLPVIRGCKLDGYMLGTEGCPEEFITSSDSSKN----- 70

Query: 69  LPNPEHDIWLAADQLLVGWLYNSMTAEVTTQVMGYDEAKPLWDAIQEYFGIQSQSQ---- 128
             N     W A DQ L+GW+ NSMT E+ TQ++  + +K LWD  Q   G  ++SQ    
Sbjct: 71  -KNSAFVEWQANDQRLLGWMLNSMTTEIATQLLHCETSKQLWDEAQSLAGAHTRSQIIYL 130

Query: 129 --------------EDYYWQIMRFTS----------------HVFAGLDEEFNPI----- 188
                         EDY  ++                         GLD E+NP+     
Sbjct: 131 KSEFHSIRKGEMKMEDYLIKMKNLVDKLKLAGNPVSTSDLIIQTLNGLDSEYNPVVVKLS 190

Query: 189 ----------------RHEKLQELKGSLNLHQLSTNIVFEDKRKNHNSSQSNRSNNQNRD 248
                              ++++L    NL   +T  V         SS +N   + +R 
Sbjct: 191 DQTTLSWVDLQAQLLTFESRIEQLNNLTNLTLNATANVANRSDHRGKSSNNNWRGSNSRG 250

Query: 249 NNRGRGRNRYSPYPPPNKPTYQISGKYGHAANVCYSRYNRNDDANATSSDNKKNNSPTAL 308
              GRGR +    P       Q+ G   H A  C+ R+++    +  S+ + K  S  A 
Sbjct: 251 WRGGRGRGKSGKNP------CQVCGLSNHIAIDCFHRFDKTYSRSNHSAGHDKQGSHNAF 310

Query: 309 MAFPKTVDDSTWYLDSGATNHITTDIQRL----------------SLKVPIAAYGNAKLF 368
           +A   +V+D  WY DSGA+NH+T   ++                   K+ I A G++KL 
Sbjct: 311 LASQNSVEDYDWYFDSGASNHVTHQTEKFQDLTEHHGKNSLVVGNGEKLAILATGSSKL- 370

Query: 369 CTTLPINLKDVLYVPH--------------NEVTVEFDNNFCFIKAKKSRETCLIGKLES 428
                +NL D+LYVP+              N + VEFD N CF+K K + +  L G L+ 
Sbjct: 371 ---KSLNLHDILYVPNITKNLLSVSKLAADNNILVEFDENCCFVKDKLTGKVILKGLLKD 430

Query: 429 GLYCLEVVPNKASFREDNKDAARLVFLGEATNSHPQTFEDLAAEKSSNFCTKDLWHQRLG 488
           GLY L                         T  +P  F  +          K+ WH+RLG
Sbjct: 431 GLYQL-----------------------SGTKRNPSAFVSV----------KESWHRRLG 490

Query: 489 HPSSRILN----------------------------------------QKLELIHADLWG 548
           HP++++L+                                        + LEL+H D+WG
Sbjct: 491 HPNNKVLDKVLESCKVKVPPSDNFSFCEACQYGKMHLLPFKSSSSHAQEPLELVHTDVWG 550

Query: 549 LAPILSTSGYRYYIAFVDDCNRFVWIYLLQTKGDTATVFKQFKSLAENLYSSKIKTLRCD 608
            API+++SG++YY+ FVDD +RF WIY L+ K +T   F QFK+L EN ++ +IK ++CD
Sbjct: 551 PAPIMTSSGFKYYVHFVDDFSRFTWIYPLKQKSETVQAFIQFKNLTENQFNKRIKVIQCD 610

Query: 609 GGGEFKLVIEYAHQQGIEVQMTCPYTSSQNGRVERKHRHNVETGLTLLAQAQMPLNFWWE 668
           GGGE+K V + A + GI+ +M+CPYTS QNGR ERKHRH  E GLTLLAQAQMPL++WWE
Sbjct: 611 GGGEYKPVQKLAVEAGIQFRMSCPYTSQQNGRAERKHRHITEFGLTLLAQAQMPLHYWWE 670

Query: 669 AFHTATYLINRMPSNVINNECPYTLLKGHSPDYTWLRTVGSACFPYLWPYQQHKFDFHTT 706
           AF TA YLINR+PS V  NE PY+L+    PDY  L+T G AC+P L PY QHK  +HTT
Sbjct: 671 AFSTAVYLINRLPSQVTQNESPYSLMLQKEPDYKLLKTFGCACYPCLKPYNQHKLQYHTT 730

BLAST of CmUC08G144660 vs. NCBI nr
Match: GAU51268.1 (hypothetical protein TSUD_412550 [Trifolium subterraneum])

HSP 1 Score: 431.4 bits (1108), Expect = 1.6e-116
Identity = 278/785 (35.41%), Postives = 395/785 (50.32%), Query Frame = 0

Query: 2   AENPSFTNLLNQVTSIELERGNFLLWQNIVLPILRSYKLEDDLTGKTSAPEHSIIIPPSE 61
           A N    N L  + S++L+R N+ LW+++VL ++R  KL+  + G T  PE  +      
Sbjct: 4   AANSPKKNDLPSIISVKLDRDNYPLWKSLVLSLIRGCKLDGYILGTTECPEQFVTSADKS 63

Query: 62  EELQRLVLPNPEHDIWLAADQLLVGWLYNSMTAEVTTQVMGYDEAKPLWDAIQEYFGIQS 121
           +++      NP+   W+A DQ L+GWL NSM  ++ TQ++  + +K LWD  Q   G  +
Sbjct: 64  KKV------NPDFGDWIANDQALLGWLMNSMAIDIATQLLHCETSKQLWDETQSLAGAHT 123

Query: 122 QSQ------------------EDYYWQIMRFTS----------------HVFAGLDEEFN 181
           +S+                  E+Y  ++   +                     GLD E+N
Sbjct: 124 KSRITYLKSEFHNTRKGEMKMEEYLIKMKNLSDKLKLAGSPISNSDLMIQTLNGLDAEYN 183

Query: 182 PIRHEKLQELKGSLNLHQLSTN---IVFEDKRKNHNS--------------------SQS 241
           P+    + +L   +NL  +      + FE +    N+                    ++ 
Sbjct: 184 PV----VVKLSDQINLSWVDVQAQLLAFESRLDQFNNFSGLTLNASANFANKTEFRGNKF 243

Query: 242 NRSNNQNRDNNR----GRGRNRYSPYPPPNKPTYQISGKYGHAANVCYSRYNRNDDANAT 301
           N   N  R N R    GRG+ R S          Q+    GH A  C  R++R       
Sbjct: 244 NSRGNWRRSNFRGMRGGRGKGRMS------NTKCQVCNGTGHIAVDCSYRFDRPYTGRNY 303

Query: 302 SSDNKKNNSPTALMAFPKTVDDSTWYLDSGATNHITTDIQRL----------------SL 361
           S++  K  S +A +A P    D  WY DSGA NH+T    +                   
Sbjct: 304 STEADKQGSHSAFIASPYHGQDYEWYFDSGANNHVTHQTDKFQGFNEHNGKNSLMVGNGE 363

Query: 362 KVPIAAYGNAKLFCTTLPINLKDVLYVPH--------------NEVTVEFDNNFCFIKAK 421
           K+ I A G+ KL      +NL DVLYVP               N + VEFD N C +K K
Sbjct: 364 KLKIVASGSTKL----NNLNLHDVLYVPQITKNLLSVSKLTADNNILVEFDANCCSVKDK 423

Query: 422 KSRETCLIGKLESGLY-------CLEVVPNKASFRE----DNKDAARLVFLGEATNSHPQ 481
            + +T L G+L+ GLY       C+ +   ++  R+    +NK   +++       SH  
Sbjct: 424 LTGQTLLKGRLKDGLYQLSNKEPCVYMSVKESWHRKLGHPNNKVLDKVLKDCNVKISHSD 483

Query: 482 TFEDLAAEKSSNFCTKDLWHQRLGHPSSRILNQKLELIHADLWGLAPILSTSGYRYYIAF 541
            F    A      C     H     PSS  + + L LIH+D+WG APILS SG++YY+ F
Sbjct: 484 QFSFCEA------CQFGKLHLLPFKPSSSHVQEPLALIHSDVWGPAPILSPSGFKYYVHF 543

Query: 542 VDDCNRFVWIYLLQTKGDTATVFKQFKSLAENLYSSKIKTLRCDGGGEFKLVIEYAHQQG 601
           +DD +RF WI+ L+ K DT   F QFK+LAEN ++ KIK ++CDGGGE+K V + + + G
Sbjct: 544 IDDFSRFTWIFPLKQKSDTIHAFIQFKNLAENQFNKKIKIIQCDGGGEYKAVQKVSIEAG 603

Query: 602 IEVQMTCPYTSSQNGRVERKHRHNVETGLTLLAQAQMPLNFWWEAFHTATYLINRMPSNV 661
           I+ +M+CPYTS QNGR ERKHRH  E GLTLLAQA+MPL +WWEAF TA YLINR+PS+V
Sbjct: 604 IQFRMSCPYTSQQNGRAERKHRHVAELGLTLLAQAKMPLRYWWEAFSTAVYLINRLPSSV 663

Query: 662 INNECPYTLLKGHSPDYTWLRTVGSACFPYLWPYQQHKFDFHTTKCVFIEYSDRHKGYKF 681
             NE PY+L+    PDY  L+  G AC+P L PY QHK  FHTT+CVF+ YS+ HKGYK 
Sbjct: 664 NPNESPYSLMFKREPDYNALKPFGCACYPCLKPYNQHKLQFHTTRCVFVGYSNSHKGYKC 723

BLAST of CmUC08G144660 vs. NCBI nr
Match: GAU17915.1 (hypothetical protein TSUD_330400, partial [Trifolium subterraneum])

HSP 1 Score: 427.2 bits (1097), Expect = 3.0e-115
Identity = 268/736 (36.41%), Postives = 376/736 (51.09%), Query Frame = 0

Query: 2   AENPSFTNLLNQVTSIELERGNFLLWQNIVLPILRSYKLEDDLTGKTSAPEHSIIIPPSE 61
           A N S  N L     ++L+R N+ LW+++VLPI+R  +L+  + GK   PE  I    S 
Sbjct: 4   AANSSHKNDLPSTVFVKLDRDNYPLWKSMVLPIIRGARLDGYMLGKKECPEEFITAADSS 63

Query: 62  EELQRLVLPNPEHDIWLAADQLLVGWLYNSMTAEVTTQVMGYDEAKPLWDAIQEYFGIQS 121
           ++       NPE + W A DQ L+GWL NSMT  + TQ++  + +  LW+      G  +
Sbjct: 64  KKF------NPEFEDWQAYDQQLLGWLRNSMTVGIATQLLHCETSMQLWEEAHSLAGAHT 123

Query: 122 QSQ------------------EDYYWQIMRFTS----------------HVFAGLDEEFN 181
           +SQ                  EDY  ++                         GLD E+N
Sbjct: 124 RSQITYLKSEFHSTRKGEMKMEDYLIKMKNLADKLKLAGNPISTSDLIIQTLNGLDSEYN 183

Query: 182 PI---------------------RHEKLQELKGSLNLHQLSTNIVFEDKRKNHNSSQSNR 241
           P+                        ++++L    NL   +T  V    R NH  ++ N 
Sbjct: 184 PVVVKLSDQTTLIWVDLQAQLLTFENRIEQLNNLTNLTLNATANV--ANRSNHRGNRFNS 243

Query: 242 SNNQNRDNNR----GRGRNRYSPYPPPNKPTYQISGKYGHAANVCYSRYNRNDDANATSS 301
           +NN    N R    GRGR R S  P       Q+ G+  H A  C+ R+++    +  SS
Sbjct: 244 NNNWRGSNFRGWRGGRGRGRSSKTP------CQVCGRDNHIAIDCFYRFDKTYSRSNHSS 303

Query: 302 DNKKNNSPTALMAFPKTVDDSTWYLDSGATNHITTDIQRLSLKVPIAAYGNAKLFCTTLP 361
           +N K  S    +A   +V+D  WY DSGA+NH+T    +               F     
Sbjct: 304 NNDKQGSHNVFLASQNSVEDYDWYFDSGASNHVTHQTNK---------------FQDMAE 363

Query: 362 INLKDVLYVPHNEVTVEFDNNFCFIKAKKSRETCLIGKLESGLYCLEVVPNKASFREDNK 421
            + K+ L V + E           ++   +  T L G L+ GLY L             K
Sbjct: 364 HHGKNSLVVGNGEK----------LEIVATGRTILRGTLKDGLYQL-----------SEK 423

Query: 422 DAARLVFLGEATNSHPQTFEDLAAEKSSNFCTKDLWHQRLGHPSSRILNQKLELIHADLW 481
           D++  V                          K+ WH++LGHP+++   + LEL+H D+W
Sbjct: 424 DSSAYV------------------------SVKESWHRKLGHPNNK---EILELVHTDVW 483

Query: 482 GLAPILSTSGYRYYIAFVDDCNRFVWIYLLQTKGDTATVFKQFKSLAENLYSSKIKTLRC 541
           G API+S+SG++YY+ F+DD  RF WIY L+ K DTA  F QFK++ EN ++ +IKT++C
Sbjct: 484 GPAPIISSSGFKYYVHFIDDFTRFTWIYPLKQKSDTAHAFIQFKNMVENQFNKRIKTIQC 543

Query: 542 DGGGEFKLVIEYAHQQGIEVQMTCPYTSSQNGRVERKHRHNVETGLTLLAQAQMPLNFWW 601
           DGGGE+K V ++A + GI+ +M+CPYTS QNGR ERKHRH  E GLTLLAQA+MPLN+WW
Sbjct: 544 DGGGEYKAVQKHAIEAGIQFRMSCPYTSQQNGRAERKHRHIAEFGLTLLAQAKMPLNYWW 603

Query: 602 EAFHTATYLINRMPSNVINNECPYTLLKGHSPDYTWLRTVGSACFPYLWPYQQHKFDFHT 661
           EAF TA YLINR+PS V +NE PY+LL    PDY  L+  G AC+P L PY +HK  FHT
Sbjct: 604 EAFSTAVYLINRLPSPVTHNESPYSLLHKKEPDYNSLKPFGCACYPCLKPYNKHKLQFHT 660

Query: 662 TKCVFIEYSDRHKGYKFLSSTRRIFISRHVYFNEKEFPYSSIFNKS----ETITNSSSEI 675
           TKCVF+ YS+ HKGYK ++S  R+FISRHV FNE  FP+   F  +    +T+T S S  
Sbjct: 664 TKCVFLGYSNSHKGYKCVNSHGRVFISRHVVFNEDHFPFHDGFLNTRVPLKTLTGSPS-- 660

BLAST of CmUC08G144660 vs. ExPASy Swiss-Prot
Match: Q94HW2 (Retrovirus-related Pol polyprotein from transposon RE1 OS=Arabidopsis thaliana OX=3702 GN=RE1 PE=2 SV=1)

HSP 1 Score: 270.8 bits (691), Expect = 4.8e-71
Identity = 232/805 (28.82%), Postives = 336/805 (41.74%), Query Frame = 0

Query: 4   NPSFTNLLNQVTSIELERGNFLLWQNIVLPILRSYKLEDDLTGKTSAPEHSIIIPPSEEE 63
           N S  N +N     +L   N+L+W   V  +   Y+L   L G T+       +PP+   
Sbjct: 12  NTSILN-VNMSNVTKLTSTNYLMWSRQVHALFDGYELAGFLDGSTT-------MPPATIG 71

Query: 64  LQRLVLPNPEHDIWLAADQLLVGWLYNSMTAEVTTQVMGYDEAKPLWDAIQEYFG----- 123
                  NP++  W   D+L+   +  +++  V   V     A  +W+ +++ +      
Sbjct: 72  TDAAPRVNPDYTRWKRQDKLIYSAVLGAISMSVQPAVSRATTAAQIWETLRKIYANPSYG 131

Query: 124 ----IQSQSQE--------DYYWQ--IMRF---------------TSHVFAGLDEEFNPI 183
               +++Q ++        D Y Q  + RF                  V   L EE+ P+
Sbjct: 132 HVTQLRTQLKQWTKGTKTIDDYMQGLVTRFDQLALLGKPMDHDEQVERVLENLPEEYKPV 191

Query: 184 ---------------RHEKLQELKGSLNLHQLSTNIVFEDKRKNH-NSSQSNRSNNQNRD 243
                           HE+L   +  +     +T I       +H N++ +N +NN NR+
Sbjct: 192 IDQIAAKDTPPTLTEIHERLLNHESKILAVSSATVIPITANAVSHRNTTTTNNNNNGNRN 251

Query: 244 N---NRGRGRN---------RYSPYPPPNKP---TYQISGKYGHAANVCYSRYNRNDDAN 303
           N   NR    N          + P    +KP     QI G  GH+A  C    +     N
Sbjct: 252 NRYDNRNNNNNSKPWQQSSTNFHPNNNQSKPYLGKCQICGVQGHSAKRCSQLQHFLSSVN 311

Query: 304 ATSSDNKKNN-SPTALMAFPKTVDDSTWYLDSGATNHITTDIQRLSL------------- 363
           +    +      P A +A       + W LDSGAT+HIT+D   LSL             
Sbjct: 312 SQQPPSPFTPWQPRANLALGSPYSSNNWLLDSGATHHITSDFNNLSLHQPYTGGDDVMVA 371

Query: 364 ---KVPIAAYGNAKLFCTTLPINLKDVLYVPH--------------NEVTVEFDNNFCFI 423
               +PI+  G+  L   + P+NL ++LYVP+              N V+VEF      +
Sbjct: 372 DGSTIPISHTGSTSLSTKSRPLNLHNILYVPNIHKNLISVYRLCNANGVSVEFFPASFQV 431

Query: 424 KAKKSRETCLIGKLESGLYCLEVVPNKASFREDNKDAARLVFLGEATNSHPQTFEDLAAE 483
           K   +    L GK +  LY   +                        +S P +   L A 
Sbjct: 432 KDLNTGVPLLQGKTKDELYEWPI-----------------------ASSQPVS---LFAS 491

Query: 484 KSSNFCTKDLWHQRLGHPSSRILN------------------------------------ 543
            SS   T   WH RLGHP+  ILN                                    
Sbjct: 492 PSSK-ATHSSWHARLGHPAPSILNSVISNYSLSVLNPSHKFLSCSDCLINKSNKVPFSQS 551

Query: 544 -----QKLELIHADLWGLAPILSTSGYRYYIAFVDDCNRFVWIYLLQTKGDTATVFKQFK 603
                + LE I++D+W  +PILS   YRYY+ FVD   R+ W+Y L+ K      F  FK
Sbjct: 552 TINSTRPLEYIYSDVWS-SPILSHDNYRYYVIFVDHFTRYTWLYPLKQKSQVKETFITFK 611

Query: 604 SLAENLYSSKIKTLRCDGGGEFKLVIEYAHQQGIEVQMTCPYTSSQNGRVERKHRHNVET 663
           +L EN + ++I T   D GGEF  + EY  Q GI    + P+T   NG  ERKHRH VET
Sbjct: 612 NLLENRFQTRIGTFYSDNGGEFVALWEYFSQHGISHLTSPPHTPEHNGLSERKHRHIVET 671

Query: 664 GLTLLAQAQMPLNFWWEAFHTATYLINRMPSNVINNECPYTLLKGHSPDYTWLRTVGSAC 669
           GLTLL+ A +P  +W  AF  A YLINR+P+ ++  E P+  L G SP+Y  LR  G AC
Sbjct: 672 GLTLLSHASIPKTYWPYAFAVAVYLINRLPTPLLQLESPFQKLFGTSPNYDKLRVFGCAC 731

BLAST of CmUC08G144660 vs. ExPASy Swiss-Prot
Match: Q9ZT94 (Retrovirus-related Pol polyprotein from transposon RE2 OS=Arabidopsis thaliana OX=3702 GN=RE2 PE=4 SV=1)

HSP 1 Score: 267.7 bits (683), Expect = 4.0e-70
Identity = 217/757 (28.67%), Postives = 321/757 (42.40%), Query Frame = 0

Query: 8   TNLLNQVTS--IELERGNFLLWQNIVLPILRSYKLEDDLTGKTSAPEHSIIIPPSEEELQ 67
           TN+LN   S   +L   N+L+W   V  +   Y+L   L G T        +PP+     
Sbjct: 13  TNILNVNMSNVTKLTSTNYLMWSRQVHALFDGYELAGFLDGSTP-------MPPATIGTD 72

Query: 68  RLVLPNPEHDIWLAADQLLVGWLYNSMTAEVTTQVMGYDEAKPLWDAIQEYFGIQSQSQE 127
            +   NP++  W   D+L+   +  +++  V   V     A  +W+ +++ +   S    
Sbjct: 73  AVPRVNPDYTRWRRQDKLIYSAILGAISMSVQPAVSRATTAAQIWETLRKIYANPSYGHV 132

Query: 128 DYYWQIMRF---------------TSHVFAGLDEEFNPI---------------RHEKL- 187
                I RF                  V   L +++ P+                HE+L 
Sbjct: 133 TQLRFITRFDQLALLGKPMDHDEQVERVLENLPDDYKPVIDQIAAKDTPPSLTEIHERLI 192

Query: 188 -QELK----GSLNLHQLSTNIVFEDKRKNHNSSQSNRSNNQNRDNNRGRGRNRYSP---- 247
            +E K     S  +  ++ N+V   +  N N +Q+NR +N+N +NN  R  N + P    
Sbjct: 193 NRESKLLALNSAEVVPITANVV-THRNTNTNRNQNNRGDNRNYNNNNNRS-NSWQPSSSG 252

Query: 248 ------YPPPNKPTYQISGKYGHAANVCYSRYNRNDDANATSSDNKKNN-SPTALMAFPK 307
                  P P     QI    GH+A  C   +      N   S +      P A +A   
Sbjct: 253 SRSDNRQPKPYLGRCQICSVQGHSAKRCPQLHQFQSTTNQQQSTSPFTPWQPRANLAVNS 312

Query: 308 TVDDSTWYLDSGATNHITTDIQRLSL----------------KVPIAAYGNAKLFCTTLP 367
             + + W LDSGAT+HIT+D   LS                  +PI   G+A L  ++  
Sbjct: 313 PYNANNWLLDSGATHHITSDFNNLSFHQPYTGGDDVMIADGSTIPITHTGSASLPTSSRS 372

Query: 368 INLKDVLYVPH--------------NEVTVEFDNNFCFIKAKKSRETCLIGKLESGLYCL 427
           ++L  VLYVP+              N V+VEF      +K   +    L GK +  LY  
Sbjct: 373 LDLNKVLYVPNIHKNLISVYRLCNTNRVSVEFFPASFQVKDLNTGVPLLQGKTKDELYEW 432

Query: 428 EVVPNKASFREDNKDAARLVFLGEATNSHPQTFEDLAAEKSSNFCTKDLWHQRLGHPSSR 487
            +  ++A                            + A   S   T   WH RLGHPS  
Sbjct: 433 PIASSQA--------------------------VSMFASPCSK-ATHSSWHSRLGHPSLA 492

Query: 488 ILN-----------------------------------------QKLELIHADLWGLAPI 547
           ILN                                         + LE I++D+W  +PI
Sbjct: 493 ILNSVISNHSLPVLNPSHKLLSCSDCFINKSHKVPFSNSTITSSKPLEYIYSDVWS-SPI 552

Query: 548 LSTSGYRYYIAFVDDCNRFVWIYLLQTKGDTATVFKQFKSLAENLYSSKIKTLRCDGGGE 607
           LS   YRYY+ FVD   R+ W+Y L+ K      F  FKSL EN + ++I TL  D GGE
Sbjct: 553 LSIDNYRYYVIFVDHFTRYTWLYPLKQKSQVKDTFIIFKSLVENRFQTRIGTLYSDNGGE 612

Query: 608 FKLVIEYAHQQGIEVQMTCPYTSSQNGRVERKHRHNVETGLTLLAQAQMPLNFWWEAFHT 644
           F ++ +Y  Q GI    + P+T   NG  ERKHRH VE GLTLL+ A +P  +W  AF  
Sbjct: 613 FVVLRDYLSQHGISHFTSPPHTPEHNGLSERKHRHIVEMGLTLLSHASVPKTYWPYAFSV 672

BLAST of CmUC08G144660 vs. ExPASy Swiss-Prot
Match: P10978 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum OX=4097 PE=2 SV=1)

HSP 1 Score: 151.4 bits (381), Expect = 4.2e-35
Identity = 156/605 (25.79%), Postives = 239/605 (39.50%), Query Frame = 0

Query: 182 SQSNRSNNQNRDNNRGRGRNRYS-----------------PYPPPNKPTYQISGKYGHAA 241
           S    SNN  R   RG+ +NR                     P P K   + SG+     
Sbjct: 204 SYQRSSNNYGRSGARGKSKNRSKSRVRNCYNCNQPGHFKRDCPNPRKGKGETSGQ----- 263

Query: 242 NVCYSRYNRNDDANATSSDNKKN-----NSPTALMAFPKTVDDSTWYLDSGATNHITTDI 301
                   +NDD  A    N  N     N     M    +  +S W +D+ A++H  T +
Sbjct: 264 --------KNDDNTAAMVQNNDNVVLFINEEEECMHL--SGPESEWVVDTAASHH-ATPV 323

Query: 302 QRLSLKVPIAAYGNAKLFCTT-----------------LPINLKDVLYVPH------NEV 361
           + L  +     +G  K+  T+                   + LKDV +VP       + +
Sbjct: 324 RDLFCRYVAGDFGTVKMGNTSYSKIAGIGDICIKTNVGCTLVLKDVRHVPDLRMNLISGI 383

Query: 362 TVEFDNNFCFIKAKKSRETCLIGKLESGLYCLEVVPNKASFREDNKDAARLVFLGEATNS 421
            ++ D    +   +K R T      +  L   + V     +R               TN+
Sbjct: 384 ALDRDGYESYFANQKWRLT------KGSLVIAKGVARGTLYR---------------TNA 443

Query: 422 HPQTFEDLAAEKSSNFCTKDLWHQRLGHPSSR---ILNQK-------------------- 481
                E  AA+      + DLWH+R+GH S +   IL +K                    
Sbjct: 444 EICQGELNAAQDE---ISVDLWHKRMGHMSEKGLQILAKKSLISYAKGTTVKPCDYCLFG 503

Query: 482 -----------------LELIHADLWGLAPILSTSGYRYYIAFVDDCNRFVWIYLLQTKG 541
                            L+L+++D+ G   I S  G +Y++ F+DD +R +W+Y+L+TK 
Sbjct: 504 KQHRVSFQTSSERKLNILDLVYSDVCGPMEIESMGGNKYFVTFIDDASRKLWVYILKTKD 563

Query: 542 DTATVFKQFKSLAENLYSSKIKTLRCDGGGEF--KLVIEYAHQQGIEVQMTCPYTSSQNG 601
               VF++F +L E     K+K LR D GGE+  +   EY    GI  + T P T   NG
Sbjct: 564 QVFQVFQKFHALVERETGRKLKRLRSDNGGEYTSREFEEYCSSHGIRHEKTVPGTPQHNG 623

Query: 602 RVERKHRHNVETGLTLLAQAQMPLNFWWEAFHTATYLINRMPSNVINNECPYTLLKGHSP 661
             ER +R  VE   ++L  A++P +FW EA  TA YLINR PS  +  E P  +      
Sbjct: 624 VAERMNRTIVEKVRSMLRMAKLPKSFWGEAVQTACYLINRSPSVPLAFEIPERVWTNKEV 683

Query: 662 DYTWLRTVGSACFPYLWPYQQHKFDFHTTKCVFIEYSDRHKGYKFLSST-RRIFISRHVY 698
            Y+ L+  G   F ++   Q+ K D  +  C+FI Y D   GY+      +++  SR V 
Sbjct: 684 SYSHLKVFGCRAFAHVPKEQRTKLDDKSIPCIFIGYGDEEFGYRLWDPVKKKVIRSRDVV 743

BLAST of CmUC08G144660 vs. ExPASy Swiss-Prot
Match: P04146 (Copia protein OS=Drosophila melanogaster OX=7227 GN=GIP PE=1 SV=3)

HSP 1 Score: 112.1 bits (279), Expect = 2.8e-23
Identity = 81/301 (26.91%), Postives = 146/301 (48.50%), Query Frame = 0

Query: 410 LNQKLELIHADLWGLAPILSTSGYRYYIAFVDDCNRFVWIYLLQTKGDTATVFKQFKSLA 469
           + + L ++H+D+ G    ++     Y++ FVD    +   YL++ K D  ++F+ F + +
Sbjct: 477 IKRPLFVVHSDVCGPITPVTLDDKNYFVIFVDQFTHYCVTYLIKYKSDVFSMFQDFVAKS 536

Query: 470 ENLYSSKIKTLRCDGGGEF--KLVIEYAHQQGIEVQMTCPYTSSQNGRVERKHRHNVETG 529
           E  ++ K+  L  D G E+    + ++  ++GI   +T P+T   NG  ER  R   E  
Sbjct: 537 EAHFNLKVVYLYIDNGREYLSNEMRQFCVKKGISYHLTVPHTPQLNGVSERMIRTITEKA 596

Query: 530 LTLLAQAQMPLNFWWEAFHTATYLINRMPSNVI--NNECPYTLLKGHSPDYTWLRTVGSA 589
            T+++ A++  +FW EA  TATYLINR+PS  +  +++ PY +     P    LR  G+ 
Sbjct: 597 RTMVSGAKLDKSFWGEAVLTATYLINRIPSRALVDSSKTPYEMWHNKKPYLKHLRVFGAT 656

Query: 590 CFPYLWPYQQHKFDFHTTKCVFIEYSDRHKGYKFLSSTRRIFI-SRHVYFNEKEFPYSSI 649
            + ++   +Q KFD  + K +F+ Y     G+K   +    FI +R V  +E     S  
Sbjct: 657 VYVHI-KNKQGKFDDKSFKSIFVGYEP--NGFKLWDAVNEKFIVARDVVVDETNMVNSRA 716

Query: 650 FNKSETITNSSSEISWIPIPNIATTASPSEVFSNQNREKFMQHDGEHVGQYLEDVKDGKT 706
                     S E      PN +     +E F N+++E       +++ Q+L+D K+ + 
Sbjct: 717 VKFETVFLKDSKESENKNFPNDSRKIIQTE-FPNESKE------CDNI-QFLKDSKESEN 766

BLAST of CmUC08G144660 vs. ExPASy Swiss-Prot
Match: Q12491 (Transposon Ty2-B Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) OX=559292 GN=TY2B-B PE=3 SV=1)

HSP 1 Score: 67.0 bits (162), Expect = 1.0e-09
Identity = 38/150 (25.33%), Postives = 74/150 (49.33%), Query Frame = 0

Query: 412 QKLELIHADLWGLAPILSTSGYRYYIAFVDDCNRFVWIYLLQTKGDTA--TVFKQFKSLA 471
           +  + +H D++G    L  S   Y+I+F D+  RF W+Y L  + + +   VF    +  
Sbjct: 659 EPFQYLHTDIFGPVHHLPKSAPSYFISFTDEKTRFQWVYPLHDRREESILNVFTSILAFI 718

Query: 472 ENLYSSKIKTLRCDGGGEF--KLVIEYAHQQGIEVQMTCPYTSSQNGRVERKHRHNVETG 531
           +N +++++  ++ D G E+  K + ++   +GI    T    S  +G  ER +R  +   
Sbjct: 719 KNQFNARVLVIQMDRGSEYTNKTLHKFFTNRGITACYTTTADSRAHGVAERLNRTLLNDC 778

Query: 532 LTLLAQAQMPLNFWWEAFHTATYLINRMPS 558
            TLL  + +P + W+ A   +T + N + S
Sbjct: 779 RTLLHCSGLPNHLWFSAVEFSTIIRNSLVS 808

BLAST of CmUC08G144660 vs. ExPASy TrEMBL
Match: A0A2K3LCM1 (Gag/pol polyprotein-maize retrotransposon Hopscotch (Fragment) OS=Trifolium pratense OX=57577 GN=L195_g032236 PE=4 SV=1)

HSP 1 Score: 458.4 bits (1178), Expect = 5.9e-125
Identity = 296/825 (35.88%), Postives = 414/825 (50.18%), Query Frame = 0

Query: 2   AENPSFTNLLNQVTSIELERGNFLLWQNIVLPILRSYKLEDDLTGKTSAPEHSIIIPPSE 61
           A N +  N L    S++L+R N+ LWQ++VLPI+R  +L+  + GK   PE  I    S 
Sbjct: 4   AANSNHKNDLPSTVSVKLDRDNYPLWQSMVLPIIRGARLDGYMLGKKKCPEEFITAADSS 63

Query: 62  EELQRLVLPNPEHDIWLAADQLLVGWLYNSMTAEVTTQVMGYDEAKPLWDAIQEYFGIQS 121
           ++       NPE + W A DQ L+GWL NSMT  + TQ++  + +  LWD  Q   G  +
Sbjct: 64  KKF------NPEFEDWQAYDQQLLGWLRNSMTVGIATQLLHCETSMQLWDEAQSLAGAHT 123

Query: 122 QSQ------------------EDYYWQIMRFTS----------------HVFAGLDEEFN 181
           +SQ                  EDY  ++                         GLD E+N
Sbjct: 124 RSQITYLKSEFHSTRKGEMKMEDYLIKMKNLADKLKLAGNPISTSDLIIQTLNGLDSEYN 183

Query: 182 PI---------------------RHEKLQELKGSLNLHQLSTNIVFEDKRKNHNSSQSNR 241
           P+                        ++++L    NL   +T  V   K+ +H  ++ N 
Sbjct: 184 PVVVKLSDQTTLSWVDLQAQLLTFENRIEQLNSLTNLTLNATANV--AKKSDHRGNRFNS 243

Query: 242 SNNQNRDNNR-----------GRGRNRYSPYPPPNKPTYQISGKYGHAANVCYSRYNRND 301
           +NN    NN            GRGR R        K T Q+ G   H A  C+ R+++  
Sbjct: 244 NNNWRGSNNNWRGSNFRGWRGGRGRGR------SFKTTCQVCGLDNHIAIDCFYRFDKTY 303

Query: 302 DANATSSDNKKNNSPTALMAFPKTVDDSTWYLDSGATNHIT--TD-IQRLS--------- 361
             +  S++N K  S  A +A   +++D  WY DSGA+NH+T  TD  Q LS         
Sbjct: 304 SRSNHSANNDKQGSHNAFLASQNSIEDYDWYFDSGASNHVTHQTDKFQNLSEHHGKNSLI 363

Query: 362 ----LKVPIAAYGNAKLFCTTLPINLKDVLYVP--------------HNEVTVEFDNNFC 421
                K+ I A G++KL      +NL D+LYVP               N + VEFD N C
Sbjct: 364 VGNGEKLEIVATGSSKL----KSLNLHDILYVPKITKNLLSVSKLAADNNILVEFDENCC 423

Query: 422 FIKAKKSRETCLIGKLESGLYCLEVVPNKASFREDNKDAARLVFLGEATNSHPQTFEDLA 481
           F+K K + +  L G L+ GLY L             KD++  V +               
Sbjct: 424 FVKDKLTGKAILRGILKDGLYQL-----------SEKDSSAYVSI--------------- 483

Query: 482 AEKSSNFCTKDLWHQRLGHPSSRILN---------------------------------- 541
                    K+ WH++LGHP++++L+                                  
Sbjct: 484 ---------KESWHRKLGHPNNKVLDIVLKSCNVKLSPSDQFSFCEACQYGKMHFLPFKT 543

Query: 542 ------QKLELIHADLWGLAPILSTSGYRYYIAFVDDCNRFVWIYLLQTKGDTATVFKQF 601
                 + LEL+H D+WG API+S+SG++YY+ F+DD  RF WIY L+ K DTA  F QF
Sbjct: 544 SFSHAKEILELVHTDVWGPAPIISSSGFKYYVHFIDDFTRFTWIYPLKQKSDTAHAFIQF 603

Query: 602 KSLAENLYSSKIKTLRCDGGGEFKLVIEYAHQQGIEVQMTCPYTSSQNGRVERKHRHNVE 661
           K++ EN +S KIKT++CDGGGE+K V ++A + GI+ +M+CPYTS QNGR ERKHRH  E
Sbjct: 604 KNMVENQFSKKIKTIQCDGGGEYKPVQKHAIEAGIQFRMSCPYTSQQNGRAERKHRHIAE 663

Query: 662 TGLTLLAQAQMPLNFWWEAFHTATYLINRMPSNVINNECPYTLLKGHSPDYTWLRTVGSA 683
            GLTLLAQA+MPLN+WWEAF TA YLINR+PS+V +N+ PY+LL    PDY  L+  G A
Sbjct: 664 FGLTLLAQAKMPLNYWWEAFSTAVYLINRLPSSVTHNKSPYSLLHKREPDYNSLKPFGCA 723

BLAST of CmUC08G144660 vs. ExPASy TrEMBL
Match: A0A2K3MUJ9 (Putative retrotransposon Ty1-copia subclass protein (Fragment) OS=Trifolium pratense OX=57577 GN=L195_g017679 PE=4 SV=1)

HSP 1 Score: 444.1 bits (1141), Expect = 1.2e-120
Identity = 282/802 (35.16%), Postives = 400/802 (49.88%), Query Frame = 0

Query: 9   NLLNQVTSIELERGNFLLWQNIVLPILRSYKLEDDLTGKTSAPEHSIIIPPSEEELQRLV 68
           N L    S++L+R NF LW+++VLP++R  K +  + G    P+  +    + E++    
Sbjct: 10  NDLPSTVSVKLDRDNFPLWKSLVLPLIRGCKYDGYMLGTKKCPDQFVTSIDNTEKI---- 69

Query: 69  LPNPEHDIWLAADQLLVGWLYNSMTAEVTTQVMGYDEAKPLWDAIQEYFGIQSQSQ---- 128
             NP++  W A DQ L+GWL NSMT ++ TQV+  + +K LWD  Q   G  ++S+    
Sbjct: 70  --NPDYQDWQADDQALLGWLMNSMTVDIATQVLHCETSKQLWDEAQSLAGAHTRSRIIYL 129

Query: 129 --------------EDYYWQIMRFTS----------------HVFAGLDEEFNPI----- 188
                         E Y  ++                         GLD E+NP+     
Sbjct: 130 KSEFHNTHKREMKMEQYLAKMKNLADKLKLAGSPISSSDLMIQTLNGLDSEYNPVVVKLS 189

Query: 189 ----------------RHEKLQELKGSLNLHQLSTNIVFEDKRK---NHNSSQSNRSNNQ 248
                              +L +L    N++ L+ +  F  K +   N   S+     + 
Sbjct: 190 DQTNISWVDFQAQLLAFESRLDQLNNFNNIN-LNASANFASKNESGGNKFGSRGGWRGSN 249

Query: 249 NRDNNRGRGRNRYSPYPPPNKPTYQISGKYGHAANVCYSRYNRNDDANATSSDNKKNNSP 308
           +R    GRGR R S    P +P  QI GK+GH A  CY R++++       ++ + ++S 
Sbjct: 250 SRGMRGGRGRARMS---KPPRPICQICGKFGHTAAQCYYRFDKSYTEKNHYAEGEGSHS- 309

Query: 309 TALMAFPKTVDDSTWYLDSGATNHITTDIQRL----------------SLKVPIAAYGNA 368
            A +A P    D  WY DSGA+NH+T    +L                  K+ I A G+ 
Sbjct: 310 -AFVASPYHGQDYEWYFDSGASNHVTHQSGQLQDLNENNGKNSLLVGNGEKLKILASGST 369

Query: 369 KLFCTTLPINLKDVLYVPH--------------NEVTVEFDNNFCFIKAKKSRETCLIGK 428
           KL      +NL++VLYVP               N   VEFD N+C++K K + +  L G+
Sbjct: 370 KL----NDVNLRNVLYVPEITKNLLSVSKLTIDNNALVEFDENYCYVKDKLTGKALLKGR 429

Query: 429 LESGLYCLEVVPNKASFREDNKDAARLVFLGEATNSHPQTFEDLAAEKSSNFCTKDLWHQ 488
           L+ GLY L                        + N  P T +D  A  S     K++WH+
Sbjct: 430 LKDGLYQL------------------------SANKEPPTNKDPCAYIS----LKEIWHR 489

Query: 489 RLGHPSSRIL----------------------------------------NQKLELIHAD 548
           +LGHP++++L                                         + L+LIH D
Sbjct: 490 KLGHPNNKVLEKVLKDNNVKISPSDKFTFCEACQFGKLHLLPFKTSSSHAKEPLDLIHTD 549

Query: 549 LWGLAPILSTSGYRYYIAFVDDCNRFVWIYLLQTKGDTATVFKQFKSLAENLYSSKIKTL 608
           +WG APILS S ++YY+ F+DD +RF WI+ L+ K +T   F QFK+L EN ++ KIK +
Sbjct: 550 VWGPAPILSQSNFKYYVHFLDDFSRFTWIFPLKQKSETIHAFNQFKNLVENQFNKKIKVI 609

Query: 609 RCDGGGEFKLVIEYAHQQGIEVQMTCPYTSSQNGRVERKHRHNVETGLTLLAQAQMPLNF 668
           RCDGGGE+K V + A   GI+ QM+CPYTS QNGR ERKHRH  E GLTLLAQA+MPL++
Sbjct: 610 RCDGGGEYKPVQKCAIDSGIQFQMSCPYTSQQNGRAERKHRHVTELGLTLLAQAKMPLSY 669

Query: 669 WWEAFHTATYLINRMPSNVINNECPYTLLKGHSPDYTWLRTVGSACFPYLWPYQQHKFDF 679
           WWEAF TA YLINR+PS+V  NE PYTL+    PDYT L+  G AC+P L PY QHK  F
Sbjct: 670 WWEAFSTAVYLINRLPSSVNPNESPYTLVFKKEPDYTALKPFGCACYPCLKPYNQHKLQF 729

BLAST of CmUC08G144660 vs. ExPASy TrEMBL
Match: A0A2Z6MBG6 (Integrase catalytic domain-containing protein OS=Trifolium subterraneum OX=3900 GN=TSUD_77270 PE=4 SV=1)

HSP 1 Score: 437.6 bits (1124), Expect = 1.1e-118
Identity = 279/827 (33.74%), Postives = 395/827 (47.76%), Query Frame = 0

Query: 9   NLLNQVTSIELERGNFLLWQNIVLPILRSYKLEDDLTGKTSAPEHSIIIPPSEEELQRLV 68
           N L    S++L+R N+ LW+++VLP++R  KL+  + G    PE  I    S +      
Sbjct: 11  NDLPSSVSVKLDRNNYPLWKSLVLPVIRGCKLDGYMLGTEGCPEEFITSSDSSKN----- 70

Query: 69  LPNPEHDIWLAADQLLVGWLYNSMTAEVTTQVMGYDEAKPLWDAIQEYFGIQSQSQ---- 128
             N     W A DQ L+GW+ NSMT E+ TQ++  + +K LWD  Q   G  ++SQ    
Sbjct: 71  -KNSAFVEWQANDQRLLGWMLNSMTTEIATQLLHCETSKQLWDEAQSLAGAHTRSQIIYL 130

Query: 129 --------------EDYYWQIMRFTS----------------HVFAGLDEEFNPI----- 188
                         EDY  ++                         GLD E+NP+     
Sbjct: 131 KSEFHSIRKGEMKMEDYLIKMKNLVDKLKLAGNPVSTSDLIIQTLNGLDSEYNPVVVKLS 190

Query: 189 ----------------RHEKLQELKGSLNLHQLSTNIVFEDKRKNHNSSQSNRSNNQNRD 248
                              ++++L    NL   +T  V         SS +N   + +R 
Sbjct: 191 DQTTLSWVDLQAQLLTFESRIEQLNNLTNLTLNATANVANRSDHRGKSSNNNWRGSNSRG 250

Query: 249 NNRGRGRNRYSPYPPPNKPTYQISGKYGHAANVCYSRYNRNDDANATSSDNKKNNSPTAL 308
              GRGR +    P       Q+ G   H A  C+ R+++    +  S+ + K  S  A 
Sbjct: 251 WRGGRGRGKSGKNP------CQVCGLSNHIAIDCFHRFDKTYSRSNHSAGHDKQGSHNAF 310

Query: 309 MAFPKTVDDSTWYLDSGATNHITTDIQRL----------------SLKVPIAAYGNAKLF 368
           +A   +V+D  WY DSGA+NH+T   ++                   K+ I A G++KL 
Sbjct: 311 LASQNSVEDYDWYFDSGASNHVTHQTEKFQDLTEHHGKNSLVVGNGEKLAILATGSSKL- 370

Query: 369 CTTLPINLKDVLYVPH--------------NEVTVEFDNNFCFIKAKKSRETCLIGKLES 428
                +NL D+LYVP+              N + VEFD N CF+K K + +  L G L+ 
Sbjct: 371 ---KSLNLHDILYVPNITKNLLSVSKLAADNNILVEFDENCCFVKDKLTGKVILKGLLKD 430

Query: 429 GLYCLEVVPNKASFREDNKDAARLVFLGEATNSHPQTFEDLAAEKSSNFCTKDLWHQRLG 488
           GLY L                         T  +P  F  +          K+ WH+RLG
Sbjct: 431 GLYQL-----------------------SGTKRNPSAFVSV----------KESWHRRLG 490

Query: 489 HPSSRILN----------------------------------------QKLELIHADLWG 548
           HP++++L+                                        + LEL+H D+WG
Sbjct: 491 HPNNKVLDKVLESCKVKVPPSDNFSFCEACQYGKMHLLPFKSSSSHAQEPLELVHTDVWG 550

Query: 549 LAPILSTSGYRYYIAFVDDCNRFVWIYLLQTKGDTATVFKQFKSLAENLYSSKIKTLRCD 608
            API+++SG++YY+ FVDD +RF WIY L+ K +T   F QFK+L EN ++ +IK ++CD
Sbjct: 551 PAPIMTSSGFKYYVHFVDDFSRFTWIYPLKQKSETVQAFIQFKNLTENQFNKRIKVIQCD 610

Query: 609 GGGEFKLVIEYAHQQGIEVQMTCPYTSSQNGRVERKHRHNVETGLTLLAQAQMPLNFWWE 668
           GGGE+K V + A + GI+ +M+CPYTS QNGR ERKHRH  E GLTLLAQAQMPL++WWE
Sbjct: 611 GGGEYKPVQKLAVEAGIQFRMSCPYTSQQNGRAERKHRHITEFGLTLLAQAQMPLHYWWE 670

Query: 669 AFHTATYLINRMPSNVINNECPYTLLKGHSPDYTWLRTVGSACFPYLWPYQQHKFDFHTT 706
           AF TA YLINR+PS V  NE PY+L+    PDY  L+T G AC+P L PY QHK  +HTT
Sbjct: 671 AFSTAVYLINRLPSQVTQNESPYSLMLQKEPDYKLLKTFGCACYPCLKPYNQHKLQYHTT 730

BLAST of CmUC08G144660 vs. ExPASy TrEMBL
Match: A0A803QCY3 (Uncharacterized protein OS=Cannabis sativa OX=3483 PE=4 SV=1)

HSP 1 Score: 432.6 bits (1111), Expect = 3.5e-117
Identity = 277/767 (36.11%), Postives = 410/767 (53.46%), Query Frame = 0

Query: 5   PSFTNLLNQVTSIELERGNFLLWQNIVLPILRSYKLEDDLTGKTSAPEHSIIIPPSEEEL 64
           P   + L Q  S++L+  N+ LW+ +V  I+R ++L+  L G    P   +    +E+  
Sbjct: 30  PHHFSTLKQPFSLKLDMNNYSLWKTMVSTIVRGHRLDGFLNGTNVCPSEYVYTGSTEDGS 89

Query: 65  QRLVLPNPEHDIWLAADQLLVGWLYNSMTAEVTTQVMGYDEAKPLWDAIQEYFGIQSQSQ 124
           + +   NPE + W+  DQLL+GWLY+SMT  + T+VMG   A  LW A+++ +G  S+S+
Sbjct: 90  KTIKTLNPEFENWIVNDQLLMGWLYSSMTETIATEVMGSTSAAGLWHALEQLYGAHSKSK 149

Query: 125 EDYYWQIMRFTSHVFAGLDEEFNPIRHEKLQELKGSLNL-------HQLSTNIV------ 184
            D    +++ T     G       +R +K      SL L        QL+TN++      
Sbjct: 150 MDDTRTLIQTTK---KGGTPMIEYLRQKK--SWADSLALAGEPYPEAQLATNVLSRLDIN 209

Query: 185 -------------------------FEDK----RKNHNSSQSNRSNNQNRDNNRGRGRNR 244
                                    FE K     + +N+S+    N  N   +RGRGR  
Sbjct: 210 YLTLVLQIKARTKTSWQELQELLLSFESKVERLGRGNNNSRGRHFNRGNGGRSRGRGRTN 269

Query: 245 YSPYPPPNKPTYQISGKYGHAANVCYSRYNRNDDANATSSDN-----KKNNSPTALMAFP 304
            S      KPT Q+ GKY H+A VCY+ ++ +   +   S N     + NN+P+A +A P
Sbjct: 270 NS------KPTCQVCGKYDHSAVVCYNWFDDSYMGSDPHSSNQNKTGQNNNNPSAFIATP 329

Query: 305 KTVDDSTWYLDSGATNHITTD------------IQRLSL----KVPIAAYGNAKLFCTT- 364
           + +D   W+ DSGA+N+IT D             +++++    K+ I+ +GN KL+  T 
Sbjct: 330 EFLDSEAWFADSGASNNITADPSVIPQKQEYGGKEKVTVGNGDKLVISHFGNGKLYTKTG 389

Query: 365 ----------LPINLKDVLYV----PHNEVTVEFDNNFCFIKAKKSRETCLIGKLESGLY 424
                     +PI  K+ L V      N+V +EF +N CF+K   +R   L G L+ GLY
Sbjct: 390 QWLKLNEMLLVPIIAKNFLSVSKLTTDNDVIIEFHSNSCFVKDIATRRVLLQGMLKDGLY 449

Query: 425 CLEVVPNKASFREDNKDAARLVFLGEATNSHPQTFEDLAAEKSSNFCTKDLWHQ-RLGHP 484
            L+   NK+++   +       F+   ++ +P T +        + C     H     H 
Sbjct: 450 QLQTPRNKSAYLRFSTSK----FI--VSDCNPFTVDHFC-----DACQYGKSHSLPFKHS 509

Query: 485 SSRILNQKLELIHADLWGLAPILSTSGYRYYIAFVDDCNRFVWIYLLQTKGDTATVFKQF 544
           +S+ L + L+L+H DLWG +PI S   ++YY+ FVDDC RF WIY L+ K +    F  F
Sbjct: 510 NSKAL-KVLDLVHTDLWGPSPITSNQDFKYYVHFVDDCTRFTWIYPLKNKSEACDAFLAF 569

Query: 545 KSLAENLYSSKIKTLRCDGGGEFKLVIEYAHQQGIEVQMTCPYTSSQNGRVERKHRHNVE 604
           KSLAEN +  KIK LR DGGGE++++ ++    GI    +CP+TSSQNGR ERKHRH VE
Sbjct: 570 KSLAENQFERKIKALRTDGGGEYQVLSDFVVTHGINFHHSCPHTSSQNGRAERKHRHIVE 629

Query: 605 TGLTLLAQAQMPLNFWWEAFHTATYLINRMPSNVINNECPYTLLKGHSPDYTWLRTVGSA 664
            GLTLLAQ+ MPL +WW+AF TA YLINR+P+ +++++ P+ +L    PDY +L+T G A
Sbjct: 630 MGLTLLAQSIMPLKYWWDAFSTAVYLINRLPTPILDHKTPFEMLHKKIPDYKFLKTFGVA 689

Query: 665 CFPYLWPYQQHKFDFHTTKCVFIEYSDRHKGYKFLSSTRRIFISRHVYFNEKEFPYSSIF 683
           CFP L PYQ HKF FH+ KCV + YSD HKGYK LS T RI+I R V FNE EFP+   F
Sbjct: 690 CFPCLRPYQAHKFQFHSIKCVNLGYSDAHKGYKCLSPTGRIYILRDVVFNELEFPFQISF 749

BLAST of CmUC08G144660 vs. ExPASy TrEMBL
Match: A0A2Z6P4D5 (Integrase catalytic domain-containing protein OS=Trifolium subterraneum OX=3900 GN=TSUD_412550 PE=4 SV=1)

HSP 1 Score: 431.4 bits (1108), Expect = 7.8e-117
Identity = 278/785 (35.41%), Postives = 395/785 (50.32%), Query Frame = 0

Query: 2   AENPSFTNLLNQVTSIELERGNFLLWQNIVLPILRSYKLEDDLTGKTSAPEHSIIIPPSE 61
           A N    N L  + S++L+R N+ LW+++VL ++R  KL+  + G T  PE  +      
Sbjct: 4   AANSPKKNDLPSIISVKLDRDNYPLWKSLVLSLIRGCKLDGYILGTTECPEQFVTSADKS 63

Query: 62  EELQRLVLPNPEHDIWLAADQLLVGWLYNSMTAEVTTQVMGYDEAKPLWDAIQEYFGIQS 121
           +++      NP+   W+A DQ L+GWL NSM  ++ TQ++  + +K LWD  Q   G  +
Sbjct: 64  KKV------NPDFGDWIANDQALLGWLMNSMAIDIATQLLHCETSKQLWDETQSLAGAHT 123

Query: 122 QSQ------------------EDYYWQIMRFTS----------------HVFAGLDEEFN 181
           +S+                  E+Y  ++   +                     GLD E+N
Sbjct: 124 KSRITYLKSEFHNTRKGEMKMEEYLIKMKNLSDKLKLAGSPISNSDLMIQTLNGLDAEYN 183

Query: 182 PIRHEKLQELKGSLNLHQLSTN---IVFEDKRKNHNS--------------------SQS 241
           P+    + +L   +NL  +      + FE +    N+                    ++ 
Sbjct: 184 PV----VVKLSDQINLSWVDVQAQLLAFESRLDQFNNFSGLTLNASANFANKTEFRGNKF 243

Query: 242 NRSNNQNRDNNR----GRGRNRYSPYPPPNKPTYQISGKYGHAANVCYSRYNRNDDANAT 301
           N   N  R N R    GRG+ R S          Q+    GH A  C  R++R       
Sbjct: 244 NSRGNWRRSNFRGMRGGRGKGRMS------NTKCQVCNGTGHIAVDCSYRFDRPYTGRNY 303

Query: 302 SSDNKKNNSPTALMAFPKTVDDSTWYLDSGATNHITTDIQRL----------------SL 361
           S++  K  S +A +A P    D  WY DSGA NH+T    +                   
Sbjct: 304 STEADKQGSHSAFIASPYHGQDYEWYFDSGANNHVTHQTDKFQGFNEHNGKNSLMVGNGE 363

Query: 362 KVPIAAYGNAKLFCTTLPINLKDVLYVPH--------------NEVTVEFDNNFCFIKAK 421
           K+ I A G+ KL      +NL DVLYVP               N + VEFD N C +K K
Sbjct: 364 KLKIVASGSTKL----NNLNLHDVLYVPQITKNLLSVSKLTADNNILVEFDANCCSVKDK 423

Query: 422 KSRETCLIGKLESGLY-------CLEVVPNKASFRE----DNKDAARLVFLGEATNSHPQ 481
            + +T L G+L+ GLY       C+ +   ++  R+    +NK   +++       SH  
Sbjct: 424 LTGQTLLKGRLKDGLYQLSNKEPCVYMSVKESWHRKLGHPNNKVLDKVLKDCNVKISHSD 483

Query: 482 TFEDLAAEKSSNFCTKDLWHQRLGHPSSRILNQKLELIHADLWGLAPILSTSGYRYYIAF 541
            F    A      C     H     PSS  + + L LIH+D+WG APILS SG++YY+ F
Sbjct: 484 QFSFCEA------CQFGKLHLLPFKPSSSHVQEPLALIHSDVWGPAPILSPSGFKYYVHF 543

Query: 542 VDDCNRFVWIYLLQTKGDTATVFKQFKSLAENLYSSKIKTLRCDGGGEFKLVIEYAHQQG 601
           +DD +RF WI+ L+ K DT   F QFK+LAEN ++ KIK ++CDGGGE+K V + + + G
Sbjct: 544 IDDFSRFTWIFPLKQKSDTIHAFIQFKNLAENQFNKKIKIIQCDGGGEYKAVQKVSIEAG 603

Query: 602 IEVQMTCPYTSSQNGRVERKHRHNVETGLTLLAQAQMPLNFWWEAFHTATYLINRMPSNV 661
           I+ +M+CPYTS QNGR ERKHRH  E GLTLLAQA+MPL +WWEAF TA YLINR+PS+V
Sbjct: 604 IQFRMSCPYTSQQNGRAERKHRHVAELGLTLLAQAKMPLRYWWEAFSTAVYLINRLPSSV 663

Query: 662 INNECPYTLLKGHSPDYTWLRTVGSACFPYLWPYQQHKFDFHTTKCVFIEYSDRHKGYKF 681
             NE PY+L+    PDY  L+  G AC+P L PY QHK  FHTT+CVF+ YS+ HKGYK 
Sbjct: 664 NPNESPYSLMFKREPDYNALKPFGCACYPCLKPYNQHKLQFHTTRCVFVGYSNSHKGYKC 723

BLAST of CmUC08G144660 vs. TAIR 10
Match: AT5G13210.1 (Uncharacterised conserved protein UCP015417, vWA )

HSP 1 Score: 89.4 bits (220), Expect = 1.4e-17
Identity = 56/116 (48.28%), Postives = 69/116 (59.48%), Query Frame = 0

Query: 661 IPIPNIATTASPSEVFSNQNREKFMQHDGEHVGQYLEDVKDGKTKIAAGALLPHEIIKSL 720
           +P   +A+ A  S       +E F++HD E   QYL+D K GKTK+AAGA+LPHEII+ L
Sbjct: 372 LPYNRVASVAMKS------YKEIFLKHDAERFQQYLDDAKAGKTKVAAGAVLPHEIIREL 431

Query: 721 DDGE-EDCGEL----------KKGKLRNCIAVCDASVR---DSMGVCVALDLLVSE 763
           D G+     EL          +KG LRNCIAVCD S     + M VCVAL LLVSE
Sbjct: 432 DGGDGGQVAELQWKRTVDDMKEKGSLRNCIAVCDVSGSMNGEPMEVCVALGLLVSE 481

BLAST of CmUC08G144660 vs. TAIR 10
Match: AT5G43400.1 (Uncharacterised conserved protein UCP015417, vWA )

HSP 1 Score: 82.4 bits (202), Expect = 1.7e-15
Identity = 54/109 (49.54%), Postives = 64/109 (58.72%), Query Frame = 0

Query: 672 PSEVFSNQNREKFMQHDGEHVGQYLEDVKDGKTKIAAGALLPHEIIKSLDD--GEEDCGE 731
           PS    N  ++ F +HD E   ++LEDVK GK KIAAGALLPH+II  L+D  G E   E
Sbjct: 353 PSVAMKNY-KKLFEEHDSERFTEFLEDVKSGKKKIAAGALLPHQIINQLEDDSGSEVGAE 412

Query: 732 L-------------KKGKLRNCIAVCDASVRDS---MGVCVALDLLVSE 763
           +             KKGKL+N +AVCD S   S   M VCVAL LLVSE
Sbjct: 413 VAELQWARMVDDLAKKGKLKNSLAVCDVSGSMSGTPMEVCVALGLLVSE 460

BLAST of CmUC08G144660 vs. TAIR 10
Match: AT5G43390.1 (Uncharacterised conserved protein UCP015417, vWA )

HSP 1 Score: 81.6 bits (200), Expect = 2.9e-15
Identity = 50/106 (47.17%), Postives = 62/106 (58.49%), Query Frame = 0

Query: 672 PSEVFSNQNREKFMQHDGEHVGQYLEDVKDGKTKIAAGALLPHEIIKSL--DDGEEDCGE 731
           PS    N +  +F +HD E   ++LEDVK GK K+AAGALLPH+II  L  D   E+  E
Sbjct: 344 PSIAMQNYS-SRFAEHDSERFTEFLEDVKSGKKKMAAGALLPHQIISQLLNDSEGEEVAE 403

Query: 732 L----------KKGKLRNCIAVCDAS---VRDSMGVCVALDLLVSE 763
           L          KKGKL+N +A+CD S       M VC+AL LLVSE
Sbjct: 404 LQWARMVDDLAKKGKLKNSLAICDVSGSMAGTPMNVCIALGLLVSE 448

BLAST of CmUC08G144660 vs. TAIR 10
Match: AT3G24780.1 (Uncharacterised conserved protein UCP015417, vWA )

HSP 1 Score: 73.6 bits (179), Expect = 7.9e-13
Identity = 50/116 (43.10%), Postives = 64/116 (55.17%), Query Frame = 0

Query: 661 IPIPNIATTASPSEVFSNQNREKFMQHDGEHVGQYLEDVKDGKTKIAAGALLPHEIIKSL 720
           +P   +A+ A  S       +E F+  D +   QYL D K GKTKIAAGA+LPHEII+ L
Sbjct: 424 LPYNRVASVAMKS------YKEVFLYRDEKRFQQYLNDAKTGKTKIAAGAVLPHEIIREL 483

Query: 721 DDGE-EDCGEL----------KKGKLRNCIAVCDASVR---DSMGVCVALDLLVSE 763
           + G+     EL          +KG L NC+A+CD S     + M V VAL LLVSE
Sbjct: 484 NGGDGGKVAELQWKRMVDDLKEKGSLTNCMAICDVSGSMNGEPMEVSVALGLLVSE 533

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
PNX76291.11.2e-12435.88gag/pol polyprotein - maize retrotransposon Hopscotch, partial [Trifolium praten... [more]
PNX94503.12.4e-12035.16putative retrotransposon Ty1-copia subclass protein, partial [Trifolium pratense... [more]
GAU19483.12.2e-11833.74hypothetical protein TSUD_77270 [Trifolium subterraneum][more]
GAU51268.11.6e-11635.41hypothetical protein TSUD_412550 [Trifolium subterraneum][more]
GAU17915.13.0e-11536.41hypothetical protein TSUD_330400, partial [Trifolium subterraneum][more]
Match NameE-valueIdentityDescription
Q94HW24.8e-7128.82Retrovirus-related Pol polyprotein from transposon RE1 OS=Arabidopsis thaliana O... [more]
Q9ZT944.0e-7028.67Retrovirus-related Pol polyprotein from transposon RE2 OS=Arabidopsis thaliana O... [more]
P109784.2e-3525.79Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum... [more]
P041462.8e-2326.91Copia protein OS=Drosophila melanogaster OX=7227 GN=GIP PE=1 SV=3[more]
Q124911.0e-0925.33Transposon Ty2-B Gag-Pol polyprotein OS=Saccharomyces cerevisiae (strain ATCC 20... [more]
Match NameE-valueIdentityDescription
A0A2K3LCM15.9e-12535.88Gag/pol polyprotein-maize retrotransposon Hopscotch (Fragment) OS=Trifolium prat... [more]
A0A2K3MUJ91.2e-12035.16Putative retrotransposon Ty1-copia subclass protein (Fragment) OS=Trifolium prat... [more]
A0A2Z6MBG61.1e-11833.74Integrase catalytic domain-containing protein OS=Trifolium subterraneum OX=3900 ... [more]
A0A803QCY33.5e-11736.11Uncharacterized protein OS=Cannabis sativa OX=3483 PE=4 SV=1[more]
A0A2Z6P4D57.8e-11735.41Integrase catalytic domain-containing protein OS=Trifolium subterraneum OX=3900 ... [more]
Match NameE-valueIdentityDescription
AT5G13210.11.4e-1748.28Uncharacterised conserved protein UCP015417, vWA [more]
AT5G43400.11.7e-1549.54Uncharacterised conserved protein UCP015417, vWA [more]
AT5G43390.12.9e-1547.17Uncharacterised conserved protein UCP015417, vWA [more]
AT3G24780.17.9e-1343.10Uncharacterised conserved protein UCP015417, vWA [more]
InterPro
Analysis Name: InterPro Annotations of Watermelon (USVL531) v1
Date Performed: 2022-01-31
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR001584Integrase, catalytic corePFAMPF00665rvecoord: 413..509
e-value: 4.2E-8
score: 33.4
IPR001584Integrase, catalytic corePROSITEPS50994INTEGRASEcoord: 401..573
score: 18.164782
IPR011205Uncharacterised conserved protein UCP015417, vWAPFAMPF11443DUF2828coord: 678..721
e-value: 2.6E-11
score: 42.4
IPR036397Ribonuclease H superfamilyGENE3D3.30.420.10coord: 408..583
e-value: 2.3E-30
score: 107.3
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 174..216
NoneNo IPR availablePANTHERPTHR11439GAG-POL-RELATED RETROTRANSPOSONcoord: 397..630
NoneNo IPR availablePANTHERPTHR11439:SF324RIBONUCLEASE H-LIKE DOMAIN, GAG-PRE-INTEGRASE DOMAIN, GAG-POLYPEPTIDE OF LTR COPIA-TYPE-RELATEDcoord: 397..630
IPR012337Ribonuclease H-like superfamilySUPERFAMILY53098Ribonuclease H-likecoord: 411..567

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CmUC08G144660.1CmUC08G144660.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0015074 DNA integration
molecular_function GO:0003676 nucleic acid binding