MC07g_new0372 (gene) Bitter gourd (Dali-11) v1

Overview
NameMC07g_new0372
Typegene
OrganismMomordica charantia cv. Dali-11 (Bitter gourd (Dali-11) v1)
DescriptionIntegrase catalytic domain-containing protein
LocationMC07: 14253175 .. 14257222 (+)
RNA-Seq ExpressionMC07g_new0372
SyntenyMC07g_new0372
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: CDSpolypeptide
Hold the cursor over a type above to highlight its positions in the sequence below.
CGAAATCGAGGTAGAGGACGGTGGAACAACAACAATAGTCGGCAAATTTGTCAGGTGTGTGGTAAACCTGGACATTCAGCACTAACGTGCTACCATCGATTTGATAAGGAGTACAGGAACAATACACAAAGCCAGGGTAAAAACTTCAATGGCGACTCTAACCAGGGGGTTAACAACAACTCTGGACAAGGTACATCTTATGCCTTCACAGCAACCCAAAATAACAATCCTTTTTTGGCCAATCCAGAAACAGTGATAGACCCGAATTGGTATGTGGATAGTGGTGCTTCAAATCATGTCACCGCCGACTACAATAGTATGGTTCAACCTACTGAATATGGAGGTATGGAAAGAGTTACAGTAGGTAATGACGATAAATTAAAAATATCTCATGTTGACAAATCCTGTTTAGTTTCTGACGGTGGGTTGATCATGCTTGAAAATGTGTTGTGCGTACCTAACATAGCTAAAAATCTAGTTAGCGTATCTAAACTCGCTAAAGACAATAACGTATACCTTGAATTTCATGCTGATTCTTGTCTTGTAAAGGATATACGTTCGGGCAAGGTGGTGCTGAAAGGGGCTCTTAAGGATGGACTTTACCGCCTCAATACTGTTGGAGTAGTCATTGGGAGTACTTCGACTCCAGTTGACTGTGGCTTGGAGTTGGCTGCTAATAAAACTATTTGTTCTGTGTCTCTTCCCAAATCATCCAGTAGTATAAATGTTGTGGTGTCCAAGGACGTTTGGCATCGTCGACTTGGACATCCGTCTTCTCAAGTTTTTAGAAGTTTAATTAAACGTTGTAATCTGCCCTTGAAAGTTCATGATAATGTCAACTTTTGTGAAGCATACAAATATAGCAAATCTCATGGTCTGCCTTTCCCTCTATCTAGTTCACAAGCTACTGCTCCATTCATGTTAGTGCATACTGATCTATGGGGACCTGCACCGTTATGTCATCTGATGGGTATAGATACTATGTTCATTTTCTTGATGACTATAGCCGATTTGTATGGATTTATCCTTTGAAGTTAAAGAGTGACACACTTTCAGCATTTAATCATTTTACTACTATGATCAAGACTCAATTTGGCAGCCATATTAAAATGTTACAGTCTGACAATGGAGGAGAATATAAACGAGTCCATCAGTTATGCCATCAGTTGGGGATACAATCCAGATTTTCGTGCCCGTACACTTCTGCGCAAAATGGTCGAGCTGAGCGTAAACATCGCCATATTGTTGAGACCGGTCTCACTTTGCTCGCTCAAGCTTCCATGCCTCTAAGTTTTTGGTGGGAGGCCTTCCTGACATCAACCTTATTAATCAATGGTCTTCCTTCCCCTCTGCTTAATGGTAAGTCTCCAATGGAGTTATTGATACAACGAAGTCTTAACGTCTCCGAGTTGAGAATATTTGGGTGTGCATGCTATCCCTTTTTACGCCCTTACCATACTCATAAGTTTCAATTTCGAACTAACAGGTGTGTTTATCTTGGTCCAAGTCCAGCTCGTAAGGGTCATAAATGTCTCAGCTCATCAGGAAGAGTCTTCATCTCACGACATGTTCAATTCAATGAAGGTGATTATCCATTTGCTTCTGGATTTGGCCTTCAACAATCCACTGTGTCTGACAATTCTTTATCTCATTCCACCGCTGCTCCAAACCTACACACGTGGTTCAGTAGCCTGCCTACTCTTGAACCTGCTACTCACCCACCATCCAACACCCCTCACCCATGCCCACCAAATTTGCCTATCACCCATAACTCTCAGCCTCCTAGATCCCAGGCCCCAACAACCTCTCCAACCTCACCTCCCCAATCACGACCTAATAATCTAGAATTCCCCATAAACAGTCCCCTAAGCCATGAACCTTTACCAATTGACACATCTCCTATTTCTAGCTCATCCCACCAGCTCCCTAGTACGATTTCAGCGGACCAACATGCGTTCTGCACCCCCTCCAATCCACCATCTTTCCCTCTATCTCCACTACCCACACCTGAAGCATCTGACCTAAATATACCAACCTCGTCCCCTTTACCTTCTCTCTCTATTCCTGCACCGGAAACTACCTCAGTGGTTGAAAGCTTTATCGCACCTCAGCCTCCCCCACCATTTCAGTCTATCCACCCTATGATTACAAGAGGGAAAGTTGGAATATTCAAGCCCAAGGTGCTCCTATCCTACACTCCCACTGATTGGTCGGTAACAGAACCCACAACTGTTAAGGTTGCTCTTGCTACTCCCATCTGGAAATCAGCGATGGATTTGGAGTATAATGCTCTTATGCAAAATCAGACTTGGACCCTCGTCCCTCCTACTGGTTCAGTCGATGTAGTTGGGTGTAAATGGGTTTTCCGCATCAAACGCAATTTTGATGGCTCGATTCAACGCAACAAGGCACGGTTGGTGGCCAAAGGCTTTCATCAAAGTCCCGGTATTGACTTTTTTGAAACCTTCAGTCCTGTGGTCAAAGCCTCCACTATCCGAGTTGTTCTGTCATTAGCTGTATCTCGAGGCTGGAAACTACGACAACTTGACTTCAACAATGCCTTTCTCAACGACAAGTTAGATGAAGATGTCTATATGTCTCAACCTCCAGGATATGCCGATCCCAGATATCCGAATTATATCTGCAAACTACATAAAGCACTTTATGGCCTCAAACAAGCTCCTCGAGCTTGGAATGTCACTCTCAAATCTGCCTTGCTCTCTTGGGGCTTCACCAACAGTAGATCAGACACATCCTTGTTCATATATCACTGCGGTCCATCCATCATCCTCCTTCTGGTCTATGTCGATGATGTCATTGTCACTGGAAATAATGTTGCTCTCATAGACAGTCTTGTTGCCACACTAGATAAAACGTTTGCGTTAAAAGATCTTGGCTTGCTCAGTTATTTTCTTGGCCTTCAGGTCACTCATCTCCCTTCCAGAGTTCTTTAAACTCAGGCAAAATACATAGATGACGTGTTGCGTCGCCTGGATATGGAGGGCTTAAAGCCAGCCCCCTCCCCCACTGTATTGGGCAAACACTTGTCAATTTCTGATGGAGAGCCCATGAGTGATCCCTTTCTATACAGAAGCACTCTTGGTGCCCTTCAATATCTTACCAACACTCGGCCAGACATCGCGTATATTGTTAATCACCTGAGTCAATTTCTCAAACAGCCCACCGACATACATTGGCAAGCTGTGAAGCGGGTGTTACGCTACTTAAGTGGTACAAAACACATGGGCCTCCACATCCAACCAAGTGACACGGTCTCTCTCACAGCTTATTCTGATGCAGACTGAGCATCAAACATTGATGATCGCAAATCAATTGCTGCTTATTGTGTTTTCTTTAGAAACACTCTTGTCTAACAAACGGCTGTTGCTCGCTCCAGTACTGAGTCCGAATATCGTGCTCTTGCTCATGCTTCCGCTGAAATTATTTGGCTGCGACAACTCCTTGGTGAACTTGGTGTCAACGTTAGTTCTCCGCCCATTATTTGGTGTGACAATATCAGTGCTGGTGCTCTAGCAACTAATCCAGTCTTCCACGCTCGGACCAAGCACATTGAAATATATGTCCACTTTGTTCGGGATCACGTGTTACGCGGTGCTCTTGAAGTTCGTTATGTACCATCTGCTGATCAACTAGCCGATTGCCTGACTAAACCACTCACTCACTCTCAGTTCCACCTACTACGATCCAAACTCGGAGTGCTTGACCTACCCGCTCGTTTGCGGGGGGATGTTAACGTAACTTCAGCAAGGAAGTTGAAGCCACACACCACGTCATAGTCAAAACAAGAATATCTATTATTGTGCATTTTTGTTACAATATTTTTGTTAAAGTTTAGATTCTTTTTCTTCTAAACTTAGGATATACTTTTGTATAAATAGAGCCTTTCAGTGCCATCATAATACAGTGAAATATAAACATCATACCTTCTATGTGAAACTTCCTCTTGTGAAATTCTAACTGGCCCGACCTGAAAAATATAGCCACAAAAACAATAAGTTATTTTAACACTTTGAAAATCCTATTATAC

mRNA sequence

ATGGGGACCTGCACCGTTATGTCATCTGATGGGTATAGATACTATGTTCATTTTCTTGATGACTATAGCCGATTTGTATGGATTTATCCTTTGAAGTTAAAGAGTGACACACTTTCAGCATTTAATCATTTTACTACTATGATCAAGACTCAATTTGGCAGCCATATTAAAATGTTACAGTCTGACAATGGAGGAGAATATAAACGAGTCCATCAGTTATGCCATCAGTTGGGGATACAATCCAGATTTTCGTGCCCGTACACTTCTGCGCAAAATGGTCGAGCTGAGCGTAAACATCGCCATATTGTTGAGACCGGTCTCACTTTGCTCGCTCAAGCTTCCATGCCTCTAAGTTTTTGGTGGGAGGCCTTCCTGACATCAACCTTATTAATCAATGGTCTTCCTTCCCCTCTGCTTAATGGTAAGTCTCCAATGGAGTTATTGATACAACGAAGTCTTAACGTCTCCGAGTTGAGAATATTTGGGTGTGCATGCTATCCCTTTTTACGCCCTTACCATACTCATAAGTTTCAATTTCGAACTAACAGGTGTGTTTATCTTGGTCCAAGTCCAGCTCGTAAGGGTCATAAATGTCTCAGCTCATCAGGAAGAGTCTTCATCTCACGACATGTTCAATTCAATGAAGGTGATTATCCATTTGCTTCTGGATTTGGCCTTCAACAATCCACTGTGTCTGACAATTCTTTATCTCATTCCACCGCTGCTCCAAACCTACACACGTGGTTCAGTAGCCTGCCTACTCTTGAACCTGCTACTCACCCACCATCCAACACCCCTCACCCATGCCCACCAAATTTGCCTATCACCCATAACTCTCAGCCTCCTAGATCCCAGGCCCCAACAACCTCTCCAACCTCACCTCCCCAATCACGACCTAATAATCTAGAATTCCCCATAAACAGTCCCCTAAGCCATGAACCTTTACCAATTGACACATCTCCTATTTCTAGCTCATCCCACCAGCTCCCTAGTACGATTTCAGCGGACCAACATGCGTTCTGCACCCCCTCCAATCCACCATCTTTCCCTCTATCTCCACTACCCACACCTGAAGCATCTGACCTAAATATACCAACCTCGTCCCCTTTACCTTCTCTCTCTATTCCTGCACCGGAAACTACCTCAGTGGTTGAAAGCTTTATCGCACCTCAGCCTCCCCCACCATTTCAGTCTATCCACCCTATGATTACAAGAGGGAAAGTTGGAATATTCAAGCCCAAGGTGCTCCTATCCTACACTCCCACTGATTGGTCGGTAACAGAACCCACAACTGTTAAGGTTGCTCTTGCTACTCCCATCTGGAAATCAGCGATGGATTTGGAGTATAATGCTCTTATGCAAAATCAGACTTGGACCCTCGTCCCTCCTACTGGTTCAGTCGATGTAGTTGGGTGTAAATGGGTTTTCCGCATCAAACGCAATTTTGATGGCTCGATTCAACGCAACAAGGCACGGTTGGTGGCCAAAGGCTTTCATCAAAGTCCCGGTATTGACTTTTTTGAAACCTTCAGTCCTGTGGTCAAAGCCTCCACTATCCGAGTTGTTCTGTCATTAGCTGTATCTCGAGGCTGGAAACTACGACAACTTGACTTCAACAATGCCTTTCTCAACGACAAGTTAGATGAAGATGTCTATATGTCTCAACCTCCAGGATATGCCGATCCCAGATATCCGAATTATATCTGCAAACTACATAAAGCACTTTATGGCCTCAAACAAGCTCCTCGAGCTTGGAATGTCACTCTCAAATCTGCCTTGCTCTCTTGGGGCTTCACCAACAGTAGATCAGACACATCCTTGTTCATATATCACTGCGGTCCATCCATCATCCTCCTTCTGGTCTATGTCGATGATGTCATTGTCACTGGAAATAATGTTGCTCTCATAGACAGTCTTGTTGCCACACTAGATAAAACGTTTGCGTTAAAAGATCTTGGCTTGCTCAGTTATTTTCTTGGCCTTCAGGTCACTCATCTCCCTTCCAGAGTTCTTTAA

Coding sequence (CDS)

ATGGGGACCTGCACCGTTATGTCATCTGATGGGTATAGATACTATGTTCATTTTCTTGATGACTATAGCCGATTTGTATGGATTTATCCTTTGAAGTTAAAGAGTGACACACTTTCAGCATTTAATCATTTTACTACTATGATCAAGACTCAATTTGGCAGCCATATTAAAATGTTACAGTCTGACAATGGAGGAGAATATAAACGAGTCCATCAGTTATGCCATCAGTTGGGGATACAATCCAGATTTTCGTGCCCGTACACTTCTGCGCAAAATGGTCGAGCTGAGCGTAAACATCGCCATATTGTTGAGACCGGTCTCACTTTGCTCGCTCAAGCTTCCATGCCTCTAAGTTTTTGGTGGGAGGCCTTCCTGACATCAACCTTATTAATCAATGGTCTTCCTTCCCCTCTGCTTAATGGTAAGTCTCCAATGGAGTTATTGATACAACGAAGTCTTAACGTCTCCGAGTTGAGAATATTTGGGTGTGCATGCTATCCCTTTTTACGCCCTTACCATACTCATAAGTTTCAATTTCGAACTAACAGGTGTGTTTATCTTGGTCCAAGTCCAGCTCGTAAGGGTCATAAATGTCTCAGCTCATCAGGAAGAGTCTTCATCTCACGACATGTTCAATTCAATGAAGGTGATTATCCATTTGCTTCTGGATTTGGCCTTCAACAATCCACTGTGTCTGACAATTCTTTATCTCATTCCACCGCTGCTCCAAACCTACACACGTGGTTCAGTAGCCTGCCTACTCTTGAACCTGCTACTCACCCACCATCCAACACCCCTCACCCATGCCCACCAAATTTGCCTATCACCCATAACTCTCAGCCTCCTAGATCCCAGGCCCCAACAACCTCTCCAACCTCACCTCCCCAATCACGACCTAATAATCTAGAATTCCCCATAAACAGTCCCCTAAGCCATGAACCTTTACCAATTGACACATCTCCTATTTCTAGCTCATCCCACCAGCTCCCTAGTACGATTTCAGCGGACCAACATGCGTTCTGCACCCCCTCCAATCCACCATCTTTCCCTCTATCTCCACTACCCACACCTGAAGCATCTGACCTAAATATACCAACCTCGTCCCCTTTACCTTCTCTCTCTATTCCTGCACCGGAAACTACCTCAGTGGTTGAAAGCTTTATCGCACCTCAGCCTCCCCCACCATTTCAGTCTATCCACCCTATGATTACAAGAGGGAAAGTTGGAATATTCAAGCCCAAGGTGCTCCTATCCTACACTCCCACTGATTGGTCGGTAACAGAACCCACAACTGTTAAGGTTGCTCTTGCTACTCCCATCTGGAAATCAGCGATGGATTTGGAGTATAATGCTCTTATGCAAAATCAGACTTGGACCCTCGTCCCTCCTACTGGTTCAGTCGATGTAGTTGGGTGTAAATGGGTTTTCCGCATCAAACGCAATTTTGATGGCTCGATTCAACGCAACAAGGCACGGTTGGTGGCCAAAGGCTTTCATCAAAGTCCCGGTATTGACTTTTTTGAAACCTTCAGTCCTGTGGTCAAAGCCTCCACTATCCGAGTTGTTCTGTCATTAGCTGTATCTCGAGGCTGGAAACTACGACAACTTGACTTCAACAATGCCTTTCTCAACGACAAGTTAGATGAAGATGTCTATATGTCTCAACCTCCAGGATATGCCGATCCCAGATATCCGAATTATATCTGCAAACTACATAAAGCACTTTATGGCCTCAAACAAGCTCCTCGAGCTTGGAATGTCACTCTCAAATCTGCCTTGCTCTCTTGGGGCTTCACCAACAGTAGATCAGACACATCCTTGTTCATATATCACTGCGGTCCATCCATCATCCTCCTTCTGGTCTATGTCGATGATGTCATTGTCACTGGAAATAATGTTGCTCTCATAGACAGTCTTGTTGCCACACTAGATAAAACGTTTGCGTTAAAAGATCTTGGCTTGCTCAGTTATTTTCTTGGCCTTCAGGTCACTCATCTCCCTTCCAGAGTTCTTTAA

Protein sequence

MGTCTVMSSDGYRYYVHFLDDYSRFVWIYPLKLKSDTLSAFNHFTTMIKTQFGSHIKMLQSDNGGEYKRVHQLCHQLGIQSRFSCPYTSAQNGRAERKHRHIVETGLTLLAQASMPLSFWWEAFLTSTLLINGLPSPLLNGKSPMELLIQRSLNVSELRIFGCACYPFLRPYHTHKFQFRTNRCVYLGPSPARKGHKCLSSSGRVFISRHVQFNEGDYPFASGFGLQQSTVSDNSLSHSTAAPNLHTWFSSLPTLEPATHPPSNTPHPCPPNLPITHNSQPPRSQAPTTSPTSPPQSRPNNLEFPINSPLSHEPLPIDTSPISSSSHQLPSTISADQHAFCTPSNPPSFPLSPLPTPEASDLNIPTSSPLPSLSIPAPETTSVVESFIAPQPPPPFQSIHPMITRGKVGIFKPKVLLSYTPTDWSVTEPTTVKVALATPIWKSAMDLEYNALMQNQTWTLVPPTGSVDVVGCKWVFRIKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRVVLSLAVSRGWKLRQLDFNNAFLNDKLDEDVYMSQPPGYADPRYPNYICKLHKALYGLKQAPRAWNVTLKSALLSWGFTNSRSDTSLFIYHCGPSIILLLVYVDDVIVTGNNVALIDSLVATLDKTFALKDLGLLSYFLGLQVTHLPSRVL
Homology
BLAST of MC07g_new0372 vs. ExPASy Swiss-Prot
Match: Q94HW2 (Retrovirus-related Pol polyprotein from transposon RE1 OS=Arabidopsis thaliana OX=3702 GN=RE1 PE=2 SV=1)

HSP 1 Score: 491.1 bits (1263), Expect = 2.0e-137
Identity = 284/680 (41.76%), Postives = 396/680 (58.24%), Query Frame = 0

Query: 6    VMSSDGYRYYVHFLDDYSRFVWIYPLKLKSDTLSAFNHFTTMIKTQFGSHIKMLQSDNGG 65
            ++S D YRYYV F+D ++R+ W+YPLK KS     F  F  +++ +F + I    SDNGG
Sbjct: 536  ILSHDNYRYYVIFVDHFTRYTWLYPLKQKSQVKETFITFKNLLENRFQTRIGTFYSDNGG 595

Query: 66   EYKRVHQLCHQLGIQSRFSCPYTSAQNGRAERKHRHIVETGLTLLAQASMPLSFWWEAFL 125
            E+  + +   Q GI    S P+T   NG +ERKHRHIVETGLTLL+ AS+P ++W  AF 
Sbjct: 596  EFVALWEYFSQHGISHLTSPPHTPEHNGLSERKHRHIVETGLTLLSHASIPKTYWPYAFA 655

Query: 126  TSTLLINGLPSPLLNGKSPMELLIQRSLNVSELRIFGCACYPFLRPYHTHKFQFRTNRCV 185
             +  LIN LP+PLL  +SP + L   S N  +LR+FGCACYP+LRPY+ HK   ++ +CV
Sbjct: 656  VAVYLINRLPTPLLQLESPFQKLFGTSPNYDKLRVFGCACYPWLRPYNQHKLDDKSRQCV 715

Query: 186  YLGPSPARKGHKCLS-SSGRVFISRHVQFNEGDYPFASGFGLQQSTVSDNSLSHSTAAPN 245
            +LG S  +  + CL   + R++ISRHV+F+E  +PF++              S    +P+
Sbjct: 716  FLGYSLTQSAYLCLHLQTSRLYISRHVRFDENCFPFSNYLATLSPVQEQRRESSCVWSPH 775

Query: 246  LHTWFSSLPTLEPATHPPS-NTPHPCPPNLPITHNSQPPRSQAPTTSPTSPPQSRPNNLE 305
                 ++LPT  P    PS + PH         H + PP S    ++P    Q   +NL+
Sbjct: 776  -----TTLPTRTPVLPAPSCSDPH---------HAATPPSS---PSAPFRNSQVSSSNLD 835

Query: 306  FPINSPLSHEPLPIDTSPISSSSHQLPSTISADQHAFCTPSNPPSFPLSPLPTPEASDLN 365
               +S     P P  T+P  +                 + +   + P +  P+  A  L+
Sbjct: 836  SSFSSSFPSSPEP--TAPRQNGPQPTTQPTQTQTQTHSSQNTSQNNPTNESPSQLAQSLS 895

Query: 366  IP----TSSPLPSLSIPAPETTSVVESFIAPQPPPPFQSI----------HPMITRGKVG 425
             P    +SSP P+ S  +  T+    S +   PPP  Q +          H M TR K G
Sbjct: 896  TPAQSSSSSPSPTTSASSSSTSPTPPSILIHPPPPLAQIVNNNNQAPLNTHSMGTRAKAG 955

Query: 426  IFKPKVLLSYTPTDWSVTEPTTVKVALATPIWKSAMDLEYNALMQNQTWTLVPPTGS-VD 485
            I KP    S   +  + +EP T   AL    W++AM  E NA + N TW LVPP  S V 
Sbjct: 956  IIKPNPKYSLAVSLAAESEPRTAIQALKDERWRNAMGSEINAQIGNHTWDLVPPPPSHVT 1015

Query: 486  VVGCKWVFRIKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRVVLSLAVS 545
            +VGC+W+F  K N DGS+ R KARLVAKG++Q PG+D+ ETFSPV+K+++IR+VL +AV 
Sbjct: 1016 IVGCRWIFTKKYNSDGSLNRYKARLVAKGYNQRPGLDYAETFSPVIKSTSIRIVLGVAVD 1075

Query: 546  RGWKLRQLDFNNAFLNDKLDEDVYMSQPPGYADPRYPNYICKLHKALYGLKQAPRAWNVT 605
            R W +RQLD NNAFL   L +DVYMSQPPG+ D   PNY+CKL KALYGLKQAPRAW V 
Sbjct: 1076 RSWPIRQLDVNNAFLQGTLTDDVYMSQPPGFIDKDRPNYVCKLRKALYGLKQAPRAWYVE 1135

Query: 606  LKSALLSWGFTNSRSDTSLFIYHCGPSIILLLVYVDDVIVTGNNVALIDSLVATLDKTFA 665
            L++ LL+ GF NS SDTSLF+   G SI+ +LVYVDD+++TGN+  L+ + +  L + F+
Sbjct: 1136 LRNYLLTIGFVNSVSDTSLFVLQRGKSIVYMLVYVDDILITGNDPTLLHNTLDNLSQRFS 1195

Query: 666  LKDLGLLSYFLGLQVTHLPS 669
            +KD   L YFLG++   +P+
Sbjct: 1196 VKDHEELHYFLGIEAKRVPT 1196

BLAST of MC07g_new0372 vs. ExPASy Swiss-Prot
Match: Q9ZT94 (Retrovirus-related Pol polyprotein from transposon RE2 OS=Arabidopsis thaliana OX=3702 GN=RE2 PE=4 SV=1)

HSP 1 Score: 490.7 bits (1262), Expect = 2.6e-137
Identity = 297/685 (43.36%), Postives = 417/685 (60.88%), Query Frame = 0

Query: 6    VMSSDGYRYYVHFLDDYSRFVWIYPLKLKSDTLSAFNHFTTMIKTQFGSHIKMLQSDNGG 65
            ++S D YRYYV F+D ++R+ W+YPLK KS     F  F ++++ +F + I  L SDNGG
Sbjct: 515  ILSIDNYRYYVIFVDHFTRYTWLYPLKQKSQVKDTFIIFKSLVENRFQTRIGTLYSDNGG 574

Query: 66   EYKRVHQLCHQLGIQSRFSCPYTSAQNGRAERKHRHIVETGLTLLAQASMPLSFWWEAFL 125
            E+  +     Q GI    S P+T   NG +ERKHRHIVE GLTLL+ AS+P ++W  AF 
Sbjct: 575  EFVVLRDYLSQHGISHFTSPPHTPEHNGLSERKHRHIVEMGLTLLSHASVPKTYWPYAFS 634

Query: 126  TSTLLINGLPSPLLNGKSPMELLIQRSLNVSELRIFGCACYPFLRPYHTHKFQFRTNRCV 185
             +  LIN LP+PLL  +SP + L  +  N  +L++FGCACYP+LRPY+ HK + ++ +C 
Sbjct: 635  VAVYLINRLPTPLLQLQSPFQKLFGQPPNYEKLKVFGCACYPWLRPYNRHKLEDKSKQCA 694

Query: 186  YLGPSPARKGHKCLS-SSGRVFISRHVQFNEGDYPFA-SGFGLQQSTVSDNSLSHSTAAP 245
            ++G S  +  + CL   +GR++ SRHVQF+E  +PF+ + FG     VS +    S +AP
Sbjct: 695  FMGYSLTQSAYLCLHIPTGRLYTSRHVQFDERCFPFSTTNFG-----VSTSQEQRSDSAP 754

Query: 246  N--LHTWFSSLPTLEPATHPPSNTPH----PCPPNLP-------ITHNSQPPRS-QAPTT 305
            N   HT   + P + PA  PP   PH    P PP+ P       ++ ++ P  S  +P++
Sbjct: 755  NWPSHTTLPTTPLVLPA--PPCLGPHLDTSPRPPSSPSPLCTTQVSSSNLPSSSISSPSS 814

Query: 306  S-PTSPPQSRPNNLEFPINSPLSHEPLPIDTSPISSSSHQLPSTISADQHAFCTPSNPPS 365
            S PT+P  + P     P  +  S+   PI  +P  +S    PS  S +Q++   P +P S
Sbjct: 815  SEPTAPSHNGPQPTAQPHQTQNSNSNSPILNNPNPNS----PSPNSPNQNS-PLPQSPIS 874

Query: 366  FPLSPLPTPEASDLNIPTSS-----PLPSLSIPAPETTSVVESFIAPQPPPPFQSIHPMI 425
             P  P P+   S+ N P+SS     PLP + +PAP    V       Q P    + H M 
Sbjct: 875  SPHIPTPSTSISEPNSPSSSSTSTPPLPPV-LPAPPIIQV-----NAQAP---VNTHSMA 934

Query: 426  TRGKVGIFKPKVLLSYTPTDWSVTEPTTVKVALATPIWKSAMDLEYNALMQNQTWTLV-P 485
            TR K GI KP    SY  +  + +EP T   A+    W+ AM  E NA + N TW LV P
Sbjct: 935  TRAKDGIRKPNQKYSYATSLAANSEPRTAIQAMKDDRWRQAMGSEINAQIGNHTWDLVPP 994

Query: 486  PTGSVDVVGCKWVFRIKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRVV 545
            P  SV +VGC+W+F  K N DGS+ R KARLVAKG++Q PG+D+ ETFSPV+K+++IR+V
Sbjct: 995  PPPSVTIVGCRWIFTKKFNSDGSLNRYKARLVAKGYNQRPGLDYAETFSPVIKSTSIRIV 1054

Query: 546  LSLAVSRGWKLRQLDFNNAFLNDKLDEDVYMSQPPGYADPRYPNYICKLHKALYGLKQAP 605
            L +AV R W +RQLD NNAFL   L ++VYMSQPPG+ D   P+Y+C+L KA+YGLKQAP
Sbjct: 1055 LGVAVDRSWPIRQLDVNNAFLQGTLTDEVYMSQPPGFVDKDRPDYVCRLRKAIYGLKQAP 1114

Query: 606  RAWNVTLKSALLSWGFTNSRSDTSLFIYHCGPSIILLLVYVDDVIVTGNNVALIDSLVAT 665
            RAW V L++ LL+ GF NS SDTSLF+   G SII +LVYVDD+++TGN+  L+   +  
Sbjct: 1115 RAWYVELRTYLLTVGFVNSISDTSLFVLQRGRSIIYMLVYVDDILITGNDTVLLKHTLDA 1174

Query: 666  LDKTFALKDLGLLSYFLGLQVTHLP 668
            L + F++K+   L YFLG++   +P
Sbjct: 1175 LSQRFSVKEHEDLHYFLGIEAKRVP 1178

BLAST of MC07g_new0372 vs. ExPASy Swiss-Prot
Match: P10978 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum OX=4097 PE=2 SV=1)

HSP 1 Score: 251.1 bits (640), Expect = 3.4e-65
Identity = 197/669 (29.45%), Postives = 291/669 (43.50%), Query Frame = 0

Query: 2    GTCTVMSSDGYRYYVHFLDDYSRFVWIYPLKLKSDTLSAFNHFTTMIKTQFGSHIKMLQS 61
            G   + S  G +Y+V F+DD SR +W+Y LK K      F  F  +++ + G  +K L+S
Sbjct: 490  GPMEIESMGGNKYFVTFIDDASRKLWVYILKTKDQVFQVFQKFHALVERETGRKLKRLRS 549

Query: 62   DNGGEY--KRVHQLCHQLGIQSRFSCPYTSAQNGRAERKHRHIVETGLTLLAQASMPLSF 121
            DNGGEY  +   + C   GI+   + P T   NG AER +R IVE   ++L  A +P SF
Sbjct: 550  DNGGEYTSREFEEYCSSHGIRHEKTVPGTPQHNGVAERMNRTIVEKVRSMLRMAKLPKSF 609

Query: 122  WWEAFLTSTLLINGLPSPLLNGKSPMELLIQRSLNVSELRIFGCACYPFLRPYHTHKFQF 181
            W EA  T+  LIN  PS  L  + P  +   + ++ S L++FGC  +  +      K   
Sbjct: 610  WGEAVQTACYLINRSPSVPLAFEIPERVWTNKEVSYSHLKVFGCRAFAHVPKEQRTKLDD 669

Query: 182  RTNRCVYLGPSPARKGHKCLSS-SGRVFISRHVQFNEGDYPFASGFGLQQSTVSDNSLSH 241
            ++  C+++G      G++       +V  SR V F E +   A+      S    N +  
Sbjct: 670  KSIPCIFIGYGDEEFGYRLWDPVKKKVIRSRDVVFRESEVRTAA----DMSEKVKNGI-- 729

Query: 242  STAAPNLHTWFSSLPTLEPATHPPSNTPHPCPPNLPITHNSQPPRSQAPTTSPTSPPQSR 301
                PN  T                         +P T N+  P S   TT   S    +
Sbjct: 730  ---IPNFVT-------------------------IPSTSNN--PTSAESTTDEVSEQGEQ 789

Query: 302  PNNLEFPINSPLSHEPLPIDTSPISSSSHQLPSTISADQHAFCTPSNPPSFPLSPLPTPE 361
            P                      +     QL   +   +H   T       PL     P 
Sbjct: 790  PGE--------------------VIEQGEQLDEGVEEVEHP--TQGEEQHQPLRRSERPR 849

Query: 362  ASDLNIPTSSPLPSLSIPAPETTSVVESFIAPQPPPPFQSIHPMITRGKVGIFKPKVLLS 421
                  P++                                               VL+S
Sbjct: 850  VESRRYPSTE---------------------------------------------YVLIS 909

Query: 422  YTPTDWSVTEPTTVKVALATP---IWKSAMDLEYNALMQNQTWTLVPPTGSVDVVGCKWV 481
                     EP ++K  L+ P       AM  E  +L +N T+ LV        + CKWV
Sbjct: 910  ------DDREPESLKEVLSHPEKNQLMKAMQEEMESLQKNGTYKLVELPKGKRPLKCKWV 969

Query: 482  FRIKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRVVLSLAVSRGWKLRQ 541
            F++K++ D  + R KARLV KGF Q  GIDF E FSPVVK ++IR +LSLA S   ++ Q
Sbjct: 970  FKLKKDGDCKLVRYKARLVVKGFEQKKGIDFDEIFSPVVKMTSIRTILSLAASLDLEVEQ 1029

Query: 542  LDFNNAFLNDKLDEDVYMSQPPGYADPRYPNYICKLHKALYGLKQAPRAWNVTLKSALLS 601
            LD   AFL+  L+E++YM QP G+      + +CKL+K+LYGLKQAPR W +   S + S
Sbjct: 1030 LDVKTAFLHGDLEEEIYMEQPEGFEVAGKKHMVCKLNKSLYGLKQAPRQWYMKFDSFMKS 1049

Query: 602  WGFTNSRSDTSLFIYHCGP-SIILLLVYVDDVIVTGNNVALIDSLVATLDKTFALKDLGL 661
              +  + SD  ++       + I+LL+YVDD+++ G +  LI  L   L K+F +KDLG 
Sbjct: 1090 QTYLKTYSDPCVYFKRFSENNFIILLLYVDDMLIVGKDKGLIAKLKGDLSKSFDMKDLGP 1049

Query: 662  LSYFLGLQV 664
                LG+++
Sbjct: 1150 AQQILGMKI 1049

BLAST of MC07g_new0372 vs. ExPASy Swiss-Prot
Match: P04146 (Copia protein OS=Drosophila melanogaster OX=7227 GN=GIP PE=1 SV=3)

HSP 1 Score: 206.8 bits (525), Expect = 7.4e-52
Identity = 189/690 (27.39%), Postives = 305/690 (44.20%), Query Frame = 0

Query: 2    GTCTVMSSDGYRYYVHFLDDYSRFVWIYPLKLKSDTLSAFNHFTTMIKTQFGSHIKMLQS 61
            G  T ++ D   Y+V F+D ++ +   Y +K KSD  S F  F    +  F   +  L  
Sbjct: 490  GPITPVTLDDKNYFVIFVDQFTHYCVTYLIKYKSDVFSMFQDFVAKSEAHFNLKVVYLYI 549

Query: 62   DNGGEY--KRVHQLCHQLGIQSRFSCPYTSAQNGRAERKHRHIVETGLTLLAQASMPLSF 121
            DNG EY    + Q C + GI    + P+T   NG +ER  R I E   T+++ A +  SF
Sbjct: 550  DNGREYLSNEMRQFCVKKGISYHLTVPHTPQLNGVSERMIRTITEKARTMVSGAKLDKSF 609

Query: 122  WWEAFLTSTLLINGLPSPLL--NGKSPMELLIQRSLNVSELRIFGCACYPFLRPYHTHKF 181
            W EA LT+T LIN +PS  L  + K+P E+   +   +  LR+FG   Y  ++     KF
Sbjct: 610  WGEAVLTATYLINRIPSRALVDSSKTPYEMWHNKKPYLKHLRVFGATVYVHIK-NKQGKF 669

Query: 182  QFRTNRCVYLGPSPARKGHKCLSSSGRVFI-SRHVQFNEGDYPFASGFGLQQSTVSDNSL 241
              ++ + +++G  P   G K   +    FI +R V  +E +   +     +   + D+  
Sbjct: 670  DDKSFKSIFVGYEP--NGFKLWDAVNEKFIVARDVVVDETNMVNSRAVKFETVFLKDSKE 729

Query: 242  SHSTAAPNLHTWFSSLPTLEPATHPPSNTPHPCP-----PNLPITHNSQPPRSQAPTTSP 301
            S +   PN      S   ++  T  P N    C       +   + N   P         
Sbjct: 730  SENKNFPN-----DSRKIIQ--TEFP-NESKECDNIQFLKDSKESENKNFPNDSRKIIQT 789

Query: 302  TSPPQSRP-NNLEFPINSPLSHEPLPIDTSPISSSSHQLPSTISADQHAFCTPSNPPSFP 361
              P +S+  +N++F  +S  S++    ++       H   S  S                
Sbjct: 790  EFPNESKECDNIQFLKDSKESNKYFLNESKKRKRDDHLNESKGSG--------------- 849

Query: 362  LSPLPTPEASDLNIPTSSPLPSLSIPAPETTSVVESFIAPQPPPPFQSIHPMITRGKVGI 421
             +P  + E+      T+  L  + I  P     +E                 I   +   
Sbjct: 850  -NPNESRESE-----TAEHLKEIGIDNPTKNDGIE-----------------IINRRSER 909

Query: 422  FKPKVLLSYTPTDWSVTEPTTVKVALATPI---------------WKSAMDLEYNALMQN 481
             K K  +SY   D S+ +       +   +               W+ A++ E NA   N
Sbjct: 910  LKTKPQISYNEEDNSLNKVVLNAHTIFNDVPNSFDEIQYRDDKSSWEEAINTELNAHKIN 969

Query: 482  QTWTLVPPTGSVDVVGCKWVFRIKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVK 541
             TWT+     + ++V  +WVF +K N  G+  R KARLVA+GF Q   ID+ ETF+PV +
Sbjct: 970  NTWTITKRPENKNIVDSRWVFSVKYNELGNPIRYKARLVARGFTQKYQIDYEETFAPVAR 1029

Query: 542  ASTIRVVLSLAVSRGWKLRQLDFNNAFLNDKLDEDVYMSQPPGYADPRYPNYICKLHKAL 601
             S+ R +LSL +    K+ Q+D   AFLN  L E++YM  P G +     + +CKL+KA+
Sbjct: 1030 ISSFRFILSLVIQYNLKVHQMDVKTAFLNGTLKEEIYMRLPQGIS--CNSDNVCKLNKAI 1089

Query: 602  YGLKQAPRAWNVTLKSALLSWGFTNSRSDTSLFIYHCG--PSIILLLVYVDDVIVTGNNV 661
            YGLKQA R W    + AL    F NS  D  ++I   G     I +L+YVDDV++   ++
Sbjct: 1090 YGLKQAARCWFEVFEQALKECEFVNSSVDRCIYILDKGNINENIYVLLYVDDVVIATGDM 1128

Query: 662  ALIDSLVATLDKTFALKDLGLLSYFLGLQV 664
              +++    L + F + DL  + +F+G+++
Sbjct: 1150 TRMNNFKRYLMEKFRMTDLNEIKHFIGIRI 1128

BLAST of MC07g_new0372 vs. ExPASy Swiss-Prot
Match: P92520 (Uncharacterized mitochondrial protein AtMg00820 OS=Arabidopsis thaliana OX=3702 GN=AtMg00820 PE=4 SV=1)

HSP 1 Score: 126.7 bits (317), Expect = 9.8e-28
Identity = 64/125 (51.20%), Postives = 83/125 (66.40%), Query Frame = 0

Query: 402 MITRGKVGIFKPKVLLSYTPTDWSVTEPTTVKVALATPIWKSAMDLEYNALMQNQTWTLV 461
           M+TR K GI K     S T T     EP +V  AL  P W  AM  E +AL +N+TW LV
Sbjct: 1   MLTRSKAGINKLNPKYSLTITTTIKKEPKSVIFALKDPGWCQAMQEELDALSRNKTWILV 60

Query: 462 PPTGSVDVVGCKWVFRIKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRV 521
           PP  + +++GCKWVF+ K + DG++ R KARLVAKGFHQ  GI F ET+SPVV+ +TIR 
Sbjct: 61  PPPVNQNILGCKWVFKTKLHSDGTLDRLKARLVAKGFHQEEGIYFVETYSPVVRTATIRT 120

Query: 522 VLSLA 527
           +L++A
Sbjct: 121 ILNVA 125

BLAST of MC07g_new0372 vs. NCBI nr
Match: PNY02796.1 (copia protein (gag-int-pol protein), partial [Trifolium pratense])

HSP 1 Score: 550 bits (1417), Expect = 1.48e-180
Identity = 315/677 (46.53%), Postives = 403/677 (59.53%), Query Frame = 0

Query: 2   GTCTVMSSDGYRYYVHFLDDYSRFVWIYPLKLKSDTLSAFNHFTTMIKTQFGSHIKMLQS 61
           G   +MS+ G++YYVHF+DD+SRF WIYPLK KS+T+ AF  F T+++ QF   IK++Q 
Sbjct: 232 GPAPIMSNSGFKYYVHFIDDFSRFTWIYPLKQKSETIHAFTQFKTLVENQFNKRIKIVQC 291

Query: 62  DNGGEYKRVHQLCHQLGIQSRFSCPYTSAQNGRAERKHRHIVETGLTLLAQASMPLSFWW 121
           D GGEYK V +L  + GIQ R SCPYTS QNGRAERKHRH+ E GLT+LAQA MPL +WW
Sbjct: 292 DGGGEYKAVQKLALEAGIQFRMSCPYTSQQNGRAERKHRHVAELGLTMLAQARMPLCYWW 351

Query: 122 EAFLTSTLLINGLPSPLLNGKSPMELLIQRSLNVSELRIFGCACYPFLRPYHTHKFQFRT 181
           EAF TS  LIN LPS +     P  L+ ++  + S L+ FGCACYP L+PY+ HK QF T
Sbjct: 352 EAFSTSVYLINRLPSSINQNACPYTLIYKKEPDYSVLKPFGCACYPCLKPYNKHKLQFHT 411

Query: 182 NRCVYLGPSPARKGHKCLSSSGRVFISRHVQFNEGDYPFASGFGLQQSTVSDNSLSHSTA 241
            RCV+LG S + KG+KC++S GR+F+SRHV FNE  +PF  GF                 
Sbjct: 412 TRCVFLGYSNSHKGYKCINSHGRIFVSRHVVFNEEHFPFHDGF----------------- 471

Query: 242 APNLHTWFSSLPTLEPATHPPSNTPHPCPPNLPITHNSQPPRSQAPTTSPTSPPQSRPNN 301
              L T  + L TL P              N PI      P + A  T          NN
Sbjct: 472 ---LDT-RNPLRTLTP--------------NDPILF----PLAPADGT----------NN 531

Query: 302 LEFPINSPLSHEPLPIDTSPISSSSHQLPSTISADQHAFCTPSNPPSFPLSPLPTPEASD 361
           ++ P N   +HE    D++ I SS          DQH                   E+SD
Sbjct: 532 IDDPENESFTHEEE--DSNSIHSSE---------DQH-------------------ESSD 591

Query: 362 LNIPTS-SPLPSL--------SIPAPETTSVVESFIAPQPPPPFQSIHPMITRGKVGIFK 421
             I TS S L S         S    ET+   E  I         + H M TR K GI K
Sbjct: 592 RLINTSESSLQSAIREEENNDSTETMETSRQNELEIGANSQENNTNTHLMRTRSKDGIHK 651

Query: 422 PKV-LLSYTPTDWSVTEPTTVKVALATPIWKSAMDLEYNALMQNQTWTLVPPTGSVDVVG 481
           PK   +          EP  ++ AL+ P WK AMD+E+NALM N TWTLVP  G  +++ 
Sbjct: 652 PKQPYIGLVEAHTKDKEPENIREALSRPKWKEAMDIEFNALMSNHTWTLVPYQGQENIID 711

Query: 482 CKWVFRIKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRVVLSLAVSRGW 541
            KWVF+ K   DGSI+R KARLVAKGF Q+ G+D+ ETFSPVVK+ST+R++LS+AV   W
Sbjct: 712 SKWVFKTKYKADGSIERRKARLVAKGFQQTAGLDYEETFSPVVKSSTVRIILSIAVHFNW 771

Query: 542 KLRQLDFNNAFLNDKLDEDVYMSQPPGYADPRYPNYICKLHKALYGLKQAPRAWNVTLKS 601
           ++RQLD NNAFLN  L E V+M QP GY D   PN+ICKL KA+YGLKQAPRAW  +LK+
Sbjct: 772 EVRQLDINNAFLNGYLKETVFMHQPEGYLDSTKPNHICKLSKAIYGLKQAPRAWFDSLKN 829

Query: 602 ALLSWGFTNSRSDTSLFIYHCGPSIILLLVYVDDVIVTGNNVALIDSLVATLDKTFALKD 661
           AL++WGF N++SD+SLFI+       +LL+YVDD+IVTG+N   +++ +  L+  F+LKD
Sbjct: 832 ALVNWGFQNTKSDSSLFIHRDTNHFTILLIYVDDIIVTGSNTKFLETFIKQLNTVFSLKD 829

Query: 662 LGLLSYFLGLQVTHLPS 668
           LG L YFLG++V    S
Sbjct: 892 LGHLHYFLGIEVQRNAS 829

BLAST of MC07g_new0372 vs. NCBI nr
Match: GAU17915.1 (hypothetical protein TSUD_330400, partial [Trifolium subterraneum])

HSP 1 Score: 532 bits (1371), Expect = 1.49e-173
Identity = 297/663 (44.80%), Postives = 385/663 (58.07%), Query Frame = 0

Query: 2   GTCTVMSSDGYRYYVHFLDDYSRFVWIYPLKLKSDTLSAFNHFTTMIKTQFGSHIKMLQS 61
           G   ++SS G++YYVHF+DD++RF WIYPLK KSDT  AF  F  M++ QF   IK +Q 
Sbjct: 407 GPAPIISSSGFKYYVHFIDDFTRFTWIYPLKQKSDTAHAFIQFKNMVENQFNKRIKTIQC 466

Query: 62  DNGGEYKRVHQLCHQLGIQSRFSCPYTSAQNGRAERKHRHIVETGLTLLAQASMPLSFWW 121
           D GGEYK V +   + GIQ R SCPYTS QNGRAERKHRHI E GLTLLAQA MPL++WW
Sbjct: 467 DGGGEYKAVQKHAIEAGIQFRMSCPYTSQQNGRAERKHRHIAEFGLTLLAQAKMPLNYWW 526

Query: 122 EAFLTSTLLINGLPSPLLNGKSPMELLIQRSLNVSELRIFGCACYPFLRPYHTHKFQFRT 181
           EAF T+  LIN LPSP+ + +SP  LL ++  + + L+ FGCACYP L+PY+ HK QF T
Sbjct: 527 EAFSTAVYLINRLPSPVTHNESPYSLLHKKEPDYNSLKPFGCACYPCLKPYNKHKLQFHT 586

Query: 182 NRCVYLGPSPARKGHKCLSSSGRVFISRHVQFNEGDYPFASGFGLQQSTVSDNSLSHSTA 241
            +CV+LG S + KG+KC++S GRVFISRHV FNE  +PF  GF                 
Sbjct: 587 TKCVFLGYSNSHKGYKCVNSHGRVFISRHVVFNEDHFPFHDGF----------------- 646

Query: 242 APNLHTWFSSLPTLEPATHPPSNTPHPCPPNLPITHNSQPPRSQAPTTSPTSPPQSRPNN 301
                     L T  P                                            
Sbjct: 647 ----------LNTRVP-------------------------------------------- 706

Query: 302 LEFPINSPLSHEPLPIDTSPISSSSHQLPSTISADQHAFCTPSNPPSFPLSPLPTPEASD 361
           L+    SP SH PL +   P SSS+      I+ +Q +            + L   + +D
Sbjct: 707 LKTLTGSPSSHFPLHV-AEPTSSSTESSEDNINTEQAS------------NELTQDDDAD 766

Query: 362 LNIPTSSPLPSLSIPAPETTSVVESFIAPQPPPPFQSIHPMITRGKVGIFKPKV-LLSYT 421
           +  P +  +P + + A   T                  H M TR K GI KPK+  +   
Sbjct: 767 VAAPDTRTVP-IEVEASNNT------------------HWMRTRSKDGIRKPKLPYIGLA 826

Query: 422 PTDWSVTEPTTVKVALATPIWKSAMDLEYNALMQNQTWTLVPPTGSVDVVGCKWVFRIKR 481
                  EP   + AL  P WK AM  E+ ALM NQTWTL+P      ++  +WVF+IK 
Sbjct: 827 ENHIEEKEPGNAQEALRRPEWKEAMHKEFQALMTNQTWTLIPYQDQESIIDSEWVFKIKY 886

Query: 482 NFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRVVLSLAVSRGWKLRQLDFNN 541
             DG+I+R KARLVAKGF Q+ G+ + ETFSPVVKASTIR++LS+AV   W+++QLD NN
Sbjct: 887 KADGTIERRKARLVAKGFQQTAGLGYEETFSPVVKASTIRIILSIAVHLNWEVKQLDINN 946

Query: 542 AFLNDKLDEDVYMSQPPGYADPRYPNYICKLHKALYGLKQAPRAWNVTLKSALLSWGFTN 601
           AFLN  L E V+M+QP G+ DP  PN++CKL KA+YGL+QAPRAW  +LK+ALLSWGF N
Sbjct: 947 AFLNGNLKETVFMNQPEGFIDPTKPNHVCKLSKAIYGLRQAPRAWFDSLKNALLSWGFQN 966

Query: 602 SRSDTSLFIYHCGPSIILLLVYVDDVIVTGNNVALIDSLVATLDKTFALKDLGLLSYFLG 661
           ++SD+SLF       I  LL+YVDD+IVTGNN   +++ +  L+  F+LKDLG L YFLG
Sbjct: 1007 TKSDSSLFTLRGRDHITFLLIYVDDIIVTGNNTKFLETFIKQLNIVFSLKDLGNLHYFLG 966

Query: 662 LQV 663
           ++V
Sbjct: 1067 IEV 966

BLAST of MC07g_new0372 vs. NCBI nr
Match: GAU51268.1 (hypothetical protein TSUD_412550 [Trifolium subterraneum])

HSP 1 Score: 540 bits (1391), Expect = 1.04e-172
Identity = 297/666 (44.59%), Postives = 401/666 (60.21%), Query Frame = 0

Query: 2    GTCTVMSSDGYRYYVHFLDDYSRFVWIYPLKLKSDTLSAFNHFTTMIKTQFGSHIKMLQS 61
            G   ++S  G++YYVHF+DD+SRF WI+PLK KSDT+ AF  F  + + QF   IK++Q 
Sbjct: 501  GPAPILSPSGFKYYVHFIDDFSRFTWIFPLKQKSDTIHAFIQFKNLAENQFNKKIKIIQC 560

Query: 62   DNGGEYKRVHQLCHQLGIQSRFSCPYTSAQNGRAERKHRHIVETGLTLLAQASMPLSFWW 121
            D GGEYK V ++  + GIQ R SCPYTS QNGRAERKHRH+ E GLTLLAQA MPL +WW
Sbjct: 561  DGGGEYKAVQKVSIEAGIQFRMSCPYTSQQNGRAERKHRHVAELGLTLLAQAKMPLRYWW 620

Query: 122  EAFLTSTLLINGLPSPLLNGKSPMELLIQRSLNVSELRIFGCACYPFLRPYHTHKFQFRT 181
            EAF T+  LIN LPS +   +SP  L+ +R  + + L+ FGCACYP L+PY+ HK QF T
Sbjct: 621  EAFSTAVYLINRLPSSVNPNESPYSLMFKREPDYNALKPFGCACYPCLKPYNQHKLQFHT 680

Query: 182  NRCVYLGPSPARKGHKCLSSSGRVFISRHVQFNEGDYPFASGFGLQQS---TVSDNSLSH 241
             RCV++G S + KG+KC++S GR+F+SRHV FNE  +PF  GF   ++   T++DNS   
Sbjct: 681  TRCVFVGYSNSHKGYKCINSHGRIFVSRHVIFNENHFPFHGGFLDTKNPLKTLTDNS--- 740

Query: 242  STAAPNLHTWFSSLPTLEPATHPPSNTPHPCPPNLPITHNSQPPRSQAPTTSPTSPPQSR 301
            S   P      ++   +EP  +  S+           TH+ +        +S  +  + +
Sbjct: 741  SILLPTCSAGATTQDAIEPDNNTTSDQ---------NTHSIE--------SSDNNENEEQ 800

Query: 302  PNNLEFPINSPLSHEPLPIDTSPISSSSHQLPSTISADQHAFCTPSNPPSFPLSPLPTPE 361
             ++ EF +N+              ++SS Q    I AD                     +
Sbjct: 801  VDSSEFFVNT--------------NNSSTQ---DIEADNSV------------------D 860

Query: 362  ASDLNIPTSSPLPSLSIPAPETTSVVESFIAPQPPPPFQSIHPMITRGKVGIFKPKV-LL 421
            + D N                  S +   I  Q      + H M TR K GI KPK+  +
Sbjct: 861  SEDRN-----------------NSTMTGTIQQQAQQDNSNTHWMRTRSKDGIHKPKIPYV 920

Query: 422  SYTPTDWSVTEPTTVKVALATPIWKSAMDLEYNALMQNQTWTLVPPTGSVDVVGCKWVFR 481
                TD    EP +VK AL  P+WK AMD EY AL+ N TWTLVP     +++  KW+F+
Sbjct: 921  GMAETDSEEKEPKSVKEALGRPMWKEAMDKEYKALVSNHTWTLVPYQEQENIIDSKWIFK 980

Query: 482  IKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRVVLSLAVSRGWKLRQLD 541
             K   DGSI+R KARLVAKGF Q+ G+DF ETFSPVVK+ST+R++L++AV   W++RQLD
Sbjct: 981  TKYKSDGSIERRKARLVAKGFQQTAGLDFGETFSPVVKSSTVRIILTIAVHFNWEVRQLD 1040

Query: 542  FNNAFLNDKLDEDVYMSQPPGYADPRYPNYICKLHKALYGLKQAPRAWNVTLKSALLSWG 601
             NNAFLN KL E V+M QP GY D   PN+ICKL KA+YGLKQAPRAW  +L+S L++WG
Sbjct: 1041 INNAFLNGKLKETVFMHQPEGYIDAAKPNHICKLSKAIYGLKQAPRAWYDSLRSTLVNWG 1094

Query: 602  FTNSRSDTSLFIYHCGPSIILLLVYVDDVIVTGNNVALIDSLVATLDKTFALKDLGLLSY 661
            F N+++DTSLF          LL+YVDD+IVTG+N+  +++    L+  ++LKDLG L Y
Sbjct: 1101 FQNAKNDTSLFFLKGADHTTFLLIYVDDIIVTGSNIKFLEAFTNQLNTAYSLKDLGPLHY 1094

Query: 662  FLGLQV 663
            FLG++V
Sbjct: 1161 FLGVEV 1094

BLAST of MC07g_new0372 vs. NCBI nr
Match: PNX94503.1 (putative retrotransposon Ty1-copia subclass protein, partial [Trifolium pratense])

HSP 1 Score: 530 bits (1366), Expect = 2.03e-170
Identity = 291/666 (43.69%), Postives = 389/666 (58.41%), Query Frame = 0

Query: 2    GTCTVMSSDGYRYYVHFLDDYSRFVWIYPLKLKSDTLSAFNHFTTMIKTQFGSHIKMLQS 61
            G   ++S   ++YYVHFLDD+SRF WI+PLK KS+T+ AFN F  +++ QF   IK+++ 
Sbjct: 508  GPAPILSQSNFKYYVHFLDDFSRFTWIFPLKQKSETIHAFNQFKNLVENQFNKKIKVIRC 567

Query: 62   DNGGEYKRVHQLCHQLGIQSRFSCPYTSAQNGRAERKHRHIVETGLTLLAQASMPLSFWW 121
            D GGEYK V +     GIQ + SCPYTS QNGRAERKHRH+ E GLTLLAQA MPLS+WW
Sbjct: 568  DGGGEYKPVQKCAIDSGIQFQMSCPYTSQQNGRAERKHRHVTELGLTLLAQAKMPLSYWW 627

Query: 122  EAFLTSTLLINGLPSPLLNGKSPMELLIQRSLNVSELRIFGCACYPFLRPYHTHKFQFRT 181
            EAF T+  LIN LPS +   +SP  L+ ++  + + L+ FGCACYP L+PY+ HK QF T
Sbjct: 628  EAFSTAVYLINRLPSSVNPNESPYTLVFKKEPDYTALKPFGCACYPCLKPYNQHKLQFHT 687

Query: 182  NRCVYLGPSPARKGHKCLSSSGRVFISRHVQFNEGDYPFASGFGLQQSTVSDNSLSHSTA 241
             RCV+LG S + KG+KC++S GRVF+SRHV FNE  +PF  GF   ++ +          
Sbjct: 688  TRCVFLGYSNSHKGYKCVNSHGRVFVSRHVVFNENHFPFQEGFLDTRNPIK-------VV 747

Query: 242  APNLHTWFSSLPTLEPATHPPSNTPHPCPPNLPITHNSQPPRSQAPTTSPTSPPQSRPNN 301
              +    F S P                     IT N                     N 
Sbjct: 748  TNDTPIGFPSFPAG-------------------ITTN---------------------NT 807

Query: 302  LEFPINSPLSHEPLPIDTSPISSSSHQLPSTISADQHAFCTPSNPPSFPLSPLPTPEASD 361
             E   N     EP   D + ++  S +  +    D++ F                 E  D
Sbjct: 808  AEATDNIVDQQEPELNDINTVADQSVESDTFEHTDENNFSNG--------------ETED 867

Query: 362  LNIPTSSPLPSLSIPAPETTSVVESFIAPQPPPPFQSI---HPMITRGKVGIFKPKV-LL 421
                      S      E+   +   I    PPP Q I   H M TR K G++KPK+  +
Sbjct: 868  ----------STEAAGRESMEEISQPITETNPPPQQDITNTHWMRTRSKAGVYKPKLPYI 927

Query: 422  SYTPTDWSVTEPTTVKVALATPIWKSAMDLEYNALMQNQTWTLVPPTGSVDVVGCKWVFR 481
              T       EP +V  AL+ P W +AMD EY ALM N+TWTLVP  G  +V+  KW+F+
Sbjct: 928  GLTEEAKEGKEPESVSEALSIPEWLNAMDAEYKALMNNKTWTLVPFEGQENVISSKWIFK 987

Query: 482  IKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRVVLSLAVSRGWKLRQLD 541
             K   DG+I+R KARLVA+GF Q+ G+D+ ETFSPVVK+ST+R++LS+AV   W++RQLD
Sbjct: 988  TKYKADGTIERRKARLVARGFQQTAGVDYDETFSPVVKSSTVRIILSIAVHLSWEVRQLD 1047

Query: 542  FNNAFLNDKLDEDVYMSQPPGYADPRYPNYICKLHKALYGLKQAPRAWNVTLKSALLSWG 601
             NNAFLN  L E V+M QP GY D   P++IC+L+KA+YGLKQAPRAW   L+  LLSWG
Sbjct: 1048 INNAFLNGNLKESVFMHQPEGYIDQTKPHHICRLNKAIYGLKQAPRAWFDRLRHTLLSWG 1102

Query: 602  FTNSRSDTSLFIYHCGPSIILLLVYVDDVIVTGNNVALIDSLVATLDKTFALKDLGLLSY 661
            F N++SD+SLF+         LL+YVDD+I+TG+N   +++ ++ L+  F+LKDLG L Y
Sbjct: 1108 FQNTKSDSSLFVLKETDHTTFLLIYVDDIIITGSNNKFLEAFISQLNLVFSLKDLGNLHY 1102

Query: 662  FLGLQV 663
            FLG++V
Sbjct: 1168 FLGIEV 1102

BLAST of MC07g_new0372 vs. NCBI nr
Match: GAU19483.1 (hypothetical protein TSUD_77270 [Trifolium subterraneum])

HSP 1 Score: 533 bits (1372), Expect = 3.12e-170
Identity = 299/667 (44.83%), Postives = 387/667 (58.02%), Query Frame = 0

Query: 2    GTCTVMSSDGYRYYVHFLDDYSRFVWIYPLKLKSDTLSAFNHFTTMIKTQFGSHIKMLQS 61
            G   +M+S G++YYVHF+DD+SRF WIYPLK KS+T+ AF  F  + + QF   IK++Q 
Sbjct: 501  GPAPIMTSSGFKYYVHFVDDFSRFTWIYPLKQKSETVQAFIQFKNLTENQFNKRIKVIQC 560

Query: 62   DNGGEYKRVHQLCHQLGIQSRFSCPYTSAQNGRAERKHRHIVETGLTLLAQASMPLSFWW 121
            D GGEYK V +L  + GIQ R SCPYTS QNGRAERKHRHI E GLTLLAQA MPL +WW
Sbjct: 561  DGGGEYKPVQKLAVEAGIQFRMSCPYTSQQNGRAERKHRHITEFGLTLLAQAQMPLHYWW 620

Query: 122  EAFLTSTLLINGLPSPLLNGKSPMELLIQRSLNVSELRIFGCACYPFLRPYHTHKFQFRT 181
            EAF T+  LIN LPS +   +SP  L++Q+  +   L+ FGCACYP L+PY+ HK Q+ T
Sbjct: 621  EAFSTAVYLINRLPSQVTQNESPYSLMLQKEPDYKLLKTFGCACYPCLKPYNQHKLQYHT 680

Query: 182  NRCVYLGPSPARKGHKCLSSSGRVFISRHVQFNEGDYPFASGFGLQQSTVSDNSLSHSTA 241
             RCV+LG S + KG+KCL+S GR+FISRHV FNE  +PF  GF   +S +       ST+
Sbjct: 681  TRCVFLGYSNSHKGYKCLNSHGRIFISRHVIFNEDHFPFHDGFLNTRSPLKTTINVPSTS 740

Query: 242  APNLHTWF----SSLPTLEPATHPPSNTPHPCPPNLPITHNSQPPRSQAPTTSPTSPPQS 301
             P          +S+P LE                           ++ P  + T   Q 
Sbjct: 741  FPLCTAGNVIDDASMPILE---------------------------AENPAETNTEDSQD 800

Query: 302  RPNNLEFPINSPLSHEPLPIDTSPISSSSHQLPSTISADQHAFCTPSNPPSFPLSPLPTP 361
              ++ E   N P               ++H+    I+  Q                    
Sbjct: 801  VNSDTEQTNNGPSEDN-----------TTHEETLDITQQQSVG----------------- 860

Query: 362  EASDLNIPTSSPLPSLSIPAPETTSVVESFIAPQPPPPFQSIHPMITRGKVGIFKPKV-L 421
            EAS  N  TS                                H + TR K GI KPK+  
Sbjct: 861  EASQ-NTNTS--------------------------------HAIHTRSKSGIHKPKLPY 920

Query: 422  LSYTPTDWSVTEPTTVKVALATPIWKSAMDLEYNALMQNQTWTLVPPTGSVDVVGCKWVF 481
            +  T T     EP   K AL+ P+WK AM  E+ ALM N+TW LVP     ++V  KWVF
Sbjct: 921  IGLTETYKDTMEPANAKEALSRPLWKEAMQKEFEALMSNKTWILVPYQNQENIVDSKWVF 980

Query: 482  RIKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRVVLSLAVSRGWKLRQL 541
            + K   DGS++R KARLVAKGF Q+ GID+ ETFSPV+KAST+R++LS+AV   W++RQL
Sbjct: 981  KTKYKPDGSLERRKARLVAKGFQQTAGIDYEETFSPVIKASTVRIILSIAVHLNWEVRQL 1040

Query: 542  DFNNAFLNDKLDEDVYMSQPPGYADPRYPNYICKLHKALYGLKQAPRAWNVTLKSALLSW 601
            D NNAFLN  L E V+M QP G+ D   PN+ICKL KA+YGLKQAPRAW  +LK+ALL+W
Sbjct: 1041 DINNAFLNGHLKETVFMHQPEGFVDSTKPNHICKLSKAIYGLKQAPRAWFDSLKTALLNW 1079

Query: 602  GFTNSRSDTSLFIYHCGPSIILLLVYVDDVIVTGNNVALIDSLVATLDKTFALKDLGLLS 661
            GF N++SD+SLF+      I  LL+YVDD+IVTG+N   + + +  L+  F+LKDLG L 
Sbjct: 1101 GFQNTKSDSSLFLLKGKDHITFLLIYVDDIIVTGSNGKFLQAFIKQLNDAFSLKDLGHLH 1079

Query: 662  YFLGLQV 663
            YFLG++V
Sbjct: 1161 YFLGIEV 1079

BLAST of MC07g_new0372 vs. ExPASy TrEMBL
Match: A0A2K3NIC3 (Copia protein (Gag-int-pol protein) (Fragment) OS=Trifolium pratense OX=57577 GN=L195_g026116 PE=4 SV=1)

HSP 1 Score: 550 bits (1417), Expect = 7.18e-181
Identity = 315/677 (46.53%), Postives = 403/677 (59.53%), Query Frame = 0

Query: 2   GTCTVMSSDGYRYYVHFLDDYSRFVWIYPLKLKSDTLSAFNHFTTMIKTQFGSHIKMLQS 61
           G   +MS+ G++YYVHF+DD+SRF WIYPLK KS+T+ AF  F T+++ QF   IK++Q 
Sbjct: 232 GPAPIMSNSGFKYYVHFIDDFSRFTWIYPLKQKSETIHAFTQFKTLVENQFNKRIKIVQC 291

Query: 62  DNGGEYKRVHQLCHQLGIQSRFSCPYTSAQNGRAERKHRHIVETGLTLLAQASMPLSFWW 121
           D GGEYK V +L  + GIQ R SCPYTS QNGRAERKHRH+ E GLT+LAQA MPL +WW
Sbjct: 292 DGGGEYKAVQKLALEAGIQFRMSCPYTSQQNGRAERKHRHVAELGLTMLAQARMPLCYWW 351

Query: 122 EAFLTSTLLINGLPSPLLNGKSPMELLIQRSLNVSELRIFGCACYPFLRPYHTHKFQFRT 181
           EAF TS  LIN LPS +     P  L+ ++  + S L+ FGCACYP L+PY+ HK QF T
Sbjct: 352 EAFSTSVYLINRLPSSINQNACPYTLIYKKEPDYSVLKPFGCACYPCLKPYNKHKLQFHT 411

Query: 182 NRCVYLGPSPARKGHKCLSSSGRVFISRHVQFNEGDYPFASGFGLQQSTVSDNSLSHSTA 241
            RCV+LG S + KG+KC++S GR+F+SRHV FNE  +PF  GF                 
Sbjct: 412 TRCVFLGYSNSHKGYKCINSHGRIFVSRHVVFNEEHFPFHDGF----------------- 471

Query: 242 APNLHTWFSSLPTLEPATHPPSNTPHPCPPNLPITHNSQPPRSQAPTTSPTSPPQSRPNN 301
              L T  + L TL P              N PI      P + A  T          NN
Sbjct: 472 ---LDT-RNPLRTLTP--------------NDPILF----PLAPADGT----------NN 531

Query: 302 LEFPINSPLSHEPLPIDTSPISSSSHQLPSTISADQHAFCTPSNPPSFPLSPLPTPEASD 361
           ++ P N   +HE    D++ I SS          DQH                   E+SD
Sbjct: 532 IDDPENESFTHEEE--DSNSIHSSE---------DQH-------------------ESSD 591

Query: 362 LNIPTS-SPLPSL--------SIPAPETTSVVESFIAPQPPPPFQSIHPMITRGKVGIFK 421
             I TS S L S         S    ET+   E  I         + H M TR K GI K
Sbjct: 592 RLINTSESSLQSAIREEENNDSTETMETSRQNELEIGANSQENNTNTHLMRTRSKDGIHK 651

Query: 422 PKV-LLSYTPTDWSVTEPTTVKVALATPIWKSAMDLEYNALMQNQTWTLVPPTGSVDVVG 481
           PK   +          EP  ++ AL+ P WK AMD+E+NALM N TWTLVP  G  +++ 
Sbjct: 652 PKQPYIGLVEAHTKDKEPENIREALSRPKWKEAMDIEFNALMSNHTWTLVPYQGQENIID 711

Query: 482 CKWVFRIKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRVVLSLAVSRGW 541
            KWVF+ K   DGSI+R KARLVAKGF Q+ G+D+ ETFSPVVK+ST+R++LS+AV   W
Sbjct: 712 SKWVFKTKYKADGSIERRKARLVAKGFQQTAGLDYEETFSPVVKSSTVRIILSIAVHFNW 771

Query: 542 KLRQLDFNNAFLNDKLDEDVYMSQPPGYADPRYPNYICKLHKALYGLKQAPRAWNVTLKS 601
           ++RQLD NNAFLN  L E V+M QP GY D   PN+ICKL KA+YGLKQAPRAW  +LK+
Sbjct: 772 EVRQLDINNAFLNGYLKETVFMHQPEGYLDSTKPNHICKLSKAIYGLKQAPRAWFDSLKN 829

Query: 602 ALLSWGFTNSRSDTSLFIYHCGPSIILLLVYVDDVIVTGNNVALIDSLVATLDKTFALKD 661
           AL++WGF N++SD+SLFI+       +LL+YVDD+IVTG+N   +++ +  L+  F+LKD
Sbjct: 832 ALVNWGFQNTKSDSSLFIHRDTNHFTILLIYVDDIIVTGSNTKFLETFIKQLNTVFSLKD 829

Query: 662 LGLLSYFLGLQVTHLPS 668
           LG L YFLG++V    S
Sbjct: 892 LGHLHYFLGIEVQRNAS 829

BLAST of MC07g_new0372 vs. ExPASy TrEMBL
Match: A0A2Z6P4D5 (Integrase catalytic domain-containing protein OS=Trifolium subterraneum OX=3900 GN=TSUD_412550 PE=4 SV=1)

HSP 1 Score: 540 bits (1391), Expect = 5.02e-173
Identity = 297/666 (44.59%), Postives = 401/666 (60.21%), Query Frame = 0

Query: 2    GTCTVMSSDGYRYYVHFLDDYSRFVWIYPLKLKSDTLSAFNHFTTMIKTQFGSHIKMLQS 61
            G   ++S  G++YYVHF+DD+SRF WI+PLK KSDT+ AF  F  + + QF   IK++Q 
Sbjct: 501  GPAPILSPSGFKYYVHFIDDFSRFTWIFPLKQKSDTIHAFIQFKNLAENQFNKKIKIIQC 560

Query: 62   DNGGEYKRVHQLCHQLGIQSRFSCPYTSAQNGRAERKHRHIVETGLTLLAQASMPLSFWW 121
            D GGEYK V ++  + GIQ R SCPYTS QNGRAERKHRH+ E GLTLLAQA MPL +WW
Sbjct: 561  DGGGEYKAVQKVSIEAGIQFRMSCPYTSQQNGRAERKHRHVAELGLTLLAQAKMPLRYWW 620

Query: 122  EAFLTSTLLINGLPSPLLNGKSPMELLIQRSLNVSELRIFGCACYPFLRPYHTHKFQFRT 181
            EAF T+  LIN LPS +   +SP  L+ +R  + + L+ FGCACYP L+PY+ HK QF T
Sbjct: 621  EAFSTAVYLINRLPSSVNPNESPYSLMFKREPDYNALKPFGCACYPCLKPYNQHKLQFHT 680

Query: 182  NRCVYLGPSPARKGHKCLSSSGRVFISRHVQFNEGDYPFASGFGLQQS---TVSDNSLSH 241
             RCV++G S + KG+KC++S GR+F+SRHV FNE  +PF  GF   ++   T++DNS   
Sbjct: 681  TRCVFVGYSNSHKGYKCINSHGRIFVSRHVIFNENHFPFHGGFLDTKNPLKTLTDNS--- 740

Query: 242  STAAPNLHTWFSSLPTLEPATHPPSNTPHPCPPNLPITHNSQPPRSQAPTTSPTSPPQSR 301
            S   P      ++   +EP  +  S+           TH+ +        +S  +  + +
Sbjct: 741  SILLPTCSAGATTQDAIEPDNNTTSDQ---------NTHSIE--------SSDNNENEEQ 800

Query: 302  PNNLEFPINSPLSHEPLPIDTSPISSSSHQLPSTISADQHAFCTPSNPPSFPLSPLPTPE 361
             ++ EF +N+              ++SS Q    I AD                     +
Sbjct: 801  VDSSEFFVNT--------------NNSSTQ---DIEADNSV------------------D 860

Query: 362  ASDLNIPTSSPLPSLSIPAPETTSVVESFIAPQPPPPFQSIHPMITRGKVGIFKPKV-LL 421
            + D N                  S +   I  Q      + H M TR K GI KPK+  +
Sbjct: 861  SEDRN-----------------NSTMTGTIQQQAQQDNSNTHWMRTRSKDGIHKPKIPYV 920

Query: 422  SYTPTDWSVTEPTTVKVALATPIWKSAMDLEYNALMQNQTWTLVPPTGSVDVVGCKWVFR 481
                TD    EP +VK AL  P+WK AMD EY AL+ N TWTLVP     +++  KW+F+
Sbjct: 921  GMAETDSEEKEPKSVKEALGRPMWKEAMDKEYKALVSNHTWTLVPYQEQENIIDSKWIFK 980

Query: 482  IKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRVVLSLAVSRGWKLRQLD 541
             K   DGSI+R KARLVAKGF Q+ G+DF ETFSPVVK+ST+R++L++AV   W++RQLD
Sbjct: 981  TKYKSDGSIERRKARLVAKGFQQTAGLDFGETFSPVVKSSTVRIILTIAVHFNWEVRQLD 1040

Query: 542  FNNAFLNDKLDEDVYMSQPPGYADPRYPNYICKLHKALYGLKQAPRAWNVTLKSALLSWG 601
             NNAFLN KL E V+M QP GY D   PN+ICKL KA+YGLKQAPRAW  +L+S L++WG
Sbjct: 1041 INNAFLNGKLKETVFMHQPEGYIDAAKPNHICKLSKAIYGLKQAPRAWYDSLRSTLVNWG 1094

Query: 602  FTNSRSDTSLFIYHCGPSIILLLVYVDDVIVTGNNVALIDSLVATLDKTFALKDLGLLSY 661
            F N+++DTSLF          LL+YVDD+IVTG+N+  +++    L+  ++LKDLG L Y
Sbjct: 1101 FQNAKNDTSLFFLKGADHTTFLLIYVDDIIVTGSNIKFLEAFTNQLNTAYSLKDLGPLHY 1094

Query: 662  FLGLQV 663
            FLG++V
Sbjct: 1161 FLGVEV 1094

BLAST of MC07g_new0372 vs. ExPASy TrEMBL
Match: A0A2K3MUJ9 (Putative retrotransposon Ty1-copia subclass protein (Fragment) OS=Trifolium pratense OX=57577 GN=L195_g017679 PE=4 SV=1)

HSP 1 Score: 530 bits (1366), Expect = 9.81e-171
Identity = 291/666 (43.69%), Postives = 389/666 (58.41%), Query Frame = 0

Query: 2    GTCTVMSSDGYRYYVHFLDDYSRFVWIYPLKLKSDTLSAFNHFTTMIKTQFGSHIKMLQS 61
            G   ++S   ++YYVHFLDD+SRF WI+PLK KS+T+ AFN F  +++ QF   IK+++ 
Sbjct: 508  GPAPILSQSNFKYYVHFLDDFSRFTWIFPLKQKSETIHAFNQFKNLVENQFNKKIKVIRC 567

Query: 62   DNGGEYKRVHQLCHQLGIQSRFSCPYTSAQNGRAERKHRHIVETGLTLLAQASMPLSFWW 121
            D GGEYK V +     GIQ + SCPYTS QNGRAERKHRH+ E GLTLLAQA MPLS+WW
Sbjct: 568  DGGGEYKPVQKCAIDSGIQFQMSCPYTSQQNGRAERKHRHVTELGLTLLAQAKMPLSYWW 627

Query: 122  EAFLTSTLLINGLPSPLLNGKSPMELLIQRSLNVSELRIFGCACYPFLRPYHTHKFQFRT 181
            EAF T+  LIN LPS +   +SP  L+ ++  + + L+ FGCACYP L+PY+ HK QF T
Sbjct: 628  EAFSTAVYLINRLPSSVNPNESPYTLVFKKEPDYTALKPFGCACYPCLKPYNQHKLQFHT 687

Query: 182  NRCVYLGPSPARKGHKCLSSSGRVFISRHVQFNEGDYPFASGFGLQQSTVSDNSLSHSTA 241
             RCV+LG S + KG+KC++S GRVF+SRHV FNE  +PF  GF   ++ +          
Sbjct: 688  TRCVFLGYSNSHKGYKCVNSHGRVFVSRHVVFNENHFPFQEGFLDTRNPIK-------VV 747

Query: 242  APNLHTWFSSLPTLEPATHPPSNTPHPCPPNLPITHNSQPPRSQAPTTSPTSPPQSRPNN 301
              +    F S P                     IT N                     N 
Sbjct: 748  TNDTPIGFPSFPAG-------------------ITTN---------------------NT 807

Query: 302  LEFPINSPLSHEPLPIDTSPISSSSHQLPSTISADQHAFCTPSNPPSFPLSPLPTPEASD 361
             E   N     EP   D + ++  S +  +    D++ F                 E  D
Sbjct: 808  AEATDNIVDQQEPELNDINTVADQSVESDTFEHTDENNFSNG--------------ETED 867

Query: 362  LNIPTSSPLPSLSIPAPETTSVVESFIAPQPPPPFQSI---HPMITRGKVGIFKPKV-LL 421
                      S      E+   +   I    PPP Q I   H M TR K G++KPK+  +
Sbjct: 868  ----------STEAAGRESMEEISQPITETNPPPQQDITNTHWMRTRSKAGVYKPKLPYI 927

Query: 422  SYTPTDWSVTEPTTVKVALATPIWKSAMDLEYNALMQNQTWTLVPPTGSVDVVGCKWVFR 481
              T       EP +V  AL+ P W +AMD EY ALM N+TWTLVP  G  +V+  KW+F+
Sbjct: 928  GLTEEAKEGKEPESVSEALSIPEWLNAMDAEYKALMNNKTWTLVPFEGQENVISSKWIFK 987

Query: 482  IKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRVVLSLAVSRGWKLRQLD 541
             K   DG+I+R KARLVA+GF Q+ G+D+ ETFSPVVK+ST+R++LS+AV   W++RQLD
Sbjct: 988  TKYKADGTIERRKARLVARGFQQTAGVDYDETFSPVVKSSTVRIILSIAVHLSWEVRQLD 1047

Query: 542  FNNAFLNDKLDEDVYMSQPPGYADPRYPNYICKLHKALYGLKQAPRAWNVTLKSALLSWG 601
             NNAFLN  L E V+M QP GY D   P++IC+L+KA+YGLKQAPRAW   L+  LLSWG
Sbjct: 1048 INNAFLNGNLKESVFMHQPEGYIDQTKPHHICRLNKAIYGLKQAPRAWFDRLRHTLLSWG 1102

Query: 602  FTNSRSDTSLFIYHCGPSIILLLVYVDDVIVTGNNVALIDSLVATLDKTFALKDLGLLSY 661
            F N++SD+SLF+         LL+YVDD+I+TG+N   +++ ++ L+  F+LKDLG L Y
Sbjct: 1108 FQNTKSDSSLFVLKETDHTTFLLIYVDDIIITGSNNKFLEAFISQLNLVFSLKDLGNLHY 1102

Query: 662  FLGLQV 663
            FLG++V
Sbjct: 1168 FLGIEV 1102

BLAST of MC07g_new0372 vs. ExPASy TrEMBL
Match: A0A2Z6MBG6 (Integrase catalytic domain-containing protein OS=Trifolium subterraneum OX=3900 GN=TSUD_77270 PE=4 SV=1)

HSP 1 Score: 533 bits (1372), Expect = 1.51e-170
Identity = 299/667 (44.83%), Postives = 387/667 (58.02%), Query Frame = 0

Query: 2    GTCTVMSSDGYRYYVHFLDDYSRFVWIYPLKLKSDTLSAFNHFTTMIKTQFGSHIKMLQS 61
            G   +M+S G++YYVHF+DD+SRF WIYPLK KS+T+ AF  F  + + QF   IK++Q 
Sbjct: 501  GPAPIMTSSGFKYYVHFVDDFSRFTWIYPLKQKSETVQAFIQFKNLTENQFNKRIKVIQC 560

Query: 62   DNGGEYKRVHQLCHQLGIQSRFSCPYTSAQNGRAERKHRHIVETGLTLLAQASMPLSFWW 121
            D GGEYK V +L  + GIQ R SCPYTS QNGRAERKHRHI E GLTLLAQA MPL +WW
Sbjct: 561  DGGGEYKPVQKLAVEAGIQFRMSCPYTSQQNGRAERKHRHITEFGLTLLAQAQMPLHYWW 620

Query: 122  EAFLTSTLLINGLPSPLLNGKSPMELLIQRSLNVSELRIFGCACYPFLRPYHTHKFQFRT 181
            EAF T+  LIN LPS +   +SP  L++Q+  +   L+ FGCACYP L+PY+ HK Q+ T
Sbjct: 621  EAFSTAVYLINRLPSQVTQNESPYSLMLQKEPDYKLLKTFGCACYPCLKPYNQHKLQYHT 680

Query: 182  NRCVYLGPSPARKGHKCLSSSGRVFISRHVQFNEGDYPFASGFGLQQSTVSDNSLSHSTA 241
             RCV+LG S + KG+KCL+S GR+FISRHV FNE  +PF  GF   +S +       ST+
Sbjct: 681  TRCVFLGYSNSHKGYKCLNSHGRIFISRHVIFNEDHFPFHDGFLNTRSPLKTTINVPSTS 740

Query: 242  APNLHTWF----SSLPTLEPATHPPSNTPHPCPPNLPITHNSQPPRSQAPTTSPTSPPQS 301
             P          +S+P LE                           ++ P  + T   Q 
Sbjct: 741  FPLCTAGNVIDDASMPILE---------------------------AENPAETNTEDSQD 800

Query: 302  RPNNLEFPINSPLSHEPLPIDTSPISSSSHQLPSTISADQHAFCTPSNPPSFPLSPLPTP 361
              ++ E   N P               ++H+    I+  Q                    
Sbjct: 801  VNSDTEQTNNGPSEDN-----------TTHEETLDITQQQSVG----------------- 860

Query: 362  EASDLNIPTSSPLPSLSIPAPETTSVVESFIAPQPPPPFQSIHPMITRGKVGIFKPKV-L 421
            EAS  N  TS                                H + TR K GI KPK+  
Sbjct: 861  EASQ-NTNTS--------------------------------HAIHTRSKSGIHKPKLPY 920

Query: 422  LSYTPTDWSVTEPTTVKVALATPIWKSAMDLEYNALMQNQTWTLVPPTGSVDVVGCKWVF 481
            +  T T     EP   K AL+ P+WK AM  E+ ALM N+TW LVP     ++V  KWVF
Sbjct: 921  IGLTETYKDTMEPANAKEALSRPLWKEAMQKEFEALMSNKTWILVPYQNQENIVDSKWVF 980

Query: 482  RIKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRVVLSLAVSRGWKLRQL 541
            + K   DGS++R KARLVAKGF Q+ GID+ ETFSPV+KAST+R++LS+AV   W++RQL
Sbjct: 981  KTKYKPDGSLERRKARLVAKGFQQTAGIDYEETFSPVIKASTVRIILSIAVHLNWEVRQL 1040

Query: 542  DFNNAFLNDKLDEDVYMSQPPGYADPRYPNYICKLHKALYGLKQAPRAWNVTLKSALLSW 601
            D NNAFLN  L E V+M QP G+ D   PN+ICKL KA+YGLKQAPRAW  +LK+ALL+W
Sbjct: 1041 DINNAFLNGHLKETVFMHQPEGFVDSTKPNHICKLSKAIYGLKQAPRAWFDSLKTALLNW 1079

Query: 602  GFTNSRSDTSLFIYHCGPSIILLLVYVDDVIVTGNNVALIDSLVATLDKTFALKDLGLLS 661
            GF N++SD+SLF+      I  LL+YVDD+IVTG+N   + + +  L+  F+LKDLG L 
Sbjct: 1101 GFQNTKSDSSLFLLKGKDHITFLLIYVDDIIVTGSNGKFLQAFIKQLNDAFSLKDLGHLH 1079

Query: 662  YFLGLQV 663
            YFLG++V
Sbjct: 1161 YFLGIEV 1079

BLAST of MC07g_new0372 vs. ExPASy TrEMBL
Match: A0A396JWP9 (Putative RNA-directed DNA polymerase OS=Medicago truncatula OX=3880 GN=MtrunA17_Chr1g0213431 PE=4 SV=1)

HSP 1 Score: 533 bits (1373), Expect = 1.45e-169
Identity = 323/676 (47.78%), Postives = 411/676 (60.80%), Query Frame = 0

Query: 2    GTCTVMSSDGYRYYVHFLDDYSRFVWIYPLKLKSDTLSAFNHFTTMIKTQFGSHIKMLQS 61
            G   V SS GY Y++  +D YSR+ WIYPLKLKS TLS F +F TMI+ Q    I  +Q+
Sbjct: 582  GPAPVESSCGYTYFLTCVDAYSRYTWIYPLKLKSHTLSTFQNFKTMIELQLNHKITSVQT 641

Query: 62   DNGGEYKRVHQLCHQLGIQSRFSCPYTSAQNGRAERKHRHIVETGLTLLAQASMPLSFWW 121
            D GGE+    +  + LGI  RF+CP+T  QNG  ERKHRHIVETGLTLL+ A MPL FW 
Sbjct: 642  DGGGEFLPFTKYLNSLGITHRFTCPHTHHQNGSVERKHRHIVETGLTLLSHAQMPLKFWD 701

Query: 122  EAFLTSTLLINGLPSPLLNGKSPMELLIQRSLNVSELRIFGCACYPFLRPYHTHKFQFRT 181
             AFLT+T LIN LP+P+L  KSP  LL  +  +   L+ FGCAC+PFLRPY++HKF F +
Sbjct: 702  HAFLTATYLINRLPTPVLANKSPFFLLHLQFPDYKFLKSFGCACFPFLRPYNSHKFDFHS 761

Query: 182  NRCVYLGPSPARKGHKCLSSSGRVFISRHVQFNEGDYPFASGFGLQQ--STVSDNSLSHS 241
              CV+LG S + KG+KCL +SGR+FIS+ V FNE  +P+   F  Q+  S + D      
Sbjct: 762  KECVFLGYSNSHKGYKCLDASGRIFISKDVVFNEVKFPYLDLFPSQKVCSVLPDG----- 821

Query: 242  TAAPNLHTWFSSLPTLEPATHPPSNTPHPCPPNLPITHNSQPPRSQAPTTSP--TSPPQS 301
               P L T+   LPT      P S T          T NS  P++    + P   + P  
Sbjct: 822  ---PTLSTF---LPT------PVSTT---------FTVNSHTPQNSHSKSGPHIVNSPTP 881

Query: 302  RPNNLEFPINSPLSHEPLPIDTSPISSSSHQLPSTISADQHAFCTPSNPPSFPLSPLPTP 361
            + ++ EF   +P+S+ P     +P  SS H      S   H      NP   P++ L +P
Sbjct: 882  QTSHSEFVPTTPISNTP----QTPSISSHH------SESSHRNNVVLNP--TPITIL-SP 941

Query: 362  EASDLNIPTSSPLPSLSIPAPETTSVVESFIAPQPPPPFQSIHP-----MITRGKVGIFK 421
             AS      SSP  S S+ + ++T+      +  PPP    IHP     M TRGK GI +
Sbjct: 942  SASQ----NSSPESSASVTSSQSTN------SESPPPVPHRIHPQNCHTMRTRGKHGIVQ 1001

Query: 422  PKVLLSYTPTDWSVTEPTTVKVALATPIWKSAMDLEYNALMQNQTWTLVPPTGSVDVVGC 481
            P++  +   T     EPTT K AL  P W  AM  EYNAL+ NQTW+LV    +   +GC
Sbjct: 1002 PRINPTLLLTH---VEPTTYKTALQDPKWHLAMQEEYNALLHNQTWSLVSLPANRLAIGC 1061

Query: 482  KWVFRIKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRVVLSLAVSRGWK 541
            KWVFR+K N DG++ + KARLVAKGFHQ  G D+ ETFSPVVK  T+R VL+LAV+  W 
Sbjct: 1062 KWVFRVKENPDGTVNKYKARLVAKGFHQQAGFDYNETFSPVVKPVTVRTVLTLAVTYNWT 1121

Query: 542  LRQLDFNNAFLNDKLDEDVYMSQPPGYADPRYPNYICKLHKALYGLKQAPRAWNVTLKSA 601
            L+QLD NNAFLN  L E+VYM QPPG+      N +CKLHKALYGLKQAPRAW   LKS+
Sbjct: 1122 LQQLDVNNAFLNGVLTEEVYMVQPPGFESSD-KNLVCKLHKALYGLKQAPRAWFERLKSS 1181

Query: 602  LLSWGFTNSRSDTSLFIYHCGPSIILLLVYVDDVIVTGNNVALIDSLVATLDKTFALKDL 661
            LLS+GF +SR D SLF  H     I +LVYVDD+I+TGN+   I +LV  L+  F+LKDL
Sbjct: 1182 LLSFGFKSSRCDPSLFTLHTQAHCIFILVYVDDIIITGNSKLAIQNLVHQLNSEFSLKDL 1204

Query: 662  GLLSYFLGLQVTHLPS 668
            G+L YFLG++V H PS
Sbjct: 1242 GILDYFLGIEVHHSPS 1204

BLAST of MC07g_new0372 vs. TAIR 10
Match: AT4G23160.1 (cysteine-rich RLK (RECEPTOR-like protein kinase) 8 )

HSP 1 Score: 217.6 bits (553), Expect = 3.0e-56
Identity = 109/240 (45.42%), Postives = 150/240 (62.50%), Query Frame = 0

Query: 428 EPTTVKVALATPIWKSAMDLEYNALMQNQTWTLVPPTGSVDVVGCKWVFRIKRNFDGSIQ 487
           EP+T   A    +W  AMD E  A+    TW +     +   +GCKWV++IK N DG+I+
Sbjct: 85  EPSTYNEAKEFLVWCGAMDDEIGAMETTHTWEICTLPPNKKPIGCKWVYKIKYNSDGTIE 144

Query: 488 RNKARLVAKGFHQSPGIDFFETFSPVVKASTIRVVLSLAVSRGWKLRQLDFNNAFLNDKL 547
           R KARLVAKG+ Q  GIDF ETFSPV K ++++++L+++    + L QLD +NAFLN  L
Sbjct: 145 RYKARLVAKGYTQQEGIDFIETFSPVCKLTSVKLILAISAIYNFTLHQLDISNAFLNGDL 204

Query: 548 DEDVYMSQPPGYA----DPRYPNYICKLHKALYGLKQAPRAWNVTLKSALLSWGFTNSRS 607
           DE++YM  PPGYA    D   PN +C L K++YGLKQA R W +     L+ +GF  S S
Sbjct: 205 DEEIYMKLPPGYAARQGDSLPPNAVCYLKKSIYGLKQASRQWFLKFSVTLIGFGFVQSHS 264

Query: 608 DTSLFIYHCGPSIILLLVYVDDVIVTGNNVALIDSLVATLDKTFALKDLGLLSYFLGLQV 664
           D + F+       + +LVYVDD+I+  NN A +D L + L   F L+DLG L YFLGL++
Sbjct: 265 DHTYFLKITATLFLCVLVYVDDIIICSNNDAAVDELKSQLKSCFKLRDLGPLKYFLGLEI 324

BLAST of MC07g_new0372 vs. TAIR 10
Match: ATMG00820.1 (Reverse transcriptase (RNA-dependent DNA polymerase) )

HSP 1 Score: 126.7 bits (317), Expect = 7.0e-29
Identity = 64/125 (51.20%), Postives = 83/125 (66.40%), Query Frame = 0

Query: 402 MITRGKVGIFKPKVLLSYTPTDWSVTEPTTVKVALATPIWKSAMDLEYNALMQNQTWTLV 461
           M+TR K GI K     S T T     EP +V  AL  P W  AM  E +AL +N+TW LV
Sbjct: 1   MLTRSKAGINKLNPKYSLTITTTIKKEPKSVIFALKDPGWCQAMQEELDALSRNKTWILV 60

Query: 462 PPTGSVDVVGCKWVFRIKRNFDGSIQRNKARLVAKGFHQSPGIDFFETFSPVVKASTIRV 521
           PP  + +++GCKWVF+ K + DG++ R KARLVAKGFHQ  GI F ET+SPVV+ +TIR 
Sbjct: 61  PPPVNQNILGCKWVFKTKLHSDGTLDRLKARLVAKGFHQEEGIYFVETYSPVVRTATIRT 120

Query: 522 VLSLA 527
           +L++A
Sbjct: 121 ILNVA 125

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
Q94HW22.0e-13741.76Retrovirus-related Pol polyprotein from transposon RE1 OS=Arabidopsis thaliana O... [more]
Q9ZT942.6e-13743.36Retrovirus-related Pol polyprotein from transposon RE2 OS=Arabidopsis thaliana O... [more]
P109783.4e-6529.45Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum... [more]
P041467.4e-5227.39Copia protein OS=Drosophila melanogaster OX=7227 GN=GIP PE=1 SV=3[more]
P925209.8e-2851.20Uncharacterized mitochondrial protein AtMg00820 OS=Arabidopsis thaliana OX=3702 ... [more]
Match NameE-valueIdentityDescription
PNY02796.11.48e-18046.53copia protein (gag-int-pol protein), partial [Trifolium pratense][more]
GAU17915.11.49e-17344.80hypothetical protein TSUD_330400, partial [Trifolium subterraneum][more]
GAU51268.11.04e-17244.59hypothetical protein TSUD_412550 [Trifolium subterraneum][more]
PNX94503.12.03e-17043.69putative retrotransposon Ty1-copia subclass protein, partial [Trifolium pratense... [more]
GAU19483.13.12e-17044.83hypothetical protein TSUD_77270 [Trifolium subterraneum][more]
Match NameE-valueIdentityDescription
A0A2K3NIC37.18e-18146.53Copia protein (Gag-int-pol protein) (Fragment) OS=Trifolium pratense OX=57577 GN... [more]
A0A2Z6P4D55.02e-17344.59Integrase catalytic domain-containing protein OS=Trifolium subterraneum OX=3900 ... [more]
A0A2K3MUJ99.81e-17143.69Putative retrotransposon Ty1-copia subclass protein (Fragment) OS=Trifolium prat... [more]
A0A2Z6MBG61.51e-17044.83Integrase catalytic domain-containing protein OS=Trifolium subterraneum OX=3900 ... [more]
A0A396JWP91.45e-16947.78Putative RNA-directed DNA polymerase OS=Medicago truncatula OX=3880 GN=MtrunA17_... [more]
Match NameE-valueIdentityDescription
AT4G23160.13.0e-5645.42cysteine-rich RLK (RECEPTOR-like protein kinase) 8 [more]
ATMG00820.17.0e-2951.20Reverse transcriptase (RNA-dependent DNA polymerase) [more]
InterPro
Analysis Name: InterPro Annotations of Bitter gourd (Dali-11) v1
Date Performed: 2021-10-25
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR013103Reverse transcriptase, RNA-dependent DNA polymerasePFAMPF07727RVT_2coord: 455..664
e-value: 4.5E-58
score: 196.7
IPR001584Integrase, catalytic corePFAMPF00665rvecoord: 8..88
e-value: 1.8E-12
score: 47.4
IPR001584Integrase, catalytic corePROSITEPS50994INTEGRASEcoord: 1..152
score: 19.255964
IPR036397Ribonuclease H superfamilyGENE3D3.30.420.10coord: 3..157
e-value: 2.1E-29
score: 104.4
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 295..345
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 259..277
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 359..376
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 253..376
NoneNo IPR availablePANTHERPTHR45895FAMILY NOT NAMEDcoord: 8..217
coord: 428..664
IPR012337Ribonuclease H-like superfamilySUPERFAMILY53098Ribonuclease H-likecoord: 8..146
IPR043502DNA/RNA polymerase superfamilySUPERFAMILY56672DNA/RNA polymerasescoord: 455..663

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
MC07g_new0372.1MC07g_new0372.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0015074 DNA integration
molecular_function GO:0003676 nucleic acid binding