ClCG03G007900 (gene) Watermelon (Charleston Gray) v2.5

Overview
NameClCG03G007900
Typegene
OrganismCitrullus lanatus subsp. vulgaris cv. Charleston Gray (Watermelon (Charleston Gray) v2.5)
Descriptionnuclear envelope pore membrane protein POM 121-like
LocationCG_Chr03: 8978993 .. 8982550 (+)
RNA-Seq ExpressionClCG03G007900
SyntenyClCG03G007900
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: exonfive_prime_UTRCDSpolypeptidethree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
AAAAAAAAAAGAAAAAAGAAAAAAAAAGTGTGAGATGAAGAAAACTCAAGCCAAACGACAGCGTATTCCCAGGAGAGGACCTGGCGTTGCAGAGCTTGAGAAGATTTTAAAGGAGCAAGAAGGCGGCGGCGGTGACGGCGGTGCCGGCCAAGGTCATGACTCCTTCATCACTCAATTTCAAGACATTTCGCCATCGCCGTCGTTAAACCCTCCCCGGCCGCCACCGCCGCCGCCTCCTCCGCCACTTGTTCCAATAATCACGCCGCCGTGTGACTATGCTTCTTGGTCAAATAATCTTCCGTTATTCCCAACTCTTGAATTCATTCCCCCGCCTTTACCAACCGCCGCCGCCGCCGCTTCCTCCGCCAAAAAACCCTTGTTTCCAACAACCCGAATATCCGATTCCCAATTGAATTTTCCTTCGCATTTTTTCCCTAGCTTTCAATATTCTGCTTCTTCTCTCAATCTTGAATATTGTAATTCAATGGTGAGACAAGATGGAAAAAATTGTGAAATATTTTATGGATCTTCAAAATGAATTATATTAAATTACTGGAAATAAATTTTTCAGGTGAATATTAATCCAGGATCTGCCTCTTCGTATCCGTCAGCCACGTCATCGGCGGCGAGATATTTTCGTCAAATAGAGCACCCTTCAAGCCAAATATCAACTGATTTCAACAACATATGGACCTCACCGGAAGAACAAGAAAAGGCAAAATTACTTCTCACCTCTTTTTCTTTTAATTTTTATTTATTTCCAAAATGTCCAAACATATTTTTAAATTTACCAACCTAAAATACTCTCTCATTGGTTAATTAATATATGATTTTGTTTAATAATAGTTACTAATAAGTATATATATACGTGTGTGTGTGTGTGTGTGTGTTTAAATGTTAATTGACAGATGGTTAGTGCAAAGAGAGTGAGACCCTTTTTGGAAGAAAGCCATAGGGAAGGAAATGTAGAGAGCAGAGATCCAATTTTGACAAACATGGCAACAAAAGACTCATCTCCCTCTTCTTCTTCTTCTTCAATGGAAATGAATCCTTCTCCATTTCATTTCGATTCAAACTTCAGGTTGTCTTATTTCTTTCAATATCTCCTCTTTTTAGTTTTACTTTTAATTCAATAATGTTTTGTTTTAATCAAAATTTTGGGTACATTTGATCGAAGAATTTTTGGAAGTGGATTATGATTTATGCGCAATAAACCAAAGTGTTAATAGAGAGGCTAGAGTATTAAAAATCAAATCTTTATGCAAAATATTTTTTTTTTTTTTTAAATCTTTTGCACAATATAAATGATAAAAAGTATATTGTGTATTTAGTTTTAAATGTTGGTTTGGAACGTTGCAGATATATTTAGAAAATGGAGGGAAAAAATGTAAATAAACAACCCAAAAAATAAAAAGCTAGTAGAAACATTAAAACTTGGGATAAACTACCGTTTTATAACTCAAATCATTAAGATTAAGATTTTAGTAAAAAACACATGGGATTAGAGTTATTGATGTATGATATATGTCTAAACGAATTAAGGTATGCTCTTGTTCGTGAATAAAACATTTAATCTTTATTTATTTTTTTTTTTCAATTAATAGTGATAAATGTGTAAACAAGGGAAAGGGAAAAGATAAAGGTTGGTTATTTGGCTATTAATTTTGTTATTAGTATTATTATTATTTTGTGTGTGTAGGGGTACAAAAAGGGGTTTTGTTGGGCAGTTAATGAGCTCTATCAAAAGTAGTGGAAGCTATCAATTAGCAGAAGACAACAAATTGATGGTATTAGGATCTTCATCATCATCAGCTCCTAATGAAATTGCACCATTCAATTTCCATCTCCCCCAAGTAAGAATTATGTTTTCAACATTTTTTCTTCTTTTAACTTTATTTAACCTATGCATAATATGATCAACTCAGTTGAATTTGTGAATTTAATCTTGGAAAAGCTATAATTTCATCCATTAAACCCTAGTCTTAAATTAGGCAACCAAAATTTCAATACCCTTCTTTAAATCTTAATTTTAAAAACCCCAAAATTTGACCAAACAAAATATTCCAAATCAAGCCAGCTAGAAAAATGCTTAAACCTAAAAACATATGTCTAATCCAATACCAAAATGTCTGAAATCCAACCAAAAACAAACACCCAACATAGCTACAAAATGCACCTAAAAAATTAGCATCTCCCAGATTACTATAAGGAAAGTTTTCATTTTTTTTTTCAATCTTTATTCTAAATCAAAATTAAATTTTTATTGAATTGAATTGAATTCTAGACTTAGACTAAATGGTAGTATTAATAATTTTCTAAAATATTAGACAAAGTAATGGCTAATTAATTAAATTATAATCATTGATATATATTTAGTATGTTCTTTTCCTTTTTTAAATATTTGGAATTATTCTAAATATTTGCCGTCACGCCCTTTCTATTGTTAAGTTTGTGAGCATTATTTTATCTGATTTCATGATTATTTTCGGGTTTAAATATTTCAAGCATTTTCTAATTTAGTTTTTAAAGAGTGTTAATTTTTTAATACTACTTTGGTAATTCTACTTTGAAATCTGGTTTATTTTAGTTTTTGTACTTTCAAAATGTCCACATTAGTCCTTATGTTTTTAATTTTAGTTCATTCATTTTCATTTTTTAGGAGAAATTATCACCTTTTAGAAGTATAGGGACTAAAACGAACTAAGTCAAAAGTATAAGGATCAAATTGAACATTTTGAAAGTATAGGGGCCAAAATGATGATCCCTTCATTTTCATTTTTAAGGAAAAAAATTATCACTTTTTAGAAATATAGGGACCAAAATGAACTAAAGTTGATGTTACAAAAATTAAAATAGACATTTTAAAAGTACAAGATCCAAAATGAACCGAAATTAAAATTACAGGGACCAAAATCGAAAATACAAAAATTAAAGTAGTATTTAAATCTAAAATTAAAAGCGATAAATATTTAGTTTAAATAATATATATGAAGAGGGCAAGAATGGAAATTAACTAAGTGGAGAAATTATTTTGAAGGAAACTATGGAGGCTTCACAGCATATAGATGAAGGAGGATGTGCCTCAGATTACTACAAAGTTACATTTAACTCCTCCTCCTTTTATGAATCAAACTCAAACTCAAACTCAAAGGGCAACAAAGGCATTACCATCGGCTCTAAAGCCGGAGCCCGAGCCGAAGTCGAAGCCGAAGCTGAAGCCGAAGGCATCGATCTGAATTTGAAGCTGTAATTGTTTGTTTTGAACGAAAAAGCTTTGACAAAAGAGAGATTGCGATGCAATTTTATACCCAAAAATATAAGCATACATACAAACTAAGGCTCTTAATTTCTCCAATATTTAGTTGTAGACTACTTGTAGTTTTAGTTTTAGAGAGCTACATTTTAATAGACGGTTCCATGTGTTAACTTTAGAGCTAATTTAAGATAAATTATATGCAAAATCGTAAGTGGTGGGAGTCAAACGTGAGTATACTCCAACGATAATTGACTTGTATTTTGAGGTCAGATAATTAAGTTTACATGTTCTACATTTGACGT

mRNA sequence

AAAAAAAAAAGAAAAAAGAAAAAAAAAGTGTGAGATGAAGAAAACTCAAGCCAAACGACAGCGTATTCCCAGGAGAGGACCTGGCGTTGCAGAGCTTGAGAAGATTTTAAAGGAGCAAGAAGGCGGCGGCGGTGACGGCGGTGCCGGCCAAGGTCATGACTCCTTCATCACTCAATTTCAAGACATTTCGCCATCGCCGTCGTTAAACCCTCCCCGGCCGCCACCGCCGCCGCCTCCTCCGCCACTTGTTCCAATAATCACGCCGCCGTGTGACTATGCTTCTTGGTCAAATAATCTTCCGTTATTCCCAACTCTTGAATTCATTCCCCCGCCTTTACCAACCGCCGCCGCCGCCGCTTCCTCCGCCAAAAAACCCTTGTTTCCAACAACCCGAATATCCGATTCCCAATTGAATTTTCCTTCGCATTTTTTCCCTAGCTTTCAATATTCTGCTTCTTCTCTCAATCTTGAATATTGTAATTCAATGGTGAATATTAATCCAGGATCTGCCTCTTCGTATCCGTCAGCCACGTCATCGGCGGCGAGATATTTTCGTCAAATAGAGCACCCTTCAAGCCAAATATCAACTGATTTCAACAACATATGGACCTCACCGGAAGAACAAGAAAAGATGGTTAGTGCAAAGAGAGTGAGACCCTTTTTGGAAGAAAGCCATAGGGAAGGAAATGTAGAGAGCAGAGATCCAATTTTGACAAACATGGCAACAAAAGACTCATCTCCCTCTTCTTCTTCTTCTTCAATGGAAATGAATCCTTCTCCATTTCATTTCGATTCAAACTTCAGGGGTACAAAAAGGGGTTTTGTTGGGCAGTTAATGAGCTCTATCAAAAGTAGTGGAAGCTATCAATTAGCAGAAGACAACAAATTGATGGTATTAGGATCTTCATCATCATCAGCTCCTAATGAAATTGCACCATTCAATTTCCATCTCCCCCAAGAAACTATGGAGGCTTCACAGCATATAGATGAAGGAGGATGTGCCTCAGATTACTACAAAGTTACATTTAACTCCTCCTCCTTTTATGAATCAAACTCAAACTCAAACTCAAAGGGCAACAAAGGCATTACCATCGGCTCTAAAGCCGGAGCCCGAGCCGAAGTCGAAGCCGAAGCTGAAGCCGAAGGCATCGATCTGAATTTGAAGCTGTAATTGTTTGTTTTGAACGAAAAAGCTTTGACAAAAGAGAGATTGCGATGCAATTTTATACCCAAAAATATAAGCATACATACAAACTAAGGCTCTTAATTTCTCCAATATTTAGTTGTAGACTACTTGTAGTTTTAGTTTTAGAGAGCTACATTTTAATAGACGGTTCCATGTGTTAACTTTAGAGCTAATTTAAGATAAATTATATGCAAAATCGTAAGTGGTGGGAGTCAAACGTGAGTATACTCCAACGATAATTGACTTGTATTTTGAGGTCAGATAATTAAGTTTACATGTTCTACATTTGACGT

Coding sequence (CDS)

ATGAAGAAAACTCAAGCCAAACGACAGCGTATTCCCAGGAGAGGACCTGGCGTTGCAGAGCTTGAGAAGATTTTAAAGGAGCAAGAAGGCGGCGGCGGTGACGGCGGTGCCGGCCAAGGTCATGACTCCTTCATCACTCAATTTCAAGACATTTCGCCATCGCCGTCGTTAAACCCTCCCCGGCCGCCACCGCCGCCGCCTCCTCCGCCACTTGTTCCAATAATCACGCCGCCGTGTGACTATGCTTCTTGGTCAAATAATCTTCCGTTATTCCCAACTCTTGAATTCATTCCCCCGCCTTTACCAACCGCCGCCGCCGCCGCTTCCTCCGCCAAAAAACCCTTGTTTCCAACAACCCGAATATCCGATTCCCAATTGAATTTTCCTTCGCATTTTTTCCCTAGCTTTCAATATTCTGCTTCTTCTCTCAATCTTGAATATTGTAATTCAATGGTGAATATTAATCCAGGATCTGCCTCTTCGTATCCGTCAGCCACGTCATCGGCGGCGAGATATTTTCGTCAAATAGAGCACCCTTCAAGCCAAATATCAACTGATTTCAACAACATATGGACCTCACCGGAAGAACAAGAAAAGATGGTTAGTGCAAAGAGAGTGAGACCCTTTTTGGAAGAAAGCCATAGGGAAGGAAATGTAGAGAGCAGAGATCCAATTTTGACAAACATGGCAACAAAAGACTCATCTCCCTCTTCTTCTTCTTCTTCAATGGAAATGAATCCTTCTCCATTTCATTTCGATTCAAACTTCAGGGGTACAAAAAGGGGTTTTGTTGGGCAGTTAATGAGCTCTATCAAAAGTAGTGGAAGCTATCAATTAGCAGAAGACAACAAATTGATGGTATTAGGATCTTCATCATCATCAGCTCCTAATGAAATTGCACCATTCAATTTCCATCTCCCCCAAGAAACTATGGAGGCTTCACAGCATATAGATGAAGGAGGATGTGCCTCAGATTACTACAAAGTTACATTTAACTCCTCCTCCTTTTATGAATCAAACTCAAACTCAAACTCAAAGGGCAACAAAGGCATTACCATCGGCTCTAAAGCCGGAGCCCGAGCCGAAGTCGAAGCCGAAGCTGAAGCCGAAGGCATCGATCTGAATTTGAAGCTGTAA

Protein sequence

MKKTQAKRQRIPRRGPGVAELEKILKEQEGGGGDGGAGQGHDSFITQFQDISPSPSLNPPRPPPPPPPPPLVPIITPPCDYASWSNNLPLFPTLEFIPPPLPTAAAAASSAKKPLFPTTRISDSQLNFPSHFFPSFQYSASSLNLEYCNSMVNINPGSASSYPSATSSAARYFRQIEHPSSQISTDFNNIWTSPEEQEKMVSAKRVRPFLEESHREGNVESRDPILTNMATKDSSPSSSSSSMEMNPSPFHFDSNFRGTKRGFVGQLMSSIKSSGSYQLAEDNKLMVLGSSSSSAPNEIAPFNFHLPQETMEASQHIDEGGCASDYYKVTFNSSSFYESNSNSNSKGNKGITIGSKAGARAEVEAEAEAEGIDLNLKL
Homology
BLAST of ClCG03G007900 vs. NCBI nr
Match: XP_038895159.1 (uncharacterized protein DDB_G0271670-like [Benincasa hispida])

HSP 1 Score: 517.7 bits (1332), Expect = 8.5e-143
Identity = 306/388 (78.87%), Postives = 318/388 (81.96%), Query Frame = 0

Query: 1   MKKTQAKRQRIPRRGPGVAELEKILKEQEGGGGDGGAGQGHDSFITQFQDISPSPSLNPP 60
           MKK QAKRQRIPRRGPGVAELEKILKEQEG      A Q   S   Q  +  P  SLNPP
Sbjct: 1   MKKAQAKRQRIPRRGPGVAELEKILKEQEG----AAAAQDIIS-PPQPTNTHPLSSLNPP 60

Query: 61  RPPPPPPPPPLVPIITPPCDY-ASWSNNLPLFPTLEFIPPPLP-TAAAAASSAKKPLFPT 120
           RPPPPPPPPPLV    P  DY ASWSNNLPLFPTLEFIPPPLP +AAA  +SAKKPLFPT
Sbjct: 61  RPPPPPPPPPLV----PTTDYNASWSNNLPLFPTLEFIPPPLPNSAAAVTASAKKPLFPT 120

Query: 121 TRISDSQLNFPSHFFPSFQYSASSLNLEYCNSMVNINPGSASSYPSATSSAARYFRQIEH 180
           TRISDSQL F  HFFPSFQYSASSLN+E  NSMVNINPGSASS PSATSSA RYFR+IEH
Sbjct: 121 TRISDSQLKFAPHFFPSFQYSASSLNVECYNSMVNINPGSASSCPSATSSAGRYFREIEH 180

Query: 181 PSSQISTDFNNIWTSPEEQEKMVSAKRVRPFLEESHREGNVESRDPILTNMATKD----- 240
           PSSQIS+DFNNIWTSPEEQEKMVSAKRVR FLEESHRE N+ES+ PI  NMATKD     
Sbjct: 181 PSSQISSDFNNIWTSPEEQEKMVSAKRVRAFLEESHREANIESKGPIFKNMATKDSSSSS 240

Query: 241 --SSPSSSSSSMEMNPSPFHFDSNFRGTKRGFVGQLMSSIKSSGSYQLAEDNKLMVLGSS 300
             SS SSSSSSMEMN SPF+FDSNFRGTKRGF GQLMSS K SGSYQLAED+ LM LGSS
Sbjct: 241 SSSSSSSSSSSMEMNHSPFNFDSNFRGTKRGFAGQLMSSTKRSGSYQLAEDSNLMALGSS 300

Query: 301 SSSAPNEIAPFNFHLPQETMEASQHIDEGGCASDYYKVTFN-SSSFYESNSNSNSKGNKG 360
           SSSAPNEIA FNFHLPQETMEASQH D GGC SDYYKVTFN SSS YE  SNSNSKGNK 
Sbjct: 301 SSSAPNEIAAFNFHLPQETMEASQHRD-GGCVSDYYKVTFNSSSSMYE--SNSNSKGNKE 360

Query: 361 ITIGSKAGARAEVEAEAEAEGIDLNLKL 379
             IGS AGA A+VE EAEA+GIDLNLKL
Sbjct: 361 TVIGSGAGAGAQVEDEAEADGIDLNLKL 376

BLAST of ClCG03G007900 vs. NCBI nr
Match: XP_023001726.1 (trinucleotide repeat-containing gene 18 protein-like [Cucurbita maxima] >XP_023001727.1 trinucleotide repeat-containing gene 18 protein-like [Cucurbita maxima])

HSP 1 Score: 419.9 bits (1078), Expect = 2.4e-113
Identity = 270/397 (68.01%), Postives = 292/397 (73.55%), Query Frame = 0

Query: 1   MKKTQAKRQRIPRRGPGVAELEKILKEQEGGGGDGGAGQGH-------DSFITQFQDISP 60
           MKKT  KR RIPRRGPGVAELEKILKEQEGG G G  GQ          SF    +  SP
Sbjct: 1   MKKTHPKRHRIPRRGPGVAELEKILKEQEGGNGGGNGGQDQADISSLPPSFFQHRRRHSP 60

Query: 61  S---PSLNPPRPPPPPPPPPLVPIITPPC--DYASWSNNLPLFPTLEFIPP--PLPTAAA 120
           S    SLNPPRPPPPPPPPPL+P   PP   DYASWS NLPLFPTL+FIPP  P PT AA
Sbjct: 61  SNTPSSLNPPRPPPPPPPPPLLP---PPLTRDYASWS-NLPLFPTLDFIPPALPTPTPAA 120

Query: 121 AASSAKKPLFPTTRISDSQLNFPSHFFPSFQYSASSLNLEYCNSMVNINPGSASSYPSAT 180
           AA++A+KPLFPTTR SD+Q+N P HFFP+FQYSASS NL     M+NI PG      SAT
Sbjct: 121 AAAAAEKPLFPTTRKSDAQINLPPHFFPTFQYSASSHNL-----MMNIIPG------SAT 180

Query: 181 SSAARYFRQIEHPSSQISTDFNNIWTSPEEQEKMVSAKRVRPFLEESHREGNVESRDPIL 240
           S A RY+RQIEHPSSQIST FN+ WTSPEEQ+KMVSAKRVRPFL+E HR+ N ESR P  
Sbjct: 181 SPATRYYRQIEHPSSQISTQFNHTWTSPEEQQKMVSAKRVRPFLDEGHRDPNAESRAPFF 240

Query: 241 TNMATKD---SSPSSSSSSMEMNPSPFHFDSNFRGTKRGFVGQLMSSIKSSGSYQLA-ED 300
           TNMATK+   SS SSSSSSM+MN SPF+ DSN RGTKRGF G+LM   K S  YQLA E 
Sbjct: 241 TNMATKESSSSSSSSSSSSMDMNRSPFNLDSNSRGTKRGFGGELMRCSKRSERYQLAGEV 300

Query: 301 NKLMVLGSSSSSAPNEI-APFNFHLPQETMEASQHIDEGGCASDYYKVTFNSSSFYESNS 360
           + LM LG SSSSAPNE+ A FN H PQETMEASQ+ DEGG ASDY KVTFNSSS   S  
Sbjct: 301 SHLMALG-SSSSAPNEVAAAFNIHHPQETMEASQYRDEGGSASDYNKVTFNSSS--SSLY 360

Query: 361 NSNSKGNKGITIGSKAGARAEVEAEAEAEGIDLNLKL 379
            S SKGNK I IGS      EVEAEAEAEGIDLNLKL
Sbjct: 361 ESKSKGNKEIVIGS----AFEVEAEAEAEGIDLNLKL 375

BLAST of ClCG03G007900 vs. NCBI nr
Match: XP_031738182.1 (uncharacterized protein LOC105434463 [Cucumis sativus] >XP_031738184.1 uncharacterized protein LOC105434463 [Cucumis sativus] >KGN64713.1 hypothetical protein Csa_013361 [Cucumis sativus])

HSP 1 Score: 411.4 bits (1056), Expect = 8.5e-111
Identity = 263/401 (65.59%), Postives = 293/401 (73.07%), Query Frame = 0

Query: 1   MKKTQAKRQRIPRRGPGVAELEKILKEQEGGGGDG--GAGQGHDSFITQFQDISPSP-SL 60
           M+KTQAKRQRIPRRGPGVAELEKILKEQE G           H +  +     +  P SL
Sbjct: 1   MRKTQAKRQRIPRRGPGVAELEKILKEQESGAATDHQHTSSPHTNSTSTAATTTTHPLSL 60

Query: 61  NPPRPPPPPPPPPLVPIITP-----PCDYASWSNNLPLFPTLEFIPPP-LPTAAAAASSA 120
           NPPRPPPPPPPPPLV I+TP     P DY +WSNNLPLFPTLEFIPPP LPT        
Sbjct: 61  NPPRPPPPPPPPPLVQIMTPSPPPLPRDYVAWSNNLPLFPTLEFIPPPYLPTVV-----T 120

Query: 121 KKPLFPTTRISDSQLNFPSHFFPSFQYSASSLNL-EYCNSMVNINPGSASSYPSATSSAA 180
           +KPLFPTTR+S+SQLN   +F PSFQYSASS N  +Y N MVN+N GS SS PSATSS+A
Sbjct: 121 EKPLFPTTRMSESQLNLAPYFLPSFQYSASSFNPDQYYNPMVNVNQGSGSSCPSATSSSA 180

Query: 181 -RYFRQIEHPSSQISTDFNNIWTSPEEQEKMVSAKRVRPFLEESHRE-------GNVESR 240
            R+FR+IEHPSSQISTDFNNIW SPEE+EKMV+AKRV PFLEESHRE        N+  +
Sbjct: 181 GRHFREIEHPSSQISTDFNNIWNSPEEEEKMVNAKRVIPFLEESHREEANNNNNNNIIEK 240

Query: 241 DPILTN-MATKDSSPSSSSSSMEMNPSPFHFDSNFRGTKRGFVGQ-LMSSIKSSGSYQLA 300
             +  N M TKDSS SSSSSSME N SPFHF SNFRGTKRG  GQ  MS+ K SG YQL 
Sbjct: 241 MRVENNIMGTKDSS-SSSSSSMETNCSPFHFHSNFRGTKRGLGGQSRMSNTKRSGRYQLG 300

Query: 301 -EDNKLMVLGSSSSSAPNEIAPFN-FHLPQETMEASQHIDEGGCASDYYKVTFNSSSFYE 360
            +++ LM LGSSSSSAPNEI  FN FHLPQETME  QH DEGGC SDYYK+ FNSS +  
Sbjct: 301 DQESNLMALGSSSSSAPNEIPTFNIFHLPQETMEVPQHRDEGGCPSDYYKLNFNSSLY-- 360

Query: 361 SNSNSNSKGNKGI-TIGSKAGARAEVEAEAEAEGIDLNLKL 379
             SNSN+KGN+ +  I S AG+    EAEAEAEGIDLNLKL
Sbjct: 361 -ESNSNTKGNREVGMISSGAGS----EAEAEAEGIDLNLKL 388

BLAST of ClCG03G007900 vs. NCBI nr
Match: XP_022927210.1 (nuclear envelope pore membrane protein POM 121-like [Cucurbita moschata] >XP_022927212.1 nuclear envelope pore membrane protein POM 121-like [Cucurbita moschata] >XP_022927213.1 nuclear envelope pore membrane protein POM 121-like [Cucurbita moschata])

HSP 1 Score: 406.8 bits (1044), Expect = 2.1e-109
Identity = 265/397 (66.75%), Postives = 286/397 (72.04%), Query Frame = 0

Query: 1   MKKTQAKRQRIPRRGPGVAELEKILKEQEGGGGDGGAGQGHD---------SFITQFQDI 60
           MKKTQ KR RIPRRGPGVAELEKILKEQEGG G G    G D         SF    +  
Sbjct: 1   MKKTQPKRHRIPRRGPGVAELEKILKEQEGGDGGGNGNGGQDQADISSLPPSFFQHRRRH 60

Query: 61  SPS---PSLNPPRPPPPPPPPPLVPIITPPC--DYASWSNNLPLFPTLEFIPPPLPTAAA 120
           SPS    SLNPPRPPPPPPPPPL+P   PP   DYASWS NLPLFPTL+FIPP LPT   
Sbjct: 61  SPSNTPSSLNPPRPPPPPPPPPLLP---PPLTRDYASWS-NLPLFPTLDFIPPALPTPTP 120

Query: 121 AASSAKKPLFPTTRISDSQLNFPSHFFPSFQYSASSLNLEYCNSMVNINPGSASSYPSAT 180
           AA++ +KPLFPTTR SD+Q+N P HFFP+FQYSASS      N M+NI PG      SAT
Sbjct: 121 AAAAVEKPLFPTTRKSDAQINLPPHFFPTFQYSASS-----HNPMMNIIPG------SAT 180

Query: 181 SSAARYFRQIEHPSSQISTDFNNIWTSPEEQEKMVSAKRVRPFLEESHREGNVESRDPIL 240
           S AARY+RQIEHPSSQIST FNN WTSPEEQ+KMVSAKRVRPFL+E HR+ N ESR P  
Sbjct: 181 SPAARYYRQIEHPSSQISTQFNNTWTSPEEQQKMVSAKRVRPFLDEGHRDPNAESRAPFF 240

Query: 241 TNMATKD---SSPSSSSSSMEMNPSPFHFDSNFRGTKRGFVGQLMSSIKSSGSYQLA-ED 300
           TNMATK+   SS SSSSSSM+MN SPF  DSN RGTKRGF G+LM   K S  YQLA E 
Sbjct: 241 TNMATKESSSSSSSSSSSSMDMNRSPFDLDSNSRGTKRGFGGELMRCSKRSERYQLAGEV 300

Query: 301 NKLMVLGSSSSSAPNEI-APFNFHLPQETMEASQHIDEGGCASDYYKVTFNSSSFYESNS 360
           + LM LG SSSSAPNE+ A FN H PQET EASQ+ D GG ASDY KVTFNSSS   S  
Sbjct: 301 SHLMALG-SSSSAPNEVAAAFNIHHPQETTEASQYRD-GGSASDYNKVTFNSSS--SSLY 360

Query: 361 NSNSKGNKGITIGSKAGARAEVEAEAEAEGIDLNLKL 379
            S  KGNK I IGS      EVEAEAEAEGIDL+LKL
Sbjct: 361 ESKLKGNKEIVIGS----GFEVEAEAEAEGIDLSLKL 374

BLAST of ClCG03G007900 vs. NCBI nr
Match: KAG6583971.1 (hypothetical protein SDJN03_19903, partial [Cucurbita argyrosperma subsp. sororia])

HSP 1 Score: 403.7 bits (1036), Expect = 1.8e-108
Identity = 265/399 (66.42%), Postives = 290/399 (72.68%), Query Frame = 0

Query: 1   MKKTQAKRQRIPRRGPGVAELEKILKEQEG--GGGDGGAGQGHDS------FITQFQDIS 60
           MKKT  KR RIPRRGPGVAELEKILKEQ+G  GGG+GG  Q H S      F  + +  S
Sbjct: 1   MKKTHPKRHRIPRRGPGVAELEKILKEQQGGHGGGNGGQDQAHISSLPPSFFQHRRRRHS 60

Query: 61  PS---PSLNPPRPPPPPPPPPLVPIITPPC--DYASWSNNLPLFPTLEFIPPPLPT---A 120
           PS    SLNPPRPPPPPPPPPL+P   PP   DYASWS NLPLFPTL+FIPP LPT    
Sbjct: 61  PSNTPSSLNPPRPPPPPPPPPLLP---PPLTRDYASWS-NLPLFPTLDFIPPALPTPTPT 120

Query: 121 AAAASSAKKPLFPTTRISDSQLNFPSHFFPSFQYSASSLNLEYCNSMVNINPGSASSYPS 180
            AAA++ +KPLFPTTR SD+Q+N P HFFP+FQYSASS NL     M+NI PG      S
Sbjct: 121 PAAAAAVEKPLFPTTRKSDAQINLPPHFFPTFQYSASSHNL-----MMNIIPG------S 180

Query: 181 ATSSAARYFRQIEHPSSQISTDFNNIWTSPEEQEKMVSAKRVRPFLEESHREGNVESRDP 240
           ATS AARY+RQIEHPSSQIST FNN WTSPEEQ+KMVSAKRVRPFL+E HR+ N ESR P
Sbjct: 181 ATSPAARYYRQIEHPSSQISTQFNNTWTSPEEQQKMVSAKRVRPFLDEGHRDPNAESRAP 240

Query: 241 ILTNMATKD---SSPSSSSSSMEMNPSPFHFDSNFRGTKRGFVGQLMSSIKSSGSYQLA- 300
             TNMATK+   SS SSSSSSM+MN SP   DSN RGTKRGF G+LM   K S  YQLA 
Sbjct: 241 FFTNMATKESSSSSSSSSSSSMDMNRSPLDLDSNSRGTKRGFGGELMRCSKRSERYQLAG 300

Query: 301 EDNKLMVLGSSSSSAPNEI-APFNFHLPQETMEASQHIDEGGCASDYYKVTFNSSSFYES 360
           E + LM LG SSSSAPNE+ A FN H PQETMEASQ+ D GG ASDY +VTFNSSS   S
Sbjct: 301 EVSHLMALG-SSSSAPNEVAAAFNIHHPQETMEASQYRDGGGSASDYNRVTFNSSS--SS 360

Query: 361 NSNSNSKGNKGITIGSKAGARAEVEAEAEAEGIDLNLKL 379
              S  KGNK I IGS      EVE EAEAEGIDL+LKL
Sbjct: 361 LYESKLKGNKEIVIGS----GFEVEVEAEAEGIDLSLKL 377

BLAST of ClCG03G007900 vs. ExPASy TrEMBL
Match: A0A6J1KHF7 (trinucleotide repeat-containing gene 18 protein-like OS=Cucurbita maxima OX=3661 GN=LOC111495777 PE=4 SV=1)

HSP 1 Score: 419.9 bits (1078), Expect = 1.2e-113
Identity = 270/397 (68.01%), Postives = 292/397 (73.55%), Query Frame = 0

Query: 1   MKKTQAKRQRIPRRGPGVAELEKILKEQEGGGGDGGAGQGH-------DSFITQFQDISP 60
           MKKT  KR RIPRRGPGVAELEKILKEQEGG G G  GQ          SF    +  SP
Sbjct: 1   MKKTHPKRHRIPRRGPGVAELEKILKEQEGGNGGGNGGQDQADISSLPPSFFQHRRRHSP 60

Query: 61  S---PSLNPPRPPPPPPPPPLVPIITPPC--DYASWSNNLPLFPTLEFIPP--PLPTAAA 120
           S    SLNPPRPPPPPPPPPL+P   PP   DYASWS NLPLFPTL+FIPP  P PT AA
Sbjct: 61  SNTPSSLNPPRPPPPPPPPPLLP---PPLTRDYASWS-NLPLFPTLDFIPPALPTPTPAA 120

Query: 121 AASSAKKPLFPTTRISDSQLNFPSHFFPSFQYSASSLNLEYCNSMVNINPGSASSYPSAT 180
           AA++A+KPLFPTTR SD+Q+N P HFFP+FQYSASS NL     M+NI PG      SAT
Sbjct: 121 AAAAAEKPLFPTTRKSDAQINLPPHFFPTFQYSASSHNL-----MMNIIPG------SAT 180

Query: 181 SSAARYFRQIEHPSSQISTDFNNIWTSPEEQEKMVSAKRVRPFLEESHREGNVESRDPIL 240
           S A RY+RQIEHPSSQIST FN+ WTSPEEQ+KMVSAKRVRPFL+E HR+ N ESR P  
Sbjct: 181 SPATRYYRQIEHPSSQISTQFNHTWTSPEEQQKMVSAKRVRPFLDEGHRDPNAESRAPFF 240

Query: 241 TNMATKD---SSPSSSSSSMEMNPSPFHFDSNFRGTKRGFVGQLMSSIKSSGSYQLA-ED 300
           TNMATK+   SS SSSSSSM+MN SPF+ DSN RGTKRGF G+LM   K S  YQLA E 
Sbjct: 241 TNMATKESSSSSSSSSSSSMDMNRSPFNLDSNSRGTKRGFGGELMRCSKRSERYQLAGEV 300

Query: 301 NKLMVLGSSSSSAPNEI-APFNFHLPQETMEASQHIDEGGCASDYYKVTFNSSSFYESNS 360
           + LM LG SSSSAPNE+ A FN H PQETMEASQ+ DEGG ASDY KVTFNSSS   S  
Sbjct: 301 SHLMALG-SSSSAPNEVAAAFNIHHPQETMEASQYRDEGGSASDYNKVTFNSSS--SSLY 360

Query: 361 NSNSKGNKGITIGSKAGARAEVEAEAEAEGIDLNLKL 379
            S SKGNK I IGS      EVEAEAEAEGIDLNLKL
Sbjct: 361 ESKSKGNKEIVIGS----AFEVEAEAEAEGIDLNLKL 375

BLAST of ClCG03G007900 vs. ExPASy TrEMBL
Match: A0A0A0LSF6 (Uncharacterized protein OS=Cucumis sativus OX=3659 GN=Csa_1G077140 PE=4 SV=1)

HSP 1 Score: 411.4 bits (1056), Expect = 4.1e-111
Identity = 263/401 (65.59%), Postives = 293/401 (73.07%), Query Frame = 0

Query: 1   MKKTQAKRQRIPRRGPGVAELEKILKEQEGGGGDG--GAGQGHDSFITQFQDISPSP-SL 60
           M+KTQAKRQRIPRRGPGVAELEKILKEQE G           H +  +     +  P SL
Sbjct: 1   MRKTQAKRQRIPRRGPGVAELEKILKEQESGAATDHQHTSSPHTNSTSTAATTTTHPLSL 60

Query: 61  NPPRPPPPPPPPPLVPIITP-----PCDYASWSNNLPLFPTLEFIPPP-LPTAAAAASSA 120
           NPPRPPPPPPPPPLV I+TP     P DY +WSNNLPLFPTLEFIPPP LPT        
Sbjct: 61  NPPRPPPPPPPPPLVQIMTPSPPPLPRDYVAWSNNLPLFPTLEFIPPPYLPTVV-----T 120

Query: 121 KKPLFPTTRISDSQLNFPSHFFPSFQYSASSLNL-EYCNSMVNINPGSASSYPSATSSAA 180
           +KPLFPTTR+S+SQLN   +F PSFQYSASS N  +Y N MVN+N GS SS PSATSS+A
Sbjct: 121 EKPLFPTTRMSESQLNLAPYFLPSFQYSASSFNPDQYYNPMVNVNQGSGSSCPSATSSSA 180

Query: 181 -RYFRQIEHPSSQISTDFNNIWTSPEEQEKMVSAKRVRPFLEESHRE-------GNVESR 240
            R+FR+IEHPSSQISTDFNNIW SPEE+EKMV+AKRV PFLEESHRE        N+  +
Sbjct: 181 GRHFREIEHPSSQISTDFNNIWNSPEEEEKMVNAKRVIPFLEESHREEANNNNNNNIIEK 240

Query: 241 DPILTN-MATKDSSPSSSSSSMEMNPSPFHFDSNFRGTKRGFVGQ-LMSSIKSSGSYQLA 300
             +  N M TKDSS SSSSSSME N SPFHF SNFRGTKRG  GQ  MS+ K SG YQL 
Sbjct: 241 MRVENNIMGTKDSS-SSSSSSMETNCSPFHFHSNFRGTKRGLGGQSRMSNTKRSGRYQLG 300

Query: 301 -EDNKLMVLGSSSSSAPNEIAPFN-FHLPQETMEASQHIDEGGCASDYYKVTFNSSSFYE 360
            +++ LM LGSSSSSAPNEI  FN FHLPQETME  QH DEGGC SDYYK+ FNSS +  
Sbjct: 301 DQESNLMALGSSSSSAPNEIPTFNIFHLPQETMEVPQHRDEGGCPSDYYKLNFNSSLY-- 360

Query: 361 SNSNSNSKGNKGI-TIGSKAGARAEVEAEAEAEGIDLNLKL 379
             SNSN+KGN+ +  I S AG+    EAEAEAEGIDLNLKL
Sbjct: 361 -ESNSNTKGNREVGMISSGAGS----EAEAEAEGIDLNLKL 388

BLAST of ClCG03G007900 vs. ExPASy TrEMBL
Match: A0A6J1EGJ1 (nuclear envelope pore membrane protein POM 121-like OS=Cucurbita moschata OX=3662 GN=LOC111434129 PE=4 SV=1)

HSP 1 Score: 406.8 bits (1044), Expect = 1.0e-109
Identity = 265/397 (66.75%), Postives = 286/397 (72.04%), Query Frame = 0

Query: 1   MKKTQAKRQRIPRRGPGVAELEKILKEQEGGGGDGGAGQGHD---------SFITQFQDI 60
           MKKTQ KR RIPRRGPGVAELEKILKEQEGG G G    G D         SF    +  
Sbjct: 1   MKKTQPKRHRIPRRGPGVAELEKILKEQEGGDGGGNGNGGQDQADISSLPPSFFQHRRRH 60

Query: 61  SPS---PSLNPPRPPPPPPPPPLVPIITPPC--DYASWSNNLPLFPTLEFIPPPLPTAAA 120
           SPS    SLNPPRPPPPPPPPPL+P   PP   DYASWS NLPLFPTL+FIPP LPT   
Sbjct: 61  SPSNTPSSLNPPRPPPPPPPPPLLP---PPLTRDYASWS-NLPLFPTLDFIPPALPTPTP 120

Query: 121 AASSAKKPLFPTTRISDSQLNFPSHFFPSFQYSASSLNLEYCNSMVNINPGSASSYPSAT 180
           AA++ +KPLFPTTR SD+Q+N P HFFP+FQYSASS      N M+NI PG      SAT
Sbjct: 121 AAAAVEKPLFPTTRKSDAQINLPPHFFPTFQYSASS-----HNPMMNIIPG------SAT 180

Query: 181 SSAARYFRQIEHPSSQISTDFNNIWTSPEEQEKMVSAKRVRPFLEESHREGNVESRDPIL 240
           S AARY+RQIEHPSSQIST FNN WTSPEEQ+KMVSAKRVRPFL+E HR+ N ESR P  
Sbjct: 181 SPAARYYRQIEHPSSQISTQFNNTWTSPEEQQKMVSAKRVRPFLDEGHRDPNAESRAPFF 240

Query: 241 TNMATKD---SSPSSSSSSMEMNPSPFHFDSNFRGTKRGFVGQLMSSIKSSGSYQLA-ED 300
           TNMATK+   SS SSSSSSM+MN SPF  DSN RGTKRGF G+LM   K S  YQLA E 
Sbjct: 241 TNMATKESSSSSSSSSSSSMDMNRSPFDLDSNSRGTKRGFGGELMRCSKRSERYQLAGEV 300

Query: 301 NKLMVLGSSSSSAPNEI-APFNFHLPQETMEASQHIDEGGCASDYYKVTFNSSSFYESNS 360
           + LM LG SSSSAPNE+ A FN H PQET EASQ+ D GG ASDY KVTFNSSS   S  
Sbjct: 301 SHLMALG-SSSSAPNEVAAAFNIHHPQETTEASQYRD-GGSASDYNKVTFNSSS--SSLY 360

Query: 361 NSNSKGNKGITIGSKAGARAEVEAEAEAEGIDLNLKL 379
            S  KGNK I IGS      EVEAEAEAEGIDL+LKL
Sbjct: 361 ESKLKGNKEIVIGS----GFEVEAEAEAEGIDLSLKL 374

BLAST of ClCG03G007900 vs. ExPASy TrEMBL
Match: A0A1S3B6J8 (uncharacterized protein DDB_G0271670-like OS=Cucumis melo OX=3656 GN=LOC103486731 PE=4 SV=1)

HSP 1 Score: 403.3 bits (1035), Expect = 1.1e-108
Identity = 264/402 (65.67%), Postives = 291/402 (72.39%), Query Frame = 0

Query: 1   MKKTQAKRQRIPRRGPGVAELEKILKEQEGGGGDGGAGQGHDSFITQFQDISPSPSLNPP 60
           M+KTQAKRQRIPRRGPGVAELEKILKEQE GG          +  T      P  SLNPP
Sbjct: 1   MRKTQAKRQRIPRRGPGVAELEKILKEQESGGASTDQHISPHTNSTSTATTHPL-SLNPP 60

Query: 61  RPPPPPPPPPLVPIIT-----PPCDYASWSNNLPLFPTLEFIPPP-LPTAAAAASSAKKP 120
            PPPPPPPPPLV I+T     PP DY SW+NNLPLFPTLEFIPPP LPT A     A+KP
Sbjct: 61  LPPPPPPPPPLVQIMTPSPTPPPRDYVSWTNNLPLFPTLEFIPPPYLPTVA-----AEKP 120

Query: 121 LFPTTRISD-SQLNFPSHFFPSFQYSASSLNL-EYCNSMVNINPGSASSYPSATSSAA-R 180
           LFPTTRIS+ SQLN   +F P+FQ+SASS N  +Y N MVNIN GS SS PSATSS+A R
Sbjct: 121 LFPTTRISEASQLNLAPYFLPTFQFSASSCNPDQYYNPMVNINQGSGSSCPSATSSSAGR 180

Query: 181 YFRQIEHPSSQISTDFNNIWTSPEEQEKMVSAKRVRPFLEESHR-EGNVESRDPIL---- 240
           Y R+IEHPSSQISTDFNNIWTSPEEQEKMV+AKRV PFLEESHR E N  + + I+    
Sbjct: 181 YLREIEHPSSQISTDFNNIWTSPEEQEKMVNAKRVIPFLEESHREEANNNNNNNIIEKMR 240

Query: 241 ---TNMATKD----SSPSSSSSSMEMNPSPFHFDSNFRGTKRGFVGQLM-SSIKSSGSYQ 300
                M TKD    SS SSSSSSME+N SPFH  SNFRGTKRG  GQLM ++ K SG YQ
Sbjct: 241 VENNTMGTKDSSSSSSSSSSSSSMEINCSPFHSHSNFRGTKRGSAGQLMKNNTKRSGRYQ 300

Query: 301 LA-EDNKLMVLGSSSSSAPNEIAPFN-FHLPQETMEASQHIDEGGCASDYYKVTFNSSSF 360
           L  +++ LM LGSSSSSAPNEI  FN FHLP+ETME  QH D GGC SD YK+ FNSS +
Sbjct: 301 LGDQESNLMALGSSSSSAPNEIPTFNIFHLPKETMEVPQHRD-GGCPSDNYKLNFNSSLY 360

Query: 361 YESNSNSNSKGNKGITIGSKAGARAEVEAEAEAEGIDLNLKL 379
               SNSN+KGN+ I I S AGA    EAEAE EGIDLNLKL
Sbjct: 361 ---ESNSNTKGNREIAISSGAGA----EAEAETEGIDLNLKL 388

BLAST of ClCG03G007900 vs. ExPASy TrEMBL
Match: A0A5A7UM91 (CREB-regulated transcription coactivator 1-like OS=Cucumis melo var. makuwa OX=1194695 GN=E6C27_scaffold96G002080 PE=4 SV=1)

HSP 1 Score: 232.3 bits (591), Expect = 3.4e-57
Identity = 144/208 (69.23%), Postives = 156/208 (75.00%), Query Frame = 0

Query: 1   MKKTQAKRQRIPRRGPGVAELEKILKEQEGGGGDGGAGQGHDSFITQFQDISPSPSLNPP 60
           M+KTQAKRQRIPRRGPGVAELEKILKEQE GG          +  T      P  SLNPP
Sbjct: 1   MRKTQAKRQRIPRRGPGVAELEKILKEQESGGASTDQHISPHTNSTSTATTHPL-SLNPP 60

Query: 61  RPPPPPPPPPLVPIIT-----PPCDYASWSNNLPLFPTLEFIPPP-LPTAAAAASSAKKP 120
            PPPPPPPPPLV I+T     PP DY SW+NNLPLFPTLEFIPPP LPT A     A+KP
Sbjct: 61  LPPPPPPPPPLVQIMTPSPTPPPRDYVSWTNNLPLFPTLEFIPPPYLPTVA-----AEKP 120

Query: 121 LFPTTRISD-SQLNFPSHFFPSFQYSASSLNL-EYCNSMVNINPGSASSYPSATSSAA-R 180
           LFPTTRIS+ SQLN   +F P+FQ+SASS N  +Y N MVNIN GS SS PSATSS+A R
Sbjct: 121 LFPTTRISEASQLNLAPYFLPTFQFSASSCNPDQYYNPMVNINQGSGSSCPSATSSSAGR 180

Query: 181 YFRQIEHPSSQISTDFNNIWTSPEEQEK 200
           Y R+IEHPSSQISTDFNNIWTSPEEQEK
Sbjct: 181 YLREIEHPSSQISTDFNNIWTSPEEQEK 202

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
XP_038895159.18.5e-14378.87uncharacterized protein DDB_G0271670-like [Benincasa hispida][more]
XP_023001726.12.4e-11368.01trinucleotide repeat-containing gene 18 protein-like [Cucurbita maxima] >XP_0230... [more]
XP_031738182.18.5e-11165.59uncharacterized protein LOC105434463 [Cucumis sativus] >XP_031738184.1 uncharact... [more]
XP_022927210.12.1e-10966.75nuclear envelope pore membrane protein POM 121-like [Cucurbita moschata] >XP_022... [more]
KAG6583971.11.8e-10866.42hypothetical protein SDJN03_19903, partial [Cucurbita argyrosperma subsp. sorori... [more]
Match NameE-valueIdentityDescription
Match NameE-valueIdentityDescription
A0A6J1KHF71.2e-11368.01trinucleotide repeat-containing gene 18 protein-like OS=Cucurbita maxima OX=3661... [more]
A0A0A0LSF64.1e-11165.59Uncharacterized protein OS=Cucumis sativus OX=3659 GN=Csa_1G077140 PE=4 SV=1[more]
A0A6J1EGJ11.0e-10966.75nuclear envelope pore membrane protein POM 121-like OS=Cucurbita moschata OX=366... [more]
A0A1S3B6J81.1e-10865.67uncharacterized protein DDB_G0271670-like OS=Cucumis melo OX=3656 GN=LOC10348673... [more]
A0A5A7UM913.4e-5769.23CREB-regulated transcription coactivator 1-like OS=Cucumis melo var. makuwa OX=1... [more]
Match NameE-valueIdentityDescription
InterPro
Analysis Name: InterPro Annotations of Watermelon (Charleston Gray) v2.5
Date Performed: 2022-01-31
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 1..72
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 224..249
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 13..29
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 55..72
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 213..249
NoneNo IPR availablePANTHERPTHR33388:SF19PROTEIN SPEAR2coord: 79..378
NoneNo IPR availablePANTHERPTHR33388:SF19PROTEIN SPEAR2coord: 3..68
IPR040356SPEAR familyPANTHERPTHR33388OS01G0212500 PROTEINcoord: 79..378
coord: 3..68

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
ClCG03G007900.2ClCG03G007900.2mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
molecular_function GO:0003700 DNA-binding transcription factor activity