CmaCh04G000050 (gene) Cucurbita maxima (Rimu)

NameCmaCh04G000050
Typegene
OrganismCucurbita maxima (Cucurbita maxima (Rimu))
DescriptionMyb/SANT-like DNA-binding domain protein
LocationCma_Chr04 : 32612 .. 34758 (-)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: exonfive_prime_UTRCDSthree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
CTATGTCTACTTCTTTTCTCTCTACTTCTCTCTTTTTTCTCTATCCATTCCTGAGTCTCTGAGATGCCAAAGCCATGTCGGAGCCTCCGACGACATCAACGGAGCCACCGCAGCAGCATCAGCATCAGCATCAGCATGAGCAGCAGAAACATCCCCACCATCTCCTACATTTACCCCTAATCCACTGCGGCGCATCCACGGGCACCACTGCCCGAATCAACGCTGCAGCAGCAACCTCACCCTCGACAGTAATAGTCCGAGAGTACCGCAAAGGGAACTGGAGTCTCCAAGAGACGATGATTCTGATAACCGCGAAAAAGCTGGACGAGGAACGGCGGAACAAGGCGGAACAAGGAACGGCCAGAAAGGGCAGCGAGCTGCGGTGGAAGTGGGTGGAAAACTACTGCTGGAGCCAGGGGTGCCAGCGGAGCCAAAATCAGTGCAACGACAAGTGGGATAACCTACTCCGCGACTACAAAAAAGTTCGGGAGCACGAATCCCGCGCGTGTGATCAATCCCAAATTCCGTCTTACTGGAAAATGGAAAAGCATGAGCGTAAGGACAACAATCTTCCTTCTAACATGGCCTTTGAGGTGTATCAGGCCTTAAACGACGTGGTTCAGAGGAAGCTCAGAGGTGTGTCTGTTGCTGTTGTTGCTGGCCCTCCGCCTCCTCCTTCCCCCACCGAGGGCGAGGCTGCGGCGGGGACTAGTTCCCCGGCGGCTTCAGGTGAGTGTGTTTTTGTTTGTGTGTGTGTGGGGTACTATTAAATTTGATTTTATTATATTATTTTCGGCCTGGACCTTTCATTTATTGTTTCTAAATTTGATTTTATTATATTATATATATTTACTTGGAAGAAATTTGAAAGAGAGAGAGAGAGAGAGAGTTAATAGTGTGGGTCAAAGTTGCAGGGTTTGACTCTGCCACAATTTTGAATAGGACCCTCCTTCCTCCTTTTCATTTCGTTATTCGCTTTCTAATCCTCTACGGACACCTCTACCATCTGCAGATGGGGCCTCCAAATCTTATCCCTCCGATTACGTTGGGTGATACAGTCATCCATATGAATTTAATATAGAGGGATTAATTTTGCAATCACAATTAATACAAATCTAAGCATGCAATCCCATAATTAAATATTACGTCCTTTCTATTTTCTTTTTATTCACTTTTATTGTTTACTTACTAATTTTGTCTACTTCTTCTTCTTCTTTTTAAAATCTCTTTATTTTACCGCTTCTTTAAAATTTATGGATGGATATATTTACCAAATGCAGATTCTTGAGCTTTATGTATATATATATATATAGTTGAAATTTGATTGTATGTATATATATATATATATAGAGTCGTCGTCGTCGTCAGGGAGGGAGTGGGGGCAGAAGAAAGAGAAGCGGGAGAGAAAGAGGAGAAGAGTGGGAAGAAGCATCGAAAGAAGCGCGTCGGCGGTGGCTCAAACGCTGCGGACCTGCGAGGAGCAGAGGGAGATCCGACACCAACAACTGATGGAGATTAAGAAACAGCGCCTTCAAATCCAAGAAGCCCGCAACCACATTCAGGGTCAAGGCATCGCCGACCTCGTGGCCGCGGTTGCCAACCTCTCCGGTACATGTGTATATATATATATAAATAAATAAATAAATAAAAAGAAAGGAATTAGGGTTTATGTTGACTTGGTCGGGGAATGTTGTGAGTGTAGGCATAAACAATAGAAGTAGAAGAAGAAGATCAGAAGAGTATGAATGTTTATACAGTGGAGAAGAGGTGAGAAGGTTGAAAGAACAAAACGAGGCAATGCAGGCTGAGCTTTCGAGCGTCAAGACTGAGCTTTCTCAACTCCGAGACCAAATGCCCTCTCTCGTGCGAACCGTGATGCACAATGTGATGCACAACATCCCTCCTCCTCCTTCCATGGTACTCTCTCCCTCTCCCTCTCTACGCATCCATCCATATTATTCTAAATAATATATATAATATTGACTGTATCTTCTGTTCATCTTTGTTTGTGTAGGACCCAGGTGGAGATGCTTACAAATAATTGGTTATTAATTCATTCACAAAATTACTACAATTCATCATTGTTTCATATCATATGGAGTTTAATGAATTTGTAAAACTTGTTTAATATATTTTATTTTTATTAT

mRNA sequence

CTATGTCTACTTCTTTTCTCTCTACTTCTCTCTTTTTTCTCTATCCATTCCTGAGTCTCTGAGATGCCAAAGCCATGTCGGAGCCTCCGACGACATCAACGGAGCCACCGCAGCAGCATCAGCATCAGCATCAGCATGAGCAGCAGAAACATCCCCACCATCTCCTACATTTACCCCTAATCCACTGCGGCGCATCCACGGGCACCACTGCCCGAATCAACGCTGCAGCAGCAACCTCACCCTCGACAGTAATAGTCCGAGAGTACCGCAAAGGGAACTGGAGTCTCCAAGAGACGATGATTCTGATAACCGCGAAAAAGCTGGACGAGGAACGGCGGAACAAGGCGGAACAAGGAACGGCCAGAAAGGGCAGCGAGCTGCGGTGGAAGTGGGTGGAAAACTACTGCTGGAGCCAGGGGTGCCAGCGGAGCCAAAATCAGTGCAACGACAAGTGGGATAACCTACTCCGCGACTACAAAAAAGTTCGGGAGCACGAATCCCGCGCGTGTGATCAATCCCAAATTCCGTCTTACTGGAAAATGGAAAAGCATGAGCGTAAGGACAACAATCTTCCTTCTAACATGGCCTTTGAGGTGTATCAGGCCTTAAACGACGTGGTTCAGAGGAAGCTCAGAGGTGTGTCTGTTGCTGTTGTTGCTGGCCCTCCGCCTCCTCCTTCCCCCACCGAGGGCGAGGCTGCGGCGGGGACTAGTTCCCCGGCGGCTTCAGAGTCGTCGTCGTCGTCAGGGAGGGAGTGGGGGCAGAAGAAAGAGAAGCGGGAGAGAAAGAGGAGAAGAGTGGGAAGAAGCATCGAAAGAAGCGCGTCGGCGGTGGCTCAAACGCTGCGGACCTGCGAGGAGCAGAGGGAGATCCGACACCAACAACTGATGGAGATTAAGAAACAGCGCCTTCAAATCCAAGAAGCCCGCAACCACATTCAGGGTCAAGGCATCGCCGACCTCGTGGCCGCGGTTGCCAACCTCTCCGGCATAAACAATAGAAGTAGAAGAAGAAGATCAGAAGAGTATGAATGTTTATACAGTGGAGAAGAGGTGAGAAGGTTGAAAGAACAAAACGAGGCAATGCAGGCTGAGCTTTCGAGCGTCAAGACTGAGCTTTCTCAACTCCGAGACCAAATGCCCTCTCTCGTGCGAACCGTGATGCACAATGTGATGCACAACATCCCTCCTCCTCCTTCCATGGACCCAGGTGGAGATGCTTACAAATAATTGGTTATTAATTCATTCACAAAATTACTACAATTCATCATTGTTTCATATCATATGGAGTTTAATGAATTTGTAAAACTTGTTTAATATATTTTATTTTTATTAT

Coding sequence (CDS)

ATGTCGGAGCCTCCGACGACATCAACGGAGCCACCGCAGCAGCATCAGCATCAGCATCAGCATGAGCAGCAGAAACATCCCCACCATCTCCTACATTTACCCCTAATCCACTGCGGCGCATCCACGGGCACCACTGCCCGAATCAACGCTGCAGCAGCAACCTCACCCTCGACAGTAATAGTCCGAGAGTACCGCAAAGGGAACTGGAGTCTCCAAGAGACGATGATTCTGATAACCGCGAAAAAGCTGGACGAGGAACGGCGGAACAAGGCGGAACAAGGAACGGCCAGAAAGGGCAGCGAGCTGCGGTGGAAGTGGGTGGAAAACTACTGCTGGAGCCAGGGGTGCCAGCGGAGCCAAAATCAGTGCAACGACAAGTGGGATAACCTACTCCGCGACTACAAAAAAGTTCGGGAGCACGAATCCCGCGCGTGTGATCAATCCCAAATTCCGTCTTACTGGAAAATGGAAAAGCATGAGCGTAAGGACAACAATCTTCCTTCTAACATGGCCTTTGAGGTGTATCAGGCCTTAAACGACGTGGTTCAGAGGAAGCTCAGAGGTGTGTCTGTTGCTGTTGTTGCTGGCCCTCCGCCTCCTCCTTCCCCCACCGAGGGCGAGGCTGCGGCGGGGACTAGTTCCCCGGCGGCTTCAGAGTCGTCGTCGTCGTCAGGGAGGGAGTGGGGGCAGAAGAAAGAGAAGCGGGAGAGAAAGAGGAGAAGAGTGGGAAGAAGCATCGAAAGAAGCGCGTCGGCGGTGGCTCAAACGCTGCGGACCTGCGAGGAGCAGAGGGAGATCCGACACCAACAACTGATGGAGATTAAGAAACAGCGCCTTCAAATCCAAGAAGCCCGCAACCACATTCAGGGTCAAGGCATCGCCGACCTCGTGGCCGCGGTTGCCAACCTCTCCGGCATAAACAATAGAAGTAGAAGAAGAAGATCAGAAGAGTATGAATGTTTATACAGTGGAGAAGAGGTGAGAAGGTTGAAAGAACAAAACGAGGCAATGCAGGCTGAGCTTTCGAGCGTCAAGACTGAGCTTTCTCAACTCCGAGACCAAATGCCCTCTCTCGTGCGAACCGTGATGCACAATGTGATGCACAACATCCCTCCTCCTCCTTCCATGGACCCAGGTGGAGATGCTTACAAATAA

Protein sequence

MSEPPTTSTEPPQQHQHQHQHEQQKHPHHLLHLPLIHCGASTGTTARINAAAATSPSTVIVREYRKGNWSLQETMILITAKKLDEERRNKAEQGTARKGSELRWKWVENYCWSQGCQRSQNQCNDKWDNLLRDYKKVREHESRACDQSQIPSYWKMEKHERKDNNLPSNMAFEVYQALNDVVQRKLRGVSVAVVAGPPPPPSPTEGEAAAGTSSPAASESSSSSGREWGQKKEKRERKRRRVGRSIERSASAVAQTLRTCEEQREIRHQQLMEIKKQRLQIQEARNHIQGQGIADLVAAVANLSGINNRSRRRRSEEYECLYSGEEVRRLKEQNEAMQAELSSVKTELSQLRDQMPSLVRTVMHNVMHNIPPPPSMDPGGDAYK
BLAST of CmaCh04G000050 vs. Swiss-Prot
Match: ASR3_ARATH (Trihelix transcription factor ASR3 OS=Arabidopsis thaliana GN=ASR3 PE=1 SV=1)

HSP 1 Score: 71.6 bits (174), Expect = 2.1e-11
Identity = 41/122 (33.61%), Postives = 62/122 (50.82%), Query Frame = 1

Query: 61  VREYRKGNWSLQETMILITAKKLDEERRNKAEQGTARKGS---ELRWKWVENYCWSQGCQ 120
           V+  R   W+ QE ++LI  K++ E R  +        GS   E +W  V +YC   G  
Sbjct: 31  VKTARLPRWTRQEILVLIQGKRVAENRVRRGRAAGMALGSGQMEPKWASVSSYCKRHGVN 90

Query: 121 RSQNQCNDKWDNLLRDYKKVREHESRACDQSQIPSYWKMEKHERKDNNLPSNMAFEVYQA 180
           R   QC  +W NL  DYKK++E ES+  ++++  SYW M    R++  LP     EVY  
Sbjct: 91  RGPVQCRKRWSNLAGDYKKIKEWESQIKEETE--SYWVMRNDVRREKKLPGFFDKEVYDI 150

BLAST of CmaCh04G000050 vs. TrEMBL
Match: A0A0A0KME8_CUCSA (Uncharacterized protein OS=Cucumis sativus GN=Csa_5G261720 PE=4 SV=1)

HSP 1 Score: 375.6 bits (963), Expect = 7.5e-101
Identity = 221/321 (68.85%), Postives = 242/321 (75.39%), Query Frame = 1

Query: 1   MSEPPTTSTEPPQQHQHQHQHEQQKHPHHLLHLPLIHCGASTGTTARINAAAATSPSTVI 60
           MS+PPTTS+EPP  H          H  HL  LP+IH GA+ GT  R+N AAATS S VI
Sbjct: 1   MSDPPTTSSEPPHHH----------HQQHLPRLPVIHSGATGGT--RMNTAAATSSSAVI 60

Query: 61  VREYRKGNWSLQETMILITAKKLDEERRNKAEQG------TARKGSELRWKWVENYCWSQ 120
           VREYRKGNW+LQETMILITAKKLD+ERRNKA  G       ARKG ELRWKWVENYCWS 
Sbjct: 61  VREYRKGNWTLQETMILITAKKLDDERRNKANLGPSTVDPAARKGGELRWKWVENYCWSH 120

Query: 121 GCQRSQNQCNDKWDNLLRDYKKVREHESRACDQSQIPSYWKMEKHERKDNNLPSNMAFEV 180
           GCQRSQNQCNDKWDNLLRDYKKVRE+ESRACDQ QIPSYWKMEKHERKD NLPSNMAFEV
Sbjct: 121 GCQRSQNQCNDKWDNLLRDYKKVREYESRACDQ-QIPSYWKMEKHERKDKNLPSNMAFEV 180

Query: 181 YQALNDVVQRKL-------RGVSVAVVAGPPPPPSPTEGEAAAGTSSPAASESSSSSGRE 240
           YQALNDVVQRK            + ++  P PPPS       A T+SP  SE SSSSG E
Sbjct: 181 YQALNDVVQRKFSQKPSNSSNTGILLLPLPAPPPSALLPPPTA-TNSPQLSE-SSSSGTE 240

Query: 241 WGQKKEKRERKRRR----VGRSIERSASAVAQTLRTCEEQREIRHQQLMEIKKQRLQIQE 300
             +KKEK E KRR+    +GR IERS SA+ QTL +CEEQREIRHQQLME++K+RLQI+E
Sbjct: 241 SSEKKEKVEAKRRKMEDNIGRRIERSVSALGQTLHSCEEQREIRHQQLMELRKRRLQIEE 300

Query: 301 ARNHIQGQGIADLVAAVANLS 305
            RNHI  QGIADLVAAVANLS
Sbjct: 301 TRNHIHRQGIADLVAAVANLS 306

BLAST of CmaCh04G000050 vs. TrEMBL
Match: K7N4B1_SOYBN (Uncharacterized protein OS=Glycine max GN=GLYMA_20G184500 PE=4 SV=1)

HSP 1 Score: 258.5 bits (659), Expect = 1.3e-65
Identity = 156/326 (47.85%), Postives = 203/326 (62.27%), Query Frame = 1

Query: 26  HPHHLLHLPLIHCGASTGTTARINAAAATSPSTVIVREYRKGNWSLQETMILITAKKLDE 85
           H HH  H+PLI  GA+          A +S ST + REYRKGNW++QET+ILITAKKLD+
Sbjct: 12  HHHHNHHVPLIQGGAT----------APSSSSTTLAREYRKGNWTIQETLILITAKKLDD 71

Query: 86  ERRNKAEQG------TARKGSELRWKWVENYCWSQGCQRSQNQCNDKWDNLLRDYKKVRE 145
           ERR K          T R   ELRWKWVENYCWS GC RSQNQCNDKWDNLLRDYKKVR+
Sbjct: 72  ERRLKTPAACSTSTTTTRTSGELRWKWVENYCWSHGCLRSQNQCNDKWDNLLRDYKKVRD 131

Query: 146 HESRACD-----QSQIPSYWKMEKHERKDNNLPSNMAFEVYQALNDVVQRK--------L 205
           +ES++ D         PSYW + K +RK+ NLPSNM FEVYQ + DV+QRK         
Sbjct: 132 YESKSNDNDNNNNKHFPSYWTLNKQQRKEQNLPSNMVFEVYQTIADVLQRKQTQSQRQHQ 191

Query: 206 RGVSVAVVAGP------------PPPPSPTEGEAAAGTSSPAASESSSSSGREWGQK--- 265
           + +++ +V               PPPP P        +++P  SE S SSG E  +    
Sbjct: 192 QPLAIPLVTSSPSPLQTLPPPPLPPPPPPPPPPPPVSSTTPVGSERSESSGTEHSEDDDD 251

Query: 266 -KEKRERKRRRVGRSIERSASAVAQTLRTCEEQREIRHQQLMEIKKQRLQIQEARNHIQG 312
             E + RK + +G  I +SAS +A+ LR+CEE++E RH++++E++++R+Q++EARN +  
Sbjct: 252 GSESKRRKVKNLGSRIMQSASVLARALRSCEEKKEKRHREMIELEQRRIQMEEARNEVHR 311

BLAST of CmaCh04G000050 vs. TrEMBL
Match: A0A067KQ45_JATCU (Uncharacterized protein OS=Jatropha curcas GN=JCGZ_04860 PE=4 SV=1)

HSP 1 Score: 257.7 bits (657), Expect = 2.3e-65
Identity = 162/325 (49.85%), Postives = 202/325 (62.15%), Query Frame = 1

Query: 1   MSEPPTTSTEPPQQHQHQHQHEQQKHPHHLLHLPLIHCGASTGTTARINAAAATSPSTVI 60
           MS+PPT+   PPQ          Q  P    HLPL+   A+T TT        TS +   
Sbjct: 1   MSQPPTSIPPPPQP---------QPQPQPPSHLPLLPFSATTTTTP-------TSSN--- 60

Query: 61  VREYRKGNWSLQETMILITAKKLDEERRNKAEQGTARKGSELRWKWVENYCWSQGCQRSQ 120
            REYRKGNW++QET+ LITAKKLD+ERR+K    +  K  ELRWKWVENYCW+ GC RSQ
Sbjct: 61  -REYRKGNWTIQETLTLITAKKLDDERRSKPTVPSTSKPGELRWKWVENYCWAHGCYRSQ 120

Query: 121 NQCNDKWDNLLRDYKKVREHESRACDQSQIPSYWKMEKHERKDNNLPSNMAFEVYQALND 180
           NQCNDKWDNLLRDYKKVRE++SR+      PSYW ME+H+RK  NLPSNM+ EV++ALN 
Sbjct: 121 NQCNDKWDNLLRDYKKVREYQSRSDGSDSFPSYWTMERHQRKYYNLPSNMSLEVFEALNQ 180

Query: 181 VVQRKLRGVS---------------VAVVAG-PPPPPSPTEGEAAAGTSSPAASESSSSS 240
           VVQR+   ++               V VVA  P  P +  E    A    PA SE S SS
Sbjct: 181 VVQRRYTNITQQNVVVSPQQQQQQQVTVVADVPVSPVTLREVVPEALMDRPALSEGSESS 240

Query: 241 GREWGQKKEKRERKRRR-----VGRSIERSASAVAQTLRTCEEQREIRHQQLMEIKKQRL 300
             E   K +    KRRR     +G SI+ SAS +AQT+R CEE++E RHQ+LME +++RL
Sbjct: 241 ATESSDKHDSGGSKRRRMKNNNIGASIKHSASILAQTIRNCEEKKEKRHQELMEFEQRRL 300

Query: 301 QIQEARNHIQGQGIADLVAAVANLS 305
           Q++E RN +  QG+A+L  AV NLS
Sbjct: 301 QLEETRNEVNRQGMANLAMAVTNLS 305

BLAST of CmaCh04G000050 vs. TrEMBL
Match: A0A151SE87_CAJCA (Uncharacterized protein OS=Cajanus cajan GN=KK1_025039 PE=4 SV=1)

HSP 1 Score: 255.0 bits (650), Expect = 1.5e-64
Identity = 160/340 (47.06%), Postives = 204/340 (60.00%), Query Frame = 1

Query: 1   MSEPPTTSTEPPQQHQHQHQHEQQKHPHHLLHLPLIHCGASTGTTARINAAAATSPSTVI 60
           MS+P TT   PP              PHHL                 I   A  S S+ +
Sbjct: 1   MSDPSTTPLPPPPL---------LPSPHHL-----------------IQGGATASSSSSL 60

Query: 61  VREYRKGNWSLQETMILITAKKLDEERRNKAEQG--------TARKGSELRWKWVENYCW 120
            REYRKGNW++QET+ILITAKKLD+ERR K            TAR   ELRWKWVENYCW
Sbjct: 61  AREYRKGNWTIQETLILITAKKLDDERRLKTSHDPTRAACSTTARTSGELRWKWVENYCW 120

Query: 121 SQGCQRSQNQCNDKWDNLLRDYKKVREHE--SRACDQSQIPSYWKMEKHERKDNNLPSNM 180
           S GC RSQNQCNDKWDNLLRDYKKVR++E   +  ++   PSYW + K +RK++NLPSNM
Sbjct: 121 SHGCLRSQNQCNDKWDNLLRDYKKVRDYEFKQQQSNEKHFPSYWNLNKQQRKEHNLPSNM 180

Query: 181 AFEVYQALNDVVQRKL--------RGVSVAVVAG--------------PPPPPSPTEGEA 240
            F+VYQA+ +V+QRK         R  +V +V                PPPPP P     
Sbjct: 181 VFDVYQAITEVLQRKQTQPQAQTQRQPAVTLVTSSPLQTLPPPPPPPPPPPPPPPPPPPP 240

Query: 241 AAGTSSPAASESSSSSGREWGQKKEKRERKRRRV---GRSIERSASAVAQTLRTCEEQRE 300
              +++ A SE S SSG E  +  +  E KRR+V   G SI RSAS +A+ LR+CEE++E
Sbjct: 241 PVSSATQAVSERSESSGTEHSEDDDGSESKRRKVKNLGSSIMRSASVLARALRSCEEKKE 300

Query: 301 IRHQQLMEIKKQRLQIQEARNHIQGQGIADLVAAVANLSG 306
            RH++L+E++++R+Q++EARN +  QGIA LVAAV NLSG
Sbjct: 301 KRHRELIELEQRRIQMEEARNEVHRQGIATLVAAVTNLSG 314

BLAST of CmaCh04G000050 vs. TrEMBL
Match: V7BD78_PHAVU (Uncharacterized protein OS=Phaseolus vulgaris GN=PHAVU_007G100500g PE=4 SV=1)

HSP 1 Score: 252.7 bits (644), Expect = 7.3e-64
Identity = 163/348 (46.84%), Postives = 210/348 (60.34%), Query Frame = 1

Query: 1   MSEPPTTSTEPPQQHQHQHQHEQQKHPHHLLHLPLIHCGASTGTTARINAAAATSPSTVI 60
           MS+P TT    P      H+     HP     LPLI  GA+         AAA S S+ +
Sbjct: 1   MSDPSTTPLPHPPLLPEAHRQPLHHHP-----LPLIQ-GAT---------AAAPSSSSSL 60

Query: 61  VREYRKGNWSLQETMILITAKKLDEERRNKAEQ------------GTARKGSELRWKWVE 120
            REYRKGNW++QET+ILITAKKLD+ERR K                +AR   ELRWKWVE
Sbjct: 61  AREYRKGNWTIQETLILITAKKLDDERRLKTPHDPTRPACSSTTSSSARTSGELRWKWVE 120

Query: 121 NYCWSQGCQRSQNQCNDKWDNLLRDYKKVREHESRACDQSQ--------IPSYWKMEKHE 180
           NYCWS GC RSQNQCNDKWDNLLRDYKKVR++ES++  Q           PSYW + K +
Sbjct: 121 NYCWSHGCLRSQNQCNDKWDNLLRDYKKVRDYESKSQQQQHQQSHEIKHFPSYWTLNKQQ 180

Query: 181 RKDNNLPSNMAFEVYQALNDVVQRKLRG------------------VSVAVVAGPPPPPS 240
           RK+ NLPSNM +EVY A+ +V+QRK                     V++  V+ PPPPP 
Sbjct: 181 RKEQNLPSNMVYEVYHAITEVLQRKQTQPQLQSQTQTQRQPQQQPPVALITVSSPPPPPP 240

Query: 241 PTEGEAAAGTSSPAASESSSSSGREWGQK-----KEKRERKRRRVGRSIERSASAVAQTL 300
           P        +++PA SE S SSG E  +       E + RK + +G SI RSAS +A+ L
Sbjct: 241 PP-----VSSTTPAVSERSESSGTEHSEDDADDGSESKRRKVKNLGSSIMRSASVLARAL 300

Query: 301 RTCEEQREIRHQQLMEIKKQRLQIQEARNHIQGQGIADLVAAVANLSG 306
           R+CEE++E RH++L+E++++RLQ++EAR+ +  QGIA LVAAV NLSG
Sbjct: 301 RSCEEKKEKRHRELIELEQRRLQMEEARDEVHRQGIATLVAAVTNLSG 328

BLAST of CmaCh04G000050 vs. TAIR10
Match: AT1G31310.1 (AT1G31310.1 hydroxyproline-rich glycoprotein family protein)

HSP 1 Score: 162.2 bits (409), Expect = 6.6e-40
Identity = 94/198 (47.47%), Postives = 119/198 (60.10%), Query Frame = 1

Query: 52  AATSPSTVIVREYRKGNWSLQETMILITAKKLDEERRNKAEQGT----------ARKGSE 111
           A  S   V++REYRKGNW+L ETM+LI AK++D+ERR +   G           + K +E
Sbjct: 2   ADQSGGLVMMREYRKGNWTLNETMVLIEAKRMDDERRMRRSIGLPPPEQQQDIRSNKPAE 61

Query: 112 LRWKWVENYCWSQGCQRSQNQCNDKWDNLLRDYKKVREHESRACDQS------------- 171
           LRWKW+E+YCW +GC RSQNQCNDKWDNL+RDYKKVRE+E R  + S             
Sbjct: 62  LRWKWIEDYCWRKGCMRSQNQCNDKWDNLMRDYKKVREYERRRVESSITAGESSSSSAPA 121

Query: 172 -QIPSYWKMEKHERKDNNLPSNMAFEVYQALNDVVQRKLRGVSVAVVAGPPPPPSPTEGE 226
            +  SYWKMEK ERK+ +LPSNM  + YQAL +VV+ K    S AV A            
Sbjct: 122 GETASYWKMEKSERKERSLPSNMLPQTYQALFEVVESKTLPSSTAVTA-----------V 181

BLAST of CmaCh04G000050 vs. TAIR10
Match: AT2G35640.1 (AT2G35640.1 Homeodomain-like superfamily protein)

HSP 1 Score: 157.1 bits (396), Expect = 2.1e-38
Identity = 73/142 (51.41%), Postives = 102/142 (71.83%), Query Frame = 1

Query: 51  AAATSPSTVIVREYRKGNWSLQETMILITAKKLDEERR---NKAEQGTARKGSELRWKWV 110
           A  +S   +++RE RKGNW++ ET++LI AKK+D++RR   ++ +     K +ELRWKW+
Sbjct: 4   ADPSSGEQIVMRECRKGNWTVSETLVLIEAKKMDDQRRVRRSEKQPEGRNKPAELRWKWI 63

Query: 111 ENYCWSQGCQRSQNQCNDKWDNLLRDYKKVREHESRACDQS----QIPSYWKMEKHERKD 170
           E YCW +GC R+QNQCNDKWDNL+RDYKK+RE+E    + S       SYWKM+K ERK+
Sbjct: 64  EEYCWRRGCYRNQNQCNDKWDNLMRDYKKIREYERSRVESSFNTVTSSSYWKMDKTERKE 123

Query: 171 NNLPSNMAFEVYQALNDVVQRK 186
            NLPSNM  ++Y  L+++V RK
Sbjct: 124 KNLPSNMLPQIYDVLSELVDRK 145

BLAST of CmaCh04G000050 vs. TAIR10
Match: AT2G33550.1 (AT2G33550.1 Homeodomain-like superfamily protein)

HSP 1 Score: 71.6 bits (174), Expect = 1.2e-12
Identity = 41/122 (33.61%), Postives = 62/122 (50.82%), Query Frame = 1

Query: 61  VREYRKGNWSLQETMILITAKKLDEERRNKAEQGTARKGS---ELRWKWVENYCWSQGCQ 120
           V+  R   W+ QE ++LI  K++ E R  +        GS   E +W  V +YC   G  
Sbjct: 31  VKTARLPRWTRQEILVLIQGKRVAENRVRRGRAAGMALGSGQMEPKWASVSSYCKRHGVN 90

Query: 121 RSQNQCNDKWDNLLRDYKKVREHESRACDQSQIPSYWKMEKHERKDNNLPSNMAFEVYQA 180
           R   QC  +W NL  DYKK++E ES+  ++++  SYW M    R++  LP     EVY  
Sbjct: 91  RGPVQCRKRWSNLAGDYKKIKEWESQIKEETE--SYWVMRNDVRREKKLPGFFDKEVYDI 150

BLAST of CmaCh04G000050 vs. NCBI nr
Match: gi|778701746|ref|XP_004140413.2| (PREDICTED: trihelix transcription factor PTL-like [Cucumis sativus])

HSP 1 Score: 474.6 bits (1220), Expect = 1.7e-130
Identity = 282/407 (69.29%), Postives = 310/407 (76.17%), Query Frame = 1

Query: 1   MSEPPTTSTEPPQQHQHQHQHEQQKHPHHLLHLPLIHCGASTGTTARINAAAATSPSTVI 60
           MS+PPTTS+EPP  H          H  HL  LP+IH GA+ GT  R+N AAATS S VI
Sbjct: 1   MSDPPTTSSEPPHHH----------HQQHLPRLPVIHSGATGGT--RMNTAAATSSSAVI 60

Query: 61  VREYRKGNWSLQETMILITAKKLDEERRNKAEQG------TARKGSELRWKWVENYCWSQ 120
           VREYRKGNW+LQETMILITAKKLD+ERRNKA  G       ARKG ELRWKWVENYCWS 
Sbjct: 61  VREYRKGNWTLQETMILITAKKLDDERRNKANLGPSTVDPAARKGGELRWKWVENYCWSH 120

Query: 121 GCQRSQNQCNDKWDNLLRDYKKVREHESRACDQSQIPSYWKMEKHERKDNNLPSNMAFEV 180
           GCQRSQNQCNDKWDNLLRDYKKVRE+ESRACDQ QIPSYWKMEKHERKD NLPSNMAFEV
Sbjct: 121 GCQRSQNQCNDKWDNLLRDYKKVREYESRACDQ-QIPSYWKMEKHERKDKNLPSNMAFEV 180

Query: 181 YQALNDVVQRKLR-------GVSVAVVAGPPPPPSPTEGEAAAGTSSPAASESSSSSGRE 240
           YQALNDVVQRK            + ++  P PPPS       A T+SP  SESSSS G E
Sbjct: 181 YQALNDVVQRKFSQKPSNSSNTGILLLPLPAPPPSALLPPPTA-TNSPQLSESSSS-GTE 240

Query: 241 WGQKKEKRERKRRR----VGRSIERSASAVAQTLRTCEEQREIRHQQLMEIKKQRLQIQE 300
             +KKEK E KRR+    +GR IERS SA+ QTL +CEEQREIRHQQLME++K+RLQI+E
Sbjct: 241 SSEKKEKVEAKRRKMEDNIGRRIERSVSALGQTLHSCEEQREIRHQQLMELRKRRLQIEE 300

Query: 301 ARNHIQGQGIADLVAAVANLSGINNRSRRRRSEEYE-CLYSGEEVRRLKEQNEAMQAELS 360
            RNHI  QGIADLVAAVANLS   +  RR RSE YE CLYSGEEVR LKEQNEAMQAEL 
Sbjct: 301 TRNHIHRQGIADLVAAVANLSAGIDNDRRGRSEGYESCLYSGEEVRILKEQNEAMQAELM 360

Query: 361 SVKTELSQLRDQMPSLVRTVMHNVMHNIPPPP----SMDP---GGDA 383
           +VK ELSQLRDQMPSL++T+MHN++HNIPPPP    SMDP   GGDA
Sbjct: 361 NVKNELSQLRDQMPSLMQTMMHNMLHNIPPPPPSTSSMDPSGSGGDA 392

BLAST of CmaCh04G000050 vs. NCBI nr
Match: gi|659114054|ref|XP_008456886.1| (PREDICTED: uncharacterized protein LOC103496697 [Cucumis melo])

HSP 1 Score: 470.3 bits (1209), Expect = 3.2e-129
Identity = 280/403 (69.48%), Postives = 310/403 (76.92%), Query Frame = 1

Query: 1   MSEPPTTSTEPPQQHQHQHQHEQQKHPHHLLHLPLIHCGASTGTTARINAAAATSPSTVI 60
           MS+PPTTS+EPP      HQ +QQ    HL  LP+IH GAS  T  R+N AAATS S VI
Sbjct: 1   MSDPPTTSSEPPH-----HQQQQQ----HLPRLPVIHGGASGAT--RMNTAAATSSSAVI 60

Query: 61  VREYRKGNWSLQETMILITAKKLDEERRNKAEQG------TARKGSELRWKWVENYCWSQ 120
           VREYRKGNW+LQETMILITAKKLD+ERRNKA  G       ARKG ELRWKWVENYCWS 
Sbjct: 61  VREYRKGNWTLQETMILITAKKLDDERRNKANLGPSTVDPAARKGGELRWKWVENYCWSH 120

Query: 121 GCQRSQNQCNDKWDNLLRDYKKVREHESRACDQSQIPSYWKMEKHERKDNNLPSNMAFEV 180
           GCQRSQNQCNDKWDNLLRDYKKVRE+ESRACDQ QIPSYWKMEKHERKD NLPSNMAFEV
Sbjct: 121 GCQRSQNQCNDKWDNLLRDYKKVREYESRACDQ-QIPSYWKMEKHERKDKNLPSNMAFEV 180

Query: 181 YQALNDVVQRKL-------RGVSVAVVAGPPPPPSPTEGEAAAGTSSPAASESSSSSGRE 240
           YQALNDVVQRK            + ++  P PPPS T       T+SP  SE SSSSG E
Sbjct: 181 YQALNDVVQRKFSQKPSNSSNTGILLLPLPAPPPS-TLLPPPTATNSPQLSE-SSSSGTE 240

Query: 241 WGQKKEKRERKRRR----VGRSIERSASAVAQTLRTCEEQREIRHQQLMEIKKQRLQIQE 300
             +KKEK E KRR+    +GR IERS SA+ QTL +CEEQREIRHQQLME++K+RLQI+E
Sbjct: 241 SSEKKEKMEAKRRKMEDNIGRRIERSVSALGQTLHSCEEQREIRHQQLMELRKRRLQIEE 300

Query: 301 ARNHIQGQGIADLVAAVANLSGINNRSRRRRSEEYE-CLYSGEEVRRLKEQNEAMQAELS 360
            RNHI  QGIADLVAAVANLS   + +RR RSE YE CLYSGEEVR LKEQNEAMQAEL 
Sbjct: 301 TRNHIHRQGIADLVAAVANLSAGIDNNRRGRSEGYESCLYSGEEVRILKEQNEAMQAELM 360

Query: 361 SVKTELSQLRDQMPSLVRTVMHNVMHNIPPPP-----SMDPGG 381
           +VK ELSQLRDQMPSL++T+MH+++HNIPPPP     SMDP G
Sbjct: 361 NVKNELSQLRDQMPSLMQTMMHSMIHNIPPPPPPSTSSMDPSG 389

BLAST of CmaCh04G000050 vs. NCBI nr
Match: gi|700195606|gb|KGN50783.1| (hypothetical protein Csa_5G261720 [Cucumis sativus])

HSP 1 Score: 375.6 bits (963), Expect = 1.1e-100
Identity = 221/321 (68.85%), Postives = 242/321 (75.39%), Query Frame = 1

Query: 1   MSEPPTTSTEPPQQHQHQHQHEQQKHPHHLLHLPLIHCGASTGTTARINAAAATSPSTVI 60
           MS+PPTTS+EPP  H          H  HL  LP+IH GA+ GT  R+N AAATS S VI
Sbjct: 1   MSDPPTTSSEPPHHH----------HQQHLPRLPVIHSGATGGT--RMNTAAATSSSAVI 60

Query: 61  VREYRKGNWSLQETMILITAKKLDEERRNKAEQG------TARKGSELRWKWVENYCWSQ 120
           VREYRKGNW+LQETMILITAKKLD+ERRNKA  G       ARKG ELRWKWVENYCWS 
Sbjct: 61  VREYRKGNWTLQETMILITAKKLDDERRNKANLGPSTVDPAARKGGELRWKWVENYCWSH 120

Query: 121 GCQRSQNQCNDKWDNLLRDYKKVREHESRACDQSQIPSYWKMEKHERKDNNLPSNMAFEV 180
           GCQRSQNQCNDKWDNLLRDYKKVRE+ESRACDQ QIPSYWKMEKHERKD NLPSNMAFEV
Sbjct: 121 GCQRSQNQCNDKWDNLLRDYKKVREYESRACDQ-QIPSYWKMEKHERKDKNLPSNMAFEV 180

Query: 181 YQALNDVVQRKL-------RGVSVAVVAGPPPPPSPTEGEAAAGTSSPAASESSSSSGRE 240
           YQALNDVVQRK            + ++  P PPPS       A T+SP  SE SSSSG E
Sbjct: 181 YQALNDVVQRKFSQKPSNSSNTGILLLPLPAPPPSALLPPPTA-TNSPQLSE-SSSSGTE 240

Query: 241 WGQKKEKRERKRRR----VGRSIERSASAVAQTLRTCEEQREIRHQQLMEIKKQRLQIQE 300
             +KKEK E KRR+    +GR IERS SA+ QTL +CEEQREIRHQQLME++K+RLQI+E
Sbjct: 241 SSEKKEKVEAKRRKMEDNIGRRIERSVSALGQTLHSCEEQREIRHQQLMELRKRRLQIEE 300

Query: 301 ARNHIQGQGIADLVAAVANLS 305
            RNHI  QGIADLVAAVANLS
Sbjct: 301 TRNHIHRQGIADLVAAVANLS 306

BLAST of CmaCh04G000050 vs. NCBI nr
Match: gi|1009152822|ref|XP_015894304.1| (PREDICTED: trihelix transcription factor ASR3-like [Ziziphus jujuba])

HSP 1 Score: 268.9 bits (686), Expect = 1.4e-68
Identity = 170/325 (52.31%), Postives = 213/325 (65.54%), Query Frame = 1

Query: 1   MSEPPTTST----EPPQQHQHQHQHEQQKHPHHLLHLPLIHCGASTGTTARINAAAATSP 60
           MSEP TTS      PP   Q Q Q +QQ  PHH  H    H  A     A +    ++S 
Sbjct: 1   MSEPTTTSPAATISPPVNPQQQQQQQQQ--PHH--HRKSPHFSA-----ASVGPTTSSST 60

Query: 61  STVIVREYRKGNWSLQETMILITAKKLDEERRNKAEQG------TARKGSELRWKWVENY 120
           ST I REYRKGNW++QET+ILITAKKLDEERR KA         T+    ELRWKWVENY
Sbjct: 61  STPIEREYRKGNWTIQETLILITAKKLDEERRYKARSAPPDPTSTSTTKGELRWKWVENY 120

Query: 121 CWSQGCQRSQNQCNDKWDNLLRDYKKVREHESRACDQSQIPSYWKMEKHERKDNNLPSNM 180
           CWSQGC RS NQCNDKWDNLLRDYKKVRE+ES A  +  +PSYW MEK +RK  NLPSNM
Sbjct: 121 CWSQGCLRSSNQCNDKWDNLLRDYKKVREYESNAQSKPDLPSYWNMEKQDRKLRNLPSNM 180

Query: 181 AFEVYQALNDVVQRK-------LRGVSVAVVAGPPPPPSPTEGEAAAGTSSPAASESSSS 240
           A EV+QALN+V+QRK       LR      V+  P P +      A  TS+PA SE S S
Sbjct: 181 ALEVFQALNEVLQRKYSTQTTALRDPQTLSVSPSPAPLAARPLLPAPTTSAPAPSERSDS 240

Query: 241 SGREWGQKKEK----RERKRRRVGRSIERSASAVAQTLRTCEEQREIRHQQLMEIKKQRL 300
           SG E  +K E+    + +K  ++  SI+RSAS +A+TL+ CEE++E RH+++ME+++++L
Sbjct: 241 SGTEASEKDEETSDTKRKKHGKISSSIKRSASLLAKTLQNCEEKKEKRHREIMEMERKKL 300

Query: 301 QIQEARNHIQGQGIADLVAAVANLS 305
           +I+EA N +  QG+ +LV AVANLS
Sbjct: 301 EIEEAHNEVNRQGMVNLVGAVANLS 316

BLAST of CmaCh04G000050 vs. NCBI nr
Match: gi|571568693|ref|XP_006606274.1| (PREDICTED: trihelix transcription factor ASR3-like [Glycine max])

HSP 1 Score: 258.5 bits (659), Expect = 1.9e-65
Identity = 156/326 (47.85%), Postives = 203/326 (62.27%), Query Frame = 1

Query: 26  HPHHLLHLPLIHCGASTGTTARINAAAATSPSTVIVREYRKGNWSLQETMILITAKKLDE 85
           H HH  H+PLI  GA+          A +S ST + REYRKGNW++QET+ILITAKKLD+
Sbjct: 12  HHHHNHHVPLIQGGAT----------APSSSSTTLAREYRKGNWTIQETLILITAKKLDD 71

Query: 86  ERRNKAEQG------TARKGSELRWKWVENYCWSQGCQRSQNQCNDKWDNLLRDYKKVRE 145
           ERR K          T R   ELRWKWVENYCWS GC RSQNQCNDKWDNLLRDYKKVR+
Sbjct: 72  ERRLKTPAACSTSTTTTRTSGELRWKWVENYCWSHGCLRSQNQCNDKWDNLLRDYKKVRD 131

Query: 146 HESRACD-----QSQIPSYWKMEKHERKDNNLPSNMAFEVYQALNDVVQRK--------L 205
           +ES++ D         PSYW + K +RK+ NLPSNM FEVYQ + DV+QRK         
Sbjct: 132 YESKSNDNDNNNNKHFPSYWTLNKQQRKEQNLPSNMVFEVYQTIADVLQRKQTQSQRQHQ 191

Query: 206 RGVSVAVVAGP------------PPPPSPTEGEAAAGTSSPAASESSSSSGREWGQK--- 265
           + +++ +V               PPPP P        +++P  SE S SSG E  +    
Sbjct: 192 QPLAIPLVTSSPSPLQTLPPPPLPPPPPPPPPPPPVSSTTPVGSERSESSGTEHSEDDDD 251

Query: 266 -KEKRERKRRRVGRSIERSASAVAQTLRTCEEQREIRHQQLMEIKKQRLQIQEARNHIQG 312
             E + RK + +G  I +SAS +A+ LR+CEE++E RH++++E++++R+Q++EARN +  
Sbjct: 252 GSESKRRKVKNLGSRIMQSASVLARALRSCEEKKEKRHREMIELEQRRIQMEEARNEVHR 311

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
ASR3_ARATH2.1e-1133.61Trihelix transcription factor ASR3 OS=Arabidopsis thaliana GN=ASR3 PE=1 SV=1[more]
Match NameE-valueIdentityDescription
A0A0A0KME8_CUCSA7.5e-10168.85Uncharacterized protein OS=Cucumis sativus GN=Csa_5G261720 PE=4 SV=1[more]
K7N4B1_SOYBN1.3e-6547.85Uncharacterized protein OS=Glycine max GN=GLYMA_20G184500 PE=4 SV=1[more]
A0A067KQ45_JATCU2.3e-6549.85Uncharacterized protein OS=Jatropha curcas GN=JCGZ_04860 PE=4 SV=1[more]
A0A151SE87_CAJCA1.5e-6447.06Uncharacterized protein OS=Cajanus cajan GN=KK1_025039 PE=4 SV=1[more]
V7BD78_PHAVU7.3e-6446.84Uncharacterized protein OS=Phaseolus vulgaris GN=PHAVU_007G100500g PE=4 SV=1[more]
Match NameE-valueIdentityDescription
AT1G31310.16.6e-4047.47 hydroxyproline-rich glycoprotein family protein[more]
AT2G35640.12.1e-3851.41 Homeodomain-like superfamily protein[more]
AT2G33550.11.2e-1233.61 Homeodomain-like superfamily protein[more]
Match NameE-valueIdentityDescription
gi|778701746|ref|XP_004140413.2|1.7e-13069.29PREDICTED: trihelix transcription factor PTL-like [Cucumis sativus][more]
gi|659114054|ref|XP_008456886.1|3.2e-12969.48PREDICTED: uncharacterized protein LOC103496697 [Cucumis melo][more]
gi|700195606|gb|KGN50783.1|1.1e-10068.85hypothetical protein Csa_5G261720 [Cucumis sativus][more]
gi|1009152822|ref|XP_015894304.1|1.4e-6852.31PREDICTED: trihelix transcription factor ASR3-like [Ziziphus jujuba][more]
gi|571568693|ref|XP_006606274.1|1.9e-6547.85PREDICTED: trihelix transcription factor ASR3-like [Glycine max][more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR009057Homeobox-like_sf
IPR017877Myb-like_dom
Vocabulary: Molecular Function
TermDefinition
GO:0003677DNA binding
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0008150 biological_process
cellular_component GO:0005575 cellular_component
cellular_component GO:0016021 integral component of membrane
molecular_function GO:0003677 DNA binding
molecular_function GO:0003674 molecular_function

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CmaCh04G000050.1CmaCh04G000050.1mRNA


Analysis Name: InterPro Annotations of Cucurbita maxima
Date Performed: 2017-05-20
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR009057Homeodomain-likeGENE3DG3DSA:1.10.10.60coord: 63..133
score: 9.
IPR017877Myb-like domainPROFILEPS50090MYB_LIKEcoord: 61..131
score: 7
NoneNo IPR availableunknownCoilCoilcoord: 327..354
scor
NoneNo IPR availableGENE3DG3DSA:1.20.5.170coord: 308..355
score: 2.
NoneNo IPR availablePANTHERPTHR33492FAMILY NOT NAMEDcoord: 30..324
score: 1.9
NoneNo IPR availablePANTHERPTHR33492:SF5SUBFAMILY NOT NAMEDcoord: 30..324
score: 1.9
NoneNo IPR availablePFAMPF13837Myb_DNA-bind_4coord: 68..158
score: 2.7

The following gene(s) are paralogous to this gene:

None

The following block(s) are covering this gene:
GeneOrganismBlock
CmaCh04G000050Wax gourdcmawgoB0899
CmaCh04G000050Wild cucumber (PI 183967)cmacpiB751
CmaCh04G000050Silver-seed gourdcarcmaB1030