CmaCh02G005450 (gene) Cucurbita maxima (Rimu) v1.1

Overview
NameCmaCh02G005450
Typegene
OrganismCucurbita maxima (Cucurbita maxima (Rimu) v1.1)
DescriptionSMI1_KNR4 domain-containing protein
LocationCma_Chr02: 3021760 .. 3022956 (+)
RNA-Seq ExpressionCmaCh02G005450
SyntenyCmaCh02G005450
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: exonCDSpolypeptide
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGGTCGACGTCGACCGGAGAATGACTGGTCTTAACCCGGCCCACCTCGCCGGCCTCCGCCGTCTTTCCGCCCGTGCTGCCGCTACCTCCACCGCCTCCTCCGCCCCCCTCCGCAACGGCCTCCTCTCCTTCTCCTCTCTTGCCGATGAAGTCCTCACCCATCTACGAAACTCCGGCGTCGACGTTCAACCCGGTCTCTCTGATGCGGAATTCGCCCGAGCTGAGGCTGAGTTCCGCTTCGCCTTTCCTCCTGATCTCCGGGCAGTTCTCTCCGCTGGCCTCCCCGTTGGCCCCGGTTTTCCTGACTGGCGCTCCTCCGGTGCACGCCTCCATCTTCGGTCCTCCCTCGACCTACCAATTGCTGCAATTTCCTTCCAAATCGCTAGAAACACTCTCTGGTCTCGCTCTTGGGGTCCCAGGCCGGCGGAACCGGAAAAGGCCTTACGGGTTGCGAGAAATTCGCTTAAACGAGCTCCTGTTTTGATCCCCATTTTCAACCATTGCTACATTCCCTGCAACCCGCCCTTGGCTGGGAATCCGATCTTCTTTGTCGATGAGAGTCGAGTTCTTTGCTGTGGCTTTGATTTGTCGGATTTCTTCGAACGGGAGTCTCTCTTTCGATGCTCTGCTTCCGATTCCGATTCGGGTCCTCTGTTTTCGAAACAGAGGTCTCTCGCCGAGAAGTCCTTTGGATCGTCCACGAATTTCTCGCGACGGAGCTTAGATTCGGGGGTTGTAAAAACGCCGAGGTGGGTCGAGTTTTGGAGCGACGCCGCCGTGGATCGGCGGCGGAGAAACTCGTCGTCGTCATCGAATTCCTCGCCGGATCGGTTCTTCGAGTTGCCAAGATCCGAAATCCCGAAGTGGGTTGGGGAGTATATTGGAGAATTAGGATCAGTTTTGAGATCGGGCGGGTGGAGTGAATCGGAGGTGGCGGAGATGGTGGATGTCTCAGCATCAGGATTCTTCGACGGCGAAATGGTTATGTTGGATAATCAGGCAGTTTTGGATGCTCTGCTTTTGAAAGTGGACCGGTTTTCCGGTTCGTTGAGACGGGCCGGGTGGAGCTCCGAAGAGGTTTCGGAGGCGTTTGGATTCGATTTTCGGCCGGAGAAGGAGAGGAAACCGGCGAAGAAGCTCTCGGCGGAATTGGTGGAGCGAATTGGGAAACTGGCTGAGTCGGTTTCCCGGTCATGA

mRNA sequence

ATGGTCGACGTCGACCGGAGAATGACTGGTCTTAACCCGGCCCACCTCGCCGGCCTCCGCCGTCTTTCCGCCCGTGCTGCCGCTACCTCCACCGCCTCCTCCGCCCCCCTCCGCAACGGCCTCCTCTCCTTCTCCTCTCTTGCCGATGAAGTCCTCACCCATCTACGAAACTCCGGCGTCGACGTTCAACCCGGTCTCTCTGATGCGGAATTCGCCCGAGCTGAGGCTGAGTTCCGCTTCGCCTTTCCTCCTGATCTCCGGGCAGTTCTCTCCGCTGGCCTCCCCGTTGGCCCCGGTTTTCCTGACTGGCGCTCCTCCGGTGCACGCCTCCATCTTCGGTCCTCCCTCGACCTACCAATTGCTGCAATTTCCTTCCAAATCGCTAGAAACACTCTCTGGTCTCGCTCTTGGGGTCCCAGGCCGGCGGAACCGGAAAAGGCCTTACGGGTTGCGAGAAATTCGCTTAAACGAGCTCCTGTTTTGATCCCCATTTTCAACCATTGCTACATTCCCTGCAACCCGCCCTTGGCTGGGAATCCGATCTTCTTTGTCGATGAGAGTCGAGTTCTTTGCTGTGGCTTTGATTTGTCGGATTTCTTCGAACGGGAGTCTCTCTTTCGATGCTCTGCTTCCGATTCCGATTCGGGTCCTCTGTTTTCGAAACAGAGGTCTCTCGCCGAGAAGTCCTTTGGATCGTCCACGAATTTCTCGCGACGGAGCTTAGATTCGGGGGTTGTAAAAACGCCGAGGTGGGTCGAGTTTTGGAGCGACGCCGCCGTGGATCGGCGGCGGAGAAACTCGTCGTCGTCATCGAATTCCTCGCCGGATCGGTTCTTCGAGTTGCCAAGATCCGAAATCCCGAAGTGGGTTGGGGAGTATATTGGAGAATTAGGATCAGTTTTGAGATCGGGCGGGTGGAGTGAATCGGAGGTGGCGGAGATGGTGGATGTCTCAGCATCAGGATTCTTCGACGGCGAAATGGTTATGTTGGATAATCAGGCAGTTTTGGATGCTCTGCTTTTGAAAGTGGACCGGTTTTCCGGTTCGTTGAGACGGGCCGGGTGGAGCTCCGAAGAGGTTTCGGAGGCGTTTGGATTCGATTTTCGGCCGGAGAAGGAGAGGAAACCGGCGAAGAAGCTCTCGGCGGAATTGGTGGAGCGAATTGGGAAACTGGCTGAGTCGGTTTCCCGGTCATGA

Coding sequence (CDS)

ATGGTCGACGTCGACCGGAGAATGACTGGTCTTAACCCGGCCCACCTCGCCGGCCTCCGCCGTCTTTCCGCCCGTGCTGCCGCTACCTCCACCGCCTCCTCCGCCCCCCTCCGCAACGGCCTCCTCTCCTTCTCCTCTCTTGCCGATGAAGTCCTCACCCATCTACGAAACTCCGGCGTCGACGTTCAACCCGGTCTCTCTGATGCGGAATTCGCCCGAGCTGAGGCTGAGTTCCGCTTCGCCTTTCCTCCTGATCTCCGGGCAGTTCTCTCCGCTGGCCTCCCCGTTGGCCCCGGTTTTCCTGACTGGCGCTCCTCCGGTGCACGCCTCCATCTTCGGTCCTCCCTCGACCTACCAATTGCTGCAATTTCCTTCCAAATCGCTAGAAACACTCTCTGGTCTCGCTCTTGGGGTCCCAGGCCGGCGGAACCGGAAAAGGCCTTACGGGTTGCGAGAAATTCGCTTAAACGAGCTCCTGTTTTGATCCCCATTTTCAACCATTGCTACATTCCCTGCAACCCGCCCTTGGCTGGGAATCCGATCTTCTTTGTCGATGAGAGTCGAGTTCTTTGCTGTGGCTTTGATTTGTCGGATTTCTTCGAACGGGAGTCTCTCTTTCGATGCTCTGCTTCCGATTCCGATTCGGGTCCTCTGTTTTCGAAACAGAGGTCTCTCGCCGAGAAGTCCTTTGGATCGTCCACGAATTTCTCGCGACGGAGCTTAGATTCGGGGGTTGTAAAAACGCCGAGGTGGGTCGAGTTTTGGAGCGACGCCGCCGTGGATCGGCGGCGGAGAAACTCGTCGTCGTCATCGAATTCCTCGCCGGATCGGTTCTTCGAGTTGCCAAGATCCGAAATCCCGAAGTGGGTTGGGGAGTATATTGGAGAATTAGGATCAGTTTTGAGATCGGGCGGGTGGAGTGAATCGGAGGTGGCGGAGATGGTGGATGTCTCAGCATCAGGATTCTTCGACGGCGAAATGGTTATGTTGGATAATCAGGCAGTTTTGGATGCTCTGCTTTTGAAAGTGGACCGGTTTTCCGGTTCGTTGAGACGGGCCGGGTGGAGCTCCGAAGAGGTTTCGGAGGCGTTTGGATTCGATTTTCGGCCGGAGAAGGAGAGGAAACCGGCGAAGAAGCTCTCGGCGGAATTGGTGGAGCGAATTGGGAAACTGGCTGAGTCGGTTTCCCGGTCATGA

Protein sequence

MVDVDRRMTGLNPAHLAGLRRLSARAAATSTASSAPLRNGLLSFSSLADEVLTHLRNSGVDVQPGLSDAEFARAEAEFRFAFPPDLRAVLSAGLPVGPGFPDWRSSGARLHLRSSLDLPIAAISFQIARNTLWSRSWGPRPAEPEKALRVARNSLKRAPVLIPIFNHCYIPCNPPLAGNPIFFVDESRVLCCGFDLSDFFERESLFRCSASDSDSGPLFSKQRSLAEKSFGSSTNFSRRSLDSGVVKTPRWVEFWSDAAVDRRRRNSSSSSNSSPDRFFELPRSEIPKWVGEYIGELGSVLRSGGWSESEVAEMVDVSASGFFDGEMVMLDNQAVLDALLLKVDRFSGSLRRAGWSSEEVSEAFGFDFRPEKERKPAKKLSAELVERIGKLAESVSRS
Homology
BLAST of CmaCh02G005450 vs. TAIR 10
Match: AT3G50340.1 (unknown protein; BEST Arabidopsis thaliana protein match is: unknown protein (TAIR:AT5G67020.1); Has 128 Blast hits to 128 proteins in 39 species: Archae - 0; Bacteria - 46; Metazoa - 0; Fungi - 3; Plants - 76; Viruses - 0; Other Eukaryotes - 3 (source: NCBI BLink). )

HSP 1 Score: 589.0 bits (1517), Expect = 2.9e-168
Identity = 303/410 (73.90%), Postives = 352/410 (85.85%), Query Frame = 0

Query: 1   MVDVDRRMTGLNPAHLAGLRRLSARAAATSTASSAPLRNGLLSFSSLADEVLTHLRNSGV 60
           MVDVDRRMTGL PAH AGLRRLSARAAA +T +   +RN L+SFSSLAD+V++HL  S +
Sbjct: 1   MVDVDRRMTGLRPAHAAGLRRLSARAAAPTTPT---VRNSLVSFSSLADQVISHLHTSRI 60

Query: 61  DVQPGLSDAEFARAEAEFRFAFPPDLRAVLSAGLPVGPGFPDWRSSGARLHLRSSLDLPI 120
            VQPGL+D+EFARAEAEF FAFPPDLRAVL+AGLPVG GFPDWRS GARLHLR+ +DLPI
Sbjct: 61  QVQPGLTDSEFARAEAEFAFAFPPDLRAVLTAGLPVGAGFPDWRSPGARLHLRAMIDLPI 120

Query: 121 AAISFQIARNTLWSRSWGPRPAEPEKALRVARNSLKRAPVLIPIFNHCYIPCNPPLAGNP 180
           AA+SFQIARNTLWS+SWG RP++PEKALRVARN+LKRAP++IPIF+HCYIPCNP LAGNP
Sbjct: 121 AAVSFQIARNTLWSKSWGLRPSDPEKALRVARNALKRAPLMIPIFDHCYIPCNPSLAGNP 180

Query: 181 IFFVDESRVLCCGFDLSDFFERESLFRCSASDSDSGP-LFSKQRSLAEKSFG----SSTN 240
           +F++DE+R+ CCG DLSDFFERES+FR     SD+ P + +KQRS++EKS G    SS+N
Sbjct: 181 VFYIDETRIFCCGSDLSDFFERESVFR----GSDTCPVVLTKQRSVSEKSAGSSSSSSSN 240

Query: 241 FSRRSLDSGVV---KTPRWVEFWSDAAVDRRRRNS----SSSSNSSPDRFFELPRSEIPK 300
           FSR SLDSG V    TPRWVEFWSDAAVDRRRRNS    SSS +SSP+R+ +LPRSE PK
Sbjct: 241 FSRMSLDSGRVHGSSTPRWVEFWSDAAVDRRRRNSASSMSSSHSSSPERYLDLPRSETPK 300

Query: 301 WVGEYIGELGSVLRSGGWSESEVAEMVDVSASGFFDGEMVMLDNQAVLDALLLKVDRFSG 360
           WV +Y+  +GSVLR GGWSES+V ++V VSASGFF+GEMV+LDNQAVLDALLLK  RFS 
Sbjct: 301 WVDDYVNRIGSVLRGGGWSESDVDDIVHVSASGFFEGEMVILDNQAVLDALLLKAGRFSE 360

Query: 361 SLRRAGWSSEEVSEAFGFDFRPEKERKPAKKLSAELVERIGKLAESVSRS 399
           SLR+AGWSSEEVS+A GFDFRPEKE+KP KKLS ELV+RIGKLAESVSRS
Sbjct: 361 SLRKAGWSSEEVSDALGFDFRPEKEKKPVKKLSPELVQRIGKLAESVSRS 403

BLAST of CmaCh02G005450 vs. TAIR 10
Match: AT5G67020.1 (unknown protein; BEST Arabidopsis thaliana protein match is: unknown protein (TAIR:AT3G50340.1); Has 1807 Blast hits to 1807 proteins in 277 species: Archae - 0; Bacteria - 0; Metazoa - 736; Fungi - 347; Plants - 385; Viruses - 0; Other Eukaryotes - 339 (source: NCBI BLink). )

HSP 1 Score: 544.3 bits (1401), Expect = 8.3e-155
Identity = 279/404 (69.06%), Postives = 327/404 (80.94%), Query Frame = 0

Query: 1   MVDVDRRMTGLNPAHLAGLRRLSARAAATSTASSAPLRNGLLSFSSLADEVLTHLRNSGV 60
           MVDVDRRMTGL PAH AGLRRLSARAAA ST +   +RN L SFS  AD+V+ HL+NSG+
Sbjct: 1   MVDVDRRMTGLTPAHAAGLRRLSARAAAPSTPT---IRNSLQSFSPFADKVINHLKNSGI 60

Query: 61  DVQPGLSDAEFARAEAEFRFAFPPDLRAVLSAGLPVGPGFPDWRSSGARLHLRSSLDLPI 120
            +QPGLSD EFAR EAEF F FPPDLR +LSAGL VG GFPDWRS GARLHLR+ +DLP+
Sbjct: 61  KIQPGLSDTEFARVEAEFGFTFPPDLRVILSAGLSVGAGFPDWRSPGARLHLRAMIDLPV 120

Query: 121 AAISFQIARNTLWSRSWGPRPAEPEKALRVARNSLKRAPVLIPIFNHCYIPCNPPLAGNP 180
           AA+SFQIA+N+LW +SWG +P +PEKALRVARN+LKRAP+LIPIF+HCYIPCNP LAGNP
Sbjct: 121 AAVSFQIAKNSLWCKSWGLKPPDPEKALRVARNALKRAPLLIPIFDHCYIPCNPSLAGNP 180

Query: 181 IFFVDESRVLCCGFDLSDFFERESLFRCSASDSDSGPLFSKQRSLAEKSFGSSTNFSRRS 240
           +FF+DE+R+ CCG DLS+FFERES FR   S      + +KQRS++EKS GSS+NFSRRS
Sbjct: 181 VFFIDETRIFCCGSDLSEFFERESAFR---SSEFFPRILTKQRSVSEKSAGSSSNFSRRS 240

Query: 241 LDSGVVK---TPRWVEFWSDAAVDRRRRNS---SSSSNSSPDRFFELPRSEIPKWVGEYI 300
           LD G        RWVEFWSDAAVDR RRNS   SSSS+SSPD    LP++E PKWV +Y+
Sbjct: 241 LDLGRANGAGKSRWVEFWSDAAVDRCRRNSASTSSSSSSSPD----LPKTETPKWVNQYV 300

Query: 301 GELGSVLRSGGWSESEVAEMVDVSASGFFDGEMVMLDNQAVLDALLLKVDRFSGSLRRAG 360
             +GSVLR GGWSES++ E++ VSASGFF+GEMV++DNQ VLD LLLK  R S SLR++G
Sbjct: 301 NRIGSVLRRGGWSESDIDEIIHVSASGFFEGEMVIIDNQTVLDVLLLKAGRISESLRKSG 360

Query: 361 WSSEEVSEAFGFDFRPEKERKPAKKLSAELVERIGKLAESVSRS 399
           WSSEEVS+A GFDFRPEKERKP KKLS  LVE+  KLAE VS+S
Sbjct: 361 WSSEEVSDALGFDFRPEKERKPVKKLSPMLVEQFEKLAEWVSQS 394

BLAST of CmaCh02G005450 vs. TAIR 10
Match: AT2G22790.1 (unknown protein; BEST Arabidopsis thaliana protein match is: unknown protein (TAIR:AT5G67020.1); Has 111 Blast hits to 111 proteins in 33 species: Archae - 0; Bacteria - 44; Metazoa - 0; Fungi - 0; Plants - 67; Viruses - 0; Other Eukaryotes - 0 (source: NCBI BLink). )

HSP 1 Score: 113.2 bits (282), Expect = 4.7e-25
Identity = 87/307 (28.34%), Postives = 136/307 (44.30%), Query Frame = 0

Query: 21  RLSARAAATSTASSAPLRNGLLSFSS--LADEVLTHLRN-SGVDVQPGLSDAEFARAEAE 80
           RL+   +     ++ P+R+  ++ SS      ++ H ++ +G  V PGL++ E +  E+ 
Sbjct: 4   RLAGIISPLGHITTDPIRSSSVNPSSPVYYKTIVNHFKSQTGNHVSPGLTNQEISAVESS 63

Query: 81  FRFAFPPDLRAVLSAGLPVGPGFPDWRSSGARLHLRSSLDLPIAAISFQIARNTLWSRSW 140
             F+FP DLR++L  GLPVG  FP+WR+       R++L LP+  +S  + RN  W  SW
Sbjct: 64  HGFSFPLDLRSILQTGLPVGTNFPNWRTGSN----RNNLLLPLLNLSQHVVRNGFWVDSW 123

Query: 141 GPRPAEPEKALRVARNSLKRAPVLIPIFNHCYIPCNPP-LAGNPIFFVDESRVLCCGFDL 200
           G RP    +AL + +  ++ APVL+P++   Y+P   P LAGNP+F +D   V     D+
Sbjct: 124 GIRPGNDAEALSLVKKLIEIAPVLVPVYGDFYVPSTTPNLAGNPVFQIDGDGVRELSCDV 183

Query: 201 SDFFERESLFRCSASDSDSGPLFSKQRSLAEKSFGSSTNFSRRSLDSGVVKTPRWVEFWS 260
             F                            K  G S   +    D    + PR VEFWS
Sbjct: 184 VGFL---------------------------KGIGRSETPTE---DRRRRRRPRRVEFWS 243

Query: 261 DAAVDRRRRNSSSSSNSSPDRFFELPRSEIPKW--------VGEYIGELGSVLRSGGWSE 316
           D A   R               F + R     W        +   + +    LR  GW+E
Sbjct: 244 DVAEGWR---------------FVVARDYTRDWWSALGFEGLTACLDDAFWKLREAGWTE 261

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
Match NameE-valueIdentityDescription
AT3G50340.12.9e-16873.90unknown protein; BEST Arabidopsis thaliana protein match is: unknown protein (TA... [more]
AT5G67020.18.3e-15569.06unknown protein; BEST Arabidopsis thaliana protein match is: unknown protein (TA... [more]
AT2G22790.14.7e-2528.34unknown protein; BEST Arabidopsis thaliana protein match is: unknown protein (TA... [more]
InterPro
Analysis Name: InterPro Annotations of Cucurbita maxima (Rimu) v1.1
Date Performed: 2021-10-25
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
NoneNo IPR availablePANTHERPTHR32011:SF8BNAC08G21750D PROTEINcoord: 1..398
NoneNo IPR availablePANTHERPTHR32011OS08G0472400 PROTEINcoord: 1..398

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CmaCh02G005450.1CmaCh02G005450.1mRNA