Cp4.1LG20g00040 (gene) Cucurbita pepo (Zucchini)

NameCp4.1LG20g00040
Typegene
OrganismCucurbita pepo (Cucurbita pepo (Zucchini))
DescriptionProtein of unknown function (DUF760)
LocationCp4.1LG20 : 7734 .. 10045 (+)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: five_prime_UTRCDSthree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
ATTATGGTATTGTATGTTTTGTTTATTTTATTTTTTAAATTCAAAATTAGGAATGATCCAAACCTCTCAAAGATTGTGGTGGAGAATATGTGTTAATAGATTCTATCTAGTTTTCTTGATGTAATGCACAGAGAAGAAAGCCAAGAGTGTCCACAGAAATTAGGAGATGCCTACAAAGATCTTTTTTTTTTTTTTGCATGTCGTTGCCGTCTCTCCGTCCATTCATACACAAACATACGCGTGCCCAAACGTGTTTCACAATTTCATTCTATAAACATCTCCTCTCCACAAACTGCACCACACTCTCCAGTTTCTAGTTTCTTCTTCCCAGGTTTGTATCTTTCATACTCCCTCGCTCGATGAAACTATTCATTCAGTTTCGCTCCCTTCATCAATTTTCATATGCCTTTTGCAGATCCATTCGACACAACACTACACTACTCAGAGACACCGCCATGGAAGCAGCCACTGGTTCAACTTCAACCCTTGCCATTGGAATTGGATCGCCATTTCGGGACACCGACCCCAGGCCCCCTGCCTCCCGTTCCCTTTATTTTCCTTCCGAATCCCTCATCTCTGTTCCTGTAAGCTGGAATCATTTTTCTTGTATTTCTCACTCCATATTTATCAATCATTTCATAGTCAATTTCTGTGTATCTCATTTTGATATCTTCTGTTTTTTTCCCAACTTTCCGTTGTCGCATCGCAGCATTATCGGTCCTTCGTTTCTCCATCAAAACTTGGAAAGAAGTCGATTACTCTCCCCTGTAGTGGCCGGGGTCGGGGATTGGGATTCCCAATGGTTAAAGCGTCTCTGTCTCCGGATCCGGATGGTTCTGCTGCCCAAATTGCTCCACTTCGGCTCCAGTCTCCAATTGGCCAGTTTCTGTCTCAAATCCTGACTACCCATCCCCACCTTCTTCCTGCAGCCGTCGACCAGCAGCTTCAACAGCTGCAGACCCAACGTCATGCTGAAGAACTAACTCAAGAGCCCTCCGCTTCAGCTACTCATGACATTGTCTTGTACAGGTTAGTTCCGCATCCCTTAGAGGCCGAGTGTCAACAGGTAAGGATGGATCTAGGAACGAAAATGTTTTAACAGTCGATTGAAGCATATTGTTCGATTAGAAGAGCACCGAAGCTGATGAAGCTTGTTTTTTTTTTTTTTTGGACAACCCTGCTCACGTCGCTTAGCGTTTGAATTTGTTAGGAGGATTGCTGAGGTTAAGGCAATTGAAAGGAAGAGGGCCTTAGAAGAGATATTATATGCAATGGTGGTGCAACGATTCATGGACGCCGATGTTCCTCTAATACCAGCTGTTGCCCCGTCGTCTACGGATCCATATGGCCGAGTTGACACATGGGCACGAGATGATGAAAAGCTGGAGCGGCTTCACTCGTCGGAAGCAAGCGAAATGATTCAGAACCACCTAGCGCTGGTTTTGGGGAATCGGATTGGTGACTTTGCGTCAGTAGCGCAGATAAGCAAACTAAGAGTGGGGCAGGTGTATGCTGCGTCTGTGATGTATGGGTACTTCCTCAAGCGAGTGGATGAGAGATTTCAGCTTGAGAAGACTGTGAAAGTGCTACCAGCCAGTGCAACTGTTGAGGGCTCCTTCTCCAATGCACCAGTGCATCCTGAAATCTCTTCCATGGCAGCTGAACAGGGAGATGTTAGTCCTGGGGAGTCGGGTATGGGGATCAAGCCCTCCCGACTGCGAACATACGTAATGTCATTTGATGGGGAGACACTGCAGAGATTTGCCACAATAAGGTCAAAAGAGGCCGTTAGCATCATTGAGAGACACACGGAGGCCTTGTTTGGAAGACCCCAGATTGCAATCACCCCGCAAGGAACAGTAGATACCTCCAAAGACGAGCTTATCAAAATCAGCTTTGGTGGGTTGAAGAGACTAGTTTTGGAAGCCGTGACTTTCGGTTCTTTCCTGTGGGATGTGGAGACGTATGTGGACTCCAGGTATCATTTTGTCATGAATTGAGACATGAACTTACTCGTGCATTCTGGACAATGGATGCCTCCAGCGAATGTAATGTTAACTGTATATGTTTTGTTTAACTCAACCATTTACATTTTCACATTTTAAAATTATCTTATTTTTTTAAAGACAGAATGTCAAAATTGATGGACAGGAACTTTAAACGATACTTAACAAGTTCATATAATTTTAGAGGAGGAACTAAGTTACGTGAGAGATACAATTTAACAGAGTTTTGATATTTACATCTCAGAAGAAGAACGAAGTAACATGAATATGATACAAAGTTCAATATTTATTCAAATTTTTTTAGCA

mRNA sequence

ATTATGGTATTGTATGTTTTGTTTATTTTATTTTTTAAATTCAAAATTAGGAATGATCCAAACCTCTCAAAGATTGTGGTGGAGAATATGTGTTAATAGATTCTATCTAGTTTTCTTGATGTAATGCACAGAGAAGAAAGCCAAGAGTGTCCACAGAAATTAGGAGATGCCTACAAAGATCTTTTTTTTTTTTTTGCATGTCGTTGCCGTCTCTCCGTCCATTCATACACAAACATACGCGTGCCCAAACGTGTTTCACAATTTCATTCTATAAACATCTCCTCTCCACAAACTGCACCACACTCTCCAGTTTCTAGTTTCTTCTTCCCAGATCCATTCGACACAACACTACACTACTCAGAGACACCGCCATGGAAGCAGCCACTGGTTCAACTTCAACCCTTGCCATTGGAATTGGATCGCCATTTCGGGACACCGACCCCAGGCCCCCTGCCTCCCGTTCCCTTTATTTTCCTTCCGAATCCCTCATCTCTGTTCCTCATTATCGGTCCTTCGTTTCTCCATCAAAACTTGGAAAGAAGTCGATTACTCTCCCCTGTAGTGGCCGGGGTCGGGGATTGGGATTCCCAATGGTTAAAGCGTCTCTGTCTCCGGATCCGGATGGTTCTGCTGCCCAAATTGCTCCACTTCGGCTCCAGTCTCCAATTGGCCAGTTTCTGTCTCAAATCCTGACTACCCATCCCCACCTTCTTCCTGCAGCCGTCGACCAGCAGCTTCAACAGCTGCAGACCCAACGTCATGCTGAAGAACTAACTCAAGAGCCCTCCGCTTCAGCTACTCATGACATTGTCTTGTACAGGAGGATTGCTGAGGTTAAGGCAATTGAAAGGAAGAGGGCCTTAGAAGAGATATTATATGCAATGGTGGTGCAACGATTCATGGACGCCGATGTTCCTCTAATACCAGCTGTTGCCCCGTCGTCTACGGATCCATATGGCCGAGTTGACACATGGGCACGAGATGATGAAAAGCTGGAGCGGCTTCACTCGTCGGAAGCAAGCGAAATGATTCAGAACCACCTAGCGCTGGTTTTGGGGAATCGGATTGGTGACTTTGCGTCAGTAGCGCAGATAAGCAAACTAAGAGTGGGGCAGGTGTATGCTGCGTCTGTGATGTATGGGTACTTCCTCAAGCGAGTGGATGAGAGATTTCAGCTTGAGAAGACTGTGAAAGTGCTACCAGCCAGTGCAACTGTTGAGGGCTCCTTCTCCAATGCACCAGTGCATCCTGAAATCTCTTCCATGGCAGCTGAACAGGGAGATGTTAGTCCTGGGGAGTCGGGTATGGGGATCAAGCCCTCCCGACTGCGAACATACGTAATGTCATTTGATGGGGAGACACTGCAGAGATTTGCCACAATAAGGTCAAAAGAGGCCGTTAGCATCATTGAGAGACACACGGAGGCCTTGTTTGGAAGACCCCAGATTGCAATCACCCCGCAAGGAACAGTAGATACCTCCAAAGACGAGCTTATCAAAATCAGCTTTGGTGGGTTGAAGAGACTAGTTTTGGAAGCCGTGACTTTCGGTTCTTTCCTGTGGGATGTGGAGACGTATGTGGACTCCAGGTATCATTTTGTCATGAATTGAGACATGAACTTACTCGTGCATTCTGGACAATGGATGCCTCCAGCGAATGTAATGTTAACTGTATATGTTTTGTTTAACTCAACCATTTACATTTTCACATTTTAAAATTATCTTATTTTTTTAAAGACAGAATGTCAAAATTGATGGACAGGAACTTTAAACGATACTTAACAAGTTCATATAATTTTAGAGGAGGAACTAAGTTACGTGAGAGATACAATTTAACAGAGTTTTGATATTTACATCTCAGAAGAAGAACGAAGTAACATGAATATGATACAAAGTTCAATATTTATTCAAATTTTTTTAGCA

Coding sequence (CDS)

ATGGAAGCAGCCACTGGTTCAACTTCAACCCTTGCCATTGGAATTGGATCGCCATTTCGGGACACCGACCCCAGGCCCCCTGCCTCCCGTTCCCTTTATTTTCCTTCCGAATCCCTCATCTCTGTTCCTCATTATCGGTCCTTCGTTTCTCCATCAAAACTTGGAAAGAAGTCGATTACTCTCCCCTGTAGTGGCCGGGGTCGGGGATTGGGATTCCCAATGGTTAAAGCGTCTCTGTCTCCGGATCCGGATGGTTCTGCTGCCCAAATTGCTCCACTTCGGCTCCAGTCTCCAATTGGCCAGTTTCTGTCTCAAATCCTGACTACCCATCCCCACCTTCTTCCTGCAGCCGTCGACCAGCAGCTTCAACAGCTGCAGACCCAACGTCATGCTGAAGAACTAACTCAAGAGCCCTCCGCTTCAGCTACTCATGACATTGTCTTGTACAGGAGGATTGCTGAGGTTAAGGCAATTGAAAGGAAGAGGGCCTTAGAAGAGATATTATATGCAATGGTGGTGCAACGATTCATGGACGCCGATGTTCCTCTAATACCAGCTGTTGCCCCGTCGTCTACGGATCCATATGGCCGAGTTGACACATGGGCACGAGATGATGAAAAGCTGGAGCGGCTTCACTCGTCGGAAGCAAGCGAAATGATTCAGAACCACCTAGCGCTGGTTTTGGGGAATCGGATTGGTGACTTTGCGTCAGTAGCGCAGATAAGCAAACTAAGAGTGGGGCAGGTGTATGCTGCGTCTGTGATGTATGGGTACTTCCTCAAGCGAGTGGATGAGAGATTTCAGCTTGAGAAGACTGTGAAAGTGCTACCAGCCAGTGCAACTGTTGAGGGCTCCTTCTCCAATGCACCAGTGCATCCTGAAATCTCTTCCATGGCAGCTGAACAGGGAGATGTTAGTCCTGGGGAGTCGGGTATGGGGATCAAGCCCTCCCGACTGCGAACATACGTAATGTCATTTGATGGGGAGACACTGCAGAGATTTGCCACAATAAGGTCAAAAGAGGCCGTTAGCATCATTGAGAGACACACGGAGGCCTTGTTTGGAAGACCCCAGATTGCAATCACCCCGCAAGGAACAGTAGATACCTCCAAAGACGAGCTTATCAAAATCAGCTTTGGTGGGTTGAAGAGACTAGTTTTGGAAGCCGTGACTTTCGGTTCTTTCCTGTGGGATGTGGAGACGTATGTGGACTCCAGGTATCATTTTGTCATGAATTGA

Protein sequence

MEAATGSTSTLAIGIGSPFRDTDPRPPASRSLYFPSESLISVPHYRSFVSPSKLGKKSITLPCSGRGRGLGFPMVKASLSPDPDGSAAQIAPLRLQSPIGQFLSQILTTHPHLLPAAVDQQLQQLQTQRHAEELTQEPSASATHDIVLYRRIAEVKAIERKRALEEILYAMVVQRFMDADVPLIPAVAPSSTDPYGRVDTWARDDEKLERLHSSEASEMIQNHLALVLGNRIGDFASVAQISKLRVGQVYAASVMYGYFLKRVDERFQLEKTVKVLPASATVEGSFSNAPVHPEISSMAAEQGDVSPGESGMGIKPSRLRTYVMSFDGETLQRFATIRSKEAVSIIERHTEALFGRPQIAITPQGTVDTSKDELIKISFGGLKRLVLEAVTFGSFLWDVETYVDSRYHFVMN
BLAST of Cp4.1LG20g00040 vs. Swiss-Prot
Match: UVB31_ARATH (UV-B-induced protein At3g17800, chloroplastic OS=Arabidopsis thaliana GN=At3g17800 PE=2 SV=1)

HSP 1 Score: 426.0 bits (1094), Expect = 4.6e-118
Identity = 230/350 (65.71%), Postives = 275/350 (78.57%), Query Frame = 1

Query: 77  ASLSPDPDGSAAQIAPLRLQSPIGQFLSQILTTHPHLLPAAVDQQLQQLQTQRHAEELTQ 136
           AS       S   IAPL+LQSP GQFLSQIL +HPHL+PAAV+QQL+QLQT R ++   +
Sbjct: 79  ASNDASSGSSPKPIAPLQLQSPAGQFLSQILVSHPHLVPAAVEQQLEQLQTDRDSQGQNK 138

Query: 137 EPSASATHDIVLYRRIAEVKAIERKRALEEILYAMVVQRFMDADVPLIPAVAPSSTDPYG 196
           + ++    DIVLYRRIAE+K  ER+R LEEILYA+VVQ+FM+A+V L+P+V+PSS DP G
Sbjct: 139 DSASVPGTDIVLYRRIAELKENERRRTLEEILYALVVQKFMEANVSLVPSVSPSS-DPSG 198

Query: 197 RVDTWARDDEKLERLHSSEASEMIQNHLALVLGNRIGDFASVAQISKLRVGQVYAASVMY 256
           RVDTW    EKLERLHS E  EMI NHLAL+LG+R+GD  SVAQISKLRVGQVYAASVMY
Sbjct: 199 RVDTWPTKVEKLERLHSPEMYEMIHNHLALILGSRMGDLNSVAQISKLRVGQVYAASVMY 258

Query: 257 GYFLKRVDERFQLEKTVKVLPA--------------SATVEGSFSNAPVHPEISSMAAEQ 316
           GYFLKRVD+RFQLEKT+K+LP               +AT + + S+   HPE+ + A   
Sbjct: 259 GYFLKRVDQRFQLEKTMKILPGGSDESKTSVEQAEGTATYQAAVSS---HPEVGAFA--- 318

Query: 317 GDVSPGESGMGIKPSRLRTYVMSFDGETLQRFATIRSKEAVSIIERHTEALFGRPQIAIT 376
           G VS    G  IKPSRLR+YVMSFD ETLQR+ATIRS+EAV IIE+HTEALFG+P+I IT
Sbjct: 319 GGVSAKGFGSEIKPSRLRSYVMSFDAETLQRYATIRSREAVGIIEKHTEALFGKPEIVIT 378

Query: 377 PQGTVDTSKDELIKISFGGLKRLVLEAVTFGSFLWDVETYVDSRYHFVMN 413
           P+GTVD+SKDE IKISFGG+KRLVLEAVTFGSFLWDVE++VD+RYHFV+N
Sbjct: 379 PEGTVDSSKDEQIKISFGGMKRLVLEAVTFGSFLWDVESHVDARYHFVLN 421

BLAST of Cp4.1LG20g00040 vs. TrEMBL
Match: A0A061FWL1_THECC (Uncharacterized protein isoform 2 OS=Theobroma cacao GN=TCM_013026 PE=4 SV=1)

HSP 1 Score: 491.9 bits (1265), Expect = 7.7e-136
Identity = 281/444 (63.29%), Postives = 323/444 (72.75%), Query Frame = 1

Query: 1   MEAATGSTSTLAIGIGSPFRDTDPRPPAS--RSLYFPSESLISVPHYRSFV--------- 60
           M+AAT S S     +GS    T  RPP+S  RS    +      PH+  F          
Sbjct: 1   MDAATASASV----VGSSM--TTRRPPSSVTRSAILTANE----PHFLRFAAKPRLPFSI 60

Query: 61  ---SPSKLGKKSITLPCSGRGRGLGFPMVKASLSPDPDGSAAQIAPLRLQSPIGQFLSQI 120
              SP    K        G  RG+   +V+AS SPD  G  A IAPL+++SPIGQFLSQI
Sbjct: 61  KHYSPLSYSKPQNRRMALGSRRGM---VVRASSSPDSAGPTAPIAPLQMESPIGQFLSQI 120

Query: 121 LTTHPHLLPAAVDQQLQQLQTQRHAEELTQEPSASATHDIVLYRRIAEVKAIERKRALEE 180
           L +HPHL+PAAV+QQL+QLQT R AEE  +EPSASA  D+VLYRRIAEVKA ERK+ALEE
Sbjct: 121 LISHPHLVPAAVEQQLEQLQTDRDAEEKKEEPSASAGTDLVLYRRIAEVKANERKKALEE 180

Query: 181 ILYAMVVQRFMDADVPLIPAVAPSSTDPYGRVDTWARDDEKLERLHSSEASEMIQNHLAL 240
           ILYA+VVQ+FMDA+V L+PA+ PSSTDP GRVD W  +++KLE LHS EA EMIQNHLAL
Sbjct: 181 ILYALVVQKFMDANVSLVPAMTPSSTDPSGRVDMWPSEEDKLELLHSPEAYEMIQNHLAL 240

Query: 241 VLGNRIGDFASVAQISKLRVGQVYAASVMYGYFLKRVDERFQLEKTVKVLP--------- 300
           +LGNR+GD  SVAQISKLRVGQVYAASVMYGYFLKRVD+RFQLEKT+K+LP         
Sbjct: 241 ILGNRLGDSTSVAQISKLRVGQVYAASVMYGYFLKRVDQRFQLEKTMKILPNASNGEESG 300

Query: 301 ---------ASATVEGSFSNAPVHPEISSMAAEQGDVSPGESGMGIKPSRLRTYVMSFDG 360
                     +A +  S+     HPE+SS +   G +SPG  G GIKP RLRTYVMSFDG
Sbjct: 301 VEQSVGEDMGTAGLGDSYKAVSSHPEVSSWS---GGISPGGFGHGIKPCRLRTYVMSFDG 360

Query: 361 ETLQRFATIRSKEAVSIIERHTEALFGRPQIAITPQGTVDTSKDELIKISFGGLKRLVLE 413
           ETLQ+FA IRSKEAVSIIE+HTEALFGRP+I ITPQGTVD+SKDELIKISF GLKRLVLE
Sbjct: 361 ETLQKFAAIRSKEAVSIIEKHTEALFGRPEIVITPQGTVDSSKDELIKISFNGLKRLVLE 420

BLAST of Cp4.1LG20g00040 vs. TrEMBL
Match: A0A061FX61_THECC (Uncharacterized protein isoform 1 OS=Theobroma cacao GN=TCM_013026 PE=4 SV=1)

HSP 1 Score: 478.4 bits (1230), Expect = 8.8e-132
Identity = 275/438 (62.79%), Postives = 317/438 (72.37%), Query Frame = 1

Query: 1   MEAATGSTSTLAIGIGSPFRDTDPRPPAS--RSLYFPSESLISVPHYRSFV--------- 60
           M+AAT S S     +GS    T  RPP+S  RS    +      PH+  F          
Sbjct: 1   MDAATASASV----VGSSM--TTRRPPSSVTRSAILTANE----PHFLRFAAKPRLPFSI 60

Query: 61  ---SPSKLGKKSITLPCSGRGRGLGFPMVKASLSPDPDGSAAQIAPLRLQSPIGQFLSQI 120
              SP    K        G  RG+   +V+AS SPD  G  A IAPL+++SPIGQFLSQI
Sbjct: 61  KHYSPLSYSKPQNRRMALGSRRGM---VVRASSSPDSAGPTAPIAPLQMESPIGQFLSQI 120

Query: 121 LTTHPHLLPAAVDQQLQQLQTQRHAEELTQEPSASATHDIVLYRRIAEVKAIERKRALEE 180
           L +HPHL+PAAV+QQL+QLQT R AEE  +EPSASA  D+VLYRRIAEVKA ERK+ALEE
Sbjct: 121 LISHPHLVPAAVEQQLEQLQTDRDAEEKKEEPSASAGTDLVLYRRIAEVKANERKKALEE 180

Query: 181 ILYAMVVQRFMDADVPLIPAVAPSSTDPYGRVDTWARDDEKLERLHSSEASEMIQNHLAL 240
           ILYA+VVQ+FMDA+V L+PA+ PSSTDP GRVD W  +++KLE LHS EA EMIQNHLAL
Sbjct: 181 ILYALVVQKFMDANVSLVPAMTPSSTDPSGRVDMWPSEEDKLELLHSPEAYEMIQNHLAL 240

Query: 241 VLGNRIGDFASVAQISKLRVGQVYAASVMYGYFLKRVDERFQLEKTVKVLP--------- 300
           +LGNR+GD  SVAQISKLRVGQVYAASVMYGYFLKRVD+RFQLEKT+K+LP         
Sbjct: 241 ILGNRLGDSTSVAQISKLRVGQVYAASVMYGYFLKRVDQRFQLEKTMKILPNASNGEESG 300

Query: 301 ---------ASATVEGSFSNAPVHPEISSMAAEQGDVSPGESGMGIKPSRLRTYVMSFDG 360
                     +A +  S+     HPE+SS +   G +SPG  G GIKP RLRTYVMSFDG
Sbjct: 301 VEQSVGEDMGTAGLGDSYKAVSSHPEVSSWS---GGISPGGFGHGIKPCRLRTYVMSFDG 360

Query: 361 ETLQRFATIRSKEAVSIIERHTEALFGRPQIAITPQGTVDTSKDELIKISFGGLKRLVLE 407
           ETLQ+FA IRSKEAVSIIE+HTEALFGRP+I ITPQGTVD+SKDELIKISF GLKRLVLE
Sbjct: 361 ETLQKFAAIRSKEAVSIIEKHTEALFGRPEIVITPQGTVDSSKDELIKISFNGLKRLVLE 420

BLAST of Cp4.1LG20g00040 vs. TrEMBL
Match: A0A0B0NHU4_GOSAR (Alanine--tRNA ligase OS=Gossypium arboreum GN=F383_01965 PE=4 SV=1)

HSP 1 Score: 478.0 bits (1229), Expect = 1.1e-131
Identity = 263/384 (68.49%), Postives = 307/384 (79.95%), Query Frame = 1

Query: 47  SFVSPSKLGKKSITLPCSGRGRGLGFPMVKASLSPDPDGSAAQIAPLRLQSPIGQFLSQI 106
           S VS SK   + + L   G  RG+   +VKAS SPD     AQIAPLR++SPIGQFLSQI
Sbjct: 56  SSVSYSKSRNRRVGL---GGRRGM---VVKASSSPDSAEPNAQIAPLRMESPIGQFLSQI 115

Query: 107 LTTHPHLLPAAVDQQLQQLQTQRHAEELTQEPSASATHDIVLYRRIAEVKAIERKRALEE 166
           L +HPHL+PAAV+QQL+QLQ+ R  +E  +EPSAS T D+VLYRRIAEVKA ERKRALEE
Sbjct: 116 LISHPHLVPAAVEQQLEQLQSDRDTDEKKEEPSASGT-DLVLYRRIAEVKANERKRALEE 175

Query: 167 ILYAMVVQRFMDADVPLIPAVAPSSTDPYGRVDTWARDDEKLERLHSSEASEMIQNHLAL 226
           ILYA+VVQ+FMDA++ L+PA+  SS DP GRVDTW   ++KLE+LHS+EA EMIQNHLAL
Sbjct: 176 ILYALVVQKFMDANISLVPAIT-SSADPSGRVDTWPSQEDKLEQLHSAEAHEMIQNHLAL 235

Query: 227 VLGNRIGDFASVAQISKLRVGQVYAASVMYGYFLKRVDERFQLEKTVKVLPASATVE--- 286
           +LGNR+GD  SVAQISKLRVGQVYAASVMYGYFL+RVD+RFQLEKT+KVLP+++  +   
Sbjct: 236 ILGNRLGDSTSVAQISKLRVGQVYAASVMYGYFLRRVDQRFQLEKTMKVLPSASDGDKSS 295

Query: 287 ---------------GSFSNAPVHPEISSMAAEQGDVSPGESGMGIKPSRLRTYVMSFDG 346
                           S+  A  H E+SS +   G +SPG  G GIKPSRLRTYVMSFDG
Sbjct: 296 IEQTVGDDTRPSGLGDSYQAASSHAEVSSWS---GGISPGGFGSGIKPSRLRTYVMSFDG 355

Query: 347 ETLQRFATIRSKEAVSIIERHTEALFGRPQIAITPQGTVDTSKDELIKISFGGLKRLVLE 406
           ETLQR+A+IRSKEAV IIE+HTEALFGRP+IAITPQGTVD+S DELIKISFGGLKRLVLE
Sbjct: 356 ETLQRYASIRSKEAVGIIEKHTEALFGRPEIAITPQGTVDSSNDELIKISFGGLKRLVLE 415

Query: 407 AVTFGSFLWDVETYVDSRYHFVMN 413
           AVTFGSFLWDVE++VDSRYHFVMN
Sbjct: 416 AVTFGSFLWDVESFVDSRYHFVMN 428

BLAST of Cp4.1LG20g00040 vs. TrEMBL
Match: A0A0D2SGC6_GOSRA (Uncharacterized protein OS=Gossypium raimondii GN=B456_005G134400 PE=4 SV=1)

HSP 1 Score: 474.2 bits (1219), Expect = 1.7e-130
Identity = 260/384 (67.71%), Postives = 307/384 (79.95%), Query Frame = 1

Query: 47  SFVSPSKLGKKSITLPCSGRGRGLGFPMVKASLSPDPDGSAAQIAPLRLQSPIGQFLSQI 106
           S VS SK   + + L   G  RG+   +VKAS SPD     AQIAPLR++SPIGQFLSQI
Sbjct: 50  SSVSYSKSRNRRMGL---GGRRGM---VVKASSSPDSAEPNAQIAPLRMESPIGQFLSQI 109

Query: 107 LTTHPHLLPAAVDQQLQQLQTQRHAEELTQEPSASATHDIVLYRRIAEVKAIERKRALEE 166
           L +HPHL+PAAV+QQL+QLQT R  +E  +EPSAS T D+VLYRRIAEVKA ERKRALEE
Sbjct: 110 LISHPHLVPAAVEQQLEQLQTDRDTDEKKEEPSASGT-DLVLYRRIAEVKANERKRALEE 169

Query: 167 ILYAMVVQRFMDADVPLIPAVAPSSTDPYGRVDTWARDDEKLERLHSSEASEMIQNHLAL 226
           ILYA+VVQ+FMDA++ L+PA+  SS DP GRVDTW   ++KLE++HS+EA EMIQNH+AL
Sbjct: 170 ILYALVVQKFMDANISLVPAIT-SSADPSGRVDTWPSQEDKLEQIHSAEAHEMIQNHVAL 229

Query: 227 VLGNRIGDFASVAQISKLRVGQVYAASVMYGYFLKRVDERFQLEKTVKVLPASATVE--- 286
           +LGNR+G+  SVAQISKLRVGQVYAASVMYGYFL+RVD+RFQLE+T+KVLP+++  +   
Sbjct: 230 ILGNRLGESTSVAQISKLRVGQVYAASVMYGYFLRRVDQRFQLERTMKVLPSASDDDKSS 289

Query: 287 ---------------GSFSNAPVHPEISSMAAEQGDVSPGESGMGIKPSRLRTYVMSFDG 346
                           S+  A  HPE+SS +   G +S G  G GIKPSRLRTYVMSFDG
Sbjct: 290 IEQTVGDDTRPSGLGDSYQAASSHPEVSSWS---GGISSGGFGSGIKPSRLRTYVMSFDG 349

Query: 347 ETLQRFATIRSKEAVSIIERHTEALFGRPQIAITPQGTVDTSKDELIKISFGGLKRLVLE 406
           ETLQR+A+IRSKEAV IIE+HTEALFGRP+IAITPQGTVD+S DELIKISFGGLKRLVLE
Sbjct: 350 ETLQRYASIRSKEAVGIIEKHTEALFGRPEIAITPQGTVDSSNDELIKISFGGLKRLVLE 409

Query: 407 AVTFGSFLWDVETYVDSRYHFVMN 413
           AVTFGSFLWDVE++VDSRYHFVMN
Sbjct: 410 AVTFGSFLWDVESFVDSRYHFVMN 422

BLAST of Cp4.1LG20g00040 vs. TrEMBL
Match: W9QPD8_9ROSA (Uncharacterized protein OS=Morus notabilis GN=L484_019353 PE=4 SV=1)

HSP 1 Score: 471.5 bits (1212), Expect = 1.1e-129
Identity = 265/402 (65.92%), Postives = 310/402 (77.11%), Query Frame = 1

Query: 40  ISVPHYRSFVSPSKLGKKSITLPCSGRGRGLGFPMVKASLSPDPDGSAAQIAPLRLQSPI 99
           +S+ H  S  S SKLG K I+    G  R   F +V+AS S D   S + IAPL+L+SP+
Sbjct: 44  LSMKHKTS--SRSKLGHKRISF---GSRR---FLLVRASTSSDSGSSDSPIAPLQLESPV 103

Query: 100 GQFLSQILTTHPHLLPAAVDQQLQQLQTQRHAE---------ELTQEPSASATHDIVLYR 159
           GQFLSQIL +HPHL+PAAV+QQL+QLQT R A          E ++EPSA+ T D+ LYR
Sbjct: 104 GQFLSQILMSHPHLVPAAVEQQLEQLQTDRDAAQQLQTDCDAEKSEEPSATGT-DLALYR 163

Query: 160 RIAEVKAIERKRALEEILYAMVVQRFMDADVPLIPAVAPSSTDPYGRVDTWARDDEKLER 219
           RIAEVKA ER++ALEEILYA+VVQ+FMDA+V L+P++  S++DP G VD+W   DEKLE+
Sbjct: 164 RIAEVKANERRKALEEILYALVVQKFMDANVSLVPSIETSASDPSGCVDSWPSQDEKLEQ 223

Query: 220 LHSSEASEMIQNHLALVLGNRIGDFASVAQISKLRVGQVYAASVMYGYFLKRVDERFQLE 279
           LHS EA EMIQNHLAL+LGNR+GD  SVAQISKLRVGQVYAASVMYGYFLKRVD+RFQLE
Sbjct: 224 LHSPEAYEMIQNHLALILGNRLGDSTSVAQISKLRVGQVYAASVMYGYFLKRVDQRFQLE 283

Query: 280 KTVKVLPASATVEG--------------------SFSNAPVHPEISSMAAEQGDVSPGES 339
           KT+K+LP   T++G                    SF  AP HPE+SS A   G  SPG  
Sbjct: 284 KTMKILP--NTLDGDDTNVQQAVGDDSRPLGGGESFQAAPSHPEVSSWA---GGTSPGGF 343

Query: 340 GMGIKPSRLRTYVMSFDGETLQRFATIRSKEAVSIIERHTEALFGRPQIAITPQGTVDTS 399
           G G+KPSRLRTYVMSFDGETLQR+ATIRSKEAVSIIE+HTEALFGRP+I ITPQGTVD+S
Sbjct: 344 GHGMKPSRLRTYVMSFDGETLQRYATIRSKEAVSIIEKHTEALFGRPEIVITPQGTVDSS 403

Query: 400 KDELIKISFGGLKRLVLEAVTFGSFLWDVETYVDSRYHFVMN 413
           KDELIKISF GLKRLVLEAVTFGSFLWDVE+YVD+RYHFV+N
Sbjct: 404 KDELIKISFAGLKRLVLEAVTFGSFLWDVESYVDARYHFVLN 431

BLAST of Cp4.1LG20g00040 vs. TAIR10
Match: AT3G17800.2 (AT3G17800.2 Protein of unknown function (DUF760))

HSP 1 Score: 426.0 bits (1094), Expect = 2.6e-119
Identity = 230/350 (65.71%), Postives = 275/350 (78.57%), Query Frame = 1

Query: 77  ASLSPDPDGSAAQIAPLRLQSPIGQFLSQILTTHPHLLPAAVDQQLQQLQTQRHAEELTQ 136
           AS       S   IAPL+LQSP GQFLSQIL +HPHL+PAAV+QQL+QLQT R ++   +
Sbjct: 85  ASNDASSGSSPKPIAPLQLQSPAGQFLSQILVSHPHLVPAAVEQQLEQLQTDRDSQGQNK 144

Query: 137 EPSASATHDIVLYRRIAEVKAIERKRALEEILYAMVVQRFMDADVPLIPAVAPSSTDPYG 196
           + ++    DIVLYRRIAE+K  ER+R LEEILYA+VVQ+FM+A+V L+P+V+PSS DP G
Sbjct: 145 DSASVPGTDIVLYRRIAELKENERRRTLEEILYALVVQKFMEANVSLVPSVSPSS-DPSG 204

Query: 197 RVDTWARDDEKLERLHSSEASEMIQNHLALVLGNRIGDFASVAQISKLRVGQVYAASVMY 256
           RVDTW    EKLERLHS E  EMI NHLAL+LG+R+GD  SVAQISKLRVGQVYAASVMY
Sbjct: 205 RVDTWPTKVEKLERLHSPEMYEMIHNHLALILGSRMGDLNSVAQISKLRVGQVYAASVMY 264

Query: 257 GYFLKRVDERFQLEKTVKVLPA--------------SATVEGSFSNAPVHPEISSMAAEQ 316
           GYFLKRVD+RFQLEKT+K+LP               +AT + + S+   HPE+ + A   
Sbjct: 265 GYFLKRVDQRFQLEKTMKILPGGSDESKTSVEQAEGTATYQAAVSS---HPEVGAFA--- 324

Query: 317 GDVSPGESGMGIKPSRLRTYVMSFDGETLQRFATIRSKEAVSIIERHTEALFGRPQIAIT 376
           G VS    G  IKPSRLR+YVMSFD ETLQR+ATIRS+EAV IIE+HTEALFG+P+I IT
Sbjct: 325 GGVSAKGFGSEIKPSRLRSYVMSFDAETLQRYATIRSREAVGIIEKHTEALFGKPEIVIT 384

Query: 377 PQGTVDTSKDELIKISFGGLKRLVLEAVTFGSFLWDVETYVDSRYHFVMN 413
           P+GTVD+SKDE IKISFGG+KRLVLEAVTFGSFLWDVE++VD+RYHFV+N
Sbjct: 385 PEGTVDSSKDEQIKISFGGMKRLVLEAVTFGSFLWDVESHVDARYHFVLN 427

BLAST of Cp4.1LG20g00040 vs. TAIR10
Match: AT1G48450.1 (AT1G48450.1 Protein of unknown function (DUF760))

HSP 1 Score: 415.2 bits (1066), Expect = 4.6e-116
Identity = 228/357 (63.87%), Postives = 273/357 (76.47%), Query Frame = 1

Query: 74  MVKASLSPDPDGSAAQIAPLRLQSPIGQFLSQILTTHPHLLPAAVDQQLQQLQTQRHAEE 133
           +VKAS S D   S   IAPL+L+SP+GQFLSQIL +HPHL+PAAV+QQL+QLQ  R AEE
Sbjct: 69  VVKASASGD--ASTESIAPLQLKSPVGQFLSQILVSHPHLVPAAVEQQLEQLQIDRDAEE 128

Query: 134 LTQEPSASATHDIVLYRRIAEVKAIERKRALEEILYAMVVQRFMDADVPLIPAVAPSSTD 193
            +++ S+    DIVLYRRIAEVK  ER+RALEEILYA+VVQ+FMDA+V L+P++  SS D
Sbjct: 129 QSKDASSVLGTDIVLYRRIAEVKEKERRRALEEILYALVVQKFMDANVTLVPSITSSSAD 188

Query: 194 PYGRVDTWARDDEKLERLHSSEASEMIQNHLALVLGNRIGDFASVAQISKLRVGQVYAAS 253
           P GRVDTW   D +LERLHS E  EMIQNHL+++L NR  D  +VAQISKL VGQVYAAS
Sbjct: 189 PSGRVDTWPTLDGELERLHSPEVYEMIQNHLSIILKNRTDDLTAVAQISKLGVGQVYAAS 248

Query: 254 VMYGYFLKRVDERFQLEKTVKVLP------------ASATVEGSF-SNAPVHPEISSMAA 313
           VMYGYFLKR+D+RFQLEKT+++LP            A   VE +F   A    +  S   
Sbjct: 249 VMYGYFLKRIDQRFQLEKTMRILPGGSDEGETSIEQAGRDVERNFYEEAEETYQAVSSNQ 308

Query: 314 EQGDVSPGESGMG-----IKPSRLRTYVMSFDGETLQRFATIRSKEAVSIIERHTEALFG 373
           + G    G +  G     +K SRL+TYVMSFDGETLQR+ATIRS+E+V IIE+HTEALFG
Sbjct: 309 DVGSFVGGINASGGFSSDMKQSRLKTYVMSFDGETLQRYATIRSRESVGIIEKHTEALFG 368

Query: 374 RPQIAITPQGTVDTSKDELIKISFGGLKRLVLEAVTFGSFLWDVETYVDSRYHFVMN 413
           RP+I ITPQGT+D+SKDE IKISF GLKRLVLEAVTFGSFLWDVE++VDSRYHFV+N
Sbjct: 369 RPEIVITPQGTIDSSKDEHIKISFKGLKRLVLEAVTFGSFLWDVESHVDSRYHFVLN 423

BLAST of Cp4.1LG20g00040 vs. TAIR10
Match: AT1G32160.1 (AT1G32160.1 Protein of unknown function (DUF760))

HSP 1 Score: 303.1 bits (775), Expect = 2.6e-82
Identity = 179/389 (46.02%), Postives = 253/389 (65.04%), Query Frame = 1

Query: 35  PSESLISVPHY-RSFVSPSKLGKKSITLPCSGRGRGLGFPMVKASLSPDPDGSAAQIAPL 94
           PS S   +P    SF  P KLG  S     +GRGR +    V+AS   D + + A +AP+
Sbjct: 28  PSSSPSLLPQRCHSFCIP-KLGSSSTNE--NGRGRSV---TVRASGDEDSNENFAPLAPV 87

Query: 95  RLQSPIGQFLSQILTTHPHLLPAAVDQQLQQLQTQRHAEELTQEPSASATHDIVLYRRIA 154
            L+SP+GQ L QIL THPHLLP  VD+QL++      AE  +++  +S+T DI L +RI+
Sbjct: 88  ELESPVGQLLEQILRTHPHLLPVTVDEQLEKFA----AESESRKADSSSTQDI-LQKRIS 147

Query: 155 EVKAIERKRALEEILYAMVVQRFMDADVPLIPAVAPSSTDPYGRVDTWARDDEKLERLHS 214
           EV+  ER++ L EI+Y +VV RF++  + +IP + P+S DP GR+D W   +EKLE +HS
Sbjct: 148 EVRDKERRKTLAEIIYCLVVHRFVEKGISMIPRIKPTS-DPAGRIDLWPNQEEKLEVIHS 207

Query: 215 SEASEMIQNHLALVLGN--RIGDFASVAQISKLRVGQVYAASVMYGYFLKRVDERFQLEK 274
           ++A EMIQ+HL+ VLG+   +G  +S+ QI K+++G++YAAS MYGYFL+RVD+R+QLE+
Sbjct: 208 ADAFEMIQSHLSSVLGDGPAVGPLSSIVQIGKIKLGKLYAASAMYGYFLRRVDQRYQLER 267

Query: 275 TVKVLPA--SATVEGSFSNAPVHP---EISSMAAEQGDVSPGESGMGIKPSR-----LRT 334
           T+  LP     T E     +P +P     S +  +  +  P E  +           LR+
Sbjct: 268 TMNTLPKRPEKTRERFEEPSPPYPLWDPDSLIRIQPEEYDPDEYAIQRNEDESSSYGLRS 327

Query: 335 YVMSFDGETLQRFATIRSKEAVSIIERHTEALFGRPQIAITPQGTVDTSKDELIKISFGG 394
           YV   D +TLQR+ATIRSKEA+++IE+ T+ALFGRP I I   G +DTS DE++ +S  G
Sbjct: 328 YVTYLDSDTLQRYATIRSKEAMTLIEKQTQALFGRPDIRILEDGKLDTSNDEVLSLSVSG 387

Query: 395 LKRLVLEAVTFGSFLWDVETYVDSRYHFV 411
           L  LVLEAV FGSFLWD E+YV+S+YHF+
Sbjct: 388 LAMLVLEAVAFGSFLWDSESYVESKYHFL 404

BLAST of Cp4.1LG20g00040 vs. TAIR10
Match: AT3G07310.1 (AT3G07310.1 Protein of unknown function (DUF760))

HSP 1 Score: 167.9 bits (424), Expect = 1.3e-41
Identity = 113/326 (34.66%), Postives = 173/326 (53.07%), Query Frame = 1

Query: 91  APLRLQSPIGQFLSQILTTHPHLLPAAVDQQLQQLQTQRHAEELTQEPSASATHDIVLYR 150
           APL  +S  G+FL  +L     L   A   +L+QL   R A  L +   +S + +  L+R
Sbjct: 61  APLEPRSAQGRFLRSVLLNKRQLFHYAAADELKQLADDREAA-LARMSLSSGSDEASLHR 120

Query: 151 RIAEVKAIERKRALEEILYAMVVQRFMDADVPLIPAVAPSSTDPYGRVDTWARDDEKLER 210
           RIAE+K    K A+++I+Y ++  ++ +  VPL+P ++    +  GR++ W   D +LE 
Sbjct: 121 RIAELKERYCKTAVQDIMYMLIFYKYSEIRVPLVPKLSRCIYN--GRLEIWPSKDWELES 180

Query: 211 LHSSEASEMIQNHLALVLGNRIG----DFASVAQISKLRVGQVYAASVMYGYFLKRVDER 270
           ++S +  E+I+ H++ V+G R+     D  +  QI KL + +VYAAS++YGYFLK    R
Sbjct: 181 IYSCDTLEIIKEHVSAVIGLRVNSCVTDNWATTQIQKLHLRKVYAASILYGYFLKSASLR 240

Query: 271 FQLEKTVKVLPASATVEGSFSNAPVHPEISSMAAEQGDVSPGESGMGIKPSRLRTYVMSF 330
            QLE ++  +  S  ++            + ++ +Q               +LR Y+  F
Sbjct: 241 HQLECSLSDIHGSGYLKSPIFGCSFTTGTAQISNKQ---------------QLRHYISDF 300

Query: 331 DGETLQRFATIRSKEAVSIIERHTEALFGRPQIAITPQGTVDTSKDELIKISFGGLKRLV 390
           D ETLQR A  R++EA ++IE+ + ALFG  +             DE I  SF  LKRLV
Sbjct: 301 DPETLQRCAKPRTEEARNLIEKQSLALFGTEE------------SDETIVTSFSSLKRLV 356

Query: 391 LEAVTFGSFLWDVETYVDSRYHFVMN 413
           LEAV FG+FLWD E YVD  Y    N
Sbjct: 361 LEAVAFGTFLWDTELYVDGAYKLKEN 356

BLAST of Cp4.1LG20g00040 vs. TAIR10
Match: AT5G48590.1 (AT5G48590.1 Protein of unknown function (DUF760))

HSP 1 Score: 128.6 bits (322), Expect = 8.6e-30
Identity = 120/381 (31.50%), Postives = 184/381 (48.29%), Query Frame = 1

Query: 38  SLISVPHYRSFVSPSKLGKKSITLPCSGRGRGLGFPMVKASLSPDPDGSAAQIAPLRLQS 97
           SL   P  R+FV  ++ G   + LP   + R     +V A+ S    G +   APL  +S
Sbjct: 10  SLPFPPSRRNFVKQNR-GGDCVFLPSRRKFRYDSLVVVSAASS----GQSID-APLVPRS 69

Query: 98  PIGQFLSQILTTHPHLLPAAVDQQLQQLQTQRHAEELTQEPSASATHDIVLYRRIAEVKA 157
           P G+FLS +L     L   AV   L+QL   + A  L++   +  + +  L+RRIA++K 
Sbjct: 70  PQGRFLSSVLVKKRQLFHFAVADLLKQLADDKEAS-LSRMFLSYGSDEASLHRRIAQLKE 129

Query: 158 IERKRALEEILYAMVVQRFMDADVPLIPAVAPSSTDPYGRVDTWARDDEKLERLHSSEAS 217
            + + A+E+I+Y +++ +F +  VPL+P +     +  GR++     D +LE +HS +  
Sbjct: 130 SDCQIAIEDIMYMLILYKFSEIRVPLVPKLPSCIYN--GRLEISPSKDWELESIHSFDVL 189

Query: 218 EMIQNHLALVLGNRIG----DFASVAQISKLRVGQVYAASVMYGY--FLKRVDERFQLEK 277
           E+I+ H   V+  R+     D  +  +I K R+ +VY ASV+  Y  FLK          
Sbjct: 190 ELIKEHSNAVISLRVNSSLTDDCATTEIDKNRLSKVYTASVL--YGYFLK---------- 249

Query: 278 TVKVLPASATVEGSFSNAPVHPEISSMAAEQGDVSPGESGMGIKPSRLRTYVMSFDGETL 337
                  SA++         H    S++   G  +           +LR Y+  FD + L
Sbjct: 250 -------SASLR--------HQLECSLSQHHGSFT----------KQLRHYISEFDPKIL 309

Query: 338 QRFATIRSKEAVSIIERHTEALFGRPQIAITPQGTVDTSKDELIKISFGGLKRLVLEAVT 397
           +R A  RS EA S+IE+ + ALFG  +           S  E I  SF  LKRL+LEAV 
Sbjct: 310 RRCAKPRSHEAKSLIEKQSLALFGPEE-----------SSKESIVTSFSSLKRLLLEAVA 333

Query: 398 FGSFLWDVETYVDSRYHFVMN 413
           FG+FLWD E YVD  +    N
Sbjct: 370 FGTFLWDTEEYVDGAFKLKEN 333

BLAST of Cp4.1LG20g00040 vs. NCBI nr
Match: gi|590666388|ref|XP_007036962.1| (Uncharacterized protein isoform 2 [Theobroma cacao])

HSP 1 Score: 491.9 bits (1265), Expect = 1.1e-135
Identity = 281/444 (63.29%), Postives = 323/444 (72.75%), Query Frame = 1

Query: 1   MEAATGSTSTLAIGIGSPFRDTDPRPPAS--RSLYFPSESLISVPHYRSFV--------- 60
           M+AAT S S     +GS    T  RPP+S  RS    +      PH+  F          
Sbjct: 1   MDAATASASV----VGSSM--TTRRPPSSVTRSAILTANE----PHFLRFAAKPRLPFSI 60

Query: 61  ---SPSKLGKKSITLPCSGRGRGLGFPMVKASLSPDPDGSAAQIAPLRLQSPIGQFLSQI 120
              SP    K        G  RG+   +V+AS SPD  G  A IAPL+++SPIGQFLSQI
Sbjct: 61  KHYSPLSYSKPQNRRMALGSRRGM---VVRASSSPDSAGPTAPIAPLQMESPIGQFLSQI 120

Query: 121 LTTHPHLLPAAVDQQLQQLQTQRHAEELTQEPSASATHDIVLYRRIAEVKAIERKRALEE 180
           L +HPHL+PAAV+QQL+QLQT R AEE  +EPSASA  D+VLYRRIAEVKA ERK+ALEE
Sbjct: 121 LISHPHLVPAAVEQQLEQLQTDRDAEEKKEEPSASAGTDLVLYRRIAEVKANERKKALEE 180

Query: 181 ILYAMVVQRFMDADVPLIPAVAPSSTDPYGRVDTWARDDEKLERLHSSEASEMIQNHLAL 240
           ILYA+VVQ+FMDA+V L+PA+ PSSTDP GRVD W  +++KLE LHS EA EMIQNHLAL
Sbjct: 181 ILYALVVQKFMDANVSLVPAMTPSSTDPSGRVDMWPSEEDKLELLHSPEAYEMIQNHLAL 240

Query: 241 VLGNRIGDFASVAQISKLRVGQVYAASVMYGYFLKRVDERFQLEKTVKVLP--------- 300
           +LGNR+GD  SVAQISKLRVGQVYAASVMYGYFLKRVD+RFQLEKT+K+LP         
Sbjct: 241 ILGNRLGDSTSVAQISKLRVGQVYAASVMYGYFLKRVDQRFQLEKTMKILPNASNGEESG 300

Query: 301 ---------ASATVEGSFSNAPVHPEISSMAAEQGDVSPGESGMGIKPSRLRTYVMSFDG 360
                     +A +  S+     HPE+SS +   G +SPG  G GIKP RLRTYVMSFDG
Sbjct: 301 VEQSVGEDMGTAGLGDSYKAVSSHPEVSSWS---GGISPGGFGHGIKPCRLRTYVMSFDG 360

Query: 361 ETLQRFATIRSKEAVSIIERHTEALFGRPQIAITPQGTVDTSKDELIKISFGGLKRLVLE 413
           ETLQ+FA IRSKEAVSIIE+HTEALFGRP+I ITPQGTVD+SKDELIKISF GLKRLVLE
Sbjct: 361 ETLQKFAAIRSKEAVSIIEKHTEALFGRPEIVITPQGTVDSSKDELIKISFNGLKRLVLE 420

BLAST of Cp4.1LG20g00040 vs. NCBI nr
Match: gi|568880748|ref|XP_006493269.1| (PREDICTED: UV-B-induced protein At3g17800, chloroplastic-like [Citrus sinensis])

HSP 1 Score: 480.3 bits (1235), Expect = 3.3e-132
Identity = 272/439 (61.96%), Postives = 323/439 (73.58%), Query Frame = 1

Query: 1   MEAATGSTSTLAIGIGSPFRDTDPRPPASRSLY-------FPSESLISVPHYRSFVSPSK 60
           MEAA  S +  +IG+ S         P   S+Y       F ++SL+ + HY S  +P  
Sbjct: 1   MEAAAASVARSSIGLHS-------HRPVLFSVYSGPDFIRFGTKSLLPIKHYSSVSNPKP 60

Query: 61  LGKKSITLPCSGRGRGLGFP---MVKASLSPDPDGSAAQIAPLRLQSPIGQFLSQILTTH 120
             ++          +G G     +V+AS S +  GS   IAPL+L+SP+GQFLSQIL +H
Sbjct: 61  RHRR----------KGFGSRRCMVVRASSSSESSGSMDPIAPLQLESPVGQFLSQILISH 120

Query: 121 PHLLPAAVDQQLQQLQTQRHAEELTQEPSASATHDIVLYRRIAEVKAIERKRALEEILYA 180
           PHL+PAAV+QQL+QLQT R AE+  +E SAS T ++VLYRRIAEVKA ER++ALEEILYA
Sbjct: 121 PHLVPAAVEQQLEQLQTDRDAEKHKEEASASGT-ELVLYRRIAEVKANERRKALEEILYA 180

Query: 181 MVVQRFMDADVPLIPAVAPSSTDPYGRVDTWARDDEKLERLHSSEASEMIQNHLALVLGN 240
           +VVQ+FMDA+V LIP++ PSS+D  GRVDTW   DE LE+LHSSEA EMIQNHLAL+LGN
Sbjct: 181 LVVQKFMDANVSLIPSITPSSSDSSGRVDTWLSQDENLEQLHSSEAYEMIQNHLALILGN 240

Query: 241 RIGDFASVAQISKLRVGQVYAASVMYGYFLKRVDERFQLEKTVKVLPASATV-------- 300
           R+GD  SVAQISKLRVGQVYAASVMYGYFLKRVD+RFQLEK++K+LP ++ V        
Sbjct: 241 RLGDSTSVAQISKLRVGQVYAASVMYGYFLKRVDQRFQLEKSMKILPDASDVEASGIQQV 300

Query: 301 ---------EGSFSNAPVHPEISSMAAEQGDVSPGESGMGIKPSRLRTYVMSFDGETLQR 360
                    EGS      HPE+SS +   G VSPG  G GIK SRLRTYVMSFDGETLQR
Sbjct: 301 VGDVTPTGAEGSHEALSSHPEVSSFS---GGVSPGGFGHGIKASRLRTYVMSFDGETLQR 360

Query: 361 FATIRSKEAVSIIERHTEALFGRPQIAITPQGTVDTSKDELIKISFGGLKRLVLEAVTFG 413
           +ATIRSKEAVSIIE+HTEALFGRP+I +TPQGTVD+S DE IKISF GLKRLVLEAVTFG
Sbjct: 361 YATIRSKEAVSIIEKHTEALFGRPEIVVTPQGTVDSSNDEQIKISFAGLKRLVLEAVTFG 418

BLAST of Cp4.1LG20g00040 vs. NCBI nr
Match: gi|590666384|ref|XP_007036961.1| (Uncharacterized protein isoform 1 [Theobroma cacao])

HSP 1 Score: 478.4 bits (1230), Expect = 1.3e-131
Identity = 275/438 (62.79%), Postives = 317/438 (72.37%), Query Frame = 1

Query: 1   MEAATGSTSTLAIGIGSPFRDTDPRPPAS--RSLYFPSESLISVPHYRSFV--------- 60
           M+AAT S S     +GS    T  RPP+S  RS    +      PH+  F          
Sbjct: 1   MDAATASASV----VGSSM--TTRRPPSSVTRSAILTANE----PHFLRFAAKPRLPFSI 60

Query: 61  ---SPSKLGKKSITLPCSGRGRGLGFPMVKASLSPDPDGSAAQIAPLRLQSPIGQFLSQI 120
              SP    K        G  RG+   +V+AS SPD  G  A IAPL+++SPIGQFLSQI
Sbjct: 61  KHYSPLSYSKPQNRRMALGSRRGM---VVRASSSPDSAGPTAPIAPLQMESPIGQFLSQI 120

Query: 121 LTTHPHLLPAAVDQQLQQLQTQRHAEELTQEPSASATHDIVLYRRIAEVKAIERKRALEE 180
           L +HPHL+PAAV+QQL+QLQT R AEE  +EPSASA  D+VLYRRIAEVKA ERK+ALEE
Sbjct: 121 LISHPHLVPAAVEQQLEQLQTDRDAEEKKEEPSASAGTDLVLYRRIAEVKANERKKALEE 180

Query: 181 ILYAMVVQRFMDADVPLIPAVAPSSTDPYGRVDTWARDDEKLERLHSSEASEMIQNHLAL 240
           ILYA+VVQ+FMDA+V L+PA+ PSSTDP GRVD W  +++KLE LHS EA EMIQNHLAL
Sbjct: 181 ILYALVVQKFMDANVSLVPAMTPSSTDPSGRVDMWPSEEDKLELLHSPEAYEMIQNHLAL 240

Query: 241 VLGNRIGDFASVAQISKLRVGQVYAASVMYGYFLKRVDERFQLEKTVKVLP--------- 300
           +LGNR+GD  SVAQISKLRVGQVYAASVMYGYFLKRVD+RFQLEKT+K+LP         
Sbjct: 241 ILGNRLGDSTSVAQISKLRVGQVYAASVMYGYFLKRVDQRFQLEKTMKILPNASNGEESG 300

Query: 301 ---------ASATVEGSFSNAPVHPEISSMAAEQGDVSPGESGMGIKPSRLRTYVMSFDG 360
                     +A +  S+     HPE+SS +   G +SPG  G GIKP RLRTYVMSFDG
Sbjct: 301 VEQSVGEDMGTAGLGDSYKAVSSHPEVSSWS---GGISPGGFGHGIKPCRLRTYVMSFDG 360

Query: 361 ETLQRFATIRSKEAVSIIERHTEALFGRPQIAITPQGTVDTSKDELIKISFGGLKRLVLE 407
           ETLQ+FA IRSKEAVSIIE+HTEALFGRP+I ITPQGTVD+SKDELIKISF GLKRLVLE
Sbjct: 361 ETLQKFAAIRSKEAVSIIEKHTEALFGRPEIVITPQGTVDSSKDELIKISFNGLKRLVLE 420

BLAST of Cp4.1LG20g00040 vs. NCBI nr
Match: gi|728831886|gb|KHG11329.1| (Alanine--tRNA ligase [Gossypium arboreum])

HSP 1 Score: 478.0 bits (1229), Expect = 1.6e-131
Identity = 263/384 (68.49%), Postives = 307/384 (79.95%), Query Frame = 1

Query: 47  SFVSPSKLGKKSITLPCSGRGRGLGFPMVKASLSPDPDGSAAQIAPLRLQSPIGQFLSQI 106
           S VS SK   + + L   G  RG+   +VKAS SPD     AQIAPLR++SPIGQFLSQI
Sbjct: 56  SSVSYSKSRNRRVGL---GGRRGM---VVKASSSPDSAEPNAQIAPLRMESPIGQFLSQI 115

Query: 107 LTTHPHLLPAAVDQQLQQLQTQRHAEELTQEPSASATHDIVLYRRIAEVKAIERKRALEE 166
           L +HPHL+PAAV+QQL+QLQ+ R  +E  +EPSAS T D+VLYRRIAEVKA ERKRALEE
Sbjct: 116 LISHPHLVPAAVEQQLEQLQSDRDTDEKKEEPSASGT-DLVLYRRIAEVKANERKRALEE 175

Query: 167 ILYAMVVQRFMDADVPLIPAVAPSSTDPYGRVDTWARDDEKLERLHSSEASEMIQNHLAL 226
           ILYA+VVQ+FMDA++ L+PA+  SS DP GRVDTW   ++KLE+LHS+EA EMIQNHLAL
Sbjct: 176 ILYALVVQKFMDANISLVPAIT-SSADPSGRVDTWPSQEDKLEQLHSAEAHEMIQNHLAL 235

Query: 227 VLGNRIGDFASVAQISKLRVGQVYAASVMYGYFLKRVDERFQLEKTVKVLPASATVE--- 286
           +LGNR+GD  SVAQISKLRVGQVYAASVMYGYFL+RVD+RFQLEKT+KVLP+++  +   
Sbjct: 236 ILGNRLGDSTSVAQISKLRVGQVYAASVMYGYFLRRVDQRFQLEKTMKVLPSASDGDKSS 295

Query: 287 ---------------GSFSNAPVHPEISSMAAEQGDVSPGESGMGIKPSRLRTYVMSFDG 346
                           S+  A  H E+SS +   G +SPG  G GIKPSRLRTYVMSFDG
Sbjct: 296 IEQTVGDDTRPSGLGDSYQAASSHAEVSSWS---GGISPGGFGSGIKPSRLRTYVMSFDG 355

Query: 347 ETLQRFATIRSKEAVSIIERHTEALFGRPQIAITPQGTVDTSKDELIKISFGGLKRLVLE 406
           ETLQR+A+IRSKEAV IIE+HTEALFGRP+IAITPQGTVD+S DELIKISFGGLKRLVLE
Sbjct: 356 ETLQRYASIRSKEAVGIIEKHTEALFGRPEIAITPQGTVDSSNDELIKISFGGLKRLVLE 415

Query: 407 AVTFGSFLWDVETYVDSRYHFVMN 413
           AVTFGSFLWDVE++VDSRYHFVMN
Sbjct: 416 AVTFGSFLWDVESFVDSRYHFVMN 428

BLAST of Cp4.1LG20g00040 vs. NCBI nr
Match: gi|697121438|ref|XP_009614692.1| (PREDICTED: uncharacterized protein LOC104107561 [Nicotiana tomentosiformis])

HSP 1 Score: 477.6 bits (1228), Expect = 2.1e-131
Identity = 252/358 (70.39%), Postives = 294/358 (82.12%), Query Frame = 1

Query: 75  VKASLSP-DPDGSAAQIAPLRLQSPIGQFLSQILTTHPHLLPAAVDQQLQQLQTQRHAEE 134
           ++ASLSP +  GSAA IAPL+L+SPIGQFLSQILT+HPHL+PAAVDQQL+QLQT+R +E+
Sbjct: 67  IRASLSPSESGGSAAPIAPLQLESPIGQFLSQILTSHPHLVPAAVDQQLEQLQTERDSEQ 126

Query: 135 LTQEPSASATHDIVLYRRIAEVKAIERKRALEEILYAMVVQRFMDADVPLIPAVAPSSTD 194
             +EPSA+ T DIVLYRRIAEVKA +RK+ALEEILYA+VVQ+FMDA+V L+PA++P S++
Sbjct: 127 QKEEPSATGT-DIVLYRRIAEVKANDRKKALEEILYALVVQKFMDANVSLVPAISPPSSE 186

Query: 195 PYGRVDTWARDDEKLERLHSSEASEMIQNHLALVLGNRIGDFASVAQISKLRVGQVYAAS 254
           P GR+DTW   D+K ERLHS+EA+EMIQNHLAL+LGNR+GD ++VAQISK RVGQVYAAS
Sbjct: 187 PSGRIDTWPSQDDKFERLHSAEANEMIQNHLALILGNRLGDNSAVAQISKFRVGQVYAAS 246

Query: 255 VMYGYFLKRVDERFQLEKTVKVLPASATVEG-------------------SFSNAPVHPE 314
           VMYGYFLKRVD+RFQLEKT+KVLP     E                    SF     HPE
Sbjct: 247 VMYGYFLKRVDQRFQLEKTMKVLPQGVDDEDSSIRQVGGEEIRSGDRSDTSFGVTQSHPE 306

Query: 315 ISSMAAEQGDVSPGESGMGIKPSRLRTYVMSFDGETLQRFATIRSKEAVSIIERHTEALF 374
           +SS +A  G    G  G GIKPSRLR YVMSFDGETLQR+ATIRSKEA+ IIE+HTEALF
Sbjct: 307 LSSWSA--GSAGTGGFGHGIKPSRLRNYVMSFDGETLQRYATIRSKEAIGIIEKHTEALF 366

Query: 375 GRPQIAITPQGTVDTSKDELIKISFGGLKRLVLEAVTFGSFLWDVETYVDSRYHFVMN 413
           GRP+I ITPQGTVD+SKDEL+KISFGGL RLVLEAVTFGSFLWDVE+YVDSRYHFV N
Sbjct: 367 GRPEIVITPQGTVDSSKDELLKISFGGLSRLVLEAVTFGSFLWDVESYVDSRYHFVAN 421

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
UVB31_ARATH4.6e-11865.71UV-B-induced protein At3g17800, chloroplastic OS=Arabidopsis thaliana GN=At3g178... [more]
Match NameE-valueIdentityDescription
A0A061FWL1_THECC7.7e-13663.29Uncharacterized protein isoform 2 OS=Theobroma cacao GN=TCM_013026 PE=4 SV=1[more]
A0A061FX61_THECC8.8e-13262.79Uncharacterized protein isoform 1 OS=Theobroma cacao GN=TCM_013026 PE=4 SV=1[more]
A0A0B0NHU4_GOSAR1.1e-13168.49Alanine--tRNA ligase OS=Gossypium arboreum GN=F383_01965 PE=4 SV=1[more]
A0A0D2SGC6_GOSRA1.7e-13067.71Uncharacterized protein OS=Gossypium raimondii GN=B456_005G134400 PE=4 SV=1[more]
W9QPD8_9ROSA1.1e-12965.92Uncharacterized protein OS=Morus notabilis GN=L484_019353 PE=4 SV=1[more]
Match NameE-valueIdentityDescription
AT3G17800.22.6e-11965.71 Protein of unknown function (DUF760)[more]
AT1G48450.14.6e-11663.87 Protein of unknown function (DUF760)[more]
AT1G32160.12.6e-8246.02 Protein of unknown function (DUF760)[more]
AT3G07310.11.3e-4134.66 Protein of unknown function (DUF760)[more]
AT5G48590.18.6e-3031.50 Protein of unknown function (DUF760)[more]
Match NameE-valueIdentityDescription
gi|590666388|ref|XP_007036962.1|1.1e-13563.29Uncharacterized protein isoform 2 [Theobroma cacao][more]
gi|568880748|ref|XP_006493269.1|3.3e-13261.96PREDICTED: UV-B-induced protein At3g17800, chloroplastic-like [Citrus sinensis][more]
gi|590666384|ref|XP_007036961.1|1.3e-13162.79Uncharacterized protein isoform 1 [Theobroma cacao][more]
gi|728831886|gb|KHG11329.1|1.6e-13168.49Alanine--tRNA ligase [Gossypium arboreum][more]
gi|697121438|ref|XP_009614692.1|2.1e-13170.39PREDICTED: uncharacterized protein LOC104107561 [Nicotiana tomentosiformis][more]
The following terms have been associated with this gene:
Vocabulary: INTERPRO
TermDefinition
IPR008479DUF760
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0008152 metabolic process
biological_process GO:0046486 glycerolipid metabolic process
biological_process GO:0006071 glycerol metabolic process
biological_process GO:0019852 L-ascorbic acid metabolic process
biological_process GO:0055114 oxidation-reduction process
biological_process GO:0008150 biological_process
cellular_component GO:0005575 cellular_component
molecular_function GO:0016874 ligase activity
molecular_function GO:0005507 copper ion binding
molecular_function GO:0004371 glycerone kinase activity
molecular_function GO:0016787 hydrolase activity
molecular_function GO:0008447 L-ascorbate oxidase activity
molecular_function GO:0003674 molecular_function

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Cp4.1LG20g00040.1Cp4.1LG20g00040.1mRNA


Analysis Name: InterPro Annotations of Cucurbita pepo
Date Performed: 2017-12-02
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR008479Protein of unknown function DUF760PFAMPF05542DUF760coord: 319..400
score: 3.3E-4coord: 148..272
score: 2.9
NoneNo IPR availablePANTHERPTHR31808FAMILY NOT NAMEDcoord: 8..412
score: 3.4E
NoneNo IPR availablePANTHERPTHR31808:SF4SUBFAMILY NOT NAMEDcoord: 8..412
score: 3.4E

The following gene(s) are paralogous to this gene:

None

The following block(s) are covering this gene:
GeneOrganismBlock
Cp4.1LG20g00040Cucurbita pepo (Zucchini)cpecpeB433
Cp4.1LG20g00040Cucurbita maxima (Rimu)cmacpeB484