Cp4.1LG20g08820 (gene) Cucurbita pepo (Zucchini)

NameCp4.1LG20g08820
Typegene
OrganismCucurbita pepo (Cucurbita pepo (Zucchini))
DescriptionTHO complex subunit 4
LocationCp4.1LG20 : 7058613 .. 7063185 (+)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: five_prime_UTRCDSthree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
AGCAAATGACCAGATAGCTCTAATGGGTTCCCTTCACTACTCCCGTGAAGATGGTCTAAACGGTTCAGAAATACTCATCTTACTACATAAAACATTTACTTAGCCTACATTTAGATTCCGGTCTACTCAAACTCGGCCTAGTTCATAACCAAAATTTAGATTTGGATTTGGCATAAGGTTCTTGAGTTGTTAGATTGTTGCTAAATAATTTTTGAATATTCAAACCGATCTAACAAAGAAGTAATTTTAATGGGGTTTGATTTAGCATATTGCGCTGGATCTTCCTCTACCTTTGTCCATTGTTGAAACCCTAGGAATTCTCTCTCCGATCCACGACCACCACTCCTCCGCTCATCAATGGCAGAGCCTCTCGATATGAGCTTAGATGATATCATCAAGAACAACAAGAAATCCGGATCCTCGAACTTCAGGGGTCGTGGCGGAGCTTCTTCTGGACCAGGTCCCTCTCGCCGCTTTCGCAATCGCGGTCTTAATAGAGCAGCGCCCTATTCTACTGCCCAGGTGAAGGATTTGGGTTTTAACATCCCGATTTCGATGAATCTTACTACCATCGCCATCCTCTAGCTCTTGTTGTTGTTGTTGTTGTTCTTCTTCTTCTTCTTCTTCTACTACTACTTCTACCAGTTTCTCGTTCTTTTCAAAACGGACTTCGGGTATTTTGTTTTTAATGTTCTGAAAAACATTGGGGGTTTAGGCGCCGGAGACGGCTTGGTCACACGACATGTTTGTAGATCACGGTGCGGCATATCCTTCACAGCCTGCACGGGCCTCTGCTATCGAAACCGGCACCAAGCTTTATGTTTCTAATTTGGATTATGGTGTTTCCAACGAGGATATCAAGGTATTTCAATTTTGGATTTGAGGGCTTGAACATGAGAATTTTTTTTGTATGTATTGCTAAGATTTTATTGAAATTGTATATGCAGGAACTGTTTTCAGAAGTTGGTGATCTCAAACGATATTCTATCAATTATGATAAAAGTGGGAGGTCAAAGGTATGTTTAAAGTGTGAAGAACACAAGAGCTTTTGTTTCTTTTTAGCAATATTCTGGCTTTGGTTTGAGTTTTTGTTCTTTGTAGGGAACAGCAGAGATTGTTTTTTCACGACAAGCAGATGCGCTTGCTGCTATAAAGAGGTACAACAATGTTCAGCTAGATGGGAAGCACATGAAGTTGGAGATTGTAGGAACTAACATCATGACACCTGCTGTTCCTGCTTCTGCAAATGGCAATTTTGGGAATCCAAATGGATTTCGAAGAGGGTATACATTTTCTCTCTTCACTTGTCCTTATTTCTGATTTTTCCATTTGCGTTTACAATTGCAATCATCTATTAACGCGGCTAATGGTATGTGTTCAGAAGTAGATATGGATTTTATTTGCTTTATAATACTTTTGTAAGCCCATGATTGAAGTAATTGTTTTTAGAGTTGTATTGGATTTTGACGTAGATTTTTCAACATTTTCTTGTTATTCTGACTTAGTTCATAGTAAAATCATTTTTCATTGCTTTTTGACATGTCCGCTGAAAGTTTTGAAAGGATTAAATTGAACATAAAAATATTTAATTTCAACAACTCGTTAAGATATGTGTACTTAAAGGGAGGTATTTCTCGTTTGACACTATGGACATAGTAAATGGTAAACCATATGATGTTAATCGAAGAAAAATTGTTACACTTGTATGAGAAACTTAGGGATTATATATATCTAAATATCTATCTATCTATATATATATTATGATAAGTAGCCGAACCTTCCTTGAGAAAACTTGAAAGAATATTAGGGCATATAAAAGACTCACTAACGGAGGCAAAATTGTATCCATAATTTATTGTCAATTTTGTTTATGATGACAATGTGTTTCTAGCAGCAATTGTATTCAACAACTTTTATTTCTCAAATTTGCGTTTGATCGCATATAAAATTAGCTGCTTCCTAATGTTTATTAATATTTTTTAACAACTGTGGGTATCTGACCAAGTTTTCACACCTTGGCTAGTATTGTCAAAAAACCACATGAACCTACTTCATTTGGTGTTGGAATGCTGGGTGTTATTTTCATTGGGTATTTGAGTGTACAAATTGTCTATTTGACCACTAACTGAATTTGTTGTAGGTTAAAGCTTACAAGATTTCTATTCAAGTTCTCCAATAAACAAACGACATTAAGGGATCCAGTTCTGCCACATTTTATGCAAGAAAAAACTTAATATGGGTTCCTTCTTTTGCGCGTCAATCCTTTTAGAAAAGAGTCTAGGTAATGATAACTTGGATAGTTTAATGACCAAGATTATTGAATTGTAGATGAAAGATTCCAAGTTGCAAGAAGTTTGTGTGTCATTGTTGAGGTGTATGTTATTATGTGGAGCCTGTGGAGCCTTTGGGGAGGGAGTAATGGCAGGCTCATTAGAGGGCTTGAGTGGTCTTTGGTTTTGACCCCTTGTTAGGTTCATCATCTCTTTGTGGGTCCAGAATCAGCTTATGTTTTGATCATATGATTCTGTAAGTTTTGATATTTAAGCTTTTCCATATTGATGATGAATTATATGAAAAGGGTAAAATGTTTACGAAGCCTGCCTTATTAACATTTATCACACAGATATTTTAATTGTGCGATGTTGATAGAAGCCGTTCTTTCTTTACTTTGGTACCCGTATGACTTCTGCCTGCCTCTGAACAGTTCTGGTAGTTATTTGTGTACATGATCATGCATTACGTTGTTCTTTTTATGCTTCAGTTGGTGCTTAAGTATGAGGCTGATAATTATGTTTTATGTATTTTAATTGAAATTGCTGACATCTAGACTTTTGATCTGATTTTGATTGCTATTATTGATTCTCTTAATCAGCATGCACTTAATAAATTTTCATTTGATTTTAACAGTGGACATGTACTGGGCCGAAACCGGGGTGGTGGACGAGGACGTGGTCCTGGAAGAGGAGGGCGTGGACGTGGGAGTAGCAGAGGTCGTGGGGAGAAATTATCTGCCGAAGATCTAGATGCTGATTTGGAGAAATATCATGAAGAAGCCATGCAGATCAATTAAATCATTTGGTGTGATTTCCTGGGGCTTGATATTCCCAACTTCGTTAGGTGCTTGATGATAAATGTGATAAGGATACCATTTTCTATTTTGCGAGTCATATCATGAAGCCCTGACCTTGAGAGAACCACTAGACCTTGCTAGGATGTGGAGTTTGGGCTTCAATTGTTATTTGTTGCAGTAAACTACAATGAGGTGTTATATATGCTTGAATCTGTAATCTCCCGTTACTTATCTGTTTCTTAATCTTCTCCAATCCTATTTATTTTCCATGAAAGTTGCACTGGAACTGGGAAAGTCGCTCACACTCCTTGTCGACACGGAAGGATTTGGGCGCTTGACATGGGAAGTATTGGTTATAACATAATCAACCACATCTTCTGGTGTTGAGGCCAAAGTTCTGGTACACCGTTTCTAAATCATAATATTTCCTTGATGGAATGATAGATATAGCCGAGAATCTATTCTGAATATGTTGATTATTGGGTTCTGACTGCAGAATTTCTACTTATATTGCTTTGCAGCGTTTGGTTTTATGAATTAAAAGTAATGTGTTAAGATCTTCTACTTGTTGGCATGGAGAGGAAGGAAGGGGTGTGGAGATGATTACCAAGTAAGTGTTCAAGGCCACCGCTAGCAGATACTGTTCGCTTTAACCTGTTACGTATCCTCGTCAGCCTCACAGTTTTAAAACGCATCTACTAGAAAAATGGGTTTTTCACACCCTTATGAGGAATGTTTCGTTCTCCTCTCCAACCGACGTGGGATCTCACACAATTGGTATGCAGTTAGAATTCAACAACACTTACATTTATTTCAGTTTATTTCAGTATTGTGTATGCCTTTTACTATCATGTGGTTGTAGTAATGACTATTATAGGCTGCAGTCTCGTTTCTTCAGCATATTTTGTTTGAAAATATTCCAGAATTTTGCATTCAATAATTAGAAGACCAGGCATATACCATGCAATAATGTTGAGTTTCTTCTGTATCTTAAGAAAGCCAAAAAAAGTAGGGGAATTATGCTACTCTGTCAAGTGTCATTTGAACTCCTAGCGTAGAAACTCGATTGCAGTTGGTCGAAGTGCACCGACATATCTGAGATTCAATTTAGATTTTACAAAACTGGTCCATTTTAGTTAGTTTTTGCTTTCTATAACATAGTTCAGATTTGTATTTTTGCTTCTATATTATGGTTATCATTCAGATTTCTACCTTGTCACTTATGTTTGAAGCCTTTTGGAATTCAATCTTAGATTAGTTTGTTTGTAGTTGATGAAGAAATGGAGTTGATGATCGATCATTACTAGGGTCATAATCGATTGAAAGATCAAGTCGACTTTTGTTTGCCATATCAATCATGGAGGTAAAAAATGATTTTACTCCCGGACCTTTTGCAATAGGTTTTGATTTTAGTATCAATAGTAAGGTGGAAATATTTGAATACAAGCTCATATTGCATATGAAAATTGGAAGGAAAATGTTCATGCCAAATGAAGCAGCTTT

mRNA sequence

AGCAAATGACCAGATAGCTCTAATGGGTTCCCTTCACTACTCCCGTGAAGATGGTCTAAACGGTTCAGAAATACTCATCTTACTACATAAAACATTTACTTAGCCTACATTTAGATTCCGGTCTACTCAAACTCGGCCTAGTTCATAACCAAAATTTAGATTTGGATTTGGCATAAGGTTCTTGAGTTGTTAGATTGTTGCTAAATAATTTTTGAATATTCAAACCGATCTAACAAAGAAGTAATTTTAATGGGGTTTGATTTAGCATATTGCGCTGGATCTTCCTCTACCTTTGTCCATTGTTGAAACCCTAGGAATTCTCTCTCCGATCCACGACCACCACTCCTCCGCTCATCAATGGCAGAGCCTCTCGATATGAGCTTAGATGATATCATCAAGAACAACAAGAAATCCGGATCCTCGAACTTCAGGGGTCGTGGCGGAGCTTCTTCTGGACCAGGTCCCTCTCGCCGCTTTCGCAATCGCGGTCTTAATAGAGCAGCGCCCTATTCTACTGCCCAGGCGCCGGAGACGGCTTGGTCACACGACATGTTTGTAGATCACGGTGCGGCATATCCTTCACAGCCTGCACGGGCCTCTGCTATCGAAACCGGCACCAAGCTTTATGTTTCTAATTTGGATTATGGTGTTTCCAACGAGGATATCAAGGAACTGTTTTCAGAAGTTGGTGATCTCAAACGATATTCTATCAATTATGATAAAAGTGGGAGGTCAAAGGGAACAGCAGAGATTGTTTTTTCACGACAAGCAGATGCGCTTGCTGCTATAAAGAGGTACAACAATGTTCAGCTAGATGGGAAGCACATGAAGTTGGAGATTGTAGGAACTAACATCATGACACCTGCTGTTCCTGCTTCTGCAAATGGCAATTTTGGGAATCCAAATGGATTTCGAAGAGGTGGACATGTACTGGGCCGAAACCGGGGTGGTGGACGAGGACGTGGTCCTGGAAGAGGAGGGCGTGGACGTGGGAGTAGCAGAGGTCGTGGGGAGAAATTATCTGCCGAAGATCTAGATGCTGATTTGGAGAAATATCATGAAGAAGCCATGCAGATCAATTAAATCATTTGGTGTGATTTCCTGGGGCTTGATATTCCCAACTTCGTTAGGTGCTTGATGATAAATGTGATAAGGATACCATTTTCTATTTTGCGAGTCATATCATGAAGCCCTGACCTTGAGAGAACCACTAGACCTTGCTAGGATGTGGAGTTTGGGCTTCAATTGTTATTTGTTGCAGTAAACTACAATGAGTTGCACTGGAACTGGGAAAGTCGCTCACACTCCTTGTCGACACGGAAGGATTTGGGCGCTTGACATGGGAAGTATTGGTTATAACATAATCAACCACATCTTCTGGTGTTGAGGCCAAAGTTCTGCGTTTGGTTTTATGAATTAAAAGTAATGTGTTAAGATCTTCTACTTGTTGGCATGGAGAGGAAGGAAGGGGTGTGGAGATGATTACCAAGTAAGTGTTCAAGGCCACCGCTAGCAGATACTGTTCGCTTTAACCTGTTACGTATCCTCGTCAGCCTCACAGTTTTAAAACGCATCTACTAGAAAAATGGGTTTTTCACACCCTTATGAGGAATGTTTCGTTCTCCTCTCCAACCGACGTGGGATCTCACACAATTGGTATGCAGTTAGAATTCAACAACACTTACATTTATTTCAGTTTATTTCAGTATTGTGTATGCCTTTTACTATCATGTGGTTGTAGTAATGACTATTATAGGCTGCAGTCTCGTTTCTTCAGCATATTTTGTTTGAAAATATTCCAGAATTTTGCATTCAATAATTAGAAGACCAGGCATATACCATGCAATAATGTTGAGTTTCTTCTGTATCTTAAGAAAGCCAAAAAAAGTAGGGGAATTATGCTACTCTGTCAAGTGTCATTTGAACTCCTAGCGTAGAAACTCGATTGCAGTTGGTCGAAGTGCACCGACATATCTGAGATTCAATTTAGATTTTACAAAACTGGTCCATTTTAGTTAGTTTTTGCTTTCTATAACATAGTTCAGATTTGTATTTTTGCTTCTATATTATGGTTATCATTCAGATTTCTACCTTGTCACTTATGTTTGAAGCCTTTTGGAATTCAATCTTAGATTAGTTTGTTTGTAGTTGATGAAGAAATGGAGTTGATGATCGATCATTACTAGGGTCATAATCGATTGAAAGATCAAGTCGACTTTTGTTTGCCATATCAATCATGGAGGAAAATGTTCATGCCAAATGAAGCAGCTTT

Coding sequence (CDS)

ATGGCAGAGCCTCTCGATATGAGCTTAGATGATATCATCAAGAACAACAAGAAATCCGGATCCTCGAACTTCAGGGGTCGTGGCGGAGCTTCTTCTGGACCAGGTCCCTCTCGCCGCTTTCGCAATCGCGGTCTTAATAGAGCAGCGCCCTATTCTACTGCCCAGGCGCCGGAGACGGCTTGGTCACACGACATGTTTGTAGATCACGGTGCGGCATATCCTTCACAGCCTGCACGGGCCTCTGCTATCGAAACCGGCACCAAGCTTTATGTTTCTAATTTGGATTATGGTGTTTCCAACGAGGATATCAAGGAACTGTTTTCAGAAGTTGGTGATCTCAAACGATATTCTATCAATTATGATAAAAGTGGGAGGTCAAAGGGAACAGCAGAGATTGTTTTTTCACGACAAGCAGATGCGCTTGCTGCTATAAAGAGGTACAACAATGTTCAGCTAGATGGGAAGCACATGAAGTTGGAGATTGTAGGAACTAACATCATGACACCTGCTGTTCCTGCTTCTGCAAATGGCAATTTTGGGAATCCAAATGGATTTCGAAGAGGTGGACATGTACTGGGCCGAAACCGGGGTGGTGGACGAGGACGTGGTCCTGGAAGAGGAGGGCGTGGACGTGGGAGTAGCAGAGGTCGTGGGGAGAAATTATCTGCCGAAGATCTAGATGCTGATTTGGAGAAATATCATGAAGAAGCCATGCAGATCAATTAA

Protein sequence

MAEPLDMSLDDIIKNNKKSGSSNFRGRGGASSGPGPSRRFRNRGLNRAAPYSTAQAPETAWSHDMFVDHGAAYPSQPARASAIETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYSINYDKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKLEIVGTNIMTPAVPASANGNFGNPNGFRRGGHVLGRNRGGGRGRGPGRGGRGRGSSRGRGEKLSAEDLDADLEKYHEEAMQIN
BLAST of Cp4.1LG20g08820 vs. Swiss-Prot
Match: THO4A_ARATH (THO complex subunit 4A OS=Arabidopsis thaliana GN=ALY1 PE=1 SV=1)

HSP 1 Score: 248.1 bits (632), Expect = 1.0e-64
Identity = 150/248 (60.48%), Postives = 184/248 (74.19%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKKSGSSNFRGRG-GASSGPGPSRRFR-NRGLNRAAPYSTAQAPE 60
           M+  LDMSLDD+I  N+KS       RG G+ SGPGP+RR   NR   R+APY +A+APE
Sbjct: 1   MSTGLDMSLDDMIAKNRKSRGGAGPARGTGSGSGPGPTRRNNPNRKSTRSAPYQSAKAPE 60

Query: 61  TAWSHDMFVDHGAAYPSQPARASA-IETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYS 120
           + W HDMF D    + S   R+SA IETGTKLY+SNLDYGV NEDIKELF+EVG+LKRY+
Sbjct: 61  STWGHDMFSDRSEDHRS--GRSSAGIETGTKLYISNLDYGVMNEDIKELFAEVGELKRYT 120

Query: 121 INYDKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKLEIVGTNIMTPAVPASANG 180
           +++D+SGRSKGTAE+V+SR+ DALAA+K+YN+VQLDGK MK+EIVGTN+ T A P+    
Sbjct: 121 VHFDRSGRSKGTAEVVYSRRGDALAAVKKYNDVQLDGKPMKIEIVGTNLQTAAAPSGRPA 180

Query: 181 NFGNPNG--FRRGGHVLGRNRGGGRGRGPGRGGRGRGSSRGRG--EKLSAEDLDADLEKY 240
           N GN NG  +R G    G+ RGGGRG G GRGG GRG   G+G  EK+SAEDLDADL+KY
Sbjct: 181 N-GNSNGAPWRGGQGRGGQQRGGGRG-GGGRGGGGRGRRPGKGPAEKISAEDLDADLDKY 240

Query: 241 HEEAMQIN 242
           H   M+ N
Sbjct: 241 HSGDMETN 244

BLAST of Cp4.1LG20g08820 vs. Swiss-Prot
Match: THO4B_ARATH (THO complex subunit 4B OS=Arabidopsis thaliana GN=ALY2 PE=1 SV=1)

HSP 1 Score: 244.2 bits (622), Expect = 1.5e-63
Identity = 163/290 (56.21%), Postives = 188/290 (64.83%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKK----------SGSSNFRGRGGASSGPGPSRRFRNRGLNRAAP 60
           M+  LDMSLDDIIK+N+K           G +N  GRGG+ S  GPSRRF NR   R AP
Sbjct: 1   MSGGLDMSLDDIIKSNRKPTGSRGRGGIGGGNNTGGRGGSGSNSGPSRRFANRVGARTAP 60

Query: 61  YSTA----QAPETAWSHDMFVDHG---AAYPSQPARA----SAIETGTKLYVSNLDYGVS 120
           YS      QA +  W +D+F       AA+           S+IETGTKLY+SNLDYGVS
Sbjct: 61  YSRPIQQQQAHDAMWQNDVFATDASVAAAFGHHQTAVVGGGSSIETGTKLYISNLDYGVS 120

Query: 121 NEDIKELFSEVGDLKRYSINYDKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKL 180
           NEDIKELFSEVGDLKRY I+YD+SGRSKGTAE+VFSR+ DALAA+KRYNNVQLDGK MK+
Sbjct: 121 NEDIKELFSEVGDLKRYGIHYDRSGRSKGTAEVVFSRRGDALAAVKRYNNVQLDGKLMKI 180

Query: 181 EIVGTNIMTPAVPASA-------------------NGNF-GNPNGFRRG---GHVLGRNR 240
           EIVGTN+  PA+P  A                   NGNF GN NG  RG   G  +GR R
Sbjct: 181 EIVGTNLSAPALPILATAQIPFPTNGILGNFNENFNGNFNGNFNGNFRGRGRGGFMGRPR 240

BLAST of Cp4.1LG20g08820 vs. Swiss-Prot
Match: THOC4_TAEGU (THO complex subunit 4 OS=Taeniopygia guttata GN=ALYREF PE=2 SV=1)

HSP 1 Score: 162.9 bits (411), Expect = 4.3e-39
Identity = 116/259 (44.79%), Postives = 150/259 (57.92%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKKS-----GSSNFRGRGGASSGPGPSRR-----------FRNR- 60
           MA+ +DMSLDDIIK N+       G    RGRGG + G GP R             RNR 
Sbjct: 1   MADKMDMSLDDIIKLNRSQRGASRGGRGGRGRGGTARGGGPGRGGVGGGRAGGGPVRNRP 60

Query: 61  ------GLNRAAPYSTAQAPETAWSHDMFVDHGAAYPSQPARASAIETGTKLYVSNLDYG 120
                 G NR APYS  +     W HD+F D G          + +ETG KL VSNLD+G
Sbjct: 61  VMARGGGRNRPAPYSRPKQLPEKWQHDLF-DSGFG------AGAGVETGGKLLVSNLDFG 120

Query: 121 VSNEDIKELFSEVGDLKRYSINYDKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHM 180
           VS+ DI+ELF+E G LK+ +++YD+SGRS GTA++ F R+ADAL A+K+YN V LDG+ M
Sbjct: 121 VSDADIQELFAEFGTLKKAAVHYDRSGRSLGTADVHFERKADALKAMKQYNGVPLDGRPM 180

Query: 181 KLEIVGTNIMTPAVPASANGNFGNPNGFRRGGHVLGRNRGGGRGRGP--GRGGRGRGSSR 235
            +++V + I T   PA +     N  G  R   VLG   GGG  RG   G  GRGRG+ R
Sbjct: 181 NIQLVTSQIDTQRRPAQSV----NRGGMTRNRGVLGGFGGGGNRRGTRGGNRGRGRGAGR 240

BLAST of Cp4.1LG20g08820 vs. Swiss-Prot
Match: THO4D_ARATH (THO complex subunit 4D OS=Arabidopsis thaliana GN=ALY4 PE=1 SV=1)

HSP 1 Score: 158.3 bits (399), Expect = 1.1e-37
Identity = 120/291 (41.24%), Postives = 157/291 (53.95%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNK--KSGSSNF-----RGRGGASSGPGPSRRFRNRGLNRAAPYST 60
           M+  L+M+LD+I+K  K  +SG         RGRGG   G GP+RR     +N      T
Sbjct: 1   MSGALNMTLDEIVKRGKTARSGGRGISRGRGRGRGGGGRGAGPARR-GPLAVNARPSSFT 60

Query: 61  AQAP-----ETAWSHDMFVDHGAAYPSQPARASAIETGTKLYVSNLDYGVSNEDIKELFS 120
              P        W   +F D   A     A AS +E GT+L+V+NLD GV+NEDI+ELFS
Sbjct: 61  INKPVRRVRSLPWQSGLFEDGLRA-----AGASGVEVGTRLHVTNLDQGVTNEDIRELFS 120

Query: 121 EVGDLKRYSINYDKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKLEIVGTNIMT 180
           E+G+++RY+I+YDK+GR  GTAE+V+ R++DA  A+K+YNNV LDG+ M+LEI+G N  +
Sbjct: 121 EIGEVERYAIHYDKNGRPSGTAEVVYPRRSDAFQALKKYNNVLLDGRPMRLEILGGNNSS 180

Query: 181 PA-VPASANGNFGNPNG-------FRRGGHVLGRNRGGGRGRGP---------------- 239
            A +    N N    NG        ++GG   GR RGG  GRGP                
Sbjct: 181 EAPLSGRVNVNVTGLNGRLKRTVVIQQGGGGRGRVRGGRGGRGPAPTVSRRLPIHNQQGG 240

BLAST of Cp4.1LG20g08820 vs. Swiss-Prot
Match: THOC4_MOUSE (THO complex subunit 4 OS=Mus musculus GN=Alyref PE=1 SV=3)

HSP 1 Score: 152.9 bits (385), Expect = 4.4e-36
Identity = 110/259 (42.47%), Postives = 146/259 (56.37%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKKS----GSSNFRGRGGASSGPGPSRR-----------FRNR-- 60
           MA+ +DMSLDDIIK N+      G    RGR G+  G G + +            RNR  
Sbjct: 1   MADKMDMSLDDIIKLNRSQRGGRGGGRGRGRAGSQGGRGGAVQAAARVNRGGGPMRNRPA 60

Query: 61  --------GLNRAAPYSTAQAPETAWSHDMFVDHGAAYPSQPARASAIETGTKLYVSNLD 120
                   G NR APYS  +     W HD+F D G          + +ETG KL VSNLD
Sbjct: 61  IARGAAGGGRNRPAPYSRPKQLPDKWQHDLF-DSGFG------GGAGVETGGKLLVSNLD 120

Query: 121 YGVSNEDIKELFSEVGDLKRYSINYDKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGK 180
           +GVS+ DI+ELF+E G LK+ +++YD+SGRS GTA++ F R+ADAL A+K+YN V LDG+
Sbjct: 121 FGVSDADIQELFAEFGTLKKAAVHYDRSGRSLGTADVHFERKADALKAMKQYNGVPLDGR 180

Query: 181 HMKLEIVGTNIMTPAVPASANGNFGNPNGFRRGGHVLGRNRGGGRGRGPGRGGRGRGSSR 235
            M +++V + I T   PA +    G       GG   G  R G RG   G  GRGRG+ R
Sbjct: 181 PMNIQLVTSQIDTQRRPAQSINRGGMTRNRGSGGFGGGGTRRGTRG---GSRGRGRGTGR 240

BLAST of Cp4.1LG20g08820 vs. TrEMBL
Match: A0A0A0LUQ3_CUCSA (Uncharacterized protein OS=Cucumis sativus GN=Csa_1G480690 PE=4 SV=1)

HSP 1 Score: 416.0 bits (1068), Expect = 3.1e-113
Identity = 221/250 (88.40%), Postives = 229/250 (91.60%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKKSGSSNFRGRGGASSGPGPSRRFRNRGLNRAAPYSTAQAPETA 60
           MAEPLDMSLDDIIKNNKKSGSSNFR RGGASSGPGPSRRFRNRGLNRA PYST++APETA
Sbjct: 1   MAEPLDMSLDDIIKNNKKSGSSNFRARGGASSGPGPSRRFRNRGLNRATPYSTSKAPETA 60

Query: 61  WSHDMFVDHGAAYPSQPARASAIETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYSINY 120
           WSHDMFVDHGAAYPS P RASAIETGTKLYVSNLDYGVSNEDIKELFSEVGD+KRYSINY
Sbjct: 61  WSHDMFVDHGAAYPSHPPRASAIETGTKLYVSNLDYGVSNEDIKELFSEVGDVKRYSINY 120

Query: 121 DKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKLEIVGTNIMTPAVPASANGNFG 180
           DKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGK MKLEIVGTNI+TPAVPA +N +FG
Sbjct: 121 DKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKPMKLEIVGTNIVTPAVPAPSNASFG 180

Query: 181 NPNGFRRGGHVLGRNRGGGRGRGPGRG-GRGR--------GSSRGRGEKLSAEDLDADLE 240
           NPNGF RGG  +GRNRGGGRGRGPGRG GRGR        GS RG GEKLSAEDLDADL+
Sbjct: 181 NPNGFPRGGRAMGRNRGGGRGRGPGRGRGRGRGSGSGSGSGSGRGHGEKLSAEDLDADLD 240

Query: 241 KYHEEAMQIN 242
           KYHEEAMQIN
Sbjct: 241 KYHEEAMQIN 250

BLAST of Cp4.1LG20g08820 vs. TrEMBL
Match: A0A067LAK2_JATCU (Uncharacterized protein OS=Jatropha curcas GN=JCGZ_05675 PE=4 SV=1)

HSP 1 Score: 330.9 bits (847), Expect = 1.3e-87
Identity = 183/249 (73.49%), Postives = 201/249 (80.72%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKKSGSSNFRGRGGASSGPGPSRRFRNRGLNRAAPYSTAQAPETA 60
           M+  LDMSLDDIIK+NKK GS N RGRG A SGPGP+RRF NR  NRAAPYSTA+APET 
Sbjct: 1   MSSALDMSLDDIIKSNKKPGSGNSRGRGRA-SGPGPTRRFTNRVANRAAPYSTAKAPETT 60

Query: 61  WSHDMFVDHGAAYPSQPARASAIETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYSINY 120
           W HDMF D G  Y  Q  RASAIETGTKLY+SNL+YGVSNEDIKELFSEVGDLKRY+I+Y
Sbjct: 61  WQHDMFTDQGMGYAGQGGRASAIETGTKLYISNLEYGVSNEDIKELFSEVGDLKRYTIHY 120

Query: 121 DKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKLEIVGTNIMTPAVPASANGNFG 180
           D+SGRSKGTAE+VFSR+ DALAA+KRYNNVQLDGK MK+EIVGTNI TPA P++ANG FG
Sbjct: 121 DRSGRSKGTAEVVFSRRTDALAAVKRYNNVQLDGKPMKIEIVGTNIATPAAPSAANGTFG 180

Query: 181 NPNGFRRGGH----VLGRNR---GGGRGRGPGRG-GRGRGSSRGRGEKLSAEDLDADLEK 240
           + N   RGG      +GR R   GGGRG G GRG GRG G   GRGEK+SAEDLDADLEK
Sbjct: 181 SSNAVSRGGQGRGGAVGRQRGGSGGGRGFGRGRGRGRGGGGGGGRGEKVSAEDLDADLEK 240

Query: 241 YHEEAMQIN 242
           YH EAMQ N
Sbjct: 241 YHSEAMQTN 248

BLAST of Cp4.1LG20g08820 vs. TrEMBL
Match: A0A061H038_THECC (RNA-binding family protein OS=Theobroma cacao GN=TCM_041883 PE=4 SV=1)

HSP 1 Score: 324.3 bits (830), Expect = 1.2e-85
Identity = 170/242 (70.25%), Postives = 198/242 (81.82%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKKSGSSNFRGRGGASSGPGPSRRFRNRGLNRAAPYSTAQAPETA 60
           M+  L+MSLDD+IK N+KSGS N RGRG   SGPGP+RRF NRG NR+ PY+ A+APET 
Sbjct: 1   MSSALEMSLDDLIKRNRKSGSGNSRGRG-RGSGPGPARRFPNRGANRSGPYTAAKAPETT 60

Query: 61  WSHDMFVDHGAAYPSQPARASAIETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYSINY 120
           W HDM+ D GAA+  Q  RASAIETGTKLY+SNLDYGVSN+DIKELF+EVGDLKR++I+Y
Sbjct: 61  WQHDMYSDKGAAFQGQAGRASAIETGTKLYISNLDYGVSNDDIKELFAEVGDLKRFTIHY 120

Query: 121 DKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKLEIVGTNIMTPAVPASANGNFG 180
           D+SGRSKGTAE+VFSR+ DA+AA+KRYNNVQLDGK MK+EIVGTN+ TP  P++ NG FG
Sbjct: 121 DRSGRSKGTAEVVFSRRTDAMAAVKRYNNVQLDGKPMKIEIVGTNVATPGAPSAGNGAFG 180

Query: 181 NPNGFRRGGHVLGRNRGGGRGRGPGRG-GRGRGSSRGRGEKLSAEDLDADLEKYHEEAMQ 240
           N NG  RGGH  G   G  RG G GRG GRGRG  +GRGEK+SAEDLDA+LEKYH EAMQ
Sbjct: 181 NSNGAPRGGHGRGGGFGKQRGGGGGRGFGRGRGRGKGRGEKVSAEDLDAELEKYHSEAMQ 240

Query: 241 IN 242
            N
Sbjct: 241 TN 241

BLAST of Cp4.1LG20g08820 vs. TrEMBL
Match: W9QZL4_9ROSA (RNA and export factor-binding protein 2 OS=Morus notabilis GN=L484_001844 PE=4 SV=1)

HSP 1 Score: 321.6 bits (823), Expect = 8.0e-85
Identity = 177/249 (71.08%), Postives = 202/249 (81.12%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKKSGSSNFRGRGGASSGPGPSRRFRNRGLNRAAPYSTA-QAPET 60
           M++PL MSLDDIIK NKK+GS N RGRG  S GPGP+RR  NR  NRAAPY+ A +APET
Sbjct: 1   MSDPLSMSLDDIIKTNKKTGSGNPRGRGRLSGGPGPARRVPNRAPNRAAPYAAAPKAPET 60

Query: 61  AWSHDMFVDHGAAYPSQPARASAIETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYSIN 120
            W HDM++D G A+ +Q  RASAI+TGTKLY+SNL+YGVSNEDIKELFSEVGDLKRY+I+
Sbjct: 61  TWQHDMYMDQGTAFAAQAGRASAIQTGTKLYISNLEYGVSNEDIKELFSEVGDLKRYAIH 120

Query: 121 YDKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKLEIVGTNIMTPAV-PASANGN 180
           YD+SGRSKGTAE+VFSR+ADA+AA+KRYNNVQLDGK MK+EIVGTN+ TPA  P  ANG+
Sbjct: 121 YDRSGRSKGTAEVVFSRRADAVAAVKRYNNVQLDGKPMKIEIVGTNVATPAAPPPPANGS 180

Query: 181 FGNPNGFRRGGH-----VLGRNR-GGGRGRGPGRGGRGRGSSRGRGEKLSAEDLDADLEK 240
           FGN NG  RGG        GR R GGG GRGP R GRGRG  RG GEK+SA+DLDADLEK
Sbjct: 181 FGNSNGLPRGGQGRGGGAFGRPRGGGGGGRGP-RRGRGRGQGRGTGEKISADDLDADLEK 240

Query: 241 YHEEAMQIN 242
           YH EAMQ N
Sbjct: 241 YHAEAMQEN 248

BLAST of Cp4.1LG20g08820 vs. TrEMBL
Match: M5WJJ8_PRUPE (Uncharacterized protein OS=Prunus persica GN=PRUPE_ppa010358mg PE=4 SV=1)

HSP 1 Score: 319.3 bits (817), Expect = 4.0e-84
Identity = 180/253 (71.15%), Postives = 204/253 (80.63%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKKSGSSNFRGRGGASSGPGPSRRFRNRGLNRAAPYSTAQAPETA 60
           M +PL+MSLDD+IK +KKSGS N RGRG AS GPGP+RR  NR  NR  PY+ A+APETA
Sbjct: 1   MQDPLNMSLDDLIKTSKKSGSGNARGRGRAS-GPGPARRLPNRAANRTTPYAAAKAPETA 60

Query: 61  WSHDMFVDHGAA-YPSQPARASAIETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYSIN 120
           W HD++ D GAA +P+Q  RASAIETGTKLY+SNLDYGVSNEDIKELFSEVGDLKRY ++
Sbjct: 61  WQHDLYTDQGAAAFPAQAGRASAIETGTKLYISNLDYGVSNEDIKELFSEVGDLKRYGVH 120

Query: 121 YDKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKLEIVGTNIMT----PAVPASA 180
           YD+SGRSKGTAE+VFSR+ DA+AA+KRYNNVQLDGK MK+EIVGTNI T    P +P +A
Sbjct: 121 YDRSGRSKGTAEVVFSRRPDAVAAVKRYNNVQLDGKPMKIEIVGTNISTPGGPPTLPPAA 180

Query: 181 NGNFGNPNGFRRGGH----VLGRNRGGGRGRGPGRGGRGRGS----SRGR-GEKLSAEDL 240
           NGNFGN NG  RGG       GR RGGG GRGP RGGRGRGS     RGR GEK+SAEDL
Sbjct: 181 NGNFGNSNGVPRGGQSRGGAFGRIRGGG-GRGPRRGGRGRGSGNGGGRGRGGEKVSAEDL 240

BLAST of Cp4.1LG20g08820 vs. TAIR10
Match: AT5G59950.5 (AT5G59950.5 RNA-binding (RRM/RBD/RNP motifs) family protein)

HSP 1 Score: 248.4 bits (633), Expect = 4.4e-66
Identity = 151/249 (60.64%), Postives = 185/249 (74.30%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKKSGSSNFRGRG-GASSGPGPSRRFR-NRGLNRAAPYSTAQAPE 60
           M+  LDMSLDD+I  N+KS       RG G+ SGPGP+RR   NR   R+APY +A+APE
Sbjct: 1   MSTGLDMSLDDMIAKNRKSRGGAGPARGTGSGSGPGPTRRNNPNRKSTRSAPYQSAKAPE 60

Query: 61  TAWSHDMFVDHGAAYPSQPARASA-IETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYS 120
           + W HDMF D    + S   R+SA IETGTKLY+SNLDYGV NEDIKELF+EVG+LKRY+
Sbjct: 61  STWGHDMFSDRSEDHRS--GRSSAGIETGTKLYISNLDYGVMNEDIKELFAEVGELKRYT 120

Query: 121 INYDKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKLEIVGTNIMTPAVPASANG 180
           +++D+SGRSKGTAE+V+SR+ DALAA+K+YN+VQLDGK MK+EIVGTN+ T A P+    
Sbjct: 121 VHFDRSGRSKGTAEVVYSRRGDALAAVKKYNDVQLDGKPMKIEIVGTNLQTAAAPSGRPA 180

Query: 181 NFGNPNG--FRRGGHVL-GRNRGGGRGRGPGRGGRGRGSSRGRG--EKLSAEDLDADLEK 240
           N GN NG  + RGG    G+ RGGGRG G GRGG GRG   G+G  EK+SAEDLDADL+K
Sbjct: 181 N-GNSNGAPWSRGGQGRGGQQRGGGRG-GGGRGGGGRGRRPGKGPAEKISAEDLDADLDK 240

Query: 241 YHEEAMQIN 242
           YH   M+ N
Sbjct: 241 YHSGDMETN 245

BLAST of Cp4.1LG20g08820 vs. TAIR10
Match: AT5G02530.1 (AT5G02530.1 RNA-binding (RRM/RBD/RNP motifs) family protein)

HSP 1 Score: 244.2 bits (622), Expect = 8.2e-65
Identity = 163/290 (56.21%), Postives = 188/290 (64.83%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKK----------SGSSNFRGRGGASSGPGPSRRFRNRGLNRAAP 60
           M+  LDMSLDDIIK+N+K           G +N  GRGG+ S  GPSRRF NR   R AP
Sbjct: 1   MSGGLDMSLDDIIKSNRKPTGSRGRGGIGGGNNTGGRGGSGSNSGPSRRFANRVGARTAP 60

Query: 61  YSTA----QAPETAWSHDMFVDHG---AAYPSQPARA----SAIETGTKLYVSNLDYGVS 120
           YS      QA +  W +D+F       AA+           S+IETGTKLY+SNLDYGVS
Sbjct: 61  YSRPIQQQQAHDAMWQNDVFATDASVAAAFGHHQTAVVGGGSSIETGTKLYISNLDYGVS 120

Query: 121 NEDIKELFSEVGDLKRYSINYDKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKL 180
           NEDIKELFSEVGDLKRY I+YD+SGRSKGTAE+VFSR+ DALAA+KRYNNVQLDGK MK+
Sbjct: 121 NEDIKELFSEVGDLKRYGIHYDRSGRSKGTAEVVFSRRGDALAAVKRYNNVQLDGKLMKI 180

Query: 181 EIVGTNIMTPAVPASA-------------------NGNF-GNPNGFRRG---GHVLGRNR 240
           EIVGTN+  PA+P  A                   NGNF GN NG  RG   G  +GR R
Sbjct: 181 EIVGTNLSAPALPILATAQIPFPTNGILGNFNENFNGNFNGNFNGNFRGRGRGGFMGRPR 240

BLAST of Cp4.1LG20g08820 vs. TAIR10
Match: AT5G37720.1 (AT5G37720.1 ALWAYS EARLY 4)

HSP 1 Score: 158.3 bits (399), Expect = 5.9e-39
Identity = 120/291 (41.24%), Postives = 157/291 (53.95%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNK--KSGSSNF-----RGRGGASSGPGPSRRFRNRGLNRAAPYST 60
           M+  L+M+LD+I+K  K  +SG         RGRGG   G GP+RR     +N      T
Sbjct: 1   MSGALNMTLDEIVKRGKTARSGGRGISRGRGRGRGGGGRGAGPARR-GPLAVNARPSSFT 60

Query: 61  AQAP-----ETAWSHDMFVDHGAAYPSQPARASAIETGTKLYVSNLDYGVSNEDIKELFS 120
              P        W   +F D   A     A AS +E GT+L+V+NLD GV+NEDI+ELFS
Sbjct: 61  INKPVRRVRSLPWQSGLFEDGLRA-----AGASGVEVGTRLHVTNLDQGVTNEDIRELFS 120

Query: 121 EVGDLKRYSINYDKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKLEIVGTNIMT 180
           E+G+++RY+I+YDK+GR  GTAE+V+ R++DA  A+K+YNNV LDG+ M+LEI+G N  +
Sbjct: 121 EIGEVERYAIHYDKNGRPSGTAEVVYPRRSDAFQALKKYNNVLLDGRPMRLEILGGNNSS 180

Query: 181 PA-VPASANGNFGNPNG-------FRRGGHVLGRNRGGGRGRGP---------------- 239
            A +    N N    NG        ++GG   GR RGG  GRGP                
Sbjct: 181 EAPLSGRVNVNVTGLNGRLKRTVVIQQGGGGRGRVRGGRGGRGPAPTVSRRLPIHNQQGG 240

BLAST of Cp4.1LG20g08820 vs. TAIR10
Match: AT1G66260.1 (AT1G66260.1 RNA-binding (RRM/RBD/RNP motifs) family protein)

HSP 1 Score: 147.5 bits (371), Expect = 1.0e-35
Identity = 114/300 (38.00%), Postives = 159/300 (53.00%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKKSGSS------------NFRGRGGASSGPGPSRRFRNRGLNRA 60
           M++ L+M+LD+I+K +K   S+            + RGRGG +   G  R     G  R 
Sbjct: 1   MSDALNMTLDEIVKKSKSERSAAARSGGKGVSRKSGRGRGGPNGVVGGGR---GGGPVRR 60

Query: 61  APYSTAQAPETAWSHDMFVDHGAAYPSQPAR-----------ASAIETGTKLYVSNLDYG 120
            P +    P +++S +       + P Q               S +E GT +Y++NLD G
Sbjct: 61  GPLAVNTRPSSSFSINKLARRKRSLPWQNQNDLYEETLRAVGVSGVEVGTTVYITNLDQG 120

Query: 121 VSNEDIKELFSEVGDLKRYSINYDKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHM 180
           V+NEDI+EL++E+G+LKRY+I+YDK+GR  G+AE+V+ R++DA+ A+++YNNV LDG+ M
Sbjct: 121 VTNEDIRELYAEIGELKRYAIHYDKNGRPSGSAEVVYMRRSDAIQAMRKYNNVLLDGRPM 180

Query: 181 KLEIVGTNIMTPAVPASANGNFGNPNGFRR-----GGHVLGRNRGGGRGRGP-------- 240
           KLEI+G N  T + P +A  N    NG  +     G  V G   G GRG GP        
Sbjct: 181 KLEILGGN--TESAPVAARVNVTGLNGRMKRSVFIGQGVRGGRVGRGRGSGPSGRRLPLQ 240

Query: 241 ---------GRG---GRGRGSSRGRGEK-----------LSAEDLDADLEKYHEEAMQIN 242
                    GRG   GRGRG+  GRG K            SA DLD DLE YH EAM I+
Sbjct: 241 QNQQGGVTAGRGGFRGRGRGNGGGRGNKSGGRGGKKPVEKSAADLDKDLESYHAEAMNIS 295

BLAST of Cp4.1LG20g08820 vs. TAIR10
Match: AT2G37220.1 (AT2G37220.1 RNA-binding (RRM/RBD/RNP motifs) family protein)

HSP 1 Score: 53.1 bits (126), Expect = 2.7e-07
Identity = 27/80 (33.75%), Postives = 45/80 (56.25%), Query Frame = 1

Query: 81  SAIETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYSINYDK-SGRSKGTAEIVFSRQAD 140
           S   +G ++YV NL +GV +  ++ LFSE G +    + YD+ SGRSKG   + +    +
Sbjct: 198 SGAGSGNRVYVGNLSWGVDDMALESLFSEQGKVVEARVIYDRDSGRSKGFGFVTYDSSQE 257

Query: 141 ALAAIKRYNNVQLDGKHMKL 160
              AIK  +   LDG+ +++
Sbjct: 258 VQNAIKSLDGADLDGRQIRV 277

BLAST of Cp4.1LG20g08820 vs. NCBI nr
Match: gi|659115424|ref|XP_008457549.1| (PREDICTED: THO complex subunit 4A [Cucumis melo])

HSP 1 Score: 418.3 bits (1074), Expect = 9.0e-114
Identity = 220/245 (89.80%), Postives = 228/245 (93.06%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKKSGSSNFRGRGGASSGPGPSRRFRNRGLNRAAPYSTAQAPETA 60
           MAEPLDMSLDDIIKNNKKSGSSNFR RGGASSGPGPSRRFRNRGLNRA PYST++APETA
Sbjct: 1   MAEPLDMSLDDIIKNNKKSGSSNFRARGGASSGPGPSRRFRNRGLNRATPYSTSKAPETA 60

Query: 61  WSHDMFVDHGAAYPSQPARASAIETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYSINY 120
           WSHDMFVDHGAAYPS P RASAIETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYSINY
Sbjct: 61  WSHDMFVDHGAAYPSHPPRASAIETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYSINY 120

Query: 121 DKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKLEIVGTNIMTPAVPASANGNFG 180
           DKSGRSKGTAEI+FSR ADALAAIKRYNNVQLDGK MKLEIVGTNI+TPAVPA +N +FG
Sbjct: 121 DKSGRSKGTAEILFSRPADALAAIKRYNNVQLDGKPMKLEIVGTNIVTPAVPAPSNASFG 180

Query: 181 NPNGFRRGGHVLGRNRGGGRGRGPGRGGRGR----GSSRGRGEKLSAEDLDADLEKYHEE 240
           N NGF RGG  +GRNRGGGRGRGPGRGGRGR    GS RGRGEKLSAEDLDADL+KYHEE
Sbjct: 181 NHNGFPRGGRAMGRNRGGGRGRGPGRGGRGRGSGSGSGRGRGEKLSAEDLDADLDKYHEE 240

Query: 241 AMQIN 242
           AMQIN
Sbjct: 241 AMQIN 245

BLAST of Cp4.1LG20g08820 vs. NCBI nr
Match: gi|778661318|ref|XP_004149042.2| (PREDICTED: THO complex subunit 4A [Cucumis sativus])

HSP 1 Score: 416.0 bits (1068), Expect = 4.5e-113
Identity = 221/250 (88.40%), Postives = 229/250 (91.60%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKKSGSSNFRGRGGASSGPGPSRRFRNRGLNRAAPYSTAQAPETA 60
           MAEPLDMSLDDIIKNNKKSGSSNFR RGGASSGPGPSRRFRNRGLNRA PYST++APETA
Sbjct: 1   MAEPLDMSLDDIIKNNKKSGSSNFRARGGASSGPGPSRRFRNRGLNRATPYSTSKAPETA 60

Query: 61  WSHDMFVDHGAAYPSQPARASAIETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYSINY 120
           WSHDMFVDHGAAYPS P RASAIETGTKLYVSNLDYGVSNEDIKELFSEVGD+KRYSINY
Sbjct: 61  WSHDMFVDHGAAYPSHPPRASAIETGTKLYVSNLDYGVSNEDIKELFSEVGDVKRYSINY 120

Query: 121 DKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKLEIVGTNIMTPAVPASANGNFG 180
           DKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGK MKLEIVGTNI+TPAVPA +N +FG
Sbjct: 121 DKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKPMKLEIVGTNIVTPAVPAPSNASFG 180

Query: 181 NPNGFRRGGHVLGRNRGGGRGRGPGRG-GRGR--------GSSRGRGEKLSAEDLDADLE 240
           NPNGF RGG  +GRNRGGGRGRGPGRG GRGR        GS RG GEKLSAEDLDADL+
Sbjct: 181 NPNGFPRGGRAMGRNRGGGRGRGPGRGRGRGRGSGSGSGSGSGRGHGEKLSAEDLDADLD 240

Query: 241 KYHEEAMQIN 242
           KYHEEAMQIN
Sbjct: 241 KYHEEAMQIN 250

BLAST of Cp4.1LG20g08820 vs. NCBI nr
Match: gi|720013236|ref|XP_010260105.1| (PREDICTED: THO complex subunit 4A [Nelumbo nucifera])

HSP 1 Score: 335.1 bits (858), Expect = 1.0e-88
Identity = 183/250 (73.20%), Postives = 206/250 (82.40%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKKSGSSNFRGRGGASSGPGPSRRFRNRGLNRAAPYSTA---QAP 60
           M+  LDM+L+D+IKNNKKSG  NFRGRG   SGPGPSRRF NRG NR APYST    QAP
Sbjct: 1   MSSALDMTLEDLIKNNKKSGGGNFRGRG-RGSGPGPSRRFPNRGANRTAPYSTGKPVQAP 60

Query: 61  ETAWSHDMFVDHGAAYPSQPARASAIETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYS 120
           ++AW HDMF D  AAYP+Q AR SAIETGTKLY+SNL+YGVSNEDIKELFSEVGDLKRY+
Sbjct: 61  DSAWQHDMFTDQAAAYPAQAARTSAIETGTKLYISNLEYGVSNEDIKELFSEVGDLKRYT 120

Query: 121 INYDKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKLEIVGTNIMTP-AVPASAN 180
           ++YD+SGRSKGTAE+VFSR+ADALAA+KRYNNVQLDGK MK+E+VGTNI TP AVP +AN
Sbjct: 121 VHYDRSGRSKGTAEVVFSRRADALAAVKRYNNVQLDGKPMKIEVVGTNIATPVAVPPAAN 180

Query: 181 GNFGNPNGFRRGGH----VLGRNRGGGRGRGPGRG-GRGRGSSRGRGEKLSAEDLDADLE 240
           G FGNPNG  R G      +GR+RGGG     GRG GRGRG  R RGE++SAEDLDADLE
Sbjct: 181 GGFGNPNGVPRSGQGRGGAMGRSRGGG-----GRGFGRGRGRRRDRGEQISAEDLDADLE 240

Query: 241 KYHEEAMQIN 242
           KYH EAMQIN
Sbjct: 241 KYHSEAMQIN 244

BLAST of Cp4.1LG20g08820 vs. NCBI nr
Match: gi|802553469|ref|XP_012064998.1| (PREDICTED: THO complex subunit 4A [Jatropha curcas])

HSP 1 Score: 330.9 bits (847), Expect = 1.9e-87
Identity = 183/249 (73.49%), Postives = 201/249 (80.72%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKKSGSSNFRGRGGASSGPGPSRRFRNRGLNRAAPYSTAQAPETA 60
           M+  LDMSLDDIIK+NKK GS N RGRG A SGPGP+RRF NR  NRAAPYSTA+APET 
Sbjct: 1   MSSALDMSLDDIIKSNKKPGSGNSRGRGRA-SGPGPTRRFTNRVANRAAPYSTAKAPETT 60

Query: 61  WSHDMFVDHGAAYPSQPARASAIETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYSINY 120
           W HDMF D G  Y  Q  RASAIETGTKLY+SNL+YGVSNEDIKELFSEVGDLKRY+I+Y
Sbjct: 61  WQHDMFTDQGMGYAGQGGRASAIETGTKLYISNLEYGVSNEDIKELFSEVGDLKRYTIHY 120

Query: 121 DKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKLEIVGTNIMTPAVPASANGNFG 180
           D+SGRSKGTAE+VFSR+ DALAA+KRYNNVQLDGK MK+EIVGTNI TPA P++ANG FG
Sbjct: 121 DRSGRSKGTAEVVFSRRTDALAAVKRYNNVQLDGKPMKIEIVGTNIATPAAPSAANGTFG 180

Query: 181 NPNGFRRGGH----VLGRNR---GGGRGRGPGRG-GRGRGSSRGRGEKLSAEDLDADLEK 240
           + N   RGG      +GR R   GGGRG G GRG GRG G   GRGEK+SAEDLDADLEK
Sbjct: 181 SSNAVSRGGQGRGGAVGRQRGGSGGGRGFGRGRGRGRGGGGGGGRGEKVSAEDLDADLEK 240

Query: 241 YHEEAMQIN 242
           YH EAMQ N
Sbjct: 241 YHSEAMQTN 248

BLAST of Cp4.1LG20g08820 vs. NCBI nr
Match: gi|719991357|ref|XP_010253243.1| (PREDICTED: THO complex subunit 4A-like isoform X1 [Nelumbo nucifera])

HSP 1 Score: 328.6 bits (841), Expect = 9.4e-87
Identity = 180/246 (73.17%), Postives = 203/246 (82.52%), Query Frame = 1

Query: 1   MAEPLDMSLDDIIKNNKKSGSSNFRGRGGASSGPGPSRRFRNRGLNRAAPYS---TAQAP 60
           M+  LDMSLDD+IKNNKKSG SNFRGRG   SGPGP+RR  NRG NR  PYS   + QAP
Sbjct: 1   MSSSLDMSLDDLIKNNKKSGGSNFRGRG-RGSGPGPARRAPNRGANRTTPYSMGKSVQAP 60

Query: 61  ETAWSHDMFVDHGAAYPSQPARASAIETGTKLYVSNLDYGVSNEDIKELFSEVGDLKRYS 120
           ++AW HDMF D  AAYP+Q +RASAIETGTKLY+SNLDYGVSN+DIKELFSEVGDLKRY+
Sbjct: 61  DSAWQHDMFTDQAAAYPAQASRASAIETGTKLYISNLDYGVSNDDIKELFSEVGDLKRYT 120

Query: 121 INYDKSGRSKGTAEIVFSRQADALAAIKRYNNVQLDGKHMKLEIVGTNIMTPAV-PASAN 180
           I+YD+SGRSKGTAE+VFSR+ADALAA+KRYNNVQLDGK MK+EIVGTN+ TPAV PA+ N
Sbjct: 121 IHYDRSGRSKGTAEVVFSRRADALAAVKRYNNVQLDGKPMKIEIVGTNVTTPAVAPAATN 180

Query: 181 GNFGNPNGFRRGGHVLGRNRGGGRGRGPGRG-GRGRGSSRGRGEKLSAEDLDADLEKYHE 240
           G FGNPN   R     G+ RGG  GR  GRG GRGRG  RGR E+++AEDLDADLEKYH 
Sbjct: 181 GKFGNPNSVPRS----GQGRGGAIGRPRGRGFGRGRGRGRGRAEQITAEDLDADLEKYHS 240

Query: 241 EAMQIN 242
           EAMQIN
Sbjct: 241 EAMQIN 241

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
THO4A_ARATH1.0e-6460.48THO complex subunit 4A OS=Arabidopsis thaliana GN=ALY1 PE=1 SV=1[more]
THO4B_ARATH1.5e-6356.21THO complex subunit 4B OS=Arabidopsis thaliana GN=ALY2 PE=1 SV=1[more]
THOC4_TAEGU4.3e-3944.79THO complex subunit 4 OS=Taeniopygia guttata GN=ALYREF PE=2 SV=1[more]
THO4D_ARATH1.1e-3741.24THO complex subunit 4D OS=Arabidopsis thaliana GN=ALY4 PE=1 SV=1[more]
THOC4_MOUSE4.4e-3642.47THO complex subunit 4 OS=Mus musculus GN=Alyref PE=1 SV=3[more]
Match NameE-valueIdentityDescription
A0A0A0LUQ3_CUCSA3.1e-11388.40Uncharacterized protein OS=Cucumis sativus GN=Csa_1G480690 PE=4 SV=1[more]
A0A067LAK2_JATCU1.3e-8773.49Uncharacterized protein OS=Jatropha curcas GN=JCGZ_05675 PE=4 SV=1[more]
A0A061H038_THECC1.2e-8570.25RNA-binding family protein OS=Theobroma cacao GN=TCM_041883 PE=4 SV=1[more]
W9QZL4_9ROSA8.0e-8571.08RNA and export factor-binding protein 2 OS=Morus notabilis GN=L484_001844 PE=4 S... [more]
M5WJJ8_PRUPE4.0e-8471.15Uncharacterized protein OS=Prunus persica GN=PRUPE_ppa010358mg PE=4 SV=1[more]
Match NameE-valueIdentityDescription
AT5G59950.54.4e-6660.64 RNA-binding (RRM/RBD/RNP motifs) family protein[more]
AT5G02530.18.2e-6556.21 RNA-binding (RRM/RBD/RNP motifs) family protein[more]
AT5G37720.15.9e-3941.24 ALWAYS EARLY 4[more]
AT1G66260.11.0e-3538.00 RNA-binding (RRM/RBD/RNP motifs) family protein[more]
AT2G37220.12.7e-0733.75 RNA-binding (RRM/RBD/RNP motifs) family protein[more]
Match NameE-valueIdentityDescription
gi|659115424|ref|XP_008457549.1|9.0e-11489.80PREDICTED: THO complex subunit 4A [Cucumis melo][more]
gi|778661318|ref|XP_004149042.2|4.5e-11388.40PREDICTED: THO complex subunit 4A [Cucumis sativus][more]
gi|720013236|ref|XP_010260105.1|1.0e-8873.20PREDICTED: THO complex subunit 4A [Nelumbo nucifera][more]
gi|802553469|ref|XP_012064998.1|1.9e-8773.49PREDICTED: THO complex subunit 4A [Jatropha curcas][more]
gi|719991357|ref|XP_010253243.1|9.4e-8773.17PREDICTED: THO complex subunit 4A-like isoform X1 [Nelumbo nucifera][more]
The following terms have been associated with this gene:
Vocabulary: Molecular Function
TermDefinition
GO:0000166nucleotide binding
GO:0003676nucleic acid binding
Vocabulary: INTERPRO
TermDefinition
IPR025715FoP_C
IPR012677Nucleotide-bd_a/b_plait_sf
IPR000504RRM_dom
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0008150 biological_process
cellular_component GO:0005575 cellular_component
molecular_function GO:0003676 nucleic acid binding
molecular_function GO:0000166 nucleotide binding
molecular_function GO:0003723 RNA binding

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Cp4.1LG20g08820.1Cp4.1LG20g08820.1mRNA


Analysis Name: InterPro Annotations of Cucurbita pepo
Date Performed: 2017-12-02
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR000504RNA recognition motif domainPFAMPF00076RRM_1coord: 89..157
score: 1.4
IPR000504RNA recognition motif domainSMARTSM00360rrm1_1coord: 88..160
score: 4.3
IPR000504RNA recognition motif domainPROFILEPS50102RRMcoord: 87..164
score: 16
IPR012677Nucleotide-binding alpha-beta plait domainGENE3DG3DSA:3.30.70.330coord: 76..161
score: 4.6
IPR012677Nucleotide-binding alpha-beta plait domainunknownSSF54928RNA-binding domain, RBDcoord: 50..165
score: 1.17
IPR025715Chromatin target of PRMT1 protein, C-terminalPFAMPF13865FoP_duplicationcoord: 187..234
score: 5.
IPR025715Chromatin target of PRMT1 protein, C-terminalSMARTSM01218FoP_duplication_2coord: 170..241
score: 5.2
NoneNo IPR availableunknownCoilCoilcoord: 223..241
scor
NoneNo IPR availablePANTHERPTHR19965RNA AND EXPORT FACTOR BINDING PROTEINcoord: 1..241
score: 3.1E
NoneNo IPR availablePANTHERPTHR19965:SF29SUBFAMILY NOT NAMEDcoord: 1..241
score: 3.1E

The following gene(s) are paralogous to this gene:
GeneParalogueOrganismBlock
Cp4.1LG20g08820Cp4.1LG02g12950Cucurbita pepo (Zucchini)cpecpeB432