Cp4.1LG04g03250 (gene) Cucurbita pepo (Zucchini)

NameCp4.1LG04g03250
Typegene
OrganismCucurbita pepo (Cucurbita pepo (Zucchini))
DescriptionProtein thf1
LocationCp4.1LG04 : 7319621 .. 7324748 (-)
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: five_prime_UTRCDSthree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
CATGGCCTCTCAGTTTCGTCCTCTCTTGTTTCTTTCTTTTCTTTTCTTTTTTTTTTTTTTTTTTCCTCAATGAAAGCTTGGTTATTCATTAAAAGAAATGATATGCCAAACTTCCTNTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTCTCCCAGAAAAAATCTTCGACATGAAATCCTTTTTCTCTGGAAGTTTGTAAGATTTGCAAATTTCTTCTCATTCTTCTGCAATGGCGGCTGTTAATTCCGTGTCATTCTCAACGGTAAGTCAGTTTTCTGATCGAAGGTCGCCGGTTCCGTCGGCTCGTTCACTCGCCTCGAATTTCGACGGGTTTCGTTTCCGTTCTAGTGTTTTCTATCATCATTCGGGAGTTCGAACCTCGAGTTTCAGTTCTCGCTTGGTCATTCATTGCATGTCCGCCGGAACAGGTAGCTTCTGGCAGATTTGTTTTCCTTACTCGCTTACTAGACTGTTCCTTTGCTAGATTATGGTAGCAGAAACGAATTGGCTTGTTTCGATGATTTCGATGTGAACTTGTGTTTTTTTGTTCGTTAATTGTTGAGATGAATCGTTATGGACGGAAATTTATGTTTTAAGTAACAATTAGGCTTTCAATAGTTCGGTTTGATCAATTCTAAACTTGTTCAGTATGTTGAGATTTCTCTAGTTCGATTTCTTAGTATTTGAGTGTAGAAAATGATGTGCTATTTTCGGATTTAGATGTGACGACTGTAGCTGAGACTAAATTGAACTTTCTAAAGGCGTATAAACGACCTATCCCTAGCATATACAACTCTGTTCTGCAAGAGTTGATTGTGCAGCAGCATTTGATGAGGTATAAGAGGACGTACCGTTATGATCCTGTTTTCGCCCTCGGTTTTGTTACTGTATATGATCAGCTTATGGATGGGTACCCTAGCGATGAGGATCGGGAGGCCATTTTCCAAGCCTACATTAAGGCGTTGAATGAGGATCCAGAGCAATATAGGTTTGGCCATGTATTCCCCTTTGCTTTACCATCCCTGATCTCTATCTCTTCTTGCCATCTAATAAATTCACTAGCCTTGTTCATAGATAGTAAACATTGTTAAGTTGAAAAAAACATGACATGATTCCACTTACAGCAAAAATACGACGATTCTTATTCAACTTTTATCTTCTTTCCTAGTATTTCACATCTTAATTTGGACTTGGTTGTCTAAATGACTAGATTATTTGGTCAAATCACATGGTAATCTAATTAACGGATGTTTACTCTTTGTTCAATATCTCCAAGAGTGGAATTTTCTTCACTGGAAATTATATTCCATCCTTCGACCATTCTTCCAATAATTTCAAGAAGGTTTTTCTTCATTTTTAGTACTACTTTCTGTTATACCGTCCAGAATTGATGCTCAAAAATTGGAAGAGTGGGCTCGGTCTCAGTCTGCAGCTTCATTGGTTGAATTTGCATCAAAAGAAGGAGAAGTTGAGAGTGTTTTGAAAGACATTGCAGAACGAGCTGCGAGTAAGGGGAGTTTCAGTTACAGCCGATTTTTTGCGATTGGGCTATTTCGACTCCTCGAATTGGCAAATGCTACTGAACCCAGTATCTTGGAAAAGGTTTATTTTATCTTCCTCTTTGTGATATAGCTTTTAAGTAGATGAAGATTTTGAGATGATAGTATATCGAGCTTATTATTGCTAAATTAAACATTTACATGATAGCCAGCCCCCCTCTCATAGTACTTTTACTTGTAGATTGTATTCTTTCTCTGAGTGCTGCCATTCATAAAGTGTAATTGTGTTTCGTGCTCATTATTATCTTTTTACTCATTAGAGAAATTGGATCCTGATGGAAGAGTGCTTAACCTTATTCAGTTCGATAGAATTAGGTTCAAGGTACTCTTTCCTGCATTAGTCTCCATTTTACTTTTCTAGGTAACATCAATCTTTTTTGTAGTATTTGGTTGAAGAGAAATATGAGCTAACCATTTTTTGGAACATTCCCCTTTGCTCGGGACAATCCCACATACACAATTACAGTGTTTTTGTGGGATTTCCTTCTCCTTTGTGGAGATTGTACTGATCTTGTGGATGTAGTTTATGCTCTTTAGTAGTCTCAGCTGGTCCTTTTCATTCAGTTTATTAGCTCTTGTGAAGGGAAACATACATATGTTCCACTGCCCCCATCAACAACAGTTATCATGAACTATTGCTTTAATTGGCCATCTCTATTTAGCGTGTCTATCTCCAAACATACTTTTATTTTTCTAAAACAGCTAAAGCCATGTAATGGAGATTTTACCAAGTGTTGACATCTAACACTTTGCTGCTTCTGAACAGCTCTGTGCCGCTTTAAATGTTGACAAAAAAAGTGTGGACCGAGACCTTGATGTATACCGCAACCTGCTTTCGAAGTTGGTTCAGGCGAAGGAGCTCCTAAAGGAATACGTTGATAGGTAAGAGATTTTTAAGTTCATGGTTCCTGGCCTGGTAGATTAGCATACCCACAATCTGCTCTTTCACACCACCATATTTTCTTGGATTTTATGAACCCATAATCGTTTTGAAAACAATCCAGTATTTGTCAAATCTTCTGCAACTCACCGTCATATCATTTATATTCACTAGATATCCACCGCTTGGCTGTGTTTTAAGATGAAAGTGATGTTTTATGTAAGTTTAAATGAAGTATTTTTGCATTCAATACAGTGGTGACCTTAGATTAAGTGGAAAATTTCGGGTCATTACATGGATCTTGAGTGAGGGTGGCTGTGTTTTAAGATGAAGGTGATGTTTTATGAAAGTTTAAATGAAGTATTTCTGCATTTAATGTTCATGATATGTATACAGTAGCGGCCTTAAATTTAGTAGAAAATTTCGGATCGTTACATGAATCTTGAGTGAGGCTGACTGTGTTCTAAAATGAAGGTGATATTTTATGAAAGTTTAAATGAAGTGTTTCTGCCTTCAATGTTCATGATATGTATACAGTGACGACCTTATATTAAGTAGAAAATTTCGGGTCGTTACATGGATCTTGGGTGAGGCTGGCTGTGTTTTAAGATCAAGGTGATGTTTTATGAAAATTTAAATGAAGTATTTTTGCATTCAATATTCATGATATGTATACAGTGGCAACCTTAAATTAAATAGAAAATTTCGGGTTGTTACATGGATCTTAAGTGAGGCTGGCTGTGTTTTAAGATGAAGGTGATGGTGATGTTTTATGAAAGTTTAAATGAAGTATTTCCGCCTTCAATGTTTATGATATGTATACAGTGGTGACCTTAAATTAAGTAAAAAATTTCGGATCGGTACATGGGTCTTGGGAGGCCTAAAAATGAGTCAGAATTGGGTGGATCATGGTATAATTAACCAGTGCTCTTCAATTCTGTTTTTGCCTATAGAGAGAAGAAGAAAAGAGATGAGAGGGCTGGATCACAGACAGCTAACGAGGCCGTAACAAAATGAATGATTGGGTGAATACAGCATAGAGAGTGGTTTTTGAGAGCTGATGACATCAATTGGAGCACTCACCGCCTAATTTGAGATAAGACTTTAAAGAGTTATAGCAATATTCTAAATACCTAATACATATGCATTTGTACTGTATTGTTGGGTCTTCTGCATTTGGTAAATTTTGTATTCTGCCAGCTTCTACTTTTCTTCTCATTTGTATTACATATTTTGAGTGCAATTGTCTATACAAATGCTTTCGCACTCNTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTGTATGTGCAACCATGTCTGGATTCTTTTGAGGGATTCTGATGAGGCATAAGTATATCTATTAGATCTTAGAAACCACGACTCTCTACAATGGTATGATAGTGTTTACTTTGCTTTTGGTTTCCTTAAAAGGCTTCAAACCAATGGAGATGTATTTCTTATAAACTTAGGAATCATTTCCTTAATTAGTGTGGGACTCTCTCCCAACGATTTTCAACAATCCTCCTTTCGAACAAAGTACACTACAGAGCCTCTCCCGAGGCCTATGAAGCCCTCGAACAACCTCCCCTTAATTGAGGCTTGACTCCTTTCTCTAGAGTCTTCGAACAAAGTACACCATTTGTTTGACATTTGAGTCACTTTTGACTACACTTTTAAGGGTCCGAACTTGTTTGTTCGATATTTGAGGATTCTATTGACATGGCTAAGTTAAGGACATAACTCTGATACCATGTTAGGAATCACGGATCTCTACAATGGTATGATATTGTCCATTTTGAACATAAGGTCTCATGACTTTGCTATTTGGTTTTCCCAAAAGGTCTCGTACCAATGAAAATATATTATTTATTTATAGACTCGTGATCATTCCCTTAATTAGCTAATGTGGGACTCCCTCCCAACAATCCTCAACATTGAAGGCTTATATGAATATATTTTAGATGTTTAGTCTGTTGCATTTCGTCACCCGTCCTTCCTTTATCTTCATCACCTTCATATAGTAACAATAACTTACAATAGATGAAATTATACTTTTAATTCTTCGAACTTGGAACTAATGTTTATTTAGTCTGATTGGTTTTGTTCAATTTGGAACCAATTTATGATCGATTTTATCAGAGACGAAACTGATTCAACAAAATTCGAACTCTGAGAACATAGAAGGGAAGAGAGATTTGCTAGTGGATCGAGGCAATTGAACCCGCAAAATAATCTATATTTGTGAGATGAAGCTATCCTTGAAGGCACAACCAGTGAGAGTGAAGAACAACTGACAAATCTGCAACCCAGAACCGAGCAACGATCAAGGACTCTAACGGTGATTACTCTTAGGGATGAGTTCCAAATTCAAAATCGATCGAGGAAAAAAGTTCGAGACAAAGGTATGAACAAAATGTGAAATGGCACATGAGTGGTTGGAATGATTAATAAATCAAATGCAAATTATATAAAGTGATACGACATTATGTGGTAGGAGCTTCATACATAATATTATTGGGAATTTGGAGCTTTTTGTACATACGTATTGTAGTTTAACAACACATACATTATTCTATTGTGTTTATATTTGGTGGATGACCATGGCCATACGGTCAAAGCCAACCAAGCACTCGGTATGGCTATAACTTTAAAATATTCAACAAAGAGATCTCTCTCGACCAA

mRNA sequence

CATGGCCTCTCAGTTTCGTCCTCTCTTGTTTCTTTCTTTTCTTTTCTTTTTTTTTTTTTTTTTTCCTCAATGAAAGCTTGGTTATTCATTAAAAGAAATGATATGCCAAACTTCCTNTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTCTCCCAGAAAAAATCTTCGACATGAAATCCTTTTTCTCTGGAAGTTTGTAAGATTTGCAAATTTCTTCTCATTCTTCTGCAATGGCGGCTGTTAATTCCGTGTCATTCTCAACGGTAAGTCAGTTTTCTGATCGAAGGTCGCCGGTTCCGTCGGCTCGTTCACTCGCCTCGAATTTCGACGGGTTTCGTTTCCGTTCTAGTGTTTTCTATCATCATTCGGGAGTTCGAACCTCGAGTTTCAGTTCTCGCTTGGTCATTCATTGCATGTCCGCCGGAACAGATGTGACGACTGTAGCTGAGACTAAATTGAACTTTCTAAAGGCGTATAAACGACCTATCCCTAGCATATACAACTCTGTTCTGCAAGAGTTGATTGTGCAGCAGCATTTGATGAGGTATAAGAGGACGTACCGTTATGATCCTGTTTTCGCCCTCGGTTTTGTTACTGTATATGATCAGCTTATGGATGGGTACCCTAGCGATGAGGATCGGGAGGCCATTTTCCAAGCCTACATTAAGGCGTTGAATGAGGATCCAGAGCAATATAGAATTGATGCTCAAAAATTGGAAGAGTGGGCTCGGTCTCAGTCTGCAGCTTCATTGGTTGAATTTGCATCAAAAGAAGGAGAAGTTGAGAGTGTTTTGAAAGACATTGCAGAACGAGCTGCGAGTAAGGGGAGTTTCAGTTACAGCCGATTTTTTGCGATTGGGCTATTTCGACTCCTCGAATTGGCAAATGCTACTGAACCCAGTATCTTGGAAAAGCTCTGTGCCGCTTTAAATGTTGACAAAAAAAGTGTGGACCGAGACCTTGATGTATACCGCAACCTGCTTTCGAAGTTGGTTCAGGCGAAGGAGCTCCTAAAGGAATACGTTGATAGAGAGAAGAAGAAAAGAGATGAGAGGGCTGGATCACAGACAGCTAACGAGGCCGTAACAAAATGAATGATTGGGTGAATACAGCATAGAGAGTGGTTTTTGAGAGCTGATGACATCAATTGGAGCACTCACCGCCTAATTTGAGATAAGACTTTAAAGAGTTATAGCAATATTCTAAATACCTAATACATATGCATTTGTACTGTATTGTTGGGTCTTCTGCATTTGGTAAATTTTGTATTCTGCCAGCTTCTACTTTTCTTCTCATTTGTATTACATATTTTGAGTGCAATTGTCTATACAAATGCTTTCGCACTCNTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTGTATGTGCAACCATGTCTGGATTCTTTTGAGGGATTCTGATGAGGCATAAGTATATCTATTAGATCTTAGAAACCACGACTCTCTACAATGGTATGATAGTGTTTACTTTGCTTTTGGTTTCCTTAAAAGGCTTCAAACCAATGGAGATGTATTTCTTATAAACTTAGGAATCATTTCCTTAATTAGTGTGGGACTCTCTCCCAACGATTTTCAACAATCCTCCTTTCGAACAAAGTACACTACAGAGCCTCTCCCGAGGCCTATGAAGCCCTCGAACAACCTCCCCTTAATTGAGGCTTGACTCCTTTCTCTAGAGTCTTCGAACAAAGTACACCATTTGTTTGACATTTGAGTCACTTTTGACTACACTTTTAAGGGTCCGAACTTGTTTGTTCGATATTTGAGGATTCTATTGACATGGCTAAGTTAAGGACATAACTCTGATACCATGTTAGGAATCACGGATCTCTACAATGGTATGATATTGTCCATTTTGAACATAAGGTCTCATGACTTTGCTATTTGGTTTTCCCAAAAGGTCTCGTACCAATGAAAATATATTATTTATTTATAGACTCGTGATCATTCCCTTAATTAGCTAATGTGGGACTCCCTCCCAACAATCCTCAACATTGAAGGCTTATATGAATATATTTTAGATGTTTAGTCTGTTGCATTTCGTCACCCGTCCTTCCTTTATCTTCATCACCTTCATATAGTAACAATAACTTACAATAGATGAAATTATACTTTTAATTCTTCGAACTTGGAACTAATGTTTATTTAGTCTGATTGGTTTTGTTCAATTTGGAACCAATTTATGATCGATTTTATCAGAGACGAAACTGATTCAACAAAATTCGAACTCTGAGAACATAGAAGGGAAGAGAGATTTGCTAGTGGATCGAGGCAATTGAACCCGCAAAATAATCTATATTTGTGAGATGAAGCTATCCTTGAAGGCACAACCAGTGAGAGTGAAGAACAACTGACAAATCTGCAACCCAGAACCGAGCAACGATCAAGGACTCTAACGGTGATTACTCTTAGGGATGAGTTCCAAATTCAAAATCGATCGAGGAAAAAAGTTCGAGACAAAGGTATGAACAAAATGTGAAATGGCACATGAGTGGTTGGAATGATTAATAAATCAAATGCAAATTATATAAAGTGATACGACATTATGTGGTAGGAGCTTCATACATAATATTATTGGGAATTTGGAGCTTTTTGTACATACGTATTGTAGTTTAACAACACATACATTATTCTATTGTGTTTATATTTGGTGGATGACCATGGCCATACGGTCAAAGCCAACCAAGCACTCGGTATGGCTATAACTTTAAAATATTCAACAAAGAGATCTCTCTCGACCAA

Coding sequence (CDS)

ATGGCGGCTGTTAATTCCGTGTCATTCTCAACGGTAAGTCAGTTTTCTGATCGAAGGTCGCCGGTTCCGTCGGCTCGTTCACTCGCCTCGAATTTCGACGGGTTTCGTTTCCGTTCTAGTGTTTTCTATCATCATTCGGGAGTTCGAACCTCGAGTTTCAGTTCTCGCTTGGTCATTCATTGCATGTCCGCCGGAACAGATGTGACGACTGTAGCTGAGACTAAATTGAACTTTCTAAAGGCGTATAAACGACCTATCCCTAGCATATACAACTCTGTTCTGCAAGAGTTGATTGTGCAGCAGCATTTGATGAGGTATAAGAGGACGTACCGTTATGATCCTGTTTTCGCCCTCGGTTTTGTTACTGTATATGATCAGCTTATGGATGGGTACCCTAGCGATGAGGATCGGGAGGCCATTTTCCAAGCCTACATTAAGGCGTTGAATGAGGATCCAGAGCAATATAGAATTGATGCTCAAAAATTGGAAGAGTGGGCTCGGTCTCAGTCTGCAGCTTCATTGGTTGAATTTGCATCAAAAGAAGGAGAAGTTGAGAGTGTTTTGAAAGACATTGCAGAACGAGCTGCGAGTAAGGGGAGTTTCAGTTACAGCCGATTTTTTGCGATTGGGCTATTTCGACTCCTCGAATTGGCAAATGCTACTGAACCCAGTATCTTGGAAAAGCTCTGTGCCGCTTTAAATGTTGACAAAAAAAGTGTGGACCGAGACCTTGATGTATACCGCAACCTGCTTTCGAAGTTGGTTCAGGCGAAGGAGCTCCTAAAGGAATACGTTGATAGAGAGAAGAAGAAAAGAGATGAGAGGGCTGGATCACAGACAGCTAACGAGGCCGTAACAAAATGA

Protein sequence

MAAVNSVSFSTVSQFSDRRSPVPSARSLASNFDGFRFRSSVFYHHSGVRTSSFSSRLVIHCMSAGTDVTTVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALGFVTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEFASKEGEVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELANATEPSILEKLCAALNVDKKSVDRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAVTK
BLAST of Cp4.1LG04g03250 vs. Swiss-Prot
Match: THF1_SOLTU (Protein THYLAKOID FORMATION1, chloroplastic OS=Solanum tuberosum GN=THF1 PE=2 SV=1)

HSP 1 Score: 386.3 bits (991), Expect = 2.8e-106
Identity = 204/288 (70.83%), Postives = 246/288 (85.42%), Query Frame = 1

Query: 1   MAAVNSVSFSTVSQFSDRRSPVPSARSLASNFDGFRFRSSVFYHHSGVRTSSFSSRLVIH 60
           MAAV SVSFS ++Q ++R+S V S+RS+    D FRFRS+  +    VR+S+ +SR V+H
Sbjct: 1   MAAVTSVSFSAITQSAERKSSVSSSRSI----DTFRFRSNFSFDSVNVRSSNSTSRFVVH 60

Query: 61  CMSAGT-DVTTVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALG 120
           C S+   D+ TVA+TKL FL AYKRPIP++YN+VLQELIVQQHL RYK++Y+YDPVFALG
Sbjct: 61  CTSSSAADLPTVADTKLKFLTAYKRPIPTVYNTVLQELIVQQHLTRYKKSYQYDPVFALG 120

Query: 121 FVTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEFAS 180
           FVTVYDQLM+GYPS+EDR AIF+AYI+AL EDPEQYR DAQKLEEWAR+Q+A +LV+F+S
Sbjct: 121 FVTVYDQLMEGYPSEEDRNAIFKAYIEALKEDPEQYRADAQKLEEWARTQNANTLVDFSS 180

Query: 181 KEGEVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELANATEPSILEKLCAALNVDKKS 240
           KEGE+E++ KDIA+RA +K  F YSR FA+GLFRLLELAN T+P+ILEKLCAALNV+KKS
Sbjct: 181 KEGEIENIFKDIAQRAGTKDGFCYSRLFAVGLFRLLELANVTDPTILEKLCAALNVNKKS 240

Query: 241 VDRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAVTK 288
           VDRDLDVYRNLLSKLVQAKELLKEYV+REKKKR ER  +Q ANE VTK
Sbjct: 241 VDRDLDVYRNLLSKLVQAKELLKEYVEREKKKRGERE-TQKANETVTK 283

BLAST of Cp4.1LG04g03250 vs. Swiss-Prot
Match: THF1_ARATH (Protein THYLAKOID FORMATION 1, chloroplastic OS=Arabidopsis thaliana GN=THF1 PE=1 SV=1)

HSP 1 Score: 370.9 bits (951), Expect = 1.2e-101
Identity = 201/286 (70.28%), Postives = 243/286 (84.97%), Query Frame = 1

Query: 3   AVNSVSFSTVSQFSDRRSPVPSARSLASNFDGFRFRSSVFYHHSGVRTSSFSSRLVIHCM 62
           A++S+SF  + Q SD+ S   S+R LAS       R    +    + + S +S+ +IHCM
Sbjct: 5   AISSLSFPALGQ-SDKISNFASSRPLAS-----AIRICTKFSRLSLNSRS-TSKSLIHCM 64

Query: 63  SAGT-DVTTVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALGFV 122
           S  T DV  V+ETK  FLKAYKRPIPSIYN+VLQELIVQQHLMRYK+TYRYDPVFALGFV
Sbjct: 65  SNVTADVPPVSETKSKFLKAYKRPIPSIYNTVLQELIVQQHLMRYKKTYRYDPVFALGFV 124

Query: 123 TVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEFASKE 182
           TVYDQLM+GYPSD+DR+AIF+AYI+ALNEDP+QYRIDAQK+EEWARSQ++ASLV+F+SKE
Sbjct: 125 TVYDQLMEGYPSDQDRDAIFKAYIEALNEDPKQYRIDAQKMEEWARSQTSASLVDFSSKE 184

Query: 183 GEVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELANATEPSILEKLCAALNVDKKSVD 242
           G++E+VLKDIA RA SK  FSYSRFFA+GLFRLLELA+AT+P++L+KLCA+LN++KKSVD
Sbjct: 185 GDIEAVLKDIAGRAGSKEGFSYSRFFAVGLFRLLELASATDPTVLDKLCASLNINKKSVD 244

Query: 243 RDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAVTK 288
           RDLDVYRNLLSKLVQAKELLKEYV+REKKK+ ERA SQ ANE ++K
Sbjct: 245 RDLDVYRNLLSKLVQAKELLKEYVEREKKKQGERAQSQKANETISK 283

BLAST of Cp4.1LG04g03250 vs. Swiss-Prot
Match: THF1_ORYSJ (Protein THYLAKOID FORMATION1, chloroplastic OS=Oryza sativa subsp. japonica GN=THF1 PE=2 SV=1)

HSP 1 Score: 357.8 bits (917), Expect = 1.1e-97
Identity = 194/288 (67.36%), Postives = 236/288 (81.94%), Query Frame = 1

Query: 1   MAAVNSVSFSTVSQFSDRRSPVPSARSLASNFDGFRFRSSVFYHHSGVRTSSFSSRLVIH 60
           MAA++S+ F+ + + +D R   PS  + A+         SV             SR V+ 
Sbjct: 1   MAAISSLPFAALRRAADCR---PSTAAAAAGAGAGAVVLSVRPRRG--------SRSVVR 60

Query: 61  CMSAGTDVT-TVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALG 120
           C++   DV  TVAETK+NFLK+YKRPI SIY++VLQEL+VQQHLMRYK TY+YD VFALG
Sbjct: 61  CVATAGDVPPTVAETKMNFLKSYKRPILSIYSTVLQELLVQQHLMRYKTTYQYDAVFALG 120

Query: 121 FVTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEFAS 180
           FVTVYDQLM+GYPS+EDR+AIF+AYI ALNEDPEQYR DAQK+EEWARSQ+  SLVEF+S
Sbjct: 121 FVTVYDQLMEGYPSNEDRDAIFKAYITALNEDPEQYRADAQKMEEWARSQNGNSLVEFSS 180

Query: 181 KEGEVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELANATEPSILEKLCAALNVDKKS 240
           K+GE+E++LKDI+ERA  KGSFSYSRFFA+GLFRLLELANATEP+IL+KLCAALN++K+S
Sbjct: 181 KDGEIEAILKDISERAQGKGSFSYSRFFAVGLFRLLELANATEPTILDKLCAALNINKRS 240

Query: 241 VDRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAVTK 288
           VDRDLDVYRN+LSKLVQAKELLKEYV+REKKKR+ER+ +  +NEAVTK
Sbjct: 241 VDRDLDVYRNILSKLVQAKELLKEYVEREKKKREERSETPKSNEAVTK 277

BLAST of Cp4.1LG04g03250 vs. Swiss-Prot
Match: THF1_ACAM1 (Protein Thf1 OS=Acaryochloris marina (strain MBIC 11017) GN=thf1 PE=3 SV=1)

HSP 1 Score: 152.1 bits (383), Expect = 9.0e-36
Identity = 84/219 (38.36%), Postives = 136/219 (62.10%), Query Frame = 1

Query: 67  DVTTVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALGFVTVYDQ 126
           ++ TV++TK  F   + RP+ S+Y  V++EL+V+ HL+R    +RYDP+FALG  T +D+
Sbjct: 3   NLRTVSDTKRAFYSIHTRPVNSVYRRVVEELMVEMHLLRVNEDFRYDPIFALGVTTSFDR 62

Query: 127 LMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEF---ASKEG- 186
            MDGY  + D++AIF A  KA   DP Q + D Q+L E A+S+SA  ++++   A+  G 
Sbjct: 63  FMDGYQPENDKDAIFSAICKAQEADPVQMKKDGQRLTELAQSKSAQEMLDWITQAANSGG 122

Query: 187 -EVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELA--NATE-----PSILEKLCAALN 246
            E++  L++IA+       F YSR FAIGLF LLEL+  N T+        L  +C  LN
Sbjct: 123 DELQWQLRNIAQNP----KFKYSRLFAIGLFTLLELSEGNITQDEESLAEFLPNICTVLN 182

Query: 247 VDKKSVDRDLDVYRNLLSKLVQAKELLKEYVDREKKKRD 274
           + +  + +DL++YR  L K+ Q ++ + + ++ +KK+R+
Sbjct: 183 ISESKLQKDLEIYRGNLDKIAQVRQAMDDILEAQKKRRE 217

BLAST of Cp4.1LG04g03250 vs. Swiss-Prot
Match: THF1_TRIEI (Protein Thf1 OS=Trichodesmium erythraeum (strain IMS101) GN=thf1 PE=3 SV=1)

HSP 1 Score: 141.4 bits (355), Expect = 1.6e-32
Identity = 80/216 (37.04%), Postives = 129/216 (59.72%), Query Frame = 1

Query: 70  TVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALGFVTVYDQLMD 129
           TV++TK  F   + RPI SIYN V++EL+V+ HL+     Y Y+P +ALG VT +D+ M 
Sbjct: 6   TVSDTKKTFYHFHTRPINSIYNRVIEELLVEMHLISVNVDYSYNPFYALGVVTAFDRFMQ 65

Query: 130 GYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEF--ASKEGEVESV 189
           GY   ED+ +IF A I+   EDP +YR DA+ LE+ A   SA+ ++ +   SK  +    
Sbjct: 66  GYSPQEDKTSIFNALIQGQEEDPNKYRSDAKGLEDLAGKISASDILSWICLSKNIDNTQY 125

Query: 190 LKDIAERAASKGSFSYSRFFAIGLFRLLELANA-------TEPSILEKLCAALNVDKKSV 249
           L+D     +    F YSR FAIGLF LLE+ +             L+K+C +LN+ ++ +
Sbjct: 126 LQDDLRAISENSKFRYSRLFAIGLFTLLEIVDTELIKEQEKRTEALKKICQSLNLVEEKL 185

Query: 250 DRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERA 277
            +D+D+Y + L ++ QA+  +++ +   +KKR++R+
Sbjct: 186 LKDIDLYLSNLERVAQARSAMEDTLAAMRKKREKRS 221

BLAST of Cp4.1LG04g03250 vs. TrEMBL
Match: A0A0A0K3P0_CUCSA (Uncharacterized protein OS=Cucumis sativus GN=Csa_7G046130 PE=3 SV=1)

HSP 1 Score: 492.3 bits (1266), Expect = 4.1e-136
Identity = 256/287 (89.20%), Postives = 275/287 (95.82%), Query Frame = 1

Query: 1   MAAVNSVSFSTVSQFSDRRSPVPSARSLASNFDGFRFRSSVFYHHSGVRTSSFSSRLVIH 60
           MAAVNS+SFST++Q SDRR  +PS+RS +SNF GF FR+SVF H+S VR S+FSSR+VIH
Sbjct: 1   MAAVNSISFSTLNQCSDRRLLLPSSRSHSSNFHGFPFRTSVFTHYSRVRASTFSSRMVIH 60

Query: 61  CMSAGTDVTTVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALGF 120
           CMSAGTDVTTVAETKLNFLKAYKRPIPSIYN+VLQELIVQQHLMRYKRTYRYDPVFALGF
Sbjct: 61  CMSAGTDVTTVAETKLNFLKAYKRPIPSIYNTVLQELIVQQHLMRYKRTYRYDPVFALGF 120

Query: 121 VTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEFASK 180
           VTVYDQLM+GYPSDEDREAIFQAYIKALNEDPEQYRIDA+K EEWARSQ+AASLVEFAS+
Sbjct: 121 VTVYDQLMEGYPSDEDREAIFQAYIKALNEDPEQYRIDAKKFEEWARSQTAASLVEFASR 180

Query: 181 EGEVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELANATEPSILEKLCAALNVDKKSV 240
           EGEVES+LKDIAERA SKG+FSYSRFFAIGLFRLLELANATEPSILEKLCAALN+DKK V
Sbjct: 181 EGEVESILKDIAERAGSKGNFSYSRFFAIGLFRLLELANATEPSILEKLCAALNIDKKGV 240

Query: 241 DRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAVTK 288
           DRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEA+TK
Sbjct: 241 DRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAITK 287

BLAST of Cp4.1LG04g03250 vs. TrEMBL
Match: A0A061E4M4_THECC (Photosystem II reaction center PSB29 protein OS=Theobroma cacao GN=TCM_007929 PE=3 SV=1)

HSP 1 Score: 438.7 bits (1127), Expect = 5.4e-120
Identity = 229/288 (79.51%), Postives = 260/288 (90.28%), Query Frame = 1

Query: 1   MAAVNSVSFSTVSQFS-DRRSPVPSARSLASNFDGFRFRSSVFYHHSGVRTSSFSSRLVI 60
           MAAV+S+S S + Q S DR+  VPSAR LASNF+G RFR+SV YH  GVR S+ +S  V+
Sbjct: 1   MAAVSSLSLSAIGQTSGDRKVNVPSARYLASNFEGLRFRTSVLYHSVGVRGSASASPSVV 60

Query: 61  HCMSAGTDVTTVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALG 120
           HCM A TDV TV+ETKLNFLKAYKRPIPS+YN+VLQELIVQQHLMRYK TYRYD VFALG
Sbjct: 61  HCMCAATDVPTVSETKLNFLKAYKRPIPSVYNTVLQELIVQQHLMRYKWTYRYDAVFALG 120

Query: 121 FVTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEFAS 180
           FVTVYDQLM+GYPSDEDR+AIFQAYIKAL EDP+QYRIDAQKLEEWARSQ+++SLVEF+S
Sbjct: 121 FVTVYDQLMEGYPSDEDRDAIFQAYIKALKEDPQQYRIDAQKLEEWARSQTSSSLVEFSS 180

Query: 181 KEGEVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELANATEPSILEKLCAALNVDKKS 240
           ++GEVE++LKDIAERA   GSFSYSRFFA+GLFRLLELANATEP++LEKLCAALN++K+S
Sbjct: 181 RDGEVEAILKDIAERAGRMGSFSYSRFFAVGLFRLLELANATEPTVLEKLCAALNINKRS 240

Query: 241 VDRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAVTK 288
           VDRDLDVYRNLLSKLVQAKELLKEYVDREKKKR+ER+ SQ ANEAV K
Sbjct: 241 VDRDLDVYRNLLSKLVQAKELLKEYVDREKKKREERSESQKANEAVKK 288

BLAST of Cp4.1LG04g03250 vs. TrEMBL
Match: M5WTF3_PRUPE (Uncharacterized protein OS=Prunus persica GN=PRUPE_ppa009554mg PE=3 SV=1)

HSP 1 Score: 436.0 bits (1120), Expect = 3.5e-119
Identity = 226/287 (78.75%), Postives = 259/287 (90.24%), Query Frame = 1

Query: 1   MAAVNSVSFSTVSQFSDRRSPVPSARSLASNFDGFRFRSSVFYHHSGVRTSSFSSRLVIH 60
           MAAV S+SFS +SQ SDR+S + S R+LA N +G R R+S   ++ GVR SS SSR++IH
Sbjct: 1   MAAVASLSFSALSQCSDRKSVISSTRNLAYNSEGLRLRTSFSCNNGGVRASSSSSRMMIH 60

Query: 61  CMSAGTDVTTVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALGF 120
           CMS  +   TVA+TKLNFLKAYKRPIPS+YN+VLQELIVQQHL++YK++YRYDPVFALGF
Sbjct: 61  CMSGASYAPTVADTKLNFLKAYKRPIPSVYNTVLQELIVQQHLIKYKKSYRYDPVFALGF 120

Query: 121 VTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEFASK 180
           VTV+DQLMDGYPSDEDREAIFQAYI+ALNEDPEQYRIDAQKLEEWAR+Q+++SLVEF S+
Sbjct: 121 VTVFDQLMDGYPSDEDREAIFQAYIEALNEDPEQYRIDAQKLEEWARAQTSSSLVEFPSR 180

Query: 181 EGEVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELANATEPSILEKLCAALNVDKKSV 240
           EGE+E  LKDIAERAASKGSFSYSRFFA+GLFRLLELANATEP+ILEKLCAALN+DK+SV
Sbjct: 181 EGEIEGTLKDIAERAASKGSFSYSRFFAVGLFRLLELANATEPTILEKLCAALNIDKRSV 240

Query: 241 DRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAVTK 288
           DRDLDVYRNLLSKLVQAKELLKEYV REKKKR+ER  +Q ANEAVTK
Sbjct: 241 DRDLDVYRNLLSKLVQAKELLKEYVAREKKKREERVENQKANEAVTK 287

BLAST of Cp4.1LG04g03250 vs. TrEMBL
Match: A0A0D2V4U5_GOSRA (Uncharacterized protein OS=Gossypium raimondii GN=B456_012G132100 PE=3 SV=1)

HSP 1 Score: 429.1 bits (1102), Expect = 4.2e-117
Identity = 223/288 (77.43%), Postives = 257/288 (89.24%), Query Frame = 1

Query: 1   MAAVNSVSFSTVSQFS-DRRSPVPSARSLASNFDGFRFRSSVFYHHSGVRTSSFSSRLVI 60
           MAAV+S+SF  + Q S DR+  VPSAR LASNF+GFRFR+S+ Y   G+R S+ +S  V 
Sbjct: 1   MAAVSSLSFPAIGQTSGDRKLNVPSARYLASNFEGFRFRTSLLYQSVGLRASTTASPSVF 60

Query: 61  HCMSAGTDVTTVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALG 120
           +CMS  TD  TV+ETK +FLKAYKRPIPS+YN+VLQELIVQQHLMRYK+TYRYD VFALG
Sbjct: 61  YCMSTATDTPTVSETKSSFLKAYKRPIPSVYNTVLQELIVQQHLMRYKKTYRYDAVFALG 120

Query: 121 FVTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEFAS 180
           FVTVYDQLM+GYPSDEDR+AIFQAYI AL EDP+QYR DAQKLEEWAR+Q+++SLVEF+S
Sbjct: 121 FVTVYDQLMEGYPSDEDRDAIFQAYINALKEDPQQYRADAQKLEEWARAQTSSSLVEFSS 180

Query: 181 KEGEVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELANATEPSILEKLCAALNVDKKS 240
           ++GEVE++LKDIAERA SKGSFSYSRFFAIGLFRLLELANATEP++LEKLCAALN+DK+S
Sbjct: 181 RDGEVEAILKDIAERAGSKGSFSYSRFFAIGLFRLLELANATEPTVLEKLCAALNIDKRS 240

Query: 241 VDRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAVTK 288
           VDRDLDVYRNLLSKLVQAKELLKEYVDREKKKR+ER+ S  ANEAV K
Sbjct: 241 VDRDLDVYRNLLSKLVQAKELLKEYVDREKKKREERSESPKANEAVKK 288

BLAST of Cp4.1LG04g03250 vs. TrEMBL
Match: A0A0B0MJ75_GOSAR (Thylakoid formation 1, chloroplastic-like protein OS=Gossypium arboreum GN=F383_18808 PE=3 SV=1)

HSP 1 Score: 426.8 bits (1096), Expect = 2.1e-116
Identity = 221/288 (76.74%), Postives = 257/288 (89.24%), Query Frame = 1

Query: 1   MAAVNSVSFSTVSQFS-DRRSPVPSARSLASNFDGFRFRSSVFYHHSGVRTSSFSSRLVI 60
           MAAV+S+SF  + Q S DR+  VPS R LASNF+GFRFR+S+ Y   G+R S+ +S  V+
Sbjct: 1   MAAVSSLSFPAIGQTSGDRKLNVPSPRYLASNFEGFRFRTSLLYQSVGLRASTTASPSVV 60

Query: 61  HCMSAGTDVTTVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALG 120
           +CMS  TD  TV+ETK +FLKAYKRPIPS+YN+VLQELIVQQHLMRYK+TYRYD VFALG
Sbjct: 61  YCMSTATDTPTVSETKSSFLKAYKRPIPSVYNTVLQELIVQQHLMRYKKTYRYDAVFALG 120

Query: 121 FVTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEFAS 180
           FVTVYDQLM+GYPSDEDR+AIFQAYI AL EDP+QYR DAQKLEEWAR+Q+++SLV+F+S
Sbjct: 121 FVTVYDQLMEGYPSDEDRDAIFQAYINALKEDPQQYRADAQKLEEWARAQTSSSLVKFSS 180

Query: 181 KEGEVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELANATEPSILEKLCAALNVDKKS 240
           ++GEVE++LKDIAERA SKGSFSYSRFFAIGLFRLLELANATEP++LEKLCAALN+DK+S
Sbjct: 181 RDGEVEAILKDIAERAGSKGSFSYSRFFAIGLFRLLELANATEPTVLEKLCAALNIDKRS 240

Query: 241 VDRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAVTK 288
           VDRDLDVYRNLLSKLVQAKELLKEYVDREKKKR+ER+ S  ANEAV K
Sbjct: 241 VDRDLDVYRNLLSKLVQAKELLKEYVDREKKKREERSESPKANEAVKK 288

BLAST of Cp4.1LG04g03250 vs. TAIR10
Match: AT2G20890.1 (AT2G20890.1 photosystem II reaction center PSB29 protein)

HSP 1 Score: 370.9 bits (951), Expect = 6.9e-103
Identity = 201/286 (70.28%), Postives = 243/286 (84.97%), Query Frame = 1

Query: 3   AVNSVSFSTVSQFSDRRSPVPSARSLASNFDGFRFRSSVFYHHSGVRTSSFSSRLVIHCM 62
           A++S+SF  + Q SD+ S   S+R LAS       R    +    + + S +S+ +IHCM
Sbjct: 5   AISSLSFPALGQ-SDKISNFASSRPLAS-----AIRICTKFSRLSLNSRS-TSKSLIHCM 64

Query: 63  SAGT-DVTTVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALGFV 122
           S  T DV  V+ETK  FLKAYKRPIPSIYN+VLQELIVQQHLMRYK+TYRYDPVFALGFV
Sbjct: 65  SNVTADVPPVSETKSKFLKAYKRPIPSIYNTVLQELIVQQHLMRYKKTYRYDPVFALGFV 124

Query: 123 TVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEFASKE 182
           TVYDQLM+GYPSD+DR+AIF+AYI+ALNEDP+QYRIDAQK+EEWARSQ++ASLV+F+SKE
Sbjct: 125 TVYDQLMEGYPSDQDRDAIFKAYIEALNEDPKQYRIDAQKMEEWARSQTSASLVDFSSKE 184

Query: 183 GEVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELANATEPSILEKLCAALNVDKKSVD 242
           G++E+VLKDIA RA SK  FSYSRFFA+GLFRLLELA+AT+P++L+KLCA+LN++KKSVD
Sbjct: 185 GDIEAVLKDIAGRAGSKEGFSYSRFFAVGLFRLLELASATDPTVLDKLCASLNINKKSVD 244

Query: 243 RDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAVTK 288
           RDLDVYRNLLSKLVQAKELLKEYV+REKKK+ ERA SQ ANE ++K
Sbjct: 245 RDLDVYRNLLSKLVQAKELLKEYVEREKKKQGERAQSQKANETISK 283

BLAST of Cp4.1LG04g03250 vs. NCBI nr
Match: gi|659110691|ref|XP_008455361.1| (PREDICTED: protein THYLAKOID FORMATION1, chloroplastic [Cucumis melo])

HSP 1 Score: 506.5 bits (1303), Expect = 3.0e-140
Identity = 261/287 (90.94%), Postives = 280/287 (97.56%), Query Frame = 1

Query: 1   MAAVNSVSFSTVSQFSDRRSPVPSARSLASNFDGFRFRSSVFYHHSGVRTSSFSSRLVIH 60
           MAAVNS+SFST++Q SDRR PVPS+RSL+SNFDGFRFR+S+F H+S VR S+FSSR+VIH
Sbjct: 1   MAAVNSISFSTLNQCSDRRFPVPSSRSLSSNFDGFRFRTSLFTHYSRVRPSTFSSRMVIH 60

Query: 61  CMSAGTDVTTVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALGF 120
           CMSAGTDVTTVAETKLNFLKAYKRPIPSIYN+VLQELIVQQHLMRYKRTYRYDPVFALGF
Sbjct: 61  CMSAGTDVTTVAETKLNFLKAYKRPIPSIYNTVLQELIVQQHLMRYKRTYRYDPVFALGF 120

Query: 121 VTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEFASK 180
           VTVYDQLM+GYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQ+AASLVEFAS+
Sbjct: 121 VTVYDQLMEGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQTAASLVEFASR 180

Query: 181 EGEVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELANATEPSILEKLCAALNVDKKSV 240
           EGEVES+LKDIAERA SKG+FSYSRFFAIGLFRLLELANATEPSILEKLCAALN+DKK V
Sbjct: 181 EGEVESILKDIAERAGSKGNFSYSRFFAIGLFRLLELANATEPSILEKLCAALNIDKKGV 240

Query: 241 DRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAVTK 288
           DRDLDVYRNLLSKLVQAKELLKEY+DREKKKRDERAGSQTANEA+TK
Sbjct: 241 DRDLDVYRNLLSKLVQAKELLKEYIDREKKKRDERAGSQTANEAITK 287

BLAST of Cp4.1LG04g03250 vs. NCBI nr
Match: gi|449438054|ref|XP_004136805.1| (PREDICTED: protein THYLAKOID FORMATION1, chloroplastic [Cucumis sativus])

HSP 1 Score: 492.3 bits (1266), Expect = 5.9e-136
Identity = 256/287 (89.20%), Postives = 275/287 (95.82%), Query Frame = 1

Query: 1   MAAVNSVSFSTVSQFSDRRSPVPSARSLASNFDGFRFRSSVFYHHSGVRTSSFSSRLVIH 60
           MAAVNS+SFST++Q SDRR  +PS+RS +SNF GF FR+SVF H+S VR S+FSSR+VIH
Sbjct: 1   MAAVNSISFSTLNQCSDRRLLLPSSRSHSSNFHGFPFRTSVFTHYSRVRASTFSSRMVIH 60

Query: 61  CMSAGTDVTTVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALGF 120
           CMSAGTDVTTVAETKLNFLKAYKRPIPSIYN+VLQELIVQQHLMRYKRTYRYDPVFALGF
Sbjct: 61  CMSAGTDVTTVAETKLNFLKAYKRPIPSIYNTVLQELIVQQHLMRYKRTYRYDPVFALGF 120

Query: 121 VTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEFASK 180
           VTVYDQLM+GYPSDEDREAIFQAYIKALNEDPEQYRIDA+K EEWARSQ+AASLVEFAS+
Sbjct: 121 VTVYDQLMEGYPSDEDREAIFQAYIKALNEDPEQYRIDAKKFEEWARSQTAASLVEFASR 180

Query: 181 EGEVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELANATEPSILEKLCAALNVDKKSV 240
           EGEVES+LKDIAERA SKG+FSYSRFFAIGLFRLLELANATEPSILEKLCAALN+DKK V
Sbjct: 181 EGEVESILKDIAERAGSKGNFSYSRFFAIGLFRLLELANATEPSILEKLCAALNIDKKGV 240

Query: 241 DRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAVTK 288
           DRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEA+TK
Sbjct: 241 DRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAITK 287

BLAST of Cp4.1LG04g03250 vs. NCBI nr
Match: gi|694324363|ref|XP_009353207.1| (PREDICTED: protein THYLAKOID FORMATION1, chloroplastic-like [Pyrus x bretschneideri])

HSP 1 Score: 443.0 bits (1138), Expect = 4.1e-121
Identity = 228/287 (79.44%), Postives = 260/287 (90.59%), Query Frame = 1

Query: 1   MAAVNSVSFSTVSQFSDRRSPVPSARSLASNFDGFRFRSSVFYHHSGVRTSSFSSRLVIH 60
           MAAV S+SFS +SQ SDR+S V  AR+L SN +G RFR+S+  H+ G+R SS+SSR+V+H
Sbjct: 1   MAAVASLSFSALSQCSDRKSVVSPARNLGSNAEGIRFRTSISSHYGGIRASSWSSRMVVH 60

Query: 61  CMSAGTDVTTVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALGF 120
           CM+  +D  TVA+TKLNFLKAYKRPIPS+YNSVLQELIVQQHLM+YKRTYRYDPVFALGF
Sbjct: 61  CMAGSSDSPTVADTKLNFLKAYKRPIPSVYNSVLQELIVQQHLMKYKRTYRYDPVFALGF 120

Query: 121 VTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEFASK 180
           VTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYR DAQKLEEWAR+Q+++SLVEF S+
Sbjct: 121 VTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRTDAQKLEEWARAQTSSSLVEFPSR 180

Query: 181 EGEVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELANATEPSILEKLCAALNVDKKSV 240
           EGEVE+ LKDIAERAA K SFSYSRFFAIGLFRLLE+A ATEP++LEKLCAALN+DK+SV
Sbjct: 181 EGEVEAALKDIAERAAGKESFSYSRFFAIGLFRLLEVAKATEPTVLEKLCAALNIDKRSV 240

Query: 241 DRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAVTK 288
           DRDLDVYRNLLSKLVQAKELL+EYV REKKKR+ERA +Q A+E VTK
Sbjct: 241 DRDLDVYRNLLSKLVQAKELLREYVAREKKKREERAETQKASETVTK 287

BLAST of Cp4.1LG04g03250 vs. NCBI nr
Match: gi|694320241|ref|XP_009350521.1| (PREDICTED: protein THYLAKOID FORMATION1, chloroplastic-like [Pyrus x bretschneideri])

HSP 1 Score: 442.6 bits (1137), Expect = 5.3e-121
Identity = 226/287 (78.75%), Postives = 260/287 (90.59%), Query Frame = 1

Query: 1   MAAVNSVSFSTVSQFSDRRSPVPSARSLASNFDGFRFRSSVFYHHSGVRTSSFSSRLVIH 60
           MAAV ++SFS +SQFSDR+S V S R+LASN DG RFR+S+  H+ G+R SS SSR+V+H
Sbjct: 1   MAAVAALSFSALSQFSDRKSVVSSTRNLASNADGLRFRTSISSHNGGIRASSSSSRMVVH 60

Query: 61  CMSAGTDVTTVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALGF 120
           CM+  +D+ TVA+TKLNFLKAYKRPIPS+YNSVLQELIVQQHLM+YKRTYRYDPVFALGF
Sbjct: 61  CMAGSSDIPTVADTKLNFLKAYKRPIPSVYNSVLQELIVQQHLMKYKRTYRYDPVFALGF 120

Query: 121 VTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEFASK 180
           VTVYDQLMDGYPSDEDR AIFQAY+KALNEDPEQYRIDAQKLEEWAR+Q+++SLVEF S+
Sbjct: 121 VTVYDQLMDGYPSDEDRVAIFQAYVKALNEDPEQYRIDAQKLEEWARAQTSSSLVEFPSR 180

Query: 181 EGEVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELANATEPSILEKLCAALNVDKKSV 240
           EG+VE+ LKDIAERAA K SFSYSRFFAIGLFRLLE+A ATEP++LEKLCAALN+DK+SV
Sbjct: 181 EGKVEAALKDIAERAAGKESFSYSRFFAIGLFRLLEVAKATEPTVLEKLCAALNIDKRSV 240

Query: 241 DRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAVTK 288
           DRDLDVYRNLLSKLVQAKELL+EYV REKKKR+ER  +Q A+E V K
Sbjct: 241 DRDLDVYRNLLSKLVQAKELLREYVAREKKKREERVETQKASETVAK 287

BLAST of Cp4.1LG04g03250 vs. NCBI nr
Match: gi|658025445|ref|XP_008348123.1| (PREDICTED: protein THYLAKOID FORMATION1, chloroplastic-like [Malus domestica])

HSP 1 Score: 441.8 bits (1135), Expect = 9.1e-121
Identity = 228/287 (79.44%), Postives = 258/287 (89.90%), Query Frame = 1

Query: 1   MAAVNSVSFSTVSQFSDRRSPVPSARSLASNFDGFRFRSSVFYHHSGVRTSSFSSRLVIH 60
           MAAV S+SFS +SQ SDR+S V SAR+L SN +G RFR+S+  H+ G+R SS SSR+V+H
Sbjct: 1   MAAVASLSFSALSQCSDRKSVVSSARNLGSNAEGIRFRTSISSHYGGIRASSSSSRMVVH 60

Query: 61  CMSAGTDVTTVAETKLNFLKAYKRPIPSIYNSVLQELIVQQHLMRYKRTYRYDPVFALGF 120
           CM+  +D  TVA+TKLNFLKAYKRPIPS+YNSVLQELIVQQHLM+YKRTYRYDPVFALGF
Sbjct: 61  CMAGSSDAPTVADTKLNFLKAYKRPIPSVYNSVLQELIVQQHLMKYKRTYRYDPVFALGF 120

Query: 121 VTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRIDAQKLEEWARSQSAASLVEFASK 180
           VTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYR DAQKLEEWAR+Q+++SLVEF S+
Sbjct: 121 VTVYDQLMDGYPSDEDREAIFQAYIKALNEDPEQYRTDAQKLEEWARAQTSSSLVEFPSR 180

Query: 181 EGEVESVLKDIAERAASKGSFSYSRFFAIGLFRLLELANATEPSILEKLCAALNVDKKSV 240
           EGEVE  LKDIAERAA K SFSYSRFFAIGLFRLLE+A ATEP++LEKLCAALN+DK+SV
Sbjct: 181 EGEVEVALKDIAERAAGKESFSYSRFFAIGLFRLLEVAKATEPTVLEKLCAALNIDKRSV 240

Query: 241 DRDLDVYRNLLSKLVQAKELLKEYVDREKKKRDERAGSQTANEAVTK 288
           DRDLDVYRNLLSKLVQAKELL+EYV REKKKR+ER  +Q A+E VTK
Sbjct: 241 DRDLDVYRNLLSKLVQAKELLREYVAREKKKREERVETQKASETVTK 287

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
THF1_SOLTU2.8e-10670.83Protein THYLAKOID FORMATION1, chloroplastic OS=Solanum tuberosum GN=THF1 PE=2 SV... [more]
THF1_ARATH1.2e-10170.28Protein THYLAKOID FORMATION 1, chloroplastic OS=Arabidopsis thaliana GN=THF1 PE=... [more]
THF1_ORYSJ1.1e-9767.36Protein THYLAKOID FORMATION1, chloroplastic OS=Oryza sativa subsp. japonica GN=T... [more]
THF1_ACAM19.0e-3638.36Protein Thf1 OS=Acaryochloris marina (strain MBIC 11017) GN=thf1 PE=3 SV=1[more]
THF1_TRIEI1.6e-3237.04Protein Thf1 OS=Trichodesmium erythraeum (strain IMS101) GN=thf1 PE=3 SV=1[more]
Match NameE-valueIdentityDescription
A0A0A0K3P0_CUCSA4.1e-13689.20Uncharacterized protein OS=Cucumis sativus GN=Csa_7G046130 PE=3 SV=1[more]
A0A061E4M4_THECC5.4e-12079.51Photosystem II reaction center PSB29 protein OS=Theobroma cacao GN=TCM_007929 PE... [more]
M5WTF3_PRUPE3.5e-11978.75Uncharacterized protein OS=Prunus persica GN=PRUPE_ppa009554mg PE=3 SV=1[more]
A0A0D2V4U5_GOSRA4.2e-11777.43Uncharacterized protein OS=Gossypium raimondii GN=B456_012G132100 PE=3 SV=1[more]
A0A0B0MJ75_GOSAR2.1e-11676.74Thylakoid formation 1, chloroplastic-like protein OS=Gossypium arboreum GN=F383_... [more]
Match NameE-valueIdentityDescription
AT2G20890.16.9e-10370.28 photosystem II reaction center PSB29 protein[more]
Match NameE-valueIdentityDescription
gi|659110691|ref|XP_008455361.1|3.0e-14090.94PREDICTED: protein THYLAKOID FORMATION1, chloroplastic [Cucumis melo][more]
gi|449438054|ref|XP_004136805.1|5.9e-13689.20PREDICTED: protein THYLAKOID FORMATION1, chloroplastic [Cucumis sativus][more]
gi|694324363|ref|XP_009353207.1|4.1e-12179.44PREDICTED: protein THYLAKOID FORMATION1, chloroplastic-like [Pyrus x bretschneid... [more]
gi|694320241|ref|XP_009350521.1|5.3e-12178.75PREDICTED: protein THYLAKOID FORMATION1, chloroplastic-like [Pyrus x bretschneid... [more]
gi|658025445|ref|XP_008348123.1|9.1e-12179.44PREDICTED: protein THYLAKOID FORMATION1, chloroplastic-like [Malus domestica][more]
The following terms have been associated with this gene:
Vocabulary: Biological Process
TermDefinition
GO:0015979photosynthesis
GO:0010207photosystem II assembly
Vocabulary: INTERPRO
TermDefinition
IPR017499Thf1
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0015996 chlorophyll catabolic process
biological_process GO:0045038 protein import into chloroplast thylakoid membrane
biological_process GO:0015979 photosynthesis
biological_process GO:0042793 transcription from plastid promoter
biological_process GO:0010027 thylakoid membrane organization
biological_process GO:0010182 sugar mediated signaling pathway
biological_process GO:0009902 chloroplast relocation
biological_process GO:0006417 regulation of translation
biological_process GO:0035304 regulation of protein dephosphorylation
biological_process GO:0006364 rRNA processing
biological_process GO:0045037 protein import into chloroplast stroma
biological_process GO:0045893 positive regulation of transcription, DNA-templated
biological_process GO:0010207 photosystem II assembly
biological_process GO:0009773 photosynthetic electron transport in photosystem I
biological_process GO:0006655 phosphatidylglycerol biosynthetic process
biological_process GO:0019288 isopentenyl diphosphate biosynthetic process, methylerythritol 4-phosphate pathway
biological_process GO:0007186 G-protein coupled receptor signaling pathway
biological_process GO:0042742 defense response to bacterium
cellular_component GO:0009941 chloroplast envelope
cellular_component GO:0009570 chloroplast stroma
cellular_component GO:0009535 chloroplast thylakoid membrane
cellular_component GO:0009528 plastid inner membrane
cellular_component GO:0009527 plastid outer membrane
cellular_component GO:0010319 stromule
cellular_component GO:0005575 cellular_component
molecular_function GO:0003674 molecular_function

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Cp4.1LG04g03250.1Cp4.1LG04g03250.1mRNA


Analysis Name: InterPro Annotations of Cucurbita pepo
Date Performed: 2017-12-02
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR017499Protein Thf1HAMAPMF_01843Thf1coord: 65..273
score: 2
IPR017499Protein Thf1PFAMPF11264ThylakoidFormatcoord: 70..275
score: 1.9
IPR017499Protein Thf1TIGRFAMsTIGR03060TIGR03060coord: 67..272
score: 7.8
NoneNo IPR availableunknownCoilCoilcoord: 247..274
scor
NoneNo IPR availablePANTHERPTHR34793FAMILY NOT NAMEDcoord: 2..287
score: 6.7E
NoneNo IPR availablePANTHERPTHR34793:SF1PROTEIN THYLAKOID FORMATION 1, CHLOROPLASTICcoord: 2..287
score: 6.7E

The following gene(s) are paralogous to this gene:
GeneParalogueOrganismBlock
Cp4.1LG04g03250Cp4.1LG15g07800Cucurbita pepo (Zucchini)cpecpeB267