Cp4.1LG16g06550.1 (mRNA) Cucurbita pepo (Zucchini)

NameCp4.1LG16g06550.1
TypemRNA
OrganismCucurbita pepo (Cucurbita pepo (Zucchini))
DescriptionPentatricopeptide repeat-containing protein, chloroplastic
LocationCp4.1LG16 : 6984795 .. 6987797 (+)
Sequence length2194
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: five_prime_UTRCDSthree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
AAACGCTGTCATTGCGACCACTCACTCCGGCTTCCTACCATCACCATCGCTTAATCCTCTGCCAGAAGATTAGGGTTTATTTTCTCCGACCAGCAATTCAGTTCAGTTCAATTCGATTCGCCGGTGGTTGAAGGCTGGCTCTATGTAATTTTCTCTCCATCTCTTCCGCCTTGCGTTCGTTTCGTTTTCTGATGAGTTTGTACTGTTATTGGAATTTGGATTGCGTTCAAAATTGATGCGGTTGAAACCCTAGTGTTCTGGTTTTCTTTTGAGAAGTTATAGGCGGTGGACCGATTGATTGTTTCATTTCTTGCTTTACTTCATTCGGTTTTGAATTTTTCTCCACTCGGTTGAGATACCGTGACGCTTAAGTTCGCTGGTTATTCGAAATTTTAGTTTGAATTTAGTGATTTGGATTAGAAGTACGAAATGATTTGTGCCCAGGGCTTTACTCCGTTAACGCAATTTGGATTTTCGTTTTCTTTATCTTCTGGACTGAAATCTGAGAGGCTTGGGTTTTCTGCTCCCCAATTGTGTAGTCGTTCGCCGGTAAATTTTTGCTTTATGGTTTCTCGTATTACTTGCAATCACCAGAATTCTACTTTCTCTGTTTCGAGAGCTGGTAAGTTTCGGGACCTAAGGTTGTTCAAATCGGTTGAGTTGGACCAGTTCATCACGAGTGATGACGAAGATGAAATGGGAGATGGGTTTTTTGAGGCAATTGAGGAATTGGAACGAATGACGAGGGATCCATCGGACGTTCTTGAAGAAATGAACGACCGCCTTTCGGCGAGGGAATTTCAGCTAGTGCTGGTGTACTTCTCTCAAGAAGGTAGGGATTCGTGGTGTGCTCTTGAGGTTTTTGAGTGGCTCCAAAAGGAAAATCGGGTCGACAAGGAGACCATGGAGCTGATGGTGTCTATAATGTGTAGTTGGATCAAGAAGTTGGTCGAGGGACAACATAACGTCGGAGATGTGGTTGACCTTCTCGTGGATATGGATTGTGTAGGTTTGAAGCCCCATTTTAGCATGATAGAAAAGGTCATCTCTTTGTATTGGGATATGGGTGAGAAGGAAAAAGCAATTTCGTTCGTGAAGGAGGTCTTGGGACGCAAACTGGATTTTATGAAGGACAATTGGGAAGGGCATAAAGGAGGACCGAGCGGATATCTCGCATGGAAGATGATGGTAAGCCTTTAGCTAATCAGGTTTTCATATCTTTAGCCTTTAACCTTGCTTAAGAGTCAAATCAACCATAGTATTCTTTTGCTGTGTCTCTTTATTACATTGAAAATTCATATCTGAAATGTAAAGATCTTCAATAGCTGTTTACTAACTTTGGATGTATAATTTTGAATCCTGAAGTTCCTAACTGTAGAGATGCCTTACTCTGCCTCCTTTATCTTATTGGGTTTTGGTTATGTGCTATAGAGCGTCAGGTTTTTGTTGAGGTGACTGTGGTCAAGGTAGTCAAGGATATTTTCTGGCGTCGTGTTTGTCTAATCACAATTGAAGGCGTTCAGGAAGAACTTTGCAGGTTTTTATTAGAGAGCTAGATGAGTGTTCAGCACACTATGGTTTGGCTGTACTTTTGGGATATATGGTTAAGCTGCGTTTTAATGTGGTTAAGCATGGTTGAATCTATTGTCATGTTAGTGTTGGACATACTGAACCACATTGACTTCCTTTCAATGTTATTATTTGTGGCGTATTAGACTTCCATATCGTAGCGTAACTTATGGTATGTTCTTGCTTATGTTAGTAGCACTCCGACATATTTCATTTATATTTGAACCACATTGAATGCTGTATGAGACCAAGTTAATCCAATCTCGTAACTCTAGCACTGTGCTGCTACTTTTGTTCCCTTCACCAGTAATGGTAGTGGATTAGTTGCCCGAGGAGTATATATAAGTATTTTCGAATTGTTATGTCTGTTCCGCACGACCCCGAGTGAAAATTCAATTGCTGATAATCTTGAAGATCAATTCTTGTAGGTTGATGGTGACTATAGGGGTGCAGTGAAAATGGTGCTGAATCTCAGAGAATCTGGATTAAAGCCAGAGGTTTACTGCTATCTTATTGCCATGACTGCTGTGGTTAAAGAGCTGAATGAATTTGCAAAAGCTCTTCGCAAACTGAAAAGTTATGCCAGAGACGGGATGGTGGCTGAACTCGATAAAGACAATGTTGAACTTGTCAAGAGGTATCAGTCGGAGCTTCTAGCTGATGGAGTACGGTTATCCAACTGGGTGCTTGACGAGGGAGGCTCTTCGAGTCACAGGGTGGTTCATGAGAGACTCCTTGCAATGTACATTTGTGCTGGGCAAGGTCTAGAGGCAGAGCGGCAGCTTTGGGAAATGAAGCTTGTAGGTAAGGAGGCTGATGCCGATCTCTACGATATCGTGCTAGCCATATGTGCTTCACAGAAGGAGACGAGAGCAATGAACCGGTTGCTTACCAGGATTGAGATTACGAGTCCCCGGCTTAAGAAGAAGAGTTTAACATGGCTACTAAGGGGTTACATAAAAGGAGGTCATTTCCGTGATGCTGCAGAAACATTAGTAAAAATGGTCAATTTGGGTTTTCTCCCAGAGTATTTGGACAGAGTAGCCGTGCTGCAAGGGCTTAGAAAACGAATTCGGGAACCTGAAAACGTCGAGACTTACCTCGATCTCTGCAAGTGTCTCTCTGATGCTAATCTAATTGGACCCAGTCTTGTATATTTGCACTTACAGAAGTACAAGCTTTGGGTCATTAAAATGCTTTGAAGAAGCCTCGATACCTCTCTGCACAGGCAGCTAATAAAGTAGAGCAGAAATCATTTATACAGCACCAGCACTTTTTTGGGTGCTTTTATATGTTGATTTTGTATAGTTTCAGGCAGGTGACTCTAGAAGCTCTTTAAGCCGACCCTGAAGACGAATACTTGTGTATATCTGTATATATATATAATCAGCTACTGTGCACAGAGAACCAATGTTACAGTGTATAGATAGATTG

mRNA sequence

AAACGCTGTCATTGCGACCACTCACTCCGGCTTCCTACCATCACCATCGCTTAATCCTCTGCCAGAAGATTAGGGTTTATTTTCTCCGACCAGCAATTCAGTTCAGTTCAATTCGATTCGCCGGTGGTTGAAGGCTGGCTCTATGTAATTTTCTCTCCATCTCTTCCGCCTTGCGTTCGTTTCGTTTTCTGATGAGTTTGTACTGTTATTGGAATTTGGATTGCGTTCAAAATTGATGCGGTTGAAACCCTAGTGTTCTGGTTTTCTTTTGAGAAGTTATAGGCGGTGGACCGATTGATTGTTTCATTTCTTGCTTTACTTCATTCGGTTTTGAATTTTTCTCCACTCGGTTGAGATACCGTGACGCTTAAGTTCGCTGGTTATTCGAAATTTTAGTTTGAATTTAGTGATTTGGATTAGAAGTACGAAATGATTTGTGCCCAGGGCTTTACTCCGTTAACGCAATTTGGATTTTCGTTTTCTTTATCTTCTGGACTGAAATCTGAGAGGCTTGGGTTTTCTGCTCCCCAATTGTGTAGTCGTTCGCCGGTAAATTTTTGCTTTATGGTTTCTCGTATTACTTGCAATCACCAGAATTCTACTTTCTCTGTTTCGAGAGCTGGTAAGTTTCGGGACCTAAGGTTGTTCAAATCGGTTGAGTTGGACCAGTTCATCACGAGTGATGACGAAGATGAAATGGGAGATGGGTTTTTTGAGGCAATTGAGGAATTGGAACGAATGACGAGGGATCCATCGGACGTTCTTGAAGAAATGAACGACCGCCTTTCGGCGAGGGAATTTCAGCTAGTGCTGGTGTACTTCTCTCAAGAAGGTAGGGATTCGTGGTGTGCTCTTGAGGTTTTTGAGTGGCTCCAAAAGGAAAATCGGGTCGACAAGGAGACCATGGAGCTGATGGTGTCTATAATGTGTAGTTGGATCAAGAAGTTGGTCGAGGGACAACATAACGTCGGAGATGTGGTTGACCTTCTCGTGGATATGGATTGTGTAGGTTTGAAGCCCCATTTTAGCATGATAGAAAAGGTCATCTCTTTGTATTGGGATATGGGTGAGAAGGAAAAAGCAATTTCGTTCGTGAAGGAGGTCTTGGGACGCAAACTGGATTTTATGAAGGACAATTGGGAAGGGCATAAAGGAGGACCGAGCGGATATCTCGCATGGAAGATGATGGTTGATGGTGACTATAGGGGTGCAGTGAAAATGGTGCTGAATCTCAGAGAATCTGGATTAAAGCCAGAGGTTTACTGCTATCTTATTGCCATGACTGCTGTGGTTAAAGAGCTGAATGAATTTGCAAAAGCTCTTCGCAAACTGAAAAGTTATGCCAGAGACGGGATGGTGGCTGAACTCGATAAAGACAATGTTGAACTTGTCAAGAGGTATCAGTCGGAGCTTCTAGCTGATGGAGTACGGTTATCCAACTGGGTGCTTGACGAGGGAGGCTCTTCGAGTCACAGGGTGGTTCATGAGAGACTCCTTGCAATGTACATTTGTGCTGGGCAAGGTCTAGAGGCAGAGCGGCAGCTTTGGGAAATGAAGCTTGTAGGTAAGGAGGCTGATGCCGATCTCTACGATATCGTGCTAGCCATATGTGCTTCACAGAAGGAGACGAGAGCAATGAACCGGTTGCTTACCAGGATTGAGATTACGAGTCCCCGGCTTAAGAAGAAGAGTTTAACATGGCTACTAAGGGGTTACATAAAAGGAGGTCATTTCCGTGATGCTGCAGAAACATTAGTAAAAATGGTCAATTTGGGTTTTCTCCCAGAGTATTTGGACAGAGTAGCCGTGCTGCAAGGGCTTAGAAAACGAATTCGGGAACCTGAAAACGTCGAGACTTACCTCGATCTCTGCAAGTGTCTCTCTGATGCTAATCTAATTGGACCCAGTCTTGTATATTTGCACTTACAGAAGTACAAGCTTTGGGTCATTAAAATGCTTTGAAGAAGCCTCGATACCTCTCTGCACAGGCAGCTAATAAAGTAGAGCAGAAATCATTTATACAGCACCAGCACTTTTTTGGGTGCTTTTATATGTTGATTTTGTATAGTTTCAGGCAGGTGACTCTAGAAGCTCTTTAAGCCGACCCTGAAGACGAATACTTGTGTATATCTGTATATATATATAATCAGCTACTGTGCACAGAGAACCAATGTTACAGTGTATAGATAGATTG

Coding sequence (CDS)

ATGATTTGTGCCCAGGGCTTTACTCCGTTAACGCAATTTGGATTTTCGTTTTCTTTATCTTCTGGACTGAAATCTGAGAGGCTTGGGTTTTCTGCTCCCCAATTGTGTAGTCGTTCGCCGGTAAATTTTTGCTTTATGGTTTCTCGTATTACTTGCAATCACCAGAATTCTACTTTCTCTGTTTCGAGAGCTGGTAAGTTTCGGGACCTAAGGTTGTTCAAATCGGTTGAGTTGGACCAGTTCATCACGAGTGATGACGAAGATGAAATGGGAGATGGGTTTTTTGAGGCAATTGAGGAATTGGAACGAATGACGAGGGATCCATCGGACGTTCTTGAAGAAATGAACGACCGCCTTTCGGCGAGGGAATTTCAGCTAGTGCTGGTGTACTTCTCTCAAGAAGGTAGGGATTCGTGGTGTGCTCTTGAGGTTTTTGAGTGGCTCCAAAAGGAAAATCGGGTCGACAAGGAGACCATGGAGCTGATGGTGTCTATAATGTGTAGTTGGATCAAGAAGTTGGTCGAGGGACAACATAACGTCGGAGATGTGGTTGACCTTCTCGTGGATATGGATTGTGTAGGTTTGAAGCCCCATTTTAGCATGATAGAAAAGGTCATCTCTTTGTATTGGGATATGGGTGAGAAGGAAAAAGCAATTTCGTTCGTGAAGGAGGTCTTGGGACGCAAACTGGATTTTATGAAGGACAATTGGGAAGGGCATAAAGGAGGACCGAGCGGATATCTCGCATGGAAGATGATGGTTGATGGTGACTATAGGGGTGCAGTGAAAATGGTGCTGAATCTCAGAGAATCTGGATTAAAGCCAGAGGTTTACTGCTATCTTATTGCCATGACTGCTGTGGTTAAAGAGCTGAATGAATTTGCAAAAGCTCTTCGCAAACTGAAAAGTTATGCCAGAGACGGGATGGTGGCTGAACTCGATAAAGACAATGTTGAACTTGTCAAGAGGTATCAGTCGGAGCTTCTAGCTGATGGAGTACGGTTATCCAACTGGGTGCTTGACGAGGGAGGCTCTTCGAGTCACAGGGTGGTTCATGAGAGACTCCTTGCAATGTACATTTGTGCTGGGCAAGGTCTAGAGGCAGAGCGGCAGCTTTGGGAAATGAAGCTTGTAGGTAAGGAGGCTGATGCCGATCTCTACGATATCGTGCTAGCCATATGTGCTTCACAGAAGGAGACGAGAGCAATGAACCGGTTGCTTACCAGGATTGAGATTACGAGTCCCCGGCTTAAGAAGAAGAGTTTAACATGGCTACTAAGGGGTTACATAAAAGGAGGTCATTTCCGTGATGCTGCAGAAACATTAGTAAAAATGGTCAATTTGGGTTTTCTCCCAGAGTATTTGGACAGAGTAGCCGTGCTGCAAGGGCTTAGAAAACGAATTCGGGAACCTGAAAACGTCGAGACTTACCTCGATCTCTGCAAGTGTCTCTCTGATGCTAATCTAATTGGACCCAGTCTTGTATATTTGCACTTACAGAAGTACAAGCTTTGGGTCATTAAAATGCTTTGA

Protein sequence

MICAQGFTPLTQFGFSFSLSSGLKSERLGFSAPQLCSRSPVNFCFMVSRITCNHQNSTFSVSRAGKFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTRDPSDVLEEMNDRLSAREFQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWIKKLVEGQHNVGDVVDLLVDMDCVGLKPHFSMIEKVISLYWDMGEKEKAISFVKEVLGRKLDFMKDNWEGHKGGPSGYLAWKMMVDGDYRGAVKMVLNLRESGLKPEVYCYLIAMTAVVKELNEFAKALRKLKSYARDGMVAELDKDNVELVKRYQSELLADGVRLSNWVLDEGGSSSHRVVHERLLAMYICAGQGLEAERQLWEMKLVGKEADADLYDIVLAICASQKETRAMNRLLTRIEITSPRLKKKSLTWLLRGYIKGGHFRDAAETLVKMVNLGFLPEYLDRVAVLQGLRKRIREPENVETYLDLCKCLSDANLIGPSLVYLHLQKYKLWVIKML
BLAST of Cp4.1LG16g06550.1 vs. Swiss-Prot
Match: PP176_ARATH (Pentatricopeptide repeat-containing protein At2g30100, chloroplastic OS=Arabidopsis thaliana GN=At2g30100 PE=2 SV=2)

HSP 1 Score: 600.5 bits (1547), Expect = 1.7e-170
Identity = 299/474 (63.08%), Postives = 375/474 (79.11%), Query Frame = 1

Query: 48  SRITCNHQNSTFSVSRAGKFRDLRLFKSVELDQFITSDDED----EMGDGFFEAIEELER 107
           SRI CN + +      AGKFR++ L +SVELDQFITS++E+    E+G+GFFEAIEELER
Sbjct: 34  SRIICNLKLNY----SAGKFREMGLSRSVELDQFITSEEEEGEAEEIGEGFFEAIEELER 93

Query: 108 MTRDPSDVLEEMNDRLSAREFQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMV 167
           MTR+PSD+LEEMN RLS+RE QL+LVYF+QEGRDSWC LEVFEWL+KENRVD+E MELMV
Sbjct: 94  MTREPSDILEEMNHRLSSRELQLMLVYFAQEGRDSWCTLEVFEWLKKENRVDEEIMELMV 153

Query: 168 SIMCSWIKKLVEGQHNVGDVVDLLVDMDCVGLKPHFSMIEKVISLYWDMGEKEKAISFVK 227
           SIMC W+KKL+E + N   V DLL++MDCVGLKP FSM++KVI+LY +MG+KE A+ FVK
Sbjct: 154 SIMCGWVKKLIEDECNAHQVFDLLIEMDCVGLKPGFSMMDKVIALYCEMGKKESAVLFVK 213

Query: 228 EVLGRKLDFMKD-----NWEGHKGGPSGYLAWKMMVDGDYRGAVKMVLNLRESGLKPEVY 287
           EVL R+  F          EG KGGP GYLAWK MVDGDYR AV MV+ LR SGLKPE Y
Sbjct: 214 EVLRRRDGFGYSVVGGGGSEGRKGGPVGYLAWKFMVDGDYRKAVDMVMELRLSGLKPEAY 273

Query: 288 CYLIAMTAVVKELNEFAKALRKLKSYARDGMVAELDKDNVELVKRYQSELLADGVRLSNW 347
            YLIAMTA+VKELN   K LR+LK +AR G VAE+D  +  L+++YQSE L+ G++L+ W
Sbjct: 274 SYLIAMTAIVKELNSLGKTLRELKRFARAGFVAEIDDHDRVLIEKYQSETLSRGLQLATW 333

Query: 348 VLDEG--GSSSHRVVHERLLAMYICAGQGLEAERQLWEMKLVGKEADADLYDIVLAICAS 407
            ++EG    S   VVHERLLAMYICAG+G EAE+QLW+MKL G+E +ADL+DIV+AICAS
Sbjct: 334 AVEEGQENDSIIGVVHERLLAMYICAGRGPEAEKQLWKMKLAGREPEADLHDIVMAICAS 393

Query: 408 QKETRAMNRLLTRIEITSPRLKKKSLTWLLRGYIKGGHFRDAAETLVKMVNLGFLPEYLD 467
           QKE  A++RLLTR+E    + KKK+L+WLLRGY+KGGHF +AAETLV M++ G  PEY+D
Sbjct: 394 QKEVNAVSRLLTRVEFMGSQRKKKTLSWLLRGYVKGGHFEEAAETLVSMIDSGLHPEYID 453

Query: 468 RVAVLQGLRKRIREPENVETYLDLCKCLSDANLIGPSLVYLHLQKYKLWVIKML 511
           RVAV+QG+ ++I+ P +VE Y+ LCK L DA L+GP LVY+++ KYKLW++KM+
Sbjct: 454 RVAVMQGMTRKIQRPRDVEAYMSLCKRLFDAGLVGPCLVYMYIDKYKLWIVKMM 503

BLAST of Cp4.1LG16g06550.1 vs. TrEMBL
Match: A0A0A0KC35_CUCSA (Uncharacterized protein OS=Cucumis sativus GN=Csa_6G182120 PE=4 SV=1)

HSP 1 Score: 907.1 bits (2343), Expect = 9.4e-261
Identity = 453/510 (88.82%), Postives = 479/510 (93.92%), Query Frame = 1

Query: 1   MICAQGFTPLTQFGFSFSLSSGLKSERLGFSAPQLCSRSPVNFCFMVSRITCNHQNSTFS 60
           MICAQGFTPLTQFGFSFSLSS L+S+R GFS P+L         +MVS I+CN+Q+STFS
Sbjct: 1   MICAQGFTPLTQFGFSFSLSSPLESQRCGFSTPRL---------YMVSPISCNYQDSTFS 60

Query: 61  VSRAGKFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTRDPSDVLEEMNDRLS 120
           VSRA KFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTR+PSDVLEEMNDRLS
Sbjct: 61  VSRAAKFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTREPSDVLEEMNDRLS 120

Query: 121 AREFQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWIKKLVEGQHNV 180
           ARE QLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWIKKLVEG+HNV
Sbjct: 121 AREIQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWIKKLVEGRHNV 180

Query: 181 GDVVDLLVDMDCVGLKPHFSMIEKVISLYWDMGEKEKAISFVKEVLGRKLDFMKDNWEGH 240
           GDVVDLLVDMDCVGLKPHFSMIEKVISLYW+MGEKEKA+ FVKEVLGR L FMKD+WEGH
Sbjct: 181 GDVVDLLVDMDCVGLKPHFSMIEKVISLYWEMGEKEKAVFFVKEVLGRNLAFMKDDWEGH 240

Query: 241 KGGPSGYLAWKMMVDGDYRGAVKMVLNLRESGLKPEVYCYLIAMTAVVKELNEFAKALRK 300
           KGGPSGYLAWKMMVDGDYRGAVKMVL+LRESGL+PEVY YLIAMTAVVKELNEFAKALRK
Sbjct: 241 KGGPSGYLAWKMMVDGDYRGAVKMVLHLRESGLRPEVYSYLIAMTAVVKELNEFAKALRK 300

Query: 301 LKSYARDGMVAELDKDNVELVKRYQSELLADGVRLSNWVLDEGGSSSHRVVHERLLAMYI 360
           LK YARDG VAELDK+NVELV +YQ+ELLADGV+LSNWVL+EG SS   VVHERLLAMYI
Sbjct: 301 LKGYARDGFVAELDKNNVELVAKYQTELLADGVQLSNWVLEEGSSSIRGVVHERLLAMYI 360

Query: 361 CAGQGLEAERQLWEMKLVGKEADADLYDIVLAICASQKETRAMNRLLTRIEITSPRLKKK 420
           CAGQG+EAERQLWEMKLVGKEADADLYDIVLAICASQKET+AM RLLTRIEITSP +KKK
Sbjct: 361 CAGQGVEAERQLWEMKLVGKEADADLYDIVLAICASQKETKAMKRLLTRIEITSPMIKKK 420

Query: 421 SLTWLLRGYIKGGHFRDAAETLVKMVNLGFLPEYLDRVAVLQGLRKRIREPENVETYLDL 480
           SLTWLLRGYIKGGHFRDAA TLVKM+NLGFLPEYLDRVAVLQGLRK IREPE+V TYLDL
Sbjct: 421 SLTWLLRGYIKGGHFRDAAGTLVKMINLGFLPEYLDRVAVLQGLRKEIREPESVHTYLDL 480

Query: 481 CKCLSDANLIGPSLVYLHLQKYKLWVIKML 511
           CKCLSDANLIGPSLVYLHLQK+KLW+IKML
Sbjct: 481 CKCLSDANLIGPSLVYLHLQKHKLWIIKML 501

BLAST of Cp4.1LG16g06550.1 vs. TrEMBL
Match: F6GUM4_VITVI (Putative uncharacterized protein OS=Vitis vinifera GN=VIT_06s0004g07010 PE=4 SV=1)

HSP 1 Score: 738.4 bits (1905), Expect = 5.8e-210
Identity = 367/514 (71.40%), Postives = 433/514 (84.24%), Query Frame = 1

Query: 1   MICAQGFTPL----TQFGFSFSLSSGLKSERLGFSAPQLCSRSPVNFCFMVSRITCNHQN 60
           M  A GF       T+ GF+ S S  ++  RL    P+        +C   + I CNHQN
Sbjct: 1   MASAHGFASSLMSPTELGFTLSSSFSIQRPRL--IVPKFSRSFLGEYCSRATTI-CNHQN 60

Query: 61  STFSVSRAGKFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTRDPSDVLEEMN 120
             F V +  K R+ RLFKSVELDQF+TSDDEDEM +GFFEAIEELERMTR+PSDVLEEMN
Sbjct: 61  PRFVVPKRDKIREFRLFKSVELDQFLTSDDEDEMSEGFFEAIEELERMTREPSDVLEEMN 120

Query: 121 DRLSAREFQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWIKKLVEG 180
           DRLSARE QLVLVYFSQEGRDSWCALEVFEWL+KENRVDKETMELMVSIMCSW+KKL+EG
Sbjct: 121 DRLSARELQLVLVYFSQEGRDSWCALEVFEWLRKENRVDKETMELMVSIMCSWVKKLIEG 180

Query: 181 QHNVGDVVDLLVDMDCVGLKPHFSMIEKVISLYWDMGEKEKAISFVKEVLGRKLDFMKDN 240
           +H+VGDVVDLLVDMDCVGLKP FSMIEKVISLYW+M EKEKA+ FVKEVL R++ + +D+
Sbjct: 181 EHDVGDVVDLLVDMDCVGLKPGFSMIEKVISLYWEMEEKEKAVLFVKEVLRREIAYSEDD 240

Query: 241 WEGHKGGPSGYLAWKMMVDGDYRGAVKMVLNLRESGLKPEVYCYLIAMTAVVKELNEFAK 300
            +GHKGGP+GYLAWKMM +G+YRGAVK+V++LRESGLKPEVY YLIAMTAVVKELNEFAK
Sbjct: 241 GDGHKGGPTGYLAWKMMAEGNYRGAVKLVIHLRESGLKPEVYSYLIAMTAVVKELNEFAK 300

Query: 301 ALRKLKSYARDGMVAELDKDNVELVKRYQSELLADGVRLSNWVLDEGGSSSHRVVHERLL 360
           ALRKLK + + G++AELD +NVEL+++YQS+LLADGVRLS+WV+ EG S  H VV+ERLL
Sbjct: 301 ALRKLKGFTKSGLIAELDAENVELIEKYQSDLLADGVRLSSWVIQEGRSPLHGVVYERLL 360

Query: 361 AMYICAGQGLEAERQLWEMKLVGKEADADLYDIVLAICASQKETRAMNRLLTRIEITSPR 420
           AMYICAG+GLEAERQLWEMKLVGKEAD +LYDIVLAICAS+KE  A++RLLT +E+TS  
Sbjct: 361 AMYICAGRGLEAERQLWEMKLVGKEADRELYDIVLAICASKKEASAISRLLTGMEVTSSI 420

Query: 421 LKKKSLTWLLRGYIKGGHFRDAAETLVKMVNLGFLPEYLDRVAVLQGLRKRIREPENVET 480
            +KK+L+WLLRGYIKG HF DA+ET++KM++LG  PEYLDR AVLQGLR RI++  NVET
Sbjct: 421 RRKKTLSWLLRGYIKGSHFDDASETIIKMLDLGLCPEYLDRAAVLQGLRNRIQQTGNVET 480

Query: 481 YLDLCKCLSDANLIGPSLVYLHLQKYKLWVIKML 511
           YL LCK LSDANLIGP LVYL+++KYKLW++K +
Sbjct: 481 YLKLCKHLSDANLIGPCLVYLYIKKYKLWILKTI 511

BLAST of Cp4.1LG16g06550.1 vs. TrEMBL
Match: B9HNB9_POPTR (Ubiquitin family protein OS=Populus trichocarpa GN=POPTR_0009s07900g PE=4 SV=1)

HSP 1 Score: 732.3 bits (1889), Expect = 4.2e-208
Identity = 359/470 (76.38%), Postives = 416/470 (88.51%), Query Frame = 1

Query: 44  CFMVSRITCNHQNS---TFSVSRAGKFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEE 103
           C MVS I CN+Q      F V++  K R+ RLFKSVELDQ++TSDDE+EMG+GFFEAIEE
Sbjct: 31  CCMVSTIICNYQTPKRPNFVVAKTTKVREFRLFKSVELDQYVTSDDEEEMGEGFFEAIEE 90

Query: 104 LERMTRDPSDVLEEMNDRLSAREFQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETME 163
           LERMTR+PSD+LEEMNDRLSARE QLVLVYFSQEGRDSWCALEVFEWL+KENRVDKETME
Sbjct: 91  LERMTREPSDILEEMNDRLSARELQLVLVYFSQEGRDSWCALEVFEWLRKENRVDKETME 150

Query: 164 LMVSIMCSWIKKLVEGQHNVGDVVDLLVDMDCVGLKPHFSMIEKVISLYWDMGEKEKAIS 223
           LMVSIMCSW+KKL+EG+ +VGDVVDLLVDMDCVGLKP FSMIEKVISLYWDMG+KE A+S
Sbjct: 151 LMVSIMCSWVKKLIEGEQDVGDVVDLLVDMDCVGLKPSFSMIEKVISLYWDMGKKEGAVS 210

Query: 224 FVKEVLGRKLDFMKDNWEGHKGGPSGYLAWKMMVDGDYRGAVKMVLNLRESGLKPEVYCY 283
           FVKEVL R + +  D+ EG KGGP+GYL WKMMVDG+YR AVK+V++LRESGLKPE+Y Y
Sbjct: 211 FVKEVLRRGIAYSGDDGEGQKGGPTGYLTWKMMVDGNYRNAVKLVIHLRESGLKPEIYAY 270

Query: 284 LIAMTAVVKELNEFAKALRKLKSYARDGMVAELDKDNVELVKRYQSELLADGVRLSNWVL 343
           LIAMTAVVKELNEF+KALRKLK Y+R GMV ELD +NVELV++YQS+LLADGV LS+WV+
Sbjct: 271 LIAMTAVVKELNEFSKALRKLKGYSRSGMVTELDAENVELVEKYQSDLLADGVCLSSWVI 330

Query: 344 DEGGSSSHRVVHERLLAMYICAGQGLEAERQLWEMKLVGKEADADLYDIVLAICASQKET 403
            EG  + + VVHERLLAMYICAG+GL+AERQLWEMKLVGKEAD DLYDIVLAICASQKE 
Sbjct: 331 QEGSPALYGVVHERLLAMYICAGRGLDAERQLWEMKLVGKEADGDLYDIVLAICASQKEA 390

Query: 404 RAMNRLLTRIEITSPRLKKKSLTWLLRGYIKGGHFRDAAETLVKMVNLGFLPEYLDRVAV 463
            A+ RLLTRIE+ S   KKKSL+WLLRGYIKGGH+ +AAETL+KM++LG  P+YLDRVAV
Sbjct: 391 SAVARLLTRIEVASSMRKKKSLSWLLRGYIKGGHYGEAAETLIKMLDLGLSPDYLDRVAV 450

Query: 464 LQGLRKRIREPENVETYLDLCKCLSDANLIGPSLVYLHLQKYKLWVIKML 511
           +QGLRKRI++  NVE+YL LCK LSD NLIGPSLVYL+++KYKLW++K+L
Sbjct: 451 MQGLRKRIQQWGNVESYLKLCKRLSDVNLIGPSLVYLYIKKYKLWIMKLL 500

BLAST of Cp4.1LG16g06550.1 vs. TrEMBL
Match: W9R4F5_9ROSA (Uncharacterized protein OS=Morus notabilis GN=L484_011688 PE=4 SV=1)

HSP 1 Score: 728.8 bits (1880), Expect = 4.6e-207
Identity = 367/519 (70.71%), Postives = 433/519 (83.43%), Query Frame = 1

Query: 1   MICAQGFTPLTQFGF--SFSLSSGLKSERLGFSAPQLC-------SRSPVNFCFMVSRIT 60
           M  AQGFTPLT+ GF  S S SS   S  L  +   LC        R+   FC +   I 
Sbjct: 1   MASAQGFTPLTELGFPSSSSSSSSSSSNSLHRNRIFLCRMDENLWGRTSAKFCPV---IC 60

Query: 61  CNHQNSTFSVSRAGKFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTRDPSDV 120
           C  QN  F   +  K R+ RLF SVELDQF+TSDDE+EMG+GFFEAIEELERMTR+PSDV
Sbjct: 61  CKQQNPNFIAPKPSKLREFRLFTSVELDQFLTSDDEEEMGEGFFEAIEELERMTREPSDV 120

Query: 121 LEEMNDRLSAREFQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWIK 180
           LEEMNDRLSARE QLVLVYFSQEGRDSWCALEVFEWL+KENRVDKETMELMV++MCSW+K
Sbjct: 121 LEEMNDRLSARELQLVLVYFSQEGRDSWCALEVFEWLRKENRVDKETMELMVTLMCSWVK 180

Query: 181 KLVEGQHNVGDVVDLLVDMDCVGLKPHFSMIEKVISLYWDMGEKEKAISFVKEVLGRKLD 240
           KL+EG+H+VGDVVDLLVDM CVGL+P FSM+E VI LYW+MGEK +A+SFVKEVL R + 
Sbjct: 181 KLIEGEHDVGDVVDLLVDMACVGLRPGFSMMENVILLYWEMGEKGRAVSFVKEVLRRGIA 240

Query: 241 FMKDNWEGHKGGPSGYLAWKMMVDGDYRGAVKMVLNLRESGLKPEVYCYLIAMTAVVKEL 300
            ++D+ EG KGGP+GYLAWKMMV+G+Y  AVK+V+++RESGLKPEVY YLIAMTAVVKEL
Sbjct: 241 CLEDDGEGPKGGPTGYLAWKMMVEGNYMEAVKLVVDIRESGLKPEVYSYLIAMTAVVKEL 300

Query: 301 NEFAKALRKLKSYARDGMVAELDKDNVELVKRYQSELLADGVRLSNWVLDEGGSSSHRVV 360
           NEFAKALRKLK + R G+ AELD+++VEL+++YQS+LL DGVRLSNWV++EG +S + VV
Sbjct: 301 NEFAKALRKLKGFERAGLTAELDEESVELIEKYQSDLLDDGVRLSNWVIEEGITSLNGVV 360

Query: 361 HERLLAMYICAGQGLEAERQLWEMKLVGKEADADLYDIVLAICASQKETRAMNRLLTRIE 420
           HERLLAMYICAG+G+EAERQLW+MKLVGKEAD DLYDIVLAICASQKE RA+ RLLTR+ 
Sbjct: 361 HERLLAMYICAGRGIEAERQLWKMKLVGKEADGDLYDIVLAICASQKEGRAIARLLTRVN 420

Query: 421 ITSPRLKKKSLTWLLRGYIKGGHFRDAAETLVKMVNLGFLPEYLDRVAVLQGLRKRIREP 480
            +S   K+KSL+WLLRGYIKGGHF +AAET+VKM++LG  PEYLDR AVLQGLRKRI+ P
Sbjct: 421 FSSTLRKRKSLSWLLRGYIKGGHFDNAAETVVKMLDLGLCPEYLDRAAVLQGLRKRIKGP 480

Query: 481 ENVETYLDLCKCLSDANLIGPSLVYLHLQKYKLWVIKML 511
           + VETYL LCK LSD NLIGP L+YL+++KYKLW++KML
Sbjct: 481 DTVETYLKLCKHLSDYNLIGPCLIYLYIKKYKLWIMKML 516

BLAST of Cp4.1LG16g06550.1 vs. TrEMBL
Match: M5VXS0_PRUPE (Uncharacterized protein OS=Prunus persica GN=PRUPE_ppa004609mg PE=4 SV=1)

HSP 1 Score: 727.6 bits (1877), Expect = 1.0e-206
Identity = 371/513 (72.32%), Postives = 429/513 (83.63%), Query Frame = 1

Query: 1   MICAQGFTPLTQFGFSFSLSS--GLKSERLGFSAPQLCSRSPVNFCFMVSRITCNHQNST 60
           M  AQG   LT   F+       GL+    GFSA Q C R       +  RI C HQ   
Sbjct: 1   MASAQGLASLTHSLFAVKRQRFMGLR----GFSA-QSCGR-------VFPRI-CKHQKPN 60

Query: 61  FSVSRAGKFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTRDPSDVLEEMNDR 120
           F V+++ K RD RLFKSVELDQF+TSDDEDEMG+GFFEAIEELERMTR+PSDVLEEMNDR
Sbjct: 61  FIVAKSSKVRDFRLFKSVELDQFLTSDDEDEMGEGFFEAIEELERMTREPSDVLEEMNDR 120

Query: 121 LSAREFQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWIKKLVEGQH 180
           LSARE QLVLVYFSQEGRDSWCALEVFEWL+KENRVDKETM+LMVSIMCSW+KKL++ +H
Sbjct: 121 LSARELQLVLVYFSQEGRDSWCALEVFEWLRKENRVDKETMDLMVSIMCSWVKKLIQREH 180

Query: 181 NVGDVVDLLVDMDCVGLKPHFSMIEKVISLYWDMGEKEKAISFVKEVLGRKLDFMK-DNW 240
           ++GDVVDLLVDMDCVGLKP FSM+EKVISLYW+MGEKEKA+ FVKEVL R + + + D+ 
Sbjct: 181 DIGDVVDLLVDMDCVGLKPSFSMMEKVISLYWEMGEKEKAVLFVKEVLKRGIVYSEEDDT 240

Query: 241 EGHKGGPSGYLAWKMMVDGDYRGAVKMVLNLRESGLKPEVYCYLIAMTAVVKELNEFAKA 300
           +GHKGGP+GYLAWKMMV+G+YR +VK+V++LRESGLKPEVY YLIAMTAVVKELNE AKA
Sbjct: 241 DGHKGGPTGYLAWKMMVEGNYRDSVKLVIHLRESGLKPEVYSYLIAMTAVVKELNELAKA 300

Query: 301 LRKLKSYARDGMVAELDKDNVELVKRYQSELLADGVRLSNWVLDEGGSSSHRVVHERLLA 360
           LRKLK + R G++AE D +NV L+++YQS+LL+DGV+LSNWV+ EG SS H VVHERLLA
Sbjct: 301 LRKLKGFTRAGLIAEFDTENVGLIEKYQSDLLSDGVQLSNWVIQEGSSSLHGVVHERLLA 360

Query: 361 MYICAGQGLEAERQLWEMKLVGKEADADLYDIVLAICASQKETRAMNRLLTRIEITSPRL 420
           MYIC+G GLEAERQLWEMKLVGKEADADLYDIVLAICASQKE  A+ RLLTR E+TS   
Sbjct: 361 MYICSGHGLEAERQLWEMKLVGKEADADLYDIVLAICASQKEASAIGRLLTRTEVTSSLR 420

Query: 421 KKKSLTWLLRGYIKGGHFRDAAETLVKMVNLGFLPEYLDRVAVLQGLRKRIREPENVETY 480
           KKKSL+WLLRGYIKGGHF DAAET++KM++LG  PE+LDR AVLQGLRK I+E   V+TY
Sbjct: 421 KKKSLSWLLRGYIKGGHFDDAAETVIKMLDLGLCPEFLDRAAVLQGLRKSIQESGGVDTY 480

Query: 481 LDLCKCLSDANLIGPSLVYLHLQKYKLWVIKML 511
           L LCK LSDA+LIGP LVYL ++KYKLW+ KML
Sbjct: 481 LKLCKRLSDASLIGPCLVYLFIRKYKLWITKML 500

BLAST of Cp4.1LG16g06550.1 vs. TAIR10
Match: AT2G30100.1 (AT2G30100.1 pentatricopeptide (PPR) repeat-containing protein)

HSP 1 Score: 600.5 bits (1547), Expect = 9.6e-172
Identity = 299/474 (63.08%), Postives = 375/474 (79.11%), Query Frame = 1

Query: 48  SRITCNHQNSTFSVSRAGKFRDLRLFKSVELDQFITSDDED----EMGDGFFEAIEELER 107
           SRI CN + +      AGKFR++ L +SVELDQFITS++E+    E+G+GFFEAIEELER
Sbjct: 34  SRIICNLKLNY----SAGKFREMGLSRSVELDQFITSEEEEGEAEEIGEGFFEAIEELER 93

Query: 108 MTRDPSDVLEEMNDRLSAREFQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMV 167
           MTR+PSD+LEEMN RLS+RE QL+LVYF+QEGRDSWC LEVFEWL+KENRVD+E MELMV
Sbjct: 94  MTREPSDILEEMNHRLSSRELQLMLVYFAQEGRDSWCTLEVFEWLKKENRVDEEIMELMV 153

Query: 168 SIMCSWIKKLVEGQHNVGDVVDLLVDMDCVGLKPHFSMIEKVISLYWDMGEKEKAISFVK 227
           SIMC W+KKL+E + N   V DLL++MDCVGLKP FSM++KVI+LY +MG+KE A+ FVK
Sbjct: 154 SIMCGWVKKLIEDECNAHQVFDLLIEMDCVGLKPGFSMMDKVIALYCEMGKKESAVLFVK 213

Query: 228 EVLGRKLDFMKD-----NWEGHKGGPSGYLAWKMMVDGDYRGAVKMVLNLRESGLKPEVY 287
           EVL R+  F          EG KGGP GYLAWK MVDGDYR AV MV+ LR SGLKPE Y
Sbjct: 214 EVLRRRDGFGYSVVGGGGSEGRKGGPVGYLAWKFMVDGDYRKAVDMVMELRLSGLKPEAY 273

Query: 288 CYLIAMTAVVKELNEFAKALRKLKSYARDGMVAELDKDNVELVKRYQSELLADGVRLSNW 347
            YLIAMTA+VKELN   K LR+LK +AR G VAE+D  +  L+++YQSE L+ G++L+ W
Sbjct: 274 SYLIAMTAIVKELNSLGKTLRELKRFARAGFVAEIDDHDRVLIEKYQSETLSRGLQLATW 333

Query: 348 VLDEG--GSSSHRVVHERLLAMYICAGQGLEAERQLWEMKLVGKEADADLYDIVLAICAS 407
            ++EG    S   VVHERLLAMYICAG+G EAE+QLW+MKL G+E +ADL+DIV+AICAS
Sbjct: 334 AVEEGQENDSIIGVVHERLLAMYICAGRGPEAEKQLWKMKLAGREPEADLHDIVMAICAS 393

Query: 408 QKETRAMNRLLTRIEITSPRLKKKSLTWLLRGYIKGGHFRDAAETLVKMVNLGFLPEYLD 467
           QKE  A++RLLTR+E    + KKK+L+WLLRGY+KGGHF +AAETLV M++ G  PEY+D
Sbjct: 394 QKEVNAVSRLLTRVEFMGSQRKKKTLSWLLRGYVKGGHFEEAAETLVSMIDSGLHPEYID 453

Query: 468 RVAVLQGLRKRIREPENVETYLDLCKCLSDANLIGPSLVYLHLQKYKLWVIKML 511
           RVAV+QG+ ++I+ P +VE Y+ LCK L DA L+GP LVY+++ KYKLW++KM+
Sbjct: 454 RVAVMQGMTRKIQRPRDVEAYMSLCKRLFDAGLVGPCLVYMYIDKYKLWIVKMM 503

BLAST of Cp4.1LG16g06550.1 vs. NCBI nr
Match: gi|778713772|ref|XP_011657120.1| (PREDICTED: pentatricopeptide repeat-containing protein At2g30100, chloroplastic [Cucumis sativus])

HSP 1 Score: 907.1 bits (2343), Expect = 1.4e-260
Identity = 453/510 (88.82%), Postives = 479/510 (93.92%), Query Frame = 1

Query: 1   MICAQGFTPLTQFGFSFSLSSGLKSERLGFSAPQLCSRSPVNFCFMVSRITCNHQNSTFS 60
           MICAQGFTPLTQFGFSFSLSS L+S+R GFS P+L         +MVS I+CN+Q+STFS
Sbjct: 1   MICAQGFTPLTQFGFSFSLSSPLESQRCGFSTPRL---------YMVSPISCNYQDSTFS 60

Query: 61  VSRAGKFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTRDPSDVLEEMNDRLS 120
           VSRA KFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTR+PSDVLEEMNDRLS
Sbjct: 61  VSRAAKFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTREPSDVLEEMNDRLS 120

Query: 121 AREFQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWIKKLVEGQHNV 180
           ARE QLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWIKKLVEG+HNV
Sbjct: 121 AREIQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWIKKLVEGRHNV 180

Query: 181 GDVVDLLVDMDCVGLKPHFSMIEKVISLYWDMGEKEKAISFVKEVLGRKLDFMKDNWEGH 240
           GDVVDLLVDMDCVGLKPHFSMIEKVISLYW+MGEKEKA+ FVKEVLGR L FMKD+WEGH
Sbjct: 181 GDVVDLLVDMDCVGLKPHFSMIEKVISLYWEMGEKEKAVFFVKEVLGRNLAFMKDDWEGH 240

Query: 241 KGGPSGYLAWKMMVDGDYRGAVKMVLNLRESGLKPEVYCYLIAMTAVVKELNEFAKALRK 300
           KGGPSGYLAWKMMVDGDYRGAVKMVL+LRESGL+PEVY YLIAMTAVVKELNEFAKALRK
Sbjct: 241 KGGPSGYLAWKMMVDGDYRGAVKMVLHLRESGLRPEVYSYLIAMTAVVKELNEFAKALRK 300

Query: 301 LKSYARDGMVAELDKDNVELVKRYQSELLADGVRLSNWVLDEGGSSSHRVVHERLLAMYI 360
           LK YARDG VAELDK+NVELV +YQ+ELLADGV+LSNWVL+EG SS   VVHERLLAMYI
Sbjct: 301 LKGYARDGFVAELDKNNVELVAKYQTELLADGVQLSNWVLEEGSSSIRGVVHERLLAMYI 360

Query: 361 CAGQGLEAERQLWEMKLVGKEADADLYDIVLAICASQKETRAMNRLLTRIEITSPRLKKK 420
           CAGQG+EAERQLWEMKLVGKEADADLYDIVLAICASQKET+AM RLLTRIEITSP +KKK
Sbjct: 361 CAGQGVEAERQLWEMKLVGKEADADLYDIVLAICASQKETKAMKRLLTRIEITSPMIKKK 420

Query: 421 SLTWLLRGYIKGGHFRDAAETLVKMVNLGFLPEYLDRVAVLQGLRKRIREPENVETYLDL 480
           SLTWLLRGYIKGGHFRDAA TLVKM+NLGFLPEYLDRVAVLQGLRK IREPE+V TYLDL
Sbjct: 421 SLTWLLRGYIKGGHFRDAAGTLVKMINLGFLPEYLDRVAVLQGLRKEIREPESVHTYLDL 480

Query: 481 CKCLSDANLIGPSLVYLHLQKYKLWVIKML 511
           CKCLSDANLIGPSLVYLHLQK+KLW+IKML
Sbjct: 481 CKCLSDANLIGPSLVYLHLQKHKLWIIKML 501

BLAST of Cp4.1LG16g06550.1 vs. NCBI nr
Match: gi|659130631|ref|XP_008465268.1| (PREDICTED: pentatricopeptide repeat-containing protein At2g30100, chloroplastic [Cucumis melo])

HSP 1 Score: 903.3 bits (2333), Expect = 2.0e-259
Identity = 452/510 (88.63%), Postives = 478/510 (93.73%), Query Frame = 1

Query: 1   MICAQGFTPLTQFGFSFSLSSGLKSERLGFSAPQLCSRSPVNFCFMVSRITCNHQNSTFS 60
           MICAQGFTPLTQFGFSFSLSS L+++R GFS P+L         +MVS I+CN+Q+STFS
Sbjct: 1   MICAQGFTPLTQFGFSFSLSSPLETQRYGFSTPRL---------YMVSPISCNYQDSTFS 60

Query: 61  VSRAGKFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTRDPSDVLEEMNDRLS 120
           VSRA KFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTR+PSDVLEEMNDRLS
Sbjct: 61  VSRAAKFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTREPSDVLEEMNDRLS 120

Query: 121 AREFQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWIKKLVEGQHNV 180
           ARE QLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWI KLVEG+HNV
Sbjct: 121 AREIQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWINKLVEGRHNV 180

Query: 181 GDVVDLLVDMDCVGLKPHFSMIEKVISLYWDMGEKEKAISFVKEVLGRKLDFMKDNWEGH 240
           GDVVDLLVDMDCVGLKPHFSMIEKVISLYW+MGEKEKAI FVKEVLGR L FMKD+WEGH
Sbjct: 181 GDVVDLLVDMDCVGLKPHFSMIEKVISLYWEMGEKEKAIFFVKEVLGRNLAFMKDDWEGH 240

Query: 241 KGGPSGYLAWKMMVDGDYRGAVKMVLNLRESGLKPEVYCYLIAMTAVVKELNEFAKALRK 300
           KGGPSGYLAWKMMVDGDYRGAVKMVL+LRESGL+PEVY YLIAMTAVVKELNEFAKALRK
Sbjct: 241 KGGPSGYLAWKMMVDGDYRGAVKMVLHLRESGLRPEVYSYLIAMTAVVKELNEFAKALRK 300

Query: 301 LKSYARDGMVAELDKDNVELVKRYQSELLADGVRLSNWVLDEGGSSSHRVVHERLLAMYI 360
           LKSYARDG VAELDK+NVELV +YQ+ELLADGVRLSNWVL+EG SS H VVHERLLAMYI
Sbjct: 301 LKSYARDGYVAELDKNNVELVAKYQTELLADGVRLSNWVLEEGSSSIHGVVHERLLAMYI 360

Query: 361 CAGQGLEAERQLWEMKLVGKEADADLYDIVLAICASQKETRAMNRLLTRIEITSPRLKKK 420
           CAGQG+EAERQLWEMKL+GKEADADLYDIVLAICASQKE +AM RLLTRIEITSP +KKK
Sbjct: 361 CAGQGVEAERQLWEMKLLGKEADADLYDIVLAICASQKEIKAMKRLLTRIEITSPMIKKK 420

Query: 421 SLTWLLRGYIKGGHFRDAAETLVKMVNLGFLPEYLDRVAVLQGLRKRIREPENVETYLDL 480
           SLTWLLRGYIKGGHFRDAA T+VKM+NLGFLPEYLDRVAVLQGLRK IREPE V TYLDL
Sbjct: 421 SLTWLLRGYIKGGHFRDAAGTVVKMINLGFLPEYLDRVAVLQGLRKGIREPEIVHTYLDL 480

Query: 481 CKCLSDANLIGPSLVYLHLQKYKLWVIKML 511
           CKCLSDANLIGPSLVYLHLQK+KLW+IKML
Sbjct: 481 CKCLSDANLIGPSLVYLHLQKHKLWIIKML 501

BLAST of Cp4.1LG16g06550.1 vs. NCBI nr
Match: gi|731391774|ref|XP_010650876.1| (PREDICTED: pentatricopeptide repeat-containing protein At2g30100, chloroplastic isoform X2 [Vitis vinifera])

HSP 1 Score: 740.0 bits (1909), Expect = 2.9e-210
Identity = 368/514 (71.60%), Postives = 434/514 (84.44%), Query Frame = 1

Query: 1   MICAQGFTPL----TQFGFSFSLSSGLKSERLGFSAPQLCSRSPVNFCFMVSRITCNHQN 60
           M  A GF       T+ GF+ S S  ++  RL    P+        +C   + I CNHQN
Sbjct: 1   MASAHGFASSLMSPTELGFTLSSSFSIQRPRL--IVPKFSRSFLGEYCSRATTI-CNHQN 60

Query: 61  STFSVSRAGKFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTRDPSDVLEEMN 120
             F V +  K R+ RLFKSVELDQF+TSDDEDEM +GFFEAIEELERMTR+PSDVLEEMN
Sbjct: 61  PRFVVPKRDKIREFRLFKSVELDQFLTSDDEDEMSEGFFEAIEELERMTREPSDVLEEMN 120

Query: 121 DRLSAREFQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWIKKLVEG 180
           DRLSARE QLVLVYFSQEGRDSWCALEVFEWL+KENRVDKETMELMVSIMCSW+KKL+EG
Sbjct: 121 DRLSARELQLVLVYFSQEGRDSWCALEVFEWLRKENRVDKETMELMVSIMCSWVKKLIEG 180

Query: 181 QHNVGDVVDLLVDMDCVGLKPHFSMIEKVISLYWDMGEKEKAISFVKEVLGRKLDFMKDN 240
           +H+VGDVVDLLVDMDCVGLKP FSMIEKVISLYW+M EKEKA+ FVKEVL R++ + +D+
Sbjct: 181 EHDVGDVVDLLVDMDCVGLKPGFSMIEKVISLYWEMEEKEKAVLFVKEVLRREIAYSEDD 240

Query: 241 WEGHKGGPSGYLAWKMMVDGDYRGAVKMVLNLRESGLKPEVYCYLIAMTAVVKELNEFAK 300
            +GHKGGP+GYLAWKMMV+G+YRGAVK+V++LRESGLKPEVY YLIAMTAVVKELNEFAK
Sbjct: 241 GDGHKGGPTGYLAWKMMVEGNYRGAVKLVIHLRESGLKPEVYSYLIAMTAVVKELNEFAK 300

Query: 301 ALRKLKSYARDGMVAELDKDNVELVKRYQSELLADGVRLSNWVLDEGGSSSHRVVHERLL 360
           ALRKLK + + G++AELD +NVEL+++YQS+LLADGVRLS+WV+ EG S  H VV+ERLL
Sbjct: 301 ALRKLKGFTKSGLIAELDAENVELIEKYQSDLLADGVRLSSWVIQEGRSPLHGVVYERLL 360

Query: 361 AMYICAGQGLEAERQLWEMKLVGKEADADLYDIVLAICASQKETRAMNRLLTRIEITSPR 420
           AMYICAG+GLEAERQLWEMKLVGKEAD +LYDIVLAICAS+KE  A++RLLT +E+TS  
Sbjct: 361 AMYICAGRGLEAERQLWEMKLVGKEADRELYDIVLAICASKKEASAISRLLTGMEVTSSI 420

Query: 421 LKKKSLTWLLRGYIKGGHFRDAAETLVKMVNLGFLPEYLDRVAVLQGLRKRIREPENVET 480
            +KK+L+WLLRGYIKG HF DA+ET++KM++LG  PEYLDR AVLQGLR RI++  NVET
Sbjct: 421 RRKKTLSWLLRGYIKGSHFDDASETIIKMLDLGLCPEYLDRAAVLQGLRNRIQQTGNVET 480

Query: 481 YLDLCKCLSDANLIGPSLVYLHLQKYKLWVIKML 511
           YL LCK LSDANLIGP LVYL+++KYKLW++K +
Sbjct: 481 YLKLCKHLSDANLIGPCLVYLYIKKYKLWILKTI 511

BLAST of Cp4.1LG16g06550.1 vs. NCBI nr
Match: gi|225434512|ref|XP_002278434.1| (PREDICTED: pentatricopeptide repeat-containing protein At2g30100, chloroplastic isoform X1 [Vitis vinifera])

HSP 1 Score: 738.4 bits (1905), Expect = 8.3e-210
Identity = 367/514 (71.40%), Postives = 433/514 (84.24%), Query Frame = 1

Query: 1   MICAQGFTPL----TQFGFSFSLSSGLKSERLGFSAPQLCSRSPVNFCFMVSRITCNHQN 60
           M  A GF       T+ GF+ S S  ++  RL    P+        +C   + I CNHQN
Sbjct: 1   MASAHGFASSLMSPTELGFTLSSSFSIQRPRL--IVPKFSRSFLGEYCSRATTI-CNHQN 60

Query: 61  STFSVSRAGKFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTRDPSDVLEEMN 120
             F V +  K R+ RLFKSVELDQF+TSDDEDEM +GFFEAIEELERMTR+PSDVLEEMN
Sbjct: 61  PRFVVPKRDKIREFRLFKSVELDQFLTSDDEDEMSEGFFEAIEELERMTREPSDVLEEMN 120

Query: 121 DRLSAREFQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWIKKLVEG 180
           DRLSARE QLVLVYFSQEGRDSWCALEVFEWL+KENRVDKETMELMVSIMCSW+KKL+EG
Sbjct: 121 DRLSARELQLVLVYFSQEGRDSWCALEVFEWLRKENRVDKETMELMVSIMCSWVKKLIEG 180

Query: 181 QHNVGDVVDLLVDMDCVGLKPHFSMIEKVISLYWDMGEKEKAISFVKEVLGRKLDFMKDN 240
           +H+VGDVVDLLVDMDCVGLKP FSMIEKVISLYW+M EKEKA+ FVKEVL R++ + +D+
Sbjct: 181 EHDVGDVVDLLVDMDCVGLKPGFSMIEKVISLYWEMEEKEKAVLFVKEVLRREIAYSEDD 240

Query: 241 WEGHKGGPSGYLAWKMMVDGDYRGAVKMVLNLRESGLKPEVYCYLIAMTAVVKELNEFAK 300
            +GHKGGP+GYLAWKMM +G+YRGAVK+V++LRESGLKPEVY YLIAMTAVVKELNEFAK
Sbjct: 241 GDGHKGGPTGYLAWKMMAEGNYRGAVKLVIHLRESGLKPEVYSYLIAMTAVVKELNEFAK 300

Query: 301 ALRKLKSYARDGMVAELDKDNVELVKRYQSELLADGVRLSNWVLDEGGSSSHRVVHERLL 360
           ALRKLK + + G++AELD +NVEL+++YQS+LLADGVRLS+WV+ EG S  H VV+ERLL
Sbjct: 301 ALRKLKGFTKSGLIAELDAENVELIEKYQSDLLADGVRLSSWVIQEGRSPLHGVVYERLL 360

Query: 361 AMYICAGQGLEAERQLWEMKLVGKEADADLYDIVLAICASQKETRAMNRLLTRIEITSPR 420
           AMYICAG+GLEAERQLWEMKLVGKEAD +LYDIVLAICAS+KE  A++RLLT +E+TS  
Sbjct: 361 AMYICAGRGLEAERQLWEMKLVGKEADRELYDIVLAICASKKEASAISRLLTGMEVTSSI 420

Query: 421 LKKKSLTWLLRGYIKGGHFRDAAETLVKMVNLGFLPEYLDRVAVLQGLRKRIREPENVET 480
            +KK+L+WLLRGYIKG HF DA+ET++KM++LG  PEYLDR AVLQGLR RI++  NVET
Sbjct: 421 RRKKTLSWLLRGYIKGSHFDDASETIIKMLDLGLCPEYLDRAAVLQGLRNRIQQTGNVET 480

Query: 481 YLDLCKCLSDANLIGPSLVYLHLQKYKLWVIKML 511
           YL LCK LSDANLIGP LVYL+++KYKLW++K +
Sbjct: 481 YLKLCKHLSDANLIGPCLVYLYIKKYKLWILKTI 511

BLAST of Cp4.1LG16g06550.1 vs. NCBI nr
Match: gi|1009116101|ref|XP_015874593.1| (PREDICTED: pentatricopeptide repeat-containing protein At2g30100, chloroplastic [Ziziphus jujuba])

HSP 1 Score: 734.9 bits (1896), Expect = 9.2e-209
Identity = 370/511 (72.41%), Postives = 425/511 (83.17%), Query Frame = 1

Query: 1   MICAQGFTPLTQFGFSFSLSSGLKSERLGFSAPQLCSRSPVNFCFMVSRIT-CNHQNSTF 60
           M  A GFT L Q G S S SS     R    AP +C      F   V  I  C HQN  F
Sbjct: 1   MDSAHGFTSLIQLGLS-SPSSSFSLHRHPIFAPPICQNLAGRFYPRVCPIIYCRHQNPYF 60

Query: 61  SVSRAGKFRDLRLFKSVELDQFITSDDEDEMGDGFFEAIEELERMTRDPSDVLEEMNDRL 120
            V++  KFR+ RLFKSVELDQF+TSDDE+EMG+GFFEAIEELERMTR+PSDVLEEMN+RL
Sbjct: 61  IVTKQSKFREFRLFKSVELDQFLTSDDEEEMGEGFFEAIEELERMTREPSDVLEEMNERL 120

Query: 121 SAREFQLVLVYFSQEGRDSWCALEVFEWLQKENRVDKETMELMVSIMCSWIKKLVEGQHN 180
           SARE QLVLVYFSQEGRDSWCALEVFEWL+KENRVDKETMELMVSIMCSW+KKL+E + +
Sbjct: 121 SARELQLVLVYFSQEGRDSWCALEVFEWLRKENRVDKETMELMVSIMCSWVKKLIEEERD 180

Query: 181 VGDVVDLLVDMDCVGLKPHFSMIEKVISLYWDMGEKEKAISFVKEVLGRKLDFMKDNWEG 240
           VGDVVDLLVDMDCVGLKP FSM+EKVISLYW+MGEKE+++ FVKEVL R +   +D+ +G
Sbjct: 181 VGDVVDLLVDMDCVGLKPSFSMMEKVISLYWEMGEKERSVLFVKEVLRRGIACSEDDGDG 240

Query: 241 HKGGPSGYLAWKMMVDGDYRGAVKMVLNLRESGLKPEVYCYLIAMTAVVKELNEFAKALR 300
           HKGGP+GYLAWKMM +G+Y GAVK+V+N+RESGLKPEVY YLIAMTAVVKELNEFAKALR
Sbjct: 241 HKGGPTGYLAWKMMAEGNYMGAVKLVVNIRESGLKPEVYSYLIAMTAVVKELNEFAKALR 300

Query: 301 KLKSYARDGMVAELDKDNVELVKRYQSELLADGVRLSNWVLDEGGSSSHRVVHERLLAMY 360
           KLK +A+DG+VAE+D +N  L+K+YQS+LLA GVRLSNW+  EG SS   VVHERLLAMY
Sbjct: 301 KLKGFAKDGLVAEVDTENAGLIKKYQSDLLAVGVRLSNWITQEGSSSLSGVVHERLLAMY 360

Query: 361 ICAGQGLEAERQLWEMKLVGKEADADLYDIVLAICASQKETRAMNRLLTRIEITSPRLKK 420
           +CAG+GLEAERQLWEMKLVGKEAD DLYDIVLAICASQKE  A+ RLLTR+E TS   KK
Sbjct: 361 VCAGRGLEAERQLWEMKLVGKEADGDLYDIVLAICASQKEASAIARLLTRLEATSSLRKK 420

Query: 421 KSLTWLLRGYIKGGHFRDAAETLVKMVNLGFLPEYLDRVAVLQGLRKRIREPENVETYLD 480
           KSL+WLLRGYIKGGHF +AAET+ KM++LG  PEYLDR AVLQGLR+RI     +ETYL 
Sbjct: 421 KSLSWLLRGYIKGGHFDNAAETVFKMLDLGLPPEYLDRAAVLQGLRRRIHRSGGLETYLK 480

Query: 481 LCKCLSDANLIGPSLVYLHLQKYKLWVIKML 511
           LCK LSD NLIGP L+YL+++KYKLW+IKM+
Sbjct: 481 LCKRLSDNNLIGPCLLYLYIKKYKLWIIKMI 510

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
PP176_ARATH1.7e-17063.08Pentatricopeptide repeat-containing protein At2g30100, chloroplastic OS=Arabidop... [more]
Match NameE-valueIdentityDescription
A0A0A0KC35_CUCSA9.4e-26188.82Uncharacterized protein OS=Cucumis sativus GN=Csa_6G182120 PE=4 SV=1[more]
F6GUM4_VITVI5.8e-21071.40Putative uncharacterized protein OS=Vitis vinifera GN=VIT_06s0004g07010 PE=4 SV=... [more]
B9HNB9_POPTR4.2e-20876.38Ubiquitin family protein OS=Populus trichocarpa GN=POPTR_0009s07900g PE=4 SV=1[more]
W9R4F5_9ROSA4.6e-20770.71Uncharacterized protein OS=Morus notabilis GN=L484_011688 PE=4 SV=1[more]
M5VXS0_PRUPE1.0e-20672.32Uncharacterized protein OS=Prunus persica GN=PRUPE_ppa004609mg PE=4 SV=1[more]
Match NameE-valueIdentityDescription
AT2G30100.19.6e-17263.08 pentatricopeptide (PPR) repeat-containing protein[more]
Match NameE-valueIdentityDescription
gi|778713772|ref|XP_011657120.1|1.4e-26088.82PREDICTED: pentatricopeptide repeat-containing protein At2g30100, chloroplastic ... [more]
gi|659130631|ref|XP_008465268.1|2.0e-25988.63PREDICTED: pentatricopeptide repeat-containing protein At2g30100, chloroplastic ... [more]
gi|731391774|ref|XP_010650876.1|2.9e-21071.60PREDICTED: pentatricopeptide repeat-containing protein At2g30100, chloroplastic ... [more]
gi|225434512|ref|XP_002278434.1|8.3e-21071.40PREDICTED: pentatricopeptide repeat-containing protein At2g30100, chloroplastic ... [more]
gi|1009116101|ref|XP_015874593.1|9.2e-20972.41PREDICTED: pentatricopeptide repeat-containing protein At2g30100, chloroplastic ... [more]
The following terms have been associated with this mRNA:
Vocabulary: INTERPRO
TermDefinition
IPR002885Pentatricopeptide_repeat
GO Assignments
This mRNA is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0008150 biological_process
cellular_component GO:0005575 cellular_component
molecular_function GO:0003674 molecular_function

This mRNA is a part of the following gene feature(s):

Feature NameUnique NameType
Cp4.1LG16g06550Cp4.1LG16g06550gene


The following five_prime_UTR feature(s) are a part of this mRNA:

Feature NameUnique NameType
Cp4.1LG16g06550.1:five_prime_utr:001Cp4.1LG16g06550.1:five_prime_utr:001five_prime_UTR


The following CDS feature(s) are a part of this mRNA:

Feature NameUnique NameType
Cp4.1LG16g06550.1:cds:001Cp4.1LG16g06550.1:cds:001CDS
Cp4.1LG16g06550.1:cds:002Cp4.1LG16g06550.1:cds:002CDS


The following three_prime_UTR feature(s) are a part of this mRNA:

Feature NameUnique NameType
Cp4.1LG16g06550.1:three_prime_utr:001Cp4.1LG16g06550.1:three_prime_utr:001three_prime_UTR


The following polypeptide feature(s) derives from this mRNA:

Feature NameUnique NameType
Cp4.1LG16g06550.1Cp4.1LG16g06550.1-proteinpolypeptide


Analysis Name: InterPro Annotations of Cucurbita pepo
Date Performed: 2017-12-02
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR002885Pentatricopeptide repeatPROFILEPS51375PPRcoord: 418..452
score: 8
NoneNo IPR availablePANTHERPTHR24015FAMILY NOT NAMEDcoord: 41..297
score: 1.5E-96coord: 347..453
score: 1.5
NoneNo IPR availablePANTHERPTHR24015:SF683SUBFAMILY NOT NAMEDcoord: 347..453
score: 1.5E-96coord: 41..297
score: 1.5