SCMPSP: Prediction and characterization of photosynthetic proteins based on a scoring card method

Tamara Vasylenko, Yi Fan Liou, Hong An Chen, Phasit Charoenkwan, Hui Ling Huang*, Shinn-Ying Ho

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

Background: Photosynthetic proteins (PSPs) greatly differ in their structure and function as they are involved in numerous subprocesses that take place inside an organelle called a chloroplast. Few studies predict PSPs from sequences due to their high variety of sequences and structues. This work aims to predict and characterize PSPs by establishing the datasets of PSP and non-PSP sequences and developing prediction methods. Results: A novel bioinformatics method of predicting and characterizing PSPs based on scoring card method (SCMPSP) was used. First, a dataset consisting of 649 PSPs was established by using a Gene Ontology term GO:0015979 and 649 non-PSPs from the SwissProt database with sequence identity <= 25%.- Several prediction methods are presented based on support vector machine (SVM), decision tree J48, Bayes, BLAST, and SCM. The SVM method using dipeptide features-performed well and yielded - a test accuracy of 72.31%. The SCMPSP method uses the estimated propensity scores of 400 dipeptides - as PSPs and has a test accuracy of 71.54%, which is comparable to that of the SVM method. The derived propensity scores of 20 amino acids were further used to identify informative physicochemical properties for characterizing PSPs. The analytical results reveal the following four characteristics of PSPs: 1) PSPs favour hydrophobic side chain amino acids; 2) PSPs are composed of the amino acids prone to form helices in membrane environments; 3) PSPs have low interaction with water; and 4) PSPs prefer to be composed of the amino acids of electron-reactive side chains. Conclusions: The SCMPSP method not only estimates the propensity of a sequence to be PSPs, it also discovers characteristics that further improve understanding of PSPs. The SCMPSP source code and the datasets used in this study are available at.

Original languageEnglish
Article number9999
JournalBMC Bioinformatics
Volume16
Issue number1
DOIs
StatePublished - 21 Jan 2015

Fingerprint Dive into the research topics of 'SCMPSP: Prediction and characterization of photosynthetic proteins based on a scoring card method'. Together they form a unique fingerprint.

Cite this