Designing predictors of halophilic and non-halophilic proteins using support vector machines

Hui Ling Huang*, Yerukala Sathipati Srinivasulu, Phasit Charoenkwan, Hua Chin Lee, Shinn-Ying Ho

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Finding the molecular features causes the halophilicity in the halostable organisms is helpful to understand the halophilic adaption. In this study, we proposed a prediction method for halophilic proteins by using a machine learning method. The stages of this study are six-fold. First, we establish a non-redundant dataset of the halophilic proteins, collected from NCBI, Uniprotkb and EMBL-EBI databases. The dataset consists of 245 positive and negative proteins with sequence identity <25%. Second, the protein sequences are represented by three types of feature vector sets which include amino acid composition, dipeptide composition, and physicochemical properties. Third, we propose three classifiers based on support vector machine (SVM) to classify the halophilic proteins and non-halophilic proteins. Fourth, the independent test accuracies of the three efficient classifiers are larger than 83%. Fifth, an inheritable biobjective combinatory genetic algorithm is utilized to select a set of 11 physicochemical properties (PCPs). Sixth, these abundant amino acids, high different dipeptides (amino acid pair) and 11 informative PCPs can support to analyze the halophilic and non-halophilic proteins.

Original languageEnglish
Title of host publicationProceedings of the IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2013 - 2013 IEEE Symposium Series on Computational Intelligence, SSCI 2013
Pages230-237
Number of pages8
DOIs
StatePublished - 10 Oct 2013
Event10th Annual IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2013 - 2013 IEEE Symposium Series on Computational Intelligence, SSCI 2013 - Singapore, Singapore
Duration: 16 Apr 201319 Apr 2013

Publication series

NameProceedings of the IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2013 - 2013 IEEE Symposium Series on Computational Intelligence, SSCI 2013

Conference

Conference10th Annual IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2013 - 2013 IEEE Symposium Series on Computational Intelligence, SSCI 2013
CountrySingapore
CitySingapore
Period16/04/1319/04/13

Keywords

  • Genetic algorithms
  • Halophilic proteins
  • Physicochemical properties
  • SVM

Fingerprint Dive into the research topics of 'Designing predictors of halophilic and non-halophilic proteins using support vector machines'. Together they form a unique fingerprint.

Cite this