Sequence analysis and rule development of predicting protein stability change upon mutation using decision tree model

Liang Tsung Huang, M. Michael Gromiha, Shinn-Ying Ho*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

25 Scopus citations

Abstract

Understanding the mechanism of the protein stability change is one of the most challenging tasks. Recently, the prediction of protein stability change affected by single point mutations has become an interesting topic in molecular biology. However, it is desirable to further acquire knowledge from large databases to provide new insights into the nature of them. This paper presents an interpretable prediction tree method (named iPTREE-2) that can accurately predict changes of protein stability upon mutations from sequence based information and analyze sequence characteristics from the viewpoint of composition and order. Therefore, iPTREE-2 based on a regression tree algorithm exhibits the ability of finding important factors and developing rules for the purpose of data mining. On a dataset of 1859 different single point mutations from thermodynamic database, ProTherm, iPTREE-2 yields a correlation coefficient of 0.70 between predicted and experimental values. In the task of data mining, detailed analysis of sequences reveals the possibility of the compositional specificity of residues in different ranges of stability change and implies the existence of certain patterns. As building rules, we found that the mutation residues in wild type and in mutant protein play an important role. The present study demonstrates that iPTREE-2 can serve the purpose of predicting protein stability change, especially when one requires more understandable knowledge.

Original languageEnglish
Pages (from-to)879-890
Number of pages12
JournalJournal of Molecular Modeling
Volume13
Issue number8
DOIs
StatePublished - 1 Aug 2007

Keywords

  • Bioinformatics
  • Data mining
  • Decision trees
  • Prediction
  • Protein stability

Fingerprint Dive into the research topics of 'Sequence analysis and rule development of predicting protein stability change upon mutation using decision tree model'. Together they form a unique fingerprint.

Cite this