Regularity of protein secondary structures and its prediction

Yen Wei Chu*, Chuen-Tsai Sun, Chung Yuan Huang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

We define a schema representation for visualizing the relationship between primary and secondary protein structures. In the low sequence similarity training set, the steady-state genetic algorithm outperforms the association rule mining to find those high discrimination and confidence schemata. These found schemata not only can be provided to biologists for the regularity of protein secondary structures but also applied to predict the protein secondary structures. Because of the poor Q3 accuracy in the previous study, we offer a clustering method to the steady-state genetic algorithm. The clustering method plays two important roles: one is to generate parts of initial chromosomes in genetic algorithms and another one is to assist schemata in predicting secondary protein structures. In accordance with our tests, the new approach improves 12% of Q3 accuracy by comparing to previous efforts. We also raise some new examples of schemata with the interesting biological meaning to do some discussions.

Original languageEnglish
Pages (from-to)380-387
Number of pages8
JournalWSEAS Transactions on Systems
Volume5
Issue number2
StatePublished - 1 Feb 2006

Keywords

  • Clustering
  • Data mining
  • Genetic algorithms
  • Knowledge discovery
  • Protein secondary structure

Fingerprint Dive into the research topics of 'Regularity of protein secondary structures and its prediction'. Together they form a unique fingerprint.

Cite this