Interpretable knowledge acquisition for predicting DNA-binding domains using an evolutionary fuzzy classifier method

Hui Ling Huang*, Chang Fang-Lin Chang, Shinn Jang Ho, Li Sun Shu, Shinn-Ying Ho

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

DNA-binding domains are functional proteins in a cell, which plays a vital role in various essential biological activities. It is desirable to predict and analyze novel proteins from protein sequences only using machine learning approaches. Numerous prediction methods were proposed by identifying informative features and designing effective classifiers. The support vector machine (SVM) is well recognized as an accurate and robust classifier. However, the block-box mechanism of SVM suffers from low interpretability for biologists. It is better to design a prediction method using interpretable features and prediction results. In this study, we propose an interpretable physicochemical property classifier (named iPPC) with an accurate and compact fuzzy rule base using a scatter partition of feature space for DNA-binding data analysis. In designing iPPC, the flexible membership function, fuzzy rule, and physicochemical properties selection are simultaneously optimized. An intelligent genetic algorithm IGA is used to efficiently solve the design problem with a large number of tuning parameters to maximize prediction accuracy, minimize the number of features selected, and minimize the number of fuzzy rules. Using benchmark datasets of DNA-binding domains, iPPC obtains the training accuracy of 81% and test accuracy of 79% with three fuzzy rules and two physicochemical properties. Compared with the decision tree method with a training accuracy of 77%, iPPC has a more compact and interpretable knowledge base. The two physicochemical properties are Number of hydrogen bond donors and Helix-coil equilibrium constant in the AAindex database.

Original languageEnglish
Title of host publicationProceedings - 2011 IEEE International Conference on Computer Science and Automation Engineering, CSAE 2011
Pages295-299
Number of pages5
DOIs
StatePublished - 25 Aug 2011
Event2011 IEEE International Conference on Computer Science and Automation Engineering, CSAE 2011 - Shanghai, China
Duration: 10 Jun 201112 Jun 2011

Publication series

NameProceedings - 2011 IEEE International Conference on Computer Science and Automation Engineering, CSAE 2011
Volume4

Conference

Conference2011 IEEE International Conference on Computer Science and Automation Engineering, CSAE 2011
CountryChina
CityShanghai
Period10/06/1112/06/11

Keywords

  • DNA-binding
  • fuzzy classifier
  • genetic algorithm
  • knowledge acquistion
  • physicochemical properties
  • prediction

Fingerprint Dive into the research topics of 'Interpretable knowledge acquisition for predicting DNA-binding domains using an evolutionary fuzzy classifier method'. Together they form a unique fingerprint.

  • Cite this

    Huang, H. L., Fang-Lin Chang, C., Ho, S. J., Shu, L. S., & Ho, S-Y. (2011). Interpretable knowledge acquisition for predicting DNA-binding domains using an evolutionary fuzzy classifier method. In Proceedings - 2011 IEEE International Conference on Computer Science and Automation Engineering, CSAE 2011 (pp. 295-299). [5952854] (Proceedings - 2011 IEEE International Conference on Computer Science and Automation Engineering, CSAE 2011; Vol. 4). https://doi.org/10.1109/CSAE.2011.5952854