Design of accurate predictors for DNA-binding sites in proteins using hybrid SVM-PSSM method

Shinn-Ying Ho*, Fu Chieh Yu, Chia Yun Chang, Hui Ling Huang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

32 Scopus citations

Abstract

In this paper, we investigate the design of accurate predictors for DNA-binding sites in proteins from amino acid sequences. As a result, we propose a hybrid method using support vector machine (SVM) in conjunction with evolutionary information of amino acid sequences in terms of their position-specific scoring matrices (PSSMs) for prediction of DNA-binding sites. Considering the numbers of binding and non-binding residues in proteins are significantly unequal, two additional weights as well as SVM parameters are analyzed and adopted to maximize net prediction (NP, an average of sensitivity and specificity) accuracy. To evaluate the generalization ability of the proposed method SVM-PSSM, a DNA-binding dataset PDC-59 consisting of 59 protein chains with low sequence identity on each other is additionally established. The SVM-based method using the same six-fold cross-validation procedure and PSSM features has NP = 80.15% for the training dataset PDNA-62 and NP = 69.54% for the test dataset PDC-59, which are much better than the existing neural network-based method by increasing the NP values for training and test accuracies up to 13.45% and 16.53%, respectively. Simulation results reveal that SVM-PSSM performs well in predicting DNA-binding sites of novel proteins from amino acid sequences.

Original languageEnglish
Pages (from-to)234-241
Number of pages8
JournalBioSystems
Volume90
Issue number1
DOIs
StatePublished - 1 Jul 2007

Keywords

  • Amino acid sequence
  • DNA-binding prediction
  • Position-specific scoring matrices (PSSM)
  • Protein
  • Support vector machine (SVM)

Fingerprint Dive into the research topics of 'Design of accurate predictors for DNA-binding sites in proteins using hybrid SVM-PSSM method'. Together they form a unique fingerprint.

Cite this