PredCRP: Predicting and analysing the regulatory roles of CRP from its binding sites in Escherichia coli

Ming Ju Tsai, Jyun Rong Wang, Chi Dung Yang, Kuo Ching Kao, Wen Lin Huang, Hsi Yuan Huang, Ching-Ping Tseng, Hsien Da Huang, Shinn-Ying Ho*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

12 Scopus citations


Cyclic AMP receptor protein (CRP), a global regulator in Escherichia coli, regulates more than 180 genes via two roles: activation and repression. Few methods are available for predicting the regulatory roles from the binding sites of transcription factors. This work proposes an accurate method PredCRP to derive an optimised model (named PredCRP-model) and a set of four interpretable rules (named PredCRP-ruleset) for predicting and analysing the regulatory roles of CRP from sequences of CRP-binding sites. A dataset consisting of 169 CRP-binding sites with regulatory roles strongly supported by evidence was compiled. The PredCRP-model, using 12 informative features of CRP-binding sites, and cooperating with a support vector machine achieved a training and test accuracy of 0.98 and 0.93, respectively. PredCRP-ruleset has two activation rules and two repression rules derived using the 12 features and the decision tree method C4.5. This work further screened and identified 23 previously unobserved regulatory interactions in Escherichia coli. Using quantitative PCR for validation, PredCRP-model and PredCRP-ruleset achieved a test accuracy of 0.96 (=22/23) and 0.91 (=21/23), respectively. The proposed method is suitable for designing predictors for regulatory roles of all global regulators in Escherichia coli. PredCRP can be accessed at

Original languageEnglish
Article number18648
JournalScientific reports
Issue number1
StatePublished - 1 Dec 2018

