Small non-coding RNA genes have been shown to play important regulatory roles in a variety of cellular processes, but prediction of non-coding RNA genes is a great challenge, using either an experimental or a computational approach, due to the characteristics of sRNAs, which are that sRNAs are small in size, are not translated into proteins and show variable stability. Most known sRNAs have been identified in Escherichia coli and have been shown to be conserved in closely related organisms. We have developed an integrative approach that searches highly conserved intergenic regions among related bacterial genomes for combinations of characteristics that have been extracted from known E. coli sRNA genes. Support vector machines (SVM) were then used with these characteristics to predict novel sRNA genes. (c) 2010 Elsevier Ltd. All rights reserved.
- Expert systems; Support vector machines; Machine learning; Bioinformatics; Non-coding RNA