Selecting a minimal number of relevant genes from microarray data to design accurate tissue classifiers

Hui Ling Huang, Chong Cheng Lee, Shinn-Ying Ho*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

27 Scopus citations


It is essential to select a minimal number of relevant genes from microarray data while maximizing classification accuracy for the development of inexpensive diagnostic tests. However, it is intractable to simultaneously optimize gene selection and classification accuracy that is a large parameter optimization problem. We propose an efficient evolutionary approach to gene selection from microarray data which can be combined with the optimal design of various multiclass classifiers. The proposed method (named GeneSelect) consists of three parts which are fully cooperated: an efficient encoding scheme of candidate solutions, a generalized fitness function, and an intelligent genetic algorithm (IGA). An existing hybrid approach based on genetic algorithm and maximum likelihood classification (GA/MLHD) is proposed to select a small number of relevant genes for accurate classification of samples. To evaluate the performance of GeneSelect, the gene selection is combined with the same maximum likelihood classification (named IGA/MLHD) for convenient comparisons. The performance of IGA/MLHD is applied to 11 cancer-related human gene expression datasets. The simulation results show that IGA/MLHD is superior to GA/MLHD in terms of the number of selected genes, classification accuracy, and robustness of selected genes and accuracy.

Original languageEnglish
Pages (from-to)78-86
Number of pages9
Issue number1
StatePublished - 1 Jul 2007


  • Classification
  • Feature selection
  • Genetic algorithm
  • Maximum likelihood
  • Microarray

Fingerprint Dive into the research topics of 'Selecting a minimal number of relevant genes from microarray data to design accurate tissue classifiers'. Together they form a unique fingerprint.

Cite this