Reduced support vector machines: A statistical theory

Yuh-Jye Lee*, Su Yun Huang

*Corresponding author for this work

Research output: Contribution to journalArticle

175 Scopus citations

Abstract

In dealing with large data sets, the reduced support vector machine (RSVM) was proposed for the practical objective to overcome some computational difficulties as well as to reduce the model complexity. In this paper, we study the RSVM from the viewpoint of sampling design, its robustness, and the spectral analysis of the reduced kernel. We consider the nonlinear separating surface as a mixture of kernels. Instead of a full model, the RSVM uses a reduced mixture with kernels sampled from certain candidate set. Our main results center on two major themes. One is the robustness of the random subset mixture model. The other is the spectral analysis of the reduced kernel. The robustness is judged by a few criteria as follows: 1) model variation measure; 2) model bias (deviation) between the reduced model and the full model; and 3) test power in distinguishing the reduced model from the full one. For the spectral analysis, we compare the eigenstructures of the full kernel matrix and the approximation kernel matrix. The approximation kernels are generated by uniform random subsets. The small discrepancies between them indicate that the approximation kernels can retain most of the relevant information for learning tasks in the full kernel. We focus on some statistical theory of the reduced set method mainly in the context of the RSVM. The use of a uniform random subset is not limited to the RSVM. This approach can act as a supplemental algorithm on top of a basic optimization algorithm, wherein the actual optimization takes place on the subset-approximated data. The statistical properties discussed in this paper are still valid.

Original languageEnglish
Pages (from-to)1-13
Number of pages13
JournalIEEE Transactions on Neural Networks
Volume18
Issue number1
DOIs
StatePublished - 1 Jan 2007

Keywords

  • Canonical angles
  • Kernel methods
  • Maximinity
  • Minimaxity
  • Model complexity
  • Monte Carlo sampling
  • Nyström approximation
  • Reduced set
  • Spectral analysis
  • Support vector machines (SVMs)
  • Uniform design
  • Uniform random subset

Fingerprint Dive into the research topics of 'Reduced support vector machines: A statistical theory'. Together they form a unique fingerprint.

  • Cite this