Compact decision trees with cluster validity for speech recognition

Jen-Tzung Chien, Chih Hsien Huang, Shun Ju Chen

Research output: Contribution to journalArticlepeer-review

15 Scopus citations

Abstract

A decision tree is built by successively splitting the observation frames of a phonetic unit according to the best phonetic questions. To prevent over-large tree models, the stopping criterion is required to suppress tree growing. It is crucial to exploit the goodness-of-split criteria to choose the best questions for node splitting and test if the hypothesis of splitting should be terminated. The robust tree models could be established. In this study, we apply the Hubert’s Γ statistic as the node splitting criterion and the T2-statistic as the stopping criterion. Hubert’s Γ statistic is a cluster validity measure, which characterizes the degree of clustering in the available data. This measure is useful to select the best questions to unravel tree nodes. Further, we examine the population closeness of two child nodes with a significant level. T2-statistic is determined to validate whether the corresponding mean vectors are close together. The splitting is stopped when validated. In continuous speech recognition experiments, the proposed methods achieve better recognition rates with smaller tree models compared to the maximum likelihood and minimum description length criteria.

Original languageEnglish
Pages (from-to)873-876
Number of pages4
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume1
DOIs
StatePublished - 1 Jan 2002

Fingerprint Dive into the research topics of 'Compact decision trees with cluster validity for speech recognition'. Together they form a unique fingerprint.

Cite this