A decision tree is built by successively splitting the observation frames of a phonetic unit according to the best phonetic questions. To prevent over-large tree models, the stopping criterion is required to suppress tree growing. It is crucial to exploit the goodness-of-split criteria to choose the best questions for node splitting and test if the hypothesis of splitting should be terminated. The robust tree models could be established. In this study, we apply the Hubert’s Γ statistic as the node splitting criterion and the T2-statistic as the stopping criterion. Hubert’s Γ statistic is a cluster validity measure, which characterizes the degree of clustering in the available data. This measure is useful to select the best questions to unravel tree nodes. Further, we examine the population closeness of two child nodes with a significant level. T2-statistic is determined to validate whether the corresponding mean vectors are close together. The splitting is stopped when validated. In continuous speech recognition experiments, the proposed methods achieve better recognition rates with smaller tree models compared to the maximum likelihood and minimum description length criteria.
|Number of pages||4|
|Journal||ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings|
|State||Published - 1 Jan 2002|