Discriminative Analysis of Distortion Sequences in Speech Recognition

Pao Chung Chang, Sin-Horng Chen, Biing Hwang Juang

Research output: Contribution to journalArticle

6 Scopus citations

Abstract

In a traditional speech recognition system, the distance score between a test token and a reference pattern is obtained by simply averaging the distortion sequence resulted from matching of the two patterns through a dynamic programming procedure. The final decision is made by choosing the one with the minimal average distance score. If we view the distortion sequence as a form of observed features, a decision rule based on a specific discriminant function designed for the distortion sequence obviously will perform better than that based on the simple average distortion. We, therefore, suggest in this paper a linear discriminant function of the form to compute the distance score Δ instead of a direct average. Several adaptive algorithms are proposed to learn the discriminant weighting function in this paper. These include one heuristic method, two methods based on the error propagation algorithm [1], [2], and one method based on the generalized probabilistic descent (GPD) algorithm [3]. We study these methods in a speaker-independent speech recognition task involving utterances of the highly confusible English E-set (b, c, d, e, g, p, t, v, z). The results show that the best performance is obtained by using the GPD method which achieved a 78.1% accuracy, compared to 67.6% with the traditional unweighted average method. Besides the experimental comparisons, an analytical discussion of various training algorithms is also provided.

Original languageEnglish
Pages (from-to)326-333
Number of pages8
JournalIEEE Transactions on Speech and Audio Processing
Volume1
Issue number3
DOIs
StatePublished - 1 Jan 1993

Fingerprint Dive into the research topics of 'Discriminative Analysis of Distortion Sequences in Speech Recognition'. Together they form a unique fingerprint.

  • Cite this