A new method is proposed in this paper to train recurrent neural networks (RNNs) for speech recognition such that the difficulty of selecting appropriate target functions can be avoided. A novel architecture of RNN-based speech recognition system is also introduced for solving the problem related to large vocabulary speech recognition. Additionally, the proposed RNN-based recognizer is found to have the advantages of being capable of absorbing the temporal variation of speech patterns as well as possessing effective discrimination capabilities. Performance of the proposed system was examined using two speech recognition tasks of recognizing 10 Mandarin digits and 54 confusable Mandarin syllables. Experimental results show that the proposed method outperforms both the continuous observation densities hidden Markov models method and a RNN recognizer using the extended back propagation training algorithm.
|Journal||ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings|
|State||Published - 19 Apr 1994|
|Event||Proceedings of the 1994 IEEE International Conference on Acoustics, Speech and Signal Processing. Part 2 (of 6) - Adelaide, Aust|
Duration: 19 Apr 1994 → 22 Apr 1994