Prosodic Information-Assisted DNN-based Mandarin Spontaneous-Speech Recognition

Yu Chih Deng, Cheng Hsin Lin, Yuan Fu Liao, Yih-Ru Wang, Sin Horng Chen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper continues the method proposed in [1] and updates its traditional HMM-based ASR to state-of-the-art DNN-based ASR. Use prosodic information to assist state-of-the-art DNN-based Mandarin spontaneous-speech recognition, especially to alleviate the serious interference of annoying disfluencies and paralinguistic phenomena during decoding. This approach adopts a sophisticated hierarchical prosodic model (HPM) made of several break-syntax, break-acoustic, syllable prosodic and prosodic state models to rescore and improve the TDNN-f+RNNLM-based 1st pass decoding output and generate, at the same time, the word, Part of Speech (POS), Punctuation Mark (PM), tone, break type, and prosodic state tags for further use. Experimental results showed the HPM-based system not only dramatically reduced the word error rate from previous best value: 41.8% [1] to 21.2%. It also detected well the underlying POS, PMs, and tones (10.9%, 12.6%, and 2.3% error rates were achieved, respectively). This confirms that the proposed method is very promising on tackling the task of Mandarin spontaneous-speech recognition.

Original languageEnglish
Title of host publicationProceedings of 2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages134-138
Number of pages5
ISBN (Electronic)9781728198965
DOIs
StatePublished - 5 Nov 2020
Event23rd Conference of the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2020 - Virtual, Yangon, Myanmar
Duration: 5 Nov 20207 Nov 2020

Publication series

NameProceedings of 2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2020

Conference

Conference23rd Conference of the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2020
CountryMyanmar
CityVirtual, Yangon
Period5/11/207/11/20

Keywords

  • HPM
  • Mandarin Spontaneous-Speech Recognition
  • MCDC
  • RNNLM
  • TDNN-f

Fingerprint Dive into the research topics of 'Prosodic Information-Assisted DNN-based Mandarin Spontaneous-Speech Recognition'. Together they form a unique fingerprint.

Cite this