A new model-based mandarin-speech coding system

Chen Yu Chiang*, Jyh Her Yang, Ming Chieh Liu, Yih-Ru Wang, Yuan Fu Liao, Sin-Horng Chen

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

2 Scopus citations

Abstract

In this paper, a new model-based Mandarin-speech coding system is proposed. It employs a prosody-enriched ASR with a hierarchical prosodic model (HPM) to generate from the input speech enriched transcriptions, including linguistic features, prosodic tags and spectral parameters in the encoder. By sending these features to the decoder, we can first reconstruct the prosodic-acoustic features of syllable pitch contour, syllable duration, syllable energy level, and inter-syllable pause duration by HPM using the linguistic features and prosodic tags; and then combined with spectral parameters to reconstruct the input speech signal by an HMM-based speech synthesizer. Experimental results show that the reconstructed speech has good quality at a low data rate of 543 bits/s.

Original languageEnglish
Pages (from-to)2561-2564
Number of pages4
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
StatePublished - 1 Dec 2011
Event12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, Italy
Duration: 27 Aug 201131 Aug 2011

Keywords

  • Enriched transcriptions
  • Hierarchical prosodic model
  • Model-based speech coding
  • Prosody-enriched ASR

Fingerprint Dive into the research topics of 'A new model-based mandarin-speech coding system'. Together they form a unique fingerprint.

Cite this