A new model-based prosody coder for mandarin speech

Chen Yu Chiang, Yu Ping Hung, Sin-Horng Chen, Yih-Ru Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, a novel parametric prosody coding approach for Mandarin speech is proposed. It employs a hierarchical prosodic model (HPM) as a prosody generating model in the encoder to analyze the speech prosody of the input utterance to obtain a parametric representation of four prosodic-acoustic features of syllable pitch contour, syllable duration, syllable energy level, and syllable-juncture pause duration for encoding. In the decoder, the four prosodic-acoustic features are reconstructed by a synthesis operation using the decoded HPM parameters. The reconstructed prosodic features are lastly used in an HMM-based speech synthesizer to help to generate the reconstructed speech. Experimental results show that the reconstructed speech has good quality at low data rates of 114.9 bits/s for a speaker-dependent task. An informal listening test confirmed decoded speeches sounded very fluently.

Original languageEnglish
Title of host publicationProceedings - 2013 9th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 2013
PublisherIEEE Computer Society
Pages60-63
Number of pages4
ISBN (Print)9780769551203
DOIs
StatePublished - 1 Jan 2013
Event9th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 2013 - Beijing, China
Duration: 16 Oct 201318 Oct 2013

Publication series

NameProceedings - 2013 9th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 2013

Conference

Conference9th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 2013
CountryChina
CityBeijing
Period16/10/1318/10/13

Keywords

  • Prosodic model
  • Prosody coding

Fingerprint Dive into the research topics of 'A new model-based prosody coder for mandarin speech'. Together they form a unique fingerprint.

  • Cite this

    Chiang, C. Y., Hung, Y. P., Chen, S-H., & Wang, Y-R. (2013). A new model-based prosody coder for mandarin speech. In Proceedings - 2013 9th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 2013 (pp. 60-63). [6846580] (Proceedings - 2013 9th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 2013). IEEE Computer Society. https://doi.org/10.1109/IIH-MSP.2013.24