Hierarchical prosody modeling of English speech and its application to TTS

Chung Yao Tsai, Chin Kuan Kuo, Yih-Ru Wang, Sin-Horng Chen, I. Bin Liao, Chen Yu Chiang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

In this paper, a hierarchical prosody modeling approach for English speech is proposed. It is an extended version of the HPM approach proposed previously for Mandarin speech. It first designs a syllable-based, statistical prosodic model to describe various relationships of prosodic-acoustic features of the speech signal, linguistic features of the associated text, and prosodic tags representing the underlining prosody structure of the speech. It then employs a prosody labeling and modeling algorithm to estimate the model parameters and label the prosodic tags of all training utterances simultaneously from a prosody-unlabeled speech corpus. Experimental results on a corpus containing many paragraphic utterances of a female English-majored Chinese speaker show that the inferred parameters of the model are all meaningful. We then use the trained model to generate prosodic information for a TTS system. An informal listening test shows that the synthetic speech sounds quite natural.

Original languageEnglish
Title of host publicationOriental COCOSDA 2014 - 17th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment / CASLRE (Conference on Asian Spoken Language Research and Evaluation)
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781479970940
DOIs
StatePublished - 27 Feb 2014
Event17th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment, Oriental COCOSDA 2014 - Phuket, Thailand
Duration: 10 Sep 201412 Sep 2014

Publication series

NameOriental COCOSDA 2014 - 17th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment / CASLRE (Conference on Asian Spoken Language Research and Evaluation)

Conference

Conference17th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment, Oriental COCOSDA 2014
CountryThailand
CityPhuket
Period10/09/1412/09/14

Keywords

  • Hierarchical prosodic model
  • Prosody modeling
  • Text-to-Speech

Fingerprint Dive into the research topics of 'Hierarchical prosody modeling of English speech and its application to TTS'. Together they form a unique fingerprint.

Cite this