Speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS

Po Chun Wang, I. Bin Liao, Chen Yu Chiang, Yih-Ru Wang, Sin-Horng Chen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

In this paper, a speaker adaptation method to adapt an existing speaking rate-dependent hierarchical prosodic model (SR-HPM) of an SR-controlled Mandarin TTS system to new speaker's data for realizing a new voice is proposed. Two main problems are addressed: data sparseness for few adaptation utterances existing only in a small range of normal speaking rate and no adaptation data in both ranges of fast and slow speaking rates. The proposed method follows the idea of SR-HPM training to firstly normalize the prosodic-acoustic features of the new speaker's speech data, to then train an HPM by the prosody labeling and modeling algorithm, and to lastly refine the HPM to an SR-dependent model. The MAP adaptation method with model parameter extrapolation is applied to cope with the above two problems. Experimental results on a male speaker's adaptation data confirmed that the resulting adaptive SR-HPM has reasonable parameters covering a wide range of speaking rates and hence can be used in the TTS system to generate prosodic-acoustic features for synthesizing the new speaker's voice of any given SR.

Original languageEnglish
Title of host publicationProceedings of the 9th International Symposium on Chinese Spoken Language Processing, ISCSLP 2014
EditorsThomas Fang Zheng, Haizhou Li, Minghui Dong, Jianhua Tao, Yanfeng Lu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages511-515
Number of pages5
ISBN (Electronic)9781479942206
DOIs
StatePublished - 24 Oct 2014
Event9th International Symposium on Chinese Spoken Language Processing, ISCSLP 2014 - Singapore, Singapore
Duration: 12 Sep 201414 Sep 2014

Publication series

NameProceedings of the 9th International Symposium on Chinese Spoken Language Processing, ISCSLP 2014

Conference

Conference9th International Symposium on Chinese Spoken Language Processing, ISCSLP 2014
CountrySingapore
CitySingapore
Period12/09/1414/09/14

Keywords

  • hierarchical prosodic model
  • Mandarin TTS
  • prosodic-acoustic features
  • speaker adaptation

Fingerprint Dive into the research topics of 'Speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS'. Together they form a unique fingerprint.

Cite this