On cross-dialect and -speaker adaptation of speaking rate-dependent hierarchical prosodic model for a Hakka text-to-speech system

Chen Yu Chiang, Hsiu Min Yu, Sin-Horng Chen

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations

Abstract

This paper presents an effective adaptation of an existing speaking rate-dependent hierarchical prosodic model (SRHPM) for Mandarin to construct the SR-HPM for Hakka, another Chinese dialect. Based on the cross-dialectal linguistic similarities in terms of syntactic and prosodic structures, the adaptation is formulated as a maximum a posteriori estimation (MAP) problem with the existing Mandarin SR-HPM serving as an informative prior. In addition, benefiting from the welltrained Mandarin SR-HPM that models the effects of speaking rate (SR) on prosodic-acoustic features, the SR-HPM developed for Hakka could generate satisfactory prosody in various SRs. The performance of the approach proposed in this study was evaluated by an experiment of prosody generation for a SR-controlled Hakka text-to-speech system, in which the Hakka SR-HPM is trained by a Hakka corpus that is small in size and read in narrow SR. Results show that the generated Hakka prosody was judged to be quite natural by native Hakka speakers for SR varying from 3.3 syllables/sec to 6.7 syllables/sec.

Original languageEnglish
Pages (from-to)786-790
Number of pages5
JournalProceedings of the International Conference on Speech Prosody
Volume2016-January
DOIs
StatePublished - 1 Jan 2016
Event8th Speech Prosody 2016 - Boston, United States
Duration: 31 May 20163 Jun 2016

Keywords

  • Chinese
  • Dialects with limited size of corpus
  • Hakka
  • Mandarin
  • Prosody generation
  • Text-to-speech systems

Fingerprint Dive into the research topics of 'On cross-dialect and -speaker adaptation of speaking rate-dependent hierarchical prosodic model for a Hakka text-to-speech system'. Together they form a unique fingerprint.

Cite this