A hybrid statistical/RNN approach to prosody synthesis for Taiwanese TTS

Sin-Horng Chen, Chen Chung Ho

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

In this paper a hybrid approach which incorporates statstcal modeling of prosodic parameters into recurrent neural network (RNN)based prosody synthesis for Min-Nan speech (Taiwanese) is proposed. It takes syllable as the basic synthesis unit and constucts statistcal models for syllable initial duraton, syllable final duraton, intersylable pause duration, pitch contour of syllable, and log-energy level of syllable. In the training, it normalizes prosodic parameters by these statistical models and uses the results to train an RNN prosody synthesizer. In syntiesis, it denormalizes the RNN outputs by the same statstical models to generate all prosodic parameters requi red by the TTS syst em. Tbadvant age of the appoach can be justified as to relieve the RNN prosody synthesizer of some affecting factors via taking care them by using the statistical models.

Original languageEnglish
Title of host publication6th International Conference on Spoken Language Processing, ICSLP 2000
PublisherInternational Speech Communication Association
ISBN (Electronic)7801501144, 9787801501141
StatePublished - 1 Jan 2000
Event6th International Conference on Spoken Language Processing, ICSLP 2000 - Beijing, China
Duration: 16 Oct 200020 Oct 2000

Publication series

Name6th International Conference on Spoken Language Processing, ICSLP 2000

Conference

Conference6th International Conference on Spoken Language Processing, ICSLP 2000
CountryChina
CityBeijing
Period16/10/0020/10/00

Fingerprint Dive into the research topics of 'A hybrid statistical/RNN approach to prosody synthesis for Taiwanese TTS'. Together they form a unique fingerprint.

  • Cite this

    Chen, S-H., & Ho, C. C. (2000). A hybrid statistical/RNN approach to prosody synthesis for Taiwanese TTS. In 6th International Conference on Spoken Language Processing, ICSLP 2000 (6th International Conference on Spoken Language Processing, ICSLP 2000). International Speech Communication Association.