Estimation of hidden speaking rate

Guan Tin Liou, Chen Yu Chiang, Yih-Ru Wang, Sin-Horng Chen

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations


Hidden speaking rate is proposed in this paper. In contrast to traditional raw speaking rate estimation that simply averages number of syllable or phone per second with or without pauses, the proposed hidden speaking rate is estimated by normalizing effects of lexical information and prosodic structure based on the existing speaking rate-dependent hierarchical prosodic model (SR-HPM). The significance of the proposed hidden speaking rate is exemplified by analysis on the speaking rate estimation for a Mandarin speech database containing four parallel speech corpora of a female professional announcer with fast, normal, medium and slow speaking rates. By conducting prosody generation experiment on the same speech corpus, the hidden speaking rate is proved to be more meaningful and accurate to represent speaker’s intended or underlying speaking rate than conventional raw speaking rate.

Original languageEnglish
Pages (from-to)592-596
Number of pages5
JournalProceedings of the International Conference on Speech Prosody
StatePublished - 1 Jan 2018
Event9th International Conference on Speech Prosody, SP 2018 - Poznan, Poland
Duration: 13 Jun 201816 Jun 2018


  • Articulation rate
  • Mandarin
  • Prosody
  • Speaking rate
  • Speech rate
  • SR-HPM
  • Text-to-speech

Fingerprint Dive into the research topics of 'Estimation of hidden speaking rate'. Together they form a unique fingerprint.

Cite this