In this paper, a model-based tone labeling method for Min-Nan/Taiwanese speech is proposed. It takes the mean and shape of syllable pitch contour as two modeling units and considers some major affecting factors that control their variations. By using the EM algorithm to estimate all parameters of the pitch mean and shape models from a speech database, we can decide the best tone sequences pronounced in all utterance of the database. Experimental results showed that it outperformed the VQ classification method which suffers from the interferences resulted from neighboring syllables and from the global prosodic phrase pattern.
|Journal||ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings|
|State||Published - 28 Sep 2004|
|Event||Proceedings - IEEE International Conference on Acoustics, Speech, and Signal Processing - Montreal, Que, Canada|
Duration: 17 May 2004 → 21 May 2004