Multilingual speech corpora for TTS system development

Hsi Chun Hsiao*, Hsiu Min Yu, Yih-Ru Wang, Sin-Horng Chen

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


In this paper, four speech corpora collected in the Speech Lab of NCTU in recent years are discussed. They include a Mandarin tree-bank speech corpus, a Min-Nan speech corpus, a Hakka speech corpus, and a Chinese-English mixed speech corpus. Currently, they are used separately to develop a corpus-based Mandarin TTS system, a Min-Nan TTS system, a Hakka TTS system, and a Chinese-English bilingual TTS system. These systems will be integrated in the future to construct a multilingual TTS system covering the four primary languages used in Taiwan.

Original languageEnglish
Title of host publicationChinese Spoken Language Processing - 5th International Symposium, ISCSLP 2006, Proceedings
Number of pages12
StatePublished - 1 Dec 2006
Event5th International Symposium on Chinese Spoken Language Processing, ISCSLP 2006 - Singapore, Singapore
Duration: 13 Dec 200616 Dec 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4274 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference5th International Symposium on Chinese Spoken Language Processing, ISCSLP 2006

Fingerprint Dive into the research topics of 'Multilingual speech corpora for TTS system development'. Together they form a unique fingerprint.

Cite this