TY - JOUR
T1 - Discriminative training of Gaussian mixture bigram models with application to Chinese dialect identification
AU - Tsai, W. H.
AU - Chang, Wen-Whei
PY - 2002/3/1
Y1 - 2002/3/1
N2 - This study focuses on the parametric stochastic modeling of characteristic sound features that distinguish languages from one another. A new stochastic model, the so-called Gaussian mixture bigram model (GMBM), that allows exploitation of the acoustic feature bigram statistics without requiring transcribed training data is introduced. For greater efficiency, a minimum classification error (MCE) algorithm is employed to accomplish discriminative training of a GMBM-based Chinese dialect identification system. Simulation results demonstrate the effectiveness of the GMBM for dialect-specific acoustic modeling, and use of this model allows the proposed system to distinguish between the three major Chinese dialects spoken in Taiwan with 94.4% accuracy.
AB - This study focuses on the parametric stochastic modeling of characteristic sound features that distinguish languages from one another. A new stochastic model, the so-called Gaussian mixture bigram model (GMBM), that allows exploitation of the acoustic feature bigram statistics without requiring transcribed training data is introduced. For greater efficiency, a minimum classification error (MCE) algorithm is employed to accomplish discriminative training of a GMBM-based Chinese dialect identification system. Simulation results demonstrate the effectiveness of the GMBM for dialect-specific acoustic modeling, and use of this model allows the proposed system to distinguish between the three major Chinese dialects spoken in Taiwan with 94.4% accuracy.
KW - Chinese dialect identification
KW - Gaussian mixture bigram model
KW - Minimum classification error algorithm
UR - http://www.scopus.com/inward/record.url?scp=0036497598&partnerID=8YFLogxK
U2 - 10.1016/S0167-6393(00)00090-X
DO - 10.1016/S0167-6393(00)00090-X
M3 - Article
AN - SCOPUS:0036497598
VL - 36
SP - 317
EP - 326
JO - Speech Communication
JF - Speech Communication
SN - 0167-6393
IS - 3-4
ER -