This study focuses on the parametric stochastic modeling of characteristic sound features that distinguish languages from one another. A new stochastic model, the so-called Gaussian mixture bigram model (GMBM), that allows exploitation of the acoustic feature bigram statistics without requiring transcribed training data is introduced. For greater efficiency, a minimum classification error (MCE) algorithm is employed to accomplish discriminative training of a GMBM-based Chinese dialect identification system. Simulation results demonstrate the effectiveness of the GMBM for dialect-specific acoustic modeling, and use of this model allows the proposed system to distinguish between the three major Chinese dialects spoken in Taiwan with 94.4% accuracy.
- Chinese dialect identification
- Gaussian mixture bigram model
- Minimum classification error algorithm