In this paper, a two-stage sample-based phone boundary detection algorithm is proposed. In the first stage, some local sample-based acoustic parameters are used to pre-select some phone boundary candidates. Then, in the second stage, some high-order statistics of the log-likelihood differences of two adjacent speech segments around each boundary candidate are calculated to serve as similarity measure for candidate verification. Experimental results on the TIMIT speech corpus showed that EERs of 8.6% and 7.6% were achieved for onestage and two-stage sample-based phone boundary detections, respectively. Moreover, for the two-stage system, 42.1% and 81.9% of boundaries detected were within 5- and 15-sample error tolerance from manual labeling results.
|Number of pages||4|
|Journal||Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH|
|State||Published - 1 Dec 2011|
|Event||12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, Italy|
Duration: 27 Aug 2011 → 31 Aug 2011
- Phone boundary detection
- Similarity measure