In this paper, a two-stage sample-based phone boundary detection algorithm is proposed. In the first stage, some local sample-based acoustic parameters are used to pre-select some phone boundary candidates. Then, in the second stage, some high-order statistics of the log-likelihood differences of two adjacent speech segments around each boundary candidate are calculated to serve as similarity measure for candidate verification. Experimental results on the TIMIT speech corpus showed that EERs of 8.6% and 7.6% were achieved for onestage and two-stage sample-based phone boundary detections, respectively. Moreover, for the two-stage system, 42.1% and 81.9% of boundaries detected were within 5- and 15-sample error tolerance from manual labeling results.
|頁（從 - 到）||413-416|
|期刊||Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH|
|出版狀態||Published - 1 十二月 2011|
|事件||12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, Italy|
持續時間: 27 八月 2011 → 31 八月 2011