Wake-up-word detection by estimating formants from spatial eigenspace information

Jwu-Sheng Hu*, Ming Tang Lee, Yun Xuan Xiao

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Wake-up-word (WUW) detection is to detect a single word or phrase while rejecting all other words or sounds. For distant human-robot interaction (HRI), the location of the target speaker and a unique command are required to activate the robot. In this paper, a multi-channel speech interface is introduced not only to estimate the unknown locations of the sound sources but also to strengthen the speech feature for WUW detection. A ring-shape microphone array is used to collect the speech signal. The spatial eigenspace information discovered by multiple signal classification (MUSIC) is used to estimate location dependent formants and the direction of the target speaker. The estimated formants contained in fixed time duration are grouped and evaluated using the likelihood functions of formants. A cascaded detector is also introduced to make the final decision. Experimental results demonstrate the usefulness of the proposed approach with several noisy conditions, including the cases of simultaneous competing speeches.

Original languageEnglish
Title of host publication2012 IEEE International Conference on Mechatronics and Automation, ICMA 2012
Pages2019-2024
Number of pages6
DOIs
StatePublished - 23 Oct 2012
Event2012 9th IEEE International Conference on Mechatronics and Automation, ICMA 2012 - Chengdu, China
Duration: 5 Aug 20128 Aug 2012

Publication series

Name2012 IEEE International Conference on Mechatronics and Automation, ICMA 2012

Conference

Conference2012 9th IEEE International Conference on Mechatronics and Automation, ICMA 2012
CountryChina
CityChengdu
Period5/08/128/08/12

Fingerprint Dive into the research topics of 'Wake-up-word detection by estimating formants from spatial eigenspace information'. Together they form a unique fingerprint.

Cite this