Wake-up-word (WUW) detection is to detect a single word or phrase while rejecting all other words or sounds. For distant human-robot interaction (HRI), the location of the target speaker and a unique command are required to activate the robot. In this paper, a multi-channel speech interface is introduced not only to estimate the unknown locations of the sound sources but also to strengthen the speech feature for WUW detection. A ring-shape microphone array is used to collect the speech signal. The spatial eigenspace information discovered by multiple signal classification (MUSIC) is used to estimate location dependent formants and the direction of the target speaker. The estimated formants contained in fixed time duration are grouped and evaluated using the likelihood functions of formants. A cascaded detector is also introduced to make the final decision. Experimental results demonstrate the usefulness of the proposed approach with several noisy conditions, including the cases of simultaneous competing speeches.