Speech signal-based emotion recognition and its application to entertainment robots

Kai-Tai Song*, Meng Ju Han, Shih Chieh Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


By recognizing sensory information, through touch, vision, or voice sensory modalities, a robot can interact with people in a more intelligent manner. In human-robot interaction (HRI), emotion recognition has been a popular research topic in recent years. This paper proposes a method for emotion recognition, using a speech signal to recognize several basic human emotional states, for application in an entertainment robot. The proposed method uses voice signal processing and classification. Firstly, end-point detection and frame setting are accomplished in the pre-processing stage. Then, the statistical features of the energy contour are computed. Fishers linear discriminant analysis (FLDA) is used to enhance the recognition rate. In the final stage, a support vector machine (SVM) is used to complete the emotional state classification. In order to determine the effectiveness of emotional HRI, an embedded system was constructed and integrated with a self-built entertainment robot. The experimental results for the entertainment robot show that the robot interacts with a person in a responsive manner. The average recognition rate for five emotional states is 73.8% using the database constructed in the authors lab.


  • emotion recognition
  • human-robot interaction
  • speech signal processing

Fingerprint Dive into the research topics of 'Speech signal-based emotion recognition and its application to entertainment robots'. Together they form a unique fingerprint.

Cite this