Emotion recognition has become an important research area for advanced human-robot interaction. Through recognizing facial expressions, a robot can interact with a person with a more friendly manner. In this paper, we proposed a bimodal emotion recognition system by combining image and speech information. A novel information fusion strategy is proposed to set proper weights to two feature modalities based on their recognition reliability. The fusion weights are determined by the distance between test data and the classification hyperplane and the standard deviation of training samples. After normalization using the mean distance between training samples and the hyperplane, the fusion weight is set to represent the classification reliability of individual modality. In the latter bimodal SVM classification, the recognition result with higher weight is selected. The complete procedure has been implemented in a DSP-based system to recognize five facial expressions on-line in real time. The experimental results show that a recognition rate of 86.9% is achieved, an improvement of 5% compared with using only image information.