In this paper, we propose a new background music recommendation scheme for home videos and two new features describing the short-term motion/tempo distribution in visual/aural content. Unlike previous researches that merely matched the visual and aural contents through a perceptual way, we incorporate the textual semantics and content semantics while determining the matching degree of a video and a song. The key idea is that the recommended music should contain semantics that relate to the ones in the input video and that the rhythm of the music and the visual motion of the video should be harmonious enough. As a result, a few user-given tags and automatically annotated tags are used to compute their relation to the lyrics of the songs for selecting candidate musics. Then, we use the proposed motion-direction histogram (MDH) and pitch tempo pattern (PTP) to do the second-run selection. The user preference to the music genre is also taken into account as a filtering mechanism at the beginning. The primitive user evaluation shows that the proposed scheme is promising.