Spectro-temporal modulation based singing detection combined with pitchbased grouping for singing voice separation

Tse En Lin, Chung Chien Hsu, Yi Cheng Chen, Jian Hueng Chen, Tai-Shih Chi

Research output: Contribution to journalConference article

1 Scopus citations

Abstract

A spectro-temporal modulation based singing voice detection cascaded with a Viterbi based pitch tracking algorithm is proposed in this paper for singing-voice separation from monaural recordings. To detect the singing voice, the spectrotemporal modulation energy related to voice harmonics is extracted using a spectro-temporal modulation analysis framework developed for the Fourier spectrogram. Separation of singing-voice from background music is conducted using a binary mask to group estimated harmonics of singing voice. The proposed system is evaluated using MIR-1K dataset and is shown outperforming three other binary-mask based systems in the vocal/music separation task.

Original languageEnglish
Pages (from-to)2920-2923
Number of pages4
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
StatePublished - 1 Jan 2013
Event14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013 - Lyon, France
Duration: 25 Aug 201329 Aug 2013

Keywords

  • Pitch tracking
  • Singing voice detection
  • Singing voice separation
  • Spectro-temporal modulation

Fingerprint Dive into the research topics of 'Spectro-temporal modulation based singing detection combined with pitchbased grouping for singing voice separation'. Together they form a unique fingerprint.

  • Cite this