Bayesian singing-voice separation

Po Kai Yang, Chung Chien Hsu, Jen-Tzung Chien

Research output: Contribution to conferencePaperpeer-review

12 Scopus citations


This paper presents a Bayesian nonnegative matrix factorization (NMF) approach to extract singing voice from background music accompaniment. Using this approach, the likelihood function based on NMF is represented by a Poisson distribution and the NMF parameters, consisting of basis and weight matrices, are characterized by the exponential priors. A variational Bayesian expectation-maximization algorithm is developed to learn variational parameters and model parameters for monaural source separation. A clustering algorithm is performed to establish two groups of bases: one is for singing voice and the other is for background music. Model complexity is controlled by adaptively selecting the number of bases for different mixed signals according to the variational lower bound. Model regularization is tackled through the uncertainty modeling via variational inference based on marginal likelihood. The experimental results on MIR-1K database show that the proposed method performs better than various unsupervised separation algorithms in terms of the global normalized source to distortion ratio.

Original languageEnglish
Number of pages6
StatePublished - 1 Jan 2014
Event15th International Society for Music Information Retrieval Conference, ISMIR 2014 - Taipei, Taiwan
Duration: 27 Oct 201431 Oct 2014


Conference15th International Society for Music Information Retrieval Conference, ISMIR 2014

Fingerprint Dive into the research topics of 'Bayesian singing-voice separation'. Together they form a unique fingerprint.

Cite this