Bayesian factorization and learning for monaural source separation

Jen-Tzung Chien, Po Kai Yang

Research output: Contribution to journalArticle

26 Scopus citations

Abstract

This paper presents a new Bayesian nonnegative matrix factorization (NMF) for monaural source separation. Using this approach, the reconstruction error based on NMF is represented by a Poisson distribution, and the NMF parameters, consisting of the basis and weight matrices, are characterized by the exponential priors. A variational Bayesian inference procedure is developed to learn variational parameters and model parameters. The randomness in separation process is faithfully represented so that the system robustness to model variations in heterogeneous environments could be achieved. Importantly, the exponential prior parameters are used to impose sparseness in basis representation. The variational lower bound of log marginal likelihood is adopted as the objective to control model complexity. The dependencies of variational objective on model parameters are fully characterized in the derived closed-form solution. A clustering algorithm is performed to find the groups of bases for unsupervised source separation. The experiments on speech/music separation and singing voice separation show that the proposed Bayesian NMF (BNMF) with adaptive basis representation outperforms the NMF with fixed number of bases and the other BNMFs in terms of signal-to-distortion ratio and the global normalized source to distortion ratio.

Original languageEnglish
Pages (from-to)185-195
Number of pages11
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume24
Issue number1
DOIs
StatePublished - 1 Jan 2016

Keywords

  • Bayesian learning
  • Model complexity
  • Monaural source separation
  • Nonnegative matrix factorization

Fingerprint Dive into the research topics of 'Bayesian factorization and learning for monaural source separation'. Together they form a unique fingerprint.

  • Cite this