This paper presents a new Bayesian nonnegative matrix factorization (NMF) for monaural source separation. Using this approach, the reconstruction error based on NMF is represented by a Poisson distribution, and the NMF parameters, consisting of the basis and weight matrices, are characterized by the exponential priors. A variational Bayesian inference procedure is developed to learn variational parameters and model parameters. The randomness in separation process is faithfully represented so that the system robustness to model variations in heterogeneous environments could be achieved. Importantly, the exponential prior parameters are used to impose sparseness in basis representation. The variational lower bound of log marginal likelihood is adopted as the objective to control model complexity. The dependencies of variational objective on model parameters are fully characterized in the derived closed-form solution. A clustering algorithm is performed to find the groups of bases for unsupervised source separation. The experiments on speech/music separation and singing voice separation show that the proposed Bayesian NMF (BNMF) with adaptive basis representation outperforms the NMF with fixed number of bases and the other BNMFs in terms of signal-to-distortion ratio and the global normalized source to distortion ratio.
|Number of pages||11|
|Journal||IEEE/ACM Transactions on Audio Speech and Language Processing|
|State||Published - 1 Jan 2016|
- Bayesian learning
- Model complexity
- Monaural source separation
- Nonnegative matrix factorization