Variational domain adversarial learning for speaker verification

Youzhi Tu, Man Wai Mak, Jen Tzung Chien

Research output: Contribution to journalConference article

6 Scopus citations

Abstract

Domain mismatch refers to the problem in which the distribution of training data differs from that of the test data. This paper proposes a variational domain adversarial neural network (VDANN), which consists of a variational autoencoder (VAE) and a domain adversarial neural network (DANN), to reduce domain mismatch. The DANN part aims to retain speaker identity information and learn a feature space that is robust against domain mismatch, while the VAE part is to impose variational regularization on the learned features so that they follow a Gaussian distribution. Thus, the representation produced by VDANN is not only speaker discriminative and domain-invariant but also Gaussian distributed, which is essential for the standard PLDA backend. Experiments on both SRE16 and SRE18-CMN2 show that VDANN outperforms the Kaldi baseline and the standard DANN. The results also suggest that VAE regularization is effective for domain adaptation.

Keywords

  • Domain adaptation
  • Domain adversarial training
  • Speaker verification
  • Variational autoencoder

Fingerprint Dive into the research topics of 'Variational domain adversarial learning for speaker verification'. Together they form a unique fingerprint.

  • Cite this