Information Maximized Variational Domain Adversarial Learning for Speaker Verification

Youzhi Tu, Man Wai Mak, Jen Tzung Chien

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Domain mismatch is a common problem in speaker verification. This paper proposes an information-maximized variational domain adversarial neural network (InfoVDANN) to reduce domain mismatch by incorporating an InfoVAE into domain adversarial training (DAT). DAT aims to produce speaker discriminative and domain-invariant features. The InfoVAE has two roles. First, it performs variational regularization on the learned features so that they follow a Gaussian distribution, which is essential for the standard PLDA backend. Second, it preserves mutual information between the features and the training set to extract extra speaker discriminative information. Experiments on both SRE16 and SRE18-CMN2 show that the InfoVDANN outperforms the recent VDANN, which suggests that increasing the mutual information between the latent features and input features enables the InfoVDANN to extract extra speaker information that is otherwise not possible.

Original languageEnglish
Title of host publication2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages6449-6453
Number of pages5
ISBN (Electronic)9781509066315
DOIs
StatePublished - May 2020
Event2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Barcelona, Spain
Duration: 4 May 20208 May 2020

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2020-May
ISSN (Print)1520-6149

Conference

Conference2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020
CountrySpain
CityBarcelona
Period4/05/208/05/20

Keywords

  • adversarial training
  • domain adaptation
  • mutual information
  • Speaker verification
  • variational autoencoder

Fingerprint Dive into the research topics of 'Information Maximized Variational Domain Adversarial Learning for Speaker Verification'. Together they form a unique fingerprint.

  • Cite this

    Tu, Y., Mak, M. W., & Chien, J. T. (2020). Information Maximized Variational Domain Adversarial Learning for Speaker Verification. In 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings (pp. 6449-6453). [9053735] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2020-May). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP40776.2020.9053735