TY - JOUR
T1 - Fast scoring for PLDA with uncertainty propagation via i-vector grouping
AU - Lin, Wei wei
AU - Mak, Man Wai
AU - Chien, Jen-Tzung
PY - 2017/9/1
Y1 - 2017/9/1
N2 - The i-vector/PLDA framework has gained huge popularity in text-independent speaker verification. This approach, however, lacks the ability to represent the reliability of i-vectors. As a result, the framework performs poorly when presented with utterances of arbitrary duration. To address this problem, a method called uncertainty propagation (UP) was proposed to explicitly model the reliability of an i-vector by an utterance-dependent loading matrix. However, the utterance-dependent matrix greatly complicates the evaluation of likelihood scores. As a result, PLDA with UP, or PLDA-UP in short, is far more computational intensive than the conventional PLDA. In this paper, we propose to group i-vectors with similar reliability, and for each group the utterance-dependent loading matrices are replaced by a representative one. This arrangement allows us to pre-compute a set of representative matrices that cover all possible i-vectors, thereby greatly reducing the computational cost of PLDA-UP while preserving its ability in discriminating the reliability of i-vectors. Experiments on NIST 2012 SRE show that the proposed method can perform as good as the PLDA with UP while the scoring time is only 3.18% of it.
AB - The i-vector/PLDA framework has gained huge popularity in text-independent speaker verification. This approach, however, lacks the ability to represent the reliability of i-vectors. As a result, the framework performs poorly when presented with utterances of arbitrary duration. To address this problem, a method called uncertainty propagation (UP) was proposed to explicitly model the reliability of an i-vector by an utterance-dependent loading matrix. However, the utterance-dependent matrix greatly complicates the evaluation of likelihood scores. As a result, PLDA with UP, or PLDA-UP in short, is far more computational intensive than the conventional PLDA. In this paper, we propose to group i-vectors with similar reliability, and for each group the utterance-dependent loading matrices are replaced by a representative one. This arrangement allows us to pre-compute a set of representative matrices that cover all possible i-vectors, thereby greatly reducing the computational cost of PLDA-UP while preserving its ability in discriminating the reliability of i-vectors. Experiments on NIST 2012 SRE show that the proposed method can perform as good as the PLDA with UP while the scoring time is only 3.18% of it.
KW - Duration mismatch
KW - Speaker verification
KW - Uncertainty Propagation
KW - i-Vector/PLDA
UR - http://www.scopus.com/inward/record.url?scp=85015713449&partnerID=8YFLogxK
U2 - 10.1016/j.csl.2017.02.009
DO - 10.1016/j.csl.2017.02.009
M3 - Article
AN - SCOPUS:85015713449
VL - 45
SP - 503
EP - 515
JO - Computer Speech and Language
JF - Computer Speech and Language
SN - 0885-2308
ER -