Within-class covariance normalization for SVM-based speaker recognition

TLDR

The study extends within‑class covariance normalization (WCCN) to generalized linear kernels and presents a practical method for applying WCCN to high‑dimensional SVM‑based speaker recognition. The method applies PCA to split the feature space into a low‑dimensional PCA subspace and a high‑dimensional complement, performs WCCN in the PCA subspace, then concatenates the normalized vectors with a weighted version of the complement. When integrated into a state‑of‑the‑art MLLR‑SVM system, the approach reduces equal error rate by up to 22 % and minimum decision cost by 28 % relative to the baseline, and outperforms a variant that applies WCCN only in the PCA subspace.

Abstract

This paper extends the within-class covariance normalization (WCCN) technique described in [1, 2] for training generalized linear kernels. We describe a practical procedure for applying WCCN to an SVM-based speaker recognition system where the input feature vectors reside in a high-dimensional space. Our approach involves using principal component analysis (PCA) to split the original feature space into two subspaces: a low-dimensional “PCA space” and a high-dimensional “PCA-complement space.” After performing WCCN in the PCA space, we concatenate the resulting feature vectors with a weighted version of their PCAcomplements. When applied to a state-of-the-art MLLR-SVM speaker recognition system, this approach achieves improvements of up to 22% in EER and 28% in minimum decision cost function (DCF) over our previous baseline. We also achieve substantial improvements over an MLLR-SVM system that performs WCCN in the PCA space but discards the PCA-complement.

References

Page 1

	Year	Citations

Page 1