T-Norm for Text-Dependent Commercial Speaker Verification Applications: Effect of Lexical Mismatch

Abstract

We describe a test-time score normalization technique (T-Norm) for text-dependent speaker verification that is robust to lexical mismatch. The main challenge to the deployment of T-Norm in a text-dependent task is the mismatch between the lexicon of the target speaker model in the application and that of the cohort speaker models. We show the negative effect of that mismatch in controlled experiments and propose a hybrid scoring scheme (T-Norm and background model) to remedy it. In a lexically mismatched scenario, which is inherent to the deployment of T-Norm in a text-dependent system, we show a 31% relative error rate reduction using the hybrid scoring over T-Norm alone. A 22% relative error rate reduction is measured over the baseline (no T-Norm) system.

References

Page 1

	Year	Citations

Page 1