More is better: likelihood ratio-based forensic voice comparison with vocalic segmental cepstra frontends

Abstract

The suitability of vowel cepstral spectra for forensic voice comparison is explored within a likelihood ratio-based framework, and non-technical explanations provided for some basic concepts of cepstral analysis and forensic voice comparison. Non-contemporaneous landline telephone recordings of 297 male Japanese speakers are compared using only two replicates per recording of each of their five read-out vowels. 14 cepstrally-mean-subtracted LPC cepstral coefficients modelling the spectral shape to 5 kHz are used as features. When evaluated intrinsically with kernel density multivariate likelihood ratios, all 297 same-speaker comparisons are correctly discriminated as coming from the same speaker, and only 173 of the 43,956 different-speaker comparisons (0.4%) are incorrectly evaluated as coming from the same speaker. The log-likelihood ratio cost for this comparison is very low at 0.013. Fusion with a speaker's long-term spectral data marginally improves the different-speaker error rate to 0.27% and the log-likelihood ratio cost to 0.009. It is concluded that the approach warrants further examination.

References

Page 1

	Year	Citations
Cepstral analysis technique for automatic speaker verification Sadaoki Furui IEEE Transactions on Acoustics Speech and Signal Processing EngineeringCepstral Analysis TechniqueBiometricsTelephone SpeechAutomatic Speaker Verification	1981	1.2K
Proceedings of the International Conference on Acoustics Speech and Signal Processing Stefan Bilbao, Kevin Arcas, Antoine Chaigne International Conference on Acoustics, Speech, and Signal Processing MusicEngineeringHealth SciencesAcoustic Signal ProcessingSpeech Acoustics	2006	636
Application-independent evaluation of speaker detection Niko Brümmer, Johan A. du Preez Computer Speech & Language EngineeringHealth SciencesSpeaker DetectionMulti-speaker Speech RecognitionSpeaker Identification	2005	585
Evaluation of Trace Evidence in the Form of Multivariate Data Colin Aitken, D. Lucy Journal of the Royal Statistical Society Series C (Applied Statistics) Forensic PsychologyEngineeringDiagnosisInformation ForensicsForensic Chemistry	2004	283
Spectral-shape features versus formants as acoustic correlates for vowels Stephen A. Zahorian, Amir J. Jagharghi The Journal of the Acoustical Society of America	1993	171
Measuring the validity and reliability of forensic likelihood-ratio systems Geoffrey Stewart Morrison Science & Justice Forensic PsychologyReliabilityForensics AnalysisEngineeringForensic Medicine	2011	162
Emulating DNA: Rigorous Quantification of Evidential Weight in Transparent and Testable Forensic Speaker Recognition Joaquín González-Rodríguez, Phil Rose, Daniel Ramos, IEEE Transactions on Audio Speech and Language Processing EngineeringSpeech CorpusBiometricsInformation ForensicsDna Evidence	2007	159
Quantifying the Weight of Evidence from a Forensic Fingerprint Comparison: A New Paradigm Cédric Neumann, I.W. Evett, James E. Skerrett Journal of the Royal Statistical Society Series A (Statistics in Society) Forensic PsychologyEngineeringBiometricsVerificationLaw	2012	154
Tutorial on logistic-regression calibration and fusion:converting a score to a likelihood ratio Geoffrey Stewart Morrison Australian Journal of Forensic Sciences EngineeringBiometricsInformation ForensicsSpeech RecognitionCalibration	2012	135
Robust estimation, interpretation and assessment of likelihood ratios in forensic speaker recognition Joaquín González-Rodríguez, Andrzej Drygajlo, Daniel Ramos-Castro, Computer Speech & Language Image AnalysisEngineeringHealth SciencesPattern RecognitionMulti-speaker Speech Recognition	2005	123

Page 1