Developing usable speech criteria for speaker identification technology

Abstract

A "usable speech" extraction system was proposed (Yanatorno, 1998) to separate co-channel speech into "usable" frames that are minimally corrupted by interfering speech. Studies indicate that a significant amount of cochannel speech can be considered "usable" for speaker identification (SID). Therefore, it is necessary to establish criteria for usable speech frames for SID. Voiced speech, of which usable speech is entirely comprised, is shown to be information rich for SID. In addition, SID accuracy increases as the frame-based target to interferer ratio (TIR) increases when evaluated independently of the amount of available segments. Krishnamachari et al. (2000) developed a frame-based spectral autocorrelation ratio (SAR) technique for determining usable frames within co-channel speech. The ability of the SAR method to determine usable frames at various thresholds is examined. This paper investigates the effectiveness of a frame-based usable speech extraction technique for speaker identification.

References

Page 1

	Year	Citations

Page 1