A new SNR-feature mapping for robust multistream speech recognition

Abstract

We describe a new model of CASA labelling which assigns to each time-frequency region a probability &quot;clean&quot; enough to feed a multistream recogniser only adapted to clean data. This labelling process is based on the harmonicity of the speech. The probability is evaluated according to a SNR-feature mapping and the choice of a SNR decision threshold. This allows an extension of a previous method [1] based on the binary detection of noisy time-frequency regions, followed by partial recognition of clean regions. The labelling process is adapted to a new multistream recognition approach [5], since the previous probabilities serve to weight the streams&apos; posteriors. 1. INTRODUCTION We propose to label the time-frequency representation after an analysis of primitive features of the speech, such as harmonicity or binaural cues. First, the CASA (Computational Auditory Scene Analysis, see [9]) approach is based on the definition of multiple representations of the signal allowing the extraction o...

References

Page 1

	Year	Citations

Page 1