Publication | Closed Access
Speech intelligibility in reverberation with ideal binary masking: Effects of early reflections and signal-to-noise ratio threshold
46
Citations
39
References
2013
Year
EngineeringSpeech IntelligibilityBinary MaskingSpeech EnhancementAcoustic ModelingSpeech RecognitionIdeal Binary MaskSpeech CodingPhoneticsNoiseAcoustic Signal ProcessingHealth SciencesDistant Speech RecognitionSignal ProcessingSpeech CommunicationIdeal Binary MaskingEarly ReflectionsSpeech ProcessingSpeech SeparationSpeech Perception
Ideal binary masking is a signal processing technique that separates a desired signal from a mixture by retaining only the time-frequency units where the signal-to-noise ratio (SNR) exceeds a predetermined threshold. In reverberant conditions there are multiple possible definitions of the ideal binary mask in that one may choose to treat the target early reflections as either desired signal or noise. The ideal binary mask may therefore be parameterized by the reflection boundary, a predetermined division point between early and late reflections. Another important parameter is the local SNR threshold used in labeling the time-frequency units as either target or background. Two experiments were designed to assess the impact of these two parameters on speech intelligibility with ideal binary masking for normal-hearing listeners in reverberant conditions. Experiment 1 shows that in order to achieve intelligibility improvements only the early reflections should be preserved by the binary mask. Moreover, it shows that the effective SNR should be accounted for when deciding the local threshold optimal range. Experiment 2 shows that with long reverberation times, intelligibility improvements are only obtained when the reflection boundary is 100 ms or less. Also, the experiment suggests that binary masking can be used for dereverberation.
| Year | Citations | |
|---|---|---|
Page 1
Page 1