DNN-supported Mask-based Convolutional Beamforming for Simultaneous Denoising, Dereverberation, and Source Separation

Abstract

In this article, we investigate an integrated mask-based convolutional beamforming method for performing simultaneous denoising, dereverberation, and source separation. Conventionally, it is difficult for neural network-supported mask-based source separation to perform denoising and dereverberation at the same time and for spatial clustering-based source separation to reliably solve the permutation problem in the presence of noise and reverberation. This greatly limits the application of mask-based source separation. To address this issue, we propose a method to integrate state-of-the-art techniques for mask-based beamforming into a single optimization framework. These techniques include frequency-domain Convolutional Neural Network based utterance-level Permutation Invariant Training with a large receptive field (CNN-uPIT), noisy Complex Gaussian Mixture Model based spatial clustering (noisyCGMM), and Weighted Power minimization Distortionless response (WPD) convolutional beamforming. Our experiments show that all these components are essential for accurately estimating desired speech signals in noisy reverberant multisource environments.

References

Page 1

	Year	Citations

Page 1