Publication | Closed Access
TCNN: Temporal Convolutional Neural Network for Real-time Speech Enhancement in the Time Domain
378
Citations
21
References
2019
Year
Unknown Venue
Convolutional Neural NetworkEngineeringSpeech EnhancementSpeech RecognitionReal-time Speech EnhancementNoiseRobust Speech RecognitionReal-time LanguageHealth SciencesDeep LearningDistant Speech RecognitionSignal ProcessingSpeech CommunicationSpeech TechnologyMulti-speaker Speech RecognitionSpeech SeparationSpeech ProcessingConvolutional LayersSpeech PerceptionTime Domain
This work proposes a fully convolutional neural network (CNN) for real-time speech enhancement in the time domain. The proposed CNN is an encoder-decoder based architecture with an additional temporal convolutional module (TCM) inserted between the encoder and the decoder. We call this architecture a Temporal Convolutional Neural Network (TCNN). The encoder in the TCNN creates a low dimensional representation of a noisy input frame. The TCM uses causal and dilated convolutional layers to utilize the encoder output of the current and previous frames. The decoder uses the TCM output to reconstruct the enhanced frame. The proposed model is trained in a speaker- and noise-independent way. Experimental results demonstrate that the proposed model gives consistently better enhancement results than a state-of-the-art real-time convolutional recurrent model. Moreover, since the model is fully convolutional, it has much fewer trainable parameters than earlier models.
| Year | Citations | |
|---|---|---|
Page 1
Page 1