Publication | Open Access
A Deep Attention Model for Environmental Sound Classification from Multi-Feature Data
16
Citations
17
References
2022
Year
MusicEngineeringMachine LearningAttention WeightEnvironmental Sound RecognitionEnvironmental Sound ClassificationAcoustic ModelingSpeech RecognitionPattern RecognitionEnvironmental SoundAudio AnalysisNoiseDeep Attention ModelAcoustic Signal ProcessingSpeech Signal AnalysisHealth SciencesMulti-feature DataAudio RetrievalDeep LearningAudio MiningMusic ClassificationSpeech AcousticsSpeech Processing
Automated environmental sound recognition has clear engineering benefits; it allows audio to be sorted, curated, and searched. Unlike music and language, environmental sound is loaded with noise and lacks the rhythm and melody of music or the semantic sequence of language, making it difficult to find common features representative enough of various environmental sound signals. To improve the accuracy of environmental sound recognition, this paper proposes a recognition method based on multi-feature parameters and time–frequency attention module. It begins with a pretreatment that relies on multi-feature parameters to extract the sound, which supplements the phase information lost by the Log-Mel spectrogram in the current mainstream methods, and enhances the expressive ability of input features. A time–frequency attention module with multiple convolutions is designed to extract the attention weight of the input feature spectrogram and reduce the interference coming from the background noise and irrelevant frequency bands in the audio. Comparative experiments were conducted on three general datasets: environmental sound classification datasets (ESC-10, ESC-50) and an UrbanSound8K dataset. Experiments demonstrated that the proposed method performs better.
| Year | Citations | |
|---|---|---|
Page 1
Page 1