Concepedia

Publication | Closed Access

Robust Sound Event Classification Using Deep Neural Networks

265

Citations

30

References

2015

Year

TLDR

Sound event recognition is critical for applications such as surveillance and machine hearing, yet robust classification under real‑world noise remains challenging, and while mel‑frequency cepstral coefficient–based methods from speech recognition perform reasonably, spectrogram‑ or auditory‑image techniques reportedly yield superior noise resilience. The study proposes a framework comparing auditory‑image and spectrogram‑image front‑end features for sound event classification using support vector machines and deep neural networks. The framework employs auditory‑image and spectrogram‑image feature extraction followed by classification with support vector machines and deep neural networks. The system achieves performance comparable to state‑of‑the‑art methods across varying noise levels, demonstrating the effectiveness of the proposed feature comparison and classifier choices.

Abstract

The automatic recognition of sound events by computers is an important aspect of emerging applications such as automated surveillance, machine hearing and auditory scene understanding. Recent advances in machine learning, as well as in computational models of the human auditory system, have contributed to advances in this increasingly popular research field. Robust sound event classification, the ability to recognise sounds under real-world noisy conditions, is an especially challenging task. Classification methods translated from the speech recognition domain, using features such as mel-frequency cepstral coefficients, have been shown to perform reasonably well for the sound event classification task, although spectrogram-based or auditory image analysis techniques reportedly achieve superior performance in noise. This paper outlines a sound event classification framework that compares auditory image front end features with spectrogram image-based front end features, using support vector machine and deep neural network classifiers. Performance is evaluated on a standard robust classification task in different levels of corrupting noise, and with several system enhancements, and shown to compare very well with current state-of-the-art classification techniques.

References

YearCitations

Page 1