Concepedia

Abstract

In this paper, we propose a method for automatically detecting various types of snore sounds using image classification convolutional neural network (CNN) descriptors extracted from audio file spectrograms.The descriptors, denoted as deep spectrum features, are derived from forwarding spectrograms through very deep task-independent pre-trained CNNs.Specifically, activations of fully connected layers from two common image classification CNNs, AlexNet and VGG19, are used as feature vectors.Moreover, we investigate the impact of differing spectrogram colour maps and two CNN architectures on the performance of the system.Results presented indicate that deep spectrum features extracted from the activations of the second fully connected layer of AlexNet using a viridis colour map are well suited to the task.This feature space, when combined with a support vector classifier, outperforms the more conventional knowledge-based features of 6 373 acoustic functionals used in the INTERSPEECH ComParE 2017 Snoring sub-challenge baseline system.In comparison to the baseline, unweighted average recall is increased from 40.6 % to 44.8 % on the development partition, and from 58.5 % to 67.0 % on the test partition.

References

YearCitations

Page 1