A Light-Weight Artificial Neural Network for Speech Emotion Recognition using Average Values of MFCCs and Their Derivatives

Abstract

Due to the limitation of memory and computational power in the embedded system, this work proposes a novel approach to create a useful set of features for improving speech emotion recognition (SER) system. Typically, Mel Frequency Cepstral Coefficients ( MFCCs) i s w idely u sed a s f eatures of SER system. In order to reduce the number of parameters and computational burden in SER applications, average values of MFCCs that are concatenated with delta and delta-delta coefficients a re u sed a s t he f eatures f or a n a rtificial neural network model (ANN) in classification. The results demonstrate that the use of the proposed features are comparable to the state-of-the-art methods with 87.8% for the EmoDB database and 82.3% for the RAVDESS database, respectively. Moreover, the number of parameters used in the classification m odel has been significantly reduced.

References

Page 1

	Year	Citations

Page 1