Publication | Closed Access
Automatic speech emotion recognition using recurrent neural networks with local attention
719
Citations
9
References
2017
Year
Unknown Venue
EngineeringMachine LearningSpoken Language ProcessingMultimodal Sentiment AnalysisRecurrent Neural NetworkSocial SciencesSpeech RecognitionNatural Language ProcessingAutomatic Emotion RecognitionAffective ComputingRecurrent Neural NetworksLocal AttentionDeep LearningSpeech AnalysisSpeech CommunicationMulti-speaker Speech RecognitionSpeech FeaturesSpeech ProcessingEmotionEmotion Recognition
Automatic emotion recognition from speech is a challenging task which relies heavily on the effectiveness of the speech features used for classification. In this work, we study the use of deep learning to automatically discover emotionally relevant features from speech. It is shown that using a deep recurrent neural network, we can learn both the short-time frame-level acoustic features that are emotionally relevant, as well as an appropriate temporal aggregation of those features into a compact utterance-level representation. Moreover, we propose a novel strategy for feature pooling over time which uses local attention in order to focus on specific regions of a speech signal that are more emotionally salient. The proposed solution is evaluated on the IEMOCAP corpus, and is shown to provide more accurate predictions compared to existing emotion recognition algorithms.
| Year | Citations | |
|---|---|---|
Page 1
Page 1