Publication | Closed Access
Sentiment-Aware Automatic Speech Recognition Pre-Training for Enhanced Speech Emotion Recognition
18
Citations
19
References
2022
Year
EngineeringMachine LearningSpeech CorpusAcoustic Asr ModelSpoken Language ProcessingMultimodal Sentiment AnalysisText MiningSpeech RecognitionNatural Language ProcessingData ScienceComputational LinguisticsAffective ComputingAcoustic AsrLanguage StudiesSpeech Emotion RecognitionSpeech AnalysisSpeech CommunicationMulti-speaker Speech RecognitionSpeech ProcessingSpeech PerceptionEmotionLinguisticsEmotion Recognition
We propose a novel multi-task pre-training method for Speech Emotion Recognition (SER). We pre-train SER model simultaneously on Automatic Speech Recognition (ASR) and sentiment classification tasks to make the acoustic ASR model more "emotion aware". We generate targets for the sentiment classification using text-to-sentiment model trained on publicly available data. Finally, we fine-tune the acoustic ASR on emotion annotated speech data. We evaluated the proposed approach on MSP-Podcast dataset, where we achieved the best reported concordance correlation coefficient (CCC) of 0.41 for valence prediction.
| Year | Citations | |
|---|---|---|
Page 1
Page 1