Publication | Closed Access
Long-short term memory for emotional recognition with variable length speech
13
Citations
9
References
2018
Year
Unknown Venue
EngineeringMachine LearningAffective NeuroscienceSpoken Language ProcessingVariable Length SpeechRecurrent Neural NetworkSocial SciencesSpeech RecognitionData ScienceAffective ComputingMemoryRobust Speech RecognitionVoice RecognitionInformation TheoryDeep LearningSpeech WaveformSpeech CommunicationSpeech TechnologySpeech AnalysisSpeech ProcessingSpeech PerceptionEmotionLinguisticsEmotion RecognitionCasia Database
Despite many kinds of features using for speech emotional recognition task, they are severely restricted due to the same dimension of features extracting from different length of speech. Therefore, frame-level features reserving temporal information in speech waveform are extracted, whose dimension changes dynamically with the length of original speech. From the perspective of information theory, the information loss of frame- Ievel features is less than that of fixed length, and is more suitable for the input of deep learning with self-learning ability. Bidirectional long-short term memory (BiLSTM) is applied to work as a classifier and process the variable length of features. Experimental results demonstrate that the proposed method significantly outperforms the INTERSPEECH 2010 features on CASIA database.
| Year | Citations | |
|---|---|---|
Page 1
Page 1