Publication | Open Access
A Multi-View Interactive Approach for Multimodal Sarcasm Detection in Social Internet of Things with Knowledge Enhancement
19
Citations
35
References
2024
Year
Knowledge EnhancementEngineeringMachine LearningMultimodal Sarcasm DetectionEducationMultimodal LearningCommunicationMultimodal Sentiment AnalysisLanguage ProcessingNatural Language ProcessingSocial MediaData ScienceAffective ComputingMultimodal InteractionMultimodal ProcessingSocial InternetMultimodal Signal ProcessingComputer ScienceDeep LearningSarcasm DetectionSocial ComputingSocial Medium DataAnnotationMultimodal Analytics
Multimodal sarcasm detection is a developing research field in social Internet of Things, which is the foundation of artificial intelligence and human psychology research. Sarcastic comments issued on social media often imply people’s real attitudes toward the events they are commenting on, reflecting their current emotional and psychological state. Additionally, the limited memory of Internet of Things mobile devices has posed challenges in deploying sarcastic detection models. An abundance of parameters also leads to an increase in the model’s inference time. Social networking platforms such as Twitter and WeChat have generated a large amount of multimodal data. Compared to unimodal data, multimodal data can provide more comprehensive information. Therefore, when studying sarcasm detection on social Internet of Things, it is necessary to simultaneously consider the inter-modal interaction and the number of model parameters. In this paper, we propose a lightweight multimodal interaction model with knowledge enhancement based on deep learning. By integrating visual commonsense knowledge into the sarcasm detection model, we can enrich the semantic information of image and text modal representation. Additionally, we develop a multi-view interaction method to facilitate the interaction between modalities from different modal perspectives. The experimental results indicate that the model proposed in this paper outperforms the unimodal baselines. Compared to multimodal baselines, it also has similar performance with a small number of parameters.
| Year | Citations | |
|---|---|---|
Page 1
Page 1