Applying Machine Learning to Identify Anti-Vaccination Tweets during the COVID-19 Pandemic

Abstract

Anti-vaccination attitudes have been an issue since the development of the first vaccines. The increasing use of social media as a source of health information may contribute to vaccine hesitancy due to anti-vaccination content widely available on social media, including Twitter. Being able to identify anti-vaccination tweets could provide useful information for formulating strategies to reduce anti-vaccination sentiments among different groups. This study aims to evaluate the performance of different natural language processing models to identify anti-vaccination tweets that were published during the COVID-19 pandemic. We compared the performance of the bidirectional encoder representations from transformers (BERT) and the bidirectional long short-term memory networks with pre-trained GLoVe embeddings (Bi-LSTM) with classic machine learning methods including support vector machine (SVM) and naïve Bayes (NB). The results show that performance on the test set of the BERT model was: accuracy = 91.6%, precision = 93.4%, recall = 97.6%, F1 score = 95.5%, and AUC = 84.7%. Bi-LSTM model performance showed: accuracy = 89.8%, precision = 44.0%, recall = 47.2%, F1 score = 45.5%, and AUC = 85.8%. SVM with linear kernel performed at: accuracy = 92.3%, Precision = 19.5%, Recall = 78.6%, F1 score = 31.2%, and AUC = 85.6%. Complement NB demonstrated: accuracy = 88.8%, precision = 23.0%, recall = 32.8%, F1 score = 27.1%, and AUC = 62.7%. In conclusion, the BERT models outperformed the Bi-LSTM, SVM, and NB models in this task. Moreover, the BERT model achieved excellent performance and can be used to identify anti-vaccination tweets in future studies.

References

Page 1

	Year	Citations
Long Short-Term Memory Sepp Hochreiter, Jürgen Schmidhuber Neural Computation	1997	93.8K
Support-vector networks Corinna Cortes, Vladimir Vapnik Machine Learning	1995	39.8K
Glove: Global Vectors for Word Representation Jeffrey Pennington, Richard Socher, Christopher D. Manning EngineeringMachine LearningVector SpaceCorpus LinguisticsText Mining	2014	33.2K
Bidirectional recurrent neural networks Mike Schuster, Kuldip K. Paliwal IEEE Transactions on Signal Processing Natural Language ProcessingStructured PredictionConditional Posterior ProbabilityEngineeringMachine Learning	1997	9.6K
Learning long-term dependencies with gradient descent is difficult Yoshua Bengio, P. Simard, Paolo Frasconi IEEE Transactions on Neural Networks Structured PredictionGradient DescentEngineeringMachine LearningData Science	1994	8.3K
Framewise phoneme classification with bidirectional LSTM and other neural network architectures Alex Graves, Jürgen Schmidhuber Neural Networks Natural Language ProcessingFramewise Phoneme ClassificationEngineeringMachine LearningSpeech Processing	2005	5.2K
A comparison of event models for naive bayes text classification Andrew McCallum, Kamal Nigam	1998	3.2K
Deep learning for sentiment analysis: A survey Lei Zhang, Shuai Wang, Bing Liu Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery EngineeringMachine LearningMultimodal Sentiment AnalysisRecurrent Neural NetworkSentiment Analysis	2018	1.8K
Deep Contextualized Word Representations Matthew E. Peters, Mark E Neumann, Mohit Iyyer, EngineeringLanguage ProcessingWord EmbeddingsNatural Language ProcessingApplied Linguistics	2018	1.8K
SemEval-2016 Task 6: Detecting Stance in Tweets Saif M. Mohammad, Svetlana Kiritchenko, Parinaz Sobhani,	2016	837

Page 1