Motives Unmasked (NLP analysis of COVID-19 tweets)

For my undergraduate thesis, I researched COVID-19 mask discourse and what it told us about political polarization, and social media’s role in accelerating it.

To do so, I pulled mask-related tweets from the Twitter API. Conventional sentiment analysis can’t tell apart a tweet that’s negative about masks from one that’s negative about anti-maskers, so I used BERT-based aspect-based sentiment analysis to score sentiment toward “mask” as a target. That split the negative pile into actual anti-mask tweets vs. anti-anti-mask tweets. Then, I trained separate Word2Vec models on each group and compared the nearest neighbors of “mask” to surface the themes each side associated with the word.

This research thesis was awarded the Hoopes Prize, given annually to the best undergraduate theses and projects at Harvard.

Abstract:

While scientific consensus on the effectiveness of masks was established in the first few months of the COVID-19 pandemic, mask-wearing continued to be a divisive issue in the United States throughout 2020. This study aims to reveal: why have face masks been divisive, particularly along partisan lines? Through a novel analysis of Twitter conversation around masks using computational text analysis, I find a mismatch in what pro-mask voices think the other side believes. Pro-mask sentiments condemn ‘anti-maskers’ for being selfish and holding faulty moral reasoning. However, anti-mask sentiments actually express concerns on the science of masks, rather than moral arguments such as holding liberty to a higher ideal than safety. These findings help further the understanding of political polarization and its empowerment through social media, as well as guidance for public health messaging.