The coronavirus (COVID-19) epidemic has changed the lives of people around the world. But the emergence of its vaccine has led to positive and negative reactions all over the world. In this article, I will introduce you to a data science project on Covid-19 vaccine sentiment analysis using Python.
Covid-19 Vaccine Sentiment Analysis
Media messages may not always align with science as the misinformation, baseless claims and rumours can spread quickly. This is what we saw with the introduction of the Covid-19 vaccine. In this data science project, we aim to analyze tweets recorded about the Covid-19 vaccine to analyze the sentiments of people for the vaccine.
Twitter is a microblogging and social networking platform where users post and interact with messages called “tweets”. With more than 166 million daily users, Twitter is a valuable data source for any social media discussion related to national and global events. So, the dataset for the sentiment analysis task of the Covid-19 vaccine was collected from Twitter.
Data Science Project on Covid-19 Vaccine Sentiment Analysis
I will start the task of Covid-19 Vaccine Sentiment analysis by importing all the necessary Python libraries:
In the code above below, I’ll be doing some text preprocessing of the functionality of our dataset, which contains the body of the tweet. Our goal is to perform sentiment analysis on clean text data to avoid noise and reading errors:
Covid-19 Vaccine Vander Sentiment Analysis
VADER sentimental analysis relies on a dictionary that maps lexical characteristics to emotional intensities called sentiment scores. A text’s sentiment score can be obtained by summarizing the intensity of each word in the text.
For example, – Words like “love”, “appreciate”, “happy” all convey a positive feeling. Also, VADER is smart enough to understand the basic context of such words, such as “disliked” as a negative statement. It also includes an emphasis on capital letters and punctuation, such as “ENJOY”. Now let’s prepare the data for Vander Sentiment Analysis:
Exploratory Data Analysis
You can observe that the distributions of sentiment follow a normal distribution; negative and positive feelings are very similar, suggesting that there can be no significant differences in the strength of the positive and negative feelings in our data.
It is also clear that the dominant sentiment is neutral; oddly enough, most tweets don’t sound more like neutral positive or negative feelings.
Now Let’s Analyze Sentiments with Python
Now let’s start with analyzing the cut off for the most negative sentiments and the most positive sentiments:
Now let’s visualize the most negative and the most positive sentiments:
At last, let’s have a look at the top 10 most negative and most positive sentiments:
Now let’s have a look at the correlation between the tweets and the other numeric features in the dataset:
ex.imshow(f_data[['user_followers','user_friends','user_favourites','user_verified','Positive Sentiment', 'Neutral Sentiment','Negative Sentiment']].corr('spearman'),title='Spearman Correlation')
Unfortunately, we don’t see any significant correlation between the sentiment of the tweet and any other numeric characteristic given in our data set, especially those that describe users. I hope you liked this article on a data science project on Covid-19 vaccine analysis with Python. Feel free to ask your valuable questions in the comments section below.