Squid Game Sentiment Analysis using Python

The squid game is currently one of the most trending shows on Netflix. It is so much trending that people who have never watched any web series before are also watching it. One of the reasons behind this is the reviews and opinions of viewers on social media. So if you want to learn how to analyze the sentiments of people about Squid Game, then this article is for you. In this article, I will take you through the task of Squid Game sentiment analysis using Python.

Squid Game Sentiment Analysis using Python

The dataset that I am using for the task of Squid Game sentiment analysis is downloaded from Kaggle, which was initially collected from Twitter while people were actively sharing their opinions about Squid Game. Letโ€™s start the task of Squid Game sentiment analysis by importing the necessary Python libraries and the dataset:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator

data = pd.read_csv("squid_game.csv")
print(data.head())
                    user_name  user_location  ...               source is_retweet
0   the _รปndรซr-ratรจd niggรกh๐Ÿ‘Š๐Ÿพ            NaN  ...  Twitter for Android      False
1  Best uncle on planet earth            NaN  ...  Twitter for Android      False
2                      marcie            NaN  ...      Twitter Web App      False
3                    YoMo.Mdp  Any pronouns   ...      Twitter Web App      False
4             Laura Reactions         France  ...      Twitter Web App      False

[5 rows x 12 columns]

In first impressions of this dataset, I noticed null values in the “user_location” column that seem to not affect the sentiment analysis task. So I will drop this column:

data = data.drop(columns="user_location", axis=1)

Now letโ€™s have a look at whether other columns contain any null values or not:

print(data.isnull().sum())
user_name              4
user_description    5211
user_created           0
user_followers         0
user_friends           0
user_favourites        0
user_verified          0
date                   0
text                   0
source                 0
is_retweet             0
dtype: int64

The “user_description” column also contains null values, which will also not affect the sentiment analysis task. So I’m going to delete this column as well:

data = data.drop(columns="user_description", axis=1)
data = data.dropna()

The “text” column in the dataset contains the opinions of the users of Twitter about the squid game, as these are social media opinions, so this column needs to be prepared before any analysis. So letโ€™s prepare this column for the task of sentiment analysis:

import nltk
import re
nltk.download('stopwords')
stemmer = nltk.SnowballStemmer("english")
from nltk.corpus import stopwords
import string
stopword=set(stopwords.words('english'))
def clean(text):
text = str(text).lower()
text = re.sub('\[.*?\]', '', text)
text = re.sub('https?://\S+|www\.\S+', '', text)
text = re.sub('<.*?>+', '', text)
text = re.sub('[%s]' % re.escape(string.punctuation), '', text)
text = re.sub('\n', '', text)
text = re.sub('\w*\d\w*', '', text)
text = [word for word in text.split(' ') if word not in stopword]
text=" ".join(text)
text = [stemmer.stem(word) for word in text.split(' ')]
text=" ".join(text)
return text
data["text"] = data["text"].apply(clean)
view raw squid game1.py hosted with ❤ by GitHub

Now let’s take a look at the most used words in the Squid Game opinions using a word cloud. A word cloud is a data visualization tool that displays the most used words in a larger size. Here is how you can visualize the word cloud of the text column:

text = " ".join(i for i in data.text)
stopwords = set(STOPWORDS)
wordcloud = WordCloud(stopwords=stopwords, background_color="white").generate(text)
plt.figure( figsize=(15,10))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
view raw squid game2.py hosted with ❤ by GitHub
Squid Game Sentiment Analysis: word cloud

Now letโ€™s move to the task of Squid Game sentiment analysis. Here I will add three more columns in this dataset as Positive, Negative, and Neutral by calculating the sentiment scores of the text column:

nltk.download('vader_lexicon')
sentiments = SentimentIntensityAnalyzer()
data["Positive"] = [sentiments.polarity_scores(i)["pos"] for i in data["text"]]
data["Negative"] = [sentiments.polarity_scores(i)["neg"] for i in data["text"]]
data["Neutral"] = [sentiments.polarity_scores(i)["neu"] for i in data["text"]]
data = data[["text", "Positive", "Negative", "Neutral"]]
print(data.head())
view raw squid game3.py hosted with ❤ by GitHub
                                                text  Positive  Negative  Neutral
0  life hit time poverti strike yougong yoo  let ...     0.173     0.108    0.719
1                    marbl episod squidgam  ruin ๐Ÿ˜ญ๐Ÿ˜ญ๐Ÿ˜ญ     0.000     0.487    0.513
2                                      squidgam time     0.000     0.000    1.000
3  blood  slideim join squidgam thing im alreadi ...     0.142     0.277    0.581
4  two first game player kill mask guy  bloodi ni...     0.000     0.461    0.539

Now letโ€™s calculate how most people think about the Squid Game:

x = sum(data["Positive"])
y = sum(data["Negative"])
z = sum(data["Neutral"])
def sentiment_score(a, b, c):
if (a>b) and (a>c):
print("Positive ๐Ÿ˜Š ")
elif (b>a) and (b>c):
print("Negative ๐Ÿ˜  ")
else:
print("Neutral ๐Ÿ™‚ ")
sentiment_score(x, y, z)
view raw squid game4.py hosted with ❤ by GitHub
Neutral ๐Ÿ™‚

So most of the opinions of the users are Neutral, now letโ€™s have a look at the total of each sentiment score before making any conclusion:

print("Positive: ", x)
print("Negative: ", y)
print("Neutral: ", z)
Positive:  10604.55899999976
Negative:  5171.334000000031
Neutral:  64233.11800000302

The total of negatives is much lower than that of Positive, so we can say that most of the opinions on the Squid Game are positive.

Summary

The Squid Game is currently one of the most trending shows on Netflix. One of the reasons behind this is the reviews and opinions of viewers on social media. I hope you liked this article on Squid game sentiment analysis using Python. Feel free to ask your valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data๐Ÿ“ˆ.

Articles: 1433

Leave a Reply