WhatsApp Chat Analysis with Python

WhatsApp is one of the most used messenger applications today with more than 2 Billion users worldwide. It was found that more than 65 billion messages are sent on WhatsApp daily so we can use WhatsApp chats for analyzing our chat with a friend, customer, or a group of people. In this article, I will take you through the task of WhatsApp Chat Analysis with Python.

WhatsApp Chat Analysis

You can use your WhatsApp data for many data science tasks like sentiment analysis, keyword extraction, named entity recognition, text analysis and several other natural language processing tasks. It also depends on who you are analyzing your WhatsApp messages with because you can find a lot of information from your WhatsApp messages which can also help you to solve business problems.

Before starting with the task of WhatsApp Chat analysis with Python you need to extract your WhatsApp data from your smartphone which is a very easy task. To extract your WhatsApp chats, just open any chat with a person or a group and follow the steps mentioned below:

  1. If you are having an iPhone then tap on the Contact Name or the Group NameIn case you are having an Android smartphone then tap on the 3 dots above.
  2. Then scroll to the bottom and top on Export Chat.
  3. Then select without media for simplicity if it asks you whether you want your chats with or without media.
  4. Then email this chat to yourself and download it to your system.

So this is how you can easily get your WhatsApp chats with any person or a group for the task of WhatsApp chat analysis. In the section below, I will take you through WhatsApp chat analysis with Python.

WhatsApp Chat Analysis with Python

I hope you now have understood how to get your WhatsApp data for the task of WhatsApp chat analysis with Python. Now let’s start this task by importing the necessary Python libraries that we need for this task:

The dataset we are using here requires a lot of preparation, so I suggest you take a look at the data you are using before starting this WhatsApp chat analysis task. As I have already walked through the dataset, so I’ll start by writing a few Python functions to prepare the data before importing it:

Now let’s import the data and prepare it in a way that we can use it in a pandas DataFrame:

Our dataset is completely ready now for the task of WhatsApp chat analysis with Python. Now let’s have a look at the last 20 messages and some other insights from the data:

WhatsApp chats analysis

Now let’s have a look at the total number of messages between this WhatsApp chat:

total_messages = df.shape[0]
print(total_messages)
1288

Now let’s have a look at the total number of media messages present in this chat:

media_messages = df[df["Message"]=='<Media omitted>'].shape[0]
print(media_messages)
11

Now let’s extract the emojis present in between the chats and have a look at the emojis present in this chat:

367

Now let’s extract the URLs present in this chat and have a look at the final insights:

Chats between Aman and Sapna
Total Messages:  1288
Number of Media Shared:  11
Number of Emojis Shared 367
Number of Links Shared 1

Now let’s prepare this data to get more insights to analyze all the messages sent in this chat in more detail:

Stats of Aman Kharwal -
Messages Sent 687
Average Words per message 6.165938864628821
Media Messages Sent 9
Emojis Sent 228
Links Sent 1

Stats of Sapna -
Messages Sent 590
Average Words per message 6.3830508474576275
Media Messages Sent 2
Emojis Sent 139
Links Sent 0

Now let’s prepare a visualization of the total emojis present in the chat and the type of emojis sent between the two people. It will help in understanding the relationship between both the people:

emojis shared in a whatsapp chat

Now let’s have a look at the most used words in this WhatsApp chat by visualizing a word cloud:

wordcloud

Now let’s have a look at the most used words by each person by visualizing two different word clouds:

Author name Aman Kharwal
wordcloud of chats
Author name Sapna
whatsapp chat analysis with python

Summary

So this is how we can easily analyze any WhatsApp chat between you and your friend, customer, or even a group of people. You can further use this data for many other tasks of natural language processing. I hope you liked this article on the task of WhatsApp chat analysis with Python. Feel free to ask your valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1537

10 Comments

  1. Your work is really amazing, and very helpful for beginners. I am following these projects for my learning and i can say that this stuff here is very helpful for people like me. I am from other domain background with “0” coding experience. I am trying to learn python to become a data analyst/scientist, and found these projects very helpful for my learning.

    Keep up the good work, so that people like me can be benefitted in their learning journey.

    Thanks a lot.

    I hope to share my success story soon as a data analyst.

  2. Really can not thank you enough Aman, I have started my Data Science and ML journey from non-coding background and trying to learn from your projects and practice them, these are really helpful. I am not sure why I am not being able to find any emojis from any chat!

    • I think there are some changes in the regular expression module that I have used here. I will explore the changes and then I will write a new article about this project.

Leave a Reply