Text emotions classification is the problem of assigning emotion to a text by understanding the context and the emotion behind the text. One real-world example is the keyboard of an iPhone that recommends the most relevant emoji by understanding the text. So, if you want to learn how to classify the emotions of a text, this article is for you. In this article, I will take you through the task of text emotions classification with Machine Learning using Python.
Text Emotions Classification
Text emotions classification is the problem of natural language processing and text classification. Here we need to train a text classification model to classify the emotion of a text.
To solve this problem, we need labelled data of texts and their emotions. I found an ideal dataset to solve this problem on Kaggle. You can download the dataset from here.
In the section below, I’ll take you through how to train a text classification model for the task of Text Emotions Classification using Machine Learning and the Python programming language.
Text Emotions Classification using Python
I’ll start by importing the necessary Python libraries and the dataset:
import pandas as pd import numpy as np import keras import tensorflow from keras.preprocessing.text import Tokenizer from tensorflow.keras.preprocessing.sequence import pad_sequences from sklearn.preprocessing import LabelEncoder from sklearn.model_selection import train_test_split from keras.models import Sequential from keras.layers import Embedding, Flatten, Dense data = pd.read_csv("train.txt", sep=';') data.columns = ["Text", "Emotions"] print(data.head())
Text Emotions 0 i can go from feeling so hopeless to so damned... sadness 1 im grabbing a minute to post i feel greedy wrong anger 2 i am ever feeling nostalgic about the fireplac... love 3 i am feeling grouchy anger 4 ive been feeling a little burdened lately wasn... sadness
As this is a problem of natural language processing, I’ll start by tokenizing the data:
texts = data["Text"].tolist() labels = data["Emotions"].tolist() # Tokenize the text data tokenizer = Tokenizer() tokenizer.fit_on_texts(texts)
Now we need to pad the sequences to the same length to feed them into a neural network. Here’s how we can pad the sequences of the texts to have the same length:
sequences = tokenizer.texts_to_sequences(texts) max_length = max([len(seq) for seq in sequences]) padded_sequences = pad_sequences(sequences, maxlen=max_length)
Now I’ll use the label encoder method to convert the classes from strings to a numerical representation:
# Encode the string labels to integers label_encoder = LabelEncoder() labels = label_encoder.fit_transform(labels)
We are now going to One-hot encode the labels. One hot encoding refers to the transformation of categorical labels into a binary representation where each label is represented as a vector of all zeros except a single 1. This is necessary because machine learning algorithms work with numerical data. So here is how we can One-hot encode the labels:
# One-hot encode the labels one_hot_labels = keras.utils.to_categorical(labels)
Text Emotions Classification Model
Now we will split the data into training and test sets:
# Split the data into training and testing sets xtrain, xtest, ytrain, ytest = train_test_split(padded_sequences, one_hot_labels, test_size=0.2)
Now let’s define a neural network architecture for our classification problem and use it to train a model to classify emotions:
# Define the model model = Sequential() model.add(Embedding(input_dim=len(tokenizer.word_index) + 1, output_dim=128, input_length=max_length)) model.add(Flatten()) model.add(Dense(units=128, activation="relu")) model.add(Dense(units=len(one_hot_labels[0]), activation="softmax")) model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"]) model.fit(xtrain, ytrain, epochs=10, batch_size=32, validation_data=(xtest, ytest))
Epoch 1/10 400/400 [==============================] - 12s 28ms/step - loss: 1.3766 - accuracy: 0.4693 - val_loss: 0.8994 - val_accuracy: 0.7028 Epoch 2/10 400/400 [==============================] - 11s 28ms/step - loss: 0.3783 - accuracy: 0.8862 - val_loss: 0.5440 - val_accuracy: 0.8338 Epoch 3/10 400/400 [==============================] - 11s 28ms/step - loss: 0.0681 - accuracy: 0.9831 - val_loss: 0.5799 - val_accuracy: 0.8281 Epoch 4/10 400/400 [==============================] - 11s 27ms/step - loss: 0.0278 - accuracy: 0.9941 - val_loss: 0.6063 - val_accuracy: 0.8272 Epoch 5/10 400/400 [==============================] - 11s 28ms/step - loss: 0.0173 - accuracy: 0.9962 - val_loss: 0.6683 - val_accuracy: 0.8281 Epoch 6/10 400/400 [==============================] - 11s 28ms/step - loss: 0.0164 - accuracy: 0.9968 - val_loss: 0.7021 - val_accuracy: 0.8250 Epoch 7/10 400/400 [==============================] - 13s 31ms/step - loss: 0.0135 - accuracy: 0.9972 - val_loss: 0.7059 - val_accuracy: 0.8238 Epoch 8/10 400/400 [==============================] - 12s 31ms/step - loss: 0.0127 - accuracy: 0.9977 - val_loss: 0.7705 - val_accuracy: 0.8163 Epoch 9/10 400/400 [==============================] - 11s 28ms/step - loss: 0.0127 - accuracy: 0.9971 - val_loss: 0.7710 - val_accuracy: 0.8181 Epoch 10/10 400/400 [==============================] - 11s 28ms/step - loss: 0.0110 - accuracy: 0.9975 - val_loss: 0.8234 - val_accuracy: 0.8206 <keras.callbacks.History at 0x7fa6a85354f0>
Now let’s take a sentence as an input text and see how the model performs:
input_text = "She didn't come today because she lost her dog yestertay!" # Preprocess the input text input_sequence = tokenizer.texts_to_sequences([input_text]) padded_input_sequence = pad_sequences(input_sequence, maxlen=max_length) prediction = model.predict(padded_input_sequence) predicted_label = label_encoder.inverse_transform([np.argmax(prediction[0])]) print(predicted_label)
1/1 [==============================] - 0s 145ms/step ['sadness']
So this is how you can use Machine Learning for the task of text emotion classification using the Python programming language.
Summary
Text emotion classification is the problem of assigning emotion to a text by understanding the context and the emotion behind the text. One real-world example is the keyboard of an iPhone that recommends the most relevant emoji by understanding the text. I hope you liked this article on Text Emotion Classification with Machine Learning using Python. Feel free to ask valuable questions in the comments section below.