Book Recommendation System using Python

A recommendation system is one of the popular applications of Data Science. A Book Recommendation system is an application used to recommend similar books to a user. If you want to learn how to build a book recommendation system, this article is for you. This article will take you through how to build a book recommendation system using Python.

Book Recommendation System using Python

A book recommendation system should recommend similar books based on the user’s interest. The dataset needed to build a book recommendation system is collected from Kaggle. You can download the dataset from here.

Now let’s import the necessary Python libraries and the dataset we need for this task:

import numpy as np
import pandas as pd
from sklearn.feature_extraction import text
from sklearn.metrics.pairwise import linear_kernel

data = pd.read_csv("book_data.csv")
print(data.head())

                                        book_authors  ...                                          image_url
0                                    Suzanne Collins  ...  https://images.gr-assets.com/books/1447303603l...
1                         J.K. Rowling|Mary GrandPré  ...  https://images.gr-assets.com/books/1255614970l...
2                                         Harper Lee  ...  https://images.gr-assets.com/books/1361975680l...
3  Jane Austen|Anna Quindlen|Mrs. Oliphant|George...  ...  https://images.gr-assets.com/books/1320399351l...
4                                    Stephenie Meyer  ...  https://images.gr-assets.com/books/1361039443l...

[5 rows x 12 columns]

I will select three columns from the dataset for the rest of the task (book_title, book_desc, book_rating_count):

data = data[["book_title", "book_desc", "book_rating_count"]]
print(data.head())

                                  book_title  ... book_rating_count
0                           The Hunger Games  ...           5519135
1  Harry Potter and the Order of the Phoenix  ...           2041594
2                      To Kill a Mockingbird  ...           3745197
3                        Pride and Prejudice  ...           2453620
4                                   Twilight  ...           4281268

[5 rows x 3 columns]

Let’s have a look at the top 5 books in the dataset according to the number of ratings:

data = data.sort_values(by="book_rating_count", ascending=False)
top_5 = data.head()

import plotly.express as px
import plotly.graph_objects as go

labels = top_5["book_title"]
values = top_5["book_rating_count"]
colors = ['gold','lightgreen']


fig = go.Figure(data=[go.Pie(labels=labels, values=values)])
fig.update_layout(title_text="Top 5 Rated Books")
fig.update_traces(hoverinfo='label+percent', textinfo='percent', textfont_size=30,
                  marker=dict(colors=colors, line=dict(color='black', width=3)))
fig.show()

Before moving forward, let’s check if the data has null values or not:

print(data.isnull().sum())

book_title              0
book_desc            1331
book_rating_count       0
dtype: int64

The dataset has null values in the book description column. Let’s drop the rows having null values:

data = data.dropna()

Now I will use the book description column as the feature to recommend similar books to the user:

feature = data["book_desc"].tolist()
tfidf = text.TfidfVectorizer(input=feature, stop_words="english")
tfidf_matrix = tfidf.fit_transform(feature)
similarity = linear_kernel(tfidf_matrix, tfidf_matrix)

Now I will set the book title column as an index so that we can find similar books by giving the title of the book as an input:

indices = pd.Series(data.index, 
                    index=data['book_title']).drop_duplicates()

Now here’s how to write a function to recommend similar books:

def book_recommendation(title, similarity = similarity):
    index = indices[title]
    similarity_scores = list(enumerate(similarity[index]))
    similarity_scores = sorted(similarity_scores, key=lambda x: x[1], reverse=True)
    similarity_scores = similarity_scores[0:5]
    bookindices = [i[0] for i in similarity_scores]
    return data['book_title'].iloc[bookindices]

print(book_recommendation("Letters to a Secret Lover"))

21823    The Kabbalah of Jesus Christ, Part 1 The True ...
28960                     Seeing and Savoring Jesus Christ
17173                             Jesus and Moses in India
7944                                The Jesus I Never Knew
16976    Beautiful Outlaw: Experiencing the Playful, Di...
Name: book_title, dtype: object

Summary

A book recommendation system should recommend similar books according to the interest of the user. As a data science beginner, you must work ok this data science project to learn about recommendation systems. You can find many more data science projects for practice from here. I hope you liked this article on how to build a book recommendation system using Python. Feel free to ask valuable questions in the comments section below.