Book Recommendation System using Python

A recommendation system is one of the popular applications of Data Science. A Book Recommendation system is an application used to recommend similar books to a user. If you want to learn how to build a book recommendation system, this article is for you. This article will take you through how to build a book recommendation system using Python.

Book Recommendation System using Python

A book recommendation system should recommend similar books based on the user’s interest. The dataset needed to build a book recommendation system is collected from Kaggle. You can download the dataset from here.

Now let’s import the necessary Python libraries and the dataset we need for this task:

import numpy as np
import pandas as pd
from sklearn.feature_extraction import text
from sklearn.metrics.pairwise import linear_kernel

data = pd.read_csv("book_data.csv")
print(data.head())
                                        book_authors  ...                                          image_url
0                                    Suzanne Collins  ...  https://images.gr-assets.com/books/1447303603l...
1                         J.K. Rowling|Mary GrandPré  ...  https://images.gr-assets.com/books/1255614970l...
2                                         Harper Lee  ...  https://images.gr-assets.com/books/1361975680l...
3  Jane Austen|Anna Quindlen|Mrs. Oliphant|George...  ...  https://images.gr-assets.com/books/1320399351l...
4                                    Stephenie Meyer  ...  https://images.gr-assets.com/books/1361039443l...

[5 rows x 12 columns]

I will select three columns from the dataset for the rest of the task (book_title, book_desc, book_rating_count):

data = data[["book_title", "book_desc", "book_rating_count"]]
print(data.head())
                                  book_title  ... book_rating_count
0                           The Hunger Games  ...           5519135
1  Harry Potter and the Order of the Phoenix  ...           2041594
2                      To Kill a Mockingbird  ...           3745197
3                        Pride and Prejudice  ...           2453620
4                                   Twilight  ...           4281268

[5 rows x 3 columns]

Let’s have a look at the top 5 books in the dataset according to the number of ratings:

data = data.sort_values(by="book_rating_count", ascending=False)
top_5 = data.head()

import plotly.express as px
import plotly.graph_objects as go

labels = top_5["book_title"]
values = top_5["book_rating_count"]
colors = ['gold','lightgreen']


fig = go.Figure(data=[go.Pie(labels=labels, values=values)])
fig.update_layout(title_text="Top 5 Rated Books")
fig.update_traces(hoverinfo='label+percent', textinfo='percent', textfont_size=30,
                  marker=dict(colors=colors, line=dict(color='black', width=3)))
fig.show()
book recommendation system

Before moving forward, let’s check if the data has null values or not:

print(data.isnull().sum())
book_title              0
book_desc            1331
book_rating_count       0
dtype: int64

The dataset has null values in the book description column. Let’s drop the rows having null values:

data = data.dropna()

Now I will use the book description column as the feature to recommend similar books to the user:

feature = data["book_desc"].tolist()
tfidf = text.TfidfVectorizer(input=feature, stop_words="english")
tfidf_matrix = tfidf.fit_transform(feature)
similarity = linear_kernel(tfidf_matrix, tfidf_matrix)

Now I will set the book title column as an index so that we can find similar books by giving the title of the book as an input:

indices = pd.Series(data.index, 
                    index=data['book_title']).drop_duplicates()

Now here’s how to write a function to recommend similar books:

def book_recommendation(title, similarity = similarity):
    index = indices[title]
    similarity_scores = list(enumerate(similarity[index]))
    similarity_scores = sorted(similarity_scores, key=lambda x: x[1], reverse=True)
    similarity_scores = similarity_scores[0:5]
    bookindices = [i[0] for i in similarity_scores]
    return data['book_title'].iloc[bookindices]

print(book_recommendation("Letters to a Secret Lover"))
21823    The Kabbalah of Jesus Christ, Part 1 The True ...
28960                     Seeing and Savoring Jesus Christ
17173                             Jesus and Moses in India
7944                                The Jesus I Never Knew
16976    Beautiful Outlaw: Experiencing the Playful, Di...
Name: book_title, dtype: object

Summary

A book recommendation system should recommend similar books according to the interest of the user. As a data science beginner, you must work ok this data science project to learn about recommendation systems. You can find many more data science projects for practice from here. I hope you liked this article on how to build a book recommendation system using Python. Feel free to ask valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

Data Strategist at Statso. My aim is to decode data science for the real world in the most simple words.

Articles: 1607

Leave a Reply

Discover more from thecleverprogrammer

Subscribe now to keep reading and get access to the full archive.

Continue reading