News Recommendation System using Python

recommendation system is a popular application of Data Science. Almost all the popular websites you visit use recommendation systems. As the name suggests, a news recommendation system is an application that recommends news articles based on the news a user is already reading. So, if you want to learn how to create a News Recommendation System, this article is for you. In this article, I will take you through how to create a News Recommendation System using Python.

How does a News Recommendation System Work?

When you visit any website, it recommends similar content based on what you are already watching or reading. Content recommendation based on the content the user is already consuming is a technique for creating a recommendation system known as Content-based filtering.

All the popular news websites use content-based recommendation systems designed to find similarities between the news you are reading and other news articles on their website to recommend the most similar news articles.

I hope you now have understood how a news recommendation system works. In the section below, I will take you through how to build a News Recommendation System using the Python programming language.

News Recommendation System using Python

The dataset I am using to build a News Recommendation System is from Microsoft. As the data needed a lot of cleaning and preparation, I downloaded data and prepared it to create a content-based recommendation system.Ā You can download the datasetĀ hereĀ (please download the data in CSV format).

Now let’s start with importing the necessary Python libraries and the dataset we need to build a News Recommendation System:

import numpy as np
import pandas as pd
from sklearn.feature_extraction import text
from sklearn.metrics.pairwise import cosine_similarity
import plotly.express as px
import plotly.graph_objects as go

data = pd.read_csv("News.csv")
print(data.head())
       ID News Category                                              Title  \
0  N88753     lifestyle  The Brands Queen Elizabeth, Prince Charles, an...   
1  N45436          news    Walmart Slashes Prices on Last-Generation iPads   
2  N23144        health                      50 Worst Habits For Belly Fat   
3  N86255        health  Dispose of unwanted prescription drugs during ...   
4  N93187          news  The Cost of Trump's Aid Freeze in the Trenches...   

                                             Summary  
0  Shop the notebooks, jackets, and more that the...  
1  Apple's new iPad releases bring big deals on l...  
2  These seemingly harmless habits are holding yo...  
3                                                NaN  
4  Lt. Ivan Molchanets peeked over a parapet of s...  

Let’s have a look at the news categories in this dataset:

# Types of News Categories
categories = data["News Category"].value_counts()
label = categories.index
counts = categories.values
figure = px.bar(data, x=label, 
                y = counts, 
            title="Types of News Categories")
figure.show()
News Recommendation System: News categories

There are two ways to build a recommendation system using this dataset:

  1. If we choose the News Category column as the feature we will use to find similarities, the recommendations may not help grab the user’s attention for a longer time. Suppose a user is reading news about sports based on a cricket match and gets news recommendations about other sports like Wrestling, Hockey, Football etc., which could be inappropriate according to the content the user is reading.
  2. The other way is to use the title or the summary as the feature to find similarities. It will give more accurate recommendations as the recommended content will be based on the content the user is already reading.

So we can use the title or the summary of the news article to find similarities with other news articles. Here I will use the title column. If you wish to use the summary column, first drop the rows with null values, as the summary column contains more than 5000 null values.

Below is how we can find similarities between the news articles by converting the texts of the title column into numerical vectors and then finding similarities between the numerical vectors using the cosine similarity algorithm:

feature = data["Title"].tolist()
tfidf = text.TfidfVectorizer(input=feature, stop_words="english")
tfidf_matrix = tfidf.fit_transform(feature)
similarity = cosine_similarity(tfidf_matrix)

Now I will set the title column as the index of the data so that we can look for content recommendations by giving the title as an input:

indices = pd.Series(data.index, index=data['Title']).drop_duplicates()

Now below is how to build a News Recommendation System:

def news_recommendation(Title, similarity = similarity):
    index = indices[Title]
    similarity_scores = list(enumerate(similarity[index]))
    similarity_scores = sorted(similarity_scores, 
    key=lambda x: x[1], reverse=True)
    similarity_scores = similarity_scores[0:10]
    newsindices = [i[0] for i in similarity_scores]
    return data['Title'].iloc[newsindices]

print(news_recommendation("Walmart Slashes Prices on Last-Generation iPads"))
1           Walmart Slashes Prices on Last-Generation iPads
83827     Walmart's Black Friday 2019 ad: the best deals...
76024     Walmart Black Friday 2019 deals unveiled: Huge...
90316     US consumer prices up 0.4% in October; gasolin...
89588     Consumer prices rise most in 7 months on highe...
32839                   Inside the next generation of irons
37970     Walmart and Kroger Undercut Drugstore Chains' ...
100684    Nissan slashes full-year forecast as first-hal...
74916                    The Top Deals at Walmart Right Now
39634     Federal Reserve slashes interest rates for thi...
Name: Title, dtype: object

So this is how you can build a News Recommender System using the Python programming language.

Summary

All the popular news websites use content-based recommendation systems designed to find similarities between the news you are reading and other news articles on their website to recommend the most similar news articles. I hope you liked this article on how to build a News Recommender System using Python. Feel free to ask valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of datašŸ“ˆ.

Articles: 1534

Leave a Reply