Hybrid Recommendation System using Python

A hybrid recommendation system is a recommendation technique that offers a complete and balanced approach by mixing two or more recommendation techniques. It aims to provide more accurate, diverse and personalized recommendations to users leveraging the strengths of different techniques and providing valuable user experience. If you want to know how to build a hybrid recommendation system, this article is for you. In this article, I will take you through building a Hybrid Recommendation System using Python.

What is a Hybrid Recommendation System?

A hybrid recommendation system combines multiple recommendation techniques to provide more accurate and diverse recommendations to users. It uses the strengths of different approaches, such as collaborative filtering and content-based filtering, to overcome their limitations and improve the recommendation process.

You must have heard of Collaborative filtering and Content-based filtering before. Collaborative filtering analyzes user-item interactions and identifies similarities between users or items to make recommendations. It recommends items users with similar preferences have liked or consumed. However, it may struggle with new or niche items having limited user interactions.

On the other hand, content-based filtering focuses on features and characteristics of items to recommend similar items to users based on their preferences. It examines attributes like product descriptions, brands, categories, and user profiles. However, it may not capture the complexity of user preferences and may result in less diverse recommendations.

This is where a hybrid recommendation system helps. By combining collaborative and content-based filtering in a hybrid system, we can overcome the limitations of collaborative and content-based filtering. The collaborative filtering component captures the wisdom of the crowd, while the content-based filtering component takes into account the specific features and attributes of items. This combination allows the system to provide more accurate recommendations, especially in situations where user-item interactions are rare or when personalized recommendations are desired.

I hope you have now understood what a Hybrid Recommendation System is. In the section below, I’ll take you through how to build a hybrid recommendation system using Python. The dataset that we can use for this task is available here.

Hybrid Recommendation System using Python

Let’s start the task of building a hybrid recommendation system by importing the necessary Python libraries and the dataset:

import pandas as pd
data = pd.read_csv("fashion_products.csv")
print(data.head())
   User ID  Product ID Product Name   Brand         Category  Price    Rating  \
0       19           1        Dress  Adidas    Men's Fashion     40  1.043159   
1       97           2        Shoes     H&M  Women's Fashion     82  4.026416   
2       25           3        Dress  Adidas  Women's Fashion     44  3.337938   
3       57           4        Shoes    Zara    Men's Fashion     23  1.049523   
4       79           5      T-shirt  Adidas    Men's Fashion     79  4.302773   

    Color Size  
0   Black   XL  
1   Black    L  
2  Yellow   XL  
3   White    S  
4   Black    M  

So this data is based on fashion products for men, women, and kids. Our goal is to create two recommendation systems using collaborative and content-based filtering and then combine the recommendation techniques to build a recommendation system using a hybrid approach.

First, let’s import the necessary Python libraries we will be using for the rest of the task:

from surprise import Dataset, Reader, SVD
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import linear_kernel

In the above code, I have imported the Surprise library that you may not have used before. The surprise library is imported to use the SVD algorithm. SVD stands for Singular Value Decomposition. Simply put, it is a matrix factorization technique commonly used in collaborative filtering algorithms. You can install it on your systems using the command mentioned below:

  • For terminal or command prompt: pip install scikit-surprise
  • For Colab Notebook: !pip install scikit-surprise

First Approach: Content-Based Filtering

Now let’s move forward by creating a recommendation system using content-based filtering:

content_df = data[['Product ID', 'Product Name', 'Brand', 
                   'Category', 'Color', 'Size']]
content_df['Content'] = content_df.apply(lambda row: ' '.join(row.dropna().astype(str)), axis=1)

# Use TF-IDF vectorizer to convert content into a matrix of TF-IDF features
tfidf_vectorizer = TfidfVectorizer()
content_matrix = tfidf_vectorizer.fit_transform(content_df['Content'])

content_similarity = linear_kernel(content_matrix, content_matrix)

reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(data[['User ID', 
                                  'Product ID', 
                                  'Rating']], reader)

def get_content_based_recommendations(product_id, top_n):
    index = content_df[content_df['Product ID'] == product_id].index[0]
    similarity_scores = content_similarity[index]
    similar_indices = similarity_scores.argsort()[::-1][1:top_n + 1]
    recommendations = content_df.loc[similar_indices, 'Product ID'].values
    return recommendations

In the above code, we are implementing the content-based filtering component of the hybrid recommender system. We started by selecting relevant features from the dataset, including the product ID, name, brand, category, colour, and size. Then we combined these features into a single “Content” column for each product.

Next, we used the TF-IDF (Term Frequency-Inverse Document Frequency) vectorizer to convert the content into a TF-IDF feature matrix. This matrix represents the importance of each word in the content compared to the whole corpus.

We then calculated the similarity between products based on their content using the cosine similarity measure. This similarity matrix captures the similarity between each pair of products based on their content.

To get content-based recommendations, we first found the index of the target product in the similarity matrix. Then we sorted the similarity scores in descending order and selected the top N similar products. Finally, we returned the product IDs of the recommended products.

Second Approach: Collaborative Filtering

Now let’s move forward by creating a recommendation system using collaborative filtering:

algo = SVD()
trainset = data.build_full_trainset()
algo.fit(trainset)

def get_collaborative_filtering_recommendations(user_id, top_n):
    testset = trainset.build_anti_testset()
    testset = filter(lambda x: x[0] == user_id, testset)
    predictions = algo.test(testset)
    predictions.sort(key=lambda x: x.est, reverse=True)
    recommendations = [prediction.iid for prediction in predictions[:top_n]]
    return recommendations

In the above code, we implemented the collaborative filtering component of the hybrid recommender system using the SVD (Singular Value Decomposition) algorithm.

First, we initialized the SVD algorithm and trained it on the dataset. This step involves decomposing the user element rating matrix to capture the underlying patterns and latent factors that drive user preferences.

To generate collaborative filtering recommendations, we then created a test set composed of user-item pairs that were not present in the training set. We have filtered this test set to only include items belonging to the target user specified by user_id.

Next, we used the trained SVD model to predict the test set item ratings. These predictions represent the estimated ratings that the user would assign to the items.

The predictions are then sorted by their estimated ratings in descending order. We selected the top N items with the highest estimated ratings as collaborative filtering recommendations for the user.

And Finally, The Hybrid Approach

Now let’s combine content-based and collaborative filtering methods to build a recommendation system using the Hybrid method:

def get_hybrid_recommendations(user_id, product_id, top_n):
    content_based_recommendations = get_content_based_recommendations(product_id, top_n)
    collaborative_filtering_recommendations = get_collaborative_filtering_recommendations(user_id, top_n)
    hybrid_recommendations = list(set(content_based_recommendations + collaborative_filtering_recommendations))
    return hybrid_recommendations[:top_n]

In the above code, we combined content-based and collaborative filtering approaches to create a hybrid recommender system.

The get_hybrid_recommendations function takes the user_id, the product_id and the desired number of top_n recommendations as input.

First, it calls the get_content_based_recommendations function to retrieve a list of content-based recommendations for the specified product_id. These recommendations are based on the similarity between the characteristics of the given product and other products in the dataset.

Then it calls the get_collaborative_filtering_recommendations function to get a list of collaborative filtering recommendations for the specified user_id. These recommendations are generated by leveraging historical user-item interactions and estimating user preferences based on similar user behaviours.

Next, we combine the content-based and collaborative filtering recommendations by taking the union of the two lists. It ensures that hybrid recommendations include content-based and collaborative filtering recommendations based on user preferences.

Here’s how to use our hybrid recommendation system to recommend products based on the product that a user is viewing:

user_id = 6
product_id = 11
top_n = 10
recommendations = get_hybrid_recommendations(user_id, product_id, top_n)

print(f"Hybrid Recommendations for User {user_id} based on Product {product_id}:")
for i, recommendation in enumerate(recommendations):
    print(f"{i + 1}. Product ID: {recommendation}")
    print(f"{i + 1}. Product ID: {recommendation}")
Hybrid Recommendations for User 6 based on Product 11:
1. Product ID: 928
1. Product ID: 928
2. Product ID: 131
2. Product ID: 131
3. Product ID: 451
3. Product ID: 451
4. Product ID: 837
4. Product ID: 837
5. Product ID: 875
5. Product ID: 875
6. Product ID: 594
6. Product ID: 594
7. Product ID: 1463
7. Product ID: 1463
8. Product ID: 1688
8. Product ID: 1688
9. Product ID: 601
9. Product ID: 601
10. Product ID: 1566
10. Product ID: 1566

Summary

So this is how to create a hybrid recommendation system using Python. A hybrid recommendation system combines multiple recommendation techniques to provide more accurate and diverse recommendations to users. It uses the strengths of different approaches, such as collaborative filtering and content-based filtering, to overcome their limitations and improve the recommendation process. I hope you liked this article on creating a Hybrid Recommendation System using Python. Feel free to ask valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1498

Leave a Reply