Job Recommendation System using Python

A recommendation system is a popular application of Data Science that recommends personalized content based on the users’ interests. Almost all the popular websites you visit today use a recommendation system. As the name suggests, a job recommendation system is an application that recommends jobs based on the skills and the user’s desired role. So, if you want to learn how to recommend jobs using the Python programming language, this article is for you. This article will help you learn about creating a Job Recommendation System using Python.

Job Recommendation System

A job recommendation system is an application that recommends jobs to a user according to the skills and the user’s desired job role. LinkedIn is one of the most popular applications using a job recommendation system to help its users find the best jobs according to their skills and desired positions.

To build a jobs recommendation system, we need to have a dataset containing information about the jobs with necessary features like skills and types of jobs. I found an ideal dataset on Kaggle for this task (downloaded from here).

If you want a cleaned version of the dataset, you can download it from here.

The section below take you through creating a job recommendation system using Python.

Job Recommendation System using Python

Let’s start the task of creating a job recommendation system by importing the necessary Python libraries and the dataset:

import numpy as np
import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity
import matplotlib.pyplot as plt
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
import nltk
import re
from nltk.corpus import stopwords
import string

data = pd.read_csv("jobs.csv")
print(data.head())
   Unnamed: 0                    Job Salary Job Experience Required  \
0           0   Not Disclosed by Recruiter               5 - 10 yrs   
1           1   Not Disclosed by Recruiter                2 - 5 yrs   
2           2   Not Disclosed by Recruiter                0 - 1 yrs   
3           3       2,00,000 - 4,00,000 PA.               0 - 5 yrs   
4           4   Not Disclosed by Recruiter                2 - 5 yrs   

                                          Key Skills  \
0                      Media Planning| Digital Media   
1   pre sales| closing| software knowledge| clien...   
2   Computer science| Fabrication| Quality check|...   
3                                  Technical Support   
4   manual testing| test engineering| test cases|...   

                                Role Category  \
0                                 Advertising   
1                                Retail Sales   
2                                         R&D   
3  Admin/Maintenance/Security/Datawarehousing   
4                        Programming & Design   

                                     Functional Area  \
0  Marketing , Advertising , MR , PR , Media Plan...   
1              Sales , Retail , Business Development   
2                           Engineering Design , R&D   
3  IT Software - Application Programming , Mainte...   
4                         IT Software - QA & Testing   

                                Industry                         Job Title  
0  Advertising, PR, MR, Event Management  Media Planning Executive/Manager  
1         IT-Software, Software Services           Sales Executive/Officer  
2                  Recruitment, Staffing                     R&D Executive  
3         IT-Software, Software Services        Technical Support Engineer  
4         IT-Software, Software Services                  Testing Engineer  

The dataset has an unnamed column. Let’s remove it and move further:

data = data.drop("Unnamed: 0",axis=1)

Now let’s have a look if the dataset contains any null values or not:

data.isnull().sum()
Job Salary                 0
Job Experience Required    0
Key Skills                 0
Role Category              0
Functional Area            0
Industry                   0
Job Title                  0
dtype: int64

As the dataset doesn’t have any null values, let’s move further by exploring the skills mentioned in the Key Skills column:

text = " ".join(i for i in data["Key Skills"])
stopwords = set(STOPWORDS)
wordcloud = WordCloud(stopwords=stopwords, 
                      background_color="white").generate(text)
plt.figure( figsize=(15,10))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
key skills for job recommendation system

Now let’s have a look at the functional areas mentioned in the dataset:

text = " ".join(i for i in data["Functional Area"])
stopwords = set(STOPWORDS)
wordcloud = WordCloud(stopwords=stopwords, 
                      background_color="white").generate(text)
plt.figure( figsize=(15,10))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
functional areas in the dataset

Now let’s have a look at the job titles mentioned in the dataset:

text = " ".join(i for i in data["Job Title"])
stopwords = set(STOPWORDS)
wordcloud = WordCloud(stopwords=stopwords, 
                      background_color="white").generate(text)
plt.figure( figsize=(15,10))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
Job Titles in the dataset

Creating a Content-Based Recommendation System

Now let’s move forward by creating a job recommendation system. The Key Skills column in the dataset contains the skills required for the job role. We can use the Key Skills column to recommend jobs to the users. So here’s how we can use the cosine similarity algorithm to create a similarity matrix from the Key Skills column:

from sklearn.feature_extraction import text
feature = data["Key Skills"].tolist()
tfidf = text.TfidfVectorizer(input=feature, stop_words="english")
tfidf_matrix = tfidf.fit_transform(feature)
similarity = cosine_similarity(tfidf_matrix)

Now I will set the Job title column as the index of the dataset so that the users can find similar jobs according to the job they are looking for:

indices = pd.Series(data.index, index=data['Job Title']).drop_duplicates()

Now here’s how to write a function to recommend jobs according to the skills required for the job role:

def jobs_recommendation(Title, similarity = similarity):
    index = indices[Title]
    similarity_scores = list(enumerate(similarity[index]))
    similarity_scores = sorted(similarity_scores, key=lambda x: x[::], reverse=True)
    similarity_scores = similarity_scores[0:5]
    newsindices = [i[0] for i in similarity_scores]
    return data[['Job Title', 'Job Experience Required', 
                 'Key Skills']].iloc[newsindices]

print(jobs_recommendation("Software Developer"))
                                       Job Title Job Experience Required  \
6249          Sales/Business Development Manager               4 - 5 yrs   
6248                          Software Developer               2 - 5 yrs   
6247  Associate/Senior Associate -(NonTechnical)              5 - 10 yrs   
6246                          Software Developer               1 - 6 yrs   
6245  Associate/Senior Associate -(NonTechnical)               1 - 4 yrs   

                                             Key Skills  
6249   Networking| Printing| Aerospace| Raw material...  
6248   PHP| MVC| Laravel| AWS| SDLC| WordPress| LAMP...  
6247   Data analysis| Investment banking| Financial ...  
6246   Coding| WordPress| Commerce| HTML| Troublesho...  
6245   client servicing| client support| background ...  

So this is how we can recommend jobs using the Python programming language.

Summary

A job recommender system is an application that recommends a job to a user according to the skills and the user’s desired job role. LinkedIn is one of the most popular applications that use a job recommender system to help its users find the best jobs according to their skills and desired roles. I hope you liked this article on creating a Jobs Recommender System using Python. Feel free to ask valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1433

Leave a Reply