House Rent Prediction with Machine Learning

The rent of a house depends on a lot of factors. With appropriate data and Machine Learning techniques, many real estate platforms find the housing options according to the customer’s budget. So, if you want to learn how to use Machine Learning to predict the rent of a house, this article is for you. In this article, I will take you through the task of House Rent Prediction with Machine Learning using Python.

House Rent Prediction

The rent of a housing property depends on a lot of factors like:

  1. number of bedrooms, hall, and kitchen
  2. size of the property
  3. the floor of the house
  4. area type
  5. area locality
  6. City
  7. furnishing status of the house

To build a house rent prediction system, we need data based on the factors affecting the rent of a housing property. I found a dataset from Kaggle which includes all the features we need. You can download the dataset from here.

House Rent Prediction using Python

I will start the task of house rent prediction by importing the necessary Python libraries and the dataset:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go

data = pd.read_csv("House_Rent_Dataset.csv")
print(data.head())
    Posted On  BHK   Rent  Size            Floor    Area Type  \
0  2022-05-18    2  10000  1100  Ground out of 2   Super Area   
1  2022-05-13    2  20000   800       1 out of 3   Super Area   
2  2022-05-16    2  17000  1000       1 out of 3   Super Area   
3  2022-07-04    2  10000   800       1 out of 2   Super Area   
4  2022-05-09    2   7500   850       1 out of 2  Carpet Area   

              Area Locality     City Furnishing Status  Tenant Preferred  \
0                    Bandel  Kolkata       Unfurnished  Bachelors/Family   
1  Phool Bagan, Kankurgachi  Kolkata    Semi-Furnished  Bachelors/Family   
2   Salt Lake City Sector 2  Kolkata    Semi-Furnished  Bachelors/Family   
3               Dumdum Park  Kolkata       Unfurnished  Bachelors/Family   
4             South Dum Dum  Kolkata       Unfurnished         Bachelors   

   Bathroom Point of Contact  
0         2    Contact Owner  
1         1    Contact Owner  
2         1    Contact Owner  
3         1    Contact Owner  
4         1    Contact Owner  

Before moving forward, let’s check if the data contains null values or not:

print(data.isnull().sum())
Posted On            0
BHK                  0
Rent                 0
Size                 0
Floor                0
Area Type            0
Area Locality        0
City                 0
Furnishing Status    0
Tenant Preferred     0
Bathroom             0
Point of Contact     0
dtype: int64

Let’s have a look at the descriptive statistics of the data:

print(data.describe())
               BHK          Rent         Size     Bathroom
count  4746.000000  4.746000e+03  4746.000000  4746.000000
mean      2.083860  3.499345e+04   967.490729     1.965866
std       0.832256  7.810641e+04   634.202328     0.884532
min       1.000000  1.200000e+03    10.000000     1.000000
25%       2.000000  1.000000e+04   550.000000     1.000000
50%       2.000000  1.600000e+04   850.000000     2.000000
75%       3.000000  3.300000e+04  1200.000000     2.000000
max       6.000000  3.500000e+06  8000.000000    10.000000

Now let’s have a look at the mean, median, highest, and lowest rent of the houses:

print(f"Mean Rent: {data.Rent.mean()}")
print(f"Median Rent: {data.Rent.median()}")
print(f"Highest Rent: {data.Rent.max()}")
print(f"Lowest Rent: {data.Rent.min()}")
Mean Rent: 34993.45132743363
Median Rent: 16000.0
Highest Rent: 3500000
Lowest Rent: 1200

Now let’s have a look at the rent of the houses in different cities according to the number of bedrooms, halls, and kitchens:

figure = px.bar(data, x=data["City"], 
                y = data["Rent"], 
                color = data["BHK"],
            title="Rent in Different Cities According to BHK")
figure.show()
house rent prediction: Rent in Different Cities According to BHK

Now let’s have a look at the rent of the houses in different cities according to the area type:

figure = px.bar(data, x=data["City"], 
                y = data["Rent"], 
                color = data["Area Type"],
            title="Rent in Different Cities According to Area Type")
figure.show()
Rent in Different Cities According to Area Type

Now let’s have a look at the rent of the houses in different cities according to the furnishing status of the house:

figure = px.bar(data, x=data["City"], 
                y = data["Rent"], 
                color = data["Furnishing Status"],
            title="Rent in Different Cities According to Furnishing Status")
figure.show()
house rent prediction: Rent in Different Cities According to Furnishing Status

Now let’s have a look at the rent of the houses in different cities according to the size of the house:

figure = px.bar(data, x=data["City"], 
                y = data["Rent"], 
                color = data["Size"],
            title="Rent in Different Cities According to Size")
figure.show()
Rent in Different Cities According to Size

Now let’s have a look at the number of houses available for rent in different cities according to the dataset:

cities = data["City"].value_counts()
label = cities.index
counts = cities.values
colors = ['gold','lightgreen']

fig = go.Figure(data=[go.Pie(labels=label, values=counts, hole=0.5)])
fig.update_layout(title_text='Number of Houses Available for Rent')
fig.update_traces(hoverinfo='label+percent', textinfo='value', textfont_size=30,
                  marker=dict(colors=colors, line=dict(color='black', width=3)))
fig.show()
Number of Houses Available for Rent

Now let’s have a look at the number of houses available for different types of tenants:

# Preference of Tenant
tenant = data["Tenant Preferred"].value_counts()
label = tenant.index
counts = tenant.values
colors = ['gold','lightgreen']

fig = go.Figure(data=[go.Pie(labels=label, values=counts, hole=0.5)])
fig.update_layout(title_text='Preference of Tenant in India')
fig.update_traces(hoverinfo='label+percent', textinfo='value', textfont_size=30,
                  marker=dict(colors=colors, line=dict(color='black', width=3)))
fig.show()
Preference of Tenant in India

House Rent Prediction Model

Now I will convert all the categorical features into numerical features that we need to train a house rent prediction model:

data["Area Type"] = data["Area Type"].map({"Super Area": 1, 
                                           "Carpet Area": 2, 
                                           "Built Area": 3})
data["City"] = data["City"].map({"Mumbai": 4000, "Chennai": 6000, 
                                 "Bangalore": 5600, "Hyderabad": 5000, 
                                 "Delhi": 1100, "Kolkata": 7000})
data["Furnishing Status"] = data["Furnishing Status"].map({"Unfurnished": 0, 
                                                           "Semi-Furnished": 1, 
                                                           "Furnished": 2})
data["Tenant Preferred"] = data["Tenant Preferred"].map({"Bachelors/Family": 2, 
                                                         "Bachelors": 1, 
                                                         "Family": 3})
print(data.head())
    Posted On  BHK   Rent  Size            Floor  Area Type  \
0  2022-05-18    2  10000  1100  Ground out of 2          1   
1  2022-05-13    2  20000   800       1 out of 3          1   
2  2022-05-16    2  17000  1000       1 out of 3          1   
3  2022-07-04    2  10000   800       1 out of 2          1   
4  2022-05-09    2   7500   850       1 out of 2          2   

              Area Locality  City  Furnishing Status  Tenant Preferred  \
0                    Bandel  7000                  0                 2   
1  Phool Bagan, Kankurgachi  7000                  1                 2   
2   Salt Lake City Sector 2  7000                  1                 2   
3               Dumdum Park  7000                  0                 2   
4             South Dum Dum  7000                  0                 1   

   Bathroom Point of Contact  
0         2    Contact Owner  
1         1    Contact Owner  
2         1    Contact Owner  
3         1    Contact Owner  
4         1    Contact Owner  

Now I will split the data into training and test sets:

#splitting data
from sklearn.model_selection import train_test_split
x = np.array(data[["BHK", "Size", "Area Type", "City", 
                   "Furnishing Status", "Tenant Preferred", 
                   "Bathroom"]])
y = np.array(data[["Rent"]])

xtrain, xtest, ytrain, ytest = train_test_split(x, y, 
                                                test_size=0.10, 
                                                random_state=42)

Now let’s train a house rent prediction model using an LSTM neural network model:

from keras.models import Sequential
from keras.layers import Dense, LSTM
model = Sequential()
model.add(LSTM(128, return_sequences=True, 
               input_shape= (xtrain.shape[1], 1)))
model.add(LSTM(64, return_sequences=False))
model.add(Dense(25))
model.add(Dense(1))
model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm (LSTM)                 (None, 7, 128)            66560     
                                                                 
 lstm_1 (LSTM)               (None, 64)                49408     
                                                                 
 dense (Dense)               (None, 25)                1625      
                                                                 
 dense_1 (Dense)             (None, 1)                 26        
                                                                 
=================================================================
Total params: 117,619
Trainable params: 117,619
Non-trainable params: 0
_________________________________________________________________
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(xtrain, ytrain, batch_size=1, epochs=21)
Epoch 1/21
4271/4271 [==============================] - 35s 7ms/step - loss: 7038080512.0000
Epoch 2/21
4271/4271 [==============================] - 31s 7ms/step - loss: 6481502720.0000
Epoch 3/21
4271/4271 [==============================] - 31s 7ms/step - loss: 6180754944.0000
Epoch 4/21
4271/4271 [==============================] - 31s 7ms/step - loss: 5968361472.0000
Epoch 5/21
4271/4271 [==============================] - 30s 7ms/step - loss: 5770649088.0000
Epoch 6/21
4271/4271 [==============================] - 29s 7ms/step - loss: 5618835968.0000
Epoch 7/21
4271/4271 [==============================] - 30s 7ms/step - loss: 5440893952.0000
Epoch 8/21
4271/4271 [==============================] - 29s 7ms/step - loss: 5341533696.0000
Epoch 9/21
4271/4271 [==============================] - 30s 7ms/step - loss: 5182846976.0000
Epoch 10/21
4271/4271 [==============================] - 31s 7ms/step - loss: 5106288128.0000
Epoch 11/21
4271/4271 [==============================] - 30s 7ms/step - loss: 5076118528.0000
Epoch 12/21
4271/4271 [==============================] - 30s 7ms/step - loss: 5001080320.0000
Epoch 13/21
4271/4271 [==============================] - 31s 7ms/step - loss: 4941253120.0000
Epoch 14/21
4271/4271 [==============================] - 33s 8ms/step - loss: 4904356864.0000
Epoch 15/21
4271/4271 [==============================] - 29s 7ms/step - loss: 4854262784.0000
Epoch 16/21
4271/4271 [==============================] - 30s 7ms/step - loss: 4855796736.0000
Epoch 17/21
4271/4271 [==============================] - 36s 8ms/step - loss: 4764052480.0000
Epoch 18/21
4271/4271 [==============================] - 30s 7ms/step - loss: 4709226496.0000
Epoch 19/21
4271/4271 [==============================] - 31s 7ms/step - loss: 4702300160.0000
Epoch 20/21
4271/4271 [==============================] - 31s 7ms/step - loss: 4670900736.0000
Epoch 21/21
4271/4271 [==============================] - 31s 7ms/step - loss: 4755582976.0000
<keras.callbacks.History at 0x7fd1deb6c9d0>

Now here’s how to predict the rent of a housing property using the trained model:

print("Enter House Details to Predict Rent")
a = int(input("Number of BHK: "))
b = int(input("Size of the House: "))
c = int(input("Area Type (Super Area = 1, Carpet Area = 2, Built Area = 3): "))
d = int(input("Pin Code of the City: "))
e = int(input("Furnishing Status of the House (Unfurnished = 0, Semi-Furnished = 1, Furnished = 2): "))
f = int(input("Tenant Type (Bachelors = 1, Bachelors/Family = 2, Only Family = 3): "))
g = int(input("Number of bathrooms: "))
features = np.array([[a, b, c, d, e, f, g]])
print("Predicted House Price = ", model.predict(features))
Enter House Details to Predict Rent
Number of BHK: 3
Size of the House: 1100
Area Type (Super Area = 1, Carpet Area = 2, Built Area = 3): 2
Pin Code of the City: 1100
Furnishing Status of the House (Unfurnished = 0, Semi-Furnished = 1, Furnished = 2): 1
Tenant Type (Bachelors = 1, Bachelors/Family = 2, Only Family = 3): 3
Number of bathrooms: 2
Predicted House Price =  [[34922.3]]

Summary

So this is how to use Machine Learning to predict the rent of a housing property. With appropriate data and Machine Learning techniques, many real estate platforms find the housing options according to the customer’s budget. I hope you liked this article on predicting house rent with Machine Learning using Python. Feel free to ask valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1537

2 Comments

Leave a Reply