The rent of a house depends on a lot of factors. With appropriate data and Machine Learning techniques, many real estate platforms find the housing options according to the customer’s budget. So, if you want to learn how to use Machine Learning to predict the rent of a house, this article is for you. In this article, I will take you through the task of House Rent Prediction with Machine Learning using Python.
House Rent Prediction
The rent of a housing property depends on a lot of factors like:
- number of bedrooms, hall, and kitchen
- size of the property
- the floor of the house
- area type
- area locality
- City
- furnishing status of the house
To build a house rent prediction system, we need data based on the factors affecting the rent of a housing property. I found a dataset from Kaggle which includes all the features we need. You can download the dataset from here.
House Rent Prediction using Python
I will start the task of house rent prediction by importing the necessary Python libraries and the dataset:
import pandas as pd import numpy as np import matplotlib.pyplot as plt import plotly.express as px import plotly.graph_objects as go data = pd.read_csv("House_Rent_Dataset.csv") print(data.head())
Posted On BHK Rent Size Floor Area Type \ 0 2022-05-18 2 10000 1100 Ground out of 2 Super Area 1 2022-05-13 2 20000 800 1 out of 3 Super Area 2 2022-05-16 2 17000 1000 1 out of 3 Super Area 3 2022-07-04 2 10000 800 1 out of 2 Super Area 4 2022-05-09 2 7500 850 1 out of 2 Carpet Area Area Locality City Furnishing Status Tenant Preferred \ 0 Bandel Kolkata Unfurnished Bachelors/Family 1 Phool Bagan, Kankurgachi Kolkata Semi-Furnished Bachelors/Family 2 Salt Lake City Sector 2 Kolkata Semi-Furnished Bachelors/Family 3 Dumdum Park Kolkata Unfurnished Bachelors/Family 4 South Dum Dum Kolkata Unfurnished Bachelors Bathroom Point of Contact 0 2 Contact Owner 1 1 Contact Owner 2 1 Contact Owner 3 1 Contact Owner 4 1 Contact Owner
Before moving forward, let’s check if the data contains null values or not:
print(data.isnull().sum())
Posted On 0 BHK 0 Rent 0 Size 0 Floor 0 Area Type 0 Area Locality 0 City 0 Furnishing Status 0 Tenant Preferred 0 Bathroom 0 Point of Contact 0 dtype: int64
Let’s have a look at the descriptive statistics of the data:
print(data.describe())
BHK Rent Size Bathroom count 4746.000000 4.746000e+03 4746.000000 4746.000000 mean 2.083860 3.499345e+04 967.490729 1.965866 std 0.832256 7.810641e+04 634.202328 0.884532 min 1.000000 1.200000e+03 10.000000 1.000000 25% 2.000000 1.000000e+04 550.000000 1.000000 50% 2.000000 1.600000e+04 850.000000 2.000000 75% 3.000000 3.300000e+04 1200.000000 2.000000 max 6.000000 3.500000e+06 8000.000000 10.000000
Now let’s have a look at the mean, median, highest, and lowest rent of the houses:
print(f"Mean Rent: {data.Rent.mean()}") print(f"Median Rent: {data.Rent.median()}") print(f"Highest Rent: {data.Rent.max()}") print(f"Lowest Rent: {data.Rent.min()}")
Mean Rent: 34993.45132743363 Median Rent: 16000.0 Highest Rent: 3500000 Lowest Rent: 1200
Now let’s have a look at the rent of the houses in different cities according to the number of bedrooms, halls, and kitchens:
figure = px.bar(data, x=data["City"], y = data["Rent"], color = data["BHK"], title="Rent in Different Cities According to BHK") figure.show()

Now let’s have a look at the rent of the houses in different cities according to the area type:
figure = px.bar(data, x=data["City"], y = data["Rent"], color = data["Area Type"], title="Rent in Different Cities According to Area Type") figure.show()

Now let’s have a look at the rent of the houses in different cities according to the furnishing status of the house:
figure = px.bar(data, x=data["City"], y = data["Rent"], color = data["Furnishing Status"], title="Rent in Different Cities According to Furnishing Status") figure.show()

Now let’s have a look at the rent of the houses in different cities according to the size of the house:
figure = px.bar(data, x=data["City"], y = data["Rent"], color = data["Size"], title="Rent in Different Cities According to Size") figure.show()

Now let’s have a look at the number of houses available for rent in different cities according to the dataset:
cities = data["City"].value_counts() label = cities.index counts = cities.values colors = ['gold','lightgreen'] fig = go.Figure(data=[go.Pie(labels=label, values=counts, hole=0.5)]) fig.update_layout(title_text='Number of Houses Available for Rent') fig.update_traces(hoverinfo='label+percent', textinfo='value', textfont_size=30, marker=dict(colors=colors, line=dict(color='black', width=3))) fig.show()

Now let’s have a look at the number of houses available for different types of tenants:
# Preference of Tenant tenant = data["Tenant Preferred"].value_counts() label = tenant.index counts = tenant.values colors = ['gold','lightgreen'] fig = go.Figure(data=[go.Pie(labels=label, values=counts, hole=0.5)]) fig.update_layout(title_text='Preference of Tenant in India') fig.update_traces(hoverinfo='label+percent', textinfo='value', textfont_size=30, marker=dict(colors=colors, line=dict(color='black', width=3))) fig.show()

House Rent Prediction Model
Now I will convert all the categorical features into numerical features that we need to train a house rent prediction model:
data["Area Type"] = data["Area Type"].map({"Super Area": 1, "Carpet Area": 2, "Built Area": 3}) data["City"] = data["City"].map({"Mumbai": 4000, "Chennai": 6000, "Bangalore": 5600, "Hyderabad": 5000, "Delhi": 1100, "Kolkata": 7000}) data["Furnishing Status"] = data["Furnishing Status"].map({"Unfurnished": 0, "Semi-Furnished": 1, "Furnished": 2}) data["Tenant Preferred"] = data["Tenant Preferred"].map({"Bachelors/Family": 2, "Bachelors": 1, "Family": 3}) print(data.head())
Posted On BHK Rent Size Floor Area Type \ 0 2022-05-18 2 10000 1100 Ground out of 2 1 1 2022-05-13 2 20000 800 1 out of 3 1 2 2022-05-16 2 17000 1000 1 out of 3 1 3 2022-07-04 2 10000 800 1 out of 2 1 4 2022-05-09 2 7500 850 1 out of 2 2 Area Locality City Furnishing Status Tenant Preferred \ 0 Bandel 7000 0 2 1 Phool Bagan, Kankurgachi 7000 1 2 2 Salt Lake City Sector 2 7000 1 2 3 Dumdum Park 7000 0 2 4 South Dum Dum 7000 0 1 Bathroom Point of Contact 0 2 Contact Owner 1 1 Contact Owner 2 1 Contact Owner 3 1 Contact Owner 4 1 Contact Owner
Now I will split the data into training and test sets:
#splitting data from sklearn.model_selection import train_test_split x = np.array(data[["BHK", "Size", "Area Type", "City", "Furnishing Status", "Tenant Preferred", "Bathroom"]]) y = np.array(data[["Rent"]]) xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.10, random_state=42)
Now let’s train a house rent prediction model using an LSTM neural network model:
from keras.models import Sequential from keras.layers import Dense, LSTM model = Sequential() model.add(LSTM(128, return_sequences=True, input_shape= (xtrain.shape[1], 1))) model.add(LSTM(64, return_sequences=False)) model.add(Dense(25)) model.add(Dense(1)) model.summary()
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= lstm (LSTM) (None, 7, 128) 66560 lstm_1 (LSTM) (None, 64) 49408 dense (Dense) (None, 25) 1625 dense_1 (Dense) (None, 1) 26 ================================================================= Total params: 117,619 Trainable params: 117,619 Non-trainable params: 0 _________________________________________________________________
model.compile(optimizer='adam', loss='mean_squared_error') model.fit(xtrain, ytrain, batch_size=1, epochs=21)
Epoch 1/21 4271/4271 [==============================] - 35s 7ms/step - loss: 7038080512.0000 Epoch 2/21 4271/4271 [==============================] - 31s 7ms/step - loss: 6481502720.0000 Epoch 3/21 4271/4271 [==============================] - 31s 7ms/step - loss: 6180754944.0000 Epoch 4/21 4271/4271 [==============================] - 31s 7ms/step - loss: 5968361472.0000 Epoch 5/21 4271/4271 [==============================] - 30s 7ms/step - loss: 5770649088.0000 Epoch 6/21 4271/4271 [==============================] - 29s 7ms/step - loss: 5618835968.0000 Epoch 7/21 4271/4271 [==============================] - 30s 7ms/step - loss: 5440893952.0000 Epoch 8/21 4271/4271 [==============================] - 29s 7ms/step - loss: 5341533696.0000 Epoch 9/21 4271/4271 [==============================] - 30s 7ms/step - loss: 5182846976.0000 Epoch 10/21 4271/4271 [==============================] - 31s 7ms/step - loss: 5106288128.0000 Epoch 11/21 4271/4271 [==============================] - 30s 7ms/step - loss: 5076118528.0000 Epoch 12/21 4271/4271 [==============================] - 30s 7ms/step - loss: 5001080320.0000 Epoch 13/21 4271/4271 [==============================] - 31s 7ms/step - loss: 4941253120.0000 Epoch 14/21 4271/4271 [==============================] - 33s 8ms/step - loss: 4904356864.0000 Epoch 15/21 4271/4271 [==============================] - 29s 7ms/step - loss: 4854262784.0000 Epoch 16/21 4271/4271 [==============================] - 30s 7ms/step - loss: 4855796736.0000 Epoch 17/21 4271/4271 [==============================] - 36s 8ms/step - loss: 4764052480.0000 Epoch 18/21 4271/4271 [==============================] - 30s 7ms/step - loss: 4709226496.0000 Epoch 19/21 4271/4271 [==============================] - 31s 7ms/step - loss: 4702300160.0000 Epoch 20/21 4271/4271 [==============================] - 31s 7ms/step - loss: 4670900736.0000 Epoch 21/21 4271/4271 [==============================] - 31s 7ms/step - loss: 4755582976.0000 <keras.callbacks.History at 0x7fd1deb6c9d0>
Now here’s how to predict the rent of a housing property using the trained model:
print("Enter House Details to Predict Rent") a = int(input("Number of BHK: ")) b = int(input("Size of the House: ")) c = int(input("Area Type (Super Area = 1, Carpet Area = 2, Built Area = 3): ")) d = int(input("Pin Code of the City: ")) e = int(input("Furnishing Status of the House (Unfurnished = 0, Semi-Furnished = 1, Furnished = 2): ")) f = int(input("Tenant Type (Bachelors = 1, Bachelors/Family = 2, Only Family = 3): ")) g = int(input("Number of bathrooms: ")) features = np.array([[a, b, c, d, e, f, g]]) print("Predicted House Price = ", model.predict(features))
Enter House Details to Predict Rent Number of BHK: 3 Size of the House: 1100 Area Type (Super Area = 1, Carpet Area = 2, Built Area = 3): 2 Pin Code of the City: 1100 Furnishing Status of the House (Unfurnished = 0, Semi-Furnished = 1, Furnished = 2): 1 Tenant Type (Bachelors = 1, Bachelors/Family = 2, Only Family = 3): 3 Number of bathrooms: 2 Predicted House Price = [[34922.3]]
Summary
So this is how to use Machine Learning to predict the rent of a housing property. With appropriate data and Machine Learning techniques, many real estate platforms find the housing options according to the customer’s budget. I hope you liked this article on predicting house rent with Machine Learning using Python. Feel free to ask valuable questions in the comments section below.
sir on what basis you map cities as mumbai = 4000 etc
pincode 🙃