Covid-19 Deaths Prediction with Machine Learning

Covid-19 is one of the deadliest viruses you’ve ever heard. Mutations in covid-19 make it either more deadly or more infectious. We have seen a lot of deaths from covid-19 while there is a higher wave of cases. We can use historical data on covid-19 cases and deaths to predict the number of deaths in the future. So if you want to learn how to predict covid-19 deaths with machine learning, this article is for you. In this article, I will take you through the task of Covid-19 deaths prediction with machine learning using Python.

Covid-19 Deaths Prediction (Case Study)

You are given a dataset of Covid-19 in India from 30 January 2020 to 18 January 2022. The dataset contains information about the daily confirmed cases and deaths. Below are all the columns of the dataset:

  1. Date: Contains the date of the record
  2. Date_YMD: Contains date in Year-Month-Day Format
  3. Daily Confirmed: Contains the daily confirmed cases of Covid-19
  4. Daily Deceased: Contains the daily deaths due to Covid-19

You need to use this historical data of covid-19 cases and deaths to predict the number of deaths for the next 30 days. You can download this dataset from here.

Covid-19 Deaths Prediction using Python

I hope you now have understood the problem statement mentioned above. Now I will import all the necessary Python libraries and the dataset we need for the task of covid-19 deaths prediction:

import pandas as pd
import numpy as np
data = pd.read_csv("COVID19 data for overall INDIA.csv")
print(data.head())
              Date    Date_YMD  Daily Confirmed  Daily Deceased
0  30 January 2020  2020-01-30                1               0
1  31 January 2020  2020-01-31                0               0
2  1 February 2020  2020-02-01                0               0
3  2 February 2020  2020-02-02                1               0
4  3 February 2020  2020-02-03                1               0

Before moving forward, let’s have a quick look at whether this dataset contains any null values or not:

data.isnull().sum()
Date               0
Date_YMD           0
Daily Confirmed    0
Daily Deceased     0
dtype: int64

We don’t need the date column, so let’s drop this column from our dataset:

data = data.drop("Date", axis=1)

Let’s have a look at the daily confirmed cases of Covid-19:

import plotly.express as px
fig = px.bar(data, x='Date_YMD', y='Daily Confirmed')
fig.show()
Daily Confirmed cases of Covid-19

In the data visualization above, we can see a high wave of covid-19 cases between April 2021 and May 2021.

Covid-19 Death Rate Analysis

Now let’s visualize the death rate due to Covid-19:

cases = data["Daily Confirmed"].sum()
deceased = data["Daily Deceased"].sum()

labels = ["Confirmed", "Deceased"]
values = [cases, deceased]

fig = px.pie(data, values=values, 
             names=labels, 
             title='Daily Confirmed Cases vs Daily Deaths', hole=0.5)
fig.show()
Covid-19 death rate

Let’s calculate the death rate of Covid-19:

death_rate = (data["Daily Deceased"].sum() / data["Daily Confirmed"].sum()) * 100
print(death_rate)
1.2840580507834722

Now let’s have a look at the daily deaths of covid-19:

import plotly.express as px
fig = px.bar(data, x='Date_YMD', y='Daily Deceased')
fig.show()
Covid-19 deaths prediction

We can see a high number of deaths during the high wave of covid-19 cases.

Covid-19 Deaths Prediction Model

Now let’s move to the task of covid-19 deaths prediction for the next 30 days. Here I will be using the AutoTS library, which is one of the best Automatic Machine Learning libraries for Time Series Analysis. If you have never used this library before, you can install it by executing the command mentioned below in your command prompt or terminal:

  • pip install autots

Now here’s how to predict covid-19 deaths with machine learning for the next 30 days:

from autots import AutoTS
model = AutoTS(forecast_length=30, frequency='infer', ensemble='simple')
model = model.fit(data, date_col="Date_YMD", value_col='Daily Deceased', id_col=None)
prediction = model.predict()
forecast = prediction.forecast
print(forecast)
            Daily Deceased
2022-01-19      271.950000
2022-01-20      310.179787
2022-01-21      297.500000
2022-01-22      310.179787
2022-01-23      271.950000
2022-01-24      258.518302
2022-01-25      340.355520
2022-01-26      296.561343
2022-01-27      296.561343
2022-01-28      284.438262
2022-01-29      323.400000
2022-01-30      271.950000
2022-01-31      245.750000
2022-02-01      284.438262
2022-02-02      258.518302
2022-02-03      239.969607
2022-02-04      271.950000
2022-02-05      334.118953
2022-02-06      323.400000
2022-02-07      271.950000
2022-02-08      284.438262
2022-02-09      323.400000
2022-02-10      258.518302
2022-02-11      245.750000
2022-02-12      245.750000
2022-02-13      326.442185
2022-02-14      323.400000
2022-02-15      394.343619
2022-02-16      228.117431
2022-02-17      358.200000

Summary

So this is how we can predict covid-19 deaths with machine learning using the Python programming language. We can use the historical data of covid-19 cases and deaths to predict the number of deaths in future. You can implement the same method for predicting covid-19 deaths and waves on the latest dataset. I hope you liked this article on covid-19 deaths prediction with machine learning. Feel free to ask valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1534

4 Comments

Leave a Reply