Contact tracing is a process used by public health ministries to help stop the spread of infectious disease, such as COVID-19, within a community. In this article, I will take you through the task of contact tracing with Machine Learning.
How Contact Tracing Works?
Once a person is positive for coronavirus, it is very important to identify other people who may have been infected by the patients diagnosed. To identify infected people, the authorities follow the activity of patients diagnosed in the last 14 days. This process is called contact tracking. Depending on the country and the local authority, the search for contacts is carried out either by manual methods or by numerical methods.
In this article, I will be proposing a digital contact tracing algorithm that relies on GPS data, which can be used in contact tracing with machine learning.
Contact Tracing with Machine Learning
DBSCAN is a density-based data clustering algorithm that groups data points in a given space. The DBSCAN algorithm groups data points close to each other and marks outlier data points as noise. I will use the DBSCAN algorithm for the task of contact tracing with Machine Learning.
Also, Read – What is Competitive Programming?
The dataset that I will use in this task is a JSON data which can be easily downloaded from here. Now let’s import all the libraries that we need for this task and get started with reading the dataset and exploring some insights from the data:
id timestamp latitude longitude
0 David 2020-07-04 15:35:30 13.148953 77.593651
1 David 2020-07-04 16:35:30 13.222397 77.652828
2 Frank 2020-07-04 14:35:30 13.236507 77.693792
3 Carol 2020-07-04 21:35:30 13.163716 77.562842
4 Ivan 2020-07-04 22:35:30 13.232095 77.580273
Code language: CSS (css)
Now, let’s analyze the dataset using the scatter plot showing the ids with their latitudes and longitudes on the x-axis and the Y-axis respectively:
plt.figure(figsize=(8,6))
sns.scatterplot(x=‘latitude’, y=‘longitude’, data=df, hue=‘id’)
plt.legend(bbox_to_anchor= [1, 0.8])
plt.show()

Creating a Model for Contact Tracing with Machine Learning
Now let’s create a model for contact tracing using the DBSCAN model. The function below will help us to create the DBSCAN model, using this model we will generate clusters, which will help identify infections by filtering the data in the clusters:
Now, let’s generate clusters using our model:
abels = model.labels_
fig = plt.figure(figsize=(12,10))
sns.scatterplot(df['latitude'], df['longitude'], hue = ['cluster-{}'.format(x) for x in labels])
plt.legend(bbox_to_anchor = [1, 1])
plt.show()
Code language: JavaScript (javascript)

Tracing Infected People
To find people who may be infected by the patient, we’ll just call the get_infected_names function and enter a name from the dataset as a parameter:
print(get_infected_names("Erin")
Code language: PHP (php)
[‘Ivan’]
Also, Read – Hate Speech Detection Model.
From the above results, we can say that a clustering algorithm like DBSCAN can perform data point clustering without prior knowledge of the datasets. I hope you liked this article on Contact Tracing with Machine Learning. Feel free to ask your valuable questions in the comments section below. You can also follow me on Medium to learn every topic of Machine Learning.