Query Data using Python

Query data means requesting specific data from a dataset. If you are familiar with SQL, you must know what it means to query data, but if you use the pandas library in Python, you can still query data from your dataset. So if you want to learn how to query data from a pandas DataFrame in Python, this article is for you. In this article, I will take you through how to query data using Python.

Query Data using Python

If you are familiar with SQL, you must know how to query data from a database using SQL commands. Just like SQL, you can also query data from a pandas DataFrame. The DataFrame.query() method allows us to write commands to query data from a dataset.

In this method, you write commands in strings. Let’s say you have a dataset of columns a, b, c, and d; here, you want to find all the rows where the values of b are greater than a. For this task, the query that you will write is data.query(“b > a”).

I hope you now have understood the query method in pandas DataFrame. Now let’s look at an example of how to query data from a pandas DataFrame. I will first import a dataset for this task:

import pandas as pd
data = pd.read_csv("https://raw.githubusercontent.com/amankharwal/Website-data/master/class_grades.csv")
print(data.head())
                name  homewk1  homewk2  midterm  partic  exam
0    Bhirasri, Silpa       58       70       66      90    95
1      Brookes, John       63       65       74      75    99
2  Carleton, William       57        0       62      90    91
3       Carli, Guido       90       73       59      85    94
4   Cornell, William       73       56       77      95    46

As you can see, the dataset contains marks of students. Here I will use the query method to look at all the rows where the exam marks are less than 50:

print(data.query("exam < 50"))
               name  homewk1  homewk2  midterm  partic  exam
4  Cornell, William       73       56       77      95    46

So there is only one row in this dataset where the marks in the exam are less than 50. Now let’s have a look at all the rows where the midterm marks are more than marks in the final exam (exam column):

print(data.query("midterm > exam"))
                name  homewk1  homewk2  midterm  partic  exam
4   Cornell, William       73       56       77      95    46
11     Smith, Sophia       70       93       77      75    75
17      Wells, Henry        0       60       68      85    57
18    Wheelock, Lucy       56       56       72      85    54
19       Yale, Elihu       53       71       77      90    59

So this is how you can query data using the Python programming language.

Summary

So this is how you can query data from a pandas DataFrame in Python. Query data means requesting specific data from a dataset. You can learn more about the DataFrame.query method from here. I hope you liked this article on how to query data from a pandas DataFrame. Feel free to ask valuable questions in the comments section below.

Default image
Aman Kharwal

Coder with the ♥️ of a Writer || Data Scientist | Solopreneur | Founder

Articles: 1238

Leave a Reply