Types of Data in Data Science

Data science is a combination of data mining and computer science. Data plays an important role in every data science task. As a data science professional, you need to know what kind of data you are working with to solve a problem. So, if you want to know the types of data we deal with in Data Science, this article is for you. In this article, I will introduce you to the types of data in Data Science that you need to know.

Types of Data in Data Science

There are four major types of data we deal with as Data Science professionals:

  1. Numerical data
  2. Categorical data
  3. Time series data
  4. Textual data

Let’s go through all the data types one by one.

Numerical Data

Numerical data is a type of data in the form of numbers. A dataset containing discrete or continuous numerical features is a numerical dataset. To train a machine learning model, we need numerical data. If the data is not numerical, we need to convert it into numerical values for training machine learning models.

For example, height, weight, number of matches, runs scored, etc.

Categorical Data

Categorical data is a data type that contains two or more categories. Whenever you have features in your dataset with categories, such a dataset is useful for training classification models. These datasets are also useful for grouping data.

The income group of a person, gender of the person, and nationality of the person are some examples of categorical data.

Time Series Data

A time-series data is a sequence of data collected over time intervals. Such datasets can be collected based on months, years, days, hours, minutes, or even seconds. Time series datasets help analyze the change in data with the change in time.

Stock price data, monthly sales data, and daily website traffic data are some examples of time-series data.

Textual Data

Textual data is a collection of textual information like words and phrases. The source of textual information can be any piece of text. Such datasets help solve the problems of Natural Language Processing, where we train systems to understand human languages.

Tweets, comments, reviews, and the text of a book are some examples of textual data.

Summary

As a data science professional, you need to know what kind of data you are working with to solve a problem. Numerical, categorical, textual and time-series data are the main types of data you should know as a Data Science professional. I hope you liked this article on the types of data in Data Science you should know. Feel free to ask valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1501

2 Comments

  1. Since we cannot train ML model with data that isn’t numeric, how can we change the different data types to numerical so that it becomes suitable for training ML models?

Leave a Reply