Data science is all about using computational methods to get valuable and actionable information from raw data sets. In this article, I will introduce you to the types of analysis in data science.
Types of Analysis
The types of analysis in data science can be separated and organized into 5 types. Below is a list of the types of data science analysis in order of increasing difficulty:
- Descriptive Analysis
- Exploratory Analysis
- Inferential Analysis
- Predictive Analysis
- Causal Analysis
Now let’s go through all these types of analysis in Data Science one by one.
Descriptive analysis is the process of synthesizing and organizing data so that it can be easily understood. The descriptive is completely different from inferential statistics analysis which seeks to describe the data but does not attempt to make inferences from the sample to the whole population. Here, we generally describe the data from a sample.
Descriptive analysis is the first step in the analysis where you summarize and describe the data you have available using descriptive statistics, and its result is a simple presentation of your data.
Exploratory Data Analysis explores the data to find the relationship between measures that tell us they exist, without the cause. They can be used to formulate hypotheses. EDA helps you discover relationships between measures in your data, which do not prove the existence of correlation, as indicated by the expression (correlation does not imply causation).
It is useful for discovering new connections, forming a hypothesis, and guiding design planning and data collection.
Inferential analysis extrapolates and generalizes information from the larger group to a smaller sample to generate analyzes and predictions. It works by using estimated data that values the population and gives a measure of the uncertainty (standard deviation) in your estimate.
The precision of the inference strongly depends on the sampling scheme; if the sample is not representative of the population, the generalization will be inaccurate. The idea of inferring the population as a whole with a smaller sample is quite intuitive, many statistics you see in the media and on the internet are inferential, a prediction of an event based on a small sample.
Predictive analysis is the process of analyzing data to predict future events. With this information at their fingertips, organizations can find better ways to serve their customers. Moreover, they can also determine the number of items to keep in inventory and even detect fraud as it occurs, among other things.
Predictive analysis uses several analysis techniques like machine learning algorithms, data mining, statistics, and artificial intelligence. These are just a few of the practices needed to analyze data and develop an understanding of how past actions and behaviours can impact future results.
The causal analysis involves discovering the causal relationship between variables, changing one variable and what happens to another. To find the causal relationship, you need to ask yourself whether the observed correlations underlying your conclusion are valid because just looking at the data (surface) will not help you uncover the hidden mechanisms underlying the correlations.
Suppose you want to test this new drug which improves human strength and focus and to do so, you conduct randomized controlled trials for the drug to test the effect of the drug. You will need to compare the samples for your new drug versus the samples receiving a mock control with a few tests on overall strengths and focus and attention and observe how the drug affects the result.
So these were all the types of analysis you need to know in Data Science. I hope you liked this article on the types of analysis in data science. Feel free to ask your valuable questions in the comments section below.