Tag data science

All Data Science Libraries

All Data Science Libraries

If you are learning Python for data science, you should know that there are many Python frameworks that you should learn for data science. From reading a CSV file, or image dataset, to training your machine learning model, or a…

Types of Jobs in Data Science

Types of Jobs in Data Science

If you think that data engineers, data scientists, machine learning engineers, and data analysts are the only types of professions you get in data science, you are completely wrong. There are many types of data science jobs that are based…

Data Science Business Ideas

Data Science Business Ideas

You must have seen something common in all new businesses and startups these days, it’s just data. Almost every new business and startup that you see and draw inspiration from use data science so much that they can figure out…

Unemployment Analysis with Python

Unemployment Analysis with Python

Unemployment is measured by the unemployment rate which is the number of people who are unemployed as a percentage of the total labour force. We have seen a sharp increase in the unemployment rate during Covid-19, so analyzing the unemployment…

Data Science Project Ideas

Data Science Project Ideas

Working on data science projects will help you improve your problem-solving skills and a good collection of data science projects will strengthen your portfolio which will leave a strong positive impact on your profile as a data scientist. So if…

How to do Data Storytelling?

How to do Data Storytelling?

Data storytelling is one of the most important soft skills that any data scientist should have. It means presenting the story behind the numbers generated by a business so that we can conclude the future course of action. If you…

Data Science Projects on Finance

Data Science Projects on Finance

Over the years, the use of data science has increased in finance for several financial analysis tasks that can help increase an organization’s profits. If you are looking for data science projects on finance to learn how to use data…

Python Libraries for Data Science

Python Libraries for Data Science

Python is such a popular programming language among data scientists because of its beginner-friendly syntax and the support of libraries that we get for all data science tasks. So there are some of the Python libraries you need to learn…

Google Search Analysis with Python

Google Search Analysis with Python

Approximately 3.5 billion searches are performed on Google daily, which means that approximately 40,000 searches are performed every second on Google. So Google search is a great use case for analyzing data based on search queries. With that in mind,…

Uber Trips Analysis using Python

Uber Trips Analysis using Python

Uber has been a major source of travel for people living in urban areas. Some people don’t have their vehicles while some don’t drive their vehicles intentionally because of their busy schedule. So different kinds of people are using the…

Heatmap using Python (Tutorial)

Heatmap using Python (Tutorial)

A heatmap is used to visualize the relationship between the features to analyze correlation, variance, anomalies, and various other patterns between features in a dataset. In this article, I’ll walk you through a tutorial on how to visualize a heatmap…

Radar Plot using Python

Radar Plot using Python

A radar plot is also known as a spider plot or a star plot. It is used to display multivariate data as a two-dimensional visualization of quantitative features that are represented on axes coming from the centre. In this article,…

What are Data Products?

What are Data Products?

A lot of people think of a data product as a physical product like Amazon Alexa that uses data, but that’s not true at all. A data product is an output produced from statistical analysis that adds value to an…

How to Become a Data Analyst?

How to Become a Data Analyst?

Data analysis is the process of collecting data, creating visualizations, interpreting the results using reporting tools, and then evaluating the results to understand if it will solve a particular business problem. A data analyst is a good career option, as…

Data Science Tools

Data Science Tools

Data science is one of the best career options of the 21st century. You need to learn many skills to become a Data Scientist like Python, SQL, data visualization and many more. In addition to these skills, there are many…

How to Analyze Data?

How to Analyze Data

The process of collecting raw data and then preparing to explore and identify patterns to understand them so that they can be used for further decision-making and machine learning model training is called data analysis. In this article, I will…

Types of Data Scientists

Types of Data Scientists

Small and medium-sized businesses typically have one or two data scientists. While the big tech companies have more data scientists who are broken down into categories based on the tasks they’re best at. So. In this article, I’ll walk you…

Learn Data Science Step By Step

Learn Data Science Step By Step

Data science is one of the best career options that doesn’t require any higher-level degrees or qualifications. Well, it’s good to have masters in any field related to business or analytics but trust me it’s not required. So how do…

Steps to Learn Data Science

Steps to Learn Data Science

A data scientist is now everyone’s dream job. First, ask yourself a question: do I want to become a data scientist? When you feel like learning new things from your inner gut, start following a learning path. In this article,…

Cohort Analysis with Python

Cohort Analysis with Python

A cohort is a group of subjects which share a defining feature. We can observe the behaviour of a cohort over time and compare it to other cohorts. In this article, I’m going to present a data science tutorial on…

Pandas DataFrame with Python

Pandas DataFrame with Python

In Data Science, the most used data structures are the Series and the DataFrame which deal with arrays and tabular data respectively. In this article, I will walk you through a tutorial on pandas DataFrame with Python. What is a…

GeoPandas in Python

GeoPandas in Python

GeoPandas is an Open Source Python package that offers the best combination of spatial data analysis and mapping functions in Python. In this article, I will introduce you to the concept of GeoPandas in Python programming language. Introduction to GeoPandas…

Feature Selection in Machine Learning

Feature Selection in Machine Learning

Feature Selection means figuring out which signals you can use to identify patterns, and then integrate them into your training and scoring pipeline. In this article, I’ll walk you through what feature selection is and how it affects the formation…

Why Data is Valuable?

Why Data is Valuable?

From changing buying habits to piloting elections, our data is so much valuable to someone you can never imagine. In this article, I’ll take you through why data is so valuable in the age of data science and machine learning.…

Data Science Certifications

Data Science Certifications

In this article, I will take you through the best Data Science Certifications available for Data Scientists and Machine Learning Experts. There are countless certifications on the internet today, but I have found the best Data Science Certifications for you…

Data Preparation for Machine Learning

Data Preparation for Machine Learning

Like many categories of fruit, datasets almost always require some form of pre-cleaning and human manipulation before they are ready for digestion. For machine learning and data science more broadly, there are a large number of techniques for the process…

Proximity Analysis with Python

Proximity Analysis with Python

Proximity analysis is a way to analyze the locations of features by measuring the distance between them and other features in the area. The distance between point A and point B can be measured in a straight line or along…

Random Sampling with Python

Random Sampling with Python

Random sampling is part of the sampling technique in which each sample has an equal probability of being selected. A randomly selected sample is meant to be an unbiased representation of the total population. In this article, I’ll walk you…

BigQuery in Data Science

BigQuery in Data Science

Well, sometimes to access big data you have to use a BigQuery. It is important to understand that you are not only storing the data in the cloud, you are also using the data analysis tools in the cloud. You…

RFM Analysis with Python

RFM Analysis with Python

RFM analysis is a marketing technique used to quantitatively determine who the best customers are by looking at what date a customer bought (recency), how often they buy (frequency) and how much the customer is spending (in money). In RFM…

ABC Analysis with Machine Learning

ABC Analysis with Machine Learning

ABC analysis assumes that income-generating items in an inventory follow a Pareto distribution, where a very small percentage of items generate the most income. In this article, I’ll walk you through how we can perform ABC analysis with Machine Learning.…

Process of Data Science

Process of Data Science

In this article, I’ll walk you through the 5-step process of data science. Let me walk you through these steps first and then walk you through all the steps involved in the Data Science process. The data science process includes:…

Diamonds Analysis with Python

Diamonds Analysis with Python

In this article, I will analyze Diamonds with python using data science tools. For this first problem, I want to choose a pretty simple dataset from Kaggle. You can easily download this dataset from here. Now let’s start with this…

Data Science Resume

Data Science Resume

During my time in data science and machine learning, I have met many employers and interviewers. So here in this article, I would like to share how to prepare your data science resume and even interviews for a Data Science…

Use of Data Science

Use of Data Science

Imagine a big pile of data, what does that tell you? Data collections are expected to continue to grow day by day, as is the time to give once again. This leads to unsupervised data storage, which has two obvious…

Data Cleaning with Python

data cleaning with python

When analyzing and modelling data, a significant amount of time is spent preparing the data: loading, cleansing, transforming, and reorganizing. These tasks are often reported to take 80% or more of an analyst’s time. Sometimes the way data is stored…

Keyword Research with Python

Keyword Research Analysis with Python

Google Trends is a keyword research tool that helps the researchers, bloggers, digital marketers and some more people in the digital industry to find how often a keyword is entered into Google search engine over a given period. Google Trends…

Data Science and Data Engineering

Data Science and Data Engineering

Data science and data engineering are two different branches of big data paradigm – an approach in which enormous speeds, varieties and volumes of structured, unstructured and semi-structured data are captured, processed, stored and analyzed using a set of techniques…

What is Data Mining?

What is Data Mining?

The combined knowledge of statistics, data mining, and machine learning plays a major role in understanding the data and describing the data features to find the relationships and patterns between the data so that we can build a model for…

Role of Analytics In An Organization

Role of Analytics

The role of Analytics in an Organisation has completely changed over the years. Earlier the higher-level officers and Board Members used to make the most decisions based on their experience and knowledge. But now the statistics and historical performance are…

What is Big Data?

What is Big Data?

Big Data is a type of data that is in a huge quantity and is still growing rapidly. It is so large and complex that it is impossible to manage it using the traditional ways of computing and to store…

What is Image Segmentation?

What is image segmentation

In this article, I will take you through a brief explanation of Image Segmentation in Deep Learning. I will only explain the concept behind the image segmentation here in this article. If you want to go through the practical part…

Pipelines in Machine Learning

Machine Learning Pipelines performs a complete workflow with an ordered sequence of the process involved in a Machine Learning task. In most of the functions in Machine Learning, the data that you work with is barely in a format for…

TensorBoard for Visualizations

TensorBoard is a great interactive visualization tool that you can use to view the learning curves during training, compare learning curves between multiple runs, visualize the computation graphs, analyze training statistics, view images generated by your model, visualize complex multidimensional…

Multiclass Classification

Multiclass classification

Where Binary Classification distinguish between two classes, Multiclass Classification or Multinomial Classification can distinguish between more than two classes. Some algorithms such as SGD classifiers, Random Forest Classifiers, and Naive Bayes classification are capable of handling multiple classes natively. Others…

Next Word Prediction Model

Next word prediction

Most of the keyboards in smartphones give next word prediction features; google also uses next word prediction based on our browsing history. So a preloaded data is also stored in the keyboard function of our smartphones to predict the next…

Binary Classification Model

Binary Classification is a type of classification model that have two label of classes. For example an email spam detection model contains two label of classes as spam or not spam. Most of the times the tasks of binary classification…

WordCloud with Python

Wordcloud with Python

You must have seen a cloud filled with words in a lot of Analysis tasks and machine learning projects. A WordCloud represents the importance of each word in a set of words by analyzing the frequency of terms. In this…

MySQL with Python

MySQL with Python

In this article you will learn to create databases, manipulate databases, and will also learn some operations on handling databases in MySQL with Python. You need to download and install MySQL from here, and after installing MySQL you also need…

Grid Search for Model Tuning

In this article, I will take you through a very powerful algorithm in Machine Learning, which is the Grid Search Algorithm. It is mostly used in hyperparameters tuning and models selection in Machine Learning. Here I will teach you how…

GPU Can Speed Up Models

In this article, we will look at how to speed up your models by using a GPU. We will also see how to split the computations across multiple devices, including the CPU and numerous GPU devices. Thanks to GPUs, instead…

NLP For WhatsApp Chats

Natural Language Processing or NLP is a field of Artificial Intelligence which focuses on enabling the systems for understanding and processing the human languages. In this article, I will use NLP to analyze my WhatsApp Chats. For some privacy reasons,…

PyTorch for Deep Learning

PyTorch is a library in Python which provides tools to build deep learning models. What python does for programming PyTorch does for deep learning. Python is a very flexible language for programming and just like python, the PyTorch library provides…

PDF with Python

In Data Science, you must have seen people reading CSV files and excel files to work with the data, but what about a PDF. Python is a very high level language that is the reason it is mostly getting used…

PySpark in Machine Learning

PySpark is the API of Python to support the framework of Apache Spark. Apache Spark is the component of Hadoop Ecosystem, which is now getting very popular with the big data frameworks. Apache Spark is a very powerful component which…

K-Means in Machine Learning

Many clustering algorithms are available in Scikit-Learn and elsewhere, but perhaps the simplest to understand is an algorithm known as k-means clustering, which is implemented in sklearn.cluster.KMeans. Introduction The k-means algorithm searches for a pre-determined number of clusters within an unlabeled…

Employee Turnover Prediction

This article features the implementation of an employee turnover analysis that is built using Python’s Scikit-Learn library. In this article, I will use Logistic Regression and Random Forest Machine Learning algorithms. At the end of this article, you would be…

Time Series Forecasting

Time Series Forecasting

Many Business activities are seasonal in nature, where most of the business are dependent on a particular time of festival and holidays. Every business uses sales promotion techniques to increase the demand for their products and services, in order to…

TensorFlow Tutorial

TensorFlow

TensorFlow is a powerful library for numerical computation, particularly well suited and fine-tuned for large–scale Machine Learning ( but you could use it for anything else that requires heavy calculations). The Google Brain team developed it, and it powers many…

Reinforcement Learning

Reinforcement Learning

Reinforcement Learning (RL) is one of the most exciting fields of machine learning today. and also one the oldest. It has been around since the 1950s, producing many exciting applications over the years, particularly in games (e.g., TD-Gammon, a Backgammon-playing…

Merging Datasets

Merging Datasets

Merging Datasets is one of the most high-performance features, which is provided by pandas in Python. In this article, I will show how we can merge datasets in Python with the help of examples and real-world scenarios. For convenience, I…

Model Selection Technique

Model Selection

Evaluating a model is simple enough to use a test set. But suppose you are hesitating in model selection between two types of models (say, a linear model and a polynomial model); how can you decide between them? One option…

Missing Data Handling

Handling Missing Data in data Science

There is a lot of difference between the data you get to practice data science skills and the data you get in the real world. Honestly speaking, many datasets you will get in the process of actual-world data science tasks…

Training and Test Sets

Training and Test sets

This article is about description for those who need to know what is the actual difference between the dataset split between the Training and Test sets in Machine Learning while training and classifying models. What is Training Data? All the…

Manifold Learning

Manifold Learning in Machine Learning

Rotating, re-orienting, or stretching the piece of paper in three-dimensional space doesn’t change the flat geometry of the article: such operations are akin to linear embeddings. If you bend, curl, or crumple the paper, it is still a two-dimensional manifold,…

PCA in Machine Learning

PCA in Machine Learning

In this article, you will explore what is perhaps one of the most broadly used of unsupervised algorithms, principal component analysis (PCA). PCA is fundamentally a dimensionality reduction algorithm, but it can also be useful as a tool for visualization,…

Best Data Science Books

best data science books

Below are some of the famous Data Science books that will help beginners explore more about Data Science and the experienced practitioners to gain more deep knowledge. I found these books really useful and highly recommended. Best Data Science Books…

Decision Trees in Machine Learning

Decision trees in machine learning

Decision Trees are versatile Machine Learning algorithms that can perform both classification and regression tasks, and even multi-output tasks. They are powerful algorithms, capable of fitting complex datasets. Decision trees are also the fundamental components of Random Forests, which are…

Understanding a Neural Network

What is a Neural Network Neural Network is a computational algorithm that is used in creating deep learning models for predictions and classifications. It is based on self-learning and training, rather than being explicitly programmed. Neural Networks are inspired by…

Data Visualization with Seaborn

Before learning Seaborn, you should know that matplotlib has proven to be an incredibly useful and popular visualization tool, but even avid users will admit it often leaves much to be desired. There are several valid complaints about Matplotlib that…

Linear Regression Model

Let’s train and run a Linear regression model to make Predictions, In this article, I will load the data, prepare it, create a scatter plot for visualization, and then train a linear regression model to make a prediction. I will…

Time Series Analysis and Forecasting with Python

Time Series Analysis carries methods to research time-series statistics to extract statistical features from the data. Time Series Forecasting is used in training a Machine learning model to predict future values with the usage of historical importance. Time Series Analysis is broadly speaking used in training machine learning models for the Economy, Weather forecasting, stock price prediction, and additionally in Sales forecasting. It can be said that Time Series Analysis is widely used in facts based on non-stationary features. Time Series Analysis and…

How to get a job in Data Science

If you know enough statistics, programming especially python, Machine Learning, etc, one thing you should know that still getting a job in Data Science is a difficult task. Some people may have the best skill set, but one thing that…

Indian GDP Analysis with Python

Understanding GDP Gross domestic product (GDP) at current prices is the GDP at the market value of goods and services produced in a country during a year. In other words, GDP measures the monetary value of final goods and services…

Twitter Sentiment Analysis

twitter sentiment analysis

Twitter Sentiment Analysis is the process of computationally identifying and categorizing tweets expressed in a piece of text, especially in order to determine whether the writer’s attitude towards a particular topic, product, etc. is positive, negative, or neutral. In this Article…

Matplotlib Tutorial for Data Science

This article is all about Matplotlib, the basic data visualization tool of Python programming language for Data Science. Here I will discuss various plot types with Matplotlib and customization techniques associated with Data Science. Introduction to Matplotlib Matplotlib is the basic…