In Machine Learning, Linear Regression Algorithm is a statistical technique for calculating the value of a dependent variable based on the value of an independent variable. The goal of linear regression is to find the best-fit line that describes the relationship between the dependent and the independent variable. So, if you are new to Machine Learning and want to know how the Linear Regression algorithm works, this article is for you. In this article, I will introduce how the Linear Regression algorithm works in simple words and its implementation using Python.
Here’s How Linear Regression Algorithm Works
Let’s understand how the Linear Regression algorithm works by taking an example of a real-time business problem.
Let’s say you are a bakery owner. You want to predict how much money you will make in a day based on the number of cupcakes you sell.
To predict this, you will gather data on the number of cupcakes you sell and how much money you earned in several days. Then, you will use a Linear Regression algorithm to find the best-fit line that describes the relationship between the number of cupcakes sold and the amount of money made.
Once you have the best-fit line, you can use it to predict how much money you will make for a different number of cupcakes sold. For example, if the line tells you that you make $50 by selling 20 cupcakes, then you can predict that selling 40 cupcakes will make $100 in a day.
That’s how the Linear Regression algorithm works. It finds the best-fit line that describes the relationship between the input variable and the output variable so we can predict the output variable based on a new input variable.
Implementation of Linear Regression Algorithm using Python
Now let’s see how to implement the linear regression algorithm using Python. To implement it using Python, we can use the scikit-learn library in Python, which provides the functionality of implementing all Machine Learning algorithms and concepts using Python.
Let’s first import the necessary Python libraries and create a sample data based on the example we discussed above:
import numpy as np import plotly.graph_objs as go from sklearn.linear_model import LinearRegression #sample data num_cupcakes = [10, 20, 30, 40, 50] money_made = [25, 50, 75, 100, 125]
Now let’s have a look at the relationship between the number of cupcakes sold and the amount of money made:
figure = go.Figure() figure.add_trace(go.Scatter(x=num_cupcakes, y=money_made, mode='markers',)) figure.update_layout(title='Relationship Between Cupcake Sales and Money Made', xaxis_title='Number of Cupcakes Sold (Independent Variable)', yaxis_title='Money Made (Dependent Variable)') figure.show()
Now here’s how to train a Machine Learning model using the linear regression algorithm:
#create a linear regression model x = np.array(num_cupcakes).reshape((-1, 1)) y = np.array(money_made) model = LinearRegression().fit(x, y)
Now here’s how we can predict the value of a dependent variable using the value of an independent variable using our Machine Learning model:
#predict the money made for 25 cupcakes sold new_num_cupcakes = [] new_money_made = model.predict(new_num_cupcakes) print("Predicted money made for the cupcakes sold:", new_money_made)
Predicted money made for the cupcakes sold: 62.5
So this is how the Linear Regression algorithm works.
Advantages & Disadvantages of Linear Regression
Here are some advantages and disadvantages of the Linear Regression algorithm that you should know:
- It’s easy to understand and use for making predictions in real-world problems.
- It can analyze the relationship between two variables and make predictions based on their relationship.
- It assumes a linear relationship between the input and output variables, which may not always be true.
- It’s sensitive to outliers and can be affected by multicollinearity.
I hope you have understood how the Linear Regression algorithm works and how to implement it using Python. The goal of linear regression is to find the best-fit line that describes the relationship between the dependent and the independent variable. I hope you liked this article. Feel free to ask valuable questions in the comments section below.