In Machine Learning, Decision Tree is an algorithm that works like a flowchart or tree structure to make decisions based on input data. It starts with a question or condition and then follows different branches based on the answers to subsequent questions until a prediction is made. So, if you are new to machine learning and want to know how the decision tree algorithm works, this article is for you. In this article, I will introduce how the decision tree algorithm works and how to implement it using Python.
Here’s How Decision Tree Algorithm Works
In Machine Learning, Decision Tree is an algorithm used to solve problems that require making decisions based on multiple criteria or characteristics. Let’s understand how the decision tree algorithm works by taking an example of a real-time business problem.
Suppose you are a bank that wants to determine if a person is likely to be approved for a loan based on their credit score. You collected data on past loan applicants, including their credit scores and whether their loan applications were approved.
A decision tree algorithm is like a flowchart that can help the bank make decisions based on a person’s credit score. In this problem, the decision tree algorithm will start with a “root” node representing the first question, such as whether a person’s credit score is above or below a certain threshold.
Depending on the answer, the algorithm follows branches to subsequent questions, such as income level or employment status, until a “leaf” node is reached.
The leaf node represents a decision, such as predicting whether to approve or deny a loan based on input characteristics.
So, a decision tree algorithm is like a flowchart that makes decisions based on a series of questions and answers. It starts at a “root” node with a question, branches to subsequent questions based on the answers, and finally reaches a “leaf” node representing a decision.
Implementation of Decision Tree Algorithm using Python
Now let’s see how to implement the Decision Tree algorithm using Python. To implement it using Python, we can use the scikit-learn library in Python, which provides the functionality of implementing all Machine Learning algorithms and concepts using Python.
Let’s first import the necessary Python libraries and create a sample data based on the example we discussed above:
from sklearn.tree import DecisionTreeClassifier from sklearn.model_selection import train_test_split import pandas as pd credit_scores = [650, 720, 580, 800, 690, 750, 600, 670, 710, 680] loan_approval = [1, 1, 0, 1, 0, 1, 0, 1, 1, 0] data = {"Credit_Score": credit_scores, "Loan_Approval": loan_approval} df = pd.DataFrame(data) print(df.head())
Credit_Score Loan_Approval 0 650 1 1 720 1 2 580 0 3 800 1 4 690 0
Now here’s how to train a Machine Learning model using the Decision Tree algorithm:
X = df[['Credit_Score']] y = df['Loan_Approval'] clf = DecisionTreeClassifier(random_state=42) clf.fit(X, y)
Below is how we can visualize the decision-making process of the decision tree algorithm:
from sklearn.tree import plot_tree import matplotlib.pyplot as plt # Plot the decision tree plt.figure(figsize=(10, 6)) plot_tree(clf, filled=True, rounded=True, feature_names=['Credit Score'], class_names=['Denied', 'Approved']) plt.show()

Now here’s how we can make predictions using the Decision Tree algorithm:
user_credit_score = float(input("Enter your credit score: ")) prediction = clf.predict([[user_credit_score]]) if prediction[0] == 1: print("Congratulations! Your loan application is likely to be approved.") else: print("We regret to inform you that your loan application is likely to be denied.")
Enter your credit score: 720 Congratulations! Your loan application is likely to be approved.
So this is how the Decision Tree algorithm works.
Advantages and Disadvantages of Decision Tree Algorithm
Here are some advantages and disadvantages of the Decision Tree algorithm that you should know:
Advantages:
- Decision trees produce easy-to-interpret decision rules, allowing stakeholders to understand and trust the decision-making process.
- Decision trees can effectively handle missing data by making decisions based on the data available for each division.
Disadvantages:
- Decision trees are prone to overfitting, especially when the tree becomes too deep or complex.
- Decision trees are sensitive to small changes in input data, which can result in different tree structures and decision rules.
Summary
So, a decision tree algorithm is like a flowchart that makes decisions based on a series of questions and answers. It starts at a “root” node with a question, branches to subsequent questions based on the answers, and finally reaches a “leaf” node representing a decision. I hope you liked this article on how the Decision Tree algorithm works and how to implement it using Python. Feel free to ask valuable questions in the comments section below.