Bankruptcy Prediction Model with Machine Learning

Bankruptcy is a state of insolvency when a business or legal person cannot repay debts to creditors. Bankruptcy is primarily imposed by court order when initiated by the debtor. In this article, I will walk you through how to train a bankruptcy prediction model with machine learning using Python.

Bankruptcy Prediction Model with Machine Learning

Bankruptcy is the concept of financial accounts. If you are one of the data science enthusiasts who started with data science after commerce then you should be aware of what bankruptcy is. When a business or legal person fails to pay the debts of creditors and becomes insolvent at some point, this type of situation is called bankruptcy.

By using machine learning algorithms we can train a model to predict whether a company or a legal person will become bankrupt in future or not. In the section below, I will take you through a machine learning tutorial on how to train a model for the task of bankruptcy prediction of a company by using the Python programming language.

Bankruptcy Prediction Model using Python

The dataset that I will be using for this task is collected from Kaggle and it is provided by the Taiwan Economic Journal. Now let’s import the dataset and the necessary Python libraries to start with the task of training a bankruptcy prediction model using Python:

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

data = pd.read_csv("bank.csv")
data.head()
bankruptcy dataset

The dataset contains 96 columns, let’s have a look at the correlation before training the model:

import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
sns.heatmap(data.corr())
plt.show()
correlation in the dataset

As the “Bankrupt?” column is the target label so I will drop it from the training data:

X = data.drop(["Bankrupt?"], axis="columns")
y = data["Bankrupt?"]

Now let’s split the dataset and use the logistic regression model to train the bankruptcy prediction model:

x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
logreg = LogisticRegression()
logreg.fit(x_train, y_train)
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=100,
                   multi_class='auto', n_jobs=None, penalty='l2',
                   random_state=None, solver='lbfgs', tol=0.0001, verbose=0,
                   warm_start=False)

Now let’s have a look at the accuracy score on the training set:

logreg.score(x_test, y_test)
0.9596774193548387

Conclusion

So the model is performing well on the training data by giving an accuracy of about 95%. This is how we can use machine learning in finance. You can do a lot more on this dataset to explore more use cases of machine learning in finance. I hope you liked this article on how to train a bankruptcy prediction model with machine learning using Python. Feel free to ask your valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1538

Leave a Reply