LightGBM is a gradient boosting classifier in machine learning that uses tree-based learning algorithms. It is designed to be distributed and efficient with faster drive speed and higher efficiency, lower memory usage and better accuracy. In this article, I will introduce you to a tutorial on LightGBM in Machine Learning using Python.
What is LightGBM in Machine Learning?
In machine learning, the LightGBM classifier is part of the Boosting family, and today it is the most common classification model in the machine learning community. LightGBM is a powerful machine learning model that can be shaped depending on the task you are working on.
Let’s say if you are working on the regression problem you have to use the LightGBM Regressor model, and if you are working on a classification problem you have to use the LightGBM Classifier model.
In this article, I will present a tutorial on the LightGBM model on the classification problem. But in general, it supports the following applications:
- binary classification
Now let’s start with a tutorial on LightGBM in Machine Learning. I’ll start this task by importing the necessary Python libraries and the dataset:
The dataset I’m using here is a classification dataset where the task is about to classify a creature. There are 3 classes in the target function, ghouls, goblins and ghosts. We will try to predict the class of the creature based on independent characteristics.
Now before moving forward let’s have a quick look at the training and test sets:
Let’s prepare the tarin_data for the LightGBM classifier. I will change the type of the categorical column from colour, and also change the target column, “type”, to full. The color column is nominal, so we’ll be using hot encoding, but in pandas, there is the “get_dummies” function which is very useful, easy to use, and does the same with hot encoding.
The Type column is also nominal, but being the target column we have to use Label Encoder. I will show you 2 approaches to converting categorical columns. One of them is LabelEncoder and the other is the map function. You can use whatever you want:
The independent characteristics are designated by “X” and the dependent function is designated by “y”. 40% of train_data is allocated for the tests:
First of all, we need to define the parameters and intervals. The parameters must be defined in a dictionary. The LightGBM classification model will try all the intervals we have set and try to find the optimal settings to get the best score. You can add more parameters if you want, but remember, more parameters mean more time:
A total of 3,240,000 adjustments were applied to the oar, the treatment lasted 8 hours 46 minutes. A very long time, and if you add more parameters the time will be longer:
Finally, let’s take a look at which features the model has given more importance:
You can get all the Python code used on this tutorial on LightGBM Classifier in Machine Learning from below.
I hope you liked this article on a tutorial on LightGBM in Machine Learning using Python. Feel free to ask your valuable questions in the comments section below.