The Boosting algorithms are algorithmic paradigm that arose from a theoretical question and has become a very practical machine learning tool. The reinforcement approach uses a generalization of linear predictors to solve two major problems. In this article, I will introduce you to Boosting algorithms and their types in Machine Learning.
Introduction to Boosting Algorithms
Boosting is a method of amplifying the accuracy of weak learners. The first problem addressed by amplification algorithms is the bias-complexity tradeoff. This means that the error of an ERM learner can be decomposed into a sum of approximation error and estimation error. The more expressive the class of hypotheses sought, the smaller the approximation error, but the larger the estimation error.
A learner is therefore faced with the problem of choosing a good compromise between these two considerations. The boosting paradigm allows the learner to have fluid control over this compromise. Learning begins with a base class (which may have a large approximation error), and as it progresses the class to which the predictor may belong grows.
The second problem that improving addresses is the computational complexity of learning, which is the task of finding a GRE assumption may be computationally unfeasible. A simulation algorithm amplifies the accuracy of weak learners.
Intuitively, one can think of a weak learner as an algorithm that uses a simple “rule of thumb” to produce a hypothesis that comes from an easy-to-learn class of hypotheses and that performs slightly better than a random estimate.
When a weak learner can be effectively implemented, reinforcement provides a tool to aggregate these weak assumptions to gradually approximate good predictors for larger and harder to learn classes.
Types of Boosting Algorithms
Hope you now understand what boost algorithms are and what kind of problems they are used for. Now let’s see the types of boosting algorithms in Machine Learning.
AdaBoost is a shortcut for Adaptive Boosting. The AdaBoost algorithm produces a hypothesis which is a linear combination of simple hypotheses. In other words, AdaBoost is based on the family of hypothesis classes obtained by composing a linear predictor over simple classes.
AdaBoost allows us to control the trade-off between approximation and estimation errors by varying a single parameter. AdaBoost demonstrates a general theme of broadening the expressiveness of linear predictors by composing them on top of other functions. You can learn the practical implementation of the AdaBoost algorithm from here.
XGBoost is a shortcut for Gradient Boosting. When using a gradient boosting for regression, weak learners are regression trees, and each regression treemaps an input data point to one of its leaves that contains a continuous score.
XGBoost minimizes a regularized objective function (L1 and L2) that combines a convex loss function (based on the difference between predicted and target outputs) and a penalty term for the complexity of the model (in other words, the functions d ‘regression tree).
The training takes place iteratively, adding new trees that predict residues or errors from previous trees which are then combined with previous trees to make the final prediction. This is called increasing the gradient because it uses a gradient descent algorithm to minimize the loss when adding new models. You can learn more about the practical implementation of the XGBoost algorithm from here.
Hope you liked this article on what drives algorithms in machine learning and its types. Please feel free to ask your valuable questions in the comments section below.