In this article, I will take you through the use cases of the most important Machine Learning algorithms which will help you to decide how to choose a Machine Learning algorithm.
How To Choose a Machine Learning Algorithm?
There are so many algorithms and the researchers are still busy in building more algorithms. But one of the biggest problems of a newbie is to decide how to choose a machine learning algorithm.
The table below describes the use cases of most popular machine learning algorithms which will help you in deciding how to choose a Machine Learning algorithm.
|Linear Regression||Regression||Model a scalar target with one or more quantitative characteristics. Although regression calculates a linear combination, features can be transformed by nonlinear functions if the relationships are known or can be guessed.|
|Logistic Regression||Classification||Categorize observations according to quantitative characteristics; predict the target class or the probabilities of the target classes.|
|SVM||Classification/Regression||Classification based on separation in a large space. Predicts target classes. The target class probabilities require additional calculation. Regression uses a subset of data and performance is highly dependent on the data.|
|KNN||Classification/Regression||Targets are calculated based on those in the training set that are “closest” to the test examples via a distance formula (eg, Euclidean distance). For classification, training is aimed at “voting”. For the regression, they are averaged. Predictions are based on a “local” subset of the data but are very accurate for some data sets.|
|Decision Trees||Classification/Regression||Training data is recursively split into subsets based on attribute value tests, and decision trees that predict targets are derived. Produces understandable models, but the random forest and amplification algorithms almost always produce lower error rates.|
|Random Forest||Classification/Regression||A “set” of decision trees is used to produce a stronger prediction than a single decision tree. For classification, several decision trees “vote”. For the regression, their results are averaged.|
|Boosting||Classification/Regression||For multi-tree methods, the amplification algorithms reduce the generalization error by adjusting the weights to give more weight to misclassified examples or (for regression) to those with larger residuals.|
|Naïve Bayes||Classification||A simple and scalable classification algorithm used in particular in text classification tasks (eg spam classification). It assumes independence between functionalities (thus naive), which is rarely the case, but the algorithm works surprisingly well in specific cases. It uses Bayes’ theorem but is not “Bayesian” as it is used in statistics.|
|Neural Network||Classification/Regression||Used to estimate unknown functions based on a large number of inputs, via the backpropagation algorithm. Usually more complex and more computationally expensive than the other methods, but powerful for some problems. The basis for many deep learning methods.|
|XGBoost||Classification/Regression||A highly optimized and scalable version of the Boosted Decision Trees algorithm.|
I hope you liked this article on how to choose a Machine Learning algorithm. Feel free to ask your valuable questions in the comments section below.