The Support Vector Machine is a very powerful and flexible class of supervised machine learning algorithms for classification and regression tasks. In this article, I will introduce you to a machine learning tutorial on Support Vector Machine using Python.
Support Vector Machine
In machine learning, support vector machines are a set of supervised machine learning algorithms that can be used for both classification and regression. They belong to the category of generalized linear classifiers. Simply put, the Support Vector Machine is a classification and regression algorithm that gives greater accuracy by automatically avoiding overfitting of the data.
Also, Read – 100+ Machine Learning Projects Solved and Explained.
The great strength of SVM is that the training is very simple. It does not require any optimal, unlike neural networks. It also fits the data very well to very high dimensional data. Now in the section below, I will walk you through a tutorial on how to implement a Support Vector Machine on a very popular ‘Iris’ dataset.
Support Vector Machine Tutorial using Python
Support Vector Machine is one of the best approaches for data modelling. It uses generalization checking as a technique to check dimensionality. Now let’s start with the task of implementing the SVM algorithm on a dataset. I’ll start by importing the dataset and libraries needed for data visualization:
import seaborn as sns iris = sns.load_dataset('iris') import pandas as pd import matplotlib.pyplot as plt %matplotlib inline set(iris['species'])
{'setosa', 'versicolor', 'virginica'}
iris.head()
sepal_length | sepal_width | petal_length | petal_width | species | |
---|---|---|---|---|---|
0 | 5.1 | 3.5 | 1.4 | 0.2 | setosa |
1 | 4.9 | 3.0 | 1.4 | 0.2 | setosa |
2 | 4.7 | 3.2 | 1.3 | 0.2 | setosa |
3 | 4.6 | 3.1 | 1.5 | 0.2 | setosa |
4 | 5.0 | 3.6 | 1.4 | 0.2 | setosa |
Data Visualization:
Now let’s visualize some of the important features in the data to understand what we are working with:
sns.pairplot(iris,hue='species',palette='Dark2')

setosa = iris[iris['species']=='setosa'] sns.kdeplot( setosa['sepal_width'], setosa['sepal_length'], cmap="plasma", shade=True, shade_lowest=False)

To train the SVM classifier I will split the data into training and test sets:
from sklearn.model_selection import train_test_split X = iris.drop('species',axis=1) y = iris['species'] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30)
Now let’s train the model, I will import the SVC model from scikit-learn and simply fit the data in the model:
from sklearn.svm import SVC svc_model = SVC() svc_model.fit(X_train,y_train) predictions = svc_model.predict(X_test) from sklearn.metrics import classification_report,confusion_matrix print(confusion_matrix(y_test,predictions))
[[16 0 0] [ 0 14 0] [ 0 2 13]]
print(classification_report(y_test,predictions))
precision recall f1-score support setosa 1.00 1.00 1.00 16 versicolor 0.88 1.00 0.93 14 virginica 1.00 0.87 0.93 15 accuracy 0.96 45 macro avg 0.96 0.96 0.95 45 weighted avg 0.96 0.96 0.96 45
So the model performed pretty well. Let’s see if we can tune the hypermeters to try to get even better, unlikely, and you would probably be happy with these results, as the dataset is quite small, but I just want you to practice using GridSearch:
[[16 0 0] [ 0 14 0] [ 0 0 15]] precision recall f1-score support setosa 1.00 1.00 1.00 16 versicolor 1.00 1.00 1.00 14 virginica 1.00 1.00 1.00 15 accuracy 1.00 45 macro avg 1.00 1.00 1.00 45 weighted avg 1.00 1.00 1.00 45
I hope you liked this article on machine learning tutorial on support vector machine. Feel free to ask your valuable questions in the comments section below.