Support Vector Machine Tutorial using Python

The Support Vector Machine is a very powerful and flexible class of supervised machine learning algorithms for classification and regression tasks. In this article, I will introduce you to a machine learning tutorial on Support Vector Machine using Python.

Support Vector Machine

In machine learning, support vector machines are a set of supervised machine learning algorithms that can be used for both classification and regression. They belong to the category of generalized linear classifiers. Simply put, the Support Vector Machine is a classification and regression algorithm that gives greater accuracy by automatically avoiding overfitting of the data.

Also, Read – 100+ Machine Learning Projects Solved and Explained.

The great strength of SVM is that the training is very simple. It does not require any optimal, unlike neural networks. It also fits the data very well to very high dimensional data. Now in the section below, I will walk you through a tutorial on how to implement a Support Vector Machine on a very popular ‘Iris’ dataset.

Support Vector Machine Tutorial using Python

Support Vector Machine is one of the best approaches for data modelling. It uses generalization checking as a technique to check dimensionality. Now let’s start with the task of implementing the SVM algorithm on a dataset. I’ll start by importing the dataset and libraries needed for data visualization:

import seaborn as sns
iris = sns.load_dataset('iris')
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
{'setosa', 'versicolor', 'virginica'}

Data Visualization:

Now let’s visualize some of the important features in the data to understand what we are working with:

iris paiplot
setosa = iris[iris['species']=='setosa']
sns.kdeplot( setosa['sepal_width'], setosa['sepal_length'],
                 cmap="plasma", shade=True, shade_lowest=False)
support vector machine kdeplot

To train the SVM classifier I will split the data into training and test sets:

from sklearn.model_selection import train_test_split
X = iris.drop('species',axis=1)
y = iris['species']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30)

Now let’s train the model, I will import the SVC model from scikit-learn and simply fit the data in the model:

from sklearn.svm import SVC
svc_model = SVC(),y_train)
predictions = svc_model.predict(X_test)
from sklearn.metrics import classification_report,confusion_matrix
[[16  0  0]
 [ 0 14  0]
 [ 0  2 13]]
  precision    recall  f1-score   support

      setosa       1.00      1.00      1.00        16
  versicolor       0.88      1.00      0.93        14
   virginica       1.00      0.87      0.93        15

    accuracy                           0.96        45
   macro avg       0.96      0.96      0.95        45
weighted avg       0.96      0.96      0.96        45

So the model performed pretty well. Let’s see if we can tune the hypermeters to try to get even better, unlikely, and you would probably be happy with these results, as the dataset is quite small, but I just want you to practice using GridSearch:

[[16  0  0]
 [ 0 14  0]
 [ 0  0 15]]

 precision    recall  f1-score   support

      setosa       1.00      1.00      1.00        16
  versicolor       1.00      1.00      1.00        14
   virginica       1.00      1.00      1.00        15

    accuracy                           1.00        45
   macro avg       1.00      1.00      1.00        45
weighted avg       1.00      1.00      1.00        45

I hope you liked this article on machine learning tutorial on support vector machine. Feel free to ask your valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1501

Leave a Reply