Support Vector Machines Tutorial — Learn to implement SVM in Python

Arpit Bhushan Sharma
4 min readMay 2, 2020

--

A few days ago, I was a little bit confuse about, how my Google Photos find-out the number of faces in my library and cluster them one by one in each cluster by faces.

Then I came to know about the Support Vector Machine algorithm of Machine Learning it makes a boundary between different faces and by K-Means Clustering it all makes different Clusters at a time.

Still, Confuse ?? Read the text below:

Introduction to Support Vector Machines

SVMs are the most popular algorithm for classification in machine learning algorithms. Their mathematical background is quintessential in building the foundational block for the geometrical distinction between the two classes. We will see how Support vector machines work by observing their implementation in Python and finally, we will look at some of the important applications.

Introduction to SVM

What is SVM?

Support Vector Machines are a type of supervised machine learning algorithm that provides analysis of data for classification and regression analysis. While they can be used for regression, SVM is mostly used for classification. We carry out plotting in the n-dimensional space. The value of each feature is also the value of the specified coordinate. Then, we find the ideal hyperplane that differentiates between the two classes.

These support vectors are the coordinate representations of individual observation. It is a frontier method for segregating the two classes.

How does SVM work?

The basic principle behind the working of Support vector machines is simple — Create a hyperplane that separates the dataset into classes. Let us start with a sample problem. Suppose that for a given dataset, you have to classify white balls from black balls. Your goal is to create a line that classifies the data into two classes, creating a distinction between white balls from black balls.

Prediction of SVM

While one can hypothesize a clear line that separates the two classes, there can be many lines that can do this job. Therefore, there is not a single line that you can agree on which can perform this task. Let us visualize some of the lines that can differentiate between the two classes as follows –

Getting Classifier for right SVM

In the above visualizations, we have a red straight line and a red curved line. Which one do you think would better differentiate the data into two classes? If you choose the red straight line, then it is the ideal line that partitions the two classes properly. However, we still have not concretized the fact that it is the universal line that would classify our data most efficiently.

At this point, you can’t miss learning about Artificial Neural NetworkAccording to SVM, we have to find the points that lie closest to both the classes. These points are known as support vectors. In the next step, we find the proximity between our dividing plane and the support vectors. The distance between the points and the dividing line is known as margin. The aim of an SVM algorithm is to maximize this very margin. When the margin reaches its maximum, the hyperplane becomes the optimal one.

SVM is Classifies and a data is differentiated between two

The SVM model tries to enlarge the distance between the two classes by creating a well-defined decision boundary. In the above case, our hyperplane divided the data. While our data was in 2 dimensions, the hyperplane was of 1 dimension. The decision boundary is made and the data is classified between two, For higher dimensions, say, an n-dimensional Euclidean Space, we have an n-1 dimensional subset that divides the space into two disconnected components.

--

--

Arpit Bhushan Sharma
Arpit Bhushan Sharma

Written by Arpit Bhushan Sharma

An AlphaCoder Guy, who loves Data Structures Algorithms and Machine Learning.

No responses yet