Unsupervised learning encompasses all types of machine learning where there is no known output, there is no teacher to instruct the learning algorithm. In this article, I’ll introduce you to unsupervised machine learning and its types.
What is Unsupervised Learning?
In unsupervised learning, the learning algorithm is simply shown the input data and prompted to extract knowledge from that data. In unsupervised learning, only the input data is known and no known output data is provided to the algorithm. Although there are many successful applications of these methods, they are generally more difficult to understand and evaluate.
For supervised and unsupervised learning tasks, it is important to have a representation of your input data that is understandable by a computer. It is often useful to think of your data as a table.
Every data point that you want to reason about (every email, every customer, every transaction) is a row and every property that describes that data point (for example, a customer’s age or the amount or l ‘location of a transaction) is a column.
You can describe users by their age, gender, when they created an account, and how often they purchased from your online store. You can describe the image of a tumour by the grayscale values of each pixel, or perhaps by using the size, shape and colour of the tumour.
Types of Unsupervised Learning
Now let’s look at the two types of unsupervised machine learning; data set transformations and clustering.
Transformations of the Dataset:
The Unsupervised transformations of a dataset are algorithms that can create some new representations of the data that will make it easier for the humans to understand or other machine learning algorithms compared to the original representation of the data.
A very common application of unsupervised transformation of data includes dimensionality reduction, which takes a high-dimensional data with a lot of features, and finds a new way to represent that data that can summarize the most important features and patterns with fewer features. A common application of dimensionality reduction is a two-dimensional reduction for visualization purposes.
Another application for unsupervised transformations is finding the parts that can make up the dataset. One common example is the extraction of topics from the collections of text documents.
Here the task is to find the unfamiliar topics mentioned in each document and to learn which topics appear in each document. This can be useful for following the discussion of topics such as elections, gun control or pop stars on social media.
Clustering algorithms, on the other hand, partition data into separate groups of similar items. Take the example of uploading photos to a social networking site. To help organize your images, the site may want to group images showing the same person.
However, the site doesn’t know which images show whom, and it doesn’t know how many different people appear in your photo collection. A better way will be to extract all the faces and split them into groups of faces that look alike. Hope these are from the same person and the pictures can be put together for you.
When To Use Unsupervised Learning?
Unsupervised machine learning algorithms are often used in an exploratory context when a data scientist wishes to better understand the data, rather than as part of a larger machine system.
Another common application of unsupervised machine learning algorithms is a preprocessing step for supervised algorithms. Learning a new representation of data can sometimes improve the accuracy of supervised algorithms, or can lead to reduced memory and time consumption.
I hope you liked this article on an introduction to Unsupervised machine learning, its types and when to use it in the process of applying machine learning algorithms. Feel free to ask your valuable questions in the comments section below.