Visualizing K-Means Clustering. How K-Means algorithm works

Visualizing K-Means Clustering tutorial

Introducing K-Means

K-Means is an algorithm that is used in a situation where you are given a dataset where each sample has a set of features, but no labels. In a situation like this we can try and find groups of data, which are similar to one another. Similar data points would stay close to each other. This way they create groups, or as we call it clusters.

K-Means is a clustering algorithm. Because K-Means makes inferences from dataset using only input vectors, without referring to known, or labeled samples, it makes it an unsupervised learning algorithm.

The goal when using K-Means is simple. Given a dataset, group similar data points together and discover underlying patterns.

In order to define the clusters K-Means uses centroids. Centroids are points that are used to describe the cluster. That way a point is considered to be in a particular cluster, is if it is closer to that cluster’s centroid than any other centroid.

On the image above we have three clusters of points (green, blue and yellow). Each cluster is described by one centroid (black point).

Visualizing K-Means Clustering

First we choose the number of clusters “K” (the number of groups we want to find in the data). Then the centroids are initialized at random. Once the centroids are initialized we are ready to do the first iteration.

The iteration consists of two steps. Firstly, we assign each point to a cluster whose centroid is nearest to it. In the second step, we calculate the centroid’s location as the mean (center) of all the points assigned to its cluster. And that’s it. We are repeating these two steps until the centroids stop moving, or no new data points are assigned to any cluster.

K-Means in action

Centroid initialization K-Means — *Image 2:* Initializing Centroids

In this image we see the two centroids (blue and red), as well as out data points (grey). The data points are grey because they do not belong to any cluster at the moment.

The important part you need to note is the initial centroid initialization. The value K (number of clusters) is two. One centroid is blue the other is red in color. In the first step, we are initializing them at a random location. At this point we are not assigning any data points to the clusters, we are just initializing the centroids.

Now we are ready to do the two step iteration. Firstly, we are assigning all data points to the closest cluster. The data points closest to the blue centroid are blue. Similarly, data points closest to the red centroid are red

Assigning data points to the closest cluster — *Image 3: Assigning data points to closest cluster*

After that, the second step is calculating the centroid position. In this example we are calculating the mean distance of all points belonging to that cluster.

Here are a couple of more examples:

*Image 4: Assigning data points to closest cluster*

Image 4: Assigning data points to closest cluster — *Image 5: Assigning data points to closest cluster*

The best way to visualize and learn about K-Means is to watch the video and debug the code.

12 thoughts on “Visualizing K-Means Clustering. How K-Means algorithm works”

Taufan

January 12, 2020 at 6:36 am

I need your hand .. Please send source code example.. thx a lot Master

Reply
vanco
January 19, 2020 at 12:50 pm
You can find the source code at: http://code-ai.mk/wp-content/uploads/2018/11/KMeansGUI.zip

Harsh

November 19, 2020 at 7:51 pm

Hi sir, please update the above link of source code. I request.

Reply
vanco
November 28, 2020 at 8:54 am
Here you can find the complete source code: http://code-ai.mk/wp-content/uploads/2018/11/KMeansGUI.zip

Buy Best Proxies

March 23, 2024 at 10:48 pm

whoah this blog is great i love reading your posts. Keep up the great work! You know, many people are searching around for this information, you could aid them greatly.

Buy Best Private Proxies

April 9, 2024 at 6:23 pm

Oh my goodness! an incredible article dude. Thank you However I am experiencing subject with ur rss . Don’t know why Unable to subscribe to it. Is there anyone getting equivalent rss drawback? Anyone who knows kindly respond. Thnkx

Usa Private Proxies

April 10, 2024 at 1:59 pm

I wish to get across my passion for your kindness for people that really need help with this field. Your very own commitment to getting the message along became wonderfully advantageous and have really made most people just like me to get to their aims. Your entire warm and helpful facts means a whole lot to me and still more to my mates. Best wishes; from each one of us.

Where To Buy Proxies

April 16, 2024 at 9:13 pm

Thankyou for sharing the information with us.

Gsa Ser Proxies

April 17, 2024 at 12:10 am

some really wonderful blog posts on this website , thanks for contribution.

Your Private Proxy Search

April 17, 2024 at 1:41 am

I would like to get across my appreciation for your kindness supporting individuals who have the need for help on your field. Your personal commitment to getting the solution all over became wonderfully informative and have surely enabled most people much like me to reach their goals. This helpful tutorial implies a great deal to me and far more to my peers. Thanks a ton; from all of us.

Sslprivateproxy

April 17, 2024 at 4:55 am

It’s laborious to seek out educated individuals on this topic, but you sound like you recognize what you’re speaking about! Thanks

Free Private Proxy List

April 17, 2024 at 5:22 am

I genuinely enjoy looking at on this website , it contains great posts.

DevInDeep