Visualizing K-Means Clustering. How K-Means algorithm works

Visualizing K-Means Clustering. How K-Means algorithm works

Visualizing K-Means Clustering tutorial

Source Code Link: K-Means example

More on K-Means at: Tech In Deep

Introducing K-Means

K-Means is an algorithm that is used in a situation where you are given a dataset where each sample has a set of features, but no labels. In a situation like this we can try and find groups of data, which are similar to one another. Similar data points would stay close to each other. This way they create groups, or as we call it clusters.

K-Means is a clustering algorithm. Because K-Means makes inferences from dataset using only input vectors, without referring to known, or labeled samples, it makes it an unsupervised learning algorithm.

The goal when using K-Means is simple. Given a dataset, group similar data points together and discover underlying patterns.

In order to define the clusters K-Means uses centroids. Centroids are points that are used to describe the cluster. That way a point is considered to be in a particular cluster, is if it is closer to that cluster’s centroid than any other centroid.

K-Means clustering
Image 1: K-Means clustering

On the image above we have three clusters of points (green, blue and yellow). Each cluster is described by one centroid (black point).

Visualizing K-Means Clustering

First we choose the number of clusters “K” (the number of groups we want to find in the data). Then the centroids are initialized at random. Once the centroids are initialized we are ready to do the first iteration.

The iteration consists of two steps. Firstly, we assign each point to a cluster whose centroid is nearest to it. In the second step, we calculate the centroid’s location as the mean (center) of all the points assigned to its cluster. And that’s it. We are repeating these two steps until the centroids stop moving, or no new data points are assigned to any cluster.

K-Means in action

Centroid initialization K-Means
Image 2: Initializing Centroids

In this image we see the two centroids (blue and red), as well as out data points (grey). The data points are grey because they do not belong to any cluster at the moment.

The important part you need to note is the initial centroid initialization. The value K (number of clusters) is two. One centroid is blue the other is red in color. In the first step, we are initializing them at a random location. At this point we are not assigning any data points to the clusters, we are just initializing the centroids.

Now we are ready to do the two step iteration. Firstly, we are assigning all data points to the closest cluster. The data points closest to the blue centroid are blue. Similarly, data points closest to the red centroid are red

Assigning data points to the closest cluster
Image 3: Assigning data points to closest cluster

After that, the second step is calculating the centroid position. In this example we are calculating the mean distance of all points belonging to that cluster.

Here are a couple of more examples:

Assigning data points to closest cluster
Image 4: Assigning data points to closest cluster
 Image 4: Assigning data points to closest cluster
Image 5: Assigning data points to closest cluster

The best way to visualize and learn about K-Means is to watch the video and debug the code.

12 thoughts on “Visualizing K-Means Clustering. How K-Means algorithm works

  1. Reply
    Taufan
    January 12, 2020 at 6:36 am

    I need your hand .. Please send source code example.. thx a lot Master

    1. Reply
      vanco
      January 19, 2020 at 12:50 pm

      You can find the source code at: http://code-ai.mk/wp-content/uploads/2018/11/KMeansGUI.zip

  2. Reply
    Harsh
    November 19, 2020 at 7:51 pm

    Hi sir, please update the above link of source code. I request.

    1. Reply
      vanco
      November 28, 2020 at 8:54 am

      Here you can find the complete source code: http://code-ai.mk/wp-content/uploads/2018/11/KMeansGUI.zip

  3. Reply
    Buy Best Proxies
    March 23, 2024 at 10:48 pm

    whoah this blog is great i love reading your posts. Keep up the great work! You know, many people are searching around for this information, you could aid them greatly.

  4. Reply
    Buy Best Private Proxies
    April 9, 2024 at 6:23 pm

    Oh my goodness! an incredible article dude. Thank you However I am experiencing subject with ur rss . Don’t know why Unable to subscribe to it. Is there anyone getting equivalent rss drawback? Anyone who knows kindly respond. Thnkx

  5. Reply
    Usa Private Proxies
    April 10, 2024 at 1:59 pm

    I wish to get across my passion for your kindness for people that really need help with this field. Your very own commitment to getting the message along became wonderfully advantageous and have really made most people just like me to get to their aims. Your entire warm and helpful facts means a whole lot to me and still more to my mates. Best wishes; from each one of us.

  6. Reply
    Where To Buy Proxies
    April 16, 2024 at 9:13 pm

    Thankyou for sharing the information with us.

  7. Reply
    Gsa Ser Proxies
    April 17, 2024 at 12:10 am

    some really wonderful blog posts on this website , thanks for contribution.

  8. Reply
    Your Private Proxy Search
    April 17, 2024 at 1:41 am

    I would like to get across my appreciation for your kindness supporting individuals who have the need for help on your field. Your personal commitment to getting the solution all over became wonderfully informative and have surely enabled most people much like me to reach their goals. This helpful tutorial implies a great deal to me and far more to my peers. Thanks a ton; from all of us.

  9. Reply
    Sslprivateproxy
    April 17, 2024 at 4:55 am

    It’s laborious to seek out educated individuals on this topic, but you sound like you recognize what you’re speaking about! Thanks

  10. Reply
    Free Private Proxy List
    April 17, 2024 at 5:22 am

    I genuinely enjoy looking at on this website , it contains great posts.

Leave a Reply

Your email address will not be published. Required fields are marked *