Machine Learning Times
Machine Learning Times
EXCLUSIVE HIGHLIGHTS
Video – Alexa On The Edge – A Case Study in Customer-Obsessed Research from Susanj of Amazon
 Event: Machine Learning Week 2021 Keynote: Alexa On The Edge...
Why AI Isn’t Going to Replace Data Scientists Any Time Soon
 Should data scientists consider AI a threat to their...
“Doing AI” Is a Mistake that Detracts from Real Problem-Solving
  A note from Executive Editor Eric Siegel: Richard...
Getting the Green Light for a Machine Learning Project
  This article is based on the transcript of...
SHARE THIS:

5 months ago
Clustergam: Visualisation of Cluster Analysis

 
Originally published in MARTIN FLEISCHMANN, April 27, 2021.

When we want to do some cluster analysis to identify groups in our data, we often use algorithms like K-Means, which require the specification of a number of clusters. But the issue is that we usually don’t know how many clusters there are.

There are many methods on how to determine the correct number, like silhouettes or elbow plot, to name a few. But they usually don’t give much insight into what is happening between different options, so the numbers are a bit abstract.

Matthias Schonlau proposed another approach – a clustergram. Clustergram is a two-dimensional plot capturing the flows of observations between classes as you add more clusters. It tells you how your data reshuffles and how good your splits are. Tal Galili later implemented clustergram for K-Means in R. And I have used Tal’s implementation, ported it to Python and created clustergram – a Python package to make clustergrams.

clustergram currently supports K-Means and using scikit-learn (inlcuding Mini-Batch implementation) and RAPIDS.AI cuML (if you have a CUDA-enabled GPU), Gaussian Mixture Model (scikit-learn only) and hierarchical clustering based on scipy.hierarchy. Alternatively, we can create clustergram based on labels and data derived from alternative custom clustering algorithms. It provides a sklearn-like API and plots clustergram using matplotlib, which gives it a wide range of styling options to match your publication style.

To continue reading this article, click here.

Leave a Reply