Machine Learning Times
EXCLUSIVE HIGHLIGHTS
2 More Ways To Hybridize Predictive AI And Generative AI
  Originally published in Forbes Predictive AI and generative AI...
How To Overcome Predictive AI’s Everyday Failure
  Originally published in Forbes Executives know the importance of predictive...
Our Last Hope Before The AI Bubble Detonates: Taming LLMs
  Originally published in Forbes To know that we’re in...
The Agentic AI Hype Cycle Is Out Of Control — Yet Widely Normalized
  Originally published in Forbes I recently wrote about how...
SHARE THIS:

5 years ago
Clustergam: Visualisation of Cluster Analysis

 
Originally published in MARTIN FLEISCHMANN, April 27, 2021.

When we want to do some cluster analysis to identify groups in our data, we often use algorithms like K-Means, which require the specification of a number of clusters. But the issue is that we usually don’t know how many clusters there are.

There are many methods on how to determine the correct number, like silhouettes or elbow plot, to name a few. But they usually don’t give much insight into what is happening between different options, so the numbers are a bit abstract.

Matthias Schonlau proposed another approach – a clustergram. Clustergram is a two-dimensional plot capturing the flows of observations between classes as you add more clusters. It tells you how your data reshuffles and how good your splits are. Tal Galili later implemented clustergram for K-Means in R. And I have used Tal’s implementation, ported it to Python and created clustergram – a Python package to make clustergrams.

clustergram currently supports K-Means and using scikit-learn (inlcuding Mini-Batch implementation) and RAPIDS.AI cuML (if you have a CUDA-enabled GPU), Gaussian Mixture Model (scikit-learn only) and hierarchical clustering based on scipy.hierarchy. Alternatively, we can create clustergram based on labels and data derived from alternative custom clustering algorithms. It provides a sklearn-like API and plots clustergram using matplotlib, which gives it a wide range of styling options to match your publication style.

To continue reading this article, click here.

Comments are closed.