Machine Learning Times
Machine Learning Times
EXCLUSIVE HIGHLIGHTS
Effective Machine Learning Needs Leadership — Not AI Hype
 Originally published in BigThink, Feb 12, 2024.  Excerpted from The...
Today’s AI Won’t Radically Transform Society, But It’s Already Reshaping Business
 Originally published in Fast Company, Jan 5, 2024. Eric...
Calculating Customer Potential with Share of Wallet
 No question about it: We, as consumers have our...
A University Curriculum Supplement to Teach a Business Framework for ML Deployment
    In 2023, as a visiting analytics professor...
SHARE THIS:

4 years ago
Dealing with Overconfidence in Neural Networks: Bayesian Approach

 
Originally published in Jonathan Ramkissoon Blog, July 29, 2020.

I trained a multi-class classifier on images of cats, dogs and wild animals and passed an image of myself, it’s 98% confident I’m a dog. The problem isn’t that I passed an inappropriate image, because models in the real world are passed all sorts of garbage. It’s that the model is overconfident about an image far away from the training data. Instead we expect a more uniform distribution over the classes. The overconfidence makes it difficult to post-process model output (setting a threshold on predictions, etc.), which means it needs to be dealt with by the architecture.

In this post I explore a Bayesian method for dealing with overconfident predictions for inputs far away from training data in neural networks. The method is called last layer Laplace approximation (LLLA) and was proposed in this paper published in ICML 2020.

Why is this a problem?

You might argue “you only trained the classifier on animals, of course it breaks when you show it a human”, and you’re right. However, in the real world, we aren’t able to filter out animal images from non-animal images before sending it to the model, so we need it to be robust to garbage input. The animal-human example tries to replicate this on a small scale (one image). Properly quantifying uncertainty is important because we (as the practitioners training the models) can’t be confident in the model’s ability to generalize if it assigns arbitrarily high confidence to garbage input.

Softmax Classifier

The 3-class classifier was trained on images of cats, dogs and wild animals taken from Kaggle that can be downloaded here.

To continue reading this article, click here.

Leave a Reply