Machine Learning Times
Machine Learning Times
EXCLUSIVE HIGHLIGHTS
Elon Musk Predicts Artificial General Intelligence In 2 Years. Here’s Why That’s Hype
 Originally published in Forbes, April 10, 2024 When OpenAI’s...
Survey: Machine Learning Projects Still Routinely Fail to Deploy
 Originally published in KDnuggets. Eric Siegel highlights the chronic...
Three Best Practices for Unilever’s Global Analytics Initiatives
    This article from Morgan Vawter, Global Vice...
Getting Machine Learning Projects from Idea to Execution
 Originally published in Harvard Business Review Machine learning might...
SHARE THIS:

4 years ago
Dealing with Overconfidence in Neural Networks: Bayesian Approach

 
Originally published in Jonathan Ramkissoon Blog, July 29, 2020.

I trained a multi-class classifier on images of cats, dogs and wild animals and passed an image of myself, it’s 98% confident I’m a dog. The problem isn’t that I passed an inappropriate image, because models in the real world are passed all sorts of garbage. It’s that the model is overconfident about an image far away from the training data. Instead we expect a more uniform distribution over the classes. The overconfidence makes it difficult to post-process model output (setting a threshold on predictions, etc.), which means it needs to be dealt with by the architecture.

In this post I explore a Bayesian method for dealing with overconfident predictions for inputs far away from training data in neural networks. The method is called last layer Laplace approximation (LLLA) and was proposed in this paper published in ICML 2020.

Why is this a problem?

You might argue “you only trained the classifier on animals, of course it breaks when you show it a human”, and you’re right. However, in the real world, we aren’t able to filter out animal images from non-animal images before sending it to the model, so we need it to be robust to garbage input. The animal-human example tries to replicate this on a small scale (one image). Properly quantifying uncertainty is important because we (as the practitioners training the models) can’t be confident in the model’s ability to generalize if it assigns arbitrarily high confidence to garbage input.

Softmax Classifier

The 3-class classifier was trained on images of cats, dogs and wild animals taken from Kaggle that can be downloaded here.

To continue reading this article, click here.

Leave a Reply