Machine Learning Times
Machine Learning Times
EXCLUSIVE HIGHLIGHTS
Data Analytics in Higher Education
 Universities confront many of the same marketing challenges as...
How Generative AI Helps Predictive AI
 Originally published in Forbes, August 21, 2024 This is the...
4 Ways Machine Learning Can Perpetuate Injustice and What to Do About It
 Originally published in Built In, July 12, 2024 When ML...
The Great AI Myth: These 3 Misconceptions Fuel It
 Originally published in Forbes, July 29, 2024 The hottest thing...
SHARE THIS:

4 months ago
AI is vulnerable to attack. Can it ever be used safely?

 

Originally published in nature, July 25, 2024.

The models that underpin artificial-intelligence systems such as ChatGPT can be subject to attacks that elicit harmful behaviour. Making them safe will not be easy.

In 2015, computer scientist Ian Goodfellow and his colleagues at Google described what could be artificial intelligence’s most famous failure. First, a neural network trained to classify images correctly identified a photograph of a panda. Then Goodfellow’s team added a small amount of carefully calculated noise to the image. The result was indistinguishable to the human eye, but the network now confidently asserted that the image was of a gibbon1.

This is an iconic instance of what researchers call adversarial examples: inputs carefully crafted to deceive neural-network classifiers. Initially, many researchers thought that the phenomenon revealed vulnerabilities that needed to be fixed before these systems could be deployed in the real world — a common concern was that if someone subtly altered a stop sign it could cause a self-driving car to crash. But these worries never materialized outside the laboratory. “There are usually easier ways to break some classification system than making a small perturbation in pixel space,” says computer scientist Nicholas Frosst. “If you want to confuse a driverless car, just take the stop sign down.”

Fears that road signs would be subtly altered might have been misplaced, but adversarial examples vividly illustrate how different AI algorithms are to human cognition. “They drive home the point that a neural net is doing something very different than we are,” says Frosst, who worked on adversarial examples at Google in Mountain View, California, before co-founding AI company Cohere in Toronto, Canada.

To continue reading this article, click here.

Comments are closed.