Since 2019, some Big Tech firms have implemented AI red teams to reveal shortcomings, biases, and security flaws.
It was a snowy day in February, and Amanda Minnich was attacking an AI system.
With one block of code—and no other details—she needed to hack into one of the most complex machine learning systems operated by a Microsoft partner. Minnich tried a few different approaches, first attempting to use a single image to confuse the system, then trying it with multiple images. Finally, she made a last-ditch effort to hoodwink the AI by replaying a sequence of images on a constant loop—Minnich described it as being like Ocean’s Eleven, where the robbers fool security by replacing the live feed with older security-camera footage.
It worked: She was in, with control over the AI system.
Microsoft congratulated Minnich—breaking into AI systems was her job, after all. As a member of Microsoft’s AI “red team,” Minnich helps stress-test the company’s ML systems—the models, the training data that fuels them, and the software that helps them operate.
“Red teams” are relatively new to AI. The term can be traced back to 1960s military simulations used by the Department of Defense and is now largely used in cybersecurity, where internal IT teams are tasked with thinking like adversaries to uncover systems vulnerabilities. But since 2019, Big Tech companies like Microsoft, Meta, and Google have implemented versions of AI red teams to reveal shortcomings, bias, and security flaws in their machine learning systems.
To continue reading this article, click here.