OpenAI’s new system is adept at turning text into images. But researchers say it also reinforces stereotypes against women and people of color.
MARCELO RINESI REMEMBERS what it was like to watch Jurassic Park for the first time in a theater. The dinosaurs looked so convincing that they felt like the real thing, a special effects breakthrough that permanently shifted people’s perception of what’s possible. After two weeks of testing DALL-E 2, the CTO of the Institute for Ethics and Emerging Technologies thinks AI might be on the verge of its own Jurassic Park moment.
Last month, OpenAI introduced the second-generation version of DALL-E, an AI model trained on 650 million images and text captions. It can take in text and spit out images, whether that’s a “Dystopian Great Wave off Kanagawa as Godzilla eating Tokyo” or “Teddy bears working on new AI research on the moon in the 1980s.” It can create variations based on the style of a particular artist, like Salvador Dali, or popular software like Unreal Engine. Photorealistic depictions that look like the real world, shared widely on social media by a select number of early testers, have given the impression that the model can create images of almost anything. “What people thought might take five to 10 years, we’re already in it. We are in the future,” says Vipul Gupta, a PhD candidate at Penn State who has used DALL-E 2.
But amid promotional depictions of koalas and pandas spreading on social media is a notable absence: people’s faces. As part of OpenAI’s “red team” process—in which external experts look for ways things can go wrong before the product’s broader distribution—AI researchers found that DALL-E 2’s depictions of people can be too biased for public consumption. Early tests by red team members and OpenAI have shown that DALL-E 2 leans toward generating images of white men by default, overly sexualizes images of women, and reinforces racial stereotypes.
To continue reading this article, click here.