Machine Learning Times
Machine Learning Times
EXCLUSIVE HIGHLIGHTS
Video – How to Use AI Ethically from Natalia Modjeska of Omdia
 Event: Machine Learning Week 2021 Keynote: How to Use AI...
Video – Alexa On The Edge – A Case Study in Customer-Obsessed Research from Susanj of Amazon
 Event: Machine Learning Week 2021 Keynote: Alexa On The Edge...
Why AI Isn’t Going to Replace Data Scientists Any Time Soon
 Should data scientists consider AI a threat to their...
“Doing AI” Is a Mistake that Detracts from Real Problem-Solving
  A note from Executive Editor Eric Siegel: Richard...
SHARE THIS:

9 months ago
How Facebook is Using AI to Improve Photo Descriptions for People Who Are Blind or Visually Impaired

 
Originally published in [email protected], Jan 19, 2021.

When Facebook users scroll through their News Feed, they find all kinds of content — articles, friends’ comments, event invitations, and of course, photos. Most people are able to instantly see what’s in these images, whether it’s their new grandchild, a boat on a river, or a grainy picture of a band onstage. But many users who are blind or visually impaired (BVI) can also experience that imagery, provided it’s tagged properly with alternative text (or “alt text”). A screen reader can describe the contents of these images using a synthetic voice and enable people who are BVI to understand images in their Facebook feed.

Unfortunately, many photos are posted without alt text, so in 2016 we introduced a new technology called automatic alternative text (AAT). AAT — which was recognized in 2018 with the Helen Keller Achievement Award from the American Foundation for the Blind — utilizes object recognition to generate descriptions of photos on demand so that blind or visually impaired individuals can more fully enjoy their News Feed. We’ve been improving it ever since and are excited to unveil the next generation of AAT.

The latest iteration of AAT represents multiple technological advances that improve the photo experience for our users. First and foremost, we’ve expanded the number of concepts that AAT can reliably detect and identify in a photo by more than 10x, which in turn means fewer photos without a description. Descriptions are also more detailed, with the ability to identify activities, landmarks, types of animals, and so forth — for example, “May be a selfie of 2 people, outdoors, the Leaning Tower of Pisa.”

To continue reading this article, click here.

Leave a Reply