Machine Learning Times
Machine Learning Times
EXCLUSIVE HIGHLIGHTS
Explainable Machine Learning, Model Transparency, and the Right to Explanation
 Check out this topical video from Predictive Analytics World...
Guidebook to the Future of Data Science: How to Navigate the Increasingly Symbiotic Dynamic Between Executives and Universities
 Book Review of Closing the Analytics Talent Gap: An...
Guilty or Not Guilty: Weight of Evidence
 You have been invited to serve as a juror...
How Machine Learning Works for Social Good
  Originally published in KDnuggets, Nov 2020. This article...
SHARE THIS:

1 month ago
How Facebook is Using AI to Improve Photo Descriptions for People Who Are Blind or Visually Impaired

 
Originally published in [email protected], Jan 19, 2021.

When Facebook users scroll through their News Feed, they find all kinds of content — articles, friends’ comments, event invitations, and of course, photos. Most people are able to instantly see what’s in these images, whether it’s their new grandchild, a boat on a river, or a grainy picture of a band onstage. But many users who are blind or visually impaired (BVI) can also experience that imagery, provided it’s tagged properly with alternative text (or “alt text”). A screen reader can describe the contents of these images using a synthetic voice and enable people who are BVI to understand images in their Facebook feed.

Unfortunately, many photos are posted without alt text, so in 2016 we introduced a new technology called automatic alternative text (AAT). AAT — which was recognized in 2018 with the Helen Keller Achievement Award from the American Foundation for the Blind — utilizes object recognition to generate descriptions of photos on demand so that blind or visually impaired individuals can more fully enjoy their News Feed. We’ve been improving it ever since and are excited to unveil the next generation of AAT.

The latest iteration of AAT represents multiple technological advances that improve the photo experience for our users. First and foremost, we’ve expanded the number of concepts that AAT can reliably detect and identify in a photo by more than 10x, which in turn means fewer photos without a description. Descriptions are also more detailed, with the ability to identify activities, landmarks, types of animals, and so forth — for example, “May be a selfie of 2 people, outdoors, the Leaning Tower of Pisa.”

To continue reading this article, click here.

Leave a Reply