Machine Learning Times
Machine Learning Times
EXCLUSIVE HIGHLIGHTS
Video – Alexa On The Edge – A Case Study in Customer-Obsessed Research from Susanj of Amazon
 Event: Machine Learning Week 2021 Keynote: Alexa On The Edge...
Why AI Isn’t Going to Replace Data Scientists Any Time Soon
 Should data scientists consider AI a threat to their...
“Doing AI” Is a Mistake that Detracts from Real Problem-Solving
  A note from Executive Editor Eric Siegel: Richard...
Getting the Green Light for a Machine Learning Project
  This article is based on the transcript of...
SHARE THIS:

5 months ago
How Image Search Works at Dropbox

 
Originally posted in Dropbox.tech, May 11, 2021

Photos are among the most common types of files in Dropbox, but searching for them by filename is even less productive than it is for text-based files.  When you’re looking for that photo from a picnic a few years ago, you surely don’t remember that the filename set by your camera was 2017-07-04 12.37.54.jpg.

Instead, you look at individual photos, or thumbnails of them, and try to identify objects or aspects that match what you’re searching for—whether that’s to recover a photo you’ve stored, or perhaps discover the perfect shot for a new campaign in your company’s archives.  Wouldn’t it be great if Dropbox could pore through all those images for you instead, and call out those which best match a few descriptive words that you dictated? That’s pretty much what our image search does.

In this post we’ll describe the core idea behind our image content search method, based on techniques from machine learning, then discuss how we built a performant implementation on Dropbox’s existing search infrastructure.

1. Our approach

Here’s a simple way to state the image search problem: find a relevance function that takes a (text) query q and an image j, and returns a relevance score s indicating how well the image matches the query.

s = f(q, j)

Given this function, when a user does a search we run it on all their images and return those that produce a score above a threshold, sorted by their scores.  We build this function using two key developments in machine learning: accurate image classification and word vectors.

To continue reading this article, click here.

Leave a Reply