Google Just Published 25 Million Free Datasets

AI, artificial intelligence, data analytics, datasets, machine analytics, Machine Learning, Predictive Analytics
8575 Views

6 years ago
Google Just Published 25 Million Free Datasets

By: Tom Watterman, Data Scientist Facebook

Originally published in Medium, January 23, 2020

Note: Google’s new dataset search tool was publicly released on January 23rd, 2020.

Here’s what you need to know about the largest data repository in the world.

Google recently released datasetsearch, a free tool for searching 25 million publicly available datasets.

The search tool includes filters to limit results based on their license (free or paid), format (csv, images, etc), and update time.

The results also include descriptions of the dataset’s contents as well as author citations.

Google’s dataset aggregation methodology differs from other dataset repositories like Amazon’s open data registry. Unlike other repositories that curate and host the datasets themselves, Google does not curate or provide direct access to the 25 million datasets directly.

Instead, Google relies on the dataset publishers to use the open standards of schema.org to describe their dataset’s metadata. Google then indexes and makes that metadata searchable across publishers.

Since publishers are still required to host the datasets themselves, for-profit publishers that conform to schema.org standards will also have their datasets indexed by Google. In my anecdotal experience, I found about half of the datasets in the search results were from for-profit aggregators, with an even higher percentage when searching for market-related datasets.

Other popular dataset publishers on the platform include government agencies and research institutions. Google claims that US government agencies alone have published over 2 million datasets.

To continue reading this article click here.

EXCLUSIVE HIGHLIGHTS

Related

6 years ago
Google Just Published 25 Million Free Datasets

Originally published in Medium, January 23, 2020

3 thoughts on “Google Just Published 25 Million Free Datasets”

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2026 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact

EXCLUSIVE HIGHLIGHTS

Related

6 years agoGoogle Just Published 25 Million Free Datasets

Originally published in Medium, January 23, 2020

Recommended

Big Tech Has Suddenly Flipped on the AI Jobs Wipeout Scenario

Why AI hasn’t replaced software engineers, and won’t

A reality check on the AI jobs hysteria

Apocalypse No

3 thoughts on “Google Just Published 25 Million Free Datasets”

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2026 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190 Produced by: Rising Media & Prediction Impact

6 years ago
Google Just Published 25 Million Free Datasets

The Machine Learning Times © 2026 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact