Peak Data

Originally published in East Wind, Oct 25, 2023.

I’m probably not the first person to write about the insane leverage that LLMs confer to engineers, but Stack Overflow’s 28% layoff really got me thinking about the future of human-generated data, especially in the context of a potential model collapse (whereby “models forget the true underlying data distribution” once they are trained on machine-generated data). I explore whether we are at “peak data” in terms of both the quality and percentage of human-generated data on the internet, how this might affect the efficacy of future AI models, and potential solutions/product opportunities that exist.

For me, in the pre-ChatGPT days, my workflow as an engineer in big tech when tackling a brand new project would be reading a lot of docs, Googling/Stack Overflow’ing when something inevitably breaks, and eventually getting something serviceable working. Now, ChatGPT is my tireless companion/engineer who consistently generates well-thought out solutions (and occasionally snarky responses).

I’ll illustrate via an example:

I’m currently using Google Firebase (and Firestore) for handling a lot of our backend business logic/data storage. Because Firestore is a document-based database, I asked it to generate an entire “data model” of our app (I’m migrating from vanilla Postgres). And it generated the full data model on the first try, even reminding me about the need for data denormalization.

It then proceed to help me generate rules for field level access.

To continue reading this article, click here.

One thought on “Peak Data”

I am very impressed by this post. Thank you so much for taking the time to backrooms detail all of this for all of us. It was a great guide. Thanks for sharing this with us. I will visit your site! Play fall guys 3 for free!

EXCLUSIVE HIGHLIGHTS

Related

9 months ago
Peak Data

Originally published in East Wind, Oct 25, 2023.

One thought on “Peak Data”

Leave a Reply Cancel reply

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2020 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact

EXCLUSIVE HIGHLIGHTS

Related

9 months agoPeak Data

Originally published in East Wind, Oct 25, 2023.

Recommended

This new forecasting model is better than machine learning, researchers say

Widespread machine learning methods behind ‘link prediction’ are performing very poorly, study shows

AI’s $600B Question

Google scrambles to manually remove weird AI answers in search

One thought on “Peak Data”

Leave a Reply Cancel reply

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2020 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190 Produced by: Rising Media & Prediction Impact

9 months ago
Peak Data

The Machine Learning Times © 2020 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact