Large language models use a surprisingly simple mechanism to retrieve some stored knowledge

Apr 7, 2024
2 comments
Industry News, Left-hand
887 Views

3 weeks ago
Large language models use a surprisingly simple mechanism to retrieve some stored knowledge

By: Adam Zewe

Originally published in MIT News, March 25, 2024

Researchers demonstrate a technique that can be used to probe a model to see what it knows about new subjects.

Large language models, such as those that power popular artificial intelligence chatbots like ChatGPT, are incredibly complex. Even though these models are being used as tools in many areas, such as customer support, code generation, and language translation, scientists still don’t fully grasp how they work.

They found a surprising result: Large language models (LLMs) often use a very simple linear function to recover and decode stored facts. Moreover, the model uses the same decoding function for similar types of facts. Linear functions, equations with only two variables and no exponents, capture the straightforward, straight-line relationship between two variables.

The researchers showed that, by identifying linear functions for different facts, they can probe the model to see what it knows about new subjects, and where within the model that knowledge is stored.

To continue reading this article, click here.

2 thoughts on “Large language models use a surprisingly simple mechanism to retrieve some stored knowledge”

Fred Newman on April 19, 2024 at 9:11 am said:
Log in to Reply

Hey, thank you!
Ramsey Morgan on April 20, 2024 at 4:11 pm said:
Log in to Reply

Researchers demonstrate a technique that can be used to probe a model to see New York Knicks OVO Varsity Jacket

Industry News

More Industry News...

Connect with Us

Stay update with all news!

Subscription

Produced By:

EXCLUSIVE HIGHLIGHTS

Related

3 weeks ago
Large language models use a surprisingly simple mechanism to retrieve some stored knowledge

2 thoughts on “Large language models use a surprisingly simple mechanism to retrieve some stored knowledge”

Leave a Reply Cancel reply

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2020 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact

EXCLUSIVE HIGHLIGHTS

Related

3 weeks agoLarge language models use a surprisingly simple mechanism to retrieve some stored knowledge

Recommended

The ROI on AI: Advisors struggle to get unbiased answers from tech providers

Apple researchers develop AI that can ‘see’ and understand screen context

A.I. Is Spying on the Food We Throw Away

This new forecasting model is better than machine learning, researchers say

2 thoughts on “Large language models use a surprisingly simple mechanism to retrieve some stored knowledge”

Leave a Reply Cancel reply

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2020 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190 Produced by: Rising Media & Prediction Impact

3 weeks ago
Large language models use a surprisingly simple mechanism to retrieve some stored knowledge

The Machine Learning Times © 2020 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact