Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance

Apr 27, 2022
Comments Off on Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance
Industry News, Left-hand

AI, artificial intelligence, Machine Learning
4239 Views

How Instacart Uses Machine Learning to Suggest Replacements for Out-of-Stock Products
How to Build a Recommendation System at Scale: Insights from Instacart
Government by AI? Trump Administration Plans to Write Regulations Using Artificial Intelligence

4 years ago
Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance

By: Sharan Narang and Aakanksha Chowdhery

Originally published in Google AI Blog, April 4, 2022.

In recent years, large neural networks trained for language understanding and generation have achieved impressive results across a wide range of tasks. GPT-3 first showed that large language models (LLMs) can be used for few-shot learning and can achieve impressive results without large-scale task-specific data collection or model parameter updating. More recent LLMs, such as GLaM, LaMDA, Gopher, and Megatron-Turing NLG, achieved state-of-the-art few-shot results on many tasks by scaling model size, using sparsely activated modules, and training on larger datasets from more diverse sources. Yet much work remains in understanding the capabilities that emerge with few-shot learning as we push the limits of model scale.

Last year Google Research announced our vision for Pathways, a single model that could generalize across domains and tasks while being highly efficient. An important milestone toward realizing this vision was to develop the new Pathways system to orchestrate distributed computation for accelerators. In “PaLM: Scaling Language Modeling with Pathways”, we introduce the Pathways Language Model (PaLM), a 540-billion parameter, dense decoder-only Transformer model trained with the Pathways system, which enabled us to efficiently train a single model across multiple TPU v4 Pods. We evaluated PaLM on hundreds of language understanding and generation tasks, and found that it achieves state-of-the-art few-shot performance across most tasks, by significant margins in many cases.

To continue reading this article, click here.

EXCLUSIVE HIGHLIGHTS

Related

4 years ago
Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance

Originally published in Google AI Blog, April 4, 2022.

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2026 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact

EXCLUSIVE HIGHLIGHTS

Related

4 years agoPathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance

Originally published in Google AI Blog, April 4, 2022.

Recommended

How Instacart Uses Machine Learning to Suggest Replacements for Out-of-Stock Products

How to Build a Recommendation System at Scale: Insights from Instacart

Government by AI? Trump Administration Plans to Write Regulations Using Artificial Intelligence

From Text To Tables: Why Structured Data Is AI’s Next $600 Billion Frontier

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2026 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190 Produced by: Rising Media & Prediction Impact

4 years ago
Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance

The Machine Learning Times © 2026 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact