Do Wide and Deep Networks Learn the Same Things?

Jun 7, 2021
Comments Off on Do Wide and Deep Networks Learn the Same Things?
Industry News, Left-hand

AI, Analytics, artificial intelligence, data analytics, Machine Learning, machine learning data
2418 Views

redHow to Build a Recommendation System at Scale: Insights from Instacart
Government by AI? Trump Administration Plans to Write Regulations Using Artificial Intelligence
From Text To Tables: Why Structured Data Is AI’s Next $600 Billion Frontier

5 years ago
Do Wide and Deep Networks Learn the Same Things?

By: Thao Nguyen, AI Resident, Google Research

Originally published in Google AI Blog, May 4, 2021.

A common practice to improve a neural network’s performance and tailor it to available computational resources is to adjust the architecture depth and width. Indeed, popular families of neural networks, including EfficientNet, ResNet and Transformers, consist of a set of architectures of flexible depths and widths. However, beyond the effect on accuracy, there is limited understanding of how these fundamental choices of architecture design affect the model, such as the impact on its internal representations.

In “Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth”, we perform a systematic study of the similarity between wide and deep networks from the same architectural family through the lens of their hidden representations and final outputs. In very wide or very deep models, we find a characteristic block structure in their internal representations, and establish a connection between this phenomenon and model overparameterization. Comparisons across models demonstrate that those without the block structure show significant similarity between representations in corresponding layers, but those containing the block structure exhibit highly dissimilar representations. These properties of the internal representations in turn translate to systematically different errors at the class and example levels for wide and deep models when they are evaluated on the same test set.

Comparing Representation Similarity with CKA

We extended prior work on analyzing representations by leveraging our previously developed Centered Kernel Alignment (CKA) technique, which provides a robust, scalable way to determine the similarity between the representations learned by any pair of neural network layers. CKA takes as input the representations (i.e., the activation matrices) from two layers, and outputs a similarity score between 0 (not at all similar) and 1 (identical representations).

To continue reading this article, click here.

EXCLUSIVE HIGHLIGHTS

Related

5 years ago
Do Wide and Deep Networks Learn the Same Things?

Originally published in Google AI Blog, May 4, 2021.

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2026 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact

EXCLUSIVE HIGHLIGHTS

Related

5 years agoDo Wide and Deep Networks Learn the Same Things?

Originally published in Google AI Blog, May 4, 2021.

Recommended

redHow to Build a Recommendation System at Scale: Insights from Instacart

Government by AI? Trump Administration Plans to Write Regulations Using Artificial Intelligence

From Text To Tables: Why Structured Data Is AI’s Next $600 Billion Frontier

Is A.I. Actually a Bubble?

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2026 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190 Produced by: Rising Media & Prediction Impact

5 years ago
Do Wide and Deep Networks Learn the Same Things?

The Machine Learning Times © 2026 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact