Summary of Why Cerebras CEO Andrew Feldman Built The World's Largest Computer Chip Podcast Episode by Odd Lots

Overview of Why Cerebras CEO Andrew Feldman Built The World's Largest Computer Chip

This Odd Lots episode features Bloomberg’s Tracy Alloway and Joe Weisenthal interviewing Andrew Feldman, CEO and cofounder of Cerebras, the company behind an unusually large “wafer-scale” AI chip. The conversation focuses on why Cerebras built a dinner-plate-sized processor, how its architecture is designed to speed up AI inference, and what that means for the economics of AI, supply chains, competition with Nvidia, and the broader semiconductor and data center landscape.

Why Cerebras Built a Giant Chip

The core technical idea

Cerebras’ chip is built on a wafer-scale design, meaning the processor is far larger than a conventional chip.
Feldman argues that larger chips can process more information in less time, which is especially valuable for AI workloads.
The goal is to reduce latency by keeping compute and memory physically close together.

Why size matters

Traditional AI systems often rely on connecting many smaller chips together.
Cerebras instead places much more of the system on one massive wafer to minimize movement of data, which is a major bottleneck.
Feldman says the chip is about 58 times larger than any previous chip and helps the company achieve major speed advantages.

The memory advantage

Cerebras’ design allows it to use fast memory more effectively.
The key tradeoff is that fast memory typically stores less per square millimeter, but Cerebras solves that by simply using a much larger wafer.
Feldman argues this is a major reason Cerebras can be 15x faster than the fastest GPU, and on some workloads 50x, 100x, or even 1,000x faster.

AI Inference vs. Training

What Cerebras focuses on

The discussion makes a sharp distinction between:
- Training: building AI models
- Inference: using AI models
Feldman says Cerebras can do both, but current demand is especially strong in inference.

Why inference matters now

He argues that by 2025 and beyond, AI models have become useful enough that demand for actual usage has exploded.
That means the market is increasingly about serving fast responses, not just building bigger models.
He also pushes back on the idea that speed matters less for agentic AI, arguing that speed is important in all productive work.

The Economics of Speed and Tokens

Speed has real value

Feldman says customers are willing to pay a premium for faster tokens because speed increases productivity.
He cites Anthropic’s premium fast-token service as evidence that the market values latency reduction.

Cost per token

GPUs are efficient at producing slow tokens cheaply, but become much more expensive and power-hungry when trying to serve fast tokens.
Cerebras claims its architecture makes fast tokens much cheaper than GPUs, while using far less power.

Why this matters

The interview frames AI economics as a question of:
- how much speed is worth,
- what users are willing to pay for it,
- and how much of today’s AI spending is wasteful versus necessary.

Building the Chip: Engineering and Manufacturing Challenges

Why no one had done it before

Feldman says every prior wafer-scale effort failed over the 75-year history of computing.
Cerebras had to solve problems across:
- lithography,
- materials,
- packaging,
- power delivery,
- cooling,
- and software.

TSMC’s role

Cerebras worked closely with TSMC, which Feldman calls the greatest manufacturing company in the world.
Despite the complexity, he says Cerebras has so far received the wafers it needed.
He notes that the hardest part of the supply chain is not just chips, but data center buildings and powered real estate.

Supply constraints

Cerebras avoids several of the most constrained AI supply chain bottlenecks:
- HBM memory shortages
- CoWoS packaging constraints
- TSMC’s most pressured 3-nanometer capacity
It uses 5-nanometer manufacturing instead.

Competition: Open Source, Closed Source, and Nvidia

Open source vs. closed source models

Feldman says open source models are generally cheaper per unit of intelligence because users are not paying the training cost.
Closed source models are slightly better, but only by a few percentage points in his view.
He does not think the market will settle on a single dominant model type; he expects a diverse market with many players.

Nvidia and CUDA

Feldman argues that CUDA, Nvidia’s software ecosystem, was hugely important in the early AI era.
But he says CUDA is far less important for inference and is losing relevance in training too.
His view is that switching workloads from GPUs to Cerebras is relatively straightforward.

Customers, Cloud, and the Business Model

Cerebras’ cloud offering

Cerebras also operates its own cloud and provides access to open source models.
Feldman says open source deployment on Cerebras is attractive because it is fast and cheaper since users are not paying model-training costs.

Major customers

He discusses:
- OpenAI
- AWS
- G42, the UAE’s AI champion
- healthcare and pharma customers such as Mayo Clinic and GlaxoSmithKline
G42 is a major customer and investor, using Cerebras for both training and inference in various data centers.

AWS integration

Feldman says Cerebras chips will be accessible through AWS Bedrock, where customers can choose faster inference options.
He expects that service to carry a premium price.

Geopolitics, Export Controls, and China

Export controls matter more now

Feldman says export restrictions were less important a few years ago, but are now highly relevant to the company.
He supports limiting access to China’s AI capabilities, arguing that strategically sensitive technology should not be broadly diffused.

His view on policy

He acknowledges the debate over whether the U.S. should block China or compete by giving access.
But he comes down on the side of restriction, even if it forecloses some markets for Cerebras.

IPO, Wealth, and Long-Term Hardware Building

The IPO

Cerebras recently completed a major IPO, and Feldman describes the process as the result of years of work rather than a sudden windfall.
He says becoming a billionaire was not the big emotional moment; what mattered more was that the IPO created hundreds of millionaires across the company.

Hardware takes time

Feldman emphasizes that hardware innovation is slow, expensive, and iterative.
He says he likes the discipline of “measure twice, cut once” and that building hardware is fundamentally different from software’s move-fast-and-iterate culture.

Key Takeaways

Cerebras’ thesis: Bigger chips can reduce latency and improve AI performance, especially for inference.
Inference is the immediate prize: As AI usage grows, fast response time becomes a major economic advantage.
Supply chain bottlenecks are shifting: The main constraints are increasingly data centers, power, and real estate—not just chip fabrication.
CUDA is not the whole moat: Feldman believes Nvidia’s software advantage is shrinking, especially in inference.
Open source is real competition: He sees open source models as cheaper and increasingly viable, though still slightly behind closed source models.
Geopolitics matters: Export controls, China, and foreign customers like G42 are now central to AI chip strategy.

Notable Insight

“Speed matters in all aspects of productive work.”

That idea runs through the interview: Cerebras is betting that AI customers will increasingly pay for lower latency, and that a radically different chip architecture can turn speed into a durable economic advantage.

Summary of Why Cerebras CEO Andrew Feldman Built The World's Largest Computer Chip

Odd Lotsby Bloomberg