Overview of Dylan Patel - The Infinite Demand for Tokens, Claude Mythos, and Supply Constraints - [Invest Like the Best, EP.468]
Patrick O’Shaughnessy interviews Dylan Patel (founder & CEO of SemiAnalysis) about the explosive rise in token consumption (pay-per-token inference) across businesses, why frontier models (Anthropic’s Mythos / Opus series) are intensely desired, and the supply-side bottlenecks (memory, logic, fab equipment, GPUs, CPUs, FPGAs) that will constrain scaling. Patel uses SemiAnalysis’s own transition from modest AI spend to a multi‑million dollar run rate as a lens to explain demand dynamics, the economics of tokens, implementation changes, and the broader economic and political implications.
Key takeaways
- Token demand is exploding: SemiAnalysis moved from tens of thousands of dollars to roughly a $7M/year run rate on Anthropic-style tokens in a few months; other firms are seeing similar “AI psychosis” adoption spikes.
- Frontier models (Mythos / latest Opus releases) create outsized willingness to pay because they unlock higher-value use cases and are more token-efficient despite higher per-token prices.
- Implementation difficulty has collapsed: ideas are cheap; executing them (via models + cloud infrastructure) is now much easier — that accelerates release cadence and use-case proliferation.
- Supply-side constraints are real and multi-layered: DRAM/NAND capacity growth is limited, fabs take years to build, ASML/equipment supply is booked out, GPUs/CPUs/FPGAs are in tight supply — all driving higher prices and extended hardware useful life.
- Tokenomics (who uses tokens, for what value, and how that value diffuses in the economy) is the biggest unknown — hard to measure with existing GDP/statistics (Patel calls this “phantom GDP”).
- Socio-political risk: concentrated access and rapid disruption may provoke public backlash, protests, and regulatory/political responses.
Demand-side insights (tokens, models, and business examples)
- Behavioral shift: non-technical staff are now heavy model consumers (developers, economists, energy analysts, reverse-engineering teams). Daily spends of thousands of dollars by individual users are common.
- Examples from SemiAnalysis:
- Reverse-engineering tool: one person built a GPU-accelerated application (using Claude tokens) to map chip materials from SEM images — previously a whole team’s job.
- “Phantom GDP” metric: a single economist built 2,000-task language-model evals to measure which tasks are automatable and the deflationary impact on labor.
- Energy grid mapping: one person scraped and combined public APIs to build an entire U.S. grid/demand mapping in weeks.
- Frontier vs. GPT-class: users prefer frontier models (best capability) even if more expensive per token; frontier models can be more cost-effective per task due to token efficiency.
- Enterprise contracts matter: paying per-token enterprise deals reduce rate limits and unlock higher productive usage; firms with capital can secure outsized early access and volume.
Model progress & implications
- Mythos (Anthropic): described as a material step up (equivalent to moving from an L4 to L6 engineer in capability), selectively released and priced higher — labs may limit broad release due to safety and capacity concerns.
- Release cadence is compressing: better tooling, cheaper/available implementation, and concentrated compute allow labs to iterate faster; capability growth + efficiency gains keep lowering costs for fixed capability.
- Outcome: more powerful models + falling per-capability cost → rapid proliferation of new, higher-value use cases.
Supply-side constraints and bottlenecks
- Memory (DRAM/NAND)
- Capacity increases are low double digits per year; lead-times push meaningful supply increases out to 2027–2028.
- Result: DRAM prices likely to remain elevated (Patel expects doubling/tripling from current levels), margin expansion for memory vendors.
- Logic / Foundries (TSMC, Samsung)
- Fab capacity has long lead times; TSMC CapEx is ramping (Patel suggests a multi-year, potentially enormous CapEx path), but downstream equipment and materials become chokepoints.
- Semiconductor equipment (ASML, Carl Zeiss, Lam, Applied)
- Tools are sold out / booked; supplier capacity and optical components constrain fab expansion.
- GPUs and compute
- H100 and other datacenter GPUs are in short supply; useful life of clusters appears to be extending (longer than commonly assumed), pushing up used prices and margins in the cloud/hardware layer.
- CPUs, FPGAs, ASICs
- CPUs are critical for reinforcement learning environments and for running the surrounding orchestration/deployment code — demand for CPUs is high.
- FPGA density per AI rack is growing; ASICs/custom chips are gaining traction for efficiency; all increase complexity and demand for varied supply chains.
- Upstream materials & components
- Copper foil, glass fibers, lasers, and specialty materials are tight; prepayments are becoming common, increasing returns on invested capital for suppliers.
- Supply reorientation speed
- Historically, shortages can flip to gluts, but the current hardware + supply-chain complexity means longer lead times and multiple multi-year bottlenecks.
Business & strategy implications
- For product & data businesses:
- Rapidly adopt frontier models to avoid being commoditized; focus on continually raising the bar (better data, better integrations).
- Use tokens to build high-leverage products (not just replacing one hour/day of work with AI).
- Secure enterprise-level access where possible to avoid rate limiting and to capture high-value use cases.
- For infrastructure & cloud providers:
- Expect margin expansion and extended hardware lifecycles; prepare for sustained capacity demand across GPUs, CPUs, memory, and specialized chips.
- For investors & operators:
- Key areas to monitor/consider: memory suppliers, fab equipment vendors (ASML, Lam), foundries (TSMC), GPU makers (NVIDIA), CPU and FPGA vendors, cloud providers, and companies that can capitalize on token economics.
- Token-driven revenue growth may concentrate value among entities that can afford early/bulk access to frontier models.
- For labs/products (OpenAI, Anthropic, Google):
- Manage releases carefully (safety, PR), but also balance enterprise access and pricing to avoid excessive concentration and political backlash.
- Measurement challenge:
- Traditional GDP and productivity metrics undercount value created by token-enabled workflows — new metrics (like “phantom GDP”) are needed to capture deflationary and productivity effects.
Notable quotes / pithy insights
- “If you don't use more tokens, you'll never escape the permanent underclass.” — summarized warning about capture of value and widening gaps.
- “Ideas are cheap; implementation is very easy.” — on how execution cost has collapsed and the decision problem becomes choosing the right idea to implement.
- “Frontier lets them create the economically valuable things.” — why users gravitate to the latest, best models despite price.
- “Phantom GDP” — a framework for thinking about the hard-to-measure output/value generated by token-driven workflows.
Actionable items & questions to watch
- Short-term actions for firms:
- Evaluate enterprise token contracts (to remove rate limits).
- Identify high-leverage workflows to migrate to frontier models.
- Track token spend vs. captured revenue (three checks: use more tokens, generate more value, capture some of that value).
- Key metrics & events to watch next 3–12 months:
- Wider release/pricing of Mythos or equivalent frontier models and their token-efficiency vs. Opus/GPT-class models.
- Memory (DRAM) price/lead-time reports and capacity guidance from major players (Micron, SK Hynix).
- TSMC/TSMC-capex trajectory and ASML/tooling backlog updates.
- GPU (H100/A100) pricing and used-cluster market dynamics.
- Evidence of robotics/few-shot pre-trained models that dramatically increase physical automation.
- Measured diffusion of token-generated value into measurable economic statistics (new metrics like phantom GDP).
- Political/social responses: protests, regulation, or major public backlash.
Conclusion / framing
Dylan Patel argues we’re in an inflection where model capability, token economics, and implementation tooling converge to create enormous, fast-growing demand that outpaces multiple links in the hardware and supply chain. Firms that secure access to frontier models and learn to direct tokens at high-value problems will have big advantages; supply constraints (memory, fabs, GPUs, CPUs, equipment) will shape who can scale and at what margin. The economic output of this transition is both massive and currently poorly measured, and political/social friction is likely to rise as the impact becomes visible.
![Dylan Patel - The Infinite Demand for Tokens, Claude Mythos, and Supply Constraints - [Invest Like the Best, EP.468]](https://megaphone.imgix.net/podcasts/ceb34422-3ec7-11f1-97a5-8bd036d5980c/image/e8a75bd4ab83f059c746a594afd471a8.jpg?ixlib=rails-4.3.1&max-w=3000&max-h=3000&fit=crop&auto=format,compress)