AI giveth and AI taketh CPU

Summary of AI giveth and AI taketh CPU

by The Stack Overflow Podcast

32mMay 8, 2026

Overview of AI giveth and AI taketh CPU

In this Stack Overflow Podcast interview from HumanX, host Ryan Donovan talks with Mark Papermaster, CTO of AMD, about how AI is reshaping hardware demand across CPUs, GPUs, memory, networking, and data center design. The conversation centers on AMD’s strategy of combining high-performance CPUs and GPUs, using chiplets and an open software stack to stay flexible as workloads shift from training to inference, from cloud to edge, and from general-purpose models to small language models and agentic workflows.

AMD’s AI Strategy: Performance Through Heterogeneous Computing

Papermaster frames AMD’s success as the result of a long-term focus on customer needs, product quality, and simplification.

Core strategy

  • Build leadership products across:
    • supercomputing
    • cloud
    • edge devices
    • PCs and embedded systems
  • Focus on what delivers real customer value
  • Keep innovation tied to listening to customers

Why AMD fits AI

  • AMD has deep experience in both CPUs and GPUs
  • AI workloads benefit from heterogeneous computing rather than relying on just one type of processor
  • The company has expanded from CPUs/GPU into embedded and adaptive computing

CPU + GPU Integration and Chiplets

A major theme of the discussion is how AMD combines compute types efficiently.

What AMD has been doing for years

  • AMD has combined CPU and GPU technologies since 2011
  • Early implementations focused on PCs, gaming, and workstation graphics
  • The key idea: shared memory and coherent architecture reduce data movement and power use

What chiplets add

  • Instead of one large monolithic chip, AMD uses chiplets
  • Benefits include:
    • easier manufacturing
    • better yield
    • lower cost
    • more flexibility in product design
  • Chiplets allow AMD to mix and match components for:
    • servers
    • desktops
    • workstations
    • GPUs

Data center design

  • AMD uses chiplets in large data center products to combine:
    • CPU compute
    • GPU compute
    • memory
    • I/O
  • This modular design helps AMD tailor systems to different workloads without redesigning everything from scratch

Open Ecosystem and ROCm Software Stack

Papermaster repeatedly emphasizes AMD’s preference for openness over lock-in.

ROCm and software control

  • AMD’s software stack is ROCm (the transcript occasionally misrenders it)
  • ROCm manages:
    • workload partitioning between CPU and GPU
    • compiler optimization
    • communication between devices
  • The stack is open, so developers can:
    • contribute code
    • fork it internally
    • avoid vendor lock-in

Why openness matters

  • Helps enterprise and hyperscale customers retain control
  • Makes AMD more attractive in mixed-vendor environments
  • Reduces the “moat” competitors may have had in AI software ecosystems

Workload Shifts: Training, Inference, and Agentic AI

A big part of the conversation is how AI workloads are changing and how AMD is adapting.

From training to inference

  • Earlier AI demand was dominated by training
  • Now, inference is growing rapidly and becoming more varied
  • AMD has adapted with different GPU configurations for:
    • high-performance computing
    • inference-heavy workloads

New inference patterns

Papermaster notes that inference is not one thing anymore. Different applications need different optimization goals:

  • low latency for “vibe coding” and interactive use
  • high throughput for larger batch workloads
  • large context handling for agentic workflows and long prompts

Small language models at the edge

  • Papermaster expects more workloads to move to:
    • small language models
    • edge devices
    • PCs and embedded systems
  • The cloud and large clusters will still matter for training and large-scale fine-tuning

Rack-Scale Systems and Data Center Scaling

The interview also covers AMD’s move beyond chips into rack-level architecture.

Rack-level optimization

  • AMD now designs around full systems, not just individual processors
  • Example: a rack-scale AI reference architecture with:
    • CPUs
    • GPUs
    • memory
    • networking
    • scale-up and scale-out connectivity

Why it matters

  • Large AI clusters need more than fast chips
  • They require carefully designed:
    • power delivery
    • cooling
    • networking
    • memory placement
    • interconnect strategy

Scaling up and out

  • One rack can serve as a building block
  • Multiple racks can connect into very large clusters
  • This supports everything from enterprise deployments to frontier-model training

Manufacturing, Supply Chain, and Bottlenecks

Papermaster makes clear that chip strategy is as much about supply chain planning as design.

The real constraints

  • Semiconductor manufacturing is slow compared with software
  • Demand must be forecast years in advance
  • AMD works closely with partners like TSMC and memory suppliers

Chiplets help here too

  • Easier to manufacture
  • Better yield
  • More flexibility in production planning

Industry-wide pressure

  • AI has increased demand for:
    • GPUs
    • CPUs
    • memory
    • data center power
  • AMD expects this to also create pressure in consumer products like PCs and phones

Power Efficiency: “Tokens per Watt per Dollar”

A major recurring theme is energy efficiency.

AMD’s approach

Papermaster says efficiency is improved across the full stack:

  • transistor design
  • chip architecture
  • chiplet interconnects
  • packaging
  • power delivery
  • memory hierarchy
  • software optimization
  • data center controls

Key efficiency ideas

  • reduce data movement
  • use coherent CPU/GPU memory access
  • improve compiler and kernel efficiency
  • optimize agentic workflows
  • manage power spikes in data centers

Future hardware directions

  • AMD is investing in photonic interconnects for future systems
  • It is also using 3D-stacked SRAM/cache techniques to improve performance and energy efficiency

AI Helping AMD Build Better Chips

One of the most interesting points in the interview is that AMD uses AI internally to improve chip design.

How AMD uses AI

  • Fine-tuned proprietary models trained on AMD’s design history
  • AI-assisted:
    • chip design
    • validation
    • compilation
    • kernel development
    • workflow optimization

What has changed recently

  • Earlier AI gains were mostly point improvements
  • More recently, agentic workflows have produced larger productivity gains
  • These systems can explore many more options than humans alone, sometimes finding unexpected performance wins

Future Outlook

Papermaster sees the next phase of AI hardware as one of increasing specialization and collaboration.

What’s next

  • More tailored inference optimization
  • More collaboration between hardware, software, and data center operators
  • More diverse AI workloads across:
    • finance
    • oil and gas
    • science and research
    • enterprise applications

AMD’s position

  • Continue offering both:
    • high-precision computing like FP32 and FP64
    • AI-friendly formats like FP4 and FP8
  • Keep customer choice central
  • Support everything from supercomputers to embedded edge systems

Key Takeaways

  • AMD’s advantage comes from combining CPU + GPU + chiplets + open software
  • AI is shifting demand from pure training toward inference, agentic workflows, and small language models
  • Efficiency is now a core competitive metric: tokens per watt, not just raw speed
  • The data center is becoming a system-design problem, not just a chip-design problem
  • AMD sees openness and flexibility as a major differentiator versus more closed ecosystems

Notable Insight

“More than ever, the industry has to band together and collaborate to drive energy efficiency.”

Papermaster’s broader message is that AI progress depends on the whole stack working together: silicon, software, systems, and supply chain.