#496 – FFmpeg: The Incredible Technology Behind Video on the Internet

Summary of #496 – FFmpeg: The Incredible Technology Behind Video on the Internet

by Lex Fridman

4h 23mMay 6, 2026

Overview of #496 – FFmpeg: The Incredible Technology Behind Video on the Internet

This episode is a deep dive into FFmpeg, VLC, and the open-source ecosystem behind modern video and audio on the internet. Lex Fridman speaks with Jean-Baptiste Kempf and Kieran Cunha about how these tools decode, encode, transcode, stream, and play nearly every media format in existence; why they matter to billions of users; how they’re maintained by a small volunteer core; and why low-level engineering, reverse engineering, and assembly still matter enormously in 2026.

The conversation is equal parts technical, historical, philosophical, and political. It covers the architecture of video playback, the role of containers and codecs, the importance of open source licensing, the culture of FFmpeg/VLC contributors, controversies around security reporting and big-tech behavior, and the future of multimedia in robotics, XR, and brain-computer interfaces.

What FFmpeg and VLC Actually Do

The basic media pipeline

The guests break down what happens when you press “play”:

  • A file or stream is fetched from a source like HTTP, disk, DVD, or network.
  • A demuxer/container parser separates tracks such as video, audio, and subtitles.
  • The relevant codec is detected and decoded, possibly using GPU hardware acceleration.
  • If needed, software fallback handles entropy decoding, inverse transforms, prediction, and reconstruction.
  • The raw frames and audio are then sent to the display and speaker outputs.

Why this is so hard

The discussion emphasizes that media compression is not like ZIP compression:

  • Video/audio codecs discard information to match human perception.
  • Compression must exploit spatial and temporal redundancy.
  • Modern codecs use complex prediction, frequency-domain transforms, and quantization.
  • Real-world files are often mislabeled, broken, incomplete, or intentionally weird.
  • A huge part of VLC/FFmpeg’s value is that they don’t trust the filename; they inspect the actual bytes.

FFmpeg, VLC, Containers, and Codecs

Container vs. codec

They clarify a common confusion:

  • Container = file format wrapper, like MP4, MOV, MKV, AVI.
  • Codec = the compression format inside, like H.264, AAC, AV1.

A file named .mp4 may contain something unexpected, and VLC/FFmpeg probe the file rather than trusting the extension.

What makes a codec good

A codec is judged by tradeoffs such as:

  • Compression efficiency
  • Decode speed
  • Encode speed
  • Error resilience
  • Bit-exactness across implementations
  • Suitability for film, screen recording, animation, live streaming, or archival use

They explain that modern codecs are often really collections of tools, not a single algorithm, because different content types require different strategies.

Open Source as a Social Contract

The philosophy

The episode strongly frames FFmpeg and VLC as examples of the best kind of open source:

  • Built by volunteers
  • Maintained across borders and cultures
  • Shared for everyone’s benefit
  • Designed to make highly complex systems easy for ordinary users

Jean-Baptiste compares open source to sharing both:

  • the cake,
  • the recipe,
  • and the oven instructions.

Why the license matters

A major theme is that licensing is not just legal—it is social:

  • Permissive licenses: MIT, BSD, Apache
  • Copyleft licenses: GPL, LGPL, AGPL
  • Licenses determine whether companies can build proprietary products on top of the code and whether modifications must be shared back.

They explain why relicensing is difficult: every contributor retains copyright over their contributions, so changing a project’s license means tracking down many people, sometimes hundreds.

VLC’s licensing history

Jean-Baptiste describes moving portions of VLC from GPL to LGPL to make the core engine usable in third-party apps and commercially viable integrations, including mobile platforms and app stores. He also explains that different VLC components and platforms ended up under different licenses depending on technical and legal constraints.

The History of VLC and FFmpeg

VLC’s origin

VLC began not as a single product, but as a university project:

  • A French engineering school wanted better network video playback.
  • Students built a local video streaming system.
  • That project evolved into VideoLAN, and the client became VLC.
  • Jean-Baptiste later helped move it into a sustainable nonprofit open-source structure.

FFmpeg’s lineage

The episode highlights the major figures who shaped FFmpeg:

  • Fabrice Bellard — foundational early work
  • Michael Niedermayer — major 2000s-era development
  • Reverse engineers and assembly specialists — crucial for handling obscure and proprietary codecs
  • Loren Merritt, Kostya, Henrik Gramner, Martin Storsjö, and many others — key contributors to optimization and support for hard-to-handle formats

Forks and reunification

They discuss the FFmpeg / LibAV split as a normal open-source governance disagreement rather than a fundamental technical split. Over time, much of the work converged back into FFmpeg.

Assembly Language and “Real” Performance

Why assembly still matters

One of the episode’s strongest themes is that high-performance multimedia still depends on assembly:

  • SIMD instructions process many pixels or samples at once.
  • Handwritten assembly can massively outperform C in critical hot paths.
  • Modern compilers do not always match what expert humans can do when squeezing every last cycle from the hardware.

They repeatedly emphasize that FFmpeg and related code often achieve 10x to 60x improvements in critical routines through hand-optimized assembly.

Why this is a lost art

The guests argue that learning assembly teaches:

  • CPU architecture
  • registers
  • pipelining
  • caches
  • memory layout
  • calling conventions
  • vectorization

They believe understanding this makes engineers better, even if they mostly code in higher-level languages.

Educational efforts

Kieran describes an effort to teach assembly through real FFmpeg-style problems, not abstract syntax drills. The goal is to preserve the craft and lower the barrier to contribution.

Reverse Engineering: How Unsupported Formats Get Decoded

The craft

A lot of FFmpeg/VLC’s value comes from reverse engineering proprietary or obscure formats:

  • Disassembling binaries
  • Identifying decoding routines
  • Dumping raw frame data
  • Testing hypotheses against sample files
  • Achieving bit-exact output where possible

Examples

They discuss decoding:

  • old Windows Media / RealMedia formats
  • GoToMeeting recordings
  • CineForm
  • obscure game codecs
  • niche broadcast formats
  • rare DVD and Blu-ray edge cases

This work is described as equal parts archaeology, detective work, and low-level systems engineering.

Bit-Exactness and Testing

Why bit-exactness matters

For many codecs, different implementations must produce the same output bits:

  • It allows interoperability
  • It makes testing reliable
  • It helps maintain compatibility across platforms and hardware

FFmpeg’s test infrastructure

They highlight FATE (FFmpeg Automated Testing Environment), which:

  • runs on many operating systems and architectures
  • checks compile correctness and codec behavior
  • catches compiler miscompilations and platform-specific bugs
  • is powered by volunteers’ machines

This reinforces the theme that FFmpeg is a global volunteer system supporting absurdly broad hardware diversity.

X264, AV1, AV2, H.265, and the Codec Wars

X264 as a watershed

X264 is presented as one of the most important encoders ever written:

  • H.264 became the dominant Internet video format
  • X264 helped make high-definition video practical
  • It used perceptual optimization rather than purely mathematical quality metrics
  • It became a reference implementation for later codec development

Perceptual compression

They explain that the industry moved from optimizing metrics like PSNR to more human-centered methods:

  • psycho-visual tuning
  • adaptive quantization
  • motion prediction
  • better handling of animation and screen content

AV1, AV2, and H.266/VVC

The newer standards are discussed as the next wave:

  • AV1: royalty-free, highly efficient, now widely deployed
  • AV2: next-generation AV standard, intended to improve compression further
  • H.265 / HEVC and H.266 / VVC: strong technical standards but tangled in patents and licensing costs

The guests estimate the next generation can often reduce bandwidth by around 30% or more for similar quality, but encoding gets dramatically more expensive.

Patents, Licensing Pools, and Why AV1 Exists

The patent minefield

They explain that multimedia is one of the most heavily patented areas in software:

  • codecs often contain many individually patentable ideas
  • patent pools can make deployment expensive
  • HEVC/H.265 licensing became so burdensome that major companies sought alternatives

Why AV1 matters

AV1 was designed to be:

  • royalty-free
  • good enough for mass deployment
  • a practical escape from patent licensing problems

That’s why large companies like Google, Netflix, Amazon, Mozilla, and VideoLAN backed it.

Security, Bug Reports, and the Google / XZ Controversies

Google security reports on FFmpeg

The guests discuss a recent dispute where Google used AI-assisted security reports to identify vulnerabilities in FFmpeg. Their criticism is not that security research is bad, but that:

  • volunteer maintainers were overwhelmed
  • reports were wordy and alarmist
  • issues were announced publicly before fixes were ready
  • the burden of response fell on unpaid developers

They argue that if large companies rely on open source, they should contribute patches and funding, not just bug reports.

The broader point

They frame this as a mismatch of incentives:

  • security researchers get prestige, bounties, and visibility
  • maintainers get extra work and stress
  • open-source projects often have tiny cores supporting massive infrastructure

Burnout and the human cost

A major theme is maintainer burnout:

  • too many bug reports
  • too much pressure from companies
  • too much drama
  • not enough recognition or financial support

The conversation makes clear that this is a serious sustainability issue in open source.

VLC, Donations, and Refusing Millions of Dollars

Jean-Baptiste’s stance

Jean-Baptiste says he repeatedly refused lucrative offers to keep VLC:

  • free
  • open source
  • ad-free
  • non-tracking
  • not turned into a spyware or toolbar platform

He sees that choice as both moral and personal:

  • he wanted to be proud of what he built
  • he didn’t want to betray the community
  • he believed that selling out would undermine VLC’s spirit

Why that mattered

This is presented as a key reason VLC became iconic:

  • people trusted it
  • it worked
  • it didn’t spy on users
  • it represented freedom and technical excellence

Security, Backdoors, and Adversarial Use

VLC as a target

They discuss several examples of malicious or state-level misuse:

  • fake VLC downloads
  • malicious clones on search engines
  • attempts to bundle spyware
  • intelligence agencies or attackers modifying VLC with extra DLLs

The response

Their response is uncompromising:

  • VLC is open source and offline
  • they don’t insert telemetry
  • they do not add backdoors
  • they care deeply about where users download the software from

They also describe a very defensive build/signing process, including offline compilation and double signing.

Sandboxing and Decomposing VLC into Safer Processes

Why sandboxing is hard

VLC is a modular system with hundreds of plugins and lots of external code:

  • hardware decoders
  • GPU drivers
  • third-party codecs
  • filters
  • network components

Because media files can be malicious, the team is working to split VLC into separate sandboxed processes for:

  • demuxing
  • decoding
  • filtering

The challenge is performance: multimedia involves enormous data throughput, so any sandboxing must preserve speed.

Kyber: The Next Frontier — Ultra-Low Latency Control

What Kyber does

Jean-Baptiste introduces Kyber, an open-source, dual-licensed platform focused on real-time remote control of machines such as:

  • robots
  • drones
  • rovers
  • remote cars
  • teleoperation systems
  • cloud gaming systems

Core idea

The goal is to make distance disappear by minimizing latency:

  • video and audio must be transmitted with minimal delay
  • control inputs like mouse, keyboard, gamepad, and comments must stay synchronized
  • multiple camera streams and sensors must remain time-aligned

Technical targets

He aims for:

  • extremely low glass-to-glass latency
  • around 4 ms as a grail target
  • advanced techniques like forward error correction
  • fast encoders and decoders optimized in the same spirit as FFmpeg/VLC

The Future of Multimedia

Beyond video and audio

The guests argue multimedia will expand far beyond today’s formats:

  • volumetric video
  • point clouds
  • depth codecs
  • XR / VR streaming
  • haptics
  • odor channels
  • brain-computer interface data

Their view: anything that becomes a time-based sensory stream can become a multimedia format, and FFmpeg/VLC-like tools will eventually support it.

Archiving and cultural memory

A major part of the future is preservation:

  • video archives need robust, lossless formats
  • open source helps institutions with limited budgets
  • FFV1 and archival tooling preserve human history
  • “Rosetta Stone” style software is needed for future playback

They strongly emphasize that what gets preserved matters, especially in an era of AI-generated content and disposable digital media.

Notable Takeaways

  • FFmpeg and VLC are invisible infrastructure underpinning most media on the internet.
  • The core of modern media tech is built by small volunteer teams with extraordinary expertise.
  • Open source is a community and a social contract, not just a licensing model.
  • Assembly still matters for real-world performance in media and real-time systems.
  • Patents are one of the biggest barriers to efficient multimedia standards.
  • Burnout is real and maintainers need support, not just security reports and demands.
  • The future of multimedia is moving toward XR, robotics, teleoperation, and sensor-rich streams.
  • The conversation is as much a celebration of human craftsmanship as it is of codecs and software.

Final Message

The episode is ultimately a tribute to the people who quietly build and maintain the software that makes modern digital life possible. FFmpeg and VLC are presented not just as technical achievements, but as embodiments of a broader ideal: serious engineering in service of everyone.