Overview of Building Pi, and what makes self-modifying software so fascinating
This episode features Mario Zechner, creator of the minimalist AI coding agent Pi, and Armin Ronacher, creator of Flask and an early Pi contributor. The conversation explores how Pi was built, why self-modifying software becomes much more practical with AI agents, and how both guests are thinking about the trade-offs of agentic development. A major theme is that AI can accelerate coding, but it also amplifies complexity, reduces friction in ways that can be dangerous, and makes human judgment and review even more important.
Why Pi Exists
Mario’s motivation
- Pi was created after Mario got frustrated with existing AI coding agents, especially their instability and heavy-handed behavior.
- He preferred a minimal, predictable core with stable deterministic behavior around the non-deterministic model.
- Pi is designed to be malleable: users can extend and reshape it rather than fight a rigid product.
Core design ideas
- Pi keeps the base system small and simple:
readwriteeditbash
- Its power comes from extension points that let users:
- add tools
- change compaction behavior
- alter the TUI
- even modify Pi itself
- The result is a coding agent that can be adapted to different workflows instead of forcing one workflow on everyone.
How AI Changed Their View of Coding Agents
From skepticism to interest
- Both Mario and Armin were initially skeptical of early AI coding tools like GitHub Copilot.
- Early tools felt underwhelming or even concerning, especially around open-source licensing and code reuse.
- Their perspective shifted as models improved, especially once:
- tool calling/function calling became useful
- agentic workflows could inspect real codebases
- CLI-based agents like Cloud Code became practical
What made the breakthrough
- The biggest change was not “smarter autocomplete,” but giving agents:
- access to the filesystem
- the ability to read and edit files directly
- a more realistic loop for doing actual work
- That made coding agents feel less like demos and more like useful engineering tools.
What They Learned From Teams Using Agents
Adoption pattern
- AI tools tend to get adopted during downtime, vacation, or free experimentation time.
- Teams often need 2–3 weeks of real use before the benefits click.
- Once adoption spreads, especially after holidays, usage rises quickly.
The downside: quality and complexity
- The guests repeatedly emphasized that agents:
- generate more code
- generate more mistakes
- increase review burden
- often create code that humans wouldn’t write
- The result is more PRs, larger PRs, and more cognitive load for reviewers.
Why humans still matter
- Humans feel the pain of complexity and eventually simplify, refactor, or reject bad design.
- Agents do not feel pain and therefore don’t naturally push back against complexity.
- This can lead to “automation bias,” where engineers trust outputs too quickly because the agent seems productive.
The “friction” argument
- Both guests argued that some friction in software development is intentional and valuable.
- Examples:
- approvals for high-risk changes
- review gates for critical services
- checks around migrations or security-sensitive actions
- Removing all friction may make agents faster, but it can also remove the pause that prevents bad decisions.
MCP vs CLI: What They Actually Prefer
Their skepticism of MCP
- MCP was viewed as useful, but flawed for some developer workflows.
- Main concerns:
- too much context pollution
- poor composability
- many server implementations are overly broad and awkward
- In practice, it often feels more like structured RAG than a truly composable execution layer.
Why CLI tools keep winning
- CLI workflows are easier to compose with pipes and code.
- Agents are already good at writing and running code, so a CLI-based approach often fits better.
- Their view: if a task involves combining multiple tools or services, code execution is often the cleaner abstraction.
Bigger Takeaways
Self-modifying software is becoming real
- Pi demonstrates a new pattern: software that can be extended and altered by the agent itself.
- This may spread beyond coding tools into broader knowledge work.
Complexity is the enemy
- The more code agents create, the harder it becomes for both humans and agents to keep the system understandable.
- Eventually, the codebase can become too large for the model to fully grasp in one context window.
Open source may need new bottlenecks
- AI-generated issues and PRs create volume without always creating responsibility.
- Maintaining quality may require new filters, queues, or trust mechanisms.
- Armin’s current approach includes auto-closing unknown PRs and asking for concise human-written issue reports instead.
The hype cycle will cool off
- Both guests think the industry is overextending AI’s current capabilities.
- They expect the excitement to normalize as teams run into real constraints:
- review bottlenecks
- maintainability issues
- quality regressions
- dependency on a few AI providers
Recommended Read / Listen Takeaways
- Build for stability first, intelligence second.
- Keep humans in the loop for important code and architecture decisions.
- Use agents to remove pain, not to maximize output at any cost.
- Prefer tools that are easy to modify if you want them to fit real workflows.
- Do not confuse speed with progress: more code is not necessarily better code.
Books Mentioned
- Code by Charles Petzold — recommended as a foundational read for understanding how computers work.
- Brave New World?
- Not mentioned.
- Brickneck — mentioned as a thought-provoking read about China and broader systems thinking, though the author wasn’t recalled in the transcript.
Final Thought
The episode’s core message is that AI agents are powerful, but they don’t replace engineering judgment. Pi is interesting not just because it is an AI coding agent, but because it is built around the idea that software should be modifiable, contextual, and human-governed. The future, as Mario and Armin see it, is not “agents everywhere” without constraints — it is better tools, better bottlenecks, and more deliberate use of automation.
