Overview of MCP on Code Mode (Interview)
Adam interviews Matt Carey from Cloudflare about agents, the Model Context Protocol (MCP), and Cloudflare’s “Code Mode” approach to making AI agents more scalable and useful. The core idea: instead of stuffing an MCP server with hundreds or thousands of discrete tools, Cloudflare lets the model write code against a small SDK and execute that code in a sandboxed server-side environment. That keeps context usage low, makes the system more flexible, and allows Cloudflare to expose its large API surface through a much simpler MCP interface.
What Cloudflare’s Code Mode Does
The problem it solves
Traditional MCP servers often map one API endpoint to one tool, which quickly becomes unwieldy:
- Tool lists get huge
- Context windows fill up fast
- Models get less effective as tool count grows
- Large platforms can only expose a tiny slice of their APIs
Matt argues that this is a mismatch between how agents work and how many services are structured.
Cloudflare’s approach
Cloudflare’s solution is to have the model:
- Search the API/spec when needed
- Write code against a generated SDK
- Execute that code on the server side in a safe sandbox
In practice, their MCP server uses just two main tools:
- search
- execute
That lets Cloudflare expose roughly 2,500 API endpoints while keeping the context footprint around 1,000 tokens.
Why it matters
The big win is that agents can now interact with an entire platform without being overloaded by a giant tool registry. Matt sees this as the right direction for agent systems generally: let the model operate in its native strength—writing code.
MCP Explained: What It Is and What It Isn’t
Basic definition
Matt gives a simple explanation of MCP (Model Context Protocol):
- Created by Anthropic in late 2024
- Designed so agents like Claude can access external tools, resources, and prompts in a standardized way
- Originally local-first, later expanded to remote servers
The three building blocks
MCP is built around:
- Tools — callable actions/functions
- Prompts — server-provided instructions or guidance
- Resources — documents or data the client can pull in
Common misconception
Matt’s view is that MCP itself is not broken; rather, it was often used in a too-literal way. People treated every action as a separate tool instead of recognizing that the model can often do better by writing code that composes actions.
How Matt Actually Works With AI
Claude Code / OpenCode workflow
Matt describes a very agent-forward workflow:
- He mostly uses terminal-based agents now
- He opens an IDE mainly for visual review
- He often works with dangerously skip permissions
- He relies on local safeguards, including a custom Git wrapper, to prevent risky actions like force-pushing
His prompting style
He prefers:
- Plain English
- Clear intent
- Minimal prompt-engineering theatrics
- Trusting the model to stay “in distribution”
He also notes that he gets better results when he already knows what he wants. If he approaches the model without direction, he feels like he’s just “scrolling” and getting dopamine without progress.
Planning loop
A recurring pattern in his workflow:
- Ask the model to draft a plan
- Ask it to review the plan for gaps and blind spots
- Ask for suggestions for each issue
- Tell it to “do it”
He views this as a practical combination of reflection and iterative planning.
Memory, Agents, and the Future
What he means by “memory”
Matt is working on how agents should:
- Remember conversations during a session
- Persist useful context across sessions
- Support long-term learning
- Load things like skills or operating instructions on demand
Design goals
He wants memory to be:
- Flexible
- Provider-based
- Cross-runtime
- Compatible with different storage backends
He explicitly does not want to lock developers into one storage choice. He imagines support for things like:
- Durable Objects / SQLite
- PlanetScale
- Vector stores such as Vectorize
- Other future memory systems
Why this matters
His goal is to make the Agents SDK a place where developers can build memory systems without being forced into one opinionated architecture.
AI Culture, Productivity, and the Human Side
On feeling behind
Matt acknowledges that the field is moving extremely fast, and even experienced builders often feel behind. He says many Cloudflare people are “permanently online” because the pace of change is so intense.
On work style and attention
He admits:
- He thinks about these problems constantly
- He has trouble multitasking
- He prefers a few high-quality agent threads over many at once
- AI has shifted him from “manual coding” to more architectural thinking
On collaboration
He agrees that AI-native work can make collaboration trickier, because one person can move very quickly. But he also emphasizes that the right goal is still to leave the codebase better than you found it—clean, tested, and understandable for others.
Tools, Demos, and Side Notes Worth Knowing
Useful tools mentioned
- Granola — used for note-taking and summarization; Matt praises it for improving workflows, especially for doctors and for writing outlines from spoken ideas
- Handy — open-source local voice-to-text
- Pydantic Monty — a Python interpreter aimed at safe code execution
- Pi / Pi Agent — inspiration for agent primitives
- Interaction’s Poke — inspiration for stateful workflows
Cloudflare demo culture
He mentions Cloudflare’s Friday demos as a major internal venue where prototypes get shown early and often become real products.
Key Takeaways
- Code Mode is Cloudflare’s answer to tool bloat in MCP: let the model write code, not just click tools.
- Two tools can be enough if they’re powerful: search and execute.
- Sandboxing is essential for safe agentic code execution.
- MCP is promising, but it works best when used as a protocol for flexible composition, not as a giant menu of endpoints.
- Memory is the next big frontier for agents, but it needs to be portable and configurable.
- AI-native development is changing how people think, not just how they code.
Practical Recommendations
- Read Cloudflare’s Code Mode blog post and experiment with the pattern.
- When building agent systems, prefer writing code/SDKs over creating a huge number of tools.
- Use sandboxed execution if you let models generate runnable code.
- If you’re building CLIs, make them more agent-friendly:
- support non-interactive usage
- provide machine-readable help
- consider an
--agentflag or markdown-based help output
- Start thinking about memory as infrastructure, not just a feature.
