What's an AI Agent? And Why's That Hard to Define? (The Agents Season, Episode 1)

Summary of What's an AI Agent? And Why's That Hard to Define? (The Agents Season, Episode 1)

by Ben Jaffe and Katie Malone

19mApril 20, 2026

Overview of What's an AI Agent? And Why's That Hard to Define? (The Agents Season, Episode 1)

Ben Jaffe introduces a multi-episode season of Linear Digressions that unpacks "AI agents." He argues the term is widely used but ambiguously defined, and lays out why clarifying that definition matters—technically, socially, economically and ethically. The episode gives progressively complex examples to illuminate what differentiates a plain LLM response from a genuinely agentic system, settles on an operational working definition (an observe → reason → act loop), and previews topics the season will cover: tool use, planning, failure modes, evaluation, economics, multi-agent systems, and an agent case study he built to help produce the podcast.

Key definitions — examples that illustrate agency

Responder (baseline)

  • Example: ChatGPT answering a single user question (e.g., podcast recommendations).
  • Characteristics: Single-turn response, no persistence, no action in the world beyond text shown to the user.
  • Not considered agentic in the sense used here.

Tool use / single action

  • Example: LLM that can call a web-search tool and include results in its reply.
  • Characteristics: Takes an external action (uses a tool) but typically performs one action, returns results and stops. No ongoing loop of observation → decision → action.
  • Distinction: Tool use alone isn’t sufficient for agency.

Goal-directed multi-step agent (e.g., booking a flight)

  • Example: "Book me the cheapest flight to NYC next Thursday under $400; prefer United if possible."
  • Characteristics: Requires search, repeated queries, result evaluation, form-filling, handling confirmations, deciding when the task is complete.
  • Key features: A feedback loop where each action’s observation informs the next action; consequences in the real world (purchases, charges, travel).

Persistent/background manager (Open Claw / email manager)

  • Example: An agent that autonomously reads and replies to emails according to policy, possibly without a human in the loop.
  • Characteristics: Ongoing relationship with the world rather than a single task; exercises delegated judgment; raises authorship/accountability questions.
  • Implication: More extreme form of agency—delegation of judgment and persistent consequences.

Working definition & theoretical grounding

  • Classic definition (Russell & Norvig): Agents perceive via sensors and act via actuators. Useful but broad.
  • Anthropic-style framing emphasizes actions with persistent consequences.
  • Working definition used in the season: the REACT-style loop — observe, reason, act, repeat.
    • Why REACT? It explicitly captures the iterative perception-decision-action cycle that distinguishes agents from mere responders or single-action tool users.
    • This loop is central to topics the season will examine (planning, stopping criteria, evaluation, failure modes).

Main issues and implications highlighted

  • Autonomy vs. control: How much judgment do you delegate? What stays with the human?
  • Consequences & accountability: Agents can cause real-world effects (charges, commitments). Who’s responsible when things go wrong?
  • Stopping criteria & completion: How does an agent know it's done? Hard planning problem.
  • Error detection & correction: Mistakes may occur outside human oversight and be hard to catch.
  • Evaluation difficulty: Agent behavior is complex, sequential, and context-dependent—hard to evaluate like a chatbot.
  • Economics & scalability: Running persistent, tool-using agents has resource and cost implications.
  • Multi-agent dynamics: Coordination, role division, and communication between agents add complexity (and opportunity).

Season roadmap (what’s coming)

  • Next episode: Transition from chatbots to agents; the emergence of tool use; detailed look at REACT and related papers.
  • Subsequent topics across the season:
    • Planning and stopping criteria
    • Failure modes unique to agents
    • How to evaluate agent performance
    • Economics of operating agents
    • Multi-agent systems and coordination
    • Case study: Ben’s own agent used to help produce this podcast (what it does well, what he keeps for himself)
  • Recurrent thread: reference to the “Open Claw” example and research literature.

Notable insights / quotes (paraphrased)

  • “The key question isn't whether a tool was used — it's whether the model is observing the results of its actions and deciding what to do next.”
  • “When an AI starts managing things on your behalf, authorship and accountability become murky: did you send that email or did your agent?”
  • “This cycle—observe, reason, act—is what makes agents unique and hard; it runs through everything we’ll cover this season.”
  • “When it works well, agents are magical. When it fails, there are many ways it can go wrong.”

Actionable next steps & resources

  • If you want to see an agent in action from the host: subscribe to the Linear Digressions newsletter on Substack for episode recaps, sources, and examples of the host’s agent-generated content.
  • Papers and references mentioned to follow up on:
    • The REACT paper (observe → reason → act loop) — primary working lens for the season
    • Russell & Norvig (standard AI agent framing)
    • Anthropic’s framing on consequential actions
    • (Episode teasers mention MCP and the “Open Claw” example—look out for these in later episodes)
  • Practical suggestion for listeners: start thinking now about what tasks you might delegate versus what you should keep—designing that boundary is a core, practical question for working with agents.

If you want a concise checklist to evaluate whether something is an “agent” under this definition:

  • Does it repeatedly act based on observations (looping behavior)?
  • Does it make decisions toward a goal rather than just reply?
  • Does it take actions that have persistent, real-world effects?
  • Can it operate without (constant) human oversight?

Enjoy the season—next episode dives into REACT and tool use.