How AI will change software engineering – with Martin Fowler

Summary of How AI will change software engineering – with Martin Fowler

by Gergely Orosz

1h 48mNovember 19, 2025

Overview of How AI will change software engineering — with Martin Fowler

This episode features Martin Fowler (ThoughtWorks Chief Scientist) in conversation with Gergely Orosz about how large language models (LLMs) and generative AI are changing software engineering. Fowler frames the shift as the largest in his career, compares it to the move from assembly to high-level languages, and emphasizes that the biggest change is the move from deterministic to non‑deterministic tooling. The discussion covers practical uses of LLMs today (prototyping, legacy understanding), risks (VibeCoding, hallucinations), evolving workflows (DSLs, spec/AI combos), and the continuing — in some cases increased — importance of practices like testing and refactoring.

Key takeaways

  • The core paradigm shift: the biggest change introduced by LLMs is non-determinism. That is, tools that may produce different outputs for the same input — which forces new engineering mindsets (tolerances, verification).
  • Best immediate wins for LLMs:
    • Rapid prototyping / “vibe coding” for exploratory, disposable work.
    • Understanding legacy codebases via semantic analysis + RAG-like queries (graph DB + LLM).
    • Learning unfamiliar APIs / domains quickly (developer exploration).
  • Big cautions:
    • VibeCoding shortcuts the developer learning loop; don’t rely on it for long-lived systems.
    • LLMs hallucinate and can “lie” about simple facts (dates, test results); always verify.
    • Team workflows need adaptation: treat LLM output like code from a highly‑productive but untrusted collaborator.
  • Refactoring becomes more important: more generated code → more need for disciplined small, behavior‑preserving changes to improve maintainability.
  • Promote rigorous ways to speak to models: building focused domain languages / DSLs or precise prompt languages helps get better, more reliable outputs.
  • Agile principles (short cycles, frequent feedback) remain the best bet; aim to shorten cycle time and keep humans in the loop for verification.

Topics discussed

  • Background: Fowler’s career path, ThoughtWorks, creation/process of the ThoughtWorks Radar.
  • Radar & industry signals: AI and LLM tooling show up heavily on the Radar (adopt/assess rings); examples: pre-commit hooks, ClickHouse, VLLM, Gen.AI for legacy understanding.
  • Historical analogy: assembly → Fortran/C/etc. was an abstraction shift; LLMs are comparable but mainly because they introduce non-determinism.
  • VibeCoding:
    • Definition: accepting generated code without understanding or inspecting it.
    • Pros: fast exploration, accessible to non-developers.
    • Cons: erodes learning loop, produces brittle, hard-to-tweak artifacts.
  • Effective LLM uses:
    • Prototyping and UX exploration (fast iteration, throwaway experiments).
    • Reverse engineering/understanding large legacy systems via semantic graphs + LLM queries.
    • Assisting developers in unfamiliar stacks or APIs.
  • Limitations:
    • LLMs are poor at some deterministic transformations (e.g., safe, cross‑codebase refactors) compared to dedicated refactoring tools.
    • Token costs and model behavior make some otherwise-simple tasks inefficient via LLMs.
  • Teams & process:
    • Treat LLM outputs as PRs needing rigorous review and testing.
    • Prefer thin, rapid slices; verify each slice with tests and human checks.
    • Potential for DSLs / specification languages to act as better interfaces to LLMs and bridge domain experts and code.
  • Enterprise adoption: highly heterogeneous — pockets of experimentation within conservative organisations; heavy regulation (e.g., finance) demands cautious, controlled usage.
  • Economic context: AI hype/bubble overlapped with macroeconomic cooling (end of zero interest rates, layoffs), creating mixed signals for adoption and hiring.

Notable quotes & distilled insights

  • “The biggest part of it is the shift from determinism to non-determinism.” — Martin Fowler
  • “When you’re using vibe coding you’re actually removing a very important part of something, which is the learning loop.” — Martin Fowler
  • “Treat every slice as a PR from a rather dodgy collaborator who’s very productive in line count but you can’t trust.” — paraphrase of Fowler’s LLM-as-collaborator mindset
  • Practical framing: use LLMs to get started or explore; then refactor and test thoroughly.

Practical recommendations / action items

For engineers

  • Use LLMs for prototyping and exploring ideas, but don’t let them replace understanding — read and test generated code.
  • Always add/require tests for any generated or modified code. “Don’t trust, verify.”
  • Treat generated code changes as untrusted PRs; enforce code review and CI gates.
  • Practice small, incremental refactorings (behavior-preserving steps) to improve LLM-generated artifacts.
  • Consider designing small DSLs / precise prompt schemas to communicate intent to models more reliably.

For teams & managers

  • Invest in tools and processes for:
    • Semantic analysis of codebases (graph representations) + RAG queries to surface insights quickly.
    • Integrations for LLMs in IDEs/CI, but ensure deterministic automation for heavy weight refactors.
  • Tighten feedback loops (shorter slices, faster deploy/observe cycles); measure outcomes.
  • Mentor junior engineers aggressively — AI can accelerate tasks but not the human judgment and context needed to evaluate output.
  • Establish guardrails and security review for AI usage, especially in regulated domains.

For orgs with legacy systems

  • Try LLM-assisted code understanding as a first pass: extract data-flow, call graphs, “what touches this data” queries to reduce ramp-up time.
  • Combine LLM prompts with deterministic tooling for safe large-scale migrations and refactors.

Who should listen / value for different audiences

  • Senior engineers & architects: framing for architecture, refactoring, and team workflows.
  • Engineering managers / leads: practical guidance on governance, adoption patterns, and risk management.
  • Developers (junior → senior): tactical pointers on when to use LLMs, testing and learning practices, and career advice (mentor emphasis).
  • Enterprise technologists: perspective on cautious adoption and how pockets of experimentation can exist inside regulated orgs.

Resources & further reading mentioned

  • ThoughtWorks Radar (regularly updated industry signals; process described by Fowler)
  • Martin Fowler’s website and blog (he curates articles from practitioners)
  • Unmesh Joshi’s work on co-building abstractions and patterns for using LLMs
  • Simon Willison on testing and practical LLM usage
  • Sponsors / tools mentioned in the episode:
    • Statsig (experimentation, feature flags, analytics)
    • Linear (engineering quality rituals — “Quality Wednesdays”)
  • Books Fowler recommended:
    • Thinking, Fast and Slow — Daniel Kahneman (probability, cognitive biases)
    • The Power Broker — Robert Caro (power, institutions); also Caro’s LBJ biography for deeper reading

Quick practical checklist for teams adopting LLMs

  • Require tests for any generated code before merging.
  • Treat LLM outputs as “untrusted PRs” — enforce review workflows.
  • Use LLMs for exploration and legacy-code comprehension; avoid direct wholesale insertion into long-lived code without refactoring.
  • Invest in developer mentoring and training on prompt discipline (or internal DSLs) and verification practices.
  • Monitor token usage/cost & put cost controls/confidence checks in place.
  • Combine LLM-driven ideas with deterministic toolchains for large, cross-codebase changes.

This episode is a concise bridge between big-picture framing (non‑determinism as the core challenge), practical advice (where LLMs help now), and process implications (refactoring, tests, team workflows). Martin’s central guidance: welcome the productivity gains, but keep the human-in-the-loop rigor — and preserve learning and verification as the foundation of quality software.