The tiny team trying to keep AI from destroying everything

Summary of The tiny team trying to keep AI from destroying everything

by The Verge

38mDecember 4, 2025

Overview of Decoder — "The tiny team trying to keep AI from destroying everything"

This episode of Decoder (The Verge) — hosted by Eli Patel with guest Hayden Field (senior AI reporter at The Verge) — profiles Anthropic’s Societal Impacts Team: a tiny, nine-person group tasked with studying and publishing uncomfortable findings about how Anthropic’s AI (Claude) and chatbots more broadly affect jobs, elections, mental health, and other social systems. The conversation explores the team’s research, its scope and influence, internal tensions (safety vs. business), and the political pressure from the U.S. federal government’s “Preventing Woke AI” stance.

Key takeaways

  • Team scope and rarity
    • Anthropic’s Societal Impacts Team is unusually explicit and small: nine people in a company of ~2,000 focused specifically on societal-level research. No other leading lab has an equivalent dedicated team.
  • Inconvenient-truth research
    • The team publishes research that can be damning for Anthropic itself — e.g., gaps in safeguards, pornographic output, SEO-optimized spam networks, biased or inaccurate political opinions, and emotional reliance on chatbots.
  • Limited product power
    • The team can surface problems and shares findings with product and safety teams, but it lacks clear authority to veto or delay product releases. Impact on concrete product change is desired but not guaranteed.
  • Business incentives vs. safety
    • Anthropic’s safety-first reputation helps win enterprise and government contracts. But commercial pressures (competition with OpenAI, fundraising, accepting controversial capital) create tensions that may limit how far safety work can constrain product decisions.
  • Political pressure
    • The Trump administration’s executive order targeting “woke AI” (framed as preventing “ideological agendas” in government-used AI) creates new regulatory and reputational pressure that could threaten trust/safety work or change how models are tuned.
  • Where Anthropic is positioned
    • Anthropic is more enterprise- and government-focused (Claude Code) than many consumer-facing rivals, which may insulate it from some consumer-culture conflicts — at least commercially.

What the Societal Impacts Team does (methods & outputs)

  • Data-driven monitoring
    • They built a tracker (a “Google Trends”-like tool) for how people use Claude: query trends, word clouds, and usage patterns to spot misuse and emergent behaviors.
  • Research topics published so far
    • Election-related risks (models giving political opinions or inaccurate claims)
    • Economic impacts (which jobs are affected and adoption patterns)
    • Misuse patterns (coordinated bot spam, sexual-content generation)
    • Emotional impacts and AI “psychosis” — plans to do more social research and interviews
  • Collaboration
    • Regular interactions with trust & safety, and monthly meetings with the chief science officer; Mike Krieger (head of product) is reportedly receptive to incorporating findings into product changes.

Notable findings and concerns (examples)

  • Safeguards fallible: Long conversations can break guardrails; researchers found ways users bypass monitoring and produce harmful content.
  • Political bias and misinformation: Models can present biased or incorrect political opinions that could influence voters.
  • Emotional dependency: Users seeking personal advice may develop emotional reliance on chatbots, raising mental-health concerns.
  • Agricultural risk of misuse: Networks of bots using models for SEO spam or coordinated campaigns were detected.
  • Business tradeoffs: Anthropic accepted Saudi capital even while acknowledging moral discomfort — exemplifies how funding and competition can shape decisions.

Industry and policy context

  • Anthropic vs. OpenAI
    • Anthropic was founded by ex-OpenAI researchers prioritizing safety. It markets itself as more safety-oriented and pro-regulation, which attracts enterprise and government customers.
    • However, safety postures also serve as a business advantage and a defense against regulation — “we’re self-regulating” rhetoric.
  • Political environment
    • The “Preventing Woke AI” executive order signals government pressure to constrain model behavior in ways that align with the administration’s perspective; that may affect trust/safety and societal-research teams.
  • Historical parallels
    • The episode likens current dynamics to past moderation teams at social networks: initial investment in safety/research followed by cuts or pivots when political or business incentives shift.

Tensions and open questions

  • Independence & longevity
    • Can the team remain independent and continue publishing damning findings if those findings conflict with business or political goals?
  • Real-world product effect
    • How much of the team’s work will translate into concrete product changes (slowing releases, changing model responses)?
  • Response to regulation
    • Will the team survive or be marginalized under new federal requirements seeking “neutral” or non‑“woke” models?
  • Trade-offs of funding and competition
    • Will competition (need for capital, desire to stay relevant vs. OpenAI) push Anthropic to compromise safety principles over time?

Notable quotes

  • Dario Amodei (Anthropic CEO): “No bad person should ever benefit from our success.” (Used to illustrate the moral rhetoric vs. practical difficulty of enforcing it at scale.)
  • Paraphrase of team goal: They aim to “investigate and publish inconvenient truths” about how chatbots affect society.

What to watch next (actionable items / signals)

  • Will Anthropic’s research tangibly change product behavior? Monitor updates to Claude, product release cadence, and documented product changes tied to societal research.
  • Effects of the executive order: Track how federal procurement rules and model-tuning guidance change public-facing responses from Claude and from competitors’ models.
  • Team visibility and staffing: Watch whether Anthropic expands, maintains, or reduces the Societal Impacts Team and whether other labs create similar dedicated groups.
  • Industry adoption signals: Watch enterprise adoption (Claude Code leaderboard, enterprise testimonials) to see whether safety-first branding continues to convert customers despite political headwinds.

Why this matters

  • The episode highlights a rare case of an in-house group explicitly studying large-scale societal harms from AI and publishing uncomfortable findings about its own product. Their work illuminates real-world failure modes of chatbots and the messy incentives (business and political) that will determine whether such research shapes safer AI or becomes PR cover. The stakes include election integrity, labor markets, mental health, and the fundamental trajectory of how AI is governed and deployed.

Credits: Host Eli Patel; guest Hayden Field (The Verge). Produced by Decoder (The Verge / Vox Media).