Summary of How to create agents that people actually want to use Podcast Episode by The Stack Overflow Podcast

Overview of "How to create agents that people actually want to use"

This Stack Overflow Podcast episode features Asaf Alevic, Head of AI at monday.com, discussing lessons from an early chatbot product, what went wrong, and how the team rebuilt their AI strategy into two complementary offerings: a contextual personal assistant (Monday Sidekick) and a platform for autonomous agents (Monday Agents). The conversation centers on product/UX design, expectations, engineering trade-offs, and practical guidance for building agents people will adopt.

Key takeaways

Lab performance ≠ production performance: a model that scored ~80% in lab tests delivered ~20–30% accuracy in production for monday.com's first chatbot.
User expectations and product framing matter more than raw model capability. Name, entry point, and visible scope strongly shape what users expect.
Conversation is a means to an end: agents must support the "last mile" (complete the task inside the product) or risk contact switching.
Two AI directions: personal assistants (context-aware Sidekick) and autonomous background agents (Agents platform) serve different needs and require different UX and safety patterns.
Control, observability, and safe escalation to humans are essential for adoption and trust.

What went wrong with the original product (Monday Expert)

Very broad, open-ended chatbot allowed infinite user queries → impossible to create a representative golden dataset or correct evaluation up front.
Huge expectation mismatch: product name ("Expert") and top-level entry created impression the bot could do everything across the platform.
Production accuracy dropped to ~20–30% vs 80% in lab due to uncontrolled, diverse real-world queries and contextual complexity.
The fallback behavior (knowledge-base guidance) triggered too often, turning the bot into a frustrating support tool instead of an empowering assistant.

How monday.com rebuilt the experience

Rebranded: from "Monday Expert" to "Monday Sidekick" to set more realistic expectations (assistant, not omniscient expert).
Changed entry point: moved from top-level global nav to contextual entry at the board/task item level so users interact with the agent inside a specific context.
Closed the last mile: added built-in UI elements in conversation flows so generated outputs (emails, docs, etc.) can be sent or applied directly without copy-paste and context switching.
Introduced Monday Agents: platform for autonomous agents (voice, email, etc.) that can run background tasks (e.g., feedback collection), convert unstructured → structured data, and write back to monday boards.
Emphasized user-facing controls: testing mode, live calls to test agents, verbal feedback that updates prompts, logging and review for each agent interaction.

Design & UX lessons (practical)

Start narrow and contextual: design agents around specific tasks and contexts, not open-ended command centers.
Set expectations up front: naming, UI boundaries, and kickoff messages should explicitly state scope and capabilities.
Provide end-to-end flows: ensure users can finish a task inside your product to measure completion and avoid contact switching to external LLM UIs.
Offer robust testing and tuning: let creators test agents live, give feedback, and iterate on system prompts before publishing.
Default to transparency: disclose when a user is talking to an AI (especially for voice) to avoid trust erosion.

Technical & platform considerations

Use LLMs and voice models (Asaf mentions integrating providers like 11 Labs and LLM APIs) rather than building speech models from scratch.
Treat the conversation interface as an orchestrator that maps user input to tools/skills and executes actions.
Build a flexible agent-builder platform that supports multiple channels (voice, email, web) and different integrations (CRM, ticketing, board updates).
Provide observability: logs, transcripts, and feedback loops so owners can monitor, review, and retrain agent behavior.
Prioritize safety: controls around what agents can do autonomously, clear escalation paths to humans, and product-level guardrails.

Best practices for voice agents

Introduce the agent as AI at the start (e.g., "Hi, I'm Linda, the monday AI voice agent") to set expectations.
Be narrow and scripted enough for predictable interactions; avoid trying to be a general human substitute.
Make handoff to humans easy and not antagonistic — help users reach a human quickly when needed.
Tune ASR and prompts for accents and noisy environments; voice agents can surpass humans on recognition and consistency in some cases.
Allow creators to select voice, tone, and accelerate iteration with live testing and immediate prompt updates based on verbal feedback.

Actionable recommendations (for product/engineering teams)

Reframe scope: choose a narrow set of tasks to automate and test thoroughly in production before expanding.
Instrument end-to-end success metrics (task completion, not just message correctness).
Move entry points to context-specific surfaces to constrain expectations and improve relevance.
Build UI components that let users finish generated outputs without copy/paste or context switching.
Implement an authoring/test/publish cycle with logs, human review, and easy escalation flows.
Be explicit about AI identity and capability limits to maintain trust.

Future outlook (monday.com view)

Two-pronged vision: personal assistants (make people 10–100x more productive) and autonomous agents (help businesses scale operations).
High SMB adoption observed as companies find clear, immediate ROI from automating tedious or repetitive workflows.
Ongoing investment in safety, control, and platform extensibility so customers can build custom agents (voice, email, web) that integrate with business systems.

Notable quotes

"If you have the best technology in the world, but users don't know how to use it, then it's meaningless."
"Conversation is a means to an end... it's great as a zero to one, but eventually the interface is critical."
"The last thing you want to do is leverage the fact that these voice agents can sound superhuman and treat them as such."

If you want a quick checklist for launching an agent product: (1) define narrow task + success metric, (2) pick contextual entry points, (3) build end-to-end UI for the last mile, (4) provide testing & feedback tooling, (5) make AI identity and escalation explicit, (6) instrument and iterate from production data.

Summary of How to create agents that people actually want to use

The Stack Overflow Podcastby The Stack Overflow Podcast