The Future is Agentic in Recommender Systems

Summary of The Future is Agentic in Recommender Systems

by Kyle Polich

49mApril 25, 2026

Overview of The Future is Agentic in Recommender Systems

This episode of Data Skeptic explores how large language models and agents are reshaping recommender systems. Host Kyle Polich speaks with recommender systems researcher Yashar Deldjoo about the shift from classic ranking-based recommenders, like collaborative filtering, toward agentic systems that can converse, reason, use tools, remember context, and complete multi-step tasks. The discussion also previews Deldjoo’s upcoming book, Recommendations with Generative Models, and covers the growing importance of trustworthiness, safety, and evaluation in this new era.

Key Takeaways

Recommender systems are moving beyond ranking

  • Traditional recommenders mainly answer: “What items should be ranked highest for this user?”
  • Agentic recommenders expand the goal to: “What task should be done for this user?”
  • This shift enables more complex, constraint-aware assistance, such as planning travel, shopping, or finding restaurants based on many preferences at once.

LLMs bring two major advantages

  • Conversational interaction: Users can express preferences naturally, in multiple turns.
  • Broader knowledge and augmentation: LLMs can pull in external knowledge beyond the training data of a classic recommender system.

Collaborative filtering is not obsolete

  • Deldjoo argues classic methods still provide a strong, reliable baseline.
  • The future is likely hybrid: collaborative filtering plus LLMs/agents, rather than a full replacement.

Agentic systems can do more than recommend

  • They can help with planning, explanation, data gathering, simulation, and evaluation.
  • In some domains, they may even generate candidate options that do not exist yet, supporting creativity and discovery.

Trustworthiness and Risk in Recommender Systems

Core trustworthiness dimensions

Deldjoo frames trustworthy AI in recommender systems around several dimensions:

  • Generalizability
  • Robustness
  • Privacy
  • Explainability
  • Fairness and bias

New risks introduced by generative models

  • Hallucination: The system may invent items, metadata, or facts.
  • Context drift: The model may lose track of the user’s original goal during a conversation.
  • Stereotype amplification: Because LLMs are trained on broad internet data, they can reproduce biased patterns.
  • Safety failures: In sensitive domains like medicine, persuasive but incorrect suggestions can be dangerous.

Adversarial concerns still matter

  • Attackers may try to manipulate recommendation outcomes for commercial gain.
  • Recommender systems must remain resilient to both targeted and untargeted attacks.
  • Robustness is easier to define and measure than fairness, which is often subjective and multi-stakeholder.

The Agentic Recommender System Framework

Three broad categories of agentic use

Deldjoo describes three ways agents fit into recommender systems:

  1. Agent as Recommender

    • The agent itself performs the recommendation.
    • This is the most direct replacement of a classic recommender.
  2. Agentic Augmentation

    • The agent helps the recommender system, but does not replace it.
    • Examples include data augmentation, tool use, and support for cold-start scenarios.
  3. Simulation and Evaluation

    • Agents simulate users or system interactions to test behavior before real deployment.
    • This can reduce reliance on costly online user studies.

A formal view of agents

The paper The Future is Agentic frames agents using core components:

  • An underlying LLM
  • An input space: what the agent can observe
  • An output space: rank lists, text, explanations, multimodal outputs, etc.
  • Tools/functions: external capabilities the agent can call
  • Memory: what the agent stores and retrieves

Types of memory

  • Working / short-term memory: remembers the current conversation
  • Episodic / long-term memory: recalls specific past interactions
  • Semantic memory: stores accumulated user preferences and facts
  • Procedural memory: remembers repeated user workflows and common tasks

Why Agents Feel Like the Future

From ranking to action

  • Classic systems say: “Here is the best list.”
  • Agentic systems say: “I can help you do the task.”

Example: travel planning

An agent can combine:

  • Budget
  • Family preferences
  • Eco-friendliness
  • Child-friendliness
  • Distance
  • Scheduling constraints

This is difficult for a traditional recommender alone, but natural for an agent that can gather and combine information from multiple tools.

Example: fashion recommendation

  • LLM-powered systems can help users visualize outfit combinations and generate ideas.
  • They can increase engagement and product discovery.
  • They may also improve “awareness” by exposing users to products they may want later.

The Book: Recommendations with Generative Models

What the book covers

Deldjoo’s upcoming book aims to answer:

  • What are generative models?
  • What are the major categories of generative models?
  • How should they be evaluated?
  • What social and ethical risks do they introduce?

Main structure

The book organizes generative recommendation approaches by modality:

  • ID-based / collaborative-style signals
  • Text-based / NLP models
  • Multimodal foundation models

Evaluation is broader now

The book emphasizes that evaluation is no longer just about accuracy:

  • Ranking quality still matters
  • But so do:
    • Hallucination
    • Latency
    • Safety
    • Robustness
    • Other system-level metrics

System composition matters

Many modern recommender pipelines are hybrid, such as:

  • Retrieval-augmented generation systems
  • Recommendation modules plus LLM reasoning
  • Tool-using agents with external data access

That makes evaluation more difficult, since you may need both:

  • End-to-end evaluation
  • Module-wise evaluation

Mitigating Risk and Building Better Systems

Practical ways to improve reliability

  • Prompt-level techniques: few-shot examples, in-context guidance
  • Fine-tuning / alignment: train models toward trust and safety goals
  • Safety layers: stronger guardrails and refusal behavior
  • Task-specific controls: especially important in high-stakes domains

Key idea

Once you know the desired behavior, you can align the model more closely to it—but that usually requires additional training and ongoing safety work.

What’s Next for Yashar Deldjoo

  • Continuing work on the trustworthiness of recommender systems
  • Extending the generative recommender systems book
  • Writing an educational book on generative AI and language agents
  • Developing practical materials, including Python exercises, for students and practitioners

Notable Themes

The future is hybrid

The episode repeatedly reinforces that the future of recommender systems is likely not “LLMs instead of recommenders,” but rather:

  • LLMs plus recommenders
  • Agents plus collaborative filtering
  • Tool use plus memory plus ranking

Autonomy is increasing

Deldjoo frames the field as moving from:

  • passive systems
  • to interactive systems
  • to autonomous agents
  • and eventually toward more general agentic behavior

Memory is a major differentiator

The ability to remember user preferences and past interactions is one of the most promising and human-like aspects of agentic recommenders.

Bottom Line

This episode presents agentic AI as the next major evolution in recommender systems. Classic ranking models still matter, but LLM-powered agents can add conversation, memory, tool use, and multi-step reasoning. The biggest challenges ahead are not just performance, but trust, safety, hallucination control, and evaluation.