Summary of The Future is Agentic in Recommender Systems Podcast Episode by Data Skeptic

Overview of The Future is Agentic in Recommender Systems

This episode of Data Skeptic explores how large language models and agents are reshaping recommender systems. Host Kyle Polich speaks with recommender systems researcher Yashar Deldjoo about the shift from classic ranking-based recommenders, like collaborative filtering, toward agentic systems that can converse, reason, use tools, remember context, and complete multi-step tasks. The discussion also previews Deldjoo’s upcoming book, Recommendations with Generative Models, and covers the growing importance of trustworthiness, safety, and evaluation in this new era.

Key Takeaways

Recommender systems are moving beyond ranking

Traditional recommenders mainly answer: “What items should be ranked highest for this user?”
Agentic recommenders expand the goal to: “What task should be done for this user?”
This shift enables more complex, constraint-aware assistance, such as planning travel, shopping, or finding restaurants based on many preferences at once.

LLMs bring two major advantages

Conversational interaction: Users can express preferences naturally, in multiple turns.
Broader knowledge and augmentation: LLMs can pull in external knowledge beyond the training data of a classic recommender system.

Collaborative filtering is not obsolete

Deldjoo argues classic methods still provide a strong, reliable baseline.
The future is likely hybrid: collaborative filtering plus LLMs/agents, rather than a full replacement.

Agentic systems can do more than recommend

They can help with planning, explanation, data gathering, simulation, and evaluation.
In some domains, they may even generate candidate options that do not exist yet, supporting creativity and discovery.

Trustworthiness and Risk in Recommender Systems

Core trustworthiness dimensions

Deldjoo frames trustworthy AI in recommender systems around several dimensions:

Generalizability
Robustness
Privacy
Explainability
Fairness and bias

New risks introduced by generative models

Hallucination: The system may invent items, metadata, or facts.
Context drift: The model may lose track of the user’s original goal during a conversation.
Stereotype amplification: Because LLMs are trained on broad internet data, they can reproduce biased patterns.
Safety failures: In sensitive domains like medicine, persuasive but incorrect suggestions can be dangerous.

Adversarial concerns still matter

Attackers may try to manipulate recommendation outcomes for commercial gain.
Recommender systems must remain resilient to both targeted and untargeted attacks.
Robustness is easier to define and measure than fairness, which is often subjective and multi-stakeholder.

The Agentic Recommender System Framework

Three broad categories of agentic use

Deldjoo describes three ways agents fit into recommender systems:

Agent as Recommender
- The agent itself performs the recommendation.
- This is the most direct replacement of a classic recommender.
Agentic Augmentation
- The agent helps the recommender system, but does not replace it.
- Examples include data augmentation, tool use, and support for cold-start scenarios.
Simulation and Evaluation
- Agents simulate users or system interactions to test behavior before real deployment.
- This can reduce reliance on costly online user studies.

A formal view of agents

The paper The Future is Agentic frames agents using core components:

An underlying LLM
An input space: what the agent can observe
An output space: rank lists, text, explanations, multimodal outputs, etc.
Tools/functions: external capabilities the agent can call
Memory: what the agent stores and retrieves

Types of memory

Working / short-term memory: remembers the current conversation
Episodic / long-term memory: recalls specific past interactions
Semantic memory: stores accumulated user preferences and facts
Procedural memory: remembers repeated user workflows and common tasks

Why Agents Feel Like the Future

From ranking to action

Classic systems say: “Here is the best list.”
Agentic systems say: “I can help you do the task.”

Example: travel planning

An agent can combine:

Budget
Family preferences
Eco-friendliness
Child-friendliness
Distance
Scheduling constraints

This is difficult for a traditional recommender alone, but natural for an agent that can gather and combine information from multiple tools.

Example: fashion recommendation

LLM-powered systems can help users visualize outfit combinations and generate ideas.
They can increase engagement and product discovery.
They may also improve “awareness” by exposing users to products they may want later.

The Book: Recommendations with Generative Models

What the book covers

Deldjoo’s upcoming book aims to answer:

What are generative models?
What are the major categories of generative models?
How should they be evaluated?
What social and ethical risks do they introduce?

Main structure

The book organizes generative recommendation approaches by modality:

ID-based / collaborative-style signals
Text-based / NLP models
Multimodal foundation models

Evaluation is broader now

The book emphasizes that evaluation is no longer just about accuracy:

Ranking quality still matters
But so do:
- Hallucination
- Latency
- Safety
- Robustness
- Other system-level metrics

System composition matters

Many modern recommender pipelines are hybrid, such as:

Retrieval-augmented generation systems
Recommendation modules plus LLM reasoning
Tool-using agents with external data access

That makes evaluation more difficult, since you may need both:

End-to-end evaluation
Module-wise evaluation

Mitigating Risk and Building Better Systems

Practical ways to improve reliability

Prompt-level techniques: few-shot examples, in-context guidance
Fine-tuning / alignment: train models toward trust and safety goals
Safety layers: stronger guardrails and refusal behavior
Task-specific controls: especially important in high-stakes domains

Key idea

Once you know the desired behavior, you can align the model more closely to it—but that usually requires additional training and ongoing safety work.

What’s Next for Yashar Deldjoo

Continuing work on the trustworthiness of recommender systems
Extending the generative recommender systems book
Writing an educational book on generative AI and language agents
Developing practical materials, including Python exercises, for students and practitioners

Notable Themes

The future is hybrid

The episode repeatedly reinforces that the future of recommender systems is likely not “LLMs instead of recommenders,” but rather:

LLMs plus recommenders
Agents plus collaborative filtering
Tool use plus memory plus ranking

Autonomy is increasing

Deldjoo frames the field as moving from:

passive systems
to interactive systems
to autonomous agents
and eventually toward more general agentic behavior

Memory is a major differentiator

The ability to remember user preferences and past interactions is one of the most promising and human-like aspects of agentic recommenders.

Bottom Line

This episode presents agentic AI as the next major evolution in recommender systems. Classic ranking models still matter, but LLM-powered agents can add conversation, memory, tool use, and multi-step reasoning. The biggest challenges ahead are not just performance, but trust, safety, hallucination control, and evaluation.

Summary of The Future is Agentic in Recommender Systems

Data Skepticby Kyle Polich