Overview of The Future is Agentic in Recommender Systems
This episode of Data Skeptic explores how large language models and agents are reshaping recommender systems. Host Kyle Polich speaks with recommender systems researcher Yashar Deldjoo about the shift from classic ranking-based recommenders, like collaborative filtering, toward agentic systems that can converse, reason, use tools, remember context, and complete multi-step tasks. The discussion also previews Deldjoo’s upcoming book, Recommendations with Generative Models, and covers the growing importance of trustworthiness, safety, and evaluation in this new era.
Key Takeaways
Recommender systems are moving beyond ranking
- Traditional recommenders mainly answer: “What items should be ranked highest for this user?”
- Agentic recommenders expand the goal to: “What task should be done for this user?”
- This shift enables more complex, constraint-aware assistance, such as planning travel, shopping, or finding restaurants based on many preferences at once.
LLMs bring two major advantages
- Conversational interaction: Users can express preferences naturally, in multiple turns.
- Broader knowledge and augmentation: LLMs can pull in external knowledge beyond the training data of a classic recommender system.
Collaborative filtering is not obsolete
- Deldjoo argues classic methods still provide a strong, reliable baseline.
- The future is likely hybrid: collaborative filtering plus LLMs/agents, rather than a full replacement.
Agentic systems can do more than recommend
- They can help with planning, explanation, data gathering, simulation, and evaluation.
- In some domains, they may even generate candidate options that do not exist yet, supporting creativity and discovery.
Trustworthiness and Risk in Recommender Systems
Core trustworthiness dimensions
Deldjoo frames trustworthy AI in recommender systems around several dimensions:
- Generalizability
- Robustness
- Privacy
- Explainability
- Fairness and bias
New risks introduced by generative models
- Hallucination: The system may invent items, metadata, or facts.
- Context drift: The model may lose track of the user’s original goal during a conversation.
- Stereotype amplification: Because LLMs are trained on broad internet data, they can reproduce biased patterns.
- Safety failures: In sensitive domains like medicine, persuasive but incorrect suggestions can be dangerous.
Adversarial concerns still matter
- Attackers may try to manipulate recommendation outcomes for commercial gain.
- Recommender systems must remain resilient to both targeted and untargeted attacks.
- Robustness is easier to define and measure than fairness, which is often subjective and multi-stakeholder.
The Agentic Recommender System Framework
Three broad categories of agentic use
Deldjoo describes three ways agents fit into recommender systems:
-
Agent as Recommender
- The agent itself performs the recommendation.
- This is the most direct replacement of a classic recommender.
-
Agentic Augmentation
- The agent helps the recommender system, but does not replace it.
- Examples include data augmentation, tool use, and support for cold-start scenarios.
-
Simulation and Evaluation
- Agents simulate users or system interactions to test behavior before real deployment.
- This can reduce reliance on costly online user studies.
A formal view of agents
The paper The Future is Agentic frames agents using core components:
- An underlying LLM
- An input space: what the agent can observe
- An output space: rank lists, text, explanations, multimodal outputs, etc.
- Tools/functions: external capabilities the agent can call
- Memory: what the agent stores and retrieves
Types of memory
- Working / short-term memory: remembers the current conversation
- Episodic / long-term memory: recalls specific past interactions
- Semantic memory: stores accumulated user preferences and facts
- Procedural memory: remembers repeated user workflows and common tasks
Why Agents Feel Like the Future
From ranking to action
- Classic systems say: “Here is the best list.”
- Agentic systems say: “I can help you do the task.”
Example: travel planning
An agent can combine:
- Budget
- Family preferences
- Eco-friendliness
- Child-friendliness
- Distance
- Scheduling constraints
This is difficult for a traditional recommender alone, but natural for an agent that can gather and combine information from multiple tools.
Example: fashion recommendation
- LLM-powered systems can help users visualize outfit combinations and generate ideas.
- They can increase engagement and product discovery.
- They may also improve “awareness” by exposing users to products they may want later.
The Book: Recommendations with Generative Models
What the book covers
Deldjoo’s upcoming book aims to answer:
- What are generative models?
- What are the major categories of generative models?
- How should they be evaluated?
- What social and ethical risks do they introduce?
Main structure
The book organizes generative recommendation approaches by modality:
- ID-based / collaborative-style signals
- Text-based / NLP models
- Multimodal foundation models
Evaluation is broader now
The book emphasizes that evaluation is no longer just about accuracy:
- Ranking quality still matters
- But so do:
- Hallucination
- Latency
- Safety
- Robustness
- Other system-level metrics
System composition matters
Many modern recommender pipelines are hybrid, such as:
- Retrieval-augmented generation systems
- Recommendation modules plus LLM reasoning
- Tool-using agents with external data access
That makes evaluation more difficult, since you may need both:
- End-to-end evaluation
- Module-wise evaluation
Mitigating Risk and Building Better Systems
Practical ways to improve reliability
- Prompt-level techniques: few-shot examples, in-context guidance
- Fine-tuning / alignment: train models toward trust and safety goals
- Safety layers: stronger guardrails and refusal behavior
- Task-specific controls: especially important in high-stakes domains
Key idea
Once you know the desired behavior, you can align the model more closely to it—but that usually requires additional training and ongoing safety work.
What’s Next for Yashar Deldjoo
- Continuing work on the trustworthiness of recommender systems
- Extending the generative recommender systems book
- Writing an educational book on generative AI and language agents
- Developing practical materials, including Python exercises, for students and practitioners
Notable Themes
The future is hybrid
The episode repeatedly reinforces that the future of recommender systems is likely not “LLMs instead of recommenders,” but rather:
- LLMs plus recommenders
- Agents plus collaborative filtering
- Tool use plus memory plus ranking
Autonomy is increasing
Deldjoo frames the field as moving from:
- passive systems
- to interactive systems
- to autonomous agents
- and eventually toward more general agentic behavior
Memory is a major differentiator
The ability to remember user preferences and past interactions is one of the most promising and human-like aspects of agentic recommenders.
Bottom Line
This episode presents agentic AI as the next major evolution in recommender systems. Classic ranking models still matter, but LLM-powered agents can add conversation, memory, tool use, and multi-step reasoning. The biggest challenges ahead are not just performance, but trust, safety, hallucination control, and evaluation.
