Overview of Data Skeptic — "Video Recommendations in Industry" (host: Kyle Polich)
This episode of Data Skeptic features Corey Zuckman (Silence No Good; content curator currently at Sling TV) discussing the intersection of human curation and machine-driven recommendation systems in music, film, TV and podcast discovery. The conversation covers curator workflows, where humans add value vs. where algorithms excel, the notion of “algatorial” (algorithm + editorial), practical product/metrics tradeoffs, cold-start and homepage design, and how generative AI and exploding content volume change discovery.
Guest background
- Corey runs the long-running music blog Silence No Good (started 2009) and curates across music, festivals, podcasting, film and television.
- Professionally, he works in content curation at Sling TV and previously at other streaming/audio companies (including Apple, TuneIn).
- He blends hands-on tastemaker curation with data-informed, product-side editorial work.
What a curator does (workflow and responsibilities)
- Described via the “CODE” pattern: Capture → Organize → Distill → Express.
- Capture: surface candidate items (music, shows, podcasts).
- Organize/filter: remove irrelevant/paid programming, fix metadata.
- Distill/contextualize: craft titles, descriptions, artwork to explain why to consume.
- Analyze: measure resonance, iterate (daily/weekly cadences).
- Key curator strengths: contextual knowledge, cultural relevance, trend awareness, and the ability to craft narratives or thematic groupings that algorithms may miss.
- Curators often focus on non-personalized editorial collections; algorithms then personalize ranking from those curated pools.
Algatorial: mixing algorithms + editorial
- “Algatorial” = algorithm + editorial. The sweet spot is human + machine collaboration:
- Humans curate and clean datasets, add context, set priorities or guardrails.
- Algorithms personalize ordering per user, scale sorting across many users, and handle repetitive organization.
- Humans can propose features/weightings; ML teams then decide how to integrate them — ideally with a human-in-the-loop feedback loop.
- Use cases: large seed lists (e.g., “songs to sing in the car”) created by editors and then personalized by ML.
Discovery trade-offs: explore vs exploit, surprise & delight
- Discovery is a push/pull: keep users in their comfort zones (exploit) while nudging them into novel but relevant areas (explore).
- “Surprise” is the hard-to-measure component; “delight” can often be proxied by engagement but true delight is emotional and costly to measure directly.
- Implicit signals (watch %/time) are useful proxies; explicit signals are valuable but have friction (users often provide negative feedback more readily than positive).
- Short-term metrics (clicks, watch hours) are proxies; long-term metrics like retention and whether users consistently discover new, meaningful content matter more.
Cold-start and the value of curation
- Cold-start (user or item) is a continuous/warm-up problem: limited data makes personalization and novel-but-relevant suggestions difficult.
- Editorial insight helps bootstrap both item discovery and early exposure—especially when external signals (IMDB buzz, industry trends) suggest an item will be popular.
- Popularity biases are useful signals but can create positive feedback loops; editorial oversight can help break or balance those loops.
Homepage experience & UX considerations
- Homepage (above-the-fold hero area) drives most engagement; reduces friction and must balance relevance + discovery.
- Goal: make discovery immediate with minimal friction (some users want one-click perfect picks, others want to explore).
- Corey advocates for more immersive, narrative-driven presentation (beyond rows of tiles) and better connective experiences across content to avoid scatterbrained browsing.
Scale, content deluge, and generative AI
- Increased creator access and generative AI produce massive quantity of content. Curators need:
- Quantity (to support personalization and diversity)
- Diversity
- Quality
- Generative AI will produce both a lot of low-quality content and some highly personalized gems — creates a “flood” that makes curation and guardrails more important (amplification vs moderation).
- Editorial/Discovery teams must decide what to amplify, where to open the platform, and how to target cohorts.
Limits and what needs improving
- Metadata quality: incorrect or sparse metadata breaks recommendation signals.
- Metric quality: know what your proxies measure and add systems for distinguishing unique vs repeat users, true satisfaction vs passive consumption.
- Systems thinking: better cross-product narrative/continuity (connective tissue across recommendations) is desirable.
- Trust and privacy: ideal personalization requires more signals, but users must trust platforms not to misuse data.
Practical tactics & organizational advice
- Collaboration and empathy: get buy-in by understanding other teams’ valid perspectives (marketing, creators, product).
- Human-in-the-loop tooling: use LLMs and narrow-domain tools (Corey mentions Notebook LM) to synthesize research and assist curation without hallucination.
- A/B testing caveats: most metrics are proxies with noisy signals; choose experiments and evaluation windows carefully (short-term lift vs long-term retention).
- Start with cleaning and improving data/metadata — it has outsized impact on downstream recommendations.
Notable quotes
- “Curation is a form of creation.”
- “Algatorial” = the practical blend of algorithmic personalization and editorial curation.
- “Discovery is actually a type of friction — but a good type of friction.”
Resources & links mentioned
- Silence No Good — silencenogood.net / .com (Corey’s blog and social: Instagram/TikTok @SilenceNoGood)
- Notebook LM (used to synthesize papers safely)
- Newsletter recommended by Corey: “Top Information Retrieval Papers of the Week” (weekly research digest)
Note: the episode includes a sponsor ad for DeleteMe (service to remove your personal data from data brokers).
Key takeaways (for builders and product teams)
- Use editorial curation to provide context, clean metadata and seed high-quality candidate pools; use algorithms to personalize at scale.
- Treat discovery as a balance: optimize for short-term signals but measure long-term retention and meaningful discovery.
- Invest in metadata quality, metrics clarity, and tools that enable human-in-the-loop adjustments (including narrow LLM workflows).
- Expect generative AI to increase quantity and variability of content — curation, guardrails, and cohort targeting become more important.
- Align cross-functional stakeholders early; show how editorial actions map to business and user outcomes to ship faster.
If you want to learn both practical curation techniques and product/ML trade-offs in recommendation systems, this episode gives hands-on industry perspective from someone working at the intersection of editorial and engineering.
