Connecting the dots for accurate AI

Summary of Connecting the dots for accurate AI

by The Stack Overflow Podcast

31mMay 12, 2026

Overview of Connecting the dots for accurate AI

In this live Stack Overflow Podcast conversation from HumanX, host Ryan Donovan speaks with Philip Rathle, CTO of Neo4j, about why enterprise AI agents need more than a model and a vector database. The core argument: for high-stakes use cases, AI systems need a knowledge and context layer built on connected data—often best represented as a graph—to improve accuracy, explainability, access control, and deterministic reasoning.

Why LLMs Alone Aren’t Enough

Philip explains that while LLMs can be useful, they have important limits:

  • They only know what was in training data up to a cutoff date.
  • They are stochastic and can be wrong without warning.
  • They don’t naturally handle privacy, regulation, or access control.
  • They are poorly suited for regulated or mission-critical decisions.

His point: for enterprise agents, intelligence should come from the model plus live, structured context about the real world.

The Role of Context and Knowledge

The conversation distinguishes between:

  • Data: raw inputs
  • Context: the subset of information relevant to a specific decision
  • Knowledge: connected, structured understanding that helps systems reason accurately

Philip argues that the best AI systems are built around a context layer that can draw from the full set of company knowledge, while respecting silos and permissions where needed.

Why RAG Helps, but Isn’t Enough

They discuss the rise of retrieval-augmented generation (RAG) as a first step beyond pure LLMs. RAG improves results by supplying external information, but Philip says it often falls short for enterprise needs because it lacks:

  • Strong explainability
  • Native access control
  • Deterministic reasoning
  • Rich relationship context

He also notes a common failure mode: stuffing more data into the prompt can make answers worse. Instead of “more context,” the answer is often better, more targeted context.

Graph RAG: Connecting Data Instead of Just Retrieving Text

A major theme is graph RAG, where retrieval is augmented not just with vectors, but with a knowledge graph.

What a graph adds

A graph can model:

  • Entities and relationships
  • Directionality
  • Hierarchies and networks
  • Multi-hop reasoning
  • Access rules and data lineage

This makes it much easier to pull back the exact connected context relevant to a question, rather than a pile of semantically similar text chunks.

Why it improves results

Philip says graph-based context usually leads to:

  • 30–80% accuracy improvement
  • 30–50% smaller context windows
  • Lower latency and cost
  • Better fit for smaller models

Deterministic Reasoning vs. Probabilistic Reasoning

The episode makes a strong distinction between:

  • LLM reasoning: flexible, probabilistic, sometimes wrong
  • Graph reasoning: deterministic, explainable, and reliable

For some use cases, there is exactly one correct answer, and the graph should compute it directly.

Examples mentioned

  • Ultimate beneficial owner in finance: can be resolved deterministically through graph traversal
  • Uber: uses a rules graph to determine driver eligibility across cities and regulations
  • Walmart: uses graph-based reasoning to support employee career and job-path questions

These examples show where “close enough” is not acceptable.

Graphs as the Symbolic Layer in Neuro-Symbolic AI

Philip frames graphs as the symbolic / left-brain complement to the LLM’s stochastic / right-brain behavior.

This combination—often called neuro-symbolic AI—is especially powerful for production systems because it pairs:

  • Human-curated structure
  • Rules and constraints
  • Queryable relationships
  • Real-time decisioning
  • Model-generated language and summarization

Why Graph Databases Are Fast

The discussion also goes deep on performance.

Key technical advantages

Neo4j stores relationships in a way that enables index-free adjacency, meaning once you find a node, you can quickly traverse to connected nodes without repeated joins.

Benefits include:

  • Very fast graph traversal
  • Often orders of magnitude faster than relational or generic NoSQL approaches for connected queries
  • Much less hardware needed in many cases

They also talk about recent improvements in storage and sharding that reduce memory pressure while preserving graph traversal speed.

How Data Gets Into the Graph

Philip says LLMs have made graph bootstrapping much easier.

Ingestion sources

  • Structured databases
  • SQL sources
  • Analytics platforms like Snowflake or BigQuery
  • Unstructured text

Tools can now infer schema, extract entities, and create relationships automatically, dramatically reducing the manual work once required.

Common pattern

  • Identify terms/entities from existing ontologies
  • Extract them from text
  • Create relationships based on sentence structure
  • Use the graph as a living knowledge layer

Graph + Vector: Better Together

The interview emphasizes that graphs and vectors are not competing approaches.

  • Vectors are useful for semantic similarity
  • Graphs are useful for structure, relationships, and reasoning

A strong system often uses both:

  • Vector search to find likely candidates
  • Graph traversal to refine, connect, and reason
  • Graph embeddings / graph vectors to capture topology and network behavior

Why Graph Queries Matter for AI

A notable point is that AI models often do a better job generating graph queries than SQL for complex business questions.

Why

  • Graph query language is closer to natural language patterns
  • Complex questions can be expressed more tersely
  • Graph queries are often easier for non-technical users to understand
  • They execute much faster for multi-hop relationship questions

Philip argues that graph databases are especially suited to questions about:

  • Networks
  • Hierarchies
  • Paths
  • Journeys
  • Connected decision-making

Resources Mentioned

Philip closes by pointing listeners to:

  • Neo4j Aura – free, fully managed version
  • Neo4j Desktop – local development option
  • Graph Academy – learning resources
  • deeplearning.ai courses with Andrew Ng
  • graphrag.com/research – collection of graph RAG papers and research

Main Takeaways

  • LLMs alone are not enough for high-stakes enterprise AI.
  • The best AI systems need a context and knowledge layer grounded in real, connected data.
  • Graph RAG improves accuracy, explainability, and efficiency over vector-only RAG.
  • Some business problems require deterministic answers, not probabilistic guesses.
  • Graphs pair naturally with LLMs in a neuro-symbolic architecture.
  • For complex connected data, graph queries can be both more expressive and faster than SQL.