Three retrieval architectures for different AI query types

Who: Posted by @_avichawla, a data science educator known for visual explainers on machine learning workflows. The content is original to Chawla, with no separate underlying author.

What's new: This post lays out three distinct approaches to — standard RAG, , and — arguing that these are not a maturity ladder but three tools suited to three distinct problem shapes.

How it works: Standard RAG converts documents into and retrieves whichever chunks of text are numerically closest to the query. This breaks down whenever a question requires connecting facts that do not individually resemble the query — the "missing middle" problem illustrated by a maintenance chain: checkout service to payments API to cluster-3 to Friday downtime. Graph RAG solves this by having an scan documents during setup and draw a map of named things and the links between them, so retrieval follows the connections rather than hunting for textual similarity. Agentic RAG scraps the fixed pipeline entirely: the LLM acts as a decision-maker at query time, picking which tools and sources to consult and in what sequence, suited to open-ended tasks with no predictable retrieval path.

The numbers: Chawla notes that memory use can be cut 32-fold by applying , though the implementation detail is covered in a separate linked article rather than here.

Why it matters: The practical value is the clear taxonomy: teams often reach for the most complex architecture by default, but standard RAG is faster and cheaper for simple lookups, Graph RAG earns its overhead only when queries require multi-hop reasoning, and Agentic RAG is warranted only when retrieval logic itself must be dynamic. Matching architecture to query type avoids unnecessary cost and latency.