RAG in 2024: The Evolution of AI-Powered Knowledge Retrieval

4 min readAug 5, 2024

Editor’s note: Laurie Voss is a speaker for ODSC West this October 29th-31st. Be sure to check out his talk, “RAG in 2024: Advancing to Agents,” there!

In the rapidly evolving landscape of artificial intelligence, Retrieval-Augmented Generation (RAG) has emerged as a powerful technique for enhancing AI models with external knowledge. However, as we push the boundaries of what AI can do, we’re discovering that basic RAG has its limitations. While RAG is necessary, it’s not sufficient: we argue that advanced knowledge retrieval requires agentic strategies.

Basic RAG

RAG works by combining a large language model with a knowledge base. When given a query, the system retrieves relevant information from its database and uses it to generate a response. This approach has proven incredibly useful for tasks that require up-to-date or specialized knowledge.

However, basic RAG systems struggle with more complex tasks. They often fall short when asked to:

Summarize large documents: your query doesn’t match the whole document, you need software sufficiently aware of the task it was given to adapt with a different strategy
Compare multiple pieces of information: a single retrieval is unlikely to get all the context you need
Multi-part questions: queries like “what is the population of the largest city?” really require two questions — what’s the largest city, and what’s the population of that city?

Advancing Beyond Basic RAG

To overcome these challenges, researchers and developers are taking two main approaches: enhancing data quality and increasing query sophistication. While improving data quality is crucial, the real game-changer is the introduction of agentic RAG systems.

Agentic RAG introduces the concept of AI agents — autonomous systems that can plan, reason, and take actions to achieve specific goals. These agents go beyond simple retrieval and generation, incorporating advanced features that make them more flexible and powerful.

Components of Agentic Systems

Agentic systems are built on several key components:

Routing: Agents can intelligently select the most appropriate tool or method to answer a query. For instance, a RouterQueryEngine can choose between different types of search or summarization tools based on the nature of the question.
Conversation Memory: Unlike basic RAG, agentic systems can maintain context across multiple interactions, leading to more coherent and contextually relevant responses.
Query Planning: Complex queries are broken down into simpler sub-queries that can be processed in parallel, allowing for more comprehensive and accurate responses.
Tool Use: Agents can interact with external APIs and data sources, adapting queries as needed. This allows them to access a wider range of information and perform actions beyond simple text generation.

Agent Reasoning Strategies

Agentic systems employ various reasoning strategies to tackle complex tasks:

Sequential Reasoning: This includes approaches like the ReAct (Reasoning + Action) pattern, where the agent alternates between thinking about its next step and taking action.
DAG-based Reasoning: The agent creates a comprehensive plan from start to finish, like a flowchart. It can also reflect on its progress and adjust the plan as needed.
Tree-based Reasoning: For open-ended tasks, the agent explores multiple possible paths, balancing between exploring new options and exploiting promising leads.

Advanced Features of Agentic Systems

Modern agentic systems also incorporate advanced features that enhance their usability and effectiveness:

Observability: Agents can be instrumented to provide insights into their decision-making processes, aiding in debugging and improvement.
Controllability: Users can exert fine-grained control over agent actions, which is crucial for human-in-the-loop scenarios.
Customizability: Agent behaviors can be modified and extended to suit specific use cases.

AI Agents in LlamaIndex

To adapt to this new reality we have made agentic capabilities a first-class citizen in LlamaIndex: routing, memory, planning, and tool use are all key primitives in our library. Strategies for various types of reasoning are built-in while others are available as downloadable plug-ins from LlamaHub, our registry of open-source AI software. And observability, controllability, and customizability are all foundational to how our framework operates.

In our talk at ODSC West, we’ll be going in-depth into the problems that led to this evolution of knowledge retrieval and how we solved them, including step-by-step guides to building agents in the LlamaIndex framework itself. We hope to see you there!

About the Author/ODSC West 2024 Speaker on Knowledge Retrieval:

Laurie Voss is VP of Developer Relations at LlamaIndex, the framework for connecting your data to LLMs. He has been a developer for 27 years and was co-founder of npm, Inc. He believes passionately in making the web bigger, better, and more accessible for everyone.

Originally posted on OpenDataScience.com

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform. Interested in attending an ODSC event? Learn more about our upcoming events here.