Key Takeaways From Week 2 of the AI Builders Summit — RAG
We wrapped up week 2 of our first-ever AI Builders Summit! With hundreds of people tuning in virtually from all around the world, our world-class instructors showed how to build, evaluate, and make the most out of large language models. Here’s a recap of each session from this week, and if you feel like you’re missing out, then you can still sign up for the next few weeks of the month-long AI Builders Summit and also get these sessions on-demand.
Database Patterns for RAG: Single Collections vs Multi-tenancy
JP Hwang, Technical Curriculum Developer at Weaviate
RAG enhances the utility of LLMs by grounding their outputs in external data. This hybrid approach addresses limitations like outdated or incomplete model knowledge, making it a game-changer for applications requiring up-to-date information. The single collection pattern is particularly suited for scenarios where all users share a common dataset. It relies on an object store to house raw objects, such as text or documents, and indexes to facilitate fast searches using inverted or vector-based techniques.
During the workshop, participants set up a vector database using Weaviate, configured embeddings through tools like Cohere and Ollama, and built a RAG application to analyze 50,000 customer service conversations. They explored tasks like semantic searches for patterns (“return processes”) and hybrid queries that combine vector and traditional searches. A key highlight was the discussion on scaling vector databases efficiently, where techniques like vector caching and quantization reduce memory usage without compromising accuracy.
Inside Multimodal RAG
Suman Debnath, Principal AI/ML Advocate at Amazon Web Services
Multi-tenancy is designed for scenarios where data isolation is crucial, such as SaaS platforms or applications handling sensitive information. Each tenant’s data is stored separately, ensuring security, compliance, and resource efficiency. This pattern prevents accidental cross-querying or data leakage while reducing the overhead of querying large shared datasets.
Participants configured a multi-tenant vector database by enabling multi-tenancy in Weaviate and simulating a SaaS use case with isolated datasets for five customers. This allowed them to run tenant-specific RAG workflows while maintaining strict data privacy. Use cases for this approach include analyzing private datasets like customer support tickets or managing sensitive data in industries such as healthcare and finance.
Secure Your RAG Pipelines with Fine-Grained Authorization
Sohan Maheshwar, Lead Developer Advocate at authzed.com
Evan Corkrean, Sr. Solutions Engineer at authzed.com
The speakers described two approaches for integrating authorization into RAG workflows. The first, post-filter authorization, involves embedding metadata into vector databases to validate user permissions after data retrieval. The second, pre-filter authorization, queries an authorization system before accessing vector databases to proactively filter unauthorized embeddings. Both methods balance security with performance, with pre-filtering excelling in scenarios requiring stringent control.
Participants implemented a secure RAG pipeline using SpiceDB, Pinecone, and OpenAI. They modeled a ReBAC schema, added metadata to embeddings, and demonstrated access control. For example, unauthorized users were blocked from retrieving embeddings linked to restricted documents, ensuring compliance with data privacy standards.
Evaluating Retrieval-Augmented Generation and LLM-as-a-Judge Methodologies
Stefan Webb, Developer Advocate at Zilliz
Stefan discussed how LLMs can act as judges to evaluate RAG systems. He highlighted the critical role of evaluation in ensuring the quality and reliability of RAG pipelines, emphasizing scalable methods for robust assessments.
Webb introduced the concept of using LLMs as judges, which allows for the evaluation of RAG outputs without requiring ground truths. Techniques included pairwise response comparisons and scoring based on faithfulness, relevance, and accuracy. However, challenges such as position and verbosity biases can affect evaluations, necessitating fine-tuning and bias mitigation strategies. The session concluded with hands-on implementation using tools like Milvus and RAGAS to construct and evaluate RAG pipelines.
From Reviews to Insights: RAG and Structured Generation in Practice
Cameron Pfiffer, Developer Relations Engineer at .txt
Cameron’s session explored the practical applications of RAG pipelines, particularly for transforming unstructured data into structured insights. Attendees gained hands-on experience building modular RAG systems capable of handling text and images.
Bifer demonstrated how to use vector databases like Milvus and Hugging Face models for indexing, retrieval, and re-ranking. The session focused on optimizing retrieval processes and leveraging fine-tuned prompts to enhance output relevance. By employing advanced re-ranking techniques with LLMs, participants saw how to generate structured insights from raw reviews, showcasing the flexibility and scalability of RAG systems in real-world applications.
Week 3 — AI Agents
On January 29th and 30th, we’ll be focusing on AI Agents during the AI Builders Summit! Sessions include:
- Building agentic RAG with LlamaIndex Workflows
- Modern AI Agents from A-Z: Building Agentic AI to Perform Complex Tasks
- Using World Models to build AI agents for Optimal Decision Making
- LLM Engineering Masterclass: Select and apply LLMs using RAG, fine-tuning and Agentic AI
- Building and Evaluating Agents with LangGraph and RAG
You can register now to catch the next few weeks of the virtual summit and even catch this week’s RAG sessions on-demand and even last week’s LLM sessions on-demand If you’re looking for even more hands-on AI training, then you can register for ODSC East this May 13th-15th and get access to the AI Builders Summit included!