Key Takeaways From Week 1 of the AI Builders Summit — LLMs

5 min readJan 22, 2025

We wrapped up week 1 of our first-ever AI Builders Summit! With hundreds of people tuning in virtually from all around the world, our world-class instructors showed how to build, evaluate, and make the most out of large language models. Here’s a recap of each session from this week, and if you feel like you’re missing out, then you can still sign up for the next few weeks of the month-long AI Builders Summit and also get these sessions on-demand.

Transforming Enterprise AI with Small Language Models

Julien Simon, Chief Evangelist at Arcee.ai

The first session featured Julien Simon from Arcee.AI, who discusses building cost-efficient and high-performance AI workflows with small language models. Julien dives into different techniques such as model distillation and merging to create tailored, efficient models.

He demonstrated these methods through live coding and explained their applications in enterprise settings. The session aimed to show how small open-source models can outperform large closed models in accuracy and cost-efficiency, emphasizing hands-on practice with provided tools and resources.

Fine-tune Your Own Open-Source SLMs

Devvret Rishi, CEO of Predibase, and Chloe Leung, ML Solutions Architect of Predibase

The video session from the AI Builder Summit features Dev and Chloe from Predibase discussing effective methods for fine-tuning and deploying custom AI models using their platform. The session was divided into two parts: theoretical background and a hands-on workshop.

Dev went through motivations, challenges, and solutions provided by Predibase, including the benefits of fine-tuning small, task-specific language models using LoRa (Low-Rank Adaptation) techniques, and introduced TurboLoRa for improved performance.

The session also covered deploying models via shared and private serverless deployments on Predibase’s platform, and introduced additional optimizations like FPA quantization. Chloe presented an in-depth hands-on tutorial on using the Predabase UI and Python SDK for fine-tuning models and implementing these optimizations.

Building High Accuracy LLMs using a Mixture of Memory Experts

Ryan Compton, Solutions Engineer at Lamini

Here, Ryan presented on building high-accuracy large language models (LLMs) using a technique called Mixture of Memory Experts (MoME). The session includes an introduction to the company Lamini, their focus on accuracy in LLMs, and the various sectors they work with.

Compton explains MoME and how it combines existing toolkits to enhance performance.

The talk covered technical details such as LLM tuning strategies, the architecture of Lamini’s solutions, and practical examples, including an interactive coding example to demonstrate the effectiveness of memory tuning and inference. He explained the pipeline for evaluating and refining models using Python SDK and REST APIs.

Evaluation-Driven Development: Best Practices and Pitfalls when building with AI

Raza Habib, CEO and Cofounder of HumanLoop

Raza Habib from Humanloop focused on eval-driven development in AI. Raza outlined the importance of placing evaluation at the center of AI development, emphasizing three key practices: frequent data inspection, detailed logging, and incorporating domain experts. He explained the workflow for effective eval-driven development, including setting up an end-to-end pipeline, using LLMs as judges, and the necessity of human feedback for model alignment. Raza demonstrated this through a practical example using a RAG-based QA system.

The session advocated iterative improvement, balanced between fast, code-based assertions and more in-depth evaluations using LLMs and human reviewers, ensuring robust AI development and continuous monitoring in production.

Cracking the Code: How to Choose the Right LLMs Model for Your Project

Ivan Lee, CEO and Founder of Datasaur

In the last session of week 1, Ivan provided a comprehensive guide on selecting and deploying the best LLM for your project. This session explored Datasaur’s capabilities for model evaluation, fine-tuning, and deployment. Ivan discussed the complexities of comparing models like OpenAI’s GPT 4.0, LLAMA, and Gemini, emphasizing real-world attributes such as cost, speed, and accuracy.

He demonstrated how to create a sandbox environment to test different models, deploy them using APIs, and fine-tune them to meet specific project requirements. Lastly,, Ivan introduced automated and manual evaluation metrics to ensure the model’s performance aligns with organizational needs. The session also touched on responsible AI practices and the integration of guardrails for data privacy and security.

Week 2 — RAG

On January 22nd and 23rd, we’ll be focusing on RAG during the AI Builders Summit! Sessions include:

Database Patterns for RAG: Single Collections vs Multi-tenancy
Inside multimodal RAG
Secure Your RAG Pipelines with Fine-Grained Authorization
Evaluating Retrieval-Augmented Generation and LLM-as-a-Judge Methodologies
From Reviews to Insights: RAG and Structured Generation in Practice

You can register now to catch the next few weeks of the virtual summit and even catch this week’s LLM sessions on-demand! If you’re looking for even more hands-on AI training, then you can register for ODSC East this May 13th-15th and get access to the AI Builders Summit included!