AI in Production
Popular topics
# LLMs
# LLM in Production
# AI Agents
# Agents in Production
# AI
# LLM
# Machine Learning
# MLOps
# Rungalileo.io
# MLops
# RAG
# Prosus Group
# Generative AI
# Interview
# Machine learning
# Tecton.ai
# Arize.com
# mckinsey.com/quantumblack
# Redis.io
# Zilliz.com
Video
Anatomy of a Software 3.0 Company // Sarah Guo // AI in Production Keynote
If software 2.0 was about designing data collection for neural network training, software 3.0 is about manipulating foundation models at a system level to create great end-user experiences. AI-native applications are “GPT wrappers” the way SaaS companies are database wrappers. This talk discusses the huge design space for software 3.0 applications and explains Conviction’s framework for value, defensibility and strategy in specifically assessing these companies.
# MLOps
# DevOps
# LLM Operations
# Machine Learning


Sarah Guo & Demetrios Brinkmann · Feb 17th, 2024
35:21
Video
Building the Next Generation of Reliable AI // Shreya Rajpal // AI in Production Keynote
In this talk, Shreya will share a candid look back at a year dedicated to developing reliable AI tools in the open-source community. The talk will explore which tools and techniques have proven effective and which ones have not, providing valuable insights from real-world experiences. Additionally, Shreya will offer predictions on the future of AI tooling, identifying emerging trends and potential breakthroughs. This presentation is designed for anyone interested in the practical aspects of AI development and the evolving landscape of open-source technology, offering both reflections on past lessons and forward-looking perspectives.
# AI
# MLOps Tools


Shreya Rajpal & Demetrios Brinkmann · Feb 17th, 2024
30:27
Video
Navigating through Retrieval Evaluation to demystify LLM Wonderland // Atita Arora // AI in Production
This session talks about the pivotal role of retrieval evaluation in Language Model (LLM)-based applications like RAG, emphasizing its direct impact on the quality of responses generated. We explore the correlation between retrieval accuracy and answer quality, highlighting the significance of meticulous evaluation methodologies. - Atita Arora
# LLM
# Evaluation
# AI
# ML


Atita Arora & Demetrios Brinkmann · Feb 18th, 2024
12:53
Video
Productionizing Health Insurance Appeal Generation // Holden Karau // AI in Production Talk
This talk will cover how we fine-tuned a model to generate health insurance appeals. If you've ever received a health insurance denial and felt frustrated, this topic should resonate with you. Even if you haven't experienced this, come and learn about our adventures in using different cloud resources for fine-tuning and, ultimately, deploying on-premises Kubernetes in Fremont, CA. This includes the unexpected challenge of fitting graphics cards into the servers.
# Finetuning
# LLM
# Kubernetes


Holden Karau & Demetrios Brinkmann · Feb 18th, 2024
27:11
Video
Charting LLMOps Odyssey: Challenges and Adaptations
In this presentation, Yinxi Zhang navigates the iterative development of Large Language Model (LLM) applications and the intricacies of LLMOps design. They emphasize the importance of anchoring LLM development in practical business use cases and a deep understanding of one's own data. Continuous Integration and Continuous Deployment (CI/CD) should be a core component for LLM pipeline deployment, just as in Machine Learning Operations (MLOps). However, the unique challenges posed by LLMs include addressing data security, API governance, the imperative need for GPU infrastructure in inference, integration with external vector databases, and the absence of clear evaluation rubrics. The audience is invited to join as Yinxi illuminates strategies to overcome these challenges and make strategic adaptations. Yinxi's journey includes reference architectures for the seamless productionization of RAGs on the Databricks Lakehouse platform.
# LLM
# LLMOps
# Design ML
# RAGs


Yinxi Zhang & Demetrios Brinkmann · Feb 18th, 2024
38:53
Video
Vision Pipelines in Production: Serving & Optimisations
The discussion will center on transitioning from solution development to production, particularly focusing on vision models. Topics explored include fine-tuning LORAs, upscaling pipelines, constraints-based generations, and step-by-step enhancements to achieve optimal performance and quality for a production-ready service.
# Vision models
# upscaling pipelines
# Finetuning


Biswaroop Bhattacharjee & Demetrios Brinkmann · Feb 22nd, 2024
13:24
Video
RagSys: RAG is Just RecSys in Disguise // Chang She // AI in Production Lightning Talk
What once was old is new again. With increasing experience in RAG, more attention is being directed towards improving retrieval quality. The evolution of RAG pipelines is resembling recommender pipelines, incorporating features such as hybrid search and reranking. This lightning talk will briefly examine the parallels between the two approaches and demonstrate how to implement hybrid reranking with LanceDB to enhance retrieval quality.
# RAG
# Hybrid reranking
# AI


Chang She & Demetrios Brinkmann · Feb 22nd, 2024
12:45
Video
Helix - Fine Tuning for Llamas // Kai Davenport AI in Production Lightning Talk
A quick run down of Helix and how it helps you to fine tune text and image AI all using the latest open source models. Kai will discuss some of the issues that cropped up when creating and running a fune tuning as a service platform.
# Finetuning
# Open Source
# AI


Kai Davenport & Demetrios Brinkmann · Feb 22nd, 2024
13:37
Video
Graphs and Language // Louis Guitton // AI in Production Lightning Talk
"It is possible to build KGs with LMs through prompt engineering. But are we boiling the ocean? Can we improve the quality of the generated graph elements by using - dare I say it - SLMs (small language models)"
# KG
# LLMs
# Prompt Engineering


Louis Guitton & Demetrios Brinkmann · Feb 22nd, 2024
11:28
Video
From Robotics to AI NPCs // Nyla Worker // AI in Production Talk
Nyla delves into the intersection of robotics, simulation, and AI techniques, particularly in the context of powering Non-Player Characters (NPCs) in games using multi-modal Large Language Models (LLMs). Drawing from the principles of training AIs in simulation environments, the talk explores how Convai utilizes these technologies to enable NPCs to perform actions and react dynamically to their virtual environments. Viewers can expect insights into the methodologies employed and the practical applications derived from robotics research.
# npc
# Multimodal LLM App
# AI
# GAMING


Nyla Worker & Demetrios Brinkmann · Feb 22nd, 2024
25:54
Video
From Research to Production: Fine-Tuning & Aligning LLMs // Philipp Schmid // AI in Production
Discover the essential steps in transitioning LLMs from research to production, with a focus on effective fine-tuning and alignment strategies. This session delves into how to fine-tune & evaluate LLMs with Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF)/Direct Preference Optimization (DPO), and their practical applications for aligning LLMs with production goals.
# LLM
# Fine-tuning LLMs
# dpo
# Evaluation


Philipp Schmid & Demetrios Brinkmann · Feb 25th, 2024
38:03
Video
The State of Production Machine Learning in 2024 // Alejandro Saucedo // AI in Production
As the number of production machine learning use-cases increase, we find ourselves facing new and bigger challenges where more is at stake. Because of this, it's critical to identify the key areas to focus our efforts, so we can ensure we're able to transition from machine learning models to reliable production machine learning systems that are robust and scalable. In this talk we dive into the state of production machine learning in 2024, and we will cover the concepts that make production machine learning so challenging, as well as some of the recommended tools available to tackle these challenges. We will be covering a deep dive of the production ML tooling ecosystem and dive into best practices that have been abstracted from production use-cases of machine learning operations at scale, as well as how to leverage tools to that will allow us to deploy, explain, secure, monitor and scale production machine learning systems.
# LLM Use Cases
# LLM in Production
# MLOPs tooling


Alejandro Saucedo & Demetrios Brinkmann · Feb 25th, 2024
33:13
Video
LLMOps and GenAI at Enterprise Scale - Challenges and Opportunities
Generative AI is not going anywhere, but many organizations are struggling to translate a very active research and development activity from POC to production solutions. In this brief talk, I'll highlight some of the challenges I think we need to overcome if we want to deploy GenAI solutions at scale and I'll also talk about some of the opportunities this presents.
# LLMs
# GenAI
# NatWest


Andy McMahon & Adam Becker · Feb 27th, 2024
13:04
Video
Explaining ChatGPT to Anyone in 10 Minutes
Over the past few years, we have witnessed a rapid evolution of generative large language models (LLMs), culminating in the creation of unprecedented tools like ChatGPT. Generative AI has now become a popular topic among both researchers and the general public. Now more than ever before, it is important that researchers and engineers (i.e., those building the technology) develop an ability to communicate the nuances of their creations to others. A failure to communicate the technical aspects of AI in an understandable and accessible manner could lead to widespread public skepticism (e.g., research on nuclear energy went down a comparable path) or the enactment of overly-restrictive legislation that hinders forward progress in our field. Within this talk, we will take a small step towards solving these issues by proposing and outlining a simple, three-part framework for understanding and explaining generative LLMs.
# LLMs
# ChatGPT
# Rebuy Engine


Cameron Wolfe & Demetrios Brinkmann · Feb 27th, 2024
11:16
Video
Building a Python-Centric Feature Platform to Power Production AI Applications
In this talk, Matt walks through Tecton's journey to build a platform that can reliably power large-scale real-time AI applications while requiring nothing more than Python.
# AI Applications
# Python
# Tecton


Matt Bleifer & Adam Becker · Feb 27th, 2024
27:11
Video
Graduating from Proprietary to Open Source Models in Production
Model endpoints are a good way to prototype ML-powered applications. But in a production environment, you need security, privacy, compliance, reliability, and control over your model inference — as well as high results quality, low latency, and reasonable cost at scale. Learn how AI-native companies from startups to enterprise are using open source ML models to power core production workloads performantly at scale.
# Machine Learning
# Open Source
# Baseten


Philip Kiely & Demetrios Brinkmann · Feb 27th, 2024
23:16
Video
Building AI Products across Multiple Domains: Commonalities & Non-Commonalities
"I will walk through some of the key things that I have noticed in the space of Applied AI - what it takes to build AI into products and surfaces that do not contain it. How do you persuade partners of value? How do you get things done? What pitfalls might you run into, and how do you solve them?"
# AI Products
# Applied AI
# Uber
12:32
Video
Model Merging and Mixtures of Experts
Model merging has recently become extremely popular in the open-source community. The idea of merging several fine-tuned models, or combining them into a Mixture of Experts (MoE), led to new state-of-the-art LLMs. This talk introduces the main concepts around model merging and how to implement it using the mergekit library. It provides a notebook to create your own models and directly upload them on the Hugging Face Hub.
# Model Merging
# (MoE)
# JP Morgan Chase


Maxime Labonne & Adam Becker · Mar 4th, 2024
11:17
Video
Data Labeling Best Practices
Data labeling is a key part of fine-tuning open-source LLMs. However, poor labeling practices can hurt your LLM's performance. This lightning talk will cover data labeling best practices from hiring, preparing your data, and managing your data labelers.
# Data Labeling
# Fine-tuning LLMs
# Textmine


Charles Brecque & Adam Becker · Mar 4th, 2024
12:59
Video
A Survey of Production RAG Pain Points and Solutions
Large Language Models (LLMs) are revolutionizing how users can search for, interact with, and generate new content. There's been an explosion of interest around Retrieval Augmented Generation (RAG), enabling users to build applications such as chatbots, document search, workflow agents, and conversational assistants using LLMs on their private data. While setting up naive RAG is straightforward, building production RAG is very challenging. There are parameters and failure points along every stage of the stack that an AI engineer must solve in order to bring their app to production. This talk will cover the overall landscape of pain points and solutions around building production RAG, and also paint a picture of how this architecture will evolve over time.
# LLMs
# RAG
# LlamaIndex


Jerry Liu & Demetrios Brinkmann · Feb 28th, 2024
30:00
Video
LLM Use Cases in Production Panel
From startups achieving significant value with minor capabilities to AI revolutionizing sales calls and raising sales by 30%, we explore a series of interesting real-world use cases. Understanding the objectives and complexities of various industries, exploring the challenges of launching products, and highlighting the vital integration of the human touch with technology, this episode is a treasure trove of insights.
# LLM Use Cases
# Startups
# hello.theresidesk.com
# chaptr.xyz
# dataindependent.com



+2
Greg Kamradt, Agnieszka Mikołajczyk-Bareła, Jason Liu & 2 content:more content:speakers · Feb 28th, 2024
30:49
Video
Evaluating Large Language Models for Production
In the rapidly evolving field of natural language processing, the evaluation of Large Language Models (LLMs) has become a critical area of focus. We will explore the importance of a robust evaluation strategy for LLMs and the challenges associated with traditional metrics such as ROUGE and BLEU. We will conclude the talk with some nontraditional such as correctness, faithfulness, and freshness metrics that are becoming increasingly important in the evaluation of LLMs.
# Evaluation
# Large Language Models
# You.com


Zairah Mustahsan & Demetrios Brinkmann · Mar 4th, 2024
19:14
Video
Productionizing AI: How to Think From the End
As builders, engineers, and creators, we are often thinking about starting the full life-cycle of a machine learning or AI project from gathering data, cleaning the data, and training and evaluating a model. But what about the experiential qualities of an AI product that we want our user to be able to experience on the front end? Join me to learn about the foundational questions I ask myself and my team while building products that incorporate LLMs.
# Productionizing AI
# LLMs
# Bainbridge Capital


Annie Condon & Demetrios Brinkmann · Mar 4th, 2024
11:11
Video
Reliable Hallucination Detection in Large Language Models
Hallucination detection is a critical step toward understanding the trustworthiness of modern language models (LMs). To achieve this goal, we re-examine existing detection approaches based on the self-consistency of LMs and uncover two types of hallucinations resulting from 1) question-level and 2) model-level, which cannot be effectively identified through self-consistency check alone. Building upon this discovery, we propose a novel sampling-based method, i.e., semantic-aware cross-check consistency (SAC3) that expands on the principle of self-consistency checking. Our SAC3 approach incorporates additional mechanisms to detect both question-level and model-level hallucinations by leveraging advances including semantically equivalent question perturbation and cross-model response consistency checking. Through extensive and systematic empirical analysis, we demonstrate that SAC3 outperforms the state of the art in detecting both non-factual and factual statements across multiple question-answering and open-domain generation benchmarks.
# Hallucinations
# LLMs
# Intuit


Jiaxin Zhang & Adam Becker · Mar 4th, 2024
35:24
Video
Seeing Like a Language Model
What do language models see when they read and generate text? What are the terms by which they process the world? In this talk, I'll share some encouraging updates on my continuing exploration of how embeddings represent meaning enabled by recent breakthroughs in interpretability research, and how these insights might help us build better capabilities for retrieval-augmented LLM systems and imagine more natural interfaces for reading and writing.
# LLMs
# Embedding
# Notion


Linus Lee & Demetrios Brinkmann · Mar 4th, 2024
33:44
Video
The Intersection of Graphs and Large Language Models
The intersection of graphs and Large Language Models (LLMs). I intend to explore the benefits of combining graphs with LLMs, delving into the engineering aspects while also touching on the practical applications from my startup's perspective. This talk will highlight my recent work and findings on the superiority of Retrieval Augmented Generation (RAG) Knowledge Graphs over traditional RAG with vector databases, underlining the profound implications of their interaction.
# Graphs
# Large Language Models
# Fribl


Anthony Alcaraz & Adam Becker · Mar 4th, 2024
15:08
Video
Opportunities and Challenges of Self-Hosting LLMs
LLM deployment is notoriously tricky, leaving ML teams with little time left to focus on driving business value. So what can we do? If you run or are a part of a data science team working with LLMs, this one’s for you.
# LLMs
# ML Teams
# TitanML


Meryem Arik & Adam Becker · Mar 4th, 2024
9:47
Video
The Future of RAG
New LLMs are constantly appearing in the AI landscape, and retrieval augmented generation (RAG) has become a dominant LLM design pattern. What will the future bring? Join Contextual AI VP Product Aditya Bindal for a deep dive into the next generation of foundation models that prioritize customization and privacy.
# Artifact Storage
# LLM Design Pattern
# ContextualAI


Aditya Bindal & Adam Becker · Mar 6th, 2024
37:41
Video
Lessons from Building LLM-based Social Media Products
The goal of the talk will be to learn how to harness Gen. AI to build the right products for your users, efficiently. It'll cover learnings from different stages of a product, from the idea exploration stage, to hardware capacity planning, iterating on early versions, building early trust with your users, and finally measuring success over the long term.
# LLMs
# Social Media Products
# LinkedIn

Faizaan Charania · Mar 6th, 2024
19:46
Video
Making Sense of LLMOps
Lots of companies are investing time and money in LLMs, some even have customer-facing applications, but what about some common sense? Impact assessment | Risk assessment | Maturity assessment.
# LLMOps
# Ahold Delhaize
# Booking.com


Maria Vechtomova & Başak Tuğçe Eskili · Mar 6th, 2024
25:35
