MLOps Community
Home
/
Collections
/
AI in Production

AI in Production

Popular topics
# LLMs
# LLM in Production
# AI Agents
# Agents in Production
# AI
# LLM
# Machine Learning
# MLOps
# Rungalileo.io
# MLops
# RAG
# Prosus Group
# Generative AI
# Interview
# Machine learning
# Tecton.ai
# Arize.com
# mckinsey.com/quantumblack
# Redis.io
# Zilliz.com
Video

Anatomy of a Software 3.0 Company // Sarah Guo // AI in Production Keynote

If software 2.0 was about designing data collection for neural network training, software 3.0 is about manipulating foundation models at a system level to create great end-user experiences. AI-native applications are “GPT wrappers” the way SaaS companies are database wrappers. This talk discusses the huge design space for software 3.0 applications and explains Conviction’s framework for value, defensibility and strategy in specifically assessing these companies.
# MLOps
# DevOps
# LLM Operations
# Machine Learning
Sarah Guo
Demetrios Brinkmann
Sarah Guo & Demetrios Brinkmann · Feb 17th, 2024
35:21
Video

Building the Next Generation of Reliable AI // Shreya Rajpal // AI in Production Keynote

In this talk, Shreya will share a candid look back at a year dedicated to developing reliable AI tools in the open-source community. The talk will explore which tools and techniques have proven effective and which ones have not, providing valuable insights from real-world experiences. Additionally, Shreya will offer predictions on the future of AI tooling, identifying emerging trends and potential breakthroughs. This presentation is designed for anyone interested in the practical aspects of AI development and the evolving landscape of open-source technology, offering both reflections on past lessons and forward-looking perspectives.
# AI
# MLOps Tools
Shreya Rajpal
Demetrios Brinkmann
Shreya Rajpal & Demetrios Brinkmann · Feb 17th, 2024
30:27
Video

Navigating through Retrieval Evaluation to demystify LLM Wonderland // Atita Arora // AI in Production

This session talks about the pivotal role of retrieval evaluation in Language Model (LLM)-based applications like RAG, emphasizing its direct impact on the quality of responses generated. We explore the correlation between retrieval accuracy and answer quality, highlighting the significance of meticulous evaluation methodologies. - Atita Arora
# LLM
# Evaluation
# AI
# ML
Atita Arora
Demetrios Brinkmann
Atita Arora & Demetrios Brinkmann · Feb 18th, 2024
12:53
Video

Productionizing Health Insurance Appeal Generation // Holden Karau // AI in Production Talk

This talk will cover how we fine-tuned a model to generate health insurance appeals. If you've ever received a health insurance denial and felt frustrated, this topic should resonate with you. Even if you haven't experienced this, come and learn about our adventures in using different cloud resources for fine-tuning and, ultimately, deploying on-premises Kubernetes in Fremont, CA. This includes the unexpected challenge of fitting graphics cards into the servers.
# Finetuning
# LLM
# Kubernetes
Holden Karau
Demetrios Brinkmann
Holden Karau & Demetrios Brinkmann · Feb 18th, 2024
27:11
Video

Charting LLMOps Odyssey: Challenges and Adaptations

In this presentation, Yinxi Zhang navigates the iterative development of Large Language Model (LLM) applications and the intricacies of LLMOps design. They emphasize the importance of anchoring LLM development in practical business use cases and a deep understanding of one's own data. Continuous Integration and Continuous Deployment (CI/CD) should be a core component for LLM pipeline deployment, just as in Machine Learning Operations (MLOps). However, the unique challenges posed by LLMs include addressing data security, API governance, the imperative need for GPU infrastructure in inference, integration with external vector databases, and the absence of clear evaluation rubrics. The audience is invited to join as Yinxi illuminates strategies to overcome these challenges and make strategic adaptations. Yinxi's journey includes reference architectures for the seamless productionization of RAGs on the Databricks Lakehouse platform.
# LLM
# LLMOps
# Design ML
# RAGs
Yinxi Zhang
Demetrios Brinkmann
Yinxi Zhang & Demetrios Brinkmann · Feb 18th, 2024
38:53
Video

Vision Pipelines in Production: Serving & Optimisations

The discussion will center on transitioning from solution development to production, particularly focusing on vision models. Topics explored include fine-tuning LORAs, upscaling pipelines, constraints-based generations, and step-by-step enhancements to achieve optimal performance and quality for a production-ready service.
# Vision models
# upscaling pipelines
# Finetuning
Biswaroop Bhattacharjee
Demetrios Brinkmann
Biswaroop Bhattacharjee & Demetrios Brinkmann · Feb 22nd, 2024
13:24
Video

RagSys: RAG is Just RecSys in Disguise // Chang She // AI in Production Lightning Talk

What once was old is new again. With increasing experience in RAG, more attention is being directed towards improving retrieval quality. The evolution of RAG pipelines is resembling recommender pipelines, incorporating features such as hybrid search and reranking. This lightning talk will briefly examine the parallels between the two approaches and demonstrate how to implement hybrid reranking with LanceDB to enhance retrieval quality.
# RAG
# Hybrid reranking
# AI
Chang She
Demetrios Brinkmann
Chang She & Demetrios Brinkmann · Feb 22nd, 2024
12:45
Video

Helix - Fine Tuning for Llamas // Kai Davenport AI in Production Lightning Talk

A quick run down of Helix and how it helps you to fine tune text and image AI all using the latest open source models. Kai will discuss some of the issues that cropped up when creating and running a fune tuning as a service platform.
# Finetuning
# Open Source
# AI
Kai Davenport
Demetrios Brinkmann
Kai Davenport & Demetrios Brinkmann · Feb 22nd, 2024
13:37
Video

Graphs and Language // Louis Guitton // AI in Production Lightning Talk

"It is possible to build KGs with LMs through prompt engineering. But are we boiling the ocean? Can we improve the quality of the generated graph elements by using - dare I say it - SLMs (small language models)"
# KG
# LLMs
# Prompt Engineering
Louis Guitton
Demetrios Brinkmann
Louis Guitton & Demetrios Brinkmann · Feb 22nd, 2024
11:28
Video

From Robotics to AI NPCs // Nyla Worker // AI in Production Talk

Nyla delves into the intersection of robotics, simulation, and AI techniques, particularly in the context of powering Non-Player Characters (NPCs) in games using multi-modal Large Language Models (LLMs). Drawing from the principles of training AIs in simulation environments, the talk explores how Convai utilizes these technologies to enable NPCs to perform actions and react dynamically to their virtual environments. Viewers can expect insights into the methodologies employed and the practical applications derived from robotics research.
# npc
# Multimodal LLM App
# AI
# GAMING
Nyla Worker
Demetrios Brinkmann
Nyla Worker & Demetrios Brinkmann · Feb 22nd, 2024
25:54
Video

From Research to Production: Fine-Tuning & Aligning LLMs // Philipp Schmid // AI in Production

Discover the essential steps in transitioning LLMs from research to production, with a focus on effective fine-tuning and alignment strategies. This session delves into how to fine-tune & evaluate LLMs with Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF)/Direct Preference Optimization (DPO), and their practical applications for aligning LLMs with production goals.
# LLM
# Fine-tuning LLMs
# dpo
# Evaluation
Philipp Schmid
Demetrios Brinkmann
Philipp Schmid & Demetrios Brinkmann · Feb 25th, 2024
38:03
Video

The State of Production Machine Learning in 2024 // Alejandro Saucedo // AI in Production

As the number of production machine learning use-cases increase, we find ourselves facing new and bigger challenges where more is at stake. Because of this, it's critical to identify the key areas to focus our efforts, so we can ensure we're able to transition from machine learning models to reliable production machine learning systems that are robust and scalable. In this talk we dive into the state of production machine learning in 2024, and we will cover the concepts that make production machine learning so challenging, as well as some of the recommended tools available to tackle these challenges. We will be covering a deep dive of the production ML tooling ecosystem and dive into best practices that have been abstracted from production use-cases of machine learning operations at scale, as well as how to leverage tools to that will allow us to deploy, explain, secure, monitor and scale production machine learning systems.
# LLM Use Cases
# LLM in Production
# MLOPs tooling
Alejandro Saucedo
Demetrios Brinkmann
Alejandro Saucedo & Demetrios Brinkmann · Feb 25th, 2024
33:13
Video

LLMOps and GenAI at Enterprise Scale - Challenges and Opportunities

Generative AI is not going anywhere, but many organizations are struggling to translate a very active research and development activity from POC to production solutions. In this brief talk, I'll highlight some of the challenges I think we need to overcome if we want to deploy GenAI solutions at scale and I'll also talk about some of the opportunities this presents.
# LLMs
# GenAI
# NatWest
Andy McMahon
Adam Becker
Andy McMahon & Adam Becker · Feb 27th, 2024
13:04
Video

Explaining ChatGPT to Anyone in 10 Minutes

Over the past few years, we have witnessed a rapid evolution of generative large language models (LLMs), culminating in the creation of unprecedented tools like ChatGPT. Generative AI has now become a popular topic among both researchers and the general public. Now more than ever before, it is important that researchers and engineers (i.e., those building the technology) develop an ability to communicate the nuances of their creations to others. A failure to communicate the technical aspects of AI in an understandable and accessible manner could lead to widespread public skepticism (e.g., research on nuclear energy went down a comparable path) or the enactment of overly-restrictive legislation that hinders forward progress in our field. Within this talk, we will take a small step towards solving these issues by proposing and outlining a simple, three-part framework for understanding and explaining generative LLMs.
# LLMs
# ChatGPT
# Rebuy Engine
Cameron Wolfe
Demetrios Brinkmann
Cameron Wolfe & Demetrios Brinkmann · Feb 27th, 2024
11:16
Video

Building a Python-Centric Feature Platform to Power Production AI Applications

In this talk, Matt walks through Tecton's journey to build a platform that can reliably power large-scale real-time AI applications while requiring nothing more than Python.
# AI Applications
# Python
# Tecton
Matt Bleifer
Adam Becker
Matt Bleifer & Adam Becker · Feb 27th, 2024
27:11
Video

Graduating from Proprietary to Open Source Models in Production

Model endpoints are a good way to prototype ML-powered applications. But in a production environment, you need security, privacy, compliance, reliability, and control over your model inference — as well as high results quality, low latency, and reasonable cost at scale. Learn how AI-native companies from startups to enterprise are using open source ML models to power core production workloads performantly at scale.
# Machine Learning
# Open Source
# Baseten
Philip Kiely
Demetrios Brinkmann
Philip Kiely & Demetrios Brinkmann · Feb 27th, 2024
23:16
Video

Building AI Products across Multiple Domains: Commonalities & Non-Commonalities

"I will walk through some of the key things that I have noticed in the space of Applied AI - what it takes to build AI into products and surfaces that do not contain it. How do you persuade partners of value? How do you get things done? What pitfalls might you run into, and how do you solve them?"
# AI Products
# Applied AI
# Uber
12:32
Video

Model Merging and Mixtures of Experts

Model merging has recently become extremely popular in the open-source community. The idea of merging several fine-tuned models, or combining them into a Mixture of Experts (MoE), led to new state-of-the-art LLMs. This talk introduces the main concepts around model merging and how to implement it using the mergekit library. It provides a notebook to create your own models and directly upload them on the Hugging Face Hub.
# Model Merging
# (MoE)
# JP Morgan Chase
Maxime Labonne
Adam Becker
Maxime Labonne & Adam Becker · Mar 4th, 2024
11:17
Video

Data Labeling Best Practices

Data labeling is a key part of fine-tuning open-source LLMs. However, poor labeling practices can hurt your LLM's performance. This lightning talk will cover data labeling best practices from hiring, preparing your data, and managing your data labelers.
# Data Labeling
# Fine-tuning LLMs
# Textmine
Charles Brecque
Adam Becker
Charles Brecque & Adam Becker · Mar 4th, 2024
12:59
Video

A Survey of Production RAG Pain Points and Solutions

Large Language Models (LLMs) are revolutionizing how users can search for, interact with, and generate new content. There's been an explosion of interest around Retrieval Augmented Generation (RAG), enabling users to build applications such as chatbots, document search, workflow agents, and conversational assistants using LLMs on their private data. While setting up naive RAG is straightforward, building production RAG is very challenging. There are parameters and failure points along every stage of the stack that an AI engineer must solve in order to bring their app to production. This talk will cover the overall landscape of pain points and solutions around building production RAG, and also paint a picture of how this architecture will evolve over time.
# LLMs
# RAG
# LlamaIndex
Jerry Liu
Demetrios Brinkmann
Jerry Liu & Demetrios Brinkmann · Feb 28th, 2024
30:00
Video

LLM Use Cases in Production Panel

From startups achieving significant value with minor capabilities to AI revolutionizing sales calls and raising sales by 30%, we explore a series of interesting real-world use cases. Understanding the objectives and complexities of various industries, exploring the challenges of launching products, and highlighting the vital integration of the human touch with technology, this episode is a treasure trove of insights.
# LLM Use Cases
# Startups
# hello.theresidesk.com
# chaptr.xyz
# dataindependent.com
Greg Kamradt
Agnieszka Mikołajczyk-Bareła
Jason Liu
+2
Greg Kamradt, Agnieszka Mikołajczyk-Bareła, Jason Liu & 2 content:more content:speakers · Feb 28th, 2024
30:49
Video

Evaluating Large Language Models for Production

In the rapidly evolving field of natural language processing, the evaluation of Large Language Models (LLMs) has become a critical area of focus. We will explore the importance of a robust evaluation strategy for LLMs and the challenges associated with traditional metrics such as ROUGE and BLEU. We will conclude the talk with some nontraditional such as correctness, faithfulness, and freshness metrics that are becoming increasingly important in the evaluation of LLMs.
# Evaluation
# Large Language Models
# You.com
Zairah Mustahsan
Demetrios Brinkmann
Zairah Mustahsan & Demetrios Brinkmann · Mar 4th, 2024
19:14
Video

Productionizing AI: How to Think From the End

As builders, engineers, and creators, we are often thinking about starting the full life-cycle of a machine learning or AI project from gathering data, cleaning the data, and training and evaluating a model. But what about the experiential qualities of an AI product that we want our user to be able to experience on the front end? Join me to learn about the foundational questions I ask myself and my team while building products that incorporate LLMs.
# Productionizing AI
# LLMs
# Bainbridge Capital
Annie Condon
Demetrios Brinkmann
Annie Condon & Demetrios Brinkmann · Mar 4th, 2024
11:11
Video

Reliable Hallucination Detection in Large Language Models

Hallucination detection is a critical step toward understanding the trustworthiness of modern language models (LMs). To achieve this goal, we re-examine existing detection approaches based on the self-consistency of LMs and uncover two types of hallucinations resulting from 1) question-level and 2) model-level, which cannot be effectively identified through self-consistency check alone. Building upon this discovery, we propose a novel sampling-based method, i.e., semantic-aware cross-check consistency (SAC3) that expands on the principle of self-consistency checking. Our SAC3 approach incorporates additional mechanisms to detect both question-level and model-level hallucinations by leveraging advances including semantically equivalent question perturbation and cross-model response consistency checking. Through extensive and systematic empirical analysis, we demonstrate that SAC3 outperforms the state of the art in detecting both non-factual and factual statements across multiple question-answering and open-domain generation benchmarks.
# Hallucinations
# LLMs
# Intuit
Jiaxin Zhang
Adam Becker
Jiaxin Zhang & Adam Becker · Mar 4th, 2024
35:24
Video

Seeing Like a Language Model

What do language models see when they read and generate text? What are the terms by which they process the world? In this talk, I'll share some encouraging updates on my continuing exploration of how embeddings represent meaning enabled by recent breakthroughs in interpretability research, and how these insights might help us build better capabilities for retrieval-augmented LLM systems and imagine more natural interfaces for reading and writing.
# LLMs
# Embedding
# Notion
Linus Lee
Demetrios Brinkmann
Linus Lee & Demetrios Brinkmann · Mar 4th, 2024
33:44
Video

The Intersection of Graphs and Large Language Models

The intersection of graphs and Large Language Models (LLMs). I intend to explore the benefits of combining graphs with LLMs, delving into the engineering aspects while also touching on the practical applications from my startup's perspective. This talk will highlight my recent work and findings on the superiority of Retrieval Augmented Generation (RAG) Knowledge Graphs over traditional RAG with vector databases, underlining the profound implications of their interaction.
# Graphs
# Large Language Models
# Fribl
Anthony Alcaraz
Adam Becker
Anthony Alcaraz & Adam Becker · Mar 4th, 2024
15:08
Video

Opportunities and Challenges of Self-Hosting LLMs

LLM deployment is notoriously tricky, leaving ML teams with little time left to focus on driving business value. So what can we do? If you run or are a part of a data science team working with LLMs, this one’s for you.
# LLMs
# ML Teams
# TitanML
Meryem Arik
Adam Becker
Meryem Arik & Adam Becker · Mar 4th, 2024
9:47
Video

The Future of RAG

New LLMs are constantly appearing in the AI landscape, and retrieval augmented generation (RAG) has become a dominant LLM design pattern. What will the future bring? Join Contextual AI VP Product Aditya Bindal for a deep dive into the next generation of foundation models that prioritize customization and privacy.
# Artifact Storage
# LLM Design Pattern
# ContextualAI
Aditya Bindal
Adam Becker
Aditya Bindal & Adam Becker · Mar 6th, 2024
37:41
Video

Lessons from Building LLM-based Social Media Products

The goal of the talk will be to learn how to harness Gen. AI to build the right products for your users, efficiently. It'll cover learnings from different stages of a product, from the idea exploration stage, to hardware capacity planning, iterating on early versions, building early trust with your users, and finally measuring success over the long term.
# LLMs
# Social Media Products
# LinkedIn
Faizaan Charania
Faizaan Charania · Mar 6th, 2024
19:46
Video

Making Sense of LLMOps

Lots of companies are investing time and money in LLMs, some even have customer-facing applications, but what about some common sense? Impact assessment | Risk assessment | Maturity assessment.
# LLMOps
# Ahold Delhaize
# Booking.com
Maria Vechtomova
Başak Tuğçe Eskili
Maria Vechtomova & Başak Tuğçe Eskili · Mar 6th, 2024
25:35
Code of Conduct
Your Privacy Choices