LIVESTREAM

LLMs in Production - Part II

# LLM in Production

# Large Language Models

Large Language Models in Production Conference

Join us for two days of talking with some of our favorite people at the forefront of using LLMs in the wild, and an in-person workshop in San Francisco on how to build and deploy LLM based apps hosted by Anyscale.

There will be over 50 Speakers from Stripe, Meta, Canva, Databricks, Anthropic, Cohere, Redis, Langchain, Chroma, Humanloop and so many more.

This all started after we put together the LLM in-production survey and realized there are still lots of unknowns when dealing with LLMs, especially when dealing with them at scale. We open-sourced all the responses and we decided if no one was going to talk about working with LLMs in a non-over-hyped way, we would have to.

Let's discover how to use these damn probabilistic models in the best ways possible without sacrificing the necessary software design building blocks.

Expect all the fun and learnings from the first one. DOUBLED.

And remember, there will be some sweeeet sweet swag giveaways.

Huge Shoutout to all our sponsors of this event, find more info about them below.

Speakers

Matei Zaharia

Cofounder and Chief Technologist @ Databricks

Chip Huyen

CEO @ Claypot AI

Sarah Aerni

Vice President, AI/Machine Learning and Engineering @ Salesforce

Emmanuel Ameisen

Research Engineer @ Anthropic

Sumit Kumar

Senior Machine Learning Engineer @ Meta

Shreya Rajpal

Creator @ Guardrails AI

Aravind Srinivas

CEO & Co-Founder @ Perplexity.ai

Ines Chami

Co-Founder @ NumbersStationAI

Samyam Rajbhandari

Principal Architect @ Microsoft Corporation

Omar Sanseviero

Machine Learning Lead @ Hugging Face

Alex Ratner

CEO and Co-founder @ Snorkel AI

Sam Charrington

Host @ TWIML AI Podcast

Raza Habib

CEO and Co-founder @ Humanloop

Chris Van Pelt

Co-founder / CISO @ Weights & Biases

Samuel Partee

CTO & Co-Founder @ Arcade AI

Sophie Daly

Staff Data Scientist @ Stripe

Misty Free

Product Manager @ Jasper

Scott Mackie

Founding Engineer @ Mem Labs

Anton Troynikov

Founder / Head of Technology @ Chroma

Sahar Mor

Product Lead @ Stripe

George Mathew

Managing Director @ Insight Partners

Manjot Pahwa

VP @ Lightspeed

Azin Asgarian

AI Technical Lead @ Georgian

Lance Martin

Software engineer @ LangChain

Daniel Campos

Research Scientist @ Snowflake

Richa Sachdev

Executive Director- Data Operations and Automation @ JP Morgan Chase

Chris Brousseau

Lead Data Scientist - International Focus @ Mastercard

Xin Liang

Senior Machine Learning Engineer @ Canva

Denys Linkov

ML Lead @ Voiceflow

Monmayuri Ray

MLOPS Advisor @ Gitlab

Joseph Gonzalez

Professor, Co-Founder & VP of Product @ UC Berkeley, Aqueduct

Amrutha Gujjar

CEO & Co-Founder @ Structured

Daniel Vila Suero

CEO & Co-Founder @ Argilla

David Aponte

Senior Research SDE, Applied Sciences Group @ Microsoft

Travis Fischer

Founder @ Agentic

Tristan Zajonc

Cofounder & CEO @ Continual

Vikram Sreekanti

CEO @ Aqueduct

Daniel Whitenack

Founder and Data Scientist @ Prediction Guard

Manojkumar Parmar

CEO, CTO @ AIShield - Corporate Startup of Bosch

Willem Pienaar

Co-Founder & CTO @ Cleric

Abi Aryan

Machine Learning Engineer @ Independent Consultant

Soham Chatterjee

Machine Learning Lead @ Sleek

Josh Tobin

Founder @ Gantry

Sohini Roy

Senior Developer Relations Manager @ NVIDIA

Sebastian Cattes

Senior Data Scientist @ INWT Statistics

Aparna Dhinakaran

Co-Founder and Chief Product Officer @ Arize AI

Gerred Dillon

Unicorn Engineer @ Defense Unicorns

Andrew Mauboussin

Engineering Lead @ Surge AI

Asmitha Rathis

Machine Learning Engineer @ PromptOps

Rohit Agarwal

CEO @ Portkey.ai

Oscar Rovira

Co-founder @ Mystic AI

Mark Huang

Co-Founder @ Gradient

Mark Craddock

Founder & CTO @ FirstLiot Ltd.

Shrinand Javadekar

All things Kubernetes @ Outerbounds, Inc.

Alberto Rizzoli

CEO @ V7

Maxime Beauchemin

CEO and Founder @ Preset

Chunting Zhou

Research Scientist @ Meta AI

Claire Longo

Head of ML Solutions Engineering @ Arize AI

Emmy Li

Technical Trainer @ Anyscale

Raahul Dutta

MLOps Lead @ Elsevier

Bradley Heilbrun

Engineer @ Replit

Demetrios Brinkmann

Chief Happiness Engineer @ MLOps Community

Filippo Pedrazzini

CTO @ Prem Labs

Natalia Burina

AI Product Leader @ ex Meta AI

Waleed Kadous

Head of Engineering @ Anyscale

Diego Oppenheimer

Head of Product @ Hyperparam

Yada Pruksachatkun

ML Lead @ Moonhub

Stefan Ojanen

Director of Product Management @ Genesis Cloud

Travis Addair

CTO @ Predibase Inc.

Rohit Agarwal

Co--Founder @ Portkey.ai

Vipul Ved Prakash

Co-founder & CEO @ Together

Yujian Tang

Developer Advocate @ Zilliz

David Hershey

Member of Technical Staff @ Anthropic

Adam Breindel

Technical Instructor @ Anyscale

Davis Treybig

Partner @ Innovation Endeavors

Dina Yerlan

Head of Product, Generative AI Data @ Adobe, Firefly

Rahul Parundekar

Founder @ A.I. Hero, Inc.

Patrick Barker

CTO @ Kentauros AI

Hemant Jain

Senior Software Engineer, ML Inference @ Cohere

Atindriyo Sanyal

Co-founder, CTO @ Galileo

Mathieu Bastian

Director of Engineering, Data Products @ GetYourGuide

Nikunj Bajaj

Cofounder & CEO @ TrueFoundry

Artem Harutyunyan

Co-Founder & CTO @ Bardeen AI

Travis Cline

Engineering Manager, Platform @ Virta

Daniel Jeffries

Chief Executive Officer @ Kentauros AI

Will Gaviria Rojas

Co-Founder @ Coactive AI

Agenda

Day 1Day 2

Track View

Track 1

Track 2

Workshops

In-Person Events

3:00 PM, GMT

3:30 PM, GMT

Stage 1

Opening / Closing

Welcome & Opening

Plus a little LLM in production survey report tl;dr summary

+ Read More

3:30 PM, GMT

4:00 PM, GMT

Stage 1

Keynote

LLMOps: The Emerging Toolkit for Reliable, High-quality LLM Applications

Large language models are fluent text generators, but they often make errors, which makes them difficult to deploy in high-stakes applications. Using them in more complicated pipelines, such as retrieval pipelines or agents, exacerbates the problem. In this talk, Matei will cover emerging techniques in the field of “LLMOps” — how to build, tune and maintain LLM-based applications with high quality. The simplest tools are ones to test and visualize LLM results, some of which are now being incorporated into MLOps frameworks like MLflow. However, there are also rich techniques emerging to “program” LLM pipelines and control LLMs’ outputs to achieve desired goals.

Matei will discuss Demonstrate-Search-Predict (DSP) from my group as an example programming framework that can automatically improve an LLM-based application based on feedback, and other open-source tools for controlling outputs and generating better training and evaluation data for LLMs. This talk is based on their experience deploying LLMs in many applications at Databricks, including the QA bot on our public website, internal QA bots, code assistants, and others, all of which are making their way into our MLOps products and MLflow.

+ Read More

4:00 PM, GMT

4:30 PM, GMT

Stage 1

Keynote

Open Challenges For LLM Applications

What do we need to be aware of when building for production? In this talk, we will explore the key challenges that arise when taking an LLM to production.

+ Read More

4:30 PM, GMT

5:00 PM, GMT

Stage 1

Panel Discussion

Evaluation

Language models are very complex thus introducing several challenges in interpretability. The large amounts of data required to train these black-box language models make it even harder to understand why a language model generates a particular output. In the past, transformer models were typically evaluated using perplexity, BLEU score or human evaluation.

However, LLMs amplify the problem even further due to their generative nature thus making them further susceptible to hallucinations and factual inaccuracies. Thus, evaluation becomes an important concern.

+ Read More

5:10 PM, GMT

5:20 PM, GMT

Stage 1

1:1 networking

Musical Break

Bring your prompts to the chat cause we will be improvising songs from the audience's suggestions!

+ Read More

5:20 PM, GMT

5:50 PM, GMT

Stage 1

Presentation

Using Vector Databases for LLM Part 2: Practical Advice for Production

In the last LLM in Production event, I spoke on some of the ways we've seen people use a vector database for large language models. This included use cases like information/context retrieval, conversational memory for chatbots, and semantic caching.

These are great and make for flashy demos, however, using this in production isn't trivial. Often times, the less flashy side of these use cases can present huge challenges such as: Advice on prompts? How do I chunk up text? What if I need HIPAA compliance? On-premise? What if I change my embeddings model? What index type? How do I do A/B tests? Which cloud platform or model API should I use? Deployment strategies? How can I inject features from my feature platform? Langchain or LlamaIndex or RelevanceAI???

This talk will detail a distillation of a year+ worth of deploying Redis for these use cases for customers and distill it down into 20 minutes.

+ Read More

5:50 PM, GMT

6:00 PM, GMT

Stage 1

Lightning Talk

Foundation Models in the Modern Data Stack

As Foundation Models (FMs) continue to grow in size, innovations continue to push the boundaries of what these models can do on language and image tasks. This talk will describe our work on applying foundation models to structured data tasks like data linkage, cleaning and querying. We will then discuss challenges and solutions that these models present for production deployment in the modern data stack.

+ Read More

6:00 PM, GMT

6:10 PM, GMT

Stage 1

Lightning Talk

AI Meets Memes: Taking ImgFlip's 'This Meme Does Not Exist' to the next level with a Large Language Model

How to use a Large Language Model (LLM) to create memes? We’ll discuss the unique dataset of ImgFlip, the selection, and fine-tuning of a commercially usable LLM, and associated challenges. Of course, we’ll also demonstrate the model prototype itself. We will also discuss the challenges we anticipate facing with the productionization of an LLM that is used by millions of users.

+ Read More

6:10 PM, GMT

6:20 PM, GMT

Stage 1

Lightning Talk

Building Reliable AI Agents

Autonomous AI agents have gotten a lot of attention recently, but they're mostly just toys. What are the primitives that we need to build more reliable agents, and what are the main business use cases that agentic automation will enable over the next few years?

+ Read More

6:20 PM, GMT

6:50 PM, GMT

Stage 1

Presentation

Pitfalls and Best Practices — 5 lessons from LLMs in production

Humanloop have now seen hundreds of companies go on the journey from playground to production. In this talk we’ll share case-studies of what has and hasn’t worked. We’ll share what the common pitfalls are, emerging best practices and suggestions for how to plan in such a quickly evolving space.

+ Read More

6:50 PM, GMT

7:10 PM, GMT

Stage 1

1:1 networking

Guided Meditation

Put down the screen for a moment, close your eyes and bliss out in between the sessions

+ Read More

7:10 PM, GMT

7:40 PM, GMT

Stage 1

Presentation

Scalable Evaluation and Serving of Open Source LLMs

While we've seen great progress on Open Source LLMs, we haven't seen the same level of progress on systems to serve those LLMs in production contexts. In this presentation, I work through some of the challenges of taking open source models and serving them in production.

+ Read More

7:40 PM, GMT

8:10 PM, GMT

Stage 1

Panel Discussion

LLMs on K8s

Large Language Models require a new set of tools... or do they? K8s is a beast and we like it that way. How can we best leverage all the battle hardened tech that k8s has to offer to make sure that our LLMs go brrrrrrr. Lets talk about it in this chat.

+ Read More

8:10 PM, GMT

8:40 PM, GMT

Stage 1

Presentation

Embeddings and Retrieval for LLMs: Techniques and Challenges

Retrieval augmented generation with embeddings and LLMs has become an important workflow for AI applications.

While embedding-based retrieval is very powerful for applications like 'chat with my documents', users and developers should be aware of key limitations, and techniques to mitigate them.

+ Read More

8:40 PM, GMT

9:00 PM, GMT

Stage 1

1:1 networking

Prompt Injection Game

You think you got prompting skills? been reading too many Reddit threads thinking you can crack the code? Well, let's see what you are capable of!

+ Read More

9:00 PM, GMT

9:30 PM, GMT

Stage 1

Presentation

Build and Customize LLMs in Less than 10 Lines of YAML

Generalized models solve general problems. The real value comes from training a large language model (LLM) on your own data and finetuning it to deliver on your specific ML task.Now you can build your own custom LLM, trained on your data and fine-tuned for your generative or predictive task in ten lines of code with Predibase and Ludwig, the low-code deep learning framework developed and open sourced by Uber, now maintained as part of the Linux Foundation. Using Ludwig’s declarative approach to model customization, you can take a pre-trained large language model like LLaMA and tune it to output data specific to your organization, with outputs conforming to an exact schema. This makes building LLMs fast, easy, and economical.In this session, Travis Addair, CTO of Predibase and co-maintainer of open-source Ludwig, will share how LLMs can be tailored to solve specific tasks from classification to content generation, and how you can get started building a custom LLM in just a few lines of code.

+ Read More

9:30 PM, GMT

9:40 PM, GMT

Stage 1

Lightning Talk

Building Production Copilots

Copilots embedded within SaaS applications have become one of the dominant ways of leveraging LLMs within products. In this lightning talk, I’ll review some of the dominant UI paradigms and features, general design patterns and system architectures, and top challenges and future frontiers of production copilot systems.

+ Read More

9:40 PM, GMT

9:50 PM, GMT

Stage 1

Lightning Talk

Building Recommender Systems with Large Language Models

Many researchers have recently proposed different approaches to building recommender systems using LLMs. These methods convert different recommendation tasks into either language understanding or language generation templates. This talk highlights some of the recent work done on this theme.

+ Read More

9:50 PM, GMT

9:59 PM, GMT

Stage 1

Lightning Talk

Unleashing Code Completion with LLM's

This would be a talk on the learning on building Code Suggestions, my team has takes in reference to Model, ML Infra, Evaluation to Compute and Cost.

+ Read More

LLMs in Production - Part II

Large Language Models in Production Conference

Speakers

Agenda

Sponsors