LLMs in Production Conference Part II

# LLM in Production

# Production Models

# Voiceflow

# voiceflow.com

LLMs vs LMs in Prod

What are some of the key differences in using 100M vs 100B parameter models in production? In this talk, Denys from Voiceflow will cover how their MLOps processes have differed between smaller transformer models and LLMs. He'll walk through how the main 4 production models Voiceflow uses differ, and the processes plus product planning behind each one. The talk will cover prompt testing, automated training, real-time inference, and more!

Denys Linkov · Jul 21st, 2023

All

Maxime Beauchemin & Demetrios Brinkmann · Jul 21st, 2023

Taming AI Product Development Through Test-driven Prompt Engineering

It’s clear that test-driven development plays a pivotal role in prompt engineering, potentially even more so than in traditional software engineering. By embracing TDD, product builders can effectively address the unique challenges presented by AI systems and create reliable, predictable, and high-performing products that harness the power of AI.

# LLM in Production

# AI Product Development

# Preset

Gerred Dillon · Jul 21st, 2023

Enabling Defense Missions with Local LLMs

Despite the power of best-in-class Large Language Models and Generative AI, the use of hosted APIs for models in highly sensitive, and regulated environments is being challenged. From fine-tuning and embedding sensitive data to creating small models in edge and air-gapped environments, users in these regulated environments need production-ready ways to run and observe models. Beyond that, both the software and the models that are being deployed need to have the Authorization to Operate in every one of these environments. LeapfrogAI is an open-source, open-contribution set of tools designed to meet the challenging requirements of these environments. Come learn about what makes these environments so rigorous, the work going on in enabling Defense missions to use Generative AI safely and successfully and hear more about how LeapfrogAI enables these missions.

# LLM in Production

# Local LLMs

# Defense Unicorn

Travis Cline & Demetrios Brinkmann · Jul 21st, 2023

MLOps LLM Stack Hackathon Winner: Exploring the MLOps Community Trends

A quick run-through of our recent project to visualize and explore the MLOps community trends by building interactive tools to see Slack message content in new lights.

# LLM in Production

# LLM Stack

# Virta

Bradley Heilbrun & Demetrios Brinkmann · Jul 21st, 2023

Preemption Chaos and Optimizing Server Startup

GPU-enabled hosts are a significant driver of cloud costs for teams serving LLMs in production. Preemptible instances can provide significant savings but generally aren’t fit for highly available services. This lightning talk tells the story of how Replit switched to preemptible GKE nodes, tamed the ensuing chaos, and saved buckets of cash while improving uptime.

# LLM in Production

# Optimizing Server Startup

# Repl.it

Xin Liang & Demetrios Brinkmann · Jul 21st, 2023

LLM-based Feature Extraction for Operational Optimization

Large language models (LLMs) have revolutionized AI, breaking down barriers to entry to cutting-edge AI applications, ranging from sophisticated chatbots to content creation engines.

# LLM in Production

# LLM-based Feature Extraction

# Canva

Vipul Ved Prakash · Jul 21st, 2023

Building RedPajama

Creating a new LLM is a difficult and expensive process, and there are several aspects that we need to get right — (1) a broad training dataset (2) a strong base model, (3) a well-aligned instruction dataset, (4) a carefully designed moderation subsystem, and (5) cost-effective training infrastructure coupled with an efficient software stack. Together’s central thesis is that these processes can be open-sourced, and we can harness the power of community to build and improve models, in the same way great open-source software has been built for decades. In this talk, I will introduce RedPajama, an open-source effort driven by Together and Collaborators, and show how to build an LLM with the power of community.

# LLM in Production

# RedPajama

# Together

Aravind Srinivas & Demetrios Brinkmann · Jul 21st, 2023

Using LLMs to Power Consumer Search at Scale

Perplexity AI is an answer engine that aims to deliver accurate answers to questions using LLMs. Perplexity's CEO Aravind Srinivas will introduce the product and discuss some of the challenges associated with building LLMs.

# LLM in Production

# Power Consumer Search

# Perplexity AI

David Hershey, Daniel Jeffries & Demetrios Brinkmann · Jul 17th, 2023

Fireside Chat - The Future of LLMs

Evaluating the performance of language models (LLMs) is a pressing issue for companies working with generative AI. Defining what makes a model "good" and measuring its performance are challenging due to the diverse range of LLM applications. Existing evaluation methods, including benchmarks and user preference comparisons, have limitations in scalability and objectivity. The future of LLM evaluation lies in scaling testing with machine learning systems, such as reward models that capture user preferences, and simulating user sessions to generate comprehensive test cases. These approaches will help developers select models, create effective prompts, ensure compliance, and enhance LLM quality.

# Future of LLMs

# LLM in Production

# LLM Applications

Chunting Zhou · Jul 17th, 2023

LIMA: Less is More for Alignment

How do you turn a language model into a chatbot without any user interactions? LIMA is a LLaMa-based model fine-tuned on only 1,000 curated prompts and responses, which produces shockingly good responses. * No user data * No mode distillation * No RLHF What does this tell us about language model alignment? In this talk, Chunting shares what we have learned throughout the process.

# LLM in Production

# LIMA

# FAIR Labs

Mathieu Bastian · Jul 17th, 2023

From an API to a Chat GPT Plugin

We've built GetYourGuide's ChatGPT plugin in about a week with a few engineers. It was an interesting experience we would love to share.

# LLM in Production

# Chat GPT Plugin

# GetYourGuide

LLMs in Production Conference Part II

.css-1t9010w-StyledLink:hover *{color:var(--theme-color-primary, #C92C7F);}LLMs vs LMs in Prod

LLMs vs LMs in Prod