LLMs in Production Conference Part II
# LLM in Production
# Local LLMs
# Defense Unicorn
Enabling Defense Missions with Local LLMs
Despite the power of best-in-class Large Language Models and Generative AI, the use of hosted APIs for models in highly sensitive, and regulated environments is being challenged. From fine-tuning and embedding sensitive data to creating small models in edge and air-gapped environments, users in these regulated environments need production-ready ways to run and observe models. Beyond that, both the software and the models that are being deployed need to have the Authorization to Operate in every one of these environments. LeapfrogAI is an open-source, open-contribution set of tools designed to meet the challenging requirements of these environments. Come learn about what makes these environments so rigorous, the work going on in enabling Defense missions to use Generative AI safely and successfully and hear more about how LeapfrogAI enables these missions.
Gerred Dillon · Jul 21st, 2023
Popular topics
# LLMs
# LLM in Production
# Machine learning
# Coding Workshop
# Presentation
# LinkedIn
# Generative AI
# Machine Learning
# Arize.com
# Interview
# Monitoring
# Case Study
# Deployment
# Analytics
# CPU
# GPU
# Run:AI
# Databricks
# A/B Testing
# Arize
Denys Linkov · Jul 21st, 2023
What are some of the key differences in using 100M vs 100B parameter models in production? In this talk, Denys from Voiceflow will cover how their MLOps processes have differed between smaller transformer models and LLMs. He'll walk through how the main 4 production models Voiceflow uses differ, and the processes plus product planning behind each one. The talk will cover prompt testing, automated training, real-time inference, and more!
# LLM in Production
# Production Models
# Voiceflow
# voiceflow.com
Maxime Beauchemin & Demetrios Brinkmann · Jul 21st, 2023
It’s clear that test-driven development plays a pivotal role in prompt engineering, potentially even more so than in traditional software engineering. By embracing TDD, product builders can effectively address the unique challenges presented by AI systems and create reliable, predictable, and high-performing products that harness the power of AI.
# LLM in Production
# AI Product Development
# Preset
Travis Cline & Demetrios Brinkmann · Jul 21st, 2023
A quick run-through of our recent project to visualize and explore the MLOps community trends by building interactive tools to see Slack message content in new lights.
# LLM in Production
# LLM Stack
# Virta
Vipul Ved Prakash · Jul 21st, 2023
Creating a new LLM is a difficult and expensive process, and there are several aspects that we need to get right — (1) a broad training dataset (2) a strong base model, (3) a well-aligned instruction dataset, (4) a carefully designed moderation subsystem, and (5) cost-effective training infrastructure coupled with an efficient software stack. Together’s central thesis is that these processes can be open-sourced, and we can harness the power of community to build and improve models, in the same way great open-source software has been built for decades. In this talk, I will introduce RedPajama, an open-source effort driven by Together and Collaborators, and show how to build an LLM with the power of community.
# LLM in Production
# RedPajama
# Together
Aravind Srinivas & Demetrios Brinkmann · Jul 21st, 2023
Perplexity AI is an answer engine that aims to deliver accurate answers to questions using LLMs. Perplexity's CEO Aravind Srinivas will introduce the product and discuss some of the challenges associated with building LLMs.
# LLM in Production
# Power Consumer Search
# Perplexity AI
Bradley Heilbrun & Demetrios Brinkmann · Jul 21st, 2023
GPU-enabled hosts are a significant driver of cloud costs for teams serving LLMs in production. Preemptible instances can provide significant savings but generally aren’t fit for highly available services. This lightning talk tells the story of how Replit switched to preemptible GKE nodes, tamed the ensuing chaos, and saved buckets of cash while improving uptime.
# LLM in Production
# Optimizing Server Startup
# Repl.it
Xin Liang & Demetrios Brinkmann · Jul 21st, 2023
Large language models (LLMs) have revolutionized AI, breaking down barriers to entry to cutting-edge AI applications, ranging from sophisticated chatbots to content creation engines.
# LLM in Production
# LLM-based Feature Extraction
# Canva
Mathieu Bastian · Jul 17th, 2023
We've built GetYourGuide's ChatGPT plugin in about a week with a few engineers. It was an interesting experience we would love to share.
# LLM in Production
# Chat GPT Plugin
# GetYourGuide
Chunting Zhou · Jul 17th, 2023
How do you turn a language model into a chatbot without any user interactions?
LIMA is a LLaMa-based model fine-tuned on only 1,000 curated prompts and responses, which produces shockingly good responses.
* No user data
* No mode distillation
* No RLHF
What does this tell us about language model alignment?
In this talk, Chunting shares what we have learned throughout the process.
# LLM in Production
# LIMA
# FAIR Labs
David Hershey, Daniel Jeffries & Demetrios Brinkmann · Jul 17th, 2023
Evaluating the performance of language models (LLMs) is a pressing issue for companies working with generative AI. Defining what makes a model "good" and measuring its performance are challenging due to the diverse range of LLM applications. Existing evaluation methods, including benchmarks and user preference comparisons, have limitations in scalability and objectivity. The future of LLM evaluation lies in scaling testing with machine learning systems, such as reward models that capture user preferences, and simulating user sessions to generate comprehensive test cases. These approaches will help developers select models, create effective prompts, ensure compliance, and enhance LLM quality.
# Future of LLMs
# LLM in Production
# LLM Applications