MLOps Community
+00:00 GMT
LLM in Production
# LLM Use Cases
# LLM in Production
# MLOPs tooling

The State of Production Machine Learning in 2024 // Alejandro Saucedo // AI in Production

As the number of production machine learning use-cases increase, we find ourselves facing new and bigger challenges where more is at stake. Because of this, it's critical to identify the key areas to focus our efforts, so we can ensure we're able to transition from machine learning models to reliable production machine learning systems that are robust and scalable. In this talk we dive into the state of production machine learning in 2024, and we will cover the concepts that make production machine learning so challenging, as well as some of the recommended tools available to tackle these challenges. We will be covering a deep dive of the production ML tooling ecosystem and dive into best practices that have been abstracted from production use-cases of machine learning operations at scale, as well as how to leverage tools to that will allow us to deploy, explain, secure, monitor and scale production machine learning systems.
Alejandro Saucedo
Demetrios Brinkmann
Alejandro Saucedo & Demetrios Brinkmann · Feb 25th, 2024
Popular topics
# LLMs
# Machine Learning
# Interview
# Generative AI
# LLM in Production
# Case Study
# Model Serving
# FinTech
# Cultural Side
# Scaling
# Security
# Deployment
# Analytics
# ML Platform
# GPU
# Cleanlab
# Labeled Datasets
# Data-centric AI
# A/B Testing
# Python
All
Gerred Dillon
Gerred Dillon · Jul 21st, 2023
Despite the power of best-in-class Large Language Models and Generative AI, the use of hosted APIs for models in highly sensitive, and regulated environments is being challenged. From fine-tuning and embedding sensitive data to creating small models in edge and air-gapped environments, users in these regulated environments need production-ready ways to run and observe models. Beyond that, both the software and the models that are being deployed need to have the Authorization to Operate in every one of these environments. LeapfrogAI is an open-source, open-contribution set of tools designed to meet the challenging requirements of these environments. Come learn about what makes these environments so rigorous, the work going on in enabling Defense missions to use Generative AI safely and successfully and hear more about how LeapfrogAI enables these missions.
# LLM in Production
# Local LLMs
# Defense Unicorn
23:07
Maxime Beauchemin
Demetrios Brinkmann
Maxime Beauchemin & Demetrios Brinkmann · Jul 21st, 2023
It’s clear that test-driven development plays a pivotal role in prompt engineering, potentially even more so than in traditional software engineering. By embracing TDD, product builders can effectively address the unique challenges presented by AI systems and create reliable, predictable, and high-performing products that harness the power of AI.
# LLM in Production
# AI Product Development
# Preset
21:26
Denys Linkov
Denys Linkov · Jul 21st, 2023
What are some of the key differences in using 100M vs 100B parameter models in production? In this talk, Denys from Voiceflow will cover how their MLOps processes have differed between smaller transformer models and LLMs. He'll walk through how the main 4 production models Voiceflow uses differ, and the processes plus product planning behind each one. The talk will cover prompt testing, automated training, real-time inference, and more!
# LLM in Production
# Production Models
# Voiceflow
# voiceflow.com
24:44
Travis Cline
Demetrios Brinkmann
Travis Cline & Demetrios Brinkmann · Jul 21st, 2023
A quick run-through of our recent project to visualize and explore the MLOps community trends by building interactive tools to see Slack message content in new lights.
# LLM in Production
# LLM Stack
# Virta
10:38
Vipul Ved Prakash
Vipul Ved Prakash · Jul 21st, 2023
Creating a new LLM is a difficult and expensive process, and there are several aspects that we need to get right — (1) a broad training dataset (2) a strong base model, (3) a well-aligned instruction dataset, (4) a carefully designed moderation subsystem, and (5) cost-effective training infrastructure coupled with an efficient software stack. Together’s central thesis is that these processes can be open-sourced, and we can harness the power of community to build and improve models, in the same way great open-source software has been built for decades. In this talk, I will introduce RedPajama, an open-source effort driven by Together and Collaborators, and show how to build an LLM with the power of community.
# LLM in Production
# RedPajama
# Together
27:52
Xin Liang
Demetrios Brinkmann
Xin Liang & Demetrios Brinkmann · Jul 21st, 2023
Large language models (LLMs) have revolutionized AI, breaking down barriers to entry to cutting-edge AI applications, ranging from sophisticated chatbots to content creation engines.
# LLM in Production
# LLM-based Feature Extraction
# Canva
27:02
Bradley Heilbrun
Demetrios Brinkmann
Bradley Heilbrun & Demetrios Brinkmann · Jul 21st, 2023
GPU-enabled hosts are a significant driver of cloud costs for teams serving LLMs in production. Preemptible instances can provide significant savings but generally aren’t fit for highly available services. This lightning talk tells the story of how Replit switched to preemptible GKE nodes, tamed the ensuing chaos, and saved buckets of cash while improving uptime.
# LLM in Production
# Optimizing Server Startup
# Repl.it
12:42
Aravind Srinivas
Demetrios Brinkmann
Aravind Srinivas & Demetrios Brinkmann · Jul 21st, 2023
Perplexity AI is an answer engine that aims to deliver accurate answers to questions using LLMs. Perplexity's CEO Aravind Srinivas will introduce the product and discuss some of the challenges associated with building LLMs.
# LLM in Production
# Power Consumer Search
# Perplexity AI
37:04
LLMs are tremendously flexible, but can they bring additional value for classification tasks of tabular datasets? I investigated if LLM-based label predictions can be an alternative to typical machine learning classification algorithms for tabular data by translating the tabular data to natural language to fine-tune a LLM. This talk compares the results of LLM and XGBoost predictions.
# LLM in Production
# LLM XGBoost
# INWT Statistics
10:51
Chunting Zhou
Chunting Zhou · Jul 17th, 2023
How do you turn a language model into a chatbot without any user interactions? LIMA is a LLaMa-based model fine-tuned on only 1,000 curated prompts and responses, which produces shockingly good responses. * No user data * No mode distillation * No RLHF What does this tell us about language model alignment? In this talk, Chunting shares what we have learned throughout the process.
# LLM in Production
# LIMA
# FAIR Labs
11:18
Popular
LLM Avalanche
Demetrios Brinkmann & Demetrios Brinkmann