MLOps Community
timezone
+00:00 GMT
LLMs in Production Conference Part III
# Open Source AI
# LLMs
# Prem AI

The State of Open Source AI: Deployment Engines, Licences, & Hardware

Following the LLMs in Production survey results, we picked the 3 hottest topics to bring some clarity in the current fast-paced mess of Open Source innovation: deployment (starting & maintaining a project is an ongoing unsolved problem) licenses (recently "solved"/permissive for a lot of good architectures, but will remain a mess for data for years) hardware (cost-vs-capability ratio is decreasing, but far more importantly, software portability is finally actually being solved!) Find out what, why, and how in this talk.
Casper  da Costa-Luis
Adam Becker
Casper da Costa-Luis & Adam Becker · Nov 1st, 2023
Popular topics
# Machine Learning
# Presentation
# Monitoring
# Interview
# LLM in Production
# Rungalileo.io
# Snorkel.ai
# Coding Workshop
# Case Study
# Model Serving
# Panel
# Scaling
# Kubernetes
# ML Engineering
# CPU
# GPU
# Run:AI
# Unstructure Data
# ML Workflow
# Galileo
Latest
Popular
All
Alex Cabrera
Adam Becker
Alex Cabrera & Adam Becker · Nov 1st, 2023

Authoring Interactive, Shareable AI Evaluation Reports with Zeno

LLMs and foundation models are unlocking thousands of possibilities for AI-driven products, from writing assistants to art platforms. Despite their abilities, these AI systems are complex and can fail in significant ways, such as producing hallucinations, biased outputs, and more. In this talk, I’ll introduce Zeno, an interactive platform for creating and sharing in-depth evaluations of complex AI systems. Zeno lets users explore the inputs and outputs of any AI system, from text to audio and image models, and create interactive reports. We envision Zeno being the go-to tool both AI developers and auditors use to share reproducible evaluations of AI systems.
# AI Evaluation Reports
# Foundation Models
# Zeno
Yuze Ma
Adam Becker
Yuze Ma & Adam Becker · Oct 31st, 2023

Behind the Scenes : The Challenges on Building AI Applications

As foundational models move fast in size and capability, the gap between research to production is yet to be completed. To iterate through the models and implementations faster, the toolchain matters a lot. This talk will cover the techniques and experience during this journey from the training of the vanilla foundational models, and optimization comparisons, to user-side system experience optimization.
# AI Applications
# Foundational Models
# Lepton AI
Finn Howell
Adam Becker
Finn Howell & Adam Becker · Oct 31st, 2023

Evaluating LLMs for AI Risk

How do you write a stress test for an LLM? This talk explores cutting-edge techniques to red-team generative AI and build validation engines that algorithmically probe models for security, ethics, and safety issues. Attendees will learn a framework to manage AI risk spanning the model lifecycle, from data collection through production.
# LLMs Evaluation
# AI Risk
# Robust Intelligence
Benjamin Harvey
Adam Becker
Benjamin Harvey & Adam Becker · Oct 26th, 2023

AI Squared: Breaking LLMs out of the Chat Application

AI Squared is an AI platform designed for product owners, data scientists, and enterprise leaders. We empower you to accelerate both predictive and generative AI projects, measure their benefits, and drive significant revenue growth and cost reduction. The largest gap in the LLM developer stack is creating different experiences for users to leverage LLM results. Many companies have only utilized generative AI inside of chat applications. AI Squared has developed a framework that empowers our customers to harvest context and connect to additional content as well as other AI models from across the organization while integrating these insights directly into currently existing tools and applications.
# AI Platform
# LLM Developer Stack
# AI Squared
Azul Garza
Adam Becker
Azul Garza & Adam Becker · Oct 26th, 2023

TimeGPT: The First Foundation Model for Time Series

Time series—data ordered chronologically—constitutes the underlying fabric of systems, enterprises, and institutions. Its impact spans from measuring ocean tides to tracking the daily closing value of the Dow Jones. This type of data representation is indispensable in sectors such as finance, healthcare, meteorology, and social sciences. However, the current theoretical and practical understanding of time series hasn't yet achieved a consensus among practitioners that mirrors the widespread acclaim for generative models in other fundamental domains of the human condition, like language and perception. Our field is still divided and highly specialized. Efforts in forecasting science have fallen short of fulfilling the promises of genuinely universal pre-trained models. In this talk, we will introduce TimeGPT, the first pre-trained foundation model for time series forecasting that can produce accurate predictions across various domains and applications without additional training. A general pre-trained model constitutes a groundbreaking innovation that opens the path to a new paradigm for the forecasting practice that is more accessible and accurate, less time-consuming, and drastically reduces computational complexity. We will show how to use TimeGPT in a live demo.
# TimeGPT
# Time Series
# Nixtla
Harini Kannan
Adam Becker
Harini Kannan & Adam Becker · Oct 26th, 2023

Product Strategy for LLM features when LLM isn’t your Product

The rapid adoption of Large Language Models (LLMs) has swept across various industries, inspiring companies to incorporate them into their products, regardless of the industry's domain. Unlike organizations like OpenAI, Google, or Microsoft, most entities view LLMs as powerful tools rather than standalone products. In this context, it remains paramount to uphold core product principles where customer satisfaction reigns supreme. In this talk, we will see how to pick practical use cases for using LLMs, where they are not essentially user-facing chatbots but valuable tools to build new features into existing products - with examples from the cybersecurity domain. We will also see a sample product strategy for how to plan and roadmap LLM features, along with key performance metrics to gauge success.
# LLM features
# Product Strategy
# LLMs
Noble Ackerson
Adam Becker
Noble Ackerson & Adam Becker · Oct 26th, 2023

GenAI: An Unreliable Information Store

Embark on an enlightening journey with Noble as he tackles the challenges of integrating Large Language Models (LLMs) into enterprise environments. Understand the inherent unreliability of these models and explore innovative solutions, ranging from when to use vector databases to when to use retrieval augmented generation, that aim to enhance the trustworthiness of LLMs in crucial applications.
# Large Language Models
# Generative AI
# VENTERA CORPORATION
Greg Diamos
Adam Becker
Greg Diamos & Adam Becker · Oct 26th, 2023

Finetuning LLMs

An opinionated view of how to build production LLMs.
# Fine-tuning LLMs
# Building Production
# PowerML, Inc
Julia  Kroll
Adam Becker
Julia Kroll & Adam Becker · Oct 26th, 2023

Speed and Sensibility: Balancing Latency and UX in Generative AI

Conversational AI demands low latency for a seamless dialogue between humans and AI. However, engineers face the dilemma that some latency is inherently required in order to process human speech and craft a response. Some incremental wins to shave off milliseconds involve trade-offs against how the AI response could be enriched during the additional processing time. Others simply refactor out inefficiency to obtain more performant results from AI devtools. This talk presents best practices of designing streaming speech-to-text applications, as well as reasons to accept extra latency for the sake of an enhanced product experience.
# Conversational AI
# Humans and AI
# Deepgram
Yifei Feng
Philipp Moritz
Adam Becker
Yifei Feng, Philipp Moritz & Adam Becker · Oct 26th, 2023

Building RAG-based LLM Applications for Production

In this talk, we will cover how to develop and deploy RAG-based LLM applications for production. We will cover how the major workloads (data loading and preprocessing, embedding, serving) can be scaled on a cluster, how different configurations can be evaluated and how the application can be deployed. We will also give an introduction to Anyscale Endpoints which offers a cost-effective solution for serving popular open-source models.
# LLM Applications
# RAG
# Anyscale
Popular