MLOps Community
+00:00 GMT
LLMs in Production Conference
# LLM in Production
# Cost Optimization
# Cost Performance

Cost Optimization and Performance

In this panel discussion, the topic of the cost of running large language models (LLMs) is explored, along with potential solutions. The benefits of bringing LLMs in-house, such as latency optimization and greater control, are also discussed. The panelists explore methods such as structured pruning and knowledge distillation for optimizing LLMs. OctoML's platform is mentioned as a tool for the automatic deployment of custom models and for selecting the most appropriate hardware for them. Overall, the discussion provides insights into the challenges of managing LLMs and potential strategies for overcoming them.
Lina Weichbrodt
Luis Ceze
Jared Zoneraich
Daniel Campos
 Mario Kostelac
Lina Weichbrodt, Luis Ceze, Jared Zoneraich, Daniel Campos & Mario Kostelac · Apr 27th, 2023
Cost Optimization and Performance
Popular topics
# Interview
# Case Study
# Model Serving
# Scaling
# Presentation
# Health Care
# Cultural Side
# Computer Vision
# Data Stack
# Analytics
# Tradeoffs
# ML Platform
# Frameworks
# Vertex AI
# Practitioners Guide
# ML Efforts
# Integration Deployments
# Detection of Phishing Attacks
# Adaptive Machine Learning Models
# Multilingual Programming
Cameron Feenstra
Cameron Feenstra · Apr 27th, 2023

Using LLMs to Punch Above your Weight!

As a small business, competing with large incumbents can be a daunting challenge. They have more money, more people, and more data, but they can also be inflexible and slow to adopt new technologies. In this talk, we will explore how small businesses can use the power of large language models (LLMs) to compete with large incumbents, particularly in industries like insurance. We will present two examples of how we are using LLMs at Anzen to streamline insurance underwriting and analyze employment agreements and discuss ideas for future applications. By harnessing the power of LLMs, small businesses can level the playing field and compete more effectively with larger companies.
# LLM in Production
# Anzen
Hannes Hapke
Hannes Hapke · Apr 27th, 2023

What is the role of Machine Learning Engineers in the time of GPT4 and BARD?

With the fast pace of innovation and the release of Large Language Models like Bard or GPT4, the role of data scientists and machine learning engineers is rapidly changing. APIs from Google, OpenAI, and other companies democratize access to machine learning but also commoditize some machine learning projects. In his talk, Hannes will explain the state of the ML world and which machine learning projects are in danger of being replaced by 3rd party APIs. He will walk the audience through a framework to determine if an API could replace your current machine-learning project and how to evaluate Machine Learning APIs in terms of data privacy and AI bias. Furthermore, Hannes will dive deep into how you can hone your machine-learning knowledge for future projects.
# GPT4
# Digits Financial, Inc.
 Deepankar Mahapatro
Deepankar Mahapatro · Apr 27th, 2023

Taking LangChain Apps to Production with LangChain-serve

Scalable, Serverless deployments of LangChain apps on the cloud without sacrificing the ease and convenience of local development. Streaming experiences without worrying about infrastructure
# LLM in Production
# LangChain
# LangChain-serve
Willem Pienaar
Willem Pienaar · Apr 27th, 2023

Emerging Patterns for LLMs in Production

As the landscape of large language models (LLMs) advances at an unprecedented rate, novel techniques are constantly emerging to make LLMs faster, safer, and more reliable in production. This talk explores some of the latest patterns that builders have adopted when integrating LLMs into their products.
# LLM in Production
# In-Stealth
Adam will highlight potential negative user outcomes that can arise when adding LLM-driven capabilities to an existing product. He will also discuss strategies and best practices that can be used to ensure a high-quality user experience for customers.
# LLM-driven Products
# Autoblocks
Ashe Magalhaes
Ashe Magalhaes · Apr 27th, 2023

Agentic Relationship Management

Today, our personal and professional networks have reached an unprecedented level of complexity. The last two decades of tech innovation have connected us to more people, across more communication channels, spanning a wider range of contexts than ever before. Our growing networks have become intractable to manage as individuals, let alone teams.
# Tech Innovation
# LLM in Production
# Hearth AI
Despite their advanced capabilities, Language Model Models (LLMs) are often too slow and resource-intensive for use at scale in voice applications, particularly for large-scale audio or low-latency real-time processing. SlimFast addresses this challenge by introducing Domain Specific Language Models (DSLMs) that are distilled from LLMs on specific data domains and tasks. SlimFast provides a practical solution for real-world applications, offering blazingly fast and resource-conscious models while maintaining high performance on speech intelligence tasks. We demo a new ASR-DSLM pipeline that we recently built, which performs summarization on call center audio.
# LLM in Production
# SlimFast
# Deepgram
Adept AI is developing a natural language software collaborator that utilizes LLMs to perform software tasks described in natural language. However, LLMs can suffer from overconfidence, hallucinations, and a lack of self-awareness, which can lead to incorrect actions. Jacob highlights an example of how the model can make a wrong action and emphasizes the importance of implementing safety checks such as action reversibility and content filters. By incorporating safety checks, Adept AI aims to improve the model's capabilities and ensure that it moves in the right direction.
# LLM in Production
# Adept AI
One of the biggest challenges of getting LLMs in production is their sheer size and computational complexity. This talk explores how smaller specialised models can be used in most cases to produce equally good results while being significantly cheaper and easier to deploy.
# LLM in Production
# LLM Deployments
# TitanML
Large Language Models (LLM’s) are starting to revolutionize how users can search for, interact with, and generate new content. There is one challenge though: how do users easily apply LLM’s to their own data? LLM’s are pre-trained with enormous amounts of publicly available natural language data, but they don’t inherently know about your personal/organizational data. LlamaIndex solves this by providing a central data interface for your LLM’s. In this talk, we talk about the tools that LlamaIndex offers (both simple and advanced) to ingest and index your data for LLM use.
# LLM in Production
# LlamaIndex
The Birth and Growth of Spark: An Open Source Success Story
Matei Zaharia
DevTools for Language Models: Unlocking the Future of AI-Driven Applications
Diego Oppenheimer
MLflow Pipelines: Opinionated ML Pipelines in MLflow
Xiangrui Meng
Age of Industrialized AI
Daniel Jeffries