MLOps Community
timezone
+00:00 GMT
SIGN IN
Livestream
LLMs in Production Conference
# Large Language Models

Large Language Models have taken the world by storm. But what are the real use cases? What are the challenges in productionizing them?

In this event, you will hear from practitioners about how they are dealing with things such as cost optimization, latency requirements, trust of output and debugging.

You will also get the opportunity to join workshops that will teach you how to set up your use cases and skip over all the headaches.

Speakers
Meryem Arik
Meryem Arik
Co-founder @ titanML
 Linus Lee
Linus Lee
Research Engineer @ Notion
Lina Weichbrodt
Lina Weichbrodt
Freelance Machine Learning Development + Consulting @ Pragmatic Machine Learning Consulting
Shreya Rajpal
Shreya Rajpal
Creator @ Guardrails AI
Daniel Jeffries
Daniel Jeffries
Managing Director @ AI Infrastructure Alliance
Raza Habib
Raza Habib
CEO and Co-founder @ Humanloop
Harrison Chase
Harrison Chase
CEO @ LangChain
Saahil Jain
Saahil Jain
Engineer @ You.com
 Alex Ratner
Alex Ratner
CEO and Co-founder @ Snorkel AI
Justin Uberti
Justin Uberti
CTO and Co-founder @ Fixie
Hanlin Tang
Hanlin Tang
CTO @ MosaicML
 Mario Kostelac
Mario Kostelac
Staff Machine Learning Engineer @ Intercom
Vin Vashishta
Vin Vashishta
CEO @ V-Squared
Willem Pienaar
Willem Pienaar
Founder @ In-Stealth
Jared Zoneraich
Jared Zoneraich
Founder @ PromptLayer
Cameron Feenstra
Cameron Feenstra
Principal Engineer @ Anzen
Ashe Magalhaes
Ashe Magalhaes
Founder @ Hearth AI
Luis Ceze
Luis Ceze
CEO and Co-founder @ OctoML
Eli Mernit
Eli Mernit
CEO / Founder @ Beam
Diego Oppenheimer
Diego Oppenheimer
Partner @ Factory
Gevorg Karapetyan
Gevorg Karapetyan
Co-founder and CTO @ ZERO Systems
Demetrios Brinkmann
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community
 Tanmay Chopra
Tanmay Chopra
Machine Learning Engineer @ Neeva
Jon Turow
Jon Turow
Partner @ Madrona
Daniel Campos
Daniel Campos
Research Scientist @ Neeva
Jerry Liu
Jerry Liu
Co-Founder/CEO @ LlamaIndex
Jacob van Gogh
Jacob van Gogh
Member of Technical Staff @ Adept AI
Andrew Seagraves
Andrew Seagraves
VP of Research @ Deepgram
Hannes Hapke
Hannes Hapke
Machine Learning Engineer @ Digits Financial, Inc.
Pascal Brokmeier
Pascal Brokmeier
Lead Data Engineer @ McKinsey and Company
 Torgyn Erland
Torgyn Erland
Data Scientist @ QuantumBlack, AI by McKinsey
Samuel Partee
Samuel Partee
Principal Applied AI Engineer @ Redis
 Vikram Chatterji
Vikram Chatterji
Co-founder and CEO @ Galileo
 Deepankar Mahapatro
Deepankar Mahapatro
Engineering Manager @ Jina AI
 Adam Nolte
Adam Nolte
CTO and Co-founder @ Autoblocks
Daniel Herde
Daniel Herde
Lead Data Scientist @ QuantumBlack, AI by McKinsey
Braden Hancock
Braden Hancock
Co-founder and Head of Technology @ Snorkel AI
Viktoriia Oliinyk
Viktoriia Oliinyk
Data Scientist @ QuantumBlack, AI by McKinsey
Meryem Arik
Meryem Arik
Co-founder @ titanML
 Linus Lee
Linus Lee
Research Engineer @ Notion
Lina Weichbrodt
Lina Weichbrodt
Freelance Machine Learning Development + Consulting @ Pragmatic Machine Learning Consulting
Shreya Rajpal
Shreya Rajpal
Creator @ Guardrails AI
Daniel Jeffries
Daniel Jeffries
Managing Director @ AI Infrastructure Alliance
Raza Habib
Raza Habib
CEO and Co-founder @ Humanloop
Harrison Chase
Harrison Chase
CEO @ LangChain
Saahil Jain
Saahil Jain
Engineer @ You.com
 Alex Ratner
Alex Ratner
CEO and Co-founder @ Snorkel AI
Justin Uberti
Justin Uberti
CTO and Co-founder @ Fixie
Hanlin Tang
Hanlin Tang
CTO @ MosaicML
 Mario Kostelac
Mario Kostelac
Staff Machine Learning Engineer @ Intercom
Vin Vashishta
Vin Vashishta
CEO @ V-Squared
Willem Pienaar
Willem Pienaar
Founder @ In-Stealth
Jared Zoneraich
Jared Zoneraich
Founder @ PromptLayer
Cameron Feenstra
Cameron Feenstra
Principal Engineer @ Anzen
Ashe Magalhaes
Ashe Magalhaes
Founder @ Hearth AI
Luis Ceze
Luis Ceze
CEO and Co-founder @ OctoML
Eli Mernit
Eli Mernit
CEO / Founder @ Beam
Diego Oppenheimer
Diego Oppenheimer
Partner @ Factory
Gevorg Karapetyan
Gevorg Karapetyan
Co-founder and CTO @ ZERO Systems
Demetrios Brinkmann
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community
 Tanmay Chopra
Tanmay Chopra
Machine Learning Engineer @ Neeva
Jon Turow
Jon Turow
Partner @ Madrona
Daniel Campos
Daniel Campos
Research Scientist @ Neeva
Jerry Liu
Jerry Liu
Co-Founder/CEO @ LlamaIndex
Jacob van Gogh
Jacob van Gogh
Member of Technical Staff @ Adept AI
Andrew Seagraves
Andrew Seagraves
VP of Research @ Deepgram
Hannes Hapke
Hannes Hapke
Machine Learning Engineer @ Digits Financial, Inc.
Pascal Brokmeier
Pascal Brokmeier
Lead Data Engineer @ McKinsey and Company
 Torgyn Erland
Torgyn Erland
Data Scientist @ QuantumBlack, AI by McKinsey
Samuel Partee
Samuel Partee
Principal Applied AI Engineer @ Redis
 Vikram Chatterji
Vikram Chatterji
Co-founder and CEO @ Galileo
 Deepankar Mahapatro
Deepankar Mahapatro
Engineering Manager @ Jina AI
 Adam Nolte
Adam Nolte
CTO and Co-founder @ Autoblocks
Daniel Herde
Daniel Herde
Lead Data Scientist @ QuantumBlack, AI by McKinsey
Braden Hancock
Braden Hancock
Co-founder and Head of Technology @ Snorkel AI
Viktoriia Oliinyk
Viktoriia Oliinyk
Data Scientist @ QuantumBlack, AI by McKinsey
Agenda
Track 1
Track 2
Workshops
3:00 PM
3:10 PM
Opening / Closing
calendar

Welcome

Demetrios Brinkmann
3:10 PM
3:40 PM
Keynote
calendar

DevTools for Language Models: Unlocking the Future of AI-Driven Applications

In this talk, we explore the thriving ecosystem of tools and technologies emerging around large language models (LLMs) such as GPT-3. As the LLM landscape enters the "Holy $#@!" phase of exponential growth, a surge of developers are building remarkable product experiences on top of these models, giving rise to a rich collection of DevTools. We delve into the current state of LLM DevTools, their significance, and future prospects. We also examine the challenges and opportunities involved in building intelligent features using LLMs, discussing the role of experimentation, prompting, knowledge retrieval, and vector databases. Moreover, we consider the next set of challenges faced by teams looking to scale their LLM features, such as data labeling, fine-tuning, monitoring, observability, and testing. Drawing parallels with previous waves of machine learning DevTools, we predict the trajectory of this rapidly maturing market and the potential impact on the broader AI landscape. Join us in this exciting discussion to learn about the future of AI-driven applications and the tools that will enable their success.

+ Read More
Diego Oppenheimer
Demetrios Brinkmann
3:40 PM
4:10 PM
Keynote
calendar

Age of Industrialized AI

The rise of LLMs means we're entering an era where intelligent agents with natural language will invade every kind of software on Earth. But how do we fix them when they hallucinate? How do we put guardrails around them? How do we protect them from giving away our secrets of falling prey to social engineering? We're on the cusp of a brand new era of incredibly capabilities but we've also got new attack vectors and problems that will change how we build and defend our systems. We'll talk about how we can solve some of these problems now and what we can do in the future to solve them better.

+ Read More
Daniel Jeffries
4:10 PM
4:20 PM
Lightning Talk
calendar

Reasoning Machines: Differentiating LLM Apps with Ops at 4 Levels of the Stack

Are apps built on large-language models "just a thin wrapper" that others can quickly replicate, or can they be more defensible? This talk explores how to build moats and amazing new products with these "reasoning machines", by using the entire stack around the LLMs as well as the models themselves.

+ Read More
Justin Uberti
Jon Turow
4:10 PM
4:40 PM
Panel Discussion
calendar

Data Privacy and Security

Diego Oppenheimer
Gevorg Karapetyan
Vin Vashishta
Saahil Jain
Shreya Rajpal
4:20 PM
4:30 PM
Lightning Talk
calendar

Why specialized NLP models might be the secret to easier LLM deployment

One of the biggest challenges of getting LLMs in production is their sheer size and computational complexity. This talk explores how smaller specialised models can be used in most cases to produce equally good results while being significantly cheaper and easier to deploy.

+ Read More
Meryem Arik
4:30 PM
4:40 PM
Lightning Talk
calendar

How LlamaIndex Can Connect your LLM's with your External Data

Large Language Models (LLM’s) are starting to revolutionize how users can search for, interact with, and generate new content. There is one challenge though: how do users easily apply LLM’s to their own data? LLM’s are pre-trained with enormous amounts of publicly available natural language data, but they don’t inherently know about your personal/organizational data. LlamaIndex solves this by providing a central data interface for your LLM’s. In this talk, we talk about the tools that LlamaIndex offers (both simple and advanced) to ingest and index your data for LLM use.

+ Read More
Jerry Liu
4:40 PM
4:50 PM
1:1 networking
calendar

Improvised Musical Break

Demetrios Brinkmann
4:40 PM
4:50 PM
1:1 networking
calendar

Improvised Musical Break

Join us on 'Track 1'!

+ Read More
Demetrios Brinkmann
4:50 PM
5:20 PM
Presentation
calendar

Efficiently Scaling and Deploying LLMs

Hanlin Tang
4:50 PM
5:20 PM
Presentation
calendar

What is the role of Machine Learning Engineers in the time of GPT4 and BARD?

With the fast pace of innovation and the release of Large Language Models like Bard or GPT4, the role of data scientists and machine learning engineers is rapidly changing. APIs from Google, OpenAI, and other companies democratize access to machine learning but also commoditize some machine learning projects.

In his talk, Hannes will explain the state of the ML world and which machine learning projects are in danger of being replaced by 3rd party APIs. He will walk the audience through a framework to determine if an API could replace your current machine-learning project and how to evaluate Machine Learning APIs in terms of data privacy and AI bias. Furthermore, Hannes will dive deep into how you can hone your machine-learning knowledge for future projects.

+ Read More
Hannes Hapke
4:50 PM
6:20 PM
Workshop
calendar

Building Fast AI Prototypes

This event is for those who want to learn how to quickly prototype and ship AI apps. We're going to share a few cheat codes: you'll learn how to use pre-trained ML models, how to build generative AI apps, chatbots, and conversational agents, and how to use GPUs for maximum productivity. You don't want to miss this!

+ Read More
Eli Mernit
5:20 PM
5:30 PM
Lightning Talk
calendar

No rose without a thorn - Obstacles to Successful LLM Deployments

LLMs have garnered immense attention in a short span of time - with their capabilities usually being conveyed to the world in low-precision demanding scenarios like demos and MVPs, but as we all know, deploying to prod is a whole other ballgame. In this talk, we'll discuss some pitfalls expected in deploying LLMs to production use-cases both at the terminal layer (direct-to-user) as well as intermediate layers. We'll approach this topic from both infrastructural and output-focused lenses and explore potential solutions to challenges ranging from foundational model downtime and latency concerns to output variability and prompt injections.

+ Read More
 Tanmay Chopra
5:20 PM
5:50 PM
Presentation
calendar

Challenges and opportunities in building Data Science solutions with LLMs

In this roundtable, we will share our experiences with LLMs across a number of real-world applications, including what it takes to build systems around LLMs in a rapidly changing landscape. We will discuss the challenges around productionsing LLM-based solutions, evaluation of the quality, as well as implications around risk & compliance.

+ Read More
Pascal Brokmeier
 Torgyn Erland
Daniel Herde
Viktoriia Oliinyk
5:30 PM
5:40 PM
Lightning Talk
calendar

Emerging Patterns for LLMs in Production

As the landscape of large language models (LLMs) advances at an unprecedented rate, novel techniques are constantly emerging to make LLMs faster, safer, and more reliable in production. This talk explores some of the latest patterns that builders have adopted when integrating LLMs into their products.

+ Read More
Willem Pienaar
5:40 PM
5:50 PM
Lightning Talk
calendar

LangChain: Enabling LLMs to Use Tools

This talk will cover everything related to getting LLMs to use tools. It will discuss why enabling tool use is important, different types of tools, popular prompting strategies for using tools, and what difficulties still exist.

+ Read More
Harrison Chase
5:50 PM
6:00 PM
Lightning Talk
calendar

Beyond the Hype: Ensuring Accuracy and Quality in LLM-driven Products

Adam will highlight potential negative user outcomes that can arise when adding LLM-driven capabilities to an existing product. He will also discuss strategies and best practices that can be used to ensure a high-quality user experience for customers.

+ Read More
 Adam Nolte
5:50 PM
6:20 PM
Presentation
calendar

Generative Interfaces beyond Chat

Chat-based interfaces to LLMs are the command-line interfaces for interfacing with modern generative AI systems, and we should be more imaginative and ambitious when thinking about how we'll interact with them in the future. I'll share 5 concrete big ideas for how to design good interactions to LLMs and other generative AI models that I've found helpful in my own work, and hopefully in the process enable you to start to think outside of the chat box (hah, literally!) when building your own products.

+ Read More
 Linus Lee
6:00 PM
6:20 PM
Lightning Talk
calendar

Vector Databases and Large Language models

Generative models such as ChatGPT have changed many product roadmaps. Interfaces and user experience can now be re-imagined and often drastically simpified to what resembles a google search bar where the input is natural language. However, some models remain behind APIs without the ability to re-train on contextually appropriate data. Even in the case where the model weights are publically available, re-training or fine-tuning is often expensive, requires expertise and is ill-suited to problem domains with constant updates. How then can such APIs be used when the data needed to generate an accurate output was not present in the training set because it is consistently changing? Vector embeddings represent the impression a model has of some, likely unstructured, data. When combined with a vector database or search algorithm, embeddings can be used to retrieve information that provides context for a generative model. Such embeddings, linked to specific information, can be updated in real-time providing generative models with a continually up-to-date, external body of knowledge. Suppose you wanted to make a product that could answer questions about internal company documentation as an onboarding tool for new employees. For large enterprises especially, re-training model this ever-changing body of knowledge would be untenable in terms of a cost to benefit ratio. Instead, using a vector database to retrieve context for prompts allows for point-in-time correctness of generated output. This also prevents model "hallucinations" as models can be instructed provide no answer when the vector search returns results below some confidence threshold. In this talk we will demonstrate the validity of this approach through examples. We will provide instructions, code and other assets that are open source and available on GitHub.

+ Read More
Samuel Partee
6:20 PM
6:40 PM
1:1 networking
calendar

Prompt Hacking Competition

Join us on 'Track 1'! Prizes and swag for the person who can get the nasty prompt injections

+ Read More
Demetrios Brinkmann
6:20 PM
6:40 PM
1:1 networking
calendar

Prompt Hacking Competition

Prizes and swag for the person who can get the nasty prompt injections. Join us on 'Track 1'!

+ Read More
Demetrios Brinkmann
6:20 PM
6:40 PM
1:1 networking
calendar

Prompt Hacking Competition

Prizes and swag for the person who can get the nasty prompt injections

+ Read More
Demetrios Brinkmann
6:40 PM
7:10 PM
Presentation
calendar

Solving the Last Mile Problem of Foundation Models with Data-Centric AI

Today, large language or “foundation” models (FMs) represent one of the most powerful new ways to build AI models; however, they still struggle to achieve production-level accuracy out of the box on complex, high-value, and/or dynamic use cases, often “hallucinating” facts, propagating data biases, and misclassifying domain-specific edge cases. This “last mile” problem is always the hardest part of shipping real AI applications, especially in the enterprise- and while FMs provide powerful foundations, they do not “build the house”.

+ Read More
 Alex Ratner
7:10 PM
7:20 PM
Lightning Talk
calendar

Taking langchain apps to Production with langchain-serve

Scalable, Serverless deployments of Langchain apps on the cloud without sacrificing the ease and convenience of local development. Streaming experiences without worrying about infrastructure

+ Read More
 Deepankar Mahapatro
7:10 PM
7:40 PM
Panel Discussion
calendar

Cost Optimization and Performance

Lina Weichbrodt
Luis Ceze
Jared Zoneraich
Daniel Campos
 Mario Kostelac
7:40 PM
7:50 PM
1:1 networking
calendar

Guided Meditation Break

Demetrios Brinkmann
7:40 PM
7:50 PM
1:1 networking
calendar

Guided Meditation Break

Join us on 'Track 1'!

+ Read More
Demetrios Brinkmann
7:40 PM
7:50 PM
1:1 networking
calendar

Guided Meditation Break

Join us on 'Track 1'!

+ Read More
Demetrios Brinkmann
7:50 PM
8:20 PM
Presentation
calendar

Building Defensible Products with LLMs

LLMs unlock a huge range of new product possibilities but with everyone using the same base models, how can you build something differentiated? In this talk, we'll look at case studies of companies that have and haven't got it right and draw lessons for what you can do.

+ Read More
Raza Habib
7:50 PM
8:20 PM
Presentation
calendar

Want high performing LLMs? Hint: It is all about your data

Building LLMs that work well in production, at scale, can be a slow, iterative, costly and unpredictable process. While new LLMs emerge each day, similar to what we saw with the Transformers era, models are getting increasingly commoditized – the differentiator and key ingredient for high performing models will be the data you feed it with.

This talk focuses on the criticality of ensuring data scientists work with high quality data across the ML workflow, the importance of pre-training and the common gotchas to avoid in the process.

+ Read More
 Vikram Chatterji
8:20 PM
8:30 PM
Lightning Talk
calendar

Agentic Relationship Management

Ashe Magalhaes
8:20 PM
8:50 PM
Presentation
calendar

Using LLMs to Punch Above your Weight!

As a small business, competing with large incumbents can be a daunting challenge. They have more money, more people, and more data, but they can also be inflexible and slow to adopt new technologies. In this talk, we will explore how small businesses can use the power of large language models (LLMs) to compete with large incumbents, particularly in industries like insurance. We will present two examples of how we are using LLMs at Anzen to streamline insurance underwriting and analyze employment agreements and discuss ideas for future applications. By harnessing the power of LLMs, small businesses can level the playing field and compete more effectively with larger companies.

+ Read More
Cameron Feenstra
8:20 PM
9:20 PM
Workshop
calendar

Making LLMs faster, cheaper, and smarter

While Large Language Models (LLMs) such as GPT-4 come pre-loaded with loads of useful general knowledge, they're rarely able to be deployed directly due to gaps in quality (lack of specialization) and deployability (too large and expensive). In this workshop, we'll demonstrate how the data development platform Snorkel Flow can be used to distill the relevant knowledge from state-of-the-art LLMs into smaller specialist models that are faster, cheaper, and higher quality.

+ Read More
Braden Hancock
8:30 PM
8:40 PM
Lightning Talk
calendar

SLMFast: Unleashing Speech Intelligence through Domain-Specific Language Models (DSLMs)

Despite their advanced capabilities, Language Model Models (LLMs) are often too slow and resource-intensive for use at scale in voice applications, particularly for large-scale audio or low-latency real-time processing. SlimFast addresses this challenge by introducing Domain Specific Language Models (DSLMs) that are distilled from LLMs on specific data domains and tasks. SlimFast provides a practical solution for real-world applications, offering blazingly fast and resource-conscious models while maintaining high performance on speech intelligence tasks. We demo a new ASR-DSLM pipeline that we recently built, which performs summarization on call center audio.

+ Read More
Andrew Seagraves
8:40 PM
8:50 PM
Lightning Talk
calendar

Sharing the Wheel: Guiding LLMs While Staying in the Driver's Seat

Jacob van Gogh
Sponsors

Diamond Level

Snorkel AI

Gold Level

Tecton
Petuum
QuantumBlack, AI by McKinsey
Wallaroo
Union

Silver Level

Community Level

The Turing Post
Big Brain
Event has finished
April 13, 3:00 PM, GMT
Online
Organized by
MLOps Community
MLOps Community
Event has finished
April 13, 3:00 PM, GMT
Online
Organized by
MLOps Community
MLOps Community