LIVESTREAM

AI in Production

# AI

# ML

# MLOps

# AI Agents

# LLM

# Finetuning

Large Language Models have taken the world by storm. But what are the real use cases? What are the challenges in productionizing them?

In this event, you will hear from practitioners about how they are dealing with things such as cost optimization, latency requirements, trust of output and debugging.

You will also get the opportunity to join workshops that will teach you how to set up your use cases and skip over all the headaches.

Speakers

Philipp Schmid

AI Developer Experience @ Google DeepMind

Linus Lee

Research Engineer @ Notion

Holden Karau

Software Engineer @ Netflix (Day) & Totally Legit Co (weekends)

Kai Wang

Group Product Manager - AI Platform @ Uber

Alejandro Saucedo

Director of Engineering, Applied Science, Product & Analytics @ Zalando

Shreya Rajpal

Creator @ Guardrails AI

Faizaan Charania

Senior Product Manager, ML @ LinkedIn

Olatunji Ruwase

Principal Research Sciences Manager @ Microsoft

Shreya Shankar

PhD Student @ UC Berkeley

Amritha Arun Babu

AI/ML Product Leader @ Klaviyo

Nyla Worker

Director of Product @ Convai

Jason Liu

Independent Consultant @ 567

Maria Vechtomova

MLOps Tech Lead @ Ahold Delhaize

Maxime Labonne

Senior Machine Learning Scientist @ --

Hien Luu

Head of ML Platform @ DoorDash

Sarah Guo

Founder @ Conviction

Başak Tuğçe Eskili

ML Engineer @ Booking.com

Cameron Wolfe

Director of AI @ Rebuy Engine

Aarash Heydari

Technical Staff @ Perplexity AI

Dhruv Ghulati

Product | Applied AI @ Uber

Katharine Jarmul

Principal Data Scientist @ Thoughtworks

Diego Oppenheimer

Head of Product @ Hyperparam

Julia Turc

Co-CEO @ Storia AI

Aditya Bindal

Vice President, Product @ Contextual AI

Annie Condon

AI Solutions Engineer @ Acre Security

Ads Dawson

Senior Security Engineer @ Cohere

Greg Kamradt

Founder @ Data Independent

Willem Pienaar

Co-Founder & CTO @ Cleric

Arjun Bansal

CEO and Co-founder @ Log10.io

Jineet Doshi

Staff Data Scientist @ Intuit

Andy McMahon

Director - Principal AI Engineer @ Barclays Bank

Meryem Arik

Co-founder/CEO @ TitanML

Hannes Hapke

Principal Machine Learning Engineer @ Digits

Eric Peter

Product, AI Platform @ Databricks

Matt Sharp

AI Strategist and Principle Engineer @ Flexion

Daniel Lenton

CEO @ Unify

Rex Harris

Founder, AI Product Lead @ Agents of Change

Jonny Dimond

CTO @ Shortwave

Arnav Singhvi

Research Scientist Intern @ Databricks

David Haber

CEO @ Lakera

Sam Stone

Head of Product @ Tome

Andriy Mulyar

Cofounder and CTO @ Nomic AI

Mihail Eric

Co-founder @ Storia AI

Phillip Carter

Principal Product Manager @ Honeycomb

Salma Mayorquin

Co-Founder @ Remyx AI

Jerry Liu

CEO @ LlamaIndex

Lina Paola Chaparro Perez

Machine Learning Project Leader @ Mercado Libre

Austin Bell

Staff Software Engineer, Machine Learning @ Slack

Stanislas Polu

Software Engineer & Co-Founder @ Dust

David Aponte

Senior Research SDE, Applied Sciences Group @ Microsoft

Charles Brecque

Founder & CEO @ TextMine

Agnieszka Mikołajczyk-Bareła

Senior AI Engineer @ CHAPTR

Louis Guitton

Freelance Solutions Architect @ guitton.co

Yinxi Zhang

Staff Data Scientist @ Databricks

Donné Stevenson

Machine Learning Engineer @ Prosus Group

Abigail Haddad

Lead Data Scientist @ Capital Technology Group

Atita Arora

Developer Relations Manager @ Qdrant

Andrew Hoh

Co-Founder @ LastMile AI

Arjun Kannan

Co-founder @ ResiDesk

Rahul Parundekar

Founder @ A.I. Hero, inc.

Michelle Chan

Senior Product Manager @ Deepgram

Bryant Son

Senior Solutions Architect @ GitHub

Alex Cabrera

Co-Founder @ Zeno

Zairah Mustahsan

Staff Data Scientist @ You.com

Jiaxin Zhang

AI Staff Research Scientist @ Intuit

Stuart Winter-Tear

Head of AI Product @ Genaios

Chang She

CEO / Co-founder @ LanceDB

Matt Bleifer

Group Product Manager @ Tecton

Anthony Alcaraz

Chief AI Officer @ Fribl

Vipula Rawte

Ph.D. Student in Computer Science @ AIISC, UofSC

Kai Davenport

Software Engineer @ HelixML

Adam Becker

IRL @ MLOps Community

Biswaroop Bhattacharjee

Senior ML Engineer @ Prem AI

Alex Volkov

AI Evangelist @ Weights & Biases

John Whaley

Founder @ Inception Studio

Denny Lee

Sr. Staff Developer Advocate @ Databricks

Anshul Ramachandran

Head of Enterprise & Partnerships @ Codeium

Philip Kiely

Head of Developer Relations @ Baseten

Almog Baku

Fractional CTO for LLMs @ Consultant

Omoju Miller

CEO and Founder @ Fimio

Demetrios Brinkmann

Chief Happiness Engineer @ MLOps Community

Ofer Hermoni

AI Transformation Consultant @ Stealth

Agenda

2024-02-152024-02-22

Track View

Engineering Track

Product Track

Workshop

4:30 PM

4:50 PM

GMT

Engineering Stage

Opening / Closing

Welcome - AI in Production

4:50 PM

5:20 PM

GMT

Engineering Stage

Keynote

Anatomy of a Software 3.0 Company

If software 2.0 was about designing data collection for neural network training, software 3.0 is about manipulating foundation models at a system level to create great end-user experiences. AI-native applications are “GPT wrappers” the way SaaS companies are database wrappers. This talk discusses the huge design space for software 3.0 applications and explains Conviction’s framework for value, defensibility and strategy in specifically assessing these companies.

+ Read More

5:20 PM

5:50 PM

GMT

Engineering Stage

Keynote

From Past Lessons to Future Horizons: Building the Next Generation of Reliable AI

In this talk, Shreya will share a candid look back at a year dedicated to developing reliable AI tools in the open-source community. The talk will explore which tools and techniques have proven effective and which ones have not, providing valuable insights from real-world experiences. Additionally, Shreya will offer predictions on the future of AI tooling, identifying emerging trends and potential breakthroughs. This presentation is designed for anyone interested in the practical aspects of AI development and the evolving landscape of open-source technology, offering both reflections on past lessons and forward-looking perspectives.

+ Read More

5:50 PM

6:20 PM

GMT

Engineering Stage

Presentation

Productionizing Health Insurance Appeal Generation

This talk will cover how we fine-tuned a model to generate health insurance appeals. If you've ever gotten a health insurance denial & just kind of given up hopefully the topic speaks to you. Even if you have not, come to learn about our adventures in using different cloud resources for fine-tuning and finally an on-prem Kubernetes based deployment in Fremont, CA including when the graphics cards would not fit in the servers.

+ Read More

6:20 PM

6:40 PM

GMT

Engineering Stage

Break

Musical Entertainment

6:40 PM

6:50 PM

GMT

Engineering Stage

Lightning Talk

Navigating through Retrieval Evaluation to demystify LLM Wonderland

This session talks about the pivotal role of retrieval evaluation in Language Model (LLM)-based applications like RAG, emphasizing its direct impact on the quality of responses generated. We explore the correlation between retrieval accuracy and answer quality, highlighting the significance of meticulous evaluation methodologies.

+ Read More

6:50 PM

7:00 PM

GMT

Engineering Stage

Lightning Talk

Graphs and Language

It is possible to build KGs with LMs through prompt engineering. But are we boiling the ocean? Can we improve the quality of the generated graph elements by using - dare I say it - SLMs (small language models)?

+ Read More

7:00 PM

7:10 PM

GMT

Engineering Stage

Lightning Talk

Explaining ChatGPT to Anyone in 10 Minutes

Over the past few years, we have witnessed a rapid evolution of generative large language models (LLMs), culminating in the creation of unprecedented tools like ChatGPT. Generative AI has now become a popular topic among both researchers and the general public. Now more than ever before, it is important that researchers and engineers (i.e., those building the technology) develop an ability to communicate the nuances of their creations to others. A failure to communicate the technical aspects of AI in an understandable and accessible manner could lead to widespread public scepticism (e.g., research on nuclear energy went down a comparable path) or the enactment of overly-restrictive legislation that hinders forward progress in our field. Within this talk, we will take a small step towards solving these issues by proposing and outlining a simple, three-part framework for understanding and explaining generative LLMs.

+ Read More

7:10 PM

7:40 PM

GMT

Engineering Stage

Presentation

Making Sense of LLMOps

Lots of companies are investing time and money in LLMs, some even have customer-facing applications, but what about some common sense? Impact assessment | Risk assessment | Maturity assessment.

+ Read More

7:40 PM

7:50 PM

GMT

Engineering Stage

Lightning Talk

Productionizing AI: How to Think From the End

As builders, engineers, and creators, we are often thinking about starting the full life-cycle of a machine learning or AI project from gathering data, cleaning the data, and training and evaluating a model. But what about the experiential qualities of an AI product that we want our user to be able to experience on the front end? Join me to learn about the foundational questions I ask myself and my team while building products that incorporate LLMs.

+ Read More

7:50 PM

8:00 PM

GMT

Engineering Stage

Lightning Talk

Evaluating Large Language Models (LLMs) for Production

In the rapidly evolving field of natural language processing, the evaluation of Large Language Models (LLMs) has become a critical area of focus. We will explore the importance of a robust evaluation strategy for LLMs and the challenges associated with traditional metrics such as ROUGE and BLEU. We will conclude the talk with some nontraditional such as correctness, faithfulness, and freshness metrics that are becoming increasingly important in the evaluation of LLMs.

+ Read More

8:00 PM

8:30 PM

GMT

Engineering Stage

Lightning Talk

Vision Pipelines in Production: Serving & Optimisations

We will discuss how can we go from developing a solution to production in context of vision models, exploring fine-tuning LORAs, upscaling pipelines, constraints based generations, and step-by-step improving overall performance & quality for a production ready service.

+ Read More

8:26 PM

8:33 PM

GMT

Engineering Stage

Break

Guess the Speaker - Quiz

8:33 PM

9:00 PM

GMT

Engineering Stage

Presentation

Charting LLMOps Odyssey: challenges and adaptations

In this presentation, we navigates the iterative development of Large Language Model (LLM) applications and the intricacies of LLMOps design. We emphasize the importance of anchoring LLM development in practical business use cases and a deep understanding of your own data. Continuous Integration and Continuous Deployment (CI/CD) should be a core component for LLM pipeline deployment, just like in Machine Learning Operations (MLOps). However, the unique challenges posed by LLMs include addressing data security, API governance, the imperative need for GPU infrastructure in inference, integration with external vector databases, and the absence of clear evaluation rubrics. Join us as we illuminate strategies to overcome these challenges and make strategic adaptations. Our journey includes reference architectures for the seamless productionization of RAGs on the Databricks Lakehouse platform.

+ Read More

9:00 PM

9:10 PM

GMT

Engineering Stage

Lightning Talk

Evaluating Language Models

I'll be talking about the challenges of evaluating language models, as well as how to address them, what metrics you can use, and datasets available. Discuss difficulties of continuous evaluation in production and common pitfalls.

Takeaways: A call to action to contribute to public evaluation datasets and a more concerted effort from the community to reduce harmful bias.

+ Read More

9:10 PM

9:30 PM

GMT

Engineering Stage

Presentation

Evaluating Data

9:30 PM

9:40 PM

GMT

Engineering Stage

Lightning Talk

Helix - Fine Tuning for Llamas

A quick run down of Helix and how it helps you to fine tune text and image AI all using the latest open source models. Kai will discuss some of the issues that cropped up when creating and running a fune tuning as a service platform.

+ Read More

9:40 PM

9:58 PM

GMT

Engineering Stage

Lightning Talk

Let's Build a Website in 10 minutes with GitHub Copilot

GitHub Copilot based on GPT is truly a game changer when it comes to automating the code generation, thus boosting developer productivity by more than 100%.In this session, you will have a chance to learn what GitHub Copilot is, and you will build a console web app in about 10 minutes with GitHub Copilot!

+ Read More

10:00 PM

10:10 PM

GMT

Engineering Stage

Lightning Talk

RagSys: RAG is just RecSys in Disguise

What old is new again. As we gain more experience in RAG, we're starting to pay more attention to improving retrieval quality. From hybrid search to reranking, RAG pipelines are starting to look more and more like recommender pipelines. In this lightning talk we'll take a brief look at the parallels between the two, and we'll check out how to do hybrid reranking with LanceDB to improve your retrieval quality.

+ Read More

10:10 PM

10:20 PM

GMT

Engineering Stage

Break

A Dash of Humor

Mihail Eric, one of the community members, is a founder by day and a stand-up comic by night!

+ Read More

10:20 PM

10:30 PM

GMT

Engineering Stage

Lightning Talk

Machine Learning beyond GenAI - Quo Vadis?

A year ago, with the introduction of GPT-4, the sphere of machine learning was transformed completely. These advancements and LLMs unlocked the capability to address previously unsolvable problems but also commoditized machine learning.

+ Read More

10:30 PM

10:40 PM

GMT

Engineering Stage

Lightning Talk

Navigating the Emerging LLMOps Stack

In this session, we will delve into the intricacies of the emerging LLMOps Stack, exploring the tools, and best practices that empower organizations to harness the full potential of LLMs.

+ Read More

11:00 PM

11:10 PM

GMT

Engineering Stage

Lightning Talk

Accelerate ML Production with Agents

Large language models (LLMs) can unlock great productivity in software engineering, but it's important to acknowledge their limitations, particularly in generating robust code. This talk, "Accelerate ML Production with Agents," discusses applying the abstraction of LLMs with tools to tackle complex challenges. Agents have the potential to streamline the orchestration of ML workflows and simplify customization and deployment processes.

+ Read More

11:08 PM

11:38 PM

GMT

Engineering Stage

Presentation

A Survey of Production RAG Pain Points and Solutions

Large Language Models (LLMs) are revolutionizing how users can search for, interact with, and generate new content. There's been an explosion of interest around Retrieval Augmented Generation (RAG), enabling users to build applications such as chatbots, document search, workflow agents, and conversational assistants using LLMs on their private data.

While setting up naive RAG is straightforward, building production RAG is very challenging. There are parameters and failure points along every stage of the stack that an AI engineer must solve in order to bring their app to production.

This talk will cover the overall landscape of pain points and solutions around building production RAG, and also paint a picture of how this architecture will evolve over time.

+ Read More

AI in Production

Speakers

Agenda

Sponsors