Join us for two days of talking with some of our favorite people at the forefront of using LLMs in the wild, and an in-person workshop in San Francisco on how to build and deploy LLM based apps hosted by Anyscale.
There will be over 50 Speakers from Stripe, Meta, Canva, Databricks, Anthropic, Cohere, Redis, Langchain, Chroma, Humanloop and so many more.
This all started after we put together the LLM in-production survey and realized there are still lots of unknowns when dealing with LLMs, especially when dealing with them at scale. We open-sourced all the responses and we decided if no one was going to talk about working with LLMs in a non-over-hyped way, we would have to.
Let's discover how to use these damn probabilistic models in the best ways possible without sacrificing the necessary software design building blocks.
Expect all the fun and learnings from the first one. DOUBLED.
And remember, there will be some sweeeet sweet swag giveaways.
Huge Shoutout to all our sponsors of this event, find more info about them below.
The journey from LLM PoCs to production deployment is fraught with unique challenges, from maintaining model reliability to effectively managing costs. In this talk, we delve deep into these complexities, outlining design patterns for successful LLM production, the role of vector databases, strategies to enhance reliability, and cost-effective methodologies.
Building a chatbot is not easy....Or is it? We need:
An embedding model that translates questions to a matrix. A Vector database to search. LLM to generate the answers.
We can orchestrate the job using Langhcain with minimum development.
Using Wardley Maps we can understand value chains and map out the landscape. Using this to develop strategies and understand where to target our efforts.
Take a moment to randomly match with others in this event by participating in the networking sessions. To access the random introductions click on the match tab in the left sidebar.
Proprietary LLMs are difficult for enterprises to adopt because of security and data privacy concerns. Open-source LLMs can circumvent many of these problems. While open LLMs are incredibly exciting, they're also a nightmare to deploy and operate in the cloud. Aqueduct enables you to run open LLMs in a few lines of vanilla Python on any cloud infrastructure that you use.
There are key areas we must be aware of when working with LLMs. High costs and low latency requirements are just the tip of the iceberg. In this panel we will hear about common pitfalls and challenges we must keep in mind when building on top of LLMs.
It’s silly to think of training and using large LANGUAGE models without any sort of input from the study of language itself. Linguistics are not the only field of knowledge that improve LLMs, as they are the intersection of several fields, however, they can help us not only improve current model performance, but also clearly see where future improvements will come.
This session provides an overview of the evolving landscape of Generative AI, with a focus on the latest trends and technologies that shape this field. Designed with startups in mind, the talk offers practical insights on how to adapt and leverage these advancements to enhance their products. Attendees will acquire valuable knowledge to navigate the dynamic landscape of Generative AI, enabling them to stay up-to-date and harness untapped potential for the success of their startups.
Here’s the truth: troubleshooting models based on unstructured data is notoriously difficult. The measures typically used for drift in tabular data do not extend to unstructured data. The general challenge with measuring unstructured data drift is that you need to understand the change in relationships inside the unstructured data itself. In short, you need to understand the data in a deeper way before you can understand drift and performance degradation.
In this presentation, Claire Long will present findings from research on ways to measure vector/embedding drift for image and language models. With lessons learned from testing different approaches (including Euclidean and Cosine distance) across billions of streams and use cases, she will dive into how to detect whether two unstructured language datasets are different — and, if so, how to understand that difference using techniques such as UMAP.
Take a moment to randomly match with others in this event by participating in the networking sessions. To access the random introductions click on the match tab in the left sidebar.
The rapid adoption of large language models (LLMs) is transforming how businesses communicate, learn, and work, prioritizing AI safety and security. This captivating and insightful talk will delve into the challenges and risks associated with LLM adoption and unveil AIShield.GuArdIan – a game-changing technology that enables businesses to leverage ChatGPT-like AI without compromising compliance. AIShield.GuArdIan's unique approach ensures legal, policy, ethical, role-based, and usage-based compliance, allowing companies to harness the power of LLMs safely. Join us on this riveting journey as we reshape the future of AI, empowering industries to unlock the full potential of LLMs securely and responsibly. Don't miss this opportunity to be at the forefront of responsible AI usage – reserve your seat today and take the first step towards a secure AI-powered future!
Large Language Models are an especially exciting opportunity for Operations: they excel at answering questions, completing sentences, and summarizing text while requiring ~100x less training data than the previous generation of models.
In this talk Sophie will discuss lessons learned productionising Stripe’s first application of Large Language Modelling - providing answers to user questions for Stripe Support.
This lightning talk explores the challenges encountered in offering Large Language Models as a Service. As LLMs are becoming increasingly larger and more proficient, there are certain challenges that arise which need to be addressed to ensure the efficient and reliable delivery of LLMs as a Service. This talk delves into key challenges such as scalability, model optimization, cost-effectiveness, and data privacy.
Document Question-Answering is a popular LLM use-case. LangChain makes it easy to assemble LLM components (e.g., models and retrievers) into chains that support question-answering. But, it is not always obvious to (1) evaluate the answer quality and (2) use this evaluation to guide improved QA chain settings (e.g., chunk size, retrieved docs count) or components (e.g., model or retriever choice). We recently released an open source, hosted app to to address these limitations (see blog post here). We have used this to compare performance of various retrieval methods, including Anthropic's 100k context length model (blog post here). This talk will discuss our results and future plans.
The impressive reasoning abilities of LLMs can be an attractive proposition for many businesses, but using foundational models and APIs can be slow and full of bumpy API latency windows. Self-hosting models can be an attractive alternative, but how do you choose what model to use, and if you have a latency or inference budget, how do you make it fit? We will discuss how pseudo-labeling, knowledge distillation, pruning, and quantization can ensure the highest efficiency possible.
You can’t build robust systems with inconsistent, unstructured text output from LLMs. Moreover, LLM integrations scare corporate lawyers, finance departments, and security professionals due to hallucinations, cost, lack of compliance (e.g., HIPAA), leaked IP/PII, and “injection” vulnerabilities. This talk will cover some practical methodologies for getting consistent, structured output from compliant AI systems. These systems, driven by open access models and various kinds of LLM wrappers, can help you delight customers AND navigate the increasing restrictions on "GPT" models.
Large Language Models (LLMs) have shown remarkable capabilities in domains such as question-answering and information recall, but every so often, they just make stuff up. In this talk, we'll take a look at “LLM Hallucinations" and explore strategies to keep LLMs grounded and reliable in real-world applications.
We’ll start by walking through an example implementation of an "LLM-powered Support Center" to illustrate the problems caused by hallucinations. Next, I'll demonstrate how leveraging a searchable knowledge base can ensure that the assistant delivers trustworthy responses. We’ll wrap up by exploring the scalability of this approach and its potential impact on the future of AI-driven applications.