Large Language Models have taken the world by storm. But what are the real use cases? What are the challenges in productionizing them?
In this event, you will hear from practitioners about how they are dealing with things such as cost optimization, latency requirements, trust of output and debugging.
You will also get the opportunity to join workshops that will teach you how to set up your use cases and skip over all the headaches.
In this talk, we explore the thriving ecosystem of tools and technologies emerging around large language models (LLMs) such as GPT-3. As the LLM landscape enters the "Holy $#@!" phase of exponential growth, a surge of developers are building remarkable product experiences on top of these models, giving rise to a rich collection of DevTools. We delve into the current state of LLM DevTools, their significance, and future prospects. We also examine the challenges and opportunities involved in building intelligent features using LLMs, discussing the role of experimentation, prompting, knowledge retrieval, and vector databases. Moreover, we consider the next set of challenges faced by teams looking to scale their LLM features, such as data labeling, fine-tuning, monitoring, observability, and testing. Drawing parallels with previous waves of machine learning DevTools, we predict the trajectory of this rapidly maturing market and the potential impact on the broader AI landscape. Join us in this exciting discussion to learn about the future of AI-driven applications and the tools that will enable their success.
The rise of LLMs means we're entering an era where intelligent agents with natural language will invade every kind of software on Earth. But how do we fix them when they hallucinate? How do we put guardrails around them? How do we protect them from giving away our secrets of falling prey to social engineering? We're on the cusp of a brand new era of incredibly capabilities but we've also got new attack vectors and problems that will change how we build and defend our systems. We'll talk about how we can solve some of these problems now and what we can do in the future to solve them better.
Are apps built on large-language models "just a thin wrapper" that others can quickly replicate, or can they be more defensible? This talk explores how to build moats and amazing new products with these "reasoning machines", by using the entire stack around the LLMs as well as the models themselves.
One of the biggest challenges of getting LLMs in production is their sheer size and computational complexity. This talk explores how smaller specialised models can be used in most cases to produce equally good results while being significantly cheaper and easier to deploy.
Large Language Models (LLM’s) are starting to revolutionize how users can search for, interact with, and generate new content. There is one challenge though: how do users easily apply LLM’s to their own data? LLM’s are pre-trained with enormous amounts of publicly available natural language data, but they don’t inherently know about your personal/organizational data. LlamaIndex solves this by providing a central data interface for your LLM’s. In this talk, we talk about the tools that LlamaIndex offers (both simple and advanced) to ingest and index your data for LLM use.
Join us on 'Track 1'!
With the fast pace of innovation and the release of Large Language Models like Bard or GPT4, the role of data scientists and machine learning engineers is rapidly changing. APIs from Google, OpenAI, and other companies democratize access to machine learning but also commoditize some machine learning projects.
In his talk, Hannes will explain the state of the ML world and which machine learning projects are in danger of being replaced by 3rd party APIs. He will walk the audience through a framework to determine if an API could replace your current machine-learning project and how to evaluate Machine Learning APIs in terms of data privacy and AI bias. Furthermore, Hannes will dive deep into how you can hone your machine-learning knowledge for future projects.
This event is for those who want to learn how to quickly prototype and ship AI apps. We're going to share a few cheat codes: you'll learn how to use pre-trained ML models, how to build generative AI apps, chatbots, and conversational agents, and how to use GPUs for maximum productivity. You don't want to miss this!
LLMs have garnered immense attention in a short span of time - with their capabilities usually being conveyed to the world in low-precision demanding scenarios like demos and MVPs, but as we all know, deploying to prod is a whole other ballgame. In this talk, we'll discuss some pitfalls expected in deploying LLMs to production use-cases both at the terminal layer (direct-to-user) as well as intermediate layers. We'll approach this topic from both infrastructural and output-focused lenses and explore potential solutions to challenges ranging from foundational model downtime and latency concerns to output variability and prompt injections.
In this roundtable, we will share our experiences with LLMs across a number of real-world applications, including what it takes to build systems around LLMs in a rapidly changing landscape. We will discuss the challenges around productionsing LLM-based solutions, evaluation of the quality, as well as implications around risk & compliance.
As the landscape of large language models (LLMs) advances at an unprecedented rate, novel techniques are constantly emerging to make LLMs faster, safer, and more reliable in production. This talk explores some of the latest patterns that builders have adopted when integrating LLMs into their products.
This talk will cover everything related to getting LLMs to use tools. It will discuss why enabling tool use is important, different types of tools, popular prompting strategies for using tools, and what difficulties still exist.
Adam will highlight potential negative user outcomes that can arise when adding LLM-driven capabilities to an existing product. He will also discuss strategies and best practices that can be used to ensure a high-quality user experience for customers.
Chat-based interfaces to LLMs are the command-line interfaces for interfacing with modern generative AI systems, and we should be more imaginative and ambitious when thinking about how we'll interact with them in the future. I'll share 5 concrete big ideas for how to design good interactions to LLMs and other generative AI models that I've found helpful in my own work, and hopefully in the process enable you to start to think outside of the chat box (hah, literally!) when building your own products.
Generative models such as ChatGPT have changed many product roadmaps. Interfaces and user experience can now be re-imagined and often drastically simpified to what resembles a google search bar where the input is natural language. However, some models remain behind APIs without the ability to re-train on contextually appropriate data. Even in the case where the model weights are publically available, re-training or fine-tuning is often expensive, requires expertise and is ill-suited to problem domains with constant updates. How then can such APIs be used when the data needed to generate an accurate output was not present in the training set because it is consistently changing? Vector embeddings represent the impression a model has of some, likely unstructured, data. When combined with a vector database or search algorithm, embeddings can be used to retrieve information that provides context for a generative model. Such embeddings, linked to specific information, can be updated in real-time providing generative models with a continually up-to-date, external body of knowledge. Suppose you wanted to make a product that could answer questions about internal company documentation as an onboarding tool for new employees. For large enterprises especially, re-training model this ever-changing body of knowledge would be untenable in terms of a cost to benefit ratio. Instead, using a vector database to retrieve context for prompts allows for point-in-time correctness of generated output. This also prevents model "hallucinations" as models can be instructed provide no answer when the vector search returns results below some confidence threshold. In this talk we will demonstrate the validity of this approach through examples. We will provide instructions, code and other assets that are open source and available on GitHub.
Join us on 'Track 1'! Prizes and swag for the person who can get the nasty prompt injections
Prizes and swag for the person who can get the nasty prompt injections. Join us on 'Track 1'!
Prizes and swag for the person who can get the nasty prompt injections
Today, large language or “foundation” models (FMs) represent one of the most powerful new ways to build AI models; however, they still struggle to achieve production-level accuracy out of the box on complex, high-value, and/or dynamic use cases, often “hallucinating” facts, propagating data biases, and misclassifying domain-specific edge cases. This “last mile” problem is always the hardest part of shipping real AI applications, especially in the enterprise- and while FMs provide powerful foundations, they do not “build the house”.
Scalable, Serverless deployments of Langchain apps on the cloud without sacrificing the ease and convenience of local development. Streaming experiences without worrying about infrastructure
Join us on 'Track 1'!
Join us on 'Track 1'!
LLMs unlock a huge range of new product possibilities but with everyone using the same base models, how can you build something differentiated? In this talk, we'll look at case studies of companies that have and haven't got it right and draw lessons for what you can do.
Building LLMs that work well in production, at scale, can be a slow, iterative, costly and unpredictable process. While new LLMs emerge each day, similar to what we saw with the Transformers era, models are getting increasingly commoditized – the differentiator and key ingredient for high performing models will be the data you feed it with.
This talk focuses on the criticality of ensuring data scientists work with high quality data across the ML workflow, the importance of pre-training and the common gotchas to avoid in the process.
As a small business, competing with large incumbents can be a daunting challenge. They have more money, more people, and more data, but they can also be inflexible and slow to adopt new technologies. In this talk, we will explore how small businesses can use the power of large language models (LLMs) to compete with large incumbents, particularly in industries like insurance. We will present two examples of how we are using LLMs at Anzen to streamline insurance underwriting and analyze employment agreements and discuss ideas for future applications. By harnessing the power of LLMs, small businesses can level the playing field and compete more effectively with larger companies.
While Large Language Models (LLMs) such as GPT-4 come pre-loaded with loads of useful general knowledge, they're rarely able to be deployed directly due to gaps in quality (lack of specialization) and deployability (too large and expensive). In this workshop, we'll demonstrate how the data development platform Snorkel Flow can be used to distill the relevant knowledge from state-of-the-art LLMs into smaller specialist models that are faster, cheaper, and higher quality.
Despite their advanced capabilities, Language Model Models (LLMs) are often too slow and resource-intensive for use at scale in voice applications, particularly for large-scale audio or low-latency real-time processing. SlimFast addresses this challenge by introducing Domain Specific Language Models (DSLMs) that are distilled from LLMs on specific data domains and tasks. SlimFast provides a practical solution for real-world applications, offering blazingly fast and resource-conscious models while maintaining high performance on speech intelligence tasks. We demo a new ASR-DSLM pipeline that we recently built, which performs summarization on call center audio.