Sign in or Join the community to continue

Productionizing AI: How to Think From the End

Posted Mar 04, 2024 | Views 459

# Productionizing AI

# LLMs

# Bainbridge Capital

Share

speakers

Annie Condon

AI Solutions Engineer @ Acre Security

Annie Condon is an AI Solutions Engineer at Acre Security, where she helps bring intelligent systems to the physical access control space—without letting any rogue AI lock people out of a building (on purpose, anyway). With over 8 years of experience across machine learning, data science, and AI, Annie’s journey has taken her from deploying traditional ML models to building cutting-edge AI agents.

+ Read More

Demetrios Brinkmann

Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More

SUMMARY

As builders, engineers, and creators, we are often thinking about starting the full life-cycle of a machine learning or AI project from gathering data, cleaning the data, and training and evaluating a model. But what about the experiential qualities of an AI product that we want our user to be able to experience on the front end? Join me to learn about the foundational questions I ask myself and my team while building products that incorporate LLMs.

+ Read More

TRANSCRIPT

Productionizing AI: How to Think From the End

AI in Production

Slides: https://drive.google.com/file/d/1M86RiOXzL1T21TzZ4FDAAZfzflRc4lNn/view?usp=drive_link

Demetrios [00:00:05]: I have the great pleasure of bringing back onto the stage Annie. Hey.

Annie Condon [00:00:11]: Hi. Thanks for your patience.

Demetrios [00:00:14]: This is a true testament of dedication and determination and how it pays off. And ain't no permissions going to hold us back today.

Annie Condon [00:00:25]: No. Do you see my screen?

Demetrios [00:00:29]: I do. I'm going to bring it onto the stage and I'm going to let you have ten minutes, and I will come and see you in ten minutes. See you.

Annie Condon [00:00:40]: Cool. Sounds good. Yeah. And I'll try and make this like a lightning, lightning talk. So. Yeah. Hi, everyone, I'm Annie. Thanks for your patience and getting me up here, and thank you to Demetrios and the Mlops community team for creating this event in this space.

Annie Condon [00:00:57]: It's been so awesome. And even just like, the past two talks were super relevant to what I'm doing. I'm technically a data scientist, but I'm not really sure if I'm a data scientist anymore. I'm kind of having an identity crisis with this crazy world of AI. But, yeah, I've worked at really large enterprises, I've worked for startups, and I'm currently working on building recommendation systems at a private equity company. So I guess I'm kind of like a practical application of trying to figure out some of the questions from the last two talks or last three talks that were just up here. So, yeah, I'm speaking to you from the perspective of someone who, like, two years ago, I was just a meager data scientist, training and deploying regression and tree based models, and I'm like that person who is always running to try and catch the train, and I always feel like I'm late. And that's really been my experience with llMs, but we all have to adapt.

Annie Condon [00:02:07]: I've been adapting by signing up for 15 udemy courses on generative AI and then never taking them, and then trying to build an LLM app in a weekend on a time crunch. So, yeah, I'm going to take some of my own advice, and I'm going to think from the end of this presentation, which is now in less than ten minutes from now. And I hope that at the end of this, not only do you feel like a winner, but you have some key questions to think about when deploying llms in production, whether that's on your own, if you've never done it before, or if your company or organization is looking to deploy llms in production. And I hope that you kind of feel a sense of empowerment that you can get production experience with llms if I did. As humans, we're not really built to think from the end, and that's what makes product managers jobs so hard, right? So I'm going to try and do that with an example that's kind of loosely based on my experience right now of preparing to deploy llms. So maybe a business leader says, I want a user to find a match within the first five recommendations that they see in the application. In my case, a lot of our users can be boomers. So people who might not have grown up in the tech age, they're not used to using applications.

Annie Condon [00:03:41]: So we have like a really quick minute for them to either enjoy their experience or not enjoy their experience. And let's assume that these recommendations in our app are developed using llms. So already with this business need, we're thinking about the user's experience, how good the recommendations are, what data a new user would need to input to receive a good recommendation, and how quickly the app can produce a recommendation based on data or even new data. So in the data science lifecycle, which is something that I'm most accustomed to and kind of like what I grew up on, it can be somewhat linear and with the end goal being get good model performance and go back and get quality data, engineer better features. And so this thinking can be somewhat linearly. And even where I currently work, we're not even at a stage where we're able to fine tune llms or even use Rag, we're just using inference. But we have naturally leaned on llms for nearly every step of the data science lifecycle because it's just so convenient and so useful. We mostly have unprocessed text data as our inputs and we don't currently have labels.

Annie Condon [00:05:11]: So using llms, it's like a given that we're going to use it in our data science lifecycle, llms have proven to be useful for cleaning up text data that's scraped from the Internet, engineering features and creating similarity searches. So getting there and getting some recommendations is just one part of this business need that we have. But once we have those good recommendations and we've sort of been through that data science lifecycle, if we go back to that business need, that only covers just a small portion of that business need, which is getting good recommendations. So some of the other questions that we need to ask if we're going to have llms anywhere in the lifecycle of our product. So as an example, we use OpenAI API as part of our data cleaning process. There are rate limits on the API and it's costly. So how long will it take to process data users at once doing things like tokenization and text embeddings. That's different from the realm of traditional data science pre processing that I've done in production.

Annie Condon [00:06:31]: So what is the compute on something like that? What's the compute on running pre trained llms like Bert going back to the evaluation piece, how do we evaluate this and collect the right data points to see if the predictions that our llms are making or the similarity searches that they're doing are what we want the user to experience? And then also if the similarity search or the LLM is utilizing a description, for example, of a business or of a person who's using the application, how do we collect quality data from the user so that we know that it's going to produce a good recommendation? We can't have a user come in and just give us a one sentence description and hope that that will be quality enough to have them have a good experience. So how much do we guide that? And I've watched data scientists eyes glaze over when they suddenly realize that the cool results that they produced using an LLM be reproduced and automated in the world of the app. So I'll just skip ahead here because I think ultimately I really wanted to give an example of how you could simple, quick, and dirty deploy an LLM, which is kind of like what I'm trying to do right now. But I realized that in some ways it's not simple. And so where I'm at is I'm working on deploying our models in AWS cloud. No particular tie to AWS, it's just the cloud service that we use so that I can kind of test the limits of all these questions that. So just as like whether you're on your own right now trying to figure out how to deploy llms, or you work for a company that's interested in deploying llms. The cool thing is that a lot of the cloud services, especially AWS, offer a lot of integrations with llms.

Annie Condon [00:08:48]: So you could create an AWS account if you don't already have one, and use the sagemaker integrations with hugging face to deploy your llms. That means you don't even have to download the model artifacts to deploy an LLM. Also, AWS Bedrock allows you to invoke an LLM on the bedrock runtime. And just yesterday, deep learning AI released a free course on creating serverless LLM apps with bedrock. So again, this is something where you don't need to get a job to then get the experience that you need to deploy llms, you can go out and play with some of this stuff yourself, even if it seems a little bit scary or you haven't really explored this world yet. And even if you just tried this, you would be at where I'm at right now. So I guess in conclusion, this is kind of like deciding to have kids. There's never a right time where you'll be ready to start deploying llms.

Annie Condon [00:09:59]: So go play with it and. Yeah, thank you.

Demetrios [00:10:06]: I love this quote at the end. This is so awesome. And I was trying to tweet what you were saying. Like, I wanted to give a quick and dirty way of deploying llms, except I realized there's not really a quick and dirty way of doing this. It's amazing. Even if the marketing teams want you to think that there is. That's the problem with it, I think.

Annie Condon [00:10:29]: Yes. Absolutely.

Demetrios [00:10:31]: Yes. If you ask any marketing team right now, they will tell you that it's easy as it's like that classic meme where half the horse is drawn with stick figures and then the other half is drawn.

Annie Condon [00:10:46]: Yeah, exactly.

Demetrios [00:10:48]: Really detailed, in depth. And so it's like how to draw a horse and draw that. So, annie, awesome. Thank you so much for persisting and giving us this talk. It was awesome. I really appreciate you coming on here.

Annie Condon [00:11:03]: Thank you.

+ Read More

Sign in or Join the community

Watch More

Building and End-to-end MLOps Pipeline

Posted Jun 09, 2023 | Views 834

# MLOps

# ML Project Lifecycle

# Neptune Ai

Basics of End-to-End MLOps

Posted Oct 20, 2021 | Views 674

End-to-end Modern Machine Learning in Production

Posted Jul 14, 2023 | Views 591

# RLHF

# LLM in Production

# Hugging Face