MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Building LLM Applications for Production

Posted Jun 20, 2023 | Views 10.1K
# LLM in Production
# LLMs
# Claypot AI
Chip Huyen
CEO @ Claypot AI

Chip Huyen is a co-founder of Claypot AI, a platform for real-time machine learning. Previously, she was with Snorkel AI and NVIDIA. She teaches CS 329S: Machine Learning Systems Design at Stanford. She’s the author of the book Designing Machine Learning Systems, an Amazon bestseller in AI.

+ Read More
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More

What do we need to be aware of when building for production? In this talk, we explore the key challenges that arise when taking an LLM to production

+ Read More

Link to the slides


How could you put me like right after? Mate? I can't stop that. I just wanted to get us all on the same screen for a moment because we're all gonna be at this LLM Avalanche event. In two weeks. So if you're in San Francisco, I'm gonna say it again. I hope to see you there At the LLM Avalanche event, I'm flying to San Francisco specifically for that.

It's gonna be a blast and I'm gonna get to meet both of you in person face-to-face for the first time after we've been hanging out virtually for the last like three years. Yeah, it should be fun. Yeah. Yeah. I can't wait to put the picture with you both on Instagram. There we go. Matt Tape, thank you so much for talking to us about DSP and all this great knowledge.

I look forward to seeing you in a few weeks, man. Yep. Thanks a lot. See you. So Chip, Hey. Oh my God. I'm still very nervous. Whoa. I I know that it's gonna be incredible. Do not feel nervous at all.

Chip's Blog

I just want to say that every time I like, try to write something about something that's happening in ML lops, whether that's like, like real time streaming or now this report.

It feels like you beat me to the punch and then I watch, I see what you write. And then I go, oh, how am I supposed to even do anything now This is, this is so good. So if people do not follow chip's blog, we're gonna leave a link to that in the chat, because I imagine there's only like two people on here that do not know who you are and do not follow your blog.

It is so good and so refreshing. Whenever you drop a, so like, In depth research really is what you do and the way you can put it all together and explain it. Ah, I love it. So that is my that is demetrios. I really like you. You don't have to say so many nice things about me. So that's, that's what I'm gonna do right now.

I'm gonna leave it at that, and I'm gonna let you jump on and give your full talk and then back question. Yeah. How do I see the chat from the audience? So, all it's It is on the, I'll post the link for you on the behind the scenes stuff so that you can just jump in and then people can talk to talk to you through that.

And also I think, so I did not do your Slido thing yet, but I'm going to, and I'll, I'll let you know when that's ready. I will try and do that right now as you're talking. So Chip's got a little interactivity with her talk, and if I can do my part, then it's gonna be cool. And so, yeah, no, don't worry. It's, it's not a big part.

I usually love like talking to the audience and see like questions and stuff. okay. But cool. don't worry about the Slido. Thank you so much. Dimitris. I'll, yeah, I'll text you when it's ready. Don't worry, I'm gonna make it happen. This is, this is all me. I'll talk, I'll see you in like 20 minutes, 25 minutes, however long it takes.

Okay. So can you all see me? Cool. Wow. that's an interesting experience not being able to see the audience, but, hi, my name is Chip. Thank you Dema. Sure. For this very warm welcome. And I love the community. You guys are amazing. so much more organized in our community. So we do run a, a little bit of like mo ops, discord community.

So here's a link if you wanna check it out. so, so today is a little bit. Different from the talk, the original title. So I thought that would be more fun to just go over like open challenges because of things, everything is changing so fast. So I just worry. It's like everything I talk if I show people like, okay, building applications gonna go updated.

Challenges in MLOps

So I'm just trying to like focus on some of the few challenges that I see today. So ranks by like how I perceive as importance. and I would love to hear from everyone, like how you guys are thinking.

Challenge 1 Consistency

So challenge one is like consistency. so when, whenever we use an applications, users expect some certain level of consistency.

And the second aspect, consistency is just like how to ensure datastream applications can run without braking. Because like a lot of time when we, when we use lm, we don't just use LM and get a response back, but we might want to process the response. For Datastream applications. So I think this is like pretty well known problem at this point.

So like for lm for, because of its look nature sometime it can have the same input and different outputs and you can enforce enforce deterministic determin determinism by. Saying something, I said, temperature as equals is zero. However, it doesn't really fix the cases when you can just have a smart changes in the input and it can lead to like very big output changes.

For example, here it is just, I'm just using charge g b d, very simple prompts. You like get review score from like one to five and you can see just like it's changed output a little bit has changed input a little bit and output is like completely different. And it's just like part of the psychotic nature of lm.

And of course there's a part of ballet. Like how we might use downstream applications on top of lm and I was using this and I want you use LM to generate a score and I want you use a, an application to parse the score. I said, I realized that this is really hard cause there's no way to enforce the output schema of lms.

Challenge 2 Hallucinations

The second challenge is halluc hallucinations. And I would say I see that as the biggest, one of the biggest challenges that prevent companies from adopting lm. So this is very interesting. So like, there's a, a picture we uset, so like, see he uset to write essay and then assign. His student to correct the essay, and they found out that like every single one of these essays have had halluc.

This is, this is very dangerous for on the task where factuality like is very very important. So one example is that so like one example is of course, like on the task that require legal, hr, a lot of contract processing, but also like on the task that involve some kind of like code writing. So for example, today, LM actually.

Perform very poorly on text generations. So here is a birth cycle leaderboard, and you can see that's the best model achieved, like under 50% accuracy, which is like very, very bad. so there are multiple reasons like why there are, we don't really know. I don't, I haven't met anyone who knows for sure why LMS hallucinate.

But they are hypothesis. So one thing, one by demise says that like it's dues, the models like the understanding of the course and effect of their actions. And another hypothesis that push by several people from open AI is that it uses a mash between LM internal knowledge and label laborer label, sorry.

This is like a event for me, so, so I haven't had enough coffee and I'm just like, turn twisted, all the top. Okay. So Bitcoin's internal knowledge and the labeler internal know knowledge. So you mentioned in the phase of like when we try to finetune finetune the model and you won't want to ask labeler to annotate the answer, first question, answer.

And so laborers might know things that the model doesn't know. So that like, so if the laborer, if we teach. The model to like generate responses based on information says the laborer knows, but the model doesn't know. So we essence only teaching the model model to hallucinate and it's like really hard to correct.

I go more in in my blog post, but we probably do not have time to unpack this in this talk, but feel free to reach out. We wanna chat more about this.

Challenge 3 Privacy

Another challenge is privacy. So is, it's a, is it's challenging in both, like whether you. Bill or, or buy. So say, if you want to build our own chat or say like, if you want to let the users talk to your data, how to ensure that this chat bot doesn't accept reveal sensitive information.

So here's an example. I didn't I didn't update it to the, to the new one. but it's there's a. There is like, it's a spot to, to geo brick chat and you can see a lot of examples like that. So companies like companies that provide APIs like OpenAI or other companies, spend a lot of effort into making sure that like their answers do not excel until it reveal informations.

But like if you build a chatbot in-house, That is your responsibility. You want need to make sure that you don't just excite until they reveal some p i i. And then if you buy then you have the whole challenge of like, you have to depend on the AI providers compliance. so right now open AI like retains of data for 30 days.

And it's really depends on the industry whether this is acceptable to you or not.

Challenge 4 Context Length

Another big, big challenge is context context length. so I think we have a lot of discussions on like whether pro engineering's gonna go away. So I would say as always sees it, is that like I don't think the context learning when when will ever go away, but there would certainly be there's certain hacks around pro engineerings that might not last.

So the reason why context learning will never go away is because it's like a lot of. Conversations or questions as like very context dependent. So it's a very interesting study on the paper, like situated qa, and they files like roughly like 16.5% of on the questions, like require context. So first all it might be asking like, who, what is the best.

Restaurant, Chinese restaurant. So it's, the context here might be very much dependent on like the location, like where you are asking is it in the wall in the US or in China, and like, or, or if, if there's no context provided, what is ex zoom context? Like what could be the for context for, for the. For the questions.

So this is very common in like use cases like document processings summarizations, any use cases that invos like genes or proteins. So here's a, you can see on the table there's some study on a 90 na percentile input length. And you can see that like for some use cases, the context length can be like option like hundreds of thousands of tokens.

And I noticed there have been like a lot of studies to like Making the model can work with long context length. However, there's still like questions about like, okay, a model might be able to take in a hundred thousand tokens as input length, but how efficiently can that model use on this token? I think it's like still open question.

Challenge 5 Data Drift

One thing that's like AI is I'm really glad about is this like, chat team made everyone realize. The importance of data drift, because we'll give you asking question. It was like, oh, because our knowledge cutoff is like blah, blah, blah. That's why I cannot answer the questions. So so, so also the same paper for Century qa, I think it's a really, really good paper.

And the, the good showing is that like, For the models are trend on data collected in the past. just feel to generalize you answering questions asked in the presence even when you provide the evidence in the context. So like the performance drop is about like 15% and this is quite significant. And here I think we have a pretty interesting breakdown of like what kind of like how quickly things go outdated.

So like from like one week, six months, two years, 50 years, here's some example.

Challenge 6 Model Updates and Compatibility

So, so this leads us to the next challenge, which is like, so say like the model has drifted the data has drifted because the world changes, right? And now we might have like the same model, new data, or another case when like, there might be a new model architecture, a new Turing technique.

So transform architecture has been incredibly sticky, but how long is this gonna last? So in many cases, we gonna. Have like new models. So imagine that you have a company that you are using a lot of prompts and the prompts have been tested that you work with the current model. So like if you swap out a model underneath, then how often, like how well can the, can the prompts still work on the new model?

And I think there's very little study around this area right now, and actually it's pretty funny. I just saw, I think Hacker News was a couple days on the front page discussing.

Challenge 7 LM on the Edge

So another very, very big challenge is also like LM on the edge. so I see this like since there are like a lot of use cases when like you just don't want to send data outside of the locations, for example, like in a lot of we would talk to a lot of healthcare companies who just have like a lot of different like medical centers and you just like, Either because of the data is sensitive, you don't want to send a patient data or just because of the many of these medical centers it's located in, in the regions where the internet is not reliable.

So like in the life and death situation, you just don't want to like, make some bio somebody else. someone's life depend on whether the internet is go through or not. I also like autonomous vehicles. There gonna be a lot of cars on the road and like h Car have it own like compute. Compute engine, keyboard should, like run their own model.

And so drive through voice bot. And I do think that's like, there's a very big dream that I see a lot of startups are trying to do is like, or build everyone a personal charge, pt, so you can be trained on their own data and can run on their own MacBook. And the dream here is that if it's, if the, if the LM runs on their own MacBooks and you're not worried about like sharing with this or sensitive information and it can be more open and it can be more helpful to you.

However, for this, there would be a lot of like challenges as well. So so first thing is like this model one has to be able to like run do inference on device. So the device has to be powerful enough to support the models inference. And then maybe you have some, like a lot of techniques to like quantize or make the model run inference faster on the device.

So another challenge is also like term of your training. So it's one thing to do inference on device, but like, how would you like update the model? That the changes. So so say like maybe there's a newer model, Turing technique comes out and it, and it can improve the model performance, then how would we update each model on device?

So on device trainings, it's like today's. To bottleneck, but it really depends on, on the model the model size and model architecture. But a lot of it's like still bottlenecked by compute and memory and technology, like for continual learning and evaluations. And then if you train on servers and you go into, like, if, if you don't want to, to do inference on server, you all due to like sensitivity, then how would you then you put it on into the same problem If you want send data back to a server somewhere.

As a model. and another thing is just like maybe like I have used a model. I have fine tune it on a lot of my like data on device. so like if I somehow like update the model the base model, how, how can, if I have a new base model, how can I still like update how, how high do still like.

Continue fine tuning this new model on, on data I have generated in the past. So I think there is a lot of I was a panel recently and there was a questions on how is there a way to do like training, so learning so that you can update a model on data, on device without having to send the device back.

And it's a very big challenge. So another a question like choosing a model size. so I think it's related to on device, because it's a dream, it should eventually run a model on device. So we probably need to think carefully about whether, how big the model should be. And I think I say a lot of factors to consider as well when considering a model size.

So I've seen this like there spot between like a model performance and cost. and of course I would say nice changes. Over time a lot. And also just want to bring up a point about about counting model parameters. So today we just like assume it's like 5 billion parameter, 10 billion parameter.

And it's just like said as in as in like the number of parameters is the only things that matter, but I think it's like highly variance. So first of all, it's also depends on whether a model is sparse or not. So like, say like a hundred billion A hundred billion parameter sparse model is gonna be very different from a hundred million billion parameter dance model.

So I haven't seen a lot of research about sparse and dance model in lm, but I would be very interested in learning more about that. so another things that I haven't seen about talking around about as much is the LMS for non-English languages. So you can for a guess by now through so much of my tongue twisting in this talk is that I'm not an native speaker, so I'm from Vietnam and one of the first thing I tried with was she use it for Vietnamese.

And it's didn't perform well at all. So one thing is that like when I try to translate general AI into Vietnamese it's actually translated it into reproductive ai. So there are many, many examples of like, this is not doing well. So, so here I think it's very interesting paper actually have been several papers on the study of like lms performance on standardized tasks, but in different languages.

And you can see that like it doesn't perform well for, for, for a lot of non-English languages. Especially for low resource languages. Another big challenge is is a tokenization process. So here's a very interesting study by Yi to show that it's a median token length for different languages and GBT four, and I can see that like, it's pretty bad for languages, like for, for low resource languages like Burmese or ic.

So say like it's the same for, for, for a given input if the. Tokenization has produce a lot of tokens that can affect both latency and cost. so yeah, so I could be interested in seeing like how, how much of more cost and, and latency increase for different languages. yeah. So so a lot of APIs today charge people by the output token length.

So like if a language just like have a lot of tokens and the cost is clearly gonna be a lot higher another very interesting questions. I have seen a lot board discussions is the efficiency of chat as a, as an, an interface or mod like of chat as a universal interface. hmm. I was meant for this to be more of a discussions, but it can't see you guys so.

Yeah, so, so maybe answer in your head would you prefer search interface or chat interface? So I see that people have been complaining, telling me just like, oh, they find the chat is not very affect, right? So like, if you want something, you have to like, oh, make a small talk. No, maybe, maybe not make smart talk, but like you have to give it context and, and you have to like go back and forth to get exactly what you're looking for, whereas a search you like you have learned the, the DSL of, of search and you can just like fight exactly what you're looking for.

I think this is like, this is I do agree that, I do think that like chat is not efficient, but I do think that it's a very, very robust api. and the reason is, is like for chat you can just like type in anything, anything and you get back a response. It might not be a very helpful response, but you get back something.

This is probably going on a tangent here, chip. Sorry. I think. The stream might have stopped. I don't know if people missed out on it. I'm gonna see. Uh oh. So yeah. where we last saw you was on challenge number five, data drift. And I am going to start the stream up again just to make sure that we're all good.

But sorry about that. No worry. I'm now on my phone, so we may have to do some. In a minute. Wait, is it my internet or what, what happens? No, it's, it's all on my side. Don't worry. It's, it's, my internet went out. All right. It works. We can see. We can see. We can see. And people are back. They're back online. Oh my God.

People, I just said so many intelligent things on the spot. You just totally missed it. no, no, don't worry. I think everybody said it was fine and they saw it all. Okay. It was just me that went out. Oh, I, I'm overreacting. Oh, okay. And so we're all good. The chat is saying no, just kidding. The same without many intelligent things.

Yeah. So they heard it, they appreciate it. I'm gonna get off the stage before I continue to look like an idiot. I will say though that the Slido is not going to happen because I need you to grant me access. To that. Okay. but we can do that during q and a and then we can do the sliding. All right.

See you later. Go ahead. Sorry to break the flow. Yeah, no worries. No worries. Okay. So I think the talk about like chat is not affection, but it's very robust. you can take pretty much any input and you give a back, you back some output even if the output is not that helpful. so. Yeah, so sorry. I, I see some slack messages, like you'll see everything's great.

Cool. So yeah, so, so we're think about like the tension I was going into is evolutions and if you look back, it's like it's not, so we have been saying this's, like it's a survival of the, of the fit fittest. but I don't think my humans are actually. Optimized for being efficient. Humans are actually optimized to be to be robust in, I think like survival is not of the fittest.

I, okay. I feel like I pre do not have time to go into this in detail or so, so, but basically like there's a lot of discussion about like this, it's not What was very important is to have an interface that is robust, and I have since I so, so the discussion of chat as an, as a universal interface is not new.

It has been around like you can see now. Dan Grover has great discussions back in 2015. Because like in China and a lot of countries in Asia, people are very, very familiar with chat as the university interface, and they do everything in chat. and also like there's a, a few studies I have seen, I can't name them because they're confidential, but basically people have studied like the.

How people prefer the search or the chat interface really depends on how much the, the users have been exposed to each interface. So, for example, for our country where a lot of populations have not been exposed to like internet for a long time, they actually prefer the chat interface because it's just like, feel more natural and easier to use.

Okay. So the large challenge is big is a data bottleneck. so we have seen just like a lot of models use a tongue, tongue of data, and it seems like there have been some study for some by demise that shows that a lot of models, like a lot of data is just still underutilized. So like the models still has a capacity to even.

Learn from like even more data. So and we seem to be running out of like internet data. So this is very interesting study showing that the rate of trend data set size grow is much faster than the rate of new data being generated. So at some point, I think this is marked as slide 2026, that we will actually run out of publicly available data.

And on top of that, like the internet is being rapidly populated with AI text. So, So if so, for the future, if you chron internet data to like train a new lms, it would likely be trained on like text generated by the existing lms. So yes, that's something like definitely a thing about, so if anything that we have seen through Generat AI is that gen data is very essential to to any companies that leverage ai.

And we have since like over the last since AI came around what, 70 years ago, there have been many hype cycles. So maybe today is like, j AI is a hype, but we do not know, like in the future what is gonna be the next new hype. So once things are consistent through on, is. Data, like data is just very important.

So we see that a lot of companies in the face of J ai, the first thing they do is to like figure out the data story, like ate, exiting data across departments and sources. But I update like data terms of eu, so you can see in like stack of flow and Reddit responded pretty fast, but like not fast enough, just.

Maybe they could have done it like years ago. or I put guardrails that I did are quality and governance. So yes, that is pretty much it and here is a summary of on these challenges and do reach out if you have any questions. I'm on LinkedIn, Twitter, I'm also on this Discord on the to and I have a book.

But yeah, that is for me. Awesome. A lot of people were saying that they had your book and they loved it, and I wholeheartedly agree. I think it is amazing and I always wonder how you're able to write so much and also run a company. It is amazing to me. It's super cool. So I think there's gonna be some questions coming through in the chat.

I don't wanna cut you off because of my little My little interlude and I wanna let people do it. Luckily, I learned the first time that we had this conference, there is a little bit of cushion now on the breaks, so we can have more flexibility when it comes to the timing. All right. Yeah. So I can ask you some questions and also in the meantime, Do you wanna gimme access to that?

Slido and I while you're answering the question, I don't have a Slido. No, no, sorry. You gimme access to the slides. The document. Yeah, the slides. And also people were asking about the slides and if the slides are gonna be anywhere, so yes. We are most definitely gonna have the slides Yeah. For everyone to check out.

And we will throw those in the recording right in the in the description of the recording once it's out. So while there are questions coming through, who's got questions in the chat, feel free to, oh, somebody's asking if there's gonna be a next edition of the book. Perhaps.

Yeah. Oh my God, I was so sorry. I think I, this is a new talk. I got so nervous about, Going out to mate that I, I was like, couldn't sleep last night. I was like, oh, I need to come up with something. Yeah. Sorry for, and I was like, what happens when I don't get no sleep is just do not, do not speak very well somehow, like own this like immigrant, immigrant side in me just came out when I do not sleep.

So, ah, it was perfect the idea of drinking coffee beforehand. The other thing is, I mean, you did mention in the talk in the part I That I was on for, cause I got kicked off and thought everyone got kicked off for a little bit and was afraid like, oh, the conference stopped. But I saw that you were talking about how there's not, what was it?

There's not models that are useful in other languages and especially like the languages that are less spoken or that have less. Speakers of the languages, it's much harder to find LLMs that are useful with them. I think, I think I wanna correct that. So, yeah, so I think it's not, it's not proportional.

So they are languages, does this have a lot of speakers but they're not represented. Equally in the corpus of like sharing data. So for one was their language, like hundreds of millions of speakers, but they're careful. Like what, under 1% of like or like, sorry. 0.0 some percent. And I say, come on, crawl.

So, yeah. Ah, see? I see. Okay. Good. Well should we get into some questions? Chip, you wanna hear what the, the chat has to say? yeah. Mari, so I just want to follow on the, on the like non-English models. Yeah. I noticed. Like so I just like, I think it's like I'm, I like to go on a tangents, I see this like the evolutions of like language model, right?

So you start going from like what? both architecture wise like Methods wise. So we went from accounting method to like neur network. So we went from like single task to like multiple, multiple tasks, right? Like with like single task before we had one model to do machine translations, another do cinema analysis.

And now we have like models generalized to like multiple tasks. And that's the same. Another access actually like go from like. A model for each language to like a model for like multiple languages. So, so I think it's very funny to say like a language model and not languages model. it's like language is singular.

As I was trying to write a about language modeling as it's really hard to like, okay, this is a model to model language, like languages because it's somehow, it's like made everything is like one single language that. Oh, can understand. Mm-hmm. So yeah. But, but it doesn't understand some language as well as, as other, and I see a lot of countries are sharing, working really hard to build language morals as work very well for that for that language.

For that language. Yeah. Super. Did, somebody was asking if, you know, any projects that are going on in that in that field? Oh, I think g PN is very big on that. I think g PN is like, make, is like a government priority. Vietnam. so I'm from Vietnam, so I'm more aware of the effort there. I think they're like, if you're Vietnamese, you should slowly check out.

There, there have been a very, very big push both by companies and like open source community in Vietnam. Hmm. And have you tried LLMs for entertaining articles? What do you think about that? If you have. Entertaining article. What does that mean? Yeah, like I think it's annotating data basically from articles like newspaper articles or stuff like that.

Like to summarize articles or fact check. I think it's for the actual like data labeling. That's what I understand from the question. but John, John, oh yeah. Like, feel free to put more context around that if I'm not saying it correctly. Yeah. Mm-hmm. So I You mean like, yeah, go ahead. Should I Synthes aside, is this about a data synthesis?

Yeah, I think so. I guess from what I gather from the question, so this is, this is the question word for word, and then I'll tell you how I interpret it and then you can go ahead and interpret it however you want to. have you tried LLM for annotating articles? What do you think about that? And the way that I interpret it is that it's like you're using LLMs to data label for your data labeling for training later another model.

Yeah. I think that's pretty common. I think like there's a whole, I think it's like not a distillation. There's a whole premise behind alpaca, right? Like you use a large better model to like generate output and then you make smaller model is just like copy that behavior. so I think, see that's a lot.

Cool. So now we've got. No, I think it's funny cause I was talking to a friend about, I was asking him like, how do you think about the future of like prom engineering? And he was like, prom engineering matters as long as human to human communication matters. Right? Because it's just like, For human hemo communication.

Collaboration with other disciplines

Sometimes we still need to put in a lot of contexts for like the other person to understand. So the same thing with pro engineering. It's like a matter of putting in things so that like LM can understand. Awesome. So we have a panel coming up, but there are some incredible. Questions that are coming through the chat.

Now, it, it generally happens like that, right? It gets a little bit delayed and then people are typing and so then it's a little bit even more delayed. But I want to ask these questions because they are awesome. Do you think that. LLM developers are cooperating adequately with experts from other disciplines, such as linguistics, sociology, and ethics.

So that is like a very big question. so, so how can we measure that? Yeah, so I do think it's asking. I mean, like NLP is an example of like how different disciplines came together, right? Like linguistics and computational science, computational, computational scientist. So if you look as a history of like nlp, it's not it's definitely like already like the merit of like, Statistics computation computer scientists and linguistics.


So it's a very incredible, fascinating field and I could see that like these adoptions in like other fields as well. So I think I'm very excited for like gene sequencing, right? Ebola was shooting it like language model. and I can see that's like if we can bring a lot of like life science scientists, it could be pretty cool.

Yeah, completely agree. So there's all kinds of other incredible questions that came through the chat. And Chip, you have the link to the chat I think, so you can jump over there. Also, chip is in Slack in the m Lops community Slack. And if anyone wants to ask any questions to, or just tag her in the community conference channel.

If you're in there and you wanna start a thread and. I think that's it. Chip, I'll see you in a few weeks for the first time, right? Oh my God, yes. I cannot wait for it. I dunno. I dunno why, but like whenever I see Demetrius, you just seem so much fun. You just seem like a fun person to party. Uhoh. Oh, those are some big shoes.

Yeah, I got lots of expectations. I don't like to keep it up here. Let's lower the bar please. Better stop is a dance move. Okay. See you. Yeah. All right. See you Chip. Thanks. This is awesome.

+ Read More
Sign in or Join the community

Create an account

Change email
I agree to MLOps Community’s Code of Conduct and Privacy Policy.

Watch More

Building RAG-based LLM Applications for Production
Posted Oct 26, 2023 | Views 1.8K
# LLM Applications
# Anyscale