Sign in or Join the community to continue

Real LLM Success Stories: How They Actually Work

Posted Jan 31, 2025 | Views 281

# ChatBot

# LLM

# ZenML

Share

speakers

Alex Strick van Linschoten

ML Engineer @ ZenML

Alex is a Software Engineer based in the Netherlands, working as a Machine Learning Engineer at ZenML. He previously was awarded a PhD in History (specialism: War Studies) from King's College London and has authored several critically acclaimed books based on his research work in Afghanistan.

+ Read More

Demetrios Brinkmann

Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More

SUMMARY

Alex Strick van Linschoten, a machine learning engineer at ZenML, joins the MLOps Community podcast to discuss his comprehensive database of real-world LLM use cases. Drawing inspiration from Evidently AI, Alex created the database to organize fragmented information on LLM usage, covering everything from common chatbot implementations to innovative applications across sectors. They discuss the technical challenges and successes in deploying LLMs, emphasizing the importance of foundational MLOps practices. The episode concludes with a call for community contributions to further enrich the database and collective knowledge of LLM applications.

+ Read More

TRANSCRIPT

Alex Strick van Linschoten [00:00:00]: My name is Alex Strick van Linschoten. I'm a machine learning engineer at ZenML and I don't drink any coffee. I drink green tea. Not always, but yeah, a good cup of jasmine green tea or something.

Demetrios [00:00:15]: Good people of the universe. Welcome Back to the MLOps community podcast. Today we've got a very special episode going over the database of real world LLM use cases. That's right. There's all kinds of AI use cases that Alex consolidated into one place. And I appreciate him doing that for us. We get to the conversation about what exactly he learned while putting together this massive database and how he did it. So without further ado, let's get into it.

Demetrios [00:00:56]: I think the only time I've ever seen this well done. What you've done is when the folks at evidently AI put together a gigantic database of different ways that ML and AI is being used. And they took a lot of disparate data sources from blog posts that have been out there and folks that they've talked to and probably some people that are using their open source client. You all did something similar at ZenML and you put together a database, but for LLMs specifically and people that are using LLMs in production. Can you explain how you went about it and what a huge undertaking?

Alex Strick van Linschoten [00:01:43]: Sure, yeah. And I'm so glad you brought up that. Evidently databases, they have two databases, I think, and they were totally. That was totally an inspiration for us. I was like, yeah, the. There's all of this stuff going on and you see people posting these little blogs occasionally or just random things or obviously all of the conversations you're hosting with MLOps community as well. Just this rich source of data. And we're all sort of trying to figure out exactly how does this stuff work in production.

Alex Strick van Linschoten [00:02:15]: What's the spectrum from mega company down to like five or six people just trying to like start something, start something new. So yeah, I'm my background, I used to be a historian and so my. I kind of like a, I don't know, a hamster or not a hamster, a squirrel, like just like hoarding like all of these things. So I'd have been like keeping a list of all of these links as I went through. And at a certain point like you reach, yeah, the point where like it makes sense to put this out there. And yeah, obviously, I mean we can talk about the details but like there's summaries for each of the posts which would have been impossible to do manually for me or at least like without like a mega budget and like LLMs themselves, like helped make all of that stuff. Stuff easier. So yeah, it's.

Alex Strick van Linschoten [00:03:13]: It was a big undertaking in the sense that it's not like someone else had done the work of collecting them together, but in terms of overall time, it didn't take that long to, to. To put together. Yeah.

Demetrios [00:03:26]: And now there's one place I, I know, at least for myself. I always try to add any quality blog posts that I'll see out there on the Internet to the MLF community newsletter that goes out as like a hidden gem. And it's true that it goes out as a hidden gem and then maybe if you missed that week, you missed it for good. But here you have it and you can always go back to it and reference it. And I really like what you said is how it tries to cover the gambit of small teams, what they're doing to large enterprises, what they're doing. And did you notice any types of repetitive use cases? Because the other thing that's a little bit new here is how varied the use cases are. It's not with the traditional ML where we kind of have it figured out and we know that there's going to be some fraud detection, there's going to be some recommender systems, maybe there's some loan scoring or some classification. But with LLMs, it's wild, wild west on how you are getting value for your company.

Demetrios [00:04:45]: And there's whole task forces at enterprises that are trying to figure that out right now.

Alex Strick van Linschoten [00:04:50]: Yeah, I mean, variety is definitely the word. There are a ton of different use cases. I think there's kind of two broad categories and one is probably maybe much bigger. The bigger one is like, let's go with something that we see everyone else doing. So let's build a chatbot. And the chatbot is either like customer service or it's like chat with your data. These are broadly speaking, like the two most. And these can come in different flavors.

Alex Strick van Linschoten [00:05:18]: Some of them have like an agentic, you know, color to it. Other ones. Yeah. Are like completely internal. Other ones are customer facing and so on. Those are kind of the. Yeah. Some CEO has seen like some other company has done this or someone internally has built like this PoC demo on like a streamlit app or whatever, which looks crazy impressive.

Alex Strick van Linschoten [00:05:47]: Like, let's roll it out. It's already working. Right. And so you see, little did they know those were kind of the most common ones. And then you have like a smaller cluster of things which are either. Yeah. Which are just like. I don't know.

Alex Strick van Linschoten [00:06:05]: I didn't know whether there's any unifying thread apart from the fact that it's like the companies which are really like pushing the envelope technically, they're really driving innovation, they're figuring out like stuff think, I don't know, think what copilot was doing like a few years back, like really like driving a path unto themselves or at some companies doing some like innovative stuff with perhaps with agents now or mixing stuff around content generation or whatever, which you can't really categorize and probably maybe there aren't like a thousand other companies who would want to do this but like it really works for them. And so. Yeah, but I mean it takes a certain kind of company and a certain amount of like risk appetite for you to get into like I'm just gonna go and do my own thing. Even though everyone else is doing chatbots.

Demetrios [00:07:03]: Yeah, yeah, it's easy to do the chatbots because it's, it's working for the majority of folks or at least it's better than what we are used to as far as chatbots go. And so why not? And I like that you show there's different flavors of chatbots. Maybe it's agentic, maybe it's internal, maybe it is just externally facing and it's folks that are hacking around on something or it's a full on rolled out project that got buy in from leadership. And so you see every type of flavor when you get blog posts about it and when people write about it or talk about it or come on the podcast and share and I wonder a bit how there's gotta be a lot that is still obfuscated because when I think about a company like, like an Uber, they have so many use cases that they aren't able to talk about all of them. Right. And they're not able to show every single way that just random departments are using LLMs and, and what they're doing with it. And it makes me think a lot about how we've had traditionally like the data governance role and now the AI governance role is just such a beast because in an enterprise with a thousand plus people, or God forbid, like 10,000 people, imagine how many different instances of AI they're doing and how many repetitive licenses they're paying for or repetitive workflows. And that just, yeah, gives me anxiety just thinking about that.

Alex Strick van Linschoten [00:08:59]: Yeah, I mean certainly, you know, the things that, the things that people are putting out, particularly in Uber or whatever, generally speaking, people are putting things out which make them look good or make their teams look good, even if it's a failure. It's like we caught the failure, right? Or we had good processes to catch the failure. And yeah, lots of respect for companies that like include details of like where they mess something up. 11 which comes to mind is like weights and biases who develop their like internal, sorry support chatbot and they'd be really great at like building in public and sharing stuff on how they build their evals and so on. They shared like oh yeah, we got something wrong about like how we did our evals and we needed to spend like several thousand like dollars just like re redoing everything because we made some mistake or whatever. So like yeah, it's nice when people like put actual money. But yeah, for. It would be great if there was a bit more normalization of like sharing failures and paths which didn't work.

Alex Strick van Linschoten [00:10:04]: I guess there's a shared internally and maybe for a mega corporation that is good that, that that's the way, you know, that's a good thing for them. And yeah, maybe it's expected. But yeah, obviously it would be nice to, to, to see all the didn't work out along the way.

Demetrios [00:10:25]: Were there any other patterns that you saw as you were putting this together and whether it goes towards common design patterns or how folks or the majority of use cases are doing evaluation this way or that type of thing that you noticed after reading so many and you're like, well this seems to be the flavor of the day. Maybe it is the most useful.

Alex Strick van Linschoten [00:10:48]: Hmm. Yeah, so, so I mean, yeah, lots of, lots of I guess smaller insights rather than big insights. I mean, I guess if there was a big insight, it's like all of the tried and tested things that we know about like software engineering, DevOps, all of these kinds of principles, all of that stuff is super important. And you better get all of that stuff right. Otherwise this magic that you're building on top is, is not going to work or it's not going to work reliably. And you know, we thought a lot, a lot about like exactly what to call this database at the moment. We settled for LLM Ops because this is kind of what the community seems to be settling for. Microsoft is trying really hard to push this term Genai Ops, which is just like too many syllables and no one else is using it.

Demetrios [00:11:41]: So somebody told me I may get canceled for saying this, but somebody came on here the other day and said oh yeah, we're doing GenOps. And I was like, that sounds like you're missing. You're. You're pretty progressive there, huh? Like Is there some. It just made me be like Gen Ops doesn't sound like what I think you want it to mean.

Alex Strick van Linschoten [00:12:07]: Yeah, yeah, we have this term LLM Ops, which is sort of, I guess like the ways to think about and the ways to do all of this stuff we're doing around Gen AI and to be honest, like most of it is around LLMs. It's still not in the video domain. They're starting to do people trying to do things with multimodality but like that's still a little bit like future facing and same with, same with image generation. But really a lot of the stuff which underpins this is MLOps. And even as I'm sure you're very well aware, there's still a lot of people who say MLOPS is not a thing, it's just DevOps and you can go all the way back down. It's why I wanted to say software engineering best practices, which a lot of this stuff is. So, so that's kind of one thing. Like the fundamentals still really matter.

Demetrios [00:13:13]: And what were some of those smaller insights kind of.

Alex Strick van Linschoten [00:13:17]: I'm very interested in tracking like the extent to which people are actually using this in production every year. It's like this is going to be the year of the agent. I hear this now about 2025 and you put on a great conference recently around this and some of the use cases and it's still relatively, I mean there are some success stories of like companies who are doing things but quite often there's not enough technical detail to know. It's not like you're just like unleashing customers workflows just like completely unbounded. Like let the agents handle it. It's like no, everything has been super constrained down like as much as possible. And so yeah, it does seem seen that we're not quite there with making agents work reliably. And it's unclear to me not working on a ton of these projects like exactly where the bottleneck is.

Alex Strick van Linschoten [00:14:23]: Yeah. In terms of that. But the places where people are managing to, to get this kind of to work is as I said like really, really constraining down exactly the specific tasks or specifications for, for agents. Klarna had a really kind of huge, huge win there where they, they, they, they amplified or supported their customer service agents. I think they calculated that they would gain, what was it, like 30, $40 million in profit on the basis of this deployment. They reduced the time of customers or whatever in waiting lines and people didn't come back and all of these kinds of things. But it's. Yeah.

Alex Strick van Linschoten [00:15:21]: A, it's a mega company. Be from, from what you could tell from what they released. It's like, yeah, it was a very narrow realm and they could kind of control it.

Demetrios [00:15:34]: Yeah, yeah. And C, they are filing to go public. So that can boost their stock price when they ipo. I think that that felt to me like one of the most outlandish claims of 2024 was when the CEO came out and said like we're, we don't need to hire seven people because our AI or something like that. And it was one of those where you read between the lines and you recognize, oh yeah, he's just getting ready to go public. He's probably on a roadshow right now. And good on him. Let that stock pop on A one.

Demetrios [00:16:16]: But there is, it's funny you mentioned Klarna specifically because of that and the support use case because that feels like the one that is most defined when it comes to agents and all the other ones. What I've seen is, yeah, we're really trying to figure it out still. And you can't just say go be my marketing team, right. Or go do my marketing for my startup. What you have to do is go deep, deep, deep down into one specific task and then try and automate it in a way that is possible that you know the steps for. And so you can say, all right, go and collect all of the keywords that my competitors are using a whole paid per click campaign around and then analyze which ones I am interested in. Also, bidding on that type of thing is great for an agent. It's not going do marketing for me.

Demetrios [00:17:23]: Right, because the more vague you are, the less that you're going to get the outcome that you want. And so that probably is the hardest thing right now with agents. And then when you see them trying to be used like the, the OLX magic use case that we had at the agents in production where they're trying to reinvent this way that we're doing search inside of their app. And so you just, you don't necessarily need to search for anything specific. You just say, oh, I'm looking for a baby stroller. And it will give you a few options, but it will try to be a bit more agentic in the way that it presents you these options. Instead of just giving you the. The ads or the classifieds of people that are advertising their strollers, you can get the information and then start honing in on it.

Demetrios [00:18:22]: I still am not clear though, do we want to be using chat for this and the interface of me having to explain exactly what I'm looking for versus me being able to click around in a recommender system. And so I think what OLX Magic is doing that's cool is they're trying to combine both and say all right, we're bringing you these first searches or these first hits. But then we're also adding a recommender system on top of that so that we can learn from where you're clicking and where you're going off of. And so it's not throwing out the old just because there's new and thinking really creatively at how to layer the two on top of each other.

Alex Strick van Linschoten [00:19:14]: Yeah, I think, I think in the database there's also something which is somewhat underrepresented maybe because kinds of people who write these blogs are like the technical team or like the. The engine, the software engineers kind of the on more. A little bit more on the back end side. But like you're totally right. Like UX innovation is. Is also super needed and people to like experiment around and and a lot of things which are often presented as chat chat interface don't need to be but they can still, you know use LLMs under the hood. It just can be like a button. Why make me like type out all of this stuff?

Demetrios [00:19:54]: Exactly.

Alex Strick van Linschoten [00:19:54]: Or interact through voice. So yeah, that's totally something which yeah. I think even if all of the innovation in models stops right now, we still have a bunch of years to figure out new ways of. Of doing this stuff.

Demetrios [00:20:16]: Yeah, the interface is. Is a really fascinating one for me because we've got the pointer and we've got or the cursor and we are used to clicking around or we've also got different commands and so thinking about hotkeys is fascinating to me. And then I am constantly referencing a talk from Linus Lee from like almost a year ago probably when he. It was before Notion AI had implemented their just five suggestions of what you can do with Notion AI inside of notion as you're writing and it's all click based right click ops in a way and that's really cool. But then yeah maybe there's just select all and then you have some kind of a hotkey that you can add your voice to the writing or just clean up typos or whatever it may be rewrite in a shorter way or condense this thought or. Or to bring up the box where you can go back and forth with a chat. And so yeah all that is it feels like we're still in the first inning on it.

Alex Strick van Linschoten [00:21:34]: Yeah, yeah. And I mean that maybe takes to, to another kind of lesson which, which came out of the database was there's a, again, perhaps not surprisingly, but there's a lot of people who are deferring to pre made. I don't want to say pre made necessarily, but like safe frameworks and frank like platforms around gen AI, whether it's bedrock or it's more specific stuff where you know, AWS has, has done the, the work of like making it super easy for someone to create a chatbot based off company or enterprise data or something like that. And so yeah, I was, I was kind of surprised to see how many people, I guess as the saying goes, no one got fired for like buying AWS or whatever for your enterprise company. But I worry in the light of what we were just talking about with the UX stuff, if we go too quickly into the world of like a pre made framework with relatively little flexibility, then maybe we don't get to discover all of these different ways that customers could, could interact with our stuff. Then we're just getting the, the whatever the five or ten things that you can get with bedrock out of the box.

Demetrios [00:23:04]: It's such a great point. And then it all looks the same and it's not like we were really excited about that whole experience in the first place. So now we get just more chatbots that we don't enjoy interacting with. And yeah, I feel like I've seen that before. Right. But the pattern has happened already and so.

Alex Strick van Linschoten [00:23:26]: But it's nice that like the, the open source side of that doesn't seem to like people don't seem to be resting on their laurels too much. I think of, I don't know, LangChain or Llama Index where probably by this point they could just like stop like breaking new ground as new technology come out and just like, hey, we're just going to become like the super stable chatbot guys. And so yeah, to their credit they're like, they're still discovering things still. Yeah, still adding new ways of thinking about like LLMs and gen, even acknowledging the many criticisms that people have of particularly those two. Like it's still. Yeah, they haven't fallen into the same trap, I feel. Yeah.

Demetrios [00:24:25]: One use case that I saw that was fascinating was when Philip from Honeycomb came on here and he was talking about how they were plugging in LLMs to their product itself and trying to have the LLM almost be like a shepherd to help folks get to doing something inside of the product that they knew would convert them to a paying user. And I think that use case is incredibly awesome but very under seen or I, I don't know many other companies that are trying to innovate in that way. Maybe you saw others that said, oh yeah, we're gonna plug in LLMs as this guide or as the shepherd or really as like our sales agent inside of the product or a sales engineer. More like, more like it. And from there they're going to help the user become more proficient at the product faster so that they become a power user and inevitably buy the product.

Alex Strick van Linschoten [00:25:42]: I mean I think for the most part most companies seemed to like a little bit wary of like entrusting too much agency like to that level to LLMs. I mean somehow chatbots are that thing, right? It's like we can't give you like full access to our support team but like hey, here's this like robot which is there like 247 and you can knock yourself out. The problem comes is like it's often seen as kind of like a panacea or like people don't like stick the details and they're like, then you see like people getting frustrated with a product. I mean certainly I'm sure you have tried out like random demos and random things on a whole bunch of different places and like quite quickly like you realize, oh, it's not actually like doing what this thing is intended to be. So either people are releasing these things and there's kind of middling results. And that's why you don't see in the blogs that people write about it, it's not so much about like yeah, we made ton of money out of this but it's more focused on the technical challenges. And then yeah, there's a ton where it's just like we uh, we built something internal because it's way easier, like we're way comfortable, way more comfortable with the risk there where the people can find it useful and they can kind of take it or leave it. But in the end they, they have to work for us or whatever.

Alex Strick van Linschoten [00:27:21]: You know, it's not like they can complain.

Demetrios [00:27:28]: Yeah, they're not going to make the stock price go down by writing a horrible Reddit thread. So that, you know, as you were saying that one thing that jumped into my mind is how with these support bots that we get, you have to inevitably think the majority of them are powered by rag systems in some way, shape or form. I'm sure you have read more rag blogs than you would like to admit. When folks are setting up their Rags and they're giving context to. So you've got the. The chatbot that I, as an end user am interacting with and then that goes to some kind of a search system or it's retrieving stuff that you're asking for. And it's also maybe trying to come up with a solution. I wonder if people have experimented with adding different signals of what I've been doing inside of the app.

Demetrios [00:28:36]: So this may get a little, dare I say the buzzword, multimodal on us, but if I'm clicking around on something and I have. I love. Some people call it like rage clicks because something isn't working or I'm trying to find some. Something. The last thing I want is for me to talk to the support bot and then it suggests that I do exactly what I've been doing.

Alex Strick van Linschoten [00:29:02]: Right. Or then have to explain like the last 10 minutes of.

Demetrios [00:29:05]: Yeah, yeah, exactly. I don't want to have to tell you that I just did these five steps. I would like that you knew that and took that into account in the answer. I don't know if you saw anybody doing that because that feels like very cool. But it also feels like it might just not be possible or valuable.

Alex Strick van Linschoten [00:29:22]: I mean, for sure it's possible, but I didn't see any specifically around, like, what were you doing in the website? But certainly customer support bots were like, enriched with like customer data and the customer's previous context. That for sure was something that I could see. And so I forget the precise examples, but like, just like E commerce, that was a really common thing and that like. Yeah, the customer's profile, recent things that they bought, their preferences, all of that stuff was pretty regular to pass in. But it does go a little bit again to what we were talking about with UX there where for rag systems. Yeah. It's like you want a little bit more power and flexibility as you would maybe with like a person to like either fast forward certain bits of the conversation.

Demetrios [00:30:21]: Yeah.

Alex Strick van Linschoten [00:30:21]: Or slow down at this bit. Or like, hey, let me send you a photo at this moment. Don't make me pick from like six bits of text that like, to my. My thing. So. Yeah. And that, that's. That kind of thing is still.

Alex Strick van Linschoten [00:30:41]: People are still figuring out like the bigger picture stuff. I imagine, like once that's a bit more solidified, then you start getting people thinking a bit more about the ux. I'm not sure. I'm not. Yeah. I'm not sure how long this stuff takes to like percolate through.

Demetrios [00:31:00]: Yeah. And a Lot of times I can imagine that the end user doesn't realize that they have the option of sending a screenshot or something because if they're not, if the end user isn't prompt with do you prefer to send me a screenshot in explaining what you're doing? Then you end up trying to explain it in text, then it's a little clunky. And so yeah, maybe it's just as easy as one of those six follow up questions that you're giving is send a screenshot with your problem. And so they go from there. But it's fascinating to think about that. I always am intrigued by the product journeys that people take and especially around friction that they may have that you as the creator of the product never think about. Right. Because you know it in and out and you know, oh yeah, if you want to do that, you just click on this button and then there's the little drop down and you get exactly what you want.

Demetrios [00:31:59]: But the new user is just clicking around everywhere trying to find what they're trying to do. And they may not even really know exactly what they want to do. And so they're exploring in one way, but they're also trying to figure out if this tool is going to be useful for them. And that's where it would be awesome to have that little buddy that pops up and it's like, I see you're just randomly clicking around. Can I help you find what you need? Or I. I've seen you've done these five actions. You know what else is really cool? Here's a hidden trick. And so it suggests like ways to become a better user of the product.

Demetrios [00:32:38]: And that is, yeah, that's very few and far between.

Alex Strick van Linschoten [00:32:43]: Yeah. And you see people experimenting with this, I mean like OpenAI and who else has this? Microsoft with their, like we watch everything that you do all day. You see people trying to make that kind of experience work. But yeah, it's clear, it's very early days. And yeah, also just thinking about one thing we haven't talked about yet is evaluation and how you then create your data flywheel and all of these kinds of things. But you can imagine that once you get to 24 hours monitoring the screen and interactions and stuff like evaluations becomes, yeah, you need to really, really think about how you do that. And you probably need like LLMs or multimillion, all LLMs, like in the loop there picking out interesting examples or whatever. But yeah, at that point you're like evaluating the sum total of human behavior.

Demetrios [00:33:45]: And yeah, it's hard exactly was there. Speaking of evaluation, did you see common patterns that were arising there? I'm always interested because there's a lot of hype around LLMs as a judge, but actual practice of using LLMs as a judge, I'm. I'm not sure how many people are actually doing it.

Alex Strick van Linschoten [00:34:07]: I mean you, there are certain people who tried it and they had mixed results and they found, you know, they ran into a lot of the common failure patterns of like using LLMs and like getting them to output like number scores and they found this was super unreliable. So then they had LLMs. You get like qualitative responses or you have LLMs which are like flagging certain examples as. And you see obviously LLMs being used in like synthetic data maybe to get you over a certain hump that you have in terms of like building out your functionality. But yeah, probably a lot less of that. And yeah, the common pattern is like someone builds the PoC demo, whatever. @ that point like evals are not really part of anything. Then they start thinking about like well how do we present this to a wider audience, whether it's the public or internal facing for the company? And then that point, hopefully people start like thinking about evals and what are the real failure scenarios that we really can't afford to happen here.

Alex Strick van Linschoten [00:35:27]: So you have some very basic stuff come in and then yeah, depending on the size of the company, maybe they even stop there. But then yeah, the bigger ones, ones with a bit more money to invest in a kind of, in a project like this which doesn't necessarily impact like the bottom line then yeah, then, then they're actually like gathering the data, iterating on the process and so on. But yeah, it's, it's more often than not like people are, people are often building these LLM projects, particularly the internal facing ones so that they don't fall behind on the technology almost they want exposure to it and it's kind of their way of yeah, just being exposed to what's happening with, with LLMs. And yeah, we did this at ZenML as well. We built or I built a Slack chat support bottom when a year, year and a half ago or something like that. And that was mainly just like yeah, we want to understand how people are using this stuff, how hard is it, what are the failure patterns and, and so on. And so there's maybe different incentives at play there versus when you're doing something which is a little bit more like profit results driven. You're yeah, you're more willing to, to accept certain Risks or to do certain things in a certain way.

Demetrios [00:36:51]: Well, you said one of my favorite words, which is ROI and profit. Did you get a lot of insights from all these blogs? Did many of them talk about it? Because as you mentioned before, it was primarily engineers who are writing about the engineers problems that they're solving. So I would assume that they're not necessarily saying this netted us a huge roi, but maybe there are some that you saw that actually took that into account or they talked about how it was or wasn't viable depending on the scale that they're looking at.

Alex Strick van Linschoten [00:37:40]: Not really, apart from like the whatever a few like Klarna or whatever where they actually did back the Klarna did, did, did put like an actual number, big, big number on it. And I think it's also a bit the ones that did talk about like you know, a big spike in users or we, you know, a renewed interest or a product which was, you know, dying or whatever was revived by, by the introduction of RAG or like a better, I don't know, LLM based search system or something like this. It's really unclear to me whether this is something that would apply for the long term because yeah, a lot of people go to places to try stuff out or to try out some new functionality or new technology. And yeah, for a lot of stuff it's clear like this is in flux and it'll be replaced by something later on. I mean my favorite thing at the moment is in this kind of area is like NotebookLM, like very popular. It's kind of a cool use case. You know, they just launched this thing where you can participate in the podcast yourself and if you saw this or tried it out, it's kind of cool, it's fun to use. Will we still be using NotebookLM or like playing around with this in three years? I don't know.

Alex Strick van Linschoten [00:39:16]: Probably not. Hopefully. Maybe there's like a better kind of like meta tool around like yeah, podcasting or discussions or study depending on what your angle is in that stuff. So a lot of the use cases where people get like really good results and like can give specific numbers about like the number of customers they have or how people's user journeys were improved by this. I don't know whether this stuff is for the long term. It's just like we release something and we have a lot of users. Yeah. The other thing I think about in this space is like all of these agent platforms.

Alex Strick van Linschoten [00:39:57]: Yeah, probably like half or most of them won't exist in five years from, from now. But like Some of them have like a ton of users and people playing around with them and yeah, particularly for the smaller use cases, like people do like interesting things which are really useful for them. But I, yeah, I have to think that like a lot of this stuff will, yeah, will, will be merged or amalgamated or, or turned into something else like over time.

Demetrios [00:40:31]: And when you say agent platforms, you mean to help folks build agents, like a, a framework that can help you building agents or it's actual agents that you can go and use?

Alex Strick van Linschoten [00:40:43]: Well, some of them have the marketplaces for, for agents. Oh yeah, which seems to be the thing which, which makes them money. But yeah, it's like kind of these like GUI interfaces, web interfaces where you can like connect lines of like you do this and if not go to this agent and that kind of stuff. They're quite popular and some of them are making like real money and maybe it's a little bit like what's that kind of create your own avatar replica. This, this kind of service where you can create like an AI avatar and like it's personalized and so on, which is not like a mass used thing but like it, it's popular, it makes a lot of money. Maybe two or three of those will continue to exist like for, for a long time. But yeah, I have to feel like a lot of these will just drop off or like people will move on to the next thing.

Demetrios [00:41:40]: Yeah, that's fascinating because the one thing that is clear is that you have to have a lot of patience to get something working on these. Build your own agents and you. The debugging is hard because you don't know if you are not prompting it correctly or if you're not asking it to do narrow enough of a task or if the flow that you've set up is, is the problem. So if you're not willing to spend the time to create that flow, then it's quite difficult. However, I have seen there's a lot of common use cases that are coming out of the box with those little build your own agent things. And so I, I think I signed up for one the other day and it said what are you? What do you like identify as? And so I said more, let's see what they have. For product marketing or SEO. I think there was just all different marketing use cases and then it gave me a lot of flows that they had set up and you just add in simple things like oh, here's my website, here's the keywords that I'm trying to rank for or whatever and they make that easier on you to try and reduce that friction.

Alex Strick van Linschoten [00:43:03]: Yeah, that works great for a well defined domain where we already know the things which are important for SEO and the tasks or whatever. But when you, you know, a lot of the use cases where people talk about agents is like it's in the area of like research where yeah, that's a little bit harder. Like if you knew what the problem was and how to get to it, you wouldn't need the agent probably. Yeah, yeah, yeah.

Demetrios [00:43:31]: Fascinating man. So as you were putting together all these different blogs and different sources into the database, did you find any sources that were consistently just giving top notch material and publishing absolute quality material?

Alex Strick van Linschoten [00:43:50]: I mean some of the the ones which are well known. The Netflix tech blog DoorDash has a really good one Honeycomb produced consistently stuff, weights and biases. Since they started their kind of support chatbot, I think they've done 10 different like deep dive blogs. Technical stuff. Yeah, there were a lot. Some of them. Yeah. And then the rest were just like a ton of like random, random blogs on companies where yeah maybe they haven't written anything previously or their new companies or, or yeah, that was a bit.

Alex Strick van Linschoten [00:44:32]: Finding them was hard. I used this really great search embeddings based search engine called Exa AI which yeah you put in some other blog and you say find other blogs like this and because it's embedding space you get really great results. Would recommend that. Yeah. So there were all of the ones that you know already for like having great technical teams and then a ton of like just yeah ones you have to like hope that they posting somewhere in your social network or whatever someone reshares them because it's hard to find. Yeah, hard to find these the case studies and yeah obviously like MLOps community like definitely love all the videos. Like I hope we've done a service of like liberating some of the content out in text form out of the videos by like summarizing from, from the transcripts and stuff because yeah there's like I think there was something like a hundred or so that can't be true. A hundred.

Alex Strick van Linschoten [00:45:43]: Yeah, a hundred or so videos referenced in the database and like yeah, maybe you don't have a hundred hours to watch all of the videos. So yeah, or at least you can decide whether you want to go and like watch it based off of the summary.

Demetrios [00:45:59]: Is there any cool stuff that you want to do with the data visualization on this? Because it feels like with all the topics or with all the different filters you could create some Fun data visualization, whether it is, okay, chatbots and you have a whole embedding space, the way that you're looking at it, or you're looking at the different use cases and that type of thing, or are you done with it? You're like, all right, I put it out there, now I'm going to get back to work and keep rocking At Zen ML.

Alex Strick van Linschoten [00:46:36]: I mean, we're continuing to maintain it and people submit use cases and articles and so on. So that's really great. And that will only grow the one thing I really.

Demetrios [00:46:46]: Where do we submit? Where did we.

Alex Strick van Linschoten [00:46:47]: There's a link at the top of the database. It's just a form.

Demetrios [00:46:50]: Okay.

Alex Strick van Linschoten [00:46:50]: Fill in. Yeah. And you know, we put out the data set also as a hugging face data set. So if people want to like, don't want to scrape our website in order to get all of the data, like we've done that for you, just go to hugging face. But something I really wanted to do and I didn't have the, the time to implement it basically is like to allow people to search tool based by all of these use cases. You want to see all of the companies that are using Llama Index for embeddings or all the people who are using Quadrant or Pinecone, like vector databases or whatever, and then see common use cases or common. Whether it's failure or success patterns around particular tools and use cases. So yeah, it's a bit harder to, to implement like the extraction of the tools or at least to automate it in a reliable way.

Alex Strick van Linschoten [00:47:50]: So. But that, that, that would be a useful thing. What I can promise is we're not going to have like a chat with your LLM Ops database like functionality on top or you can build it yourself if you want to download the hugging face too.

Demetrios [00:48:05]: That's it. Next MLOps community hack. Casson. Yeah, we're going to do that one. That, that's so good. Oh man. Well, yeah, you've done some awesome stuff with it and I really appreciate you putting it together because it, like I said, is this one resource that I can come back to and continue to learn from. And so I hope that you keep updating it and anybody out there that is doing anything cool, if you write about it, make sure to submit it to Alex and the ZenML team.

Alex Strick van Linschoten [00:48:35]: This has been awesome.

Demetrios [00:48:36]: Thank you.

+ Read More

Watch More

Real World AI Agent Stories

Posted Jan 14, 2025 | Views 377

# AI Agents

# LLMs

# Nearpod Inc

Unlocking Real-World LLM Use Cases

Posted Oct 09, 2023 | Views 581

# LLM Use Cases

# RAG

# Google Cloud

Inside Uber’s AI Revolution - Everything about how they use AI/ML

Posted Jul 04, 2025 | Views 715

# Uber

# AI

# Machine Learning