Who's MLOps for Anyway?
Jonathan is a Managing Principal of AI Consulting for EPAM, where he advises client on how to get from idea to realized AI products with the minimum of fuss and friction. He's obsessed with the mental models of ML and how to organize harmonious AI practices. Jonathan published "Data Analysis with Python and PySpark" (Manning, 2022).
At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.
The year is 2024 and we are all staring into the cliff towards the abyss of disillusionment for Generative AI. Every organization, developer, and AI-adjacent individual is now talking about "making AI real" and "turning a ROI on AI initiatives". MLOps and LLMOps are taking the stage as the solution; equip your AI teams with the best tools money can buy, grab tokens by the fistful, and look at value raking in.
Sounds familiar and eerily similar to the previous ML hype cycles?
From solo devs to large organizations, how can we avoid the same pitfalls as last time and get out of the endless hamster wheel?
Jonathan Rioux [00:00:00]: So my name is Jonathan Rioux. I'm a managing principal for artificial intelligence department at EPAM Systems. I take my coffee black. I recently got weaned from sugar in my coffee. I no longer like it. To me, the important thing is that it needs to be hot. I cannot do lukewarm or cold coffee.
Demetrios [00:00:25]: ML Ops community podcast is back in action. I'm your host, Demetrios. And talking with Jonathan today, we did not conflate. We did not. Mlops and business value, that was one of his key takeaways that I am going to probably be repeating time and time again. Just because you have Mlops doesn't mean your ML projects are going to be successful. In other news, I tend to talk about on this podcast how you don't see fraud detection as a service type. Companies out there because, oh, it's too hard or the data's too messy, or this.
Demetrios [00:01:10]: I was completely wrong. I just found out today there is a fraud detection as a service company. It is called Cubianta. Cubant, maybe is another pronunciation of it. Stock ticker is KBNT. It was so successful, it ipo'd, and everything was on the stock market until last year when it got delisted. So, yeah, I was wrong.
Demetrios [00:01:34]: There we go.
Demetrios [00:01:35]: Let's get into this conversation with Jeff. We gotta start with this because we were in California two months ago, three months ago, I can't remember exactly when, but we sat down. You guys had a very nice place.
Jonathan Rioux [00:01:57]: Rented out our man cave.
Demetrios [00:01:59]: Yeah. I appreciate you inviting me there. I felt like a vip. It was the only. It was the only meeting that I actually took when I was at that conference, and I got treated like royalty. So it was awesome to see you in your element. And you just told me now that you don't think mlops is sexy. So I have now gone down on the people sexiest man of the year charts, apparently, because I am very attached to the mlops term.
Demetrios [00:02:30]: But why don't you think it's sexy?
Jonathan Rioux [00:02:34]: If I define mlops as not being sexy, it goes in the same vein as, like, DevOps not really being sexy, or software engineering as a whole not being extremely sexy. I feel like a lot of individual have coined the term mlops as kind of a guarantee of revenue or a guarantee of ROI on their ML initiative. It's a different ballgame, you know, if you get to think about it, or at least my definition. I really see mlops as a set of principle, a set of process, you know, tooling, that are all going to work together to facilitate how quickly you can take an idea and get it to production and maintain it in production without having crashing on you. It does not give you ideas. I was discussing with a few of my colleagues, a terrible idea implemented using MlOPs principle is going to yield a terrible product, but faster and more reliably.
Demetrios [00:03:42]: With the DevOps principles in hand. And that is hilarious. So huge thing you just said there is that do not conflate mlops with ROI. Just because you have very strong mlops practices and I processes in place, that does not mean that you're going to have a successful ML product or product in general that leverages ML or AI or whatever magic bullet you think is.
Jonathan Rioux [00:04:13]: Going to save you 100%. I mean, you're looking into this. I find it absolutely hilarious to see organization that say we're mlops compliant, whatever that means, but then their data scientists or ML engineers are working in kind of a shared service and becomes that kind of arbitrator of like, what is good enough to become an ML product. The way like a couple of years ago we were talking around DevOps for ML before MLops became kind of bona fide accepted. And to me it was always about putting, you know, business forward people to technology forward people, and in the case of MLops, analytical and ML forward people to be able to create something. Where did we get lost in the process saying that you need a center of excellence that's going to control all of the intake for ML and it's going to evaluate it with an arbitrary framework to be able to determine the value, which I think in practice is whoever screams the loudest or who gets to own the budget. But I feel like this way we're going further away, pardon me, from the original piece, which was infuse embed ML where it makes sense into the product development. And I think it's because a lot of organizations don't really know what to do with ML.
Jonathan Rioux [00:05:56]: So by putting somebody who's accountable on it, I think the natural tendency is to want to centralize everything. It's a little bit the same thing with data organization. If you have a data team who doesn't understand how the data is being produced and who has no control in fixing the source, how can you hope to have data quality? Same principle.
Demetrios [00:06:20]: So bit of a hot take there on organizations don't know what to do with ML. But before we dive down that rabbit hole, you sparked a gigantic thought. I may have my next business idea thanks to you, which is mlobscertified.com just bought that domain while we were on this conversation. Now I am going to go sell that to these businesses and say, you know what you need? You need some MLOPS certifications. And I will look at your processes and tell you if they are in line with the best standard practices that we are seeing out there.
Jonathan Rioux [00:07:02]: Hey, you know, at one point you need to make your, you need to make rent.
Demetrios [00:07:09]: My daughter needs diapers. That's it. Exactly. So let's go down this rabbit hole of what makes you say that folks do not know what to do with ML.
Jonathan Rioux [00:07:18]: I think this has been magnified. And I mean, we can talk probably about the two last hype cycle, deep learning. A couple of years ago, I actually joined EPAM, my current employer, when at the hype of, I think it was around the time where they were saying like blueberry muffin versus dog using computer vision. That was kind of the thing that a lot of people were going around. And now we're having large language model and generative AI. And to me, I think there's a big discomfort because we're back to use case chasing and a lot of conversation are oriented in the fact of, hey, I have this new technology, where can I apply it? And in my opinion, this stems from a misunderstanding, a little bit of the technology, which is totally normal. I think there's a lot of hype in what it can and cannot do. I think people have incentives to be able to both from a personal standpoint, but also some companies are over advocating the capabilities of generative AI, let's say, in that kind of circumstance.
Jonathan Rioux [00:08:37]: But I think it leaves a lot of business leaders kind of flipping the script of instead of focusing on what they know, which is their business, and seeing what are the problems that can be solved using this technology and looking at your portfolio from a holistic fashion, we're seeing the opposite, where people are starting with the technology which they understand partially, and then trying really, really hard to think about compelling business problem, or even worse in certain cases, like, oh, what kind of white space innovation we can do, where the biggest benefit that we see companies that, I mean, our company as an example, like our biggest usage of generative AI, is firmly in the grace based innovation. Like what are the processes that are cumbersome, what are the processes that are tedious, repetitive, that we can kind of do micro optimization, if you allow me to like blend two words together. But for us, it was always evident, you know, if you're in control of the roadmap and the problem that you have, and you have a good grasp on what are your opportunity for improvements. You dont really need to go into that all out panic. And you add on top of that, adding a generative AI center of excellence, which means that your data scientists, who already have permeated the industry now are working double shift because you have your regular job or you're part of a product, you may follow mlops process. Some companies are lucky enough to be able to have data scientists that are horizontally laid out, and then you have that center of excellence, which means that you're uprooting how your company is working just to be able to adopt that technology. So I think a lot of companies are not AI ready, and a lot of companies that were a little bit more AI mature genai is kind of disrupting them, but not because the technology is so promising and wonderful. It is promising and wonderful, but because I think the natural reaction is to centralize, to control, to limit, and ultimately it sets you back.
Demetrios [00:11:01]: So basically it's like you're saying, folks are taking their eye off of the ball, and instead of spending time and energy and cycles in what they know best, which is product or just their business line, and what the business is actually focused on, now, they're thinking, oh, this AI stuff is so cool. What if we put it into here? And what if we put it into there? And mind you, nobody's actually asking for that. Nobody's thinking, you know what, I would love a chatbot that actually responds to me when I pop up onto your website. Stuff like that. It's not that they are going from those principles of what does the customer want? And then is AI the best way of solving it? It's more thinking, we've got AI now, we might as well use it. And especially if you're in another one. This might be my ultimate dream job. I want the title of AI center of excellence at some company.
Demetrios [00:12:05]: That sounds like an incredible, incredible job. I will be cheap AI center of excellence. If you annoyed me. That title sounds like the most bullshit job ever. But I. So anyway, if it is, if that is my title, I am going to, of course, champion for all these different use cases of how we need to infuse AI into all these processes. Right? Because it's in my best interest.
Jonathan Rioux [00:12:37]: Yeah, but then you get into the very real problem of you're ultimately centralizing all of that innovation for an entire company, which means that you need to acquire more than a working knowledge on every business function that is coming to you with use case, because otherwise it's very difficult for you to understand, appreciate, and empathize with the value that this can bring. Theres also a tremendous amount. And this is something that we see quite a bit, is separating optimization versus potential revenue generation, calibrating some of the risks that you can have. So there is a place for deeper knowledge on what is the application of generative AI, and having a little bit of a more technical bias. But this is doing in conjunction with your business functions. And you can't remove them, the agency, you can't prevent them from innovating otherwise. We talk a lot about shadow it, but we're seeing a lot of shadow Gen AI. And to me, this becomes the, um, that is step one in, you know, having a very unfortunate thing that's going to happen.
Jonathan Rioux [00:14:06]: You know, we're, we're seeing a ton of, uh, use case that, you know, public chat bots that are making promises, whether we're not supposed to make promises, data leak, those kind of thing. Um, and I think it is related. I I'm not involved in those things. I like to think that I keep a tidier ship than this, but I think it's easy to explain because you're in a business function, you really want to be able to launch something. There's someone, somewhere else in a company that's saying, this is not really a priority. You're looking at this as an opportunity to out innovate the innovation lab. You go forward.
Demetrios [00:14:47]: Yeah. And then you sell cars for $1.
Jonathan Rioux [00:14:51]: You sell cars for $1. Or in Canada, we had air Canada promising refunds where there were no refunds that were supposed to be done. And we're seeing a lot of funny things internally. We experiment quite a bit with some patterns that we hear. We test a lot of LLMD. We have some pretty funny stories. We definitely dog food. A lot of what we're like, try to come in front of a client and have a compelling point of view.
Jonathan Rioux [00:15:26]: So, I failed a lot with LLM, but usually behind closed doors.
Demetrios [00:15:37]: Yeah. So you protect your ass. That's way of doing it. Just yesterday, I read about how slack rolled out their AI feature, and they didn't realize it, but you can prompt, inject it so that it gives information from private channels into the answers of any questions that are asked to the slack AI. So it's like, there's stuff that are in those private channels that may want to be kept private. And this is slack. Like, we've had really great engineers and data scientists on this podcast from Slack. They're incredible.
Demetrios [00:16:16]: Yeah. And they're still not able to fully think about all the different repercussions that this can have when it comes out.
Jonathan Rioux [00:16:24]: And I think we're going full circle on that frame, is, to me, this is, this is a gap in your, let's call it LLM mopse pipeline, because I see this as building a tower. You have your DevOps practices, version control, good coding, hygiene. Are you using extension that can augment your productivity and so on and so forth, like personal development, but also team management. You have your data ops, which is where data is a product. What can be used when clear governance. There's still a lot to explore there. I think that we're doing a pretty bad job at weaponizing data properly. We moved to mlops, which is your entire lifecycle, model management, but also your entire lineage, making sure that you can have reproductible builds when a model is put in place.
Jonathan Rioux [00:17:27]: And then you go with llmops, but you get to think about it like, to me, a data leakage. I don't know exactly how they've architected the product, but somewhere at some point, there's an access control that is failing. And I'm not saying that it's easy and they're under a tremendous amount of pressure, but it becomes like it's easy to understand, at least conceptually, where it failed. And it's a good reminder that even the strongest company, the company that employs the best engineer and has the most budget, is not immune to those lapses in their process.
Demetrios [00:18:16]: So you mentioned how centralizing things with this center of excellence is not the ideal scenario, because the center of excellence now has to understand on a deep level, each aspect of the business, almost more than the actual people that are doing that, because they have to understand what AI is capable of and where the rub is for the accountants or for the salespeople or whatever it may be. How do you recommend going about it so that it isn't centralized, but it's also successful?
Jonathan Rioux [00:18:58]: I mean, I like to use the expression like, are you a left leaning data scientist or a right leaning data scientist? So are you looking into, you want to be able to centralize by competence, which means a little bit what we've mentioned, and I think we see that it does not work in practice. But I also think that the complete opposite, which means each business function is responsible for the entire data. ML LLM Stat is also out of reach for most organizations. You can't have one data scientist or a series of data scientists across multiple functions, and it becomes really hard to encode best practices and you end up with, you have a ton of data scientists all around the organization, their governance become difficult, dissemination of best practices becomes hard. This is where you get, we're seeing sometime hand different environment with a different set of governance. It's really, really tough to control your enterprise risk on that front. So to me right now, like in August of 2024 or I think September of 2024, when this is going to go live, I think that the, the best model that we can have is some sort of hub and spoke, but a little bit of a different approach where you're not seeing the center of excellence as a governing function, but more of a support function. And I think that's what's hard for a lot of people because they see the center of excellence as a way to augment their profile.
Jonathan Rioux [00:20:57]: But, you know, you're really supporting, you're making sure that everybody has the tool that they have a little bit kind of a, you know, a proverbial scrum master. Your job is to facilitate and make it easy to do the right thing. So I think right now this would be the model that I have a tendency to recommend. We do have a tendency to go in between, like as LLM is going to become more commoditized, easier to understand, more permeated into the industry, I think that were going to go more and more decentralized and then something else going to come in and youre going to see a huge shrinking that, oh, we have a huge skill gap. We dont have a lot of people that understand this. Were going to create a center of excellence thats going to govern whatever new hype cycle is going to come. And were just going to repeat that process again and again.
Demetrios [00:21:49]: Again. Anybody out there that's hiring for center of Excellence chief, I'm your guy. So that is exactly, that hub and spoke model is exactly what I think Uber did. When I was reading their blog post on how they've evolved, Michelangelo, which is arguably one of the most famous ML AI platforms out there, they said that and they mentioned how you have very experienced and advanced folks that are kind of pulling features or needs or capabilities out of the platform. And what they had to do is recognize what use cases they were trying to create with these feature requests and see how much money that was making for the business. And they, one thing that I thought was fascinating, and I repeat this quite a bit, is how they were able to figure out which models, which use cases were generating what kind of ROI for the company. And so they had different tiers of babysitting, we could say, for different models. And you see that, and it kind of is is what you were saying, because they have the Michelangelo platform and it helps support 99% of the use cases.
Demetrios [00:23:22]: But if there are people that want to do their own thing, they can. And Michelangelo just tries to make it as easy as possible for those people to. Then when they're done doing whatever it is they're doing, they can bring it into the fold and run it on Michelangelo.
Jonathan Rioux [00:23:42]: I love that example because to me this exemplifies perfectly that support model that we're trying to get. I mean, even looking at this, and Uber is an example of a very immense company. But I think a lot of individual are working, whether it is banks, insurance company, pharmaceutical publishing. They probably don't have the VC funding to invest, nor should they. I think we have to acknowledge that Uber made a very risky bet in creating their bespoke architecture and they were one of the first to do so and to be very public about it. I think the market is catching up. I mean, Uber has open sourced significant portion of their tooling. There's other players in the market, but I think you have to think of it as the jobs to be done.
Jonathan Rioux [00:24:46]: So what are the process that you want people to follow? Where are your source of truth? And the way that we look at this, we talk a lot about mlops around people, process, platform, and I think a lot of people are just throwing this around. But people follow process and the platform enables those processes. But at the end of the day, you can have the best platform in the world, you can have the best people in the world. But if there's no clear way of doing things, or as you said, when you need to go off script, how do you plug in and plug out of that process and identify those 95% to 99% activities? Everyone's going to need to host a model. How do we make it as simple as possible? What is the SLA that we need to provide? Do we want to create this in house, deploy our own model in kubernetes? What's going to be the scale gap? A lot of people are going and they're saying, oh, we don't want to pay for either a third party tool or hosting is very expensive, so we're going to try to do it ourselves. Well, you have to acknowledge that if your team of data scientists are not cloud competent, it's going to be difficult for them to go and they will be frustrated because as long as it's going to work, it's going to be wonderful. But the day that you're going to have to troubleshoot something and the abstraction is going to fall. You also need to be mindful that everything is good as long as it works.
Jonathan Rioux [00:26:38]: And I talk from experience. My previous organization, we made the choice of deploying our models on eks, and it was wonderful. It was really fun. It's very gratifying to be able to deploy. We got some problem where our scaling was not working properly. It was very interesting and fulfilling intellectually to be able to learn about this. But ultimately I look at this from an OKR perspective. I was unable to work on any other thing.
Jonathan Rioux [00:27:17]: My team was 100% mobilized to be able to fix this, which basically means a month later there's nothing new, no model refresh, no nothing kind of coming into place. So it's all a question of checks and balances. And I like to use and remind everyone of what needs to be done, make it as easy as possible, make it as friendly as possible. And if people want to be able to go off script, at least they have a set of convention that it can hinge into. God.
Demetrios [00:27:46]: Just talking about that reminds me of a conversation I was having with my buddy last week, and it was a little bit about how you have these companies that are spun out of the quote unquote top tier companies like the Ubers or the Linkedins or the Googles or Facebooks, meta, all of that. Engineers at these companies say, oh, wow, I have this problem. And they create some open source product and then it gets a bit of traction and boom, you know, the playbook, they go raise a bunch of vc funding and it's off to the races. And my buddy was saying, man, but what a lot of these folks don't understand is that not everybody is a top tier engineer. So they've built this product for their friends at the company that they used to work at, knowing the skill level that they were at, and they're not building it for me. Like, I am not that type of engineer, and I'm being very candid about it. I'm not going to be working at Uber in their engineering department anytime soon, or Meta's engineering department, except these products are built like I should know how to work it and I should know how things need to function. And it's like you're saying, like, if you can just bring it down a few notches, dumb it down for me, that would be incredible because you're going to have much more success.
Demetrios [00:29:17]: It's going to be mass market, and you're going to have people that love your product because of that user experience.
Jonathan Rioux [00:29:27]: I totally agree with you. I think it's also like, I want to excuse a little bit the behavior of those folks, because I also think that it's just very natural. I think anybody who's been working in a space for a long period of time, especially surrounded by their peers, sometimes there's kind of a big dichotomy on one part. You're looking around and you're like, man, we're so late and everything. But also, and I'm 100% guilty of this, we have an overwhelming tendency to assume that everybody knows as much on the topic we love as ourselves. So, I mean, I'll use kind of very funny anecdote, but we had some interns this summer and very eager to learn and very excited, and they've done a lot of data science, Jupyter notebook kind of analysis, and then we get them kind of set up into the environment. And me and my team, at least for personal project, we have a pretty old school stack. I'm still a big, big fan of makefiles with all of its words.
Jonathan Rioux [00:30:42]: I just feel like it's kind of a good thing that you can hinge ourselves onto. They were all like, if you're talking about one of the lowlights of their internship, was like, figure out all of those tools. Version control, CI CD makefiles, automation. They were just like, it's so much, so much to take all of the first months. But to me, I was like, oh, well, I've learned this. Anybody can learn it. But you get to think about that. Like, I'm happy to go on the record that I, you know, it took me, I mean, a couple of years to be able to adopt version control properly and then to be able to understand CI CD in a way that works for me.
Jonathan Rioux [00:31:36]: Like, not necessarily all of the intricacy it took me, you know, a little bit of time afterwards. So my expectations of adoption are like, oh, yeah, mon, that should be enough. It's not. You're trying to change habits. And it's one of the challenge. I do a lot of mlops consulting and implementing the stack, suggesting a process map, doing an analysis of what is versus what should be doesn't take a lot of time. There's good internalized practices that you can go. We take pride in nuancing a little bit.
Jonathan Rioux [00:32:23]: I'm not going to go and sell the same product to everyone. Not everybody is in the same boat, but the change management and getting data scientists to adopt tooling and we have a lot of conversation downstream where we do a check in and they're like, well, I don't understand. CI CD is not everybody's using it and it's going to take time. People need to. I almost feel like you need to be frustrated enough with the current state so that you can adopt the change. But ultimately everything from you decide as a company to become more agile, to put some automation to streamline your process. This is going to cascade into individual team how they're going to adopt this and then it's going to go to individual developer. And that process is like, it can take months, years, depending on how far you're going.
Jonathan Rioux [00:33:26]: And I don't think we appreciate it. Like you and I, we did all of this, so now it's kind of like everybody should do it.
Demetrios [00:33:35]: Exactly. So great points.
Demetrios [00:33:37]: All right, real quick question for you. Are you a Microsoft fabric user? Well, if you are, you are in luck because we are introducing SAS decision Builder. It's a decision intelligence solution that is so good it makes your models useful because let's face it, your data means nothing unless you use it to drive business outcomes. It's something we say time and time again on this very show. But wait, what do you mean by nothing? Well, SAS decision builder integrates seamlessly with Microsoft fabric to create effortless business intelligent flows. It's like having a team of geniuses you manage in your pocket without all that awkward small talk. With decision Builder, you'll turn data into insights faster than brewing a double espresso. And you know how much we like coffee on this show, visually construct decision strategies, align your business and call external language models.
Demetrios [00:34:37]: Leverage decision builder to intuitively flex your data models and other capabilities at breakneck speeds. There's use cases for every industry, including finance, retail, education and customer service. So stop making decisions in the dark. Turn the lights on with SAS decision builder. Yes, I did just make that joke. Watch your business shine SAS decision builder, because your business deserves better than guesswork. Want to be one of the first to experience the future of decisions? Well, sign up now for our exclusive preview. Visit sas.com fabric or click the link below.
Demetrios [00:35:20]: Let's change gears for a minute. I want to talk about this idea that we touched on where you mentioned how you're doing all kinds of pOCs with folks in the genai sphere and all of a sudden six months later you got that poc up and running and people are coming back to you saying, okay, so now what's the ROI on this? You're looking around like, wait a minute, we weren't talking about Roi. We're just trying to get something up and running. Can you explain that a little bit more? Put a little more color on that?
Jonathan Rioux [00:35:57]: Oh, for sure. So maybe like we can contextualize a little bit like the situation. You're an enterprise, and I mean, this could be my company, this could be any companies that we do work with. One of their primary goal is to get familiar with the technology. So they're selecting use cases, they're prioritizing use cases, and as part of the prioritization, no talk or very little of whats going to meet a return on investment. And ill use the example of the HR chatbot, because I think it was identified for a lot of company as its kind of the Goldilocks use case. The data is well understood because everyone at least has a high level of familiarity over what HR is, the type of questions that you can ask, the type of answers that you can expect. Not saying it's a very easy use case, there's a lot of finesse into it, but conceptually, everybody can understand it.
Jonathan Rioux [00:37:06]: Your data's captive. You don't need to do third party data acquisition. You don't even need to understand your data landscape. Every company has an HR function. Everybody takes vacation, everybody gets paid. Like it almost kind of perfect on that front, but it's one of those automation use case. So the real question that you have to ask yourself is if you want to turn out an ROI, is, are you going to be able to limit your hiring? Is it going to relieve people from doing more productive work, um, more revenue generating work, um, and it, it's not a use case that's going to drive, I, I don't think for most organization, an HR chatbot is going to drive a tremendous amount of return on investment, because on top of it, one of the prerequisite to be able to have that is to, um, have your, your documents very well organized, very well structured, um, which is kind of a big important of the work. And by the time you're done with this, then an FAQ becomes a very easy thing to implement.
Jonathan Rioux [00:38:20]: So you do it, you implement it, you do the poc, you get it. The costs are very real because, you know, you're calling the API, you have your application, you can have a cloud bill, or, you know, the salary of people that are coming into. Usually it's going to be a ton, let's say a team of four people for a number of weeks, and then you keep one or two people to be able to maintain it and continue updating the knowledge base. And those jobs might be part time so you're engaging the costs, and then you're going to be asking yourself, okay, what's going to be the return on investment? And we see a lot of people who like their usage statistics are not super great. A lot of people don't want to use the chatbot, or the answers are not super relevant. We've seen people have chatbots getting confused by the country that you're in because you don't necessarily have the context, so on and so forth. But then they're like, okay, so what's the return on investment? And you have to acknowledge that the true return on investment that you had is, well, you undertook doing your vendor selection for the technology that you want. You got a preferred Genai vendor.
Jonathan Rioux [00:39:31]: You internalize some of the design pattern. You got to see some of the failure mode from Genai. There's a lot of endemic knowledge that kind of got into it. You're better positioned to be able to prioritize those use case. But somehow all of that learning, nobody cares about it. Everybody is like, okay, how do we turn an ROI on this specific initiative? And I feel like it's a little hypocritical to do that because, and I understand where it comes from, but ultimately, you know, you have to acknowledge that you took that use case because you wanted to get familiar with the technology. You need to keep this as part of your narrative because it is probably like 90% of the reason why you picked that use case. Otherwise, if you really wanted to have something that drive return on investment, you have to do this at the inception.
Jonathan Rioux [00:40:31]: You have to do this as your prioritizing initiative. And especially a POC is meant to prove return on investment. So you're validating. To me, the concept that you're trying to validate is if I use an LLM for this business process, hey, is it going to work? We've seen occurrences where you don't have access to the data, you don't have a complete picture. You're looking at automating the process, but you need to insert the human into the process at very awkward places, which basically means that attempting to automate the process makes it slower and more cumbersome. That's what you want to validate. So to me, it comes a little bit as a weird positioning. We're seeing less of this now.
Jonathan Rioux [00:41:24]: I know we talked about this a couple of months ago. I think organizations are getting a little bit better at this. And we see some of the organization that we do work with are actually quite unapologetic around selecting use case and identifying revenue. And what they have in common is their business. People took a very keen interest in getting familiar with the technology, understanding not necessarily like the intricacies of it, but they have a good mental model of how it works and there's a very tight collaboration in creating those products and we're seeing more and more of this. I find this very encouraging.
Demetrios [00:42:08]: Do you feel like those learnings that you talked about in the poCs, they can be salvaged? And almost like you pull a rabbit out of the hat and you say, hey, the HR chat bot is shit. We all know that even. But now that we've got all this stuff, we actually have an idea for something that will bring ROi. Does that give people pause? Because now they're thinking, uh, this again, we got to go through that whole cycle. Even though it's going to be a bit more streamlined, there still isn't that kind of buy in as there was before.
Jonathan Rioux [00:42:47]: I think it depends on what's your appetite to invest and ultimately how the market is responding to those claims. I think it's important to remember that it's very exciting to pursue generative AI and I mean the market for a lot of public company, the market has rewarded companies that have been very vocal about adopting Genai. So what do you do with that power? I think is the broader question that you need to ask yourself. Some companies are, you know, a little bit burned because, you know, they, they don't really have that risk, like that notion of risk that you, it's not guaranteed. You know, you're, you're putting a hypothesis and on top of that, some people are not stopping their POC early enough. I'm seeing sometimes people doing a proof of concept and they're engaging full ui, complete build. They want to be able to do user testing and it is expensive. If you're very confident that your backbone, that there's going to be a fit and it's going to be used and you have a good notion of where it's going to go, I think it's fine.
Jonathan Rioux [00:44:25]: There's no one size fit all. But I would postulate that a lot of company could use in reducing their pocs and trying to distill them down to a series. Maybe up to three questions that you're trying to validate. One of the things that we're implementing on a lot of the pocs that we do is we start with a scorecard. So you want to do, I'll use a very common example that we see, but you have a ton of unstructured data that is coming, fax, emails, PDF, and you want to extract relevant information on it. I think at this stage the POC has been validated. I think industry wide we understand it's a very good use case for generative Aih. Is it going to work with your document? Well, boiling down to a very simple question, work on the ML portion.
Jonathan Rioux [00:45:25]: Use those results to be able to build a business case. You don't necessarily have to engage a massive team for a long amount of time. Exactly. You're proving a concept. If you want to go straight into a pilot, we're seeing a lot more about that happening because the concept's been proven to. And then you can go, you can start asking questions that are going to come onto it. But to me this is like, this has very little to do with how you do data science as an organization. It's very much how do you develop internal product or even external product, and how do you innovate as an organization?
Demetrios [00:46:07]: Yeah, I was thinking that same thing. It's more about how do you run a proper POC? If it's in Gen AI, great. If it's in some other random aspect of the business, great. You still got to know how to properly conduct these pocs.
Jonathan Rioux [00:46:22]: Yeah.
Demetrios [00:46:24]: There's another thing, too, that, as I was thinking, it almost is, I don't know if it's counterintuitive, but it is like going back to what we were saying earlier on, your job is the center of excellence and you now have to validate those pocs and you have to figure out where is the ROI. It makes for potentially sticky conversations. And so, like you were saying, you're coming in more in the line of there are certain use cases that have been proven. We've had chat GPT for almost two years now. Right. And so it's pretty clear that it works really well with these five things. Choose one of these five things and you'll probably be able to see some ROI from it. If you go down this route.
Demetrios [00:47:23]: It's cool, it's fun, you'll learn stuff. But roi, don't bring that into the conversation.
Jonathan Rioux [00:47:32]: Well, I mean, we're a consulting organization, so we've created a series of handy dandy tools for those kind of endeavors. It's funny that you say those five things we actually have. I use that mental model a lot of time. Like the five pillars of Genai. What are the verbs that generative AI is very good at? Is kind of summarize, synthesize. It gives, I think it's a great way to do outreach with people that are closer to business function and it also helping them build a healthier mental model of the technology. Whether it is, and I mean generative AI is what's going to be the tip of the spear. But traditional machine learning or deep learning or even plain old automation has also a place and usually they work together like you have a lot of compound system.
Jonathan Rioux [00:48:36]: A lot of people talk about LLM agents, which I prefer to see it as that synergy between software, ML, deep learning and generative AI. But bye. By doing that outreach and continuing to maintain those relationship within your organization, you're in a much better position to be able to identify, use case and do back of the envelope POC. But one of the things that we do quite a bit is, hey, one afternoon we're going to validate this handful of documents. See, this is a result. This is what's good, this is what's bad. Are we in a stage where we feel pretty comfortable and then as you design your flow, like what you're trying to optimize or what you're trying to develop, understanding what are the different failure modes. I think that this is a very exciting way to me to develop new products and to be able to generate what I would say, like differentiating a product.
Jonathan Rioux [00:49:50]: Like once in a while we see a couple of companies and most of the time it's not a company that you would expect, but they naturally, like, couple of people are going to converge, they're going to hack on an idea and you're just going to be like, wait a moment, that's really smart.
Demetrios [00:50:08]: What have been some of those that you've seen that surprised you?
Jonathan Rioux [00:50:12]: Um, I would say to me, um, it is something that we, we've seen very recently, uh, was to, how do you create, um, a chatbot, let's say with content that is not, uh, that needs to be curated. Like in the sense of your hallucination, tolerance is zero. Like none whatsoever. So it's kind of a tough predicament to say. We're just going to put an LLM. No amount of prompt engineering, rag fine tuning anything is going to make up for the fact that ultimately you will get inconsistency. So one of the things that we've developed as a blueprint, which came from numerous conversation that we had across the industry, was to, well, what if we didn't use an LLM for that stage? What if we just use a mere classifier, present the information in a way that you're not compromising. Like, you know, you ask a question you do a classifier to like, what is a question that is closely related and present that.
Jonathan Rioux [00:51:32]: And if there's a little bit of inconsistency, that's just become part of your knowledge loop. But then use the large language model, the LLM loop in the content creation, curation and review to streamline the work, which is mandatory work. At one point, someone needs to create the content that's going to be served. How can we accelerate it? So we ended up creating a bona fide workbench for optimizing the creation of that content, the ingestion of the relevant material to be able to create, going even to a point you ingest a PDF, what are like, this is the style of questions that we know people are going to ask. What type of question does this document answer? And ultimately, we're seeing the same benefit you have the chatbot. The chatbot is presenting the information that it's supposed to. The chatbot is not hallucinating. There's nothing sexy there.
Jonathan Rioux [00:52:29]: Like, you know, it's something that you use previous generation technology, but we look at the process, see where the friction was, and ultimately, it wasn't a content creation and curation. And because this is something that the human is very tightly in the loop. Well, LLM can play in that arena. And, and to me, it was one of the things that at first, when I was looking at this, I was a little bit puzzled, because I was like, this looks overly complicated. But then you start decomposing. What are the non negotiable? And then you're like, well, traditionally you would say, like, oh, you can't have any hallucination. Oh, well, so LLM is not a good use case. We're going to go to the next.
Jonathan Rioux [00:53:16]: Wait a moment. This is actually something that is very valuable. A lot of contents is a big pain in the ass for a lot of people. Can we pay attention to it? And then it was in our case, we had some subject matter experts, some people that are creating the content, a team of data scientists to be able to develop the model. And it works. I was legitimately surprised. It works.
Demetrios [00:53:46]: Wait, so I wasn't cleared. There was no LLM involved, or it's.
Jonathan Rioux [00:53:51]: Just involved later on in, it's involved ahead of time. It's like you're basically, you have kind of a knowledge base, and you're using natural language understanding to be able to match the question that is proper. But when you're creating those articles and the metadata that makes them referenceable, you use large language model to generate the embeddings.
Demetrios [00:54:22]: When you're creating the embeddings, you have your embedding model. So you're using the embedding model still.
Jonathan Rioux [00:54:28]: Yeah, it's kind of a reverse rag.
Demetrios [00:54:32]: Oh, you just created a whole new type of rag. Get out of here. So now, hold on, hold on, hold on. Yeah, we've got graph rag, we've got advance rag, we've got naive rag, and you heard it here first. We now have reverse rag.
Jonathan Rioux [00:54:47]: I will, I will take absolutely no credit for this. I do not want to. There's something that a lot of people say, like, it's been unvented. Like, you know, it just happened to be the pattern at hand. I don't want to be. I don't want to go and create a paper on reverse rag and I'm going to get shredded on LinkedIn.
Demetrios [00:55:14]: This is how I'm going to go and get my center of excellence job. I'm going to go write that paper on reverse rag. I'm going to coin the.
Jonathan Rioux [00:55:21]: I can imagine this in your mlops certification thing. You've never heard of reverse rag, and then people start searching and they see your name as the primary authorization, like self service at the finest.
Demetrios [00:55:34]: LinkedIn, top voice. That's what I'm going for, baby. All right, so you reverse dragged it, basically, which means you. And there's a classification model. I'm still not completely sure how this works out, but it sounds really cool, because anytime anyone says these magic words to me, my brain and my attention perks up, which is, we decided not to use an LLM. When you say that to me in 2024, I go, wait, wait, what? Yeah, that is unconventional. And so you're using a classification model, but I am not. So you're using the embedding model still.
Jonathan Rioux [00:56:13]: So I'll give you an example. Let's say, for instance, that you're working for an insurance company and you want to be able to have some information around your policy. So you're going to ask a questions around, like, you know, what is my deductible if I have sewer backup? Well, there's actually, like, you don't need an LLM in order to be able to serve that information. There's a couple, there's a couple of classification that you're going to have to do where, you know, you live in, you know, state x, you're in this city. This is your policy. You're going to have to extract all of that information, which, for the most part, is structured like you don't need to, like, you know, start stuffing some embeddings and creating a massive rag of all the policies and so on and so forth.
Demetrios [00:57:12]: Because it's, it's in the CRM.
Jonathan Rioux [00:57:14]: Exactly.
Demetrios [00:57:15]: You have that information about the customer, and this is basically a customer support question. Insurance can't be telling you policies that are not your actual policy, so it can't go wrong.
Jonathan Rioux [00:57:26]: That's right. And then the article, like, you know, as a matter of fact, your insurance policy was written with a set of articles. So you can just say, you know, you're going to take this article that's going to be attached to your policy, that's going to be attached to who you are, and then you can craft that answer. So in that case, you just resurface the piece of content that you want, and you can just add some deterministic fluff of, hey, Demetrios, based on this information that we have with you, this is what's written in your policy. If you have more questions, you can ask more questions, and then you can escalate to a human. But serving that initial piece of content, there's absolutely nothing stochastic, with the exception of taking your question and translating it to the appropriate article. But if you use an LLM, there is a chance like that as whatever, it might reshuffle something. And we're not talking, like, in that case, we're not talking to people that are like, I would rather eliminate hallucination.
Jonathan Rioux [00:58:39]: We're talking around people that are either scared shitless or they're constrained either by the government or, you know, by, they're afraid to get sued if they get some bad information. I mean, think of it for a moment. Companies that are trying to do chatbot for signing drugs, drug usage, where when you get information, like, it's really easy for this to get out of hand. Like, let's discount the legal framework for a moment. If you ask questions around, like, interaction between two medication and the LLM, for some reason, there's, there's something that magically escape, well, you, you could be in trouble. So, so how do you redesign this and find alternative pattern? And it's not just data science, it's how you're going to present the information. We did a lot of analysis, like human centered design, UX development, to be able to see how it is going to go. A lot of experimentation internally, because it's a problem that we decided Washington were pursuing.
Jonathan Rioux [00:59:57]: It's a pattern that we're seeing across multiple customers. And it's an interesting problem. And I think that taking that approach also demystifies that all or nothing approach to, like, either you put LLM at the center of everything, or you go home. There's a lot of things that you can do with a more nuanced and thoughtful approach.
Demetrios [01:00:24]: Wait, so isn't this just basically what we were doing in 2017 with Chatbot?
Jonathan Rioux [01:00:30]: Yeah, but not every company is. Not every company was ready for that, or for a lot of company, generating the content was tedious. And this is one of the place where, in that case, in the content generation, or the content curation or collection, or in that case, we've done some work with a client where we've used large language model to generate the questions that could be answered by that document and do Q and a pairs, which is a lot easier for someone to review than have to generate it themselves. So it becomes more of a productivity play rather than delight your user with an approximate chatbot.
Demetrios [01:01:17]: I see. Yeah. And this is fascinating to me, because I know a few companies that are actually trying to do this as a service. And so I think it is one of those things that is immensely valuable as a use case, if you can. The support team at any company that has a large volume of customers is just inundated constantly. And we know that because if you've ever tried to call or get somebody on the phone of whatever, your phone provider or your local whatever, you know that it's a pain in the butt. And so if you can lift a little bit of that burden, it's really helpful. What I always have trouble with is, a lot of times there are questions, but then there are other times where I need something done.
Demetrios [01:02:18]: I need to talk to you because I need something to happen. And so it's almost like I am very skeptical that I'm going to be able to get a machine or a support bot to actually do those things in a way that I can trust that thing to happen.
Jonathan Rioux [01:02:36]: I mean, it's also kind of the dirty secret of a lot of agent workflow. A lot of people use the verb function, calling for an LLM to be able to perform an action, the function need to exist. I find it funny to say this, and I know I saw your smile when I started, but if there is nothing to call, then everything fall like a house of cards. So you need to build that kind of tower. Let's say you use an example. I want to close my account. You talk to a chatbot because you're going to your bank. Well, there's a pretty big sop that, like, standard procedure standard operating procedure that needs to be followed.
Jonathan Rioux [01:03:36]: Sometimes people are going to try to retain you. So it actually, like, if you look at this, it usually looks like a decision tree with like, you know, inflection point and decision making that you have to do. But let's say for an example that like the action close, the account involves three application, four screensh, and then a letter that you have to mail to the person. Cool. Your very first step should be to automate that. And then you can start growing your process and maybe, and maybe the agent tick LLM is going to be your north star. But start with those fundamental gain, the ROI as it's going to go. What are the things that, because you're not necessarily going to want to the LLMN to call for function one after the other, if that step happened, you might want to do something that's going to abstract it.
Jonathan Rioux [01:04:39]: You might still be working with an as 400 type system, which means that you're going to do RPA or God forbid, put an LLM to interpret the screen and then click on those things.
Demetrios [01:04:52]: There we go.
Jonathan Rioux [01:04:53]: Yeah, but I mean, it all has to do with your system design. And an LLM application is also an ML application, which means that all of the consideration that you have with ML also applies your LLM app. And all of that is also a software product. So you have to remember that all of the responsibility of developing a software product, nothing disappears because you ignore it. But it also gives you that complete array of tool that you can leverage. And a lot of things need to happen sometime before you can use an LLM competently. And a lot of companies are, I think we're seeing this right now where everybody's doing agent based LLM and they're all doing LLM hosting, and they're all doing like, you know, LLM development into the ecosystem because it's new, it's interesting, it's fresh, but you also need to remember that's just part of the story. Like, what are you going to do with the rest of this stuff? You want to be able to do a rag and you need embeddings.
Jonathan Rioux [01:06:05]: Cool. But if your data is on onedrive somewhere and, you know, it's getting updated by, you know, penny from finance every Tuesday, well, you need to be able to have a pipeline that's going to do this. It's not Genai, but it needs to happen still.
Demetrios [01:06:30]: It's funny because in 2020, we had Luigi Petruno on this podcast, and I wrote a post after talking with him, because one of the main themes that I took away from the conversation was start manual first and then automate. Yeah. And you just added an extra little bit to it. It's start manual first, then automate, then AI ify or then make it AI agentable, whatever it may be that the next step is. So if you want to function, call it or make it agentic. It's almost like that manual automate agentic. And you can't just jump over any of those steps because your agent is going to not be efficient. And I really like how you said that it's probably not the most useful for the agent to have four different function calls in there.
Demetrios [01:07:29]: Maybe you can extract, abstract something and make it a lot easier for the agent when it comes time for you to make it agentic.
Jonathan Rioux [01:07:40]: And as you design your experience, you know, you have to remember that even though it's an LLM, like, it depends on how much control you want to give to the machine. Like if, if you're one of my favorite thing, like sometimes we do design of like, you know, customer service bots or, or, you know, internal productivity bots, and, and you ask, okay, what should happen at this point? And everyone around the room, it's like, well, it depends. And it's like, cool. We're going to have to break that down because ultimately you're relinquishing your decision making to either an open source LLM or a third party LLM. And you also need to be able to design failsafe. And this is very similar to how you design for human, you know, humans. Errors are creeping in. How do you correct them? There's a lot of process that you need a second person to go over it because it's highly sensitive.
Jonathan Rioux [01:08:44]: So you have to design those workflow. And I think we're getting the grasp of it. One of the things that I use a lot is every single time there's a decision point that you want to use machine learning or you want to use generative AI. I'm always saying the question, let's assume that 70% of the time it's going to do the right thing, but 30% of the time it's going to do something wrong. Is this making or breaking your use case? And I had some occurrences where people were like, well, we need 100% accuracy. Still to this day. It was the same thing when we were talking about linear regression, tree based models, even neural network. And now with LLM, your definition of, okay, how many errors are you willing to accept if your number is zero.
Demetrios [01:09:42]: Probably not the right use case.
Jonathan Rioux [01:09:44]: Probably not the right use case. But also, are you really that certain that the humans that are doing the job are also making zero mistakes?
Demetrios [01:09:52]: Turn the tables. Yeah, totally. Cause anybody who answers that honestly is gonna go, yeah, of course. Making mistakes all the time.