MLOps Community
+00:00 GMT
Sign in or Join the community to continue

The State of Production Machine Learning in 2024 // Alejandro Saucedo // AI in Production

Posted Feb 25, 2024 | Views 933
# LLM Use Cases
# LLM in Production
# MLOPs tooling
Share
speakers
avatar
Alejandro Saucedo
Director of Engineering, Applied Science, Product & Analytics @ Zalando

Alejandro is the Director of Engineering, Science & Product at Zalando SE, where he is responsible for a large portfolio of (10+) products and platforms, including one of Zalando's petabyte-scale central data platforms, and several State-of-the-Art machine learning systems (Forecasting, Causal Inference & NLP) powering critical use-cases across the organisation. He is also the Chief Scientist at the Institute for Ethical AI, where he contributes to policy and standards on the responsible design, development and operation of machine learning systems. Alejandro is currently the Chair of the ML Security Committee at the Linux Foundation, and Chair of the AI Committee at the Association for Computing Machinery (ACM), where he has led EU policy contributions across the AI Act, the Data Act and the DSA, between others.

+ Read More
avatar
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More
SUMMARY

As the number of production machine learning use-cases increase, we find ourselves facing new and bigger challenges where more is at stake. Because of this, it's critical to identify the key areas to focus our efforts, so we can ensure we're able to transition from machine learning models to reliable production machine learning systems that are robust and scalable. In this talk we dive into the state of production machine learning in 2024, and we will cover the concepts that make production machine learning so challenging, as well as some of the recommended tools available to tackle these challenges. We will be covering a deep dive of the production ML tooling ecosystem and dive into best practices that have been abstracted from production use-cases of machine learning operations at scale, as well as how to leverage tools to that will allow us to deploy, explain, secure, monitor and scale production machine learning systems.

+ Read More
TRANSCRIPT

The State of Production Machine Learning in 2024

AI in Production

Slides: https://drive.google.com/file/d/1TrGfRhw-ebe1BNzWDqGg1s7oU10O4h4z/view?usp=drive_link

Demetrios [00:00:00]: Next up is our keynote number, dose of the day. Alejandro is here with us. What's up, my man?

Alejandro Saucedo [00:00:09]: Hello. Hello. Hi, Demetrius. Good to see you. How are you?

Demetrios [00:00:13]: Been too long. And I am very glad that you said yes to doing this because it's always a special day in my book when I get to hear a talk from you.

Alejandro Saucedo [00:00:25]: My pleasure. No, it's a special day for me contributing to this. I'm very excited and I was just listening to the previous one. Yeah, a lot of really great insights.

Demetrios [00:00:35]: Excellent. Well, dude, I'm going to throw 20 minutes on the clock and then I'll be back to ask all kinds of fun questions to you, too. So I'll share your slides and let you get rocking and rolling.

Alejandro Saucedo [00:00:50]: Perfect.

Alejandro Saucedo [00:00:51]: Great.

Alejandro Saucedo [00:00:52]: Thank you very much, Demetrius. And yeah, it is a pleasure for me today to present to you the state of production machine learning in 2024. A little bit about myself. So I am director of engineering, science and product at Salando, a scientific advisor at the Institute for Ethical AI, and member at large at the ACM. So today we have a very interesting topic where we're going to be covering a broad range of different areas in order for you to actually get the most value of this is by actually delving into each of the different aspects and elements that we touch upon. So if you want to actually access the resources, you can find it on this link at the top. So it's going to be also at the end, but pretty much each of the slide will have kind of a link to a previous talk that I have given or that I have actually found really useful. So please do check it out whenever you have a chance.

Alejandro Saucedo [00:01:52]: So today we're going to be covering about the five key areas of the motivations and challenges, like why should we care? Some of the trends at the industry level, trends at the technological level, and trends at the organizational level. Before we wrap up, starting with some of the motivations, one of the things that we don't need a reminder in this group is basically that we no longer believe, and hopefully never believe that the lifecycle of the model ends once it's trained, right? If anything, it actually begins once it's trained and once it starts being consumed, right. And we actually have different considerations once a model hits production, because it starts, in a way, decaying.

Alejandro Saucedo [00:02:36]: Let's.

Alejandro Saucedo [00:02:37]: So we're going to be actually talking about some of the challenges that appear in production machine learning and what makes it so challenging. Well, there's specialized hardware. So we've heard a lot in the conference about the rise of gpus. There are complex data flows that involve throughout the pipelines, but also the feature data that needs to be used, both training and inference. There's a lot of compliance requirements, particularly for the use cases, and there is a need for things that are not as prominent in other areas, such as reproducibility of components themselves. But what does it make it so much more challenging than just traditional infrastructure? Well, there is also considerations such as bias, such as the traditional outages, but can be caused by different contexts and situations. There is personal data, or even just data that perhaps even with the exposure with the models, may actually have particular nuances, and also the cybersecurity elements that we're going to touch upon. And we have to remember that the impact of a bad solution can be worse than no solution at all.

Alejandro Saucedo [00:03:47]: And one of the interesting things when it comes to in particularly llms, is that the rise of this new emerging technology gives a really great intuition on the challenges that we face in production machine learning, particularly in how they interact from sort of like the different data flows, to consume data from different areas in a way that often is interactive. So I think a lot of these concepts will become a little bit even more intuitive, linking them in some of the challenge that we find llms, but they would still be prominent across types of AI. And the reason why is because we actually see a different set of domain expertise required to be able to fulfill the needs that the machine learning engineering brings, right? So data science, software engineering DevOps, but that extends even beyond the machine learning engineering ops into the industry domain expertise, the policy expertise, forming this industry standards. So in order for us to tackle this challenge as a whole, we need to think about from a very high level perspective what it's at play. So one of the things tend to see is that the way that I want to is 2018 was the year where we were exploring the topic of AI ethics, right? Then 2020, et cetera, is when it started going more into the regulatory domains. Right now, we see examples like the AI act that is now going into enforcement here in Europe, but also, now we're talking also, this extends into the software frameworks and libraries, right? Because ultimately, it doesn't matter how many roundtables we have and how many principles we set, if the underlying infrastructure is not set up to be able to enforce or apply those high level principles, we're never going to be able to succeed on that. And it's not just about the hard ethical questions to fall on the shoulders of a single practitioner, right? A single software engineer, a single data scientist. It's also about the accountability frameworks and structure that serves around the operations and development and design of machine learning models.

Alejandro Saucedo [00:05:55]: So whilst an individual practitioner like ourselves can make sure that we're using the technology best practices, the relevant tools, we also need to make sure that at a team level, there's cross functional skill sets, relevant sort of domain experts at the different touch points, and then kind of at the broader, even organizational level, that sort of governing structures, the aligned objectives to be able to ensure that this is encompassed. And we're going to dive a little bit deeper into this now. Of course, as I said, when it comes to the enforcement of these higher level principles or even regulation, the interesting thing is that now projects, even previously we had this talk from hugging face, but even other sort of open source frameworks are becoming critical in order for us to further this capability to ensure the responsible development, design and operation of machine learning systems. So even us practitioners, we have a very important voice to be able to make sure that this evolves and we further the direction. And one of the things that we had also been seeing is that previously there was this phrase of regulation was playing catch up, but now we see that it's the other way around. And actually we see a lot of the interesting innovations coming from the regulatory frameworks and standards, similar to what we're now seeing, not just in Europe, but across the world. And actually today we celebrate a massive milestone with basically one of the contributions that we did to the UK regulation, basically adopting 13 out of the 14 recommendations that we made. So that was a huge win for the AI regulatory ecosystem.

Alejandro Saucedo [00:07:29]: So this is from a high level, right? So regulations, standards, frameworks, guidelines, principles. If you're interested, we have kind of like a 1 hour talk that I gave at new rips, specifically in the topic of responsible AI. But for the premise of the time that we have in this talk, this should give you some intuition. Now let's dive into some technological trends, more specifically in four areas that I want to hyper focus. Frameworks, architecture, monitoring and security. In terms of frameworks, let's just. And remember how it all started.

Alejandro Saucedo [00:07:59]: Right?

Alejandro Saucedo [00:07:59]: We had a couple of frameworks to use in order to pick for us to run through our mnist or one of our CFR data sets. But now, where are we? Now we are in a situation where there is more than a dozen of tools to choose for a large number of different areas. So how do we navigate this? Right, let's assess what does the anatomy of production machine learning looks like. But if you want to delve into some of the tools that are available. We actually have a really awesome list that basically curates tools across privacy, feature engineering, visualization, et cetera, which just celebrated five years. So please do check it out. Give it a star. But this is a good starter.

Alejandro Saucedo [00:08:47]: But yeah, in order to see how this all fits together, let's look at the anatomy of basically production machine learning. So, looking at basically the data at play, this is training data, your artifacts, your machine learning models that are trained, and then your inference data. Of course, you want to start with the experimentation, right? Performing hyperparameter training evaluation. Whether it's a Jupyter notebook or an actual distributed tool, you're going to be converting training data into model artifacts. Of course, this is where we said that the lifecycle begins once the models are trained, because those model artifacts are what now want to be consumed by business, deployed or served as either real time models or batch models, of course. Then introducing the fail saves that allow us to run this at scale, which is all of our advanced monitoring, not just like traditional software monitoring, but things like drift detection, explainability, which we will cover in a bit. And we also want to make sure that there is the connection back of our inference data so that it can be reused. Now, as part of this, there is an element of metadata that flows across the area, which we're going to go back to this.

Alejandro Saucedo [00:09:56]: But now that we have this anatomy of production machine learning, we can also ask the question, well, how do we then marry the architectural blueprint and all of these hundreds or thousands of mlops frameworks? There is an interesting project called MymlOps, which actually provides a really interesting visualization of how organizations are starting to think about this. This basically provides a couple of different sort of architectural blueprint components, your experimentation, your runtime engine, your code versioning, and it gives you an ability to pick and choose what tool can you use with a trade off decision on what does one give you versus another one. So ultimately, this is something that a lot of organizations are sort of like wrestling upon understanding whether they need to adopt a sort of heterogeneous, best for breed, open source combination, or whether it is kind of like an end to end single provider. But this actually allows you to reason about and navigate this very complex ecosystem. Now, one other thing is that now that we move into this more sort of like scalable architecture of mlops, we also have to consider this transition from model centric to data centric. And actually, whenever we're talking about deploying models, we now actually see complex architectures this example is basically the retrieval in Facebook search, which contains basically like an offline and an online part. I mean, the key thing to emphasize here is that, back to the point that I was mentioning, llms really allow you to see this ML system intuition of the direction in which AI is just going, right? Because now we're actually looking at these very complex architectures that now require multiple different components, each of them with its own sort of monitoring considerations.

Alejandro Saucedo [00:11:42]: Right?

Alejandro Saucedo [00:11:42]: Now, in terms of this move from model centric to data centric, we have to consider the relationships at play, right? So let's think about, for example, if you have a couple of data sets. So let's say data set, instance a one all the way to an. We would use this to train a machine learning model to create an artifact, right? So let's say instance a one to am to create a machine learning model artifact, right? We would do that another one for another artifact. And then maybe we actually have different data set for a completely new, different use case. So traditional model artifact stores with this relationship. And this is something that applied scientists and data scientists have recent bout quite well, but now we move into a new domain, right? Like we're productionizing not machine learning models, but machine learning systems. So in this case, we actually have a new relationship where we are instantiating, in a way artifacts, right? So we deploy a machine learning model in a particular environment, but we can also deploy that machine learning model in another environment. And similarly, we can actually deploy systems or pipelines consisting of different machine learning models from different contexts.

Alejandro Saucedo [00:12:50]: Right.

Alejandro Saucedo [00:12:50]: And this introduces a new paradigm.

Alejandro Saucedo [00:12:52]: Right.

Alejandro Saucedo [00:12:53]: We need to think about this in a completely new way that goes well beyond the limitations of traditional artifact stores.

Alejandro Saucedo [00:13:01]: Right.

Alejandro Saucedo [00:13:02]: And because of this, we are seeing a large trend asking the question of how do we manage our metadata interoperability.

Alejandro Saucedo [00:13:09]: Right.

Alejandro Saucedo [00:13:09]: Like we have the different stages of your machine learning models. We need to make sure that everything actually, I guess, interoperate in between, right? So you have the data itself. So we talked about the data that you use for training or for inference, and also the metadata of the models themselves. You may have the experimentation elements involved in this. You may have then the deployment artifacts that you have in here, but then you can have also all the way to the back, right? Like your data labeling, your linking between your data labels and then your data sets, and then actually the data products that actually are at the backbone, right? So this is a big consideration that organizations are really also wrestling upon. How do you then cover this end to end capability? So with that, that covers the architecture let's move into monitoring a couple of areas. But if you want to dive into a full on sort of overview, check out the resource in terms of machine learning monitoring, you have things like traditional performance metrics, things like statistical performance, your accuracy, precision recall, the ability to dive into aggregate insights.

Alejandro Saucedo [00:14:15]: Right.

Alejandro Saucedo [00:14:15]: Being able to slice and dice into the production machine learning model performance as well as explainability techniques. Right, but it doesn't stop there, right. Because it's not just about jumping into a dashboard and looking what is happening there. It's also about introducing observability by design.

Alejandro Saucedo [00:14:30]: Right?

Alejandro Saucedo [00:14:30]: So instead of requiring everybody to jump in, look at the dashboard, see how it's performing, we want to introduce things that provide actionable insights, like alerting.

Alejandro Saucedo [00:14:40]: Right.

Alejandro Saucedo [00:14:40]: Things like automated slos.

Alejandro Saucedo [00:14:42]: Right.

Alejandro Saucedo [00:14:43]: What are the sort of, like, contracts that the models have to be operating by, whether it's requests per second, whether it is particular throughput, et cetera, et cetera, GPU usage, et cetera. The ability to introduce things like progressive rollouts.

Alejandro Saucedo [00:14:55]: Right.

Alejandro Saucedo [00:14:55]: It's like once you hit a particular SlO, then actually, I want this to be promoted. And then similarly introducing more advanced monitoring techniques like drift detection and outlier detection. Again, this is basically a whistle stop tour. Please do check out the deeper dives. Otherwise this would be already kind of like a four hour talk. And the last element is security. And this is something that I just want to raise as an awareness, because this is an element that is actually growing in importance across the industry. The reason why is because if we ask the question of what are the phases of the machine learning model where cybersecurity is important, that would be basically all of the areas in red.

Alejandro Saucedo [00:15:35]: If you squint your eyes, you would be able to see it. That is, throughout the end to end.

Alejandro Saucedo [00:15:39]: Right.

Alejandro Saucedo [00:15:40]: There is potential risks of vulnerabilities at the data processing, at the model, training, at the model serving at the metadata layer. And as part of this, it requires for us to think about new mechanisms that allow us to tackle these security considerations. One of the approaches that we've taken. So I'm actually chairing a working group at the Linux foundation that is looking at machine learning and MLP security. So if you're interested to contribute on this topic, please also do get involved, as we have released a couple of great resources. So now on the final piece, which is the organizational trends, let's dive into some of the insights that we are seeing across different companies around how they're adopting production machine learning across their entire end to end organization. So the first point to highlight is there's this transition of what we can call from software development lifecycle to a MLDLC, right, SDLC to MLDLC like machine learning development lifecycle. And the reason why this is a key distinction is that when it comes to an SDLC, you have a sort of like rigid set of steps that are carried out in order to deliver an artifact of production, right? You write your code for your microservice that is tested.

Alejandro Saucedo [00:16:56]: It goes to the OpS, OPS teams, and then the OpS teams are able to approve it, it goes to production, then you have the monitoring between that. However, when it comes to machine learning, it is more a sort of like not all size fit. All right? So you have to also be very close to the use case to understand what are the areas that actually should be relevant for that. In some cases you may have actually risk assessments. In some cases you may have some actual ethics board approvals. In some cases, you may actually not require as much of an overhead because it's going to be a quick iteration and experiment to answer a particular question, right? So it is not possible to just copy paste what we've already done for software when it comes to the governance and operations of our production machine learning methodologies at scale. So this is actually something that we're seeing, and it's evolving slowly but surely. And there's going to be a lot of really interesting discussions as organizations start iterating on this.

Alejandro Saucedo [00:17:54]: The second one is of mlops and data ops. So data Ops is, I guess, parallel universe to what we're discussing right now, but an important one that is looking at basically concepts such as data mesh architectures to be able to enable each of the business units, each of the departments, each of the teams to be able to make use in a sort of nimble way towards their data sets. And right now, these basically concepts of operationalization, of machine learning and basically data governance are also colliding and creating basically new frameworks that we need in order to ensure compliance at scale, not just for personal, independent data, but also for being able to map and trace the usage of different data sets. So this again is actually becoming a very important topic, especially in data ops communities where they're starting to talk a lot about mlops. There is also a transition from projects to products, right? So this is basically saying, okay, well, in the past you actually want to use machine learning often to answer a question or to deliver a particular thing. We deliver a project and we actually finish and that is it.

Alejandro Saucedo [00:18:59]: Right?

Alejandro Saucedo [00:18:59]: But like we said, the lifecycle of a machine learning model begins once it's trained and once it's in production, right? There may be new versions of the model that are productionized. There may be iterations to actually extend the capabilities, not just in terms of the features, but actually of the system as a whole, even of the product of the AI delivered product that is at play. So this sort of mindset of prep thinking is coming not just in the sense of the actual methodologies, but also in terms of the teams. So we are seeing basically things and concepts like Spotify's squad structure coming into machine learning and bringing not just the machine learning practitioners, but also practitioners that would be often relevant to different components, like for example, UX researchers, full stack engineers, domain experts, creating basically these quad functionalities. And then one of the final things to cover is basically how to actually map the different horizons of the organization, right? You're going to have some things that are more immediate term that have to be delivered for the business project delivery, but others that are more on the mlops infrastructure. And the way that actually we think about this is with different personas at play, right? So perhaps if you have a few set of models, maybe you only need a couple of data scientists. But then as the number of machine learning models increase, you may actually require different personas involved. Right, machine learning engineers or even Mlops engineers, right? And as part of this, also, you shouldn't actually start with all of the complexity from day one, but all of your automation, standardization, control, security, observability increases as your mlops requirements also increase.

Alejandro Saucedo [00:20:40]: Right?

Alejandro Saucedo [00:20:40]: So that's the one thing to consider. And then the final thing is that similarly, when it comes to actually the different sort of delivery of the mechanisms, you also have to think about as an organization the ratio between these data scientists and other personas, right? So perhaps again, if you have a small team with data scientists, you may have a couple of different sort of pipelines that have basically delivery of value. But as you have different actual pipelines relevant at this, then you would have perhaps more machine learning engineers at play, or even mlops engineers that maintain the ecosystem. So that is basically kind of like some organizational trends. As I said, please do check out the resources linked for a bit of a deep dive and to wrap up before we move into questions, just a reminder that not everything has to be solved with AI, right? When you run with a hammer, everything looks like a nail. And we have to remember that as practitioners, we are starting to have a growing responsibility because critical infrastructure increasingly depends on ML systems. And it doesn't matter how many abstractions and how many llms, the impact will always be human.

Alejandro Saucedo [00:21:47]: Right.

Alejandro Saucedo [00:21:47]: So it's just something for us to keep in our minds as we actually progress and further these considerations and best practices of machine learning. And with that, that was a lot of content. So I hope you managed to bear with me. But, yeah, I very much appreciate it. Please do feel free to check out the resources. And yeah, I hope you enjoyed the session. So I'll pause there. Thank you.

Demetrios [00:22:14]: We most definitely did, my man. Don't you worry about a thing. I think we crashed the MYML Ops website, sadly, because all of a sudden all these people went to the website at the same time and now you can't get that fun from it. But I imagine the team is hard at work.

Alejandro Saucedo [00:22:39]: Many people made everybody go.

Alejandro Saucedo [00:22:42]: Yeah.

Demetrios [00:22:43]: And so we'll talk to the guys at. I think it's aporia who put that on.

Alejandro Saucedo [00:22:47]: Right.

Demetrios [00:22:48]: So we'll talk to them about making sure that gets up and running as fast as possible. So we've got some questions coming in, and I've got a few questions too. I loved this. This is such a cool talk. There's a question that came through first that is all about how you estimate what the cost of model deployment and serving in production is.

Alejandro Saucedo [00:23:13]: That's a really great question. So I would say that it's something that would depend on the complexity that you're looking to adopt.

Alejandro Saucedo [00:23:23]: Right.

Alejandro Saucedo [00:23:23]: Because when it comes to actually starting the adoption of machine learning, and particularly if it's more towards like batch type use cases to answer questions, it may be very different to if you're looking to serve a high throughput, low latency, or large scale large model that requires basically the infrastructure to back it up.

Alejandro Saucedo [00:23:47]: Right.

Alejandro Saucedo [00:23:47]: So I would say that, again, similar to the references that were made to traditional software, it would be an exercise that would go kind of like along those lines.

Alejandro Saucedo [00:23:56]: Right.

Alejandro Saucedo [00:23:56]: It's like assessing, scoping, designing. However, as this is performed multiple times, this is when you're starting to be able to introduce developer productivity capabilities or infrastructure to streamline this.

Alejandro Saucedo [00:24:09]: Right.

Alejandro Saucedo [00:24:09]: And addressing basically the bottlenecks. So I would say that it would be very similar to traditional software, but then kind of like with the nuances of ML.

Alejandro Saucedo [00:24:17]: Yeah.

Demetrios [00:24:18]: And one other piece that I know we've been asking people about in the evaluation survey that we're doing is how you evaluate if everything is going well, like if your gpus are being used properly. Right. And so it's not only like estimating the cost, but evaluating how you're doing on those estimates and then realigning if you feel like, oh, we actually have too much gpus right now, or. I don't think anybody ever says that, but you never know.

Alejandro Saucedo [00:24:50]: Exactly. No, absolutely. And that goes back to, I mean, we were barely able to cover, like, about two or three minutes of it, but in the monitoring, that's where the performance metrics would play a large part. And also, we're setting slos and introducing observability. Again, battle tested concepts in the traditional sort of like, software SRE space, but really bringing them to ML and also introducing not just metrics that are traditional to software, but also, like, ML specific metrics. So things like even, how is the distribution of my data looking in relation to the use case and whether I should actually be notified whether this is what is expected.

Alejandro Saucedo [00:25:29]: Right.

Alejandro Saucedo [00:25:29]: And this could be, for example, even in some cases, organizations that may be looking at sales or they may be looking at sales forecasts. This is something that could be quite meaningful as well.

Demetrios [00:25:41]: So I'm wondering, you mentioned on one of the slides, the team compositions and how team compositions can change over time once you start to see more models in production and you get more mature. And you also were talking about how a team isn't just like ten data scientists and you almost want these different pieces that can fit into the team and be that superpower. It's like the Power Rangers aren't all white. You got the green Power Ranger, you got the blue power Ranger, all that. So I'm wondering if you have seen any changes since the introduction of llms in the team compositions. Is it more heavily weighted towards people in the product or data engineers? Do you need more of a certain prototype or Persona since LLMs came on into the.

Alejandro Saucedo [00:26:35]: No, no, that's a great question. Yeah. Demetri. So I would say that also emphasized that LLMs just accelerated and really re emphasized some of the topics that were coming in slowly in just the machine learning space. I have now seen a massive, massive accelerated adoption, again, in this case, more like product team thinking. So introduction of squads, cross functional virtual teams that are delivering features as opposed to projects. And in terms of kind of like the composition. Indeed.

Alejandro Saucedo [00:27:09]: I mean, I'm now starting to see also an increasing amount of designers, like UX researchers, people that perhaps would be more niche in the machine learning space. And you would often just talk about the machine learning engineer and maybe the MLPs engineer. Now we're starting to really see that design hat on and the creative and creativity hat on, which I think is fantastic, because indeed, it's what actually drives innovation that marries together the technology and the cutting edge technology as well as the domain. And specific to your first question, I would say indeed, technologies like with llms, you would probably see that much more emphasized than if you see kind of like something more related to a traditional NLP project that is just kind of like automating a document extraction, for example.

Demetrios [00:28:00]: Yeah. Or even a fraud detection model.

Alejandro Saucedo [00:28:03]: Exactly.

Demetrios [00:28:03]: Not really creative hats that are worn in that. That's for mean. There could be, you just got to be really damn creative to get.

Alejandro Saucedo [00:28:12]: Yeah, I would say more like maybe like UX internal tooling as opposed to UI designer that would create. But yeah, I think those are the questions that I think everybody's also wrestling with. Really good questions as well.

Demetrios [00:28:25]: So this kind of tags on the back of that. Jonathan's asking about how easy you think it is for those of us in organizations doing ML Ops, the differences in processes between llms coming onto the scene and then like the traditional ML.

Alejandro Saucedo [00:28:45]: That's a great question.

Alejandro Saucedo [00:28:47]: Yeah.

Alejandro Saucedo [00:28:47]: So I would say there should be no difference.

Alejandro Saucedo [00:28:50]: Right.

Alejandro Saucedo [00:28:50]: This is what I was saying in terms of the rollout of the MLDLC. Your machine learning development lifecycle framework and methodology should not see llms as just like a complete new technology. I mean, everything boils down to it being kind of part of a tool stack.

Alejandro Saucedo [00:29:10]: Right?

Alejandro Saucedo [00:29:11]: Of course. Indeed, when it comes to llms, you get a huge sort of like augmentation of the machine learning system architecture. But like I showed in this case, a machine learning system for search, you would also have a highly complex set of components that would require similar level of overhead in terms of productionization considerations. I would still say that, of course that does mean that you cannot copy paste your existing best practices for software development, but you should also not just drop them, right? All of those have to be just like mapped into a one to many or many to one and extended to fill the high risk use cases.

Alejandro Saucedo [00:29:58]: Right.

Alejandro Saucedo [00:29:58]: Like for your compliance requirements, for your ethical assessments, for your whatever it is. That will also be like industry specific, more or less compliance required depending on the context.

Demetrios [00:30:13]: So sometimes you can get stuck developing and putting together things to get an MVP. It is difficult to understand if something is mature enough to be deployed. How do you understand when it is time to deploy a project, to not risk things getting too old versus deploying it a little bit too early and regretting?

Alejandro Saucedo [00:30:38]: Yeah, that's a great question as well. So I think also the mindset that I've also had throughout has been calculated risks.

Alejandro Saucedo [00:30:50]: Right.

Alejandro Saucedo [00:30:51]: And it's also questioning the proportionate impact.

Alejandro Saucedo [00:30:55]: Right.

Alejandro Saucedo [00:30:56]: If we're talking about a situation which would involve a high risk consideration, I. E. Impacting users in situations that could be significant, like whether it's financially, livelihood wise, et cetera, or even organizationally being potentially like a, a risk from kind of like a reputational perspective, reputational damage. So it would have to be consideration of proportionate risk.

Alejandro Saucedo [00:31:31]: Right.

Alejandro Saucedo [00:31:32]: So in the context where the risk is higher, there would have to be closer alignment with the domain experts to make sure that those KPIs and SLOs are set with that into context.

Alejandro Saucedo [00:31:44]: Right.

Alejandro Saucedo [00:31:44]: But if you're dealing with something that can either be sandboxed. Right, can be sandboxed to a smaller number of users, or the impact can be mitigated through human in the loop, that is something that you can use kind of at play as part of the tool set, right. There will be situations where you may ask to yourself, well, but all of this overhead seems like too much for this particular use case, then it means that it's not going to be viable. The reality is that there will be some situations where just AI, advanced AI is just not relevant, right. And that can be in a lot of banks, for example, which are still, or even hedge funds that are using linear models because the explainability is just not there.

Alejandro Saucedo [00:32:25]: Right.

Alejandro Saucedo [00:32:25]: Or the compliance demands it. So there will just be context where indeed that is the barrier. But, yeah, the summary would be like proportionate risk.

Demetrios [00:32:35]: Excellent. Alejandro, my man, this has been incredible. Dude, there's so many more questions coming through in the chat for you, but we've got to keep moving. You can call me good old Phil Collins today because I'm keeping time, baby. I am keeping time.

Alejandro Saucedo [00:32:52]: There's so much good content that, yeah, I would not want to block on that. But, Demetrius, it's always a massive pleasure. It's also an honor to contribute. And like always. Yeah, I'll see you on the socials.

Alejandro Saucedo

+ Read More
Sign in or Join the community

Create an account

Change email
e.g. https://www.linkedin.com/in/xxx or https://xx.linkedin.com/in/xxx
I agree to MLOps Community’s Code of Conduct and Privacy Policy.

Watch More

Current State of LLMs in Production
Posted Oct 18, 2023 | Views 1.6K
# Natural Language Processing
# LLMs
# Truckstop
# Truckstop.com
End-to-end Modern Machine Learning in Production
Posted Jul 14, 2023 | Views 480
# RLHF
# LLM in Production
# Hugging Face
Machine Learning Engineering in Action
Posted Mar 07, 2022 | Views 987
# Presentation
# ML Engineering
# databricks.com