MLOps Community
+00:00 GMT
Sign in or Join the community to continue

MLOps - Design Thinking to Build ML Infra for ML and LLM Use Cases

Posted Mar 29, 2024 | Views 1.9K
# MLOps
# ML Infra
# LLM Use Cases
# Klaviyo
Amritha Arun Babu
Amritha Arun Babu
Amritha Arun Babu
AI/ML Product Leader @ Klaviyo

Amritha is an accomplished technology leader with over 12 years of experience spearheading product innovation and strategic initiatives at both large enterprises and rapid-growth startups. Leveraging her background in engineering, supply chain, and business, Amritha has led high-performing teams to deliver transformative solutions solving complex challenges. She has driven product road mapping, requirements analysis, system design, and launch execution for advanced platforms in domains like machine learning, logistics, and e-commerce.

Throughout her career, Amritha has been relied upon to envision the future, mobilize resources, and achieve business success through technology. She has been instrumental in helping shape product strategy across diverse sectors including retail, software, semiconductor manufacturing, and cloud services. Amritha excels at understanding diverse customer needs and leading data-driven efforts that maximize value delivery. Her passion and talents have led to her spearheading many greenfield projects taking concepts from ideation to national scale within aggressive timeframes.

With her balance of technical depth, business acumen, and bold leadership, Amritha is an invaluable asset ready to tackle dynamic challenges and capitalize on new opportunities. She is a principled, solutions-focused leader committed to empowering people, organizations, and ideas.

+ Read More

Amritha is an accomplished technology leader with over 12 years of experience spearheading product innovation and strategic initiatives at both large enterprises and rapid-growth startups. Leveraging her background in engineering, supply chain, and business, Amritha has led high-performing teams to deliver transformative solutions solving complex challenges. She has driven product road mapping, requirements analysis, system design, and launch execution for advanced platforms in domains like machine learning, logistics, and e-commerce.

Throughout her career, Amritha has been relied upon to envision the future, mobilize resources, and achieve business success through technology. She has been instrumental in helping shape product strategy across diverse sectors including retail, software, semiconductor manufacturing, and cloud services. Amritha excels at understanding diverse customer needs and leading data-driven efforts that maximize value delivery. Her passion and talents have led to her spearheading many greenfield projects taking concepts from ideation to national scale within aggressive timeframes.

With her balance of technical depth, business acumen, and bold leadership, Amritha is an invaluable asset ready to tackle dynamic challenges and capitalize on new opportunities. She is a principled, solutions-focused leader committed to empowering people, organizations, and ideas.

+ Read More
Abhik Choudhury
Abhik Choudhury
Abhik Choudhury
Managing Consultant Analytics @ IBM

Abhik is a Senior Analytics Managing Consultant and Data Scientist with 11 years of experience in designing and implementing scalable data solutions for organizations across various industries. Throughout his career, Abhik developed a strong understanding of AI/ML, Cloud computing, database management systems, data modeling, ETL processes, and Big Data Technologies. Abhik's expertise lies in leading cross-functional teams and collaborating with stakeholders at all levels to drive data-driven decision-making in longitudinal pharmacy and medical claims and wholesale drug distribution areas.

+ Read More

Abhik is a Senior Analytics Managing Consultant and Data Scientist with 11 years of experience in designing and implementing scalable data solutions for organizations across various industries. Throughout his career, Abhik developed a strong understanding of AI/ML, Cloud computing, database management systems, data modeling, ETL processes, and Big Data Technologies. Abhik's expertise lies in leading cross-functional teams and collaborating with stakeholders at all levels to drive data-driven decision-making in longitudinal pharmacy and medical claims and wholesale drug distribution areas.

+ Read More
Demetrios Brinkmann
Demetrios Brinkmann
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More

As machine learning (ML) and large language models (LLMs) continue permeating industries, robust ML infrastructure and operations (ML Ops) are crucial to deploying these AI systems successfully. This podcast discusses best practices for building reusable, scalable, and governable ML Ops architectures tailored to ML and LLM use cases.

+ Read More

Join us at our first in-person conference on June 25 all about AI Quality:

Amritha Arun Babu 00:00:00: My name is Amrita Arun Babu Mysore. I am currently ML platform leader at Klaviyo and I like my coffee mainly strong like an espresso shots.

Abhik Choudhury 00:00:13: That's what I would hi, my name is Abhik Choudhury. I work as a senior managing consultant in IBM and I love my coffee as french vanilla from Vawa. Vawa is big here in Pennsylvania.

Demetrios 00:00:30: Hello and welcome back to another Mlops community podcast. I am your host, Demetrios, and today we got to talking about the Mlops maturity levels within businesses going from zero to two and how you can tie that back to business value. We haven't had a conversation like this in a while. I really appreciate both of these folks for coming on and talking to us. Amritha and Abeek gave us a bit of a one two punch. Abeek was able to go deep into the technical side of things. I really enjoyed the tangent that we went on on how you can approach mlops from the different Personas, whether you're coming from DevOps Persona or a data engineering Persona and the or the data scientist Persona. What you need to know what specific traits and skills you need to acquire if you would like to start using machine learning in your company, or start thinking a little bit deeper about mlops from these different angles.

Demetrios 00:01:40: So if you're coming at it from data science, okay, you need to start thinking about how can I really be using source control all the time? We have the joke that data scientists don't necessarily necessarily like to use git, and I know it's a stereotype, but sometimes stereotypes are true. So that is all I'll say. Hopefully you are not that data scientist because you're listening to this and you recognize the value of using git. But I digress. The idea here is that AbiK was able to give us this technical angle and Amritha gave us the view from a product manager perspective and trying to sync the needs of the management or leadership with the needs of someone who is an engineer or working on the platform. What kind of metrics do you need to look at to know that your ML team is successful? Let's dive into it before we do. You know, it always helps if you leave a review or just go ahead and give us one of those stars on Spotify. Drop in a comment if you're on YouTube.

Demetrios 00:02:54: It means the world to me. I will see you on the other side. And of course, don't tell anybody, but we'll be doing a Mm UpS conference June 25 in San Francisco. It's super secret right now, but soon enough you will be hearing about it. Alright, let's go. Let's get into it. Who put there learning how to cross country ski? Was that you and Rita?

Amritha Arun Babu 00:03:29: No.

Demetrios 00:03:31: Is that you, Abeek?

Abhik Choudhury 00:03:33: I'm not sure. No, I did not as well.

Demetrios 00:03:37: Where did I get that from? Because we asked for fun facts about what you all.

Amritha Arun Babu 00:03:43: Oh, yes, I did. I'm sorry. Yes, I did. I wanted to learn cross country ski. Yes, that was me.

Demetrios 00:03:50: Yeah, I guess you haven't gotten too far if you don't even remember that you put that in three months ago.

Amritha Arun Babu 00:03:57: I did not. Because we did not have enough snow. And, you know, and I don't know if I mentioned I live in Boston, it's supposed to snow a lot. And this year was climate change. We hardly had one snowstorm, so like, we didn't use anything, any of our equipment.

Demetrios 00:04:13: Yeah, you just get the cold, not the snow that comes along with it.

Amritha Arun Babu 00:04:17: Yeah.

Demetrios 00:04:18: Oh, wow, wow, wow. So. All right, well, I'm excited to chat with both of you. I know that this has been something that we've wanted to get into for a while. We want to talk today all about tying the business objectives to the technical needs when it comes to mlops and AI initiatives. Abik, I know you can go deep on the technical side. Amrita, you're going to bring that PM perspective. Let's start with this.

Demetrios 00:04:46: Abiq, can you give us a bit of background about yourself and how you came into tech?

Abhik Choudhury 00:04:51: Yeah, sure. So I'm a senior managing consultant in IBM with over like eleven years of experience. I am right now at the intersection of data engineering and machine learning and of course using, delivery of and serving the model. That's my current role, overseeing like a team of data engineers and data scientists bringing value for multiple clients. For IBM, I started my IT journey basically as a co bold developer and soon I was transitioning into the data science, first as a data analyst, then into engineering, and more recently as a machine learning engineer.

Demetrios 00:05:31: So you got to see the full pipeline. You've had a lot of different experiences along the way.

Abhik Choudhury 00:05:36: Yeah.

Demetrios 00:05:37: And so Amritha, give us a breakdown of what you're doing. I know you're currently at Klaviyo, you were at AWS, and I would love to hear just the TLDR of your journey.

Amritha Arun Babu 00:05:49: I'm currently leading the ML platform at Klabi and I've over about a decade of product management experience. I started off working on supply chain and e commerce platforms, you know, building integrations with sellers and as well as contract management and those kind of initiatives. Over the years, I had an opportunity to transition slowly as I built like bi instrumentation and data pipelines into machine learning. That was one of the ways I transitioned. And currently, last three years I've been focusing on machine learning initiatives.

Demetrios 00:06:27: Excellent. So there is something abhikh that you said before we hit record, and I want to get right into it, which is talking about how Mlops is still very new in general. It was a buzzword. I think it got dethroned as the buzzword of the tech year. When llms came out, that became the new buzzword. And then LLM ops. Yeah, I think we're over the hump of the ML or the LLM ops craze right now. But there still is this feeling that when it comes to DevOps, that's a very mature field.

Demetrios 00:07:03: And actually, in fact, I'm at Kubecon right now. There's like 1112 thousand people here. It's mayhem. And you can see that this has had very, this has had many years of development under its belt. MLops is not like that at all. We've got maybe one conference, and that might be the conference that we're about to put on that has to do with this stuff. But can you break down what you're seeing and what, what, like, made you come up with this idea of, hey, it's still new and we still need to figure some things out. Where are the loose ends in your mind?

Abhik Choudhury 00:07:41: Yeah, sure. So as you rightly said, few years back, it was a buzzword. DevOps. Everyone knew about DevOps, but not many people knew that. It's a rendition of DevOps into machine learning. That's what we call Mlops. And people did know that there is something existed as DevOps in machine learning, but not many people or other organizations knew how to get the best out of this thing. So now, more and more organization, including the clients that I work with day in and day out, they have understood the value of value that MLops brings to the table as far as continuous integration and continuous delivery is concerned.

Abhik Choudhury 00:08:23: And they have started reaping their benefits. They have started understanding that the whole lifecycle process of machine learning, I would say pipeline or the old machine learning lifecycle, is long. And to make it purposeful for day to day business activities, they need to shorten it and streamline it. And the best way to streamline is to use mlops. So, yeah, people have started realizing the true value of it.

Demetrios 00:08:49: And how do you see the lifecycle being has it? Because I remember one thing that we often would look at were the different maturity levels that you could say, all right, here's an example diagram of a mature way of doing continuous training or continuous integration and continuous development in machine learning. Do you feel like that has now become more standardized? Because sometimes it's like, all right, cool. We have, we have this notion that the more automation, the more mature, but that isn't necessarily always better, especially when it comes to ML, which is, I think, a little bit where it starts to veer off the beaten path of DevOps. Right. And with DevOps, if you can just not touch it ever, that's great. With ML, there's a lot of things that you should probably be looking at. And so can you break down the way that you see different maturity levels and or life cycles?

Abhik Choudhury 00:09:54: Yeah. So to your question. So maturity level can be broadly classified into 30 being the maturity level zero. It's that they just conceived it, they're in the process, the first stepping stone of it, but they haven't done much into the continuous integration and continuous delivery part of it. Part two would be the kind of automated, the continuous delivery part of it, and not much into the continuous integration part of it. And I'll break down these into further smaller pieces when I elaborate. And in a nutshell, the part three would be absolute automation, continuous integration, continuous delivery, everything done. But level three is all the organization tried to aspire to over years of experience, but not many have reached, even till now, even many mature IT companies with very mature IT infrastructure, they haven't been able to reach to level three.

Abhik Choudhury 00:10:55: They mostly are somewhere between level one, I mean, level zero and level one. And that's have been my experience for multiple customers that I've worked with. So the first customer that I started working with back in 2018, when ML also still kind of a buzzword, they started with level zero, and now after, like, several years, they are somewhere in between level one and level two. And there are various constraints which prohibit them from going to full level two.

Demetrios 00:11:25: Yeah, it is not easy at all. And you mentioned you can break down a little bit more of what is in each one of these. Can you go into that?

Abhik Choudhury 00:11:34: Yeah, sure. So the first thing is. So I'll break various stages of the entire mlops thing. First is the analysis of the data. You get the data, and then you orchestrate an experiment. So in orchestration of the experiment, you come up with the machine learning model that you are trying to work on. So first is the data preparation, data model training, then model evaluation, and then model validation. So this is typically a custom machine learning developer or an AI developer would do in his test environment, and that's called the experiment.

Abhik Choudhury 00:12:16: In experimentation and the development phase, once he is done with that, he will package it into a source code and then he will put it into a code repository for versioning. Now, in level zero, you have orchestrated the experiment and you deploy it manually. There is not much of a scope of versioning. They are trying to version it, but they are somewhere in between doing it manually and versioning. And then deployment is also done manually. So this is level zero. So in deployment they manually deploy it, while if you mature to the next level, in level one, this whole portion is automated. So let's say the entire orchestration of the experiment, data prep, modeling, training, evaluation, everything, it's all standardized.

Abhik Choudhury 00:13:14: So let's say I, as a machine learning, as a machine learning developer, want to use a sort of standardized template for my development. I want to ensure that the data engineering team is also on the same page. They are also using the same forward. There is a different amount of sync in between them. So that's when we are trying to orchestrate the pipeline. We are all in the same level. And when we are kind of deploying it in a package, it all helps in a better way. So in Ci CD, I mean, in DevOps, we build, I mean, we deploy codes here, we deploy the entire pipeline.

Abhik Choudhury 00:13:56: That's what we call about packaging. So that's the crucial difference between stage zero and stage one. Amrita, you would want to fill into any part of it, or if you have any questions to.

Amritha Arun Babu 00:14:11: Now, no, I agree with you. Like when you said the template, some of the things that I'm able to relate to is like providing like a docker image, or to say that here is the image that you're going to probably convert for your experimentation pipeline, and this is the image that we are going to deploy. Where this helps with respect to tying to the business value, which I think we will deep dive eventually, is to say that this reduces time for the developers significantly. This makes sure there is less back and forth, increases collaborate. So that's the key piece that we.

Abhik Choudhury 00:14:49: Need to focus here, actually.

Demetrios 00:14:51: Yeah. Amritha, at the risk of totally derailing what Abhikh was saying, I want to get back to like, the key differences between level one and level two. But I think it would probably be a great point here if you can mention what some of the key metrics are that you're looking at as you're going through these different maturity stages. Because it can't just be tied to we have more automation, right? Or we have the ability to do XYZ. I've heard the Doordash team talk about how one thing that they're looking at is velocity to deployment, and so how fast can you get something out is one key metric that they look at. What are some metrics that you've found really help paint the picture of if your team is getting better?

Amritha Arun Babu 00:15:41: So a couple of things agree with what the DoorDash team and what you mentioned, because at least the North Star vision for an ML platform is to make sure that you as a platform can enable scientists to get their idea from ideation phase to production as short as time as possible. That's the North Star metric. Underneath that, there are various way to think about this, especially in the case of Abhikh mentioning zero to one stage we can think of like, it's not that we are automating everything, right? We are. Let's say our goal post is to automate only bring standardization. We are not still focusing on continuous integration. Bring standardization and make seamless deployment with that as the goal. Bringing standardization would mean that, okay, how am I streamlining and reducing the time in data exploration stage? How am I providing standardized libraries or SDKs in the training phase? How will that reduce the time for scientists? Are we providing any data quality insights so that scientists have an insight of which is the highest data set quality? How can they reduce time? So these are some of the ways that we should think about how to reduce time. And how does this tie to our north star metrics?

Abhik Choudhury 00:17:04: Yeah, yeah. And to add to that, I want to cite another example that I have from my experience. So let's say one of the customers who are big into wholesale distribution, they wanted to do customer segmentation. So first they wanted to, in their machine learning pipeline, they introduced this, the entire machine learning pipeline, from all the way from data acquisition to serving the model. When they were in level zero, they were not much into a whole lot of automation. They saw that the entire cycle is like several months of duration. Now, one thing they figured out, I mean, they were just cool with that. But then one fine quarter they realized that the model that we are serving, it's outdated because the behavior has changed.

Abhik Choudhury 00:17:57: And also the data underneath has several factors. Several biases have come up and which need to be incorporated into the model. So they want to make sure that they reduce the cycle, they increase automation. That one thing that would prompt them to adopt to a second level of automation.

Demetrios 00:18:18: Yeah, that is a great point that you can't get that ground truth always though, right? Because if you're looking at like a fraud detection model or if you're looking at a loan scoring model, better yet, and you give someone a loan and then they default on the loan two years later, that is very hard to get that ground truth back into the model and be having it continuously updating in that regard. So it's a little bit harder for some of these use cases. And as you're going through this, it almost feels like depending on your use case you're going to be looking at these different levels in certain ways and you're going to be trying to track. Pretty obviously there's going to be different metrics that are going to be more important to you and more valuable to you. But I don't want to dive down that rabbit hole yet because I be, I know you have from one to two. And so talk us through what the differences are from semi automated to basically fully automated, I think.

Abhik Choudhury 00:19:28: Yeah. So the fully automated what look like where not only the integration but the serving also is fully automated. So I'll try to tie all those pieces together with an example. So we have, let's say a pipeline in the development phase which includes the data intake preparation, water training, evaluation and validation, everything into a single pipeline as a single package. And as Amrita previously told that we have orchestration images using a docker or kubernetes that help deploy. So when we have orchestration automated we are into the second level of maturity, I mean first level of maturity there. And once we have the package deployed and we register it into a model registry. So there are two advantages to this.

Abhik Choudhury 00:20:24: So first is it helps all the teams in sync and the second thing is it helps versioning of the model. So let's say today the model is working correctly, tomorrow it may not. I just, the business requirement is that I have to go back to the first model, it's just going to a different version and deploying that. So it's that easy that it helps the flexibility of changing to the business requirement needs. And then you train the model and you serve the model. And once the prediction comes and it goes to the end users and this is the true level of maturity that comes in the level. Second is that you can use continuous monitoring. And so now the models are at in automobile pilot mode and they are serving the model.

Abhik Choudhury 00:21:13: Now there are two way you can prompt it. So first is again like I am, I'm pretty sure that I want to update the model every three four days with new hyper parameters, new updates of the models and everything before even the performance of the model itself degrades. So I can set up an automation that I will change these parameters right in the pipeline itself, so that new models are served continuously, so that the business gets very accurate, accurate prediction. So that's one scenario. And the second scenario is the more the moment the data is, the data is served in the model. And I see that there may be different triggers, like the skewness of the data changes or various discrepancies come within the data. So I want to make these changes immediately to my model based on some triggers from the data. Are you able to incorporate those things as well? So these sort of continuous trick monitoring of the data and the model and various KPI's and matrices, and then taking appropriate action as an iterative loop.

Abhik Choudhury 00:22:28: So this is something, this is the ability that a complete automation model gives us. And everybody aspires to do that. So this is something in a nutshell.

Demetrios 00:22:39: It is some, it's like that kind of pie in the sky, like, yeah, wouldn't it be great when we can get all of that going and it's just automatically retraining and automatically updating? It does feel like what you were just talking about is very much something that harps on if we're doing what some people would call traditional machine learning. Classical machine learning, right. How does this change when we start adding llms to the mix? Because some of the stuff that you're mentioning, like RBAC or the, or to be able to just roll back your model, that probably works more or less the same when it comes to a fine tuned LLM. If you grab like Mistral or llama two off the shelf and you fine tune it. But then the new fine tune isn't quite as good as the last one, so you can just roll back. But I imagine there's some things that you need to be aware of when you're looking at the traditional or classical machine learning versus the new generative AI.

Amritha Arun Babu 00:23:45: Yeah, at least in my opinion. Right. For the llms, at least what I have seen in a couple of places where we built llms. Llms require massive data sets. So again, deploying llms through a zero maturity Mlops is different from what it looks like for mature two. I haven't seen in mature two, so I wouldn't be able to speak more onto that. But what I've at least seen in zero and one stage is that being able to have access to this massive data, whether internal data or external data, having those data pipelines set up, having streamlining those access initiatives, like having a data catalog that can basically provide you insights into what are the various sources that exist, whether internal, external for your particular use case. How easy is it to get what is the lineage and what is the schema and all the good stuff.

Amritha Arun Babu 00:24:45: And being able to sample this particular data and being able to use in a continuous basis is what differentiates from level zero to level one. In level zero, you would just be able to get one batch of data in level one. I would expect my scientists to be able to get all these data in a continuous refreshed manner.

Abhik Choudhury 00:25:04: And to add to that, another thing that I have observed over the last, I would say, few months, people trying to incorporate llms into tire ML offs scenario is that they are wary about the GPU costs. So llms are. So for instance, I'm talking about the GPT four that our client has been using off late, and they have been using it for various expert experimentations, but not as a full scale ML ops deployment. And the GPU costs can snowball at times whenever the usage is pretty high. So they are really scary about, I don't want to write be very cautious about going to complete automation. So that's another thing which I experienced.

Demetrios 00:25:54: It's fascinating to me because it does feel like for a lot of the ways that ML is out there in companies and making companies money or saving companies money, it really depends on what use case you're using. And a lot of the times, the most mature ways that companies are using machine learning is traditional ML. And it's very proven because it is a little bit older, it's a little bit more mature. You can prove out that, hey, this is what we're doing, this is how much money we're saving. We have these metrics, we know that it's working like this with LLMS. It's a little bit more exploratory, and we're still trying to figure out what does a mature system look like in this regard, and what are the use cases that we can incorporate it either into our product or into our company? Because maybe it's just like a classic like chatbot rag, that you have different people in the company being able to quickly get up to speed on projects, whatever it may be. And so the requirements there are drastically different than if you're trying to figure out, again, going back to the fraud detection model or the loan scoring model, where you're not going to use an LLM for that ever, and you do need this very mature way of updating the model and continuously retraining it and being able to roll back as easily as possible. And you want to be super mature there.

Demetrios 00:27:29: And with the LLM, maybe it's just like, hey, we make a few GPT calls, we've got some prompts we like, and that's kind of good enough for where we're at right now. We see it helping and we're trying to operationalize it, but we don't necessarily need to do much more on it. So I think the dirty little secret is, and I would love for people that are listening also to hit me up and tell me potentially the dirty little secret that I've been seeing around, and I imagine you all have seen it too, is that a lot of the money that is being made with machine learning and AI right now is still like traditional AI or traditional ML, like these classical models that people call them. That's where a ton of cash is either being saved or being made. And Mlms have taken the hype. And so everybody's talking about AI and you've got like just the random people on the street saying how it's gonna take over the world. But if you look at it and you look at like, statistical machine learning is making the cash still. And that's the dirty little secret that I, I think you all have seen too.

Demetrios 00:28:38: Changing gears, though, I do want to talk a little bit about how when you are putting together these systems and you're thinking about the metrics, Amritha like design thinking is very popular in the product management space. How can we bring that type of ideology into the development of our pipelines, our maturity, our customizable and modular Mlops stats.

Amritha Arun Babu 00:29:12: Yeah. One of the couple of ways that I have seen this work and be successful is always working backwards from customer for Mlops or any, any product that you take and, you know, working, identifying who is your primary customer, who is your secondary customer, sort of like the actors in the ecosystem to identify them and work backwards from them, understand their use cases, understand their pain points. What does this mean? Let me take an example. Like when building an ML platform in, you know, last, over last three years, there are a couple of, my primary customer is data scientists. There is a secondary customer who is an ML engineer. There's a third customer who is a bi engineer or any of these analytics roles, right? So these are my three customers. And then once I identify this, usually I work back from understanding their pain points. What is a net scientist going to do? Like, to your point, okay, they're building a fraud detection model.

Amritha Arun Babu 00:30:16: Great. They're also building recommendation app with recommendation nuggets to the customers on their UI. Or as you converse with a particular web service, then might be they're giving you recommendations.

Demetrios 00:30:33: All right, real quick, let's talk for a minute about our sponsors of this episode. Making it all happen. Lattice flow AI. Are you grappling with stagnant model performance? Gartner reveals a staggering statistic that 85% of models never make it into production. Why? Well, reasons can include poor data quality, labeling issues, overfitting, underfitting, and more. But the real challenge lies in uncovering blind spots that lurk around until models hit production. Even with an impressive aggregate performance of 90%, models can plateau. Sadly, many companies optimize for prioritizing model performance performance for perfect scenarios while leaving safety as an afterthought.

Demetrios 00:31:17: Introducing lattice flow AI the pioneer in delivering robust and reliable AI models at scale, they are here to help you mitigate these risks head on during the AI development stage, preventing any unwanted surprises in the real world. Their platform empowers your data scientists and ML engineers to systematically pinpoint and rectify data and model errors, enhancing predictive performance at scale. With lattice flow AI, you can accelerate time to production with reliable and trustworthy models at scale. Don't let your model stall. Visit latticeflow AI and book a call with the folks over there right now. Let them know you heard about it from the Mlops community podcast. Let's get back into the show.

Amritha Arun Babu 00:32:04: So in order to build these, what are, what, what are the pain points that they encounter across the entire lifecycle that Abhikh just walked you through? Right. So in the data exploration, in the training, and, you know, fine tuning or evaluating, and then releasing the model, and how do they, and what are the pain points to then encounter? And how can I tie this to saying that, okay, let's say you encounter data exploration pain point, or let's say you encounter, it's very hard for you to retrain the model then with all these aspects you are always looking into, like, hey, how can I make it easy for you? Like, what does it mean for you to retrain? Like, let's say today you're not able to retrain. What is, how is that impacting our ability to be AI leader? Let's say because of, because you're not able to retrain, we are giving slightly different recommendations than what is, what can be very accurate that leads to your models not being accurate, models not being precise. So those are some of the ways that we need to apply design thinking, to just summarize, work backward from customers, identify all the touch points across various phases of your life cycle and map it to your business objective to say, what is the impact if this issue persists?

Demetrios 00:33:30: And when you map it back to these metrics and the ways to be able to say what the impact is, do you have clear ways of being able to take it like, okay, this is not in python and the data scientist needs it to be in python. So them having to learn Yaml or something is going to cost us x amount of money. I can only imagine that the best way of putting it into a language that everyone in the company is going to understand is always going to be how much money is this making us, costing us that type of thing. So how can you translate not knowing Python or not having it be in python into like actual dollar amounts?

Amritha Arun Babu 00:34:15: Yeah, that's a very good question. One of the situations that I faced this is scientists often are very good at working as a community. They're very good, comfortable with Python. There are scenarios where when they're working with, let's say, petabyte scale of data or larger scale of data and distributed compute, they have to work on spark and manage services. They do not want to work on Spark because it has steep learning curve and it comes with its own thing. And then that takes away their attention span from innovating, focusing on building the model versus they need to learn a new technology. Right? So here, the way you can translate this pain point or time lost into saying that you have x skills that the data scientists are developed and they are comfortable in, they need to build y skill and it takes, let's say if you have ten data scientists that need to build y scale and it takes 20 hours for them, or, you know, x hours over the weekend, how can you translate it? Let's say if your organization has like 3000 scientists, if this is how the managed service for training and model build that you're providing, then that only just makes it even bigger a problem.

Demetrios 00:35:31: So translating it into hours and then being able to extrapolate out like, well, we're paying these people x amount a year and so every hour is worth this much. That means that if they need to spend 20 hours learning at least the basics of Spark, and this is not, they're not going to be professionals with Spark in 20 hours. It's going to take them a lot longer than 20 hours. And so this is just to get to the basic level. And then there's those unknowns that later on you have a data scientist who thinks they understand Spark and they end up doing something that costs the company a lot of money. And so that is a little bit of a, another piece of the puzzle that I'm sure it's like a risk that you could map out.

Amritha Arun Babu 00:36:17: Yes. Yeah. Changing priorities, organization priorities. There are various factors that adds to it, because all said and done, if all variables are constant and you only have to learn spark, that's one way to look at the problem. But real life and organizations are not like that.

Demetrios 00:36:33: It's a little more messy. Yeah, yeah, I could see that. So one thing that I wanted to get into also was around the different ways that people come into mlops. And abik, I know that we had mentioned this before, we hit record, which is very much about the user journey of Mlops requires a certain amount of understanding. We broke down the maturity levels, we broke down the pipelines or the life cycles more better said. And what you have is that a lot of people come into the field out of necessity, like a data scientist who now needs to deploy their model and they need, or they need to start thinking about how to move up the ladder in the maturity field, or you have someone from the DevOps field that is saying, okay, now I need to make sure that our ML platform is reliable. How much ML do I need to know versus the base of DevOps that I already have? So it's almost like, can you maybe give us a summary of, if you're coming from these different areas, what are some things that you might want to focus on? And I almost realize now, after talking to you too, is that it's not just, I'm coming from DevOps and I want to learn mlops because that is very broad. And so that's probably not going to be that helpful.

Demetrios 00:38:06: It's like, I'm coming from DevOps and I want to learn recommender systems and I want to learn loan scoring systems, or I want to help my team who is working on prediction churn prediction models. And so there's the use cases that you have to look at, or I'm coming from, coming from data science, and I now am going to be working with the ML platform team. What do I need to know? How much DevOps should I be thinking about all of that? Or, I'm a data engineer, I understand pipelines really well, but now I need to be an ML engineer, I need to work with the ML engineers. And so I know that's like there's a big surface area there, but I get the feeling you have some insights into the journeys of each one of these Personas.

Abhik Choudhury 00:38:57: Yeah, sure. So broadly there would be, and you rightly covered broadly there would be three Personas, one with the data engineering flavor, one with the data science or the machine learning framework, and one is the deployment or the CI CD or the DevOps labor. People with different backgrounds join this entire effort, all the way from data acquisition, all the way to model serving and the entire iteration in between. And depends on the organization, how they want these three silos, I would say to be demarcated from each other, or be rather intermingled with each other. So again, depends on the organization and the culture. But broadly, I have seen that the person who is coming from data engineering background, he is mainly in the beginning of it, where he is good in acquisition of the data from multiple sources, the entire transformation part of it, where he uses his engineering skills for the exploratory data analysis part of it, and the transformation. And he needs to have certain degree. So where he, what he needs to pick up in this new ecosystem is some degree of understanding of various models, not too deep, but some degree of understanding of in what format the data scientist would need it.

Abhik Choudhury 00:40:21: So that understanding is required. As far as data scientists concerned, his most involvement would be during the experimentation and the development phase, where once he has deployed the model, he also needs to make sure that he has some amount of visibility into the deployment part of it from the experimentation phase. So for instance, what are the parameter, how I need to parameterize, how I need to work with the code repository, how I need to work with the versioning? And we talked about YAML files, you need to work with YAMl files, not just the actual Jupyter notebook. So these are the new things that he needs to pick up as far as someone from DevOps background is concerned, when he is into the MLOps part, again, one thing is that it is very iterative process, particularly for an MLOPS professional, as a machine learning engineering professional. So somebody who is not too much into data science and knows only DevOps concepts, he needs to pick up some basics of hyper parameter tunings, like what are the things, certain things I need to take care of. So let's say these are the things that change in the change in the data that I see, what are the things that I need to change that will manipulate my model to work in the way I want. So he needs to be aware of the hyper parameter tunings and these concepts as a DevOps professional. So these three silos, they have their work cut out, but still need to know something extra to work with each other.

Demetrios 00:42:06: Yeah, that's great. And Amrita, from the product manager perspective, what are some things that someone coming from a more generalized product management background, and then jumping into the more technical side of ML and ML products, or just mlops in general, or ML platform even, what are things that that person would need to know?

Amritha Arun Babu 00:42:33: Yeah, been thinking about this as I keep getting asked this question. So, if I could give you, like, what are the things that, where I had steep learning curves where what helped me was I had steep learning curves in as to understand the data science concepts, because I came from a world of, you know, building all the UI pieces and designing the UI. So, with respect to when I transitioned to this field, understanding the data science concepts, what are those terminologies? What we know, what do they mean in different areas? What does that translate into? Like, let's say if someone tells you that they are building a continuously refreshed pipeline for, let's say, real time data versus a batch, what does that mean? What does that translate in terms of code? Like, I come from an engineering background, even for me, it was like, oh, how does this look in the code? Like, is it different? Where is the difference coming from? You know? So those sort of things were something that it took, you know, that is where I took a lot of time, and then there was another piece of it. And when I worked on building models, like, I worked on building PIi models. So there understanding, like, what kind of model architecture does come into when someone tells that they are, let's say, not training a model with a public data set. Oh, we don't need ground truthing data set. Why does this model does not need a ground truth thing? Understanding those different aspects helped me build better relationships with scientists and ask better questions to scientists.

Demetrios 00:44:11: Yeah. So, trying to figure out what those constraints are and really why there are these constraints or there aren't these constraints.

Amritha Arun Babu 00:44:20: I don't look at it as constraints. Rather, I look at it as what you have been focusing on previously is drastically different from what you're focusing now. At least this was the situation back, like, couple of years ago when I transitioned into it. Today, everyone's talking about AI, today everyone's talking about ML. This is a trend riding the wave in the trend. You know, you're quite aware of these concepts, you're quite aware of these terminologies and the buzzwords, right? Which you can self teach yourself. But again, these are the concepts that you need to self teach yourself if you were to wanted to learn that's how I would put it.

Demetrios 00:44:58: Yeah, yeah. Another piece to this that I think is fascinating too, is to be able to distinguish what are the unique factors of data pipelines that you need to be thinking about, and what happens with data that doesn't necessarily happen with code. And then what are the unique factors of coding and the coding pipelines, or like the continuous development Ci CD pipelines, basically is what I'm trying to say in too many words, but knowing those two and how they look and what is different and what is different when it comes to data versus code, that is quite useful.

Amritha Arun Babu 00:45:41: Yeah.

Demetrios 00:45:43: So the other piece of this, which I think is worth us checking out and maybe talking about for a moment, is around someone that is coming from a front end or full stack development background. And now they've caught the bug because they are like, oh, you know what? There's this new term that is now very popular called the AI engineer. I can prompt, I've been able to stand up a website. I can see that people enjoy what I have created. Now I need to make my product a little bit more bulletproof. And so I am rising as this AI engineer. I need to learn about prompting prompt templates, fine tuning rags, and the ways that rags can fall over the different tools that I'm going to be using in the LLM ecosystem. But what else should that person be thinking about when they want to start using AI for their products and really start going from that, not from zero to one, because I think that's where you really can see the new llms shine.

Demetrios 00:47:02: But more from that one to n.

Amritha Arun Babu 00:47:04: Is your question like, let's say if you have a product or if you have a, let's say you're an e commerce company or selling products and you are easy. A question like as an engineer or as a product itself, how do you scale from using to foundational features, using LNM to foundational features to n or. Yeah, what are you thinking specifically?

Demetrios 00:47:29: I was thinking more about from the engineer's perspective. But I do also like that idea because I have, we have had a few people on here where they've come on and they said, look, we were building traditional models, we were doing classical NLP, and then we were able to just hit OpenAI's API. And all that work we did for the last six months or year, it all went out the window in one API call because it was way better at some sentiment analysis or some keyword. It was just infinitely better than anything that we had made over time. But now we're in that phase where we're like okay, cool, we are using the API and maybe that's what we want to do, we have the MVP or we have something that's actually working and then scaling it from that one to n. But let's start with the first part and then we can go maybe a little bit down that rabbit hole of when people are using it and scaling their product features with LLM. So as an engineer, what do I need to know about and how do I need to get into the space of being more of an AI engineer and less of a full stack dev?

Abhik Choudhury 00:48:47: So as far as so for getting into an AI engineer, some of the things that you need to be wary about is, and I have seen that these are the two things that fall apart very easily. One is the scalability. So you need to be really very about the scalability because data multiplies and particularly if there is a lot of seasonality and trend, it can easily overshoot what you are thinking and your whole pipelines are capable of handling. So from a data engineering, from engineering perspective, you need to have those things sorted first, even before you start optimizing your pipelines. So that would be the first piece is how to scale your data. So if you're in a cloud environment, you have to make sure your load balances and all those things are in place. You are able to hold those data in front and correctly. Second is in your model, whenever you are scaling it, you are able to understand that you have considered all the biases underneath.

Abhik Choudhury 00:49:54: And if you have the outliers that may come into picture, how you're going to deal with that. And from an ML engineer perspective, how you are going to serve the model. Serving the model is a very important aspect because in a development environment you are not. If you are serving and if the data increases and your performance changes, you need to be very careful about if the serving meets the criteria of the business is. So these three are the aspects then you need to be wary about. And foremost, fourth and foremost is the compliance aspect of it. When in the development perspective, we quite often overlook the compliance and the privacy part of it. With our own model, we do make sure that we have various compliance and checks that we are not using these fields for these sort of predictions.

Abhik Choudhury 00:50:50: However, if we are using a third party model or maybe an NLM just with an API call, we not, maybe we are not very sure that how they are handling the compliance and if they are in accordance with our company policy. So we need to do the due diligence of compliance with even the third party markets.

Demetrios 00:51:10: Oh, my God. I didn't even think about that. That is such a great point. And a lot of times, it's not so easy to figure that out. Maybe it's hidden behind the terms of service that's 200 pages long or it just isn't around, and you can, like, good luck trying to figure that out. And if it matches up with your company's. Yeah, that's a fascinating one.

Amritha Arun Babu 00:51:37: There's like a whole sort of a problem space that exists out there to determine this compliance piece of it translating into what is the particular region, country, their government policies, their compliance. How are they catching up? How are these companies adapting? And how does it all translate to you as a final adopter of companies? You know, sort of like a blockchain? Like, how does it all has. We have. We figured it out because everyone's at a different pace here. Right?

Demetrios 00:52:08: Yeah. A perfect example of that is something that I. I refer to this story quite often in the Mlops community, and this was before the MLobs community started. I was on a call with one of the now MLOps community members, Jeremy, and he was saying how he was working at a health tech startup, and he had data in Canada, and that data could not leave Canada. There was no way that data was going anywhere outside of Canada, because the canadian laws do not allow that. So even though the company was global and they had a lot of data scientists in the UK or in the US or in India, Australia, you name it, nobody outside of Canada could touch that or add that to their models. And you think about such a simple example of, I want to build a rag chat bot that anybody can query and know about the data and know what's going on with our data. How can you ensure that that chatbot is not going to touch the canadian data unless you're inside of Canada, right? Or maybe it's you have to build two separate chatbots, or you have to do that.

Demetrios 00:53:21: But unless you're thinking about that problem, you probably wouldn't even add it. You would just say, like, oh, cool, I've got my rag. We're good. Like, I just added a ton of value to the company. Where's my raise type thing? And then you realize, oh, I may have broken some laws. Let's hope nobody finds out about that.

Abhik Choudhury 00:53:39: Yeah, that's a very good point, because as long as you are in the experimentation phase, you, as a lone wolf, lone data scientist, you can do whatever you want. But as soon as you put into mlops, you have to come up with your chief data architect or someone who has the bigger picture, and you have to make sure that in your entire mlops landscape, you address these things even before going to level zero, let's say. So these are the things that need to be taken care of.

Amritha Arun Babu 00:54:11: Some of the ways that I think just to add to that is some of the ways you can operationalize this and make sure even whether you're a small company or a big mid size or a big company is having these security reviews to make sure that how compliant is your product given now everything is quite global. It's no more just as. And plus every country and their laws are constantly changing. So having your legal team and having your security team aligned and overseeing this particular releases is quite helpful.

Demetrios 00:54:45: Yeah, your legal team needs a rag on the different, the compliance laws that are changing constantly. So you need like 50 different rags in the company to help 50 different teams stay up to date with all of that. So this is great. Well, folks, I appreciate you coming on here. I know that we are hitting up on time. Is there anything else you want to mention before we jump off? Is there any questions that I did not get to ask that you would have loved to have dove into?

Amritha Arun Babu 00:55:19: One particular thing that I can think of is monitoring piece, you know, and evaluation. How does that change when you are, you know, in zero stage one and two? That's something like, at least from the product point of view, monitoring, model monitoring can be quite basic when you are in zero to one where you do not have pipelines for getting, you know, ground truth data. So you cannot compute a lot of metrics, but whereas, you know, and you also don't have toolings to, you know, looking to like consistently monitor a particular model, but that's where you get very scrappy. You say that, okay, I'm going to take, you know, sample size of the model outputs in ground truth myself and then I'm going to measure the accuracy and understand the model performance at one. How does it change? I would like to hear from Abhik to what has he seen from one to two? And how does model monitoring change?

Abhik Choudhury 00:56:19: Yeah, yeah, as far as my experience is concerned. So for level zero, the monitoring is like pretty rudimentary because the life cycle is long. So you just like see, you analyze the data once in a while, or the business users, they tell you that, okay, it's not performing as per requirement, then you go ahead and see that, okay, maybe the data is not meeting my expectation or there is bad data or the model, the purpose of the model has changed, the business requirement has changed. So the iteration is like a bigger iteration while in two, I mean the totally automated. So you need to be very, very serious about your monitoring. You have to not only have to do the quality of the data, but various KPI's that come out of your model. You have to be on top of that and you will have to eventually implement a dedicated modeling tool and which are pretty common across multiple cloud platforms. You have to deal with them, leverage their properties to ensure that your KPI's are up to mark.

Abhik Choudhury 00:57:22: And if any of them are changed, you work with your model. So that's how the difference is.

Demetrios 00:57:28: Yeah. Something that's fascinating to me is how when it comes to classical or traditional ML, we have almost like this golden metric that we can look at, which is accuracy. And then when you switch over to llms, accuracy falls apart. There is no real kind of depending on the use case you can try and if you can then that's great, like do it. But how are you going to measure the accuracy of summarization? It's more, that's why there's so many evaluation tools that are coming out right now, right. And trying to evaluate how well it's doing. But that doesn't really bring into account what you're talking about. How at the end of the day it's really what the business wants.

Demetrios 00:58:18: And if the business needs change or if things start going into a different direction, that's what you want to know. And whether that is being able to evaluate the output or letting the users tell you that this isn't actually useful with a thumbs up, thumbs down, or whatever it may be in the evaluation, or seeing that it's not useful because people aren't taking the predictions or they just aren't using it. And your accuracy score, or your accuracy score, just tanks, all of these are different metrics that you need to be thinking about when you're looking through the lens of like how valuable to the company is. My AI effort.

Amritha Arun Babu 00:59:02: Yep, yep.

Abhik Choudhury 00:59:03: Agree.

Amritha Arun Babu 00:59:04: 100% agree. Yeah.

Demetrios 00:59:05: Yes. It all comes back to, yeah. How can we tie this to a number that the company cares about? And how can we make sure that we can show that we're doing stuff? Because I will say this again, I haven't, I've heard it so many times on different calls of people saying like, the boss thinks that the data science team is just a cost center, they're not a profit center at all. It's like, what are we paying this guy for? Or this woman for? What is going on here because I'm not seeing any results, when in reality, they may be producing a ton of results. And so it's like, it's up to us to clearly define how we are impacting the business. And I appreciate the both of you breaking this down for me and really going deep into different ways that we can explain it and we can show leadership the value of the ML and AI efforts.

Amritha Arun Babu 01:00:08: Perfect. Thank you so much, Dimitris.

Abhik Choudhury 01:00:10: Thank you. Thank you for having us.

+ Read More

Watch More

Posted Jul 07, 2022 | Views 645
# KubeFlow
# ML Engineering
# Kubernetes
Posted Oct 09, 2023 | Views 348
# LLM Use Cases
# Google Cloud
Posted Feb 28, 2024 | Views 289
# LLM Use Cases
# Startups