MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Making MLFlow

Posted Jun 16, 2022 | Views 980
# ML Flow
# Spark
# Databricks
Corey Zumar
Corey Zumar
Corey Zumar
Software Engineer @ Databricks

Corey Zumar is a software engineer at Databricks, where he’s spent the last four years working on machine learning infrastructure and APIs for the machine learning lifecycle, including model management and production deployment. Corey is an active developer of MLflow. He holds a master’s degree in computer science from UC Berkeley.

+ Read More

Corey Zumar is a software engineer at Databricks, where he’s spent the last four years working on machine learning infrastructure and APIs for the machine learning lifecycle, including model management and production deployment. Corey is an active developer of MLflow. He holds a master’s degree in computer science from UC Berkeley.

+ Read More
Demetrios Brinkmann
Demetrios Brinkmann
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More
Mihail Eric
Mihail Eric
Mihail Eric
Co-founder @ Storia AI

Mihail is a co-CEO of Storia AI, an early-stage startup building an AI-powered creative assistant for video production. He has over a decade of experience researching and engineering AI systems at scale. Previously he built the first deep-learning dialogue systems at the Stanford NLP group. He was also a founding member of Amazon Alexa’s first special projects team where he built the organization’s earliest large language models. Mihail is a serial entrepreneur who previously founded Confetti AI, a machine-learning education company that he led until its acquisition in 2022.

+ Read More

Mihail is a co-CEO of Storia AI, an early-stage startup building an AI-powered creative assistant for video production. He has over a decade of experience researching and engineering AI systems at scale. Previously he built the first deep-learning dialogue systems at the Stanford NLP group. He was also a founding member of Amazon Alexa’s first special projects team where he built the organization’s earliest large language models. Mihail is a serial entrepreneur who previously founded Confetti AI, a machine-learning education company that he led until its acquisition in 2022.

+ Read More

Because MLOps is a broad ecosystem of rapidly evolving tools and techniques, it creates several requirements and challenges for platform developers:

  • To serve the needs of many practitioners and organizations, it's important for MLOps platforms to support a variety of tools in the ecosystem. This necessitates extra scrutiny when designing APIs, as well as rigorous testing strategies to ensure compatibility.

  • Extensibility to new tools and frameworks is a must, but it's important not to sacrifice maintainability. MLflow Plugins ( is a great example of striking this balance.

  • Open source is a great space for MLOps platforms to flourish. MLflow's growth has been heavily aided by: 1. meaningful feedback from a community of ML practitioners with a wide range of use cases and workflows & 2. collaboration with industry experts from a variety of organizations to co-develop APIs that are becoming standards in the MLOps space.

+ Read More


“It’s often easier to create an extensible and coherent core API and let developers bring their own workflows to that API.”

“One thing that’s very unique about MLFlow is that it came from Databricks, this enterprise platform for data science and machine learning, and as a result, as we integrated MLFlow with Databricks, we have this awesome stream of information about usage from our customers.”

“Open-source growth can be powered by third-party platforms that decide to adopt these kinds of tools. It was designed with that in mind.”

“In the early days, that first year or so of MLFlow development, it was all about open-source. Open-source would get the features before Databricks got the features. That was really important.”

“We didn’t extend ourselves into open-source and create a widely adopted platform by accident that we then have to maintain and serve.”

“It was very much our goal to build out a set of standards that we felt could capture what it is to be a machine learning platform, an end-to-end lifecycle tool, and deliberately to do it in open-source. That way it benefits both citizen data scientists and other organizations.”

“Another thing that we come back to time and time again is that each of these components of MLFlow, the tracking, the models, the registry, they should all stand as their own pillars”

“In order to test a platform, you can test each of its individual components and relatively hermetic and isolated environments and that’s helped us build out a production-grade platform and open source as well. There are some benefits to both sides.”

“We have to realize that vision of being a complete end-to-end MLOps platform in order to win market share be most useful to users, data scientists, and machine learning engineers. We are incrementally further along in that journey.”

“it’s nice that objectives align so well in terms of what to invest in.”

“What’s better for the data scientists and machine learning engineer is often better for a business too. The fact that most databased customers are asking for end-to-end tools is awesome. It’s so hard to stitch together a bunch of different platforms and deal with API versioning changes and behavioral nuances.”

“We aren’t really dealing with this kind of tugging of the heart in different directions and ultimately that means we just get to roll up our sleeves and solve problems and know that we’re doing the right thing for both sets.”

“We have yet to find a moment where end-users needs and workflows were totally orthogonal or opposed. We’re not at that point yet. I hope it never gets to it because that becomes tricky. How do you justify the company leadership that something is important then? It’s nice that we really don’t have to.”

“Introducing a new layer of abstraction or content level is a big move.”

“We would love to have a conversation about what a layer abstraction might look like in MLFlow. I don’t think we’ve taken the opportunity to have that dialogue with the community yet.”

“MLFlow tracking is a very simple abstraction. It’s an API for logging a bunch of stuff and querying that stuff but how you decide to log stuff and lay it out and query it later can really vary.”

“Because operationalizing is open and opinionated, customers often struggle with what are the best workflows within that platform to adopt and that’s something that we’re hoping to make easier for the time series use-case.”

“The goal of a pipeline is substantially more opinionated about what it is that you should be logging and how you should be accessing it and basically create an even happier path to production.”

“Push as much as you can into sql because that’s such a heavily optimized ecosystem with 50 plus years of investment so you’re not going to do better trying to wrangle your own thing upstream.”

“Pipelines for ML and for data I think are super useful and we’ve seen the general-purpose orchestration tools for pipelines in the past and things like AirFlow and others that have gotten a ton of adoption have been super useful but I think there’s a special characteristic in the ML space. The pipelines can really help and I’m excited to see how that plays out.”

“I would love to be remembered as the person that helps everybody from the citizen data scientist to the enterprise do MLOps better and I hope that folks feel like they’re on that journey with me and with us as a team. I hope that we can continue to make them happy and deliver what they need.”


0:00 Demetrios

I really want to start out with one that is just the origin story or an origin story of MLflow. I feel like that's the only way to start out right here with you two on. So, Corey, you want to kick it off? How did this idea come about?

0:21 Corey

Absolutely, yeah. I joined Databricks when MLflow was still an alpha project. There was this identification of a gap in the MLOps space where a lot of platforms that were out there, were either very specific or opinionated to a particular framework – you had platforms for TensorFlow models, you had platforms for PyTorch models, you had some platforms that were good at metadata tracking for model parameters and performance metrics, other platforms that were focused on model deployments. There really wasn't a whole lot of ‘glue’ between these various tools, there didn't really exist a platform that anybody could just pick up and use and would be compatible with the variety of tools and software that they needed – and sort of end-to-end MLOps lifecycle.

So when I came in, that problem had been recognized and a small team of nine developers at Databricks were working on building out this one-of-a-kind, first-in-class, open source platform for the end-to-end machine learning lifecycle built on these open APIs. It was really fun to watch. Coming in the door, I didn't really know what open source development was about or how you got a project off the ground, so it was this really awesome opportunity to see, “Okay, what's going to happen when you launch a version 0.1? And where's it gonna go?” The adoption and the growth of the platform over time is really going to exceed anything that I could have imagined. It’s been really neat.


And where does it situate us? Where were we with Spark – was Spark already a big player? What are we looking at there?


Yeah. Oh, yeah. Spark, by that time, had certainly been really widely adopted and it was a core underpinning of data versus business. You had platforms, like EMR and other competitors, offering managed Spark. The Spark community was thriving with hundreds of pull requests a month. Yeah, it was a really mature, really huge project. MLflow was one of the first major Databricks projects to follow in the model of Spark. But it wasn't just an extension of the Spark ecosystem, it’s kind of its own tool. But we took a lot of lessons from the Spark open source community and growth and how we saw that play out as we developed MLflow.


Yeah. What were some of those lessons?


Absolutely. One key insight is that – especially in machine learning, but also in ETL – there's this huge ecosystem of tools and requirements that different users might have, and instead of trying to purpose-build solutions for every single niche tool and use case, it's often easier to create a really extensible and coherence core API and then let developers bring their own workflows to that API. The initial version of MLflow – as I understand the initial versions of Spark – was really centered around a set of thin wasted interfaces. For example, the MLflow tracking API lets you do a small handful of things – log a parameter, log a metric, set a tag, create a run, and create an experiment. We didn't even have search at the time. And that was kind of it.

We had a model specification that was basically “What version is this model?” and like, “Where's the software environment? (the condi ML defined)” And that was pretty much on all four models and something similar for projects. So that was a real core insight that was important – to get a project off the ground and start getting feedback, start getting usage, without trying to boil the ocean from day one.


I, personally, am a huge, huge fan of MLflow. I mean, as a practitioner, I think it's a fantastic piece of software. In my mind, it's like a big success story of open source in MLOps – it's something that really people use a lot and I think has done a really good job of building abstractions that people find very useful. And, in a sense, open source is something that a lot of MLOps companies and organizations think about. It's become so core to how a lot of groups are thinking about deploying machine learning tools and infrastructure. I'm curious, what have been some of the core abstractions and principles that you think have made MLflow successful, that maybe other groups can learn from?


Yeah. Well, first of all, thank you so much for the high praise. It sounds like you've gotten your hands dirty with the platform and tried it out and that's super cool. I really appreciate it. I think that one of the core abstractions that we found the most success with is this extensible tracking component. It's just such a challenging problem in the MLOps space to build a model, iterate on that model, and actually know that, “Hey, I'm making substantial improvements here. The model that I trained today is better than the one that I trained a week ago.” And there's a lot that goes into that. You have parameters, performance metrics – and those can be kind of arbitrarily complex.

We're looking at ways that maybe go beyond scalar metrics, for example. You have all sorts of artifacts – the model source files, the source code that was used to build that model, the model objects themselves, software environments – you have some metadata around, “Okay, who actually built this model? What system did it come from? What version of a particular platform (if you're running on Databricks, for example).” And I think solving that problem has really driven a lot of growth in MLflow. We do get a lot of usage of these other components, like the model registry in MLflow models and projects. But I think without tracking and having that API that, like I said, started small, but grew over a period of time – I think that's just fundamental and core to the system and a lot of our other integrations tie back to it.


Actually, there's something interesting that you mentioned there, and this goes more into product development. With open source, I know it can be very difficult for people to understand A) How many people are using their tool, and B) How they're using their tool – what they like, without going in there and serving people each time they want to bring out a new feature or do something different with the tool. How did you get around that? What was the feedback loop from the users like? How were you getting metrics on what they were using, what they liked, what they didn't like?


Yeah, awesome question. It's kind of funny you mentioned surveys, because we have made pretty liberal use of surveys, especially in the early days. It's been a while since we did one, but after launching the alpha version, and then within the next two or three months of development, we sent out a survey and asked “What do you like about this platform? What are you using? What can be improved? And what you maybe don't find useful?” Or “What do you find confusing?” That actually helped in those very early days to identify what components resonated and where we needed to maybe polish things up.

After that, though, one thing that's very unique about MLflow is it came from Databricks – this enterprise platform for data science and machine learning – and as a result, as we integrated MLflow with Databricks, we had this awesome stream of information about usage from our customers. So we gave them the same set of components that we're giving open source users with largely comparable functionality and we got to see, “Hey, how are our hundreds (or thousands) of customers using these features and what does the adoption look like?” And that was also super useful for us. So I think that there might be a kind of a lesson there that open source growth can really be powered sometimes by third party platforms that decided to adopt these kinds of tools. For us, I guess, it was designed with that in mind. We built MLflow partially to serve the open source community, but partially to serve Databricks customers and that really, I think, helped with product development.


I think that's a really interesting point that you bring up because when I hear MLflow, in my mind, I immediately make that association to Databricks. Right? And there's kind of this interesting dynamic here, because we're talking about a fundamentally open source project that – when we think open source, we think “for the good of the world, volunteer led – it's just totally free and open” but then there's this connection to like a very commercial entity – a for-profit money making machine, in some sense. So I'm curious, how have you thought about straddling that fine line between what might be perceived as competing principles, in some sense, and how has that guided some of the development of the project?

10:12 Corey

Absolutely. It's something we think about a lot. And our stance has changed over time. In the early days, that first year or so of MLflow development, it was all about open source. Open source would get the features before Databricks got the features. That was extremely important. And over time, what we consider to be a feature has somewhat changed. So when we talk about features now at Databricks, it's “That's a Databricks feature. That's an MLflow feature.” And oftentimes, the distinction is, well, as a core piece of functionality for any data scientist or machine learning engineer out there – Anybody might benefit from this? Or is it specific to some tooling or some workflows that we see at Databricks? Maybe it's an artifact of how our platform is structured. And that usually tells us where we invest the effort.

Do we invest in spending time filing pull requests against open source and publishing RFCs for the community? Or do we invest in getting down to it and making tweaks and improvements in the Databricks product. We're still very much, at this point, focused on trying to open source as much as we can – that we think is useful for the broader MLOps community – while making sure that customer needs and requirements are addressed quickly. So it's a fun balance for a small and nimble team. And so far, so good, I'd say. But I'm sure the community might have other thoughts, or you guys might have other feedback.

11:54 Mihail

I like it. I mean, again, it's the association. But I don't feel like I'm being pitched on Databricks every time I use the tool. That would be the shameless way to pursue the collaboration. chuckles

12:05 Corey

Absolutely. Yeah. That's the goal. So, as you're looking to build out a new platform and start a new project, what are some major questions that come to mind being on the ground floor of something?

12:24 Mihail

Yeah. One that actually is very related here is this question of (maybe even stepping back at a meta level) OSS as kind of the go to market, right? A lot of MLOps groups and organizations do pursue that as their strategy. But then there's other models too, like freemium or closed source/open core – I mean, there are really different ways to pursue that. MLflow is purely open source, so I'm definitely curious to know why that was the path forward chosen. Why was that the right go to market for something that was like a Databricks spin-off. By which I mean that it’s an enterprise company spin-off.

13:06 Demetrios

Yeah. And actually, it's funny – when I think about the open source tools in the MLOps landscape, MLflow is one of those ones – I think MLflow, Kubeflow and Airflow. (funny, it's all the ‘flow’ brothers or sisters). chuckles

13:25 Mihail

TensorFlow. chuckles

13:26 Demetrios

TensorFlow too. But there are some very bad examples of doing open source, where it feels like ‘this is not really open source for being open source’ but it’s ‘open source for trying to get customers’. But MLflow doesn't really feel like that to me. MLflow feels like it's a much bigger beast than that. Maybe it started as that, but now, it's like you opened Pandora's box and you can't put anything back in it. So yeah, going back to me kind of question, how did that look? The execution on it was brilliant, in my mind.


I appreciate that. It was definitely intentional. We didn't overextend ourselves in open source and create this widely adopted platform by accident that we then have to maintain and serve, it was very much our goal to build out a set of standards that we felt could capture what it is to be a machine learning platform – an end-to-end ML lifecycle tool, and deliberately to do it in open source. That way, it benefits both citizen data scientists and other organizations, but it also benefits enterprise adopters of the platform that decided to offer it as a service. So it benefits Databricks to have MLflow exist as a standard that other people know about. It exists as an entry point into the MLOps space, so that other people can go and say, “Hey, I'm looking for an MLOps platform. What's out there?” Oftentimes, one of the first platforms they'll see is MLflow and then they find that, “Hey, Databricks is a great place to use MLflow.” In that way, you're both contributing to the space, you're establishing thought leadership, you're helping make machine learning applications better and you're also driving some number of customers and driving some adoption towards Databricks. So it's a win-win. And that was a lesson in the Spark model. Spark started as an open source tool. Databricks wasn't really even going to be a for-profit company, as I understand it.

In the early days, it was all about just building out Spark and offering it to other organizations, just making ETL and data management and all that better. Obviously, it didn't quite go that way, but that was the initial mission and ethos. So I think that pattern seems to work. Two for two or three for three – I guess it depends on how you evaluate delta in other projects. But so far, I think there's a proven track record.


Yeah, a multibillion dollar company that just randomly fell out of an open source project. That's always a nice thing to find under the rainbow or something. laughs


It is, yeah. From the sound of it, that was not expected. So – interesting.

16:27 Mihail

chuckles Kind of on the topic of open source – a fascinating concept, just in general, in software development, there's this idea of vanity metrics. GitHub stars is one of the most obvious ones where, “Hey, the more stars we have, the more successful we are as a project.” Eh yeah – I'm not super convinced. I think the people that are really in open source don't necessarily buy into that as the main metric of success. When you think of something like MLflow, what is your idea of a successful open source project? And how do you measure your success? What are the main metrics?

17:00 Corey

It's a great question that we're actually still trying to refine. When we give, for example, a public presentation of “What is MLflow?” or “What are the updates to MLflow?” Oftentimes, we will cite vanity metrics because it makes us look good, you know? It's like, “Hey, we're closing in on 11,000 GitHub stars. We have over a million downloads a day.” All that kind of fun stuff – because that gets people excited. It says “Hey, this is a widely used project.”

Now, when we're actually managing a community and building out the next generation of the platform, for example – those aren’t the things that we're generally looking at. Typically, the focus is on how many customers there are on the Databricks side of things, but also how many open source organizations are adopting this. So we get really excited about logos on the website. People reach out to us by email and say, “Hey, my organization of X number of people is using this platform. And we're doing these things with it.” And that gets us really excited, because it says, “Hey, we have this diverse set of use cases, and these different workflows and, and collaborative settings where our tool is being used.” I think that's really motivating and shows that we're on the right path.

It’s a similar thing for Databricks customers. Customer growth is really important in the adoption there, because that's often much stickier than a one off pipeline download, for example. So those are super good metrics. We actively monitor new issues and new pull requests. It's something we're trying to get better about making sure that we follow through and continue to address those. I think we're on the right track there. But that's an excellent signal. If people are actively filing issues, it means they've spent more than five minutes trying to use this thing. And if they're filing pull requests, then that usually indicates an even more substantial level of investment.

18:56 Demetrios

When I think about MLflow, one of the things that I think about is just how easy it is to set up and get your hands dirty with it. I feel like it is very much that first taste of the MLOps drug. It's like the gateway drug in the MLOps, in a way. Because it's just a pip install away, right? I'm wondering what are the fundamentals – what are some of the pillars that MLflow is built on? Especially when it comes to the design decisions that you made? Because you can always try and have everything – have your cake and eat it too – but at the end of the day, you have to make some hard decisions when you are implementing this. So what were the things that you try and always come back to (or still are, I imagine)? Do you have a North Star or the vision statement or things that you tell yourselves?


Definitely, yep. In terms of ease of use, one of the key obstacles when somebody’s setting up a platform is just getting all of those dependencies wrangled and set up. For example, you could imagine a world where, in order to use MLflow for the first time, I need to install MySQL and spin up the SQL servers and things like that, and connect my MLflow tracking server, which I have to run using a separate process spinning up a Flask server and I have to connect those two things together. That will probably take some people a lot of time. One really nice thing about the core portion of MLflow is that it's designed to be compatible with any host that has a file system. You can just go out there and start logging things to this file-based representation of MLflow storage.

That requires no setup at all. MLflow knows how to read and write files, and that's just awesome. Very similar thing with some of the other components, like MLflow models – a model is just this really thin wrapper around some native ML framework. So if you manage to install TensorFlow, or you manage to install PyTorch, you can start saving MLflow models without requiring any additional dependencies. Keeping a very thin set of software requirements that actually have to be installed and set up and configured to get started with the platform, is really important. Another thing that we come back to time and time again, is that each of these components of MLflow – the tracking, the models, the registry – they should all stand as their own pillars. Let's use that word again.

Users should be able to leverage each of these components totally independently of the other ones, so that you're not requiring that everybody install everything and figure out how to get everything set up every time they want to accomplish a particular task. That, I think, has been successful as well. It lets people adopt tracking, or adopt models, or adopt the registry in the organization, without having to reason about the rest of the ecosystem. We'd obviously like them to, and a lot of people start with one and they get somewhere else over time, but I think it lowers the threshold for adoption.

22:11 Demetrios

That's crazy, because I think about Kubeflow and I think how it's a little bit different than that. chuckles A lot of people have talked about the headaches that Kubeflow brings. And you see it kind of happening now with KServe – how it is doing that. I think in the past, you've always been able to only use select pieces of Kubeflow, but I definitely don't think it was part of the fundamentals of the project that allows you to just “Alright, we want to use this one little piece. Let's make it as easy as possible to get up and running with that. And then later we can grow and blow it out.” Sorry, Mihail. I cut you off, man. What were you gonna say?


No, no, no, please. I think that's a good question. Keep going, Demetrios.

23:00 Demetrios

chuckles Well, I didn't really have a question, per se, around that. I was just talking shit on Kubeflow laughs It’s one of my favorite pastimes.


laughs It's funny, in the earlier days, and I think we've come back to it a couple of times, because people have asked, “Hey, should we spend time making sure MLflow also integrates with Kubeflow?” We've done a little bit of prototyping and it seems that time and time again that whoever's diving into it comes back and says, “Man that was hard to use!” It's just purely due to the volume of setup. Getting the zero to one problem, I guess, is truly an issue there.

One other thing I wanted to call out in terms of ease of use is kind of ‘developer ease of use’. That's something that Ben's been doing a little bit of work on and I think it’s also really important in keeping a community healthy and making sure that you get good contributions without requiring somebody to spend five hours figuring out how to develop on your platform. So Ben, maybe that's something you want to speak to?


He left us. He said, “I'm done with this. I'm outta here.” chuckles I was talking to him on Slack, he said his power went out and he'll try and join in a minute.


Well, that'll do it. Yeah. Apologies. He’s still showing up on my guest list here, so I was wondering why it wasn't on video. But that'll do it. chuckles.

24:22 Demetrios

Why wasn't he talking? “You’ve been so quiet, man. Why?” laughs I mean, we can definitely get to that when he comes back – if he comes back. If his power ever comes back on.


Actually, I want to kind of go back to one of the things that you said, Corey, earlier about the pillars. In a sense MLflow is like multiple projects in one, and there's each of these independent pillars – the artifact store, the model store, the tracking component – and I think it's fascinating that you mentioned that they should be able to exist independently. Personally, that's how I… that was my gateway drug into MLflow. I was just looking for something better than TensorBoard. I was like, “This is not fun to use.” And that's when I first adopted MLflow like “Hey, this is significantly better. I'm able to actually use this without running weird commands.” So that was my entry point into the broader ecosystem. I think, in that sense, it's been a pretty effective strategy for getting people to try something out, see if they like it, make each of these components really strong on its own, and then have that be the enabler into the next broader set of tools that they can adopt. It's kind of easing people's path into the project, which I think is really well done.


I appreciate that. Yeah, it's nice for users – it's also nice for developers and maintainers. It creates a more robust platform too. In order to test the platform, you can test each of its individual components, and in relatively hermetic and isolated environments. That's helped us, I think, build out a production grade platform in open source as well. There are some benefits to both sides.

26:13 Demetrios

Now coming to the other side of that, which is – you can try and eat up too much of the workflow, or you can try and do too much and become this end-to-end tool that is opinionated and maybe doesn't do things for certain use cases or whatever. How do you look at that? I mean, there are some pieces of MLflow that seek to be this end-to-end tool and… it's not quite, but it does follow in the same vein of a DataRobot or, or something like that. So, what are the ways that you're thinking about ‘best in breed’ or full on ‘one tool to rule them all’?


We've also been going back and forth on that for the last four years, too. My opinion on this (and the opinion of others might differ) is that we really have to realize that vision of being a complete end-to-end MLOps platform in order to win market share, be most useful to users, data scientists and machine learning engineers. We've incrementally moved further and further along in that journey, but we're still not totally there. The places where I think we should go are kind of upstream and downstream of the existing MLOps platform. You have things like data ingestion and management, featurization and integration of features with models on the upstream, that I don't think we've touched. And then downstream, you have model monitoring, CI/CD improvements, for example, a notion of web hooks – which exists in the Databricks products, but doesn't exist in open source – and then more comprehensive model monitoring.

These are all things that are really important for users and I think they're growing as a problem space. So as users and customers get familiar with MLOps, and maybe they've had some success using tracking and using our deployment functionality, and projects and tools like that, they're then saying, “Hey, we're running into this problem where we don't know how to version our data and make sure that we can see exactly what data was used to build a model.” Or “We're having trouble efficiently passing queries to our model because the volume of data is too large.” Or “Once a model hits production, we're not sure if it's performing well enough.” So I do think we need to reach those areas. But, Demetrios, as you pointed out, it's important to have good APIs here – that are inclusive, that capture most of the workflows – and that's the part that sometimes takes a long time. Yeah, we're dealing with nascent spaces with a lot of these problems.


One thing that you point out there that I think has got to be such a fascinating evolution that you've seen just unravel before your eyes is the MLOps space mature with the MLOps tool. So it's like as MLflow is maturing, the overall space – because we are so new right now – is also maturing. So you're seeing these different use cases and people are hitting new bottlenecks that I'm sure, two or three years ago, they weren't hitting because of the way that they were looking at these problems.


Yeah, you nailed it. That's exactly the case and it's an awesome opportunity at the same time. It can be a little bit stressful as a platform developer. Because, all of a sudden, you're trying to maybe vet something and you don't see maybe that there's a clear product market fit or that enough people are asking for it, and then six months later, people are beating down the doors saying, “Hey, we need this feature set!” And like, “Okay, what tools are out there?” Then each of them is purpose built and opinionated and it's like, “Okay, abstracting over these is hard.” Even asking individual users, “What do you need out of this?” We'll get 10-15 different responses. So it can be tricky, which is maybe why we've moved more slowly than I think sometimes the community wishes we would have into these other areas. But we're working on it and I would imagine we'll make some progress there soon.


To go back to one of the things you said earlier around your opinion is that the ultimate vision for MLflow should be more end-to-end – like are there all these other parts of the stack that maybe you guys haven't addressed yet? There's a way to kind of hear that and think, “Oh, that's because you guys want to own everything and lock people into the platform as much as possible,” which is understandable from the Databricks side – totally makes sense. But there's also an alternative way to look at it, which is that this is driven more by usage – by people's needs – than it is by a top-down corporate desire to have people all in one platform, right? That's actually that people, when given the choice, don't really want to have to pick everything themselves, right? That, really, they want to be able to just say, “Hey, here's my one stop shop for every single part of the ML lifecycle.” And I'm curious how you think about that trade-off?

31:42 Corey

Yeah, it's nice that those objectives align so well in terms of what to invest in. What's better for the data scientists and machine learning engineer is often better for a business too. The fact that most database customers are asking for end-to-end tools and most open source users are asking for end-to-end tools because it's just so hard to stitch together a bunch of different platforms and deal with API versioning changes and behavioral nuances, is awesome. So we aren't really dealing with this kind of tugging of the heart in different directions, and ultimately, that means we just get to roll up our sleeves and solve problems and know that we're doing the right thing for both sets. It's cool that that remains aligned and I hope it continues. Yeah. What do you guys think?

32:32 Mihail

My sense is – if I could just jump in – I think that, to put it as bluntly and as kindly as possible, I think people, especially developers, we tend to be kind of lazy, when given the choice. We do like flexibility, but at a certain point, you don't want to spend too many cycles thinking about things that someone has already solved. So I think that the broader trend is toward end-to-end. And I think that a lot of groups and organizations do recognize that, which is why they're all spearheading in that direction – DataRobot, H2O, SageMaker – they all kind of want to own that stack as much as possible in addition to the fact that, of course, it's more commercially lucrative to have people use it.

You can charge them for every part of the stack, rather than just the one tool they're using. I think that you'll always have people that want to stitch everything together, because they're libertarian and don't want to get chuckles they want to stick it to the man as much as possible, like, “Don't you dare tell me how I'm gonna do everything.” But I think that for the bulk of users, if you can give them something that's really good, they'll forgo some flexibility to do that, if you can solve 80-85% of their problems.

33:45 Demetrios

Ooh. That's a hot take. I like that.


Tweet that one out to all the libertarian ML engineers. laughs


laughs Exactly. I think you see that, especially with people who are coming into the space – they're coming in a little bit new and they just want to figure out “How can I get from zero to one as quickly as possible?” So that's why you do see a lot of success around… I think about SageMaker and how much people complain in the MLOps Community Slack about how “SageMaker can't do this. SageMaker can't do that.” But, at the end of the day, they're still using it. They chose to use it and then it gets them 80% of the way there and that other 20% they have to work hard on. So it does make sense that you would want to try and do that. And I really like this idea of the end user and the business values are very aligned, so you can try and tackle them both. At the end of the day, if the end user doesn't like it, they're going to vote and they're going to show you and you're going to realign and figure things out.

34:57 Corey

Exactly. And if the end user is complaining about it – they're complaining about it in open source and enterprise contexts. We have yet to find a moment where it's “Hey, the ask and the need and the workflows are totally orthogonal – or opposed.” We're not at that point yet. I hope we never get to it because that becomes tricky. Like, how do you justify to your company leadership that something is important, then? It's nice that we don't really have to.

35:30 Mihail

Just one thing to build off of what Demetrios said, which I think was a really, really good point – MLOps, ML tooling is still a relatively new space and so, every year, there's this new influx of people coming into the space. And people that come into the space today compared to people (the “OGs”) that came into the space 5+ years ago…

35:52 Demetrios

Five months ago chuckles

35:54 Mihail

Five months ago, right laughs I was gonna say, if we go really, really far back – people that were still writing their own gradients before TensorFlow and all that stuff. These folks have – because at the time, there were not these tools, these tools didn't exist – they had to patch together their own solutions. Developers, once they have a system that works for them, we'll just kind of stick with those systems. Like, “I don't want to learn a new tool. This is what I'm gonna keep doing because it's what I did before.” I definitely have these biases as well where, yes, sometimes a new tool comes along and I just don't really want to use it, because I've already figured out the kinks of my crappier solution. But I know that solution, you know what I mean? Versus people that are just coming in today, a month ago, next month – these people don't have preconceived notions about how things should be done, and they don't have their own existing tooling that they already want to use. So it's much easier to “own” their mind share a little bit more and indoctrinate them in the way of an end-to-end system like an MLflow or a SageMaker. Right?

36:56 Corey

Mhm. Definitely.

36:57 Demetrios

So… oh, go ahead. Yeah.


At the end of the day, I think we're just purely, in an egalitarian sense – in the open source side of things – trying to do what's most helpful for practitioners. So if that means that you've been successful with your tools and stack and workflows for five years and you don't need anything else – I probably wouldn't take on the effort to switch platforms just for the heck of it, right? I don't think we expect that other people will either. But as use cases and needs evolve, I think there does become this poll of platforms where the development’s being done for you in contrast to the continued stitching and globbing on additional components to your existing stack. You can do both and maybe it's more appealing to go with the devil you know, initially, but I think it would be a time savings for a lot of folks to consider that that job, even if it might take slightly longer initially, saves a lot of time in the future. Yeah, it's a great question and something we look at a lot.

38:11 Demetrios

All right, Corey. I want to jump into some community-sourced questions because I put it out there on Slack and on Twitter that we are going to be chatting with the MLflow team. And we had quite a few people respond, which is awesome. It shows you how stoked people are about this product and just how big the MLflow community is. First one – I'm going to give a shout out to Matt Cain – this one got a ton of upvotes. We’ve got to do this one first, because I want to make sure we get to it. It's a little bit long-winded, but I'm gonna read it off to you and tell me if you need any clarification afterwards. Matt asks, or says, “I'm curious about if they've ever considered adding a project abstraction separate from the current ML project thing that experiments runs and models sit underneath. For instance, a project might be my ‘cool stock prediction project’ with an experiment underneath it called List and another called XGBoost. The goal would be to tie multiple experiments together under one umbrella as if you had some production model and you wanted to use experiments to experiment on that model. The lack of this kind of abstraction always bugs me. This is how I think about projects and we've sort of hacked it together by trying a model name to an experiment name and using that shared name as a project by bundling them together, but it's not perfect.”


Absolutely. It's a fantastic question. We're aware of the corresponding GitHub issue with the 30+ thumbs up and votes and all that. Actively, maybe this community member would be really excited to know that we are looking at that space. Though, maybe not exactly with the solution that some folks might have had in mind – introducing a new layer of abstraction or content level there is kind of a big move. We have the Databricks implementation of MLflow, we have a lot of third-party implementations – Azure ML is a big adopter – so rather than making that core to the MLflow API, for example, we're thinking that it might be really nice to just let users search and filter their experiments based on pre-existing data.

Right now, we've had this long standing limitation where all you can do is list experiments in terms of search capabilities. You can’t filter them by name, you can’t filter them by tag – things like that. So you can imagine representing a project as a sort of unified or consistent tag on a collection of experiments, and if you want to pull all the experiments with that particular tag, then you can go ahead and fire off a search query either UI or API, and pull in all of the experiments corresponding to that project tag. That also increases flexibility in my opinion, because you have an experiment that might be shared by multiple projects, for example. I can't think of a use case off the top of my head, but you can tag the experiments with multiple projects and then perform the search that way.

So that's one place where we're definitely invested in making improvements, and the community should see something within the next six to eight weeks. Hopefully, that makes some folks happy. And if that doesn't, I'd be happy to chat about it – have a GitHub issue and see what we can do to address it.

41:56 Demetrios

Super cool. Yeah. Shout out to Matt for that one. I've got some more. I've got a few more actually. So, are there any plans of implementing RBAC and multi-tenancy features in MLflow? This is me speaking now. You said, how big the ecosystem has become for MLflow – you have the Databricks, you have the Azure – does that make it really hard to start implementing bigger pushes or bigger bets in certain directions?


That's a good question. I wouldn't say that the platform growth in Databricks Azure ML has made it hard for us to make big improvements. It's just made us, I think, a little bit more deliberate about what we actually ended up prioritizing. Once we've recognized that something is important, I think our velocity is still pretty good in our ability to iterate there. So when it comes to things like RBAC and multi-tenancy and those traditionally enterprise features, we could build something like that in open source MLflow, and I don't think anybody would have major qualms about it or trying to block that.

Our goal, I think, right now, with respect to that class of issue should be to make it as compatible as possible with industry-standard tools that people use for RBAC and multi-tenancy, not necessarily to build an opinionated model into MLflow, but to make sure, for example, that it works with an Nginx reverse proxy. Users have had some issues with that in the past – I think we should invest there and make a deliberate effort to make sure that workflows like that are compatible. That also might extend to a thin plug-in layer, where people can plug in their own auth and their own notion of user identification. I haven't seen the overwhelming need for that yet, but it's definitely something that we could consider.

43:59 Demetrios

It’s fascinating to think about how we were just talking about the idea of what to eat up, or what parts of the stack to take, and then here, which parts of the stack to hand off to a third-party. So it's clear that you've thought deeply about this and you're constantly thinking about, “What should we make our priority? What should we make that we want to fix? And what can we hand off? And what can someone else do better? Because if we do build something that is opinionated about this, inevitably, it's going to make a lot of other use cases for certain people, it's potentially going to alienate certain use cases.”

44:50 Corey Definitely.

44:51 Demetrios

Let's keep going with these questions. When will you implement login features to the experiment page?


I think that's a ditto on the previous answer. Auth and user management. We would love, I think, to have a conversation about what auth layer abstraction might look like in MLflow. I don't think we've taken the opportunity to have that dialogue with the community yet. We've certainly triaged the ask multiple times and revisited whether or not that's something we want to invest in. But getting more specific and into the weeds and figuring out exactly what the requirements are seems like a good exercise. No immediate plans to do that, but we're definitely open to discussing how to make the integration of third party auth easier. 45:46 Demetrios

Sweet. What is the challenge that most companies or clients you've worked with face when operationalizing MLflow in their stack? What's been one of the hardest things or challenges and what's been one of the most frequent challenges?


This is where I wish Ben were really here to speak from the customer perspective, because he's super close to the metal on that. But he's bubbled up enough information and I've talked to enough customers that I have a sense of what some of those challenges are. A lot of it is not so much infrastructure-related – which is good, it means we're doing our job – it's “How do I actually most effectively use these tools, like MLflow Tracking and MLflow Models, to structure my workflow?” Going back to kind of what we talked about in the beginning, MLflow Tracking is kind of a very simple abstraction – it’s an API for logging a bunch of stuff and querying that stuff.

But how you decide to log stuff, and lay it out, and query it later, can really vary. You have some customers that are doing time series forecasting, for example, that say, “Hey, we have a hundred or a thousand different categories of product being sold in a variety of locations. So we're going to build a model for each of those, and we're going to log them all separately, and we're going to retrain them every hour.” And then all of a sudden, that's “Oh, crap. I'm having trouble figuring out what was trained and when because it's just a mass of data that's logged to MLflow.” Or “I'm having trouble operationalizing these models because I need to pull the best model for each category back and then write all this code to do it.” So they kind of end up locking themselves into some complex workflows and through – I wouldn't say a fault of the tooling – but because it's open and unopinionated, customers often struggle with “What are the best workflows within that platform to adopt?” That's something that we're hoping to make easier for the time series use case, and again, I wish Ben were here to talk about that, but then also for general MLOps workflows – introducing some more opinionated concepts. We've recently published MLflow pipelines RFC in the open source community. And the goal of a pipeline is to be substantially more opinionated about what it is that you should be logging, and how you should be accessing it, and basically create an even happier path to production.

48:34 Mihail

I have a little bit of a different question from the community. So, you're one of the maintainers of the project, and obviously, a lot of fantastic work that you've done and I'm sure a lot of fantastic things still to come. But it didn't always have to be this way – there could have been a different set of universes where you did other things. So my question is, or I guess my question on behalf of the community is – if you weren't working on MLflow, what would you be working on? What would you want to be working on?

49:04 Corey

Oh, man. Yeah. Me, personally? That's a great question. I really enjoy web service architectures, distributed systems, and things like that. So if I didn't have this glorious opportunity to explore that in open source, but then also within Databricks, I'd probably be architecting and code monkeying around the enterprise side of things. I enjoy that kind of work. I like building platforms. Open source is awesome and it's super cool to have the opportunity to engage with an open source community and build out this tool using systems engineering and platform design considerations. But I think ultimately, that's the interest of mine – platform development.

49:53 Demetrios

Nice. When you think about different projects that have been built on MLflow – or not projects using MLflow, I think about projects on top of MLflow or things that take inspiration and use MLflow as a foundation – have you seen any that you really like? I remember, probably like two years ago, one of the community presentations that we had when we were first starting was this framework – it was a scaffolding around MLflow called Hermione. And it really made MLflow a bit more intuitive for a certain subset of users. Have you seen stuff where you're like, “Woah! That's a really cool thing to do on top of it!” Like some kind of app or something that you like and you want to give a shout out to?


It's a great question. We've seen a whole bunch over the years. I think, unfortunately, we haven't seen ones that are heavily adopted over a period of time that have had our radar, so I'm hesitant to shout out any particular thing, worrying that I might date myself by about two or three years at this point. But there have been all sorts of cool blog posts about, “Hey, we used MLflow to do this. And we built an extension layer.” Some of those projects have fed contributions back into the MLflow-fluent API surface, so we've had big improvements to our search abstractions and to our run management and stuff like that, as a result of people going out and building their own software on top. So if I'm going to shout out anything, it's the collective of folks that have decided, “Hey, we built something useful. We want to contribute it back to the mainstream and get that merged in.” Thank you so much. It's been really helpful for the growth of the project.

51:43 Demetrios

Well, that does feel like another – this is a proxy metric maybe and it's very hard to quantify – but knowing if you have a healthy product or a healthy community, that is a great metric to look at. Like how many people are creating things on top of what you've built is much more powerful, in my eyes, than GitHub stars.


You're absolutely right, Demetrios. I think it's a great signal for us moving forward. So there's that awesome metric, like, “How many different projects embed MLflow as a dependency?” We have to look at that more. So I appreciate the guidance there. You're absolutely right.


All right, I've got a few rapid fire questions for you before we finish. There was one question it was kind of for Ben, but maybe I'll ask it for you before the rapid fire questions just to fake people out a little bit. Let's see if you have any opinions on it. How do you feel or what do you think about feature stores in MLflow? And using the two of those together?


Fantastic question. It’s a synergy that's important and needs to happen. On Databricks, for example, there's a feature store offering – there are some MLflow integrations. I think that there's more work to be done there and that feature management is core to the development and productionization of ML models. As a result, it's very much within MLflow’s domain. So we're looking at it. We're trying to figure out what the affections might look like. And I think people that need this kind of functionality should have their needs met sometime in the next year or so. That would be my guess. Yeah, we're definitely interested.

53:32 Demetrios

Ooh. All right. Now for the lightning round that we've all been waiting for. What was the last bug you smashed in… I was gonna say in production. But then I thought “No, just what was the last bug you smashed?” Just give me the last one. chuckles


Definitely. So we smashed some pretty brutal performance issues with MLflow Tracking and SQL-based stores, which is pretty critical. We were able to take logging performance for a batch of parameters and metrics from something like 1000 of these took 30 seconds down to about less than a second or two seconds. That was really recent and really important for performance tracking.

54:17 Demetrios

What was the main takeaway from that? What did you learn (that you can tell us)?


chuckles Absolutely. That “push as much as you can into SQL” because that's such a heavily optimized ecosystem with 50+ years of investment, so you're not going to do better trying to wrangle your own thing upstream.


Wow. Okay. Now – last book you read?


Last book I read? I've got it sitting on my desk – Philosophy of Software Design. It’s a classic. They gave it to us, actually, when I joined the company and I probably should have read it sooner. chuckles


Oh, nice. Well, I'm going to do something because I saw somebody else do this and I thought it was awesome. We're going to give out that book to someone that comments on the YouTube page. Let us know what you think about and we will give a free copy of that book to you. So, give us some comments. What did you think of this conversation? Now, what piece… Corey cross-talk


Does that apply to us as well? Because I don't have that book. Can I comment? laughs Can Corey comment on this?


laughs All right, man. But you’ve got to do it with your fake avatar or your Twitter bots or whatever.

55:36 Corey

Setting it up right now.

55:38 Demetrios

laughs Yeah. Exactly. So, Corey, what piece of technology are you bullish on that might surprise people? It doesn't have to be coding technology – it can be like the IoT of your house or whatever. But is there something that you think (it can be coding. It can be whatever – I don't think Kubernetes would surprise anyone) but is there something that you're particularly interested in these days? Or you've gone down a few rabbit holes and you want to see more of happening?


Don't say crypto. Please don't say crypto. laughs NFTs?


No, not crypto and NFTs. This is going to be boring, but I just want to continue promoting as much as possible, but… pipelines for ML and for data I think are super useful. We've seen general purpose orchestration tools for pipelines in the past and things like Airflow and others that have gotten a ton of adoption and been super useful. But I think there are some special characteristics in the ML space that pipelines can really help with and I'm excited to see how that plays out. So very much in the wheelhouse – very much in the same space. And I’m thrilled about it. Crypto’s cool, too. But I don't think I'm really an expert enough to talk about it.


Not so bullish on that. That makes me very excited, because I've got my friends over at ZenML who would be stoked to hear you say that, and they are working in that space currently – right now. Last question for you and then we're going to roll. How do you want to be remembered?


I would love to be remembered as the person that helps everybody – from the citizen data scientist to the enterprise – do MLOps better. And I hope that folks feel like they're on that journey with me, and with us as a team, and I hope that we can continue to make them happy and deliver what they need.

57:36 Demetrios

Excellent answer. Love it, man. Thank you for coming. This has been awesome.


Thank you so much for having me on. I had a real blast. Yeah. Thanks so much, Demetrios and Mihail. It's been so much fun.

57:51 Mihail

MLflow is in good hands. This has been very, very inspiring.


I appreciate that.


There was a question, but it's like, “What's the roadmap look like for MLflow in the next whatever?” But I'm sure cross-talk


That is an awesome question. I’ll spend 15 seconds on it. We are working on a public facing roadmap v2. We wrapped up our first ever public facing roadmap with like 75% contribution rate from the community on that, which is great. We're doing it again. And just one other thing to shout out is – we're hoping to expand the set of maintainers over time for MLflow. We don't want this just to be a Databricks-driven project, so look out for opportunities there as well. And please get involved and help us execute.

58:35 Demetrios

Awesome. How do you incentivize people to get involved?


Free books. chuckles

58:42 Corey

Free books, great questions.


Great questions from the MLOps community.


We do MLflow T-shirts and mugs, so if that's appealing, your contributions will be noted in the release notes that are published on the websites, on Twitter, on GitHub. So you get some of the personal branding side of things as well, which is nice. outro music

+ Read More

Watch More

Posted Aug 02, 2022 | Views 2.3K
# ML Flow
# Pipelines
# Databricks
Posted Jul 06, 2023 | Views 334
# LLM in Production
Posted Jul 31, 2023 | Views 253
# Model Drift
# Software
# Equal Experts