Sign in or Join the community to continue

I Don't Really Know What MLOps is, but I Think I'm Starting to Like it

Posted Aug 18, 2021 | Views 468

# Cultural Side

# Case Study

# Presentation

# Coding Workshop

Share

Speakers

Ewan Nicolson

Head of Data Science @ Forecast

Ewan graduated in 2006 with a BSc in Physics. Since then he's built a career in data. He has a breadth of experience in technical roles, covering the fields of data science, analytics, and data engineering. He's also an experienced leader in the field, with a particular focus on coaching and team development, and culture.

Before joining Forecast he's worked for companies including the BBC and Skyscanner and found himself in industries including seismic exploration, technology, and advertising. When he's not crunching numbers Ewan is a bit of a bookworm, enjoys traveling, watching cricket, and getting out and about in the Scottish Highlands.

+ Read More

Demetrios Brinkmann

Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More

SUMMARY

A data/ML system in production is different from both traditional software engineering and traditional data science/analytics workflows. These differences can be pretty subtle, and trying to use your traditional skill sets to solve these new problems doesn’t work. Ewan demonstrates a realistic machine learning system in production and uses this demo to show some patterns that he has found useful for living with ML in production, and maybe debunk a couple of myths along the way. Ewan also points at some MLOps developments that he really likes and shows some things on his wish list.

+ Read More

TRANSCRIPT

0:41 Demetrios [sings] He graduated in 2006 with a Bachelors in physics. Since then he’s built a career in data. He’s had a wealth of experience in tech, covering fields of data science, analytics, and data engineering. He’s also an experienced leader in the field with a particular focus on coaching and team development. First time I reached out to him, it was on LinkedIn but he didn’t give a sign of life. I thought that for sure he was going to cancel the invitation. I got word from his friends that he was the guy to talk to about MLOps, so I sent him a message hoping to get an interview. But I was wrong, so wrong. He never actually responded to anything I sent. I even sent him a voicemail. It’s impossible to sell anything to you. [inaudible] in what he's got to do? There we go. Welcome to the show. 2:24 Ewan That's amazing. 2:27 Demetrios So in case anybody missed it, I first tried to talk to you Ewan in 2019 on LinkedIn because somebody told me that he was the man to talk to at the BBC when it comes to MLOps and I was really excited. And then what happened? Crickets. But now, two years later, we get to talk to him. And we were saying, actually, before we hit record and we jumped on here that it is in a much better place that we finally get to meet an exchange. And he's prepared a whole lot for us today, which is awesome. You've got a whole presentation, I think. Yeah, man. Welcome to the show! 3:13 Ewan Thank you very much for having me. It's awesome. Quite an ego post having a song written about you, as well. Yes, I've got a presentation. I'm going to try and demoing as well. 3:31 Demetrios Nice! 3:32 Ewan But I'll start off with the title. Because I've been semi-active on the machine learning operations community for a wee while now. As the title states, I'm not entirely sure I know exactly what machine learning ops is, but I think I'm really starting to like it. Probably the big theme of this is just saying thank you to the community and thank you for all the things that I've learned while being a part of it and really spreading a bit of appreciation and gratitude to you all. But first, I'll tell you a little bit about myself, which you've already done in the song, Demetrius, so thank you. I'm the head of data science at a company called Forecast data consultancy, which means that we do lots of work for lots of clients all over the place. Before that, I was a data scientist at the BBC and also Skyscanner. So I've been doing machine learning work, and data work more generally, for quite a long time. My background is as a data scientist, but I've learned a lot from some really excellent engineers as well, like these companies that I've worked in have all been very collaborative, very collegiate, lots of fun to learn from each other. So that's kind of where I'm coming from so that you know who you're listening to, basically. I've also got to apologize to Demetrius, as we've just talked about there, I totally ignored him on LinkedIn. This is such poor behavior. I apologize super deeply. And I'm really glad that we're talking right now. 5:12 Demetrios Yes! As we said, I think it's much better to be doing it in this context than the context I was trying to sell you something with before. 5:21 Ewan Definitely. And the other thing is – this presentation might be incredibly embarrassing for me because I have not been very hands-on with machine learning for a few years. I've been in many leadership positions. So trying to remember my knowledge and trying to distill it back to things that are relevant for this very technical community that I'm talking with. And the other reason that this might be a big mistake is that I don't think I could give you a very concrete definition of what MLOps is. And I'm talking to the world's experts in this, so I'm feeling quite humbled at the moment. But the reason that I'm feeling quite okay with that is – I was reflecting back to the olden days of data science 10 years ago, if that counts as the “olden days,” I had a really hard time articulating what data science was. There were all of these sorts of descriptions. There's this Venn diagram from Drew Conway, there were all these memes about “data science statistics done on the MacBook,” and all that sort of stuff. People didn't really have a good definition of what data science was. But there was a need, there was something latent there. And data science was just the term to describe that, and it feels kind of similar, MLOps. 6:48 Ewan You've got this kind of need out there – people are realizing how complicated it is to live with machine learning systems and how horrible that can be. And there's this need and it's captured by this phrase, MLOps, even if I don't really know how to describe it to my mom, who I was telling that I was doing this meetup to the other day. But it feels like it's time has come – it's a term that I'm really starting to enjoy. So this is what I'm going to do. I'm going to share a few things that I've learned from being part of the community. What I mean by “the community” really is like watching these meetups, but also there's a very active Slack community, which has been super to learn from – lots of really smart people there. So I’ve really enjoyed that. The way that I'm going to show some of the things that I think I really like, is by taking an example of how I've done things in the past and showing how MLOps has kind of advanced that discussion. It's gonna push things forward a little bit. That's what this is really about. It's about saying thank you for the progress that's happening. Right. So here's the kind of rough representative system that I'm going to use to demonstrate. This is on GitHub, as well. I'll share all this later, so that you can play along at home. But I'll get a talk through and I'll do a bit of the demo and I'll jump around. I'll do it quite fluidly. I'm not paying attention for any questions, just know that, but… 8:42 Demetrios I’ll keep you honest on that in case anybody has one. Yeah. Feel free to drop them in the chat. And then, at the opportune moment, I'll ask him. 8:49 Ewan Brilliant. Thank you very much. So this is kind of a fairly representative recommender system. Like, this is the sort of stuff that I've been part of for a while. And it's trying to simulate what a real world system kind of looks like. So we've got streaming data ingest, all pumping through to a data lake, where we can get access to it. We've got the logic, the filtering, the cleaning of the data. And then that goes off to a couple of places. There's a simple collaborative filtering model to get user and item embeddings coming out. Those get stored in an approximate nearest neighbors index. There's also a feature store for any of the kind of user or item features that we're interested in. I’d love a better name for this component here because it sounds deeply unglamorous but “a business rules engine,” like a thing that runs queries against the feature store and the ANN index and then presents those results through an API. So I'll go over to a little bit of a demo. Is that a good enough size? Is that okay? 10:07 Demetrios Yeah, that looks good. To me at least. If anybody has it too small, let us know. 10:11 Ewan Nice. So I'll just do a few nice steps. The first thing that I do is – this is all done locally, it's not cloud technology, just to make it easier to use for demos and showing off what's actually happening. The first thing is, I'm trying to simulate data streaming in. This takes Amazon reviews data set, converts it to Parquet format. And this is starting to stream in day one, day two, day three, day four, day five, and so on. This is kind of the first step of any machine learning tool – getting access to the data. This has done this part (I’ll jump back) it's done this part of the data stream, and it's now accessible in a data lake. And I've got some logic in there that tidies up and takes the columns that I'm interested in. I'll now do the training. Again, this is just a collaborative filtering training model. You can see, hopefully – yeah, the loss is going down. So it's training. Thank goodness that's working. So this is just calculating embeddings for the users and the items. This is just your usual collaborative filtering type of approach. Let that finish. 11:43 Ewan Then the final part is I'm going to build these two indices. So the features store – the kind of data about the users and the items – and I'll put those embeddings over here in my approximate nearest neighbors index. This one might take a wee minute just to do all the calculations in there. This isn't the most interesting bit, this is just us getting set up for talk, and actually showing what this is going to be useful for. There we go. Now the final part that I can do is just set up the API so that it's serving. Fingers crossed. There it goes. Yep. Now, at that tab there, you can see, given a user ID query – here are some of the item IDs coming back. So It’s kind of fairly usual, standard-type of stuff. So hopefully, that kind of gives you a flavor of what's happening through this system. We kind of got the main parts – data ingest, we've got a model that's being trained. And then we've got this part here, where we're serving – where we're querying against the stuff that we've modeled. I'll be able to share all these later as well, if you want to do that. So that's grand. But what I want to do now – this is the way that I would have been doing things a wee while ago, let's say. Before I saw the light – before I saw everything about MLOps. 13:33 Ewan What I want to talk about really is how some of the techniques I've learned through, or that I've been exposed to, through the MLOps community have really helped me feel happy with this. The first one that I want to talk about is data testing. It's a field that I'm very excited about. I've always had this problem with my data systems – unit tests are very effective because they help you with your logic, they help make sure that you've checked your homework and all that sort of stuff. But then when you get into integration testing for a data system, it never works as well as a regular software system. I’ve tried to do this sort of thing in the past, where you've got integration, like a sample of test data that I run through the data system, and then I compare it against some pre-computed results to see that it's all sort of doing the right sort of thing. It never really works very… it doesn't catch that many problems. There's probably a couple of things with that. One of them is that your integration test data needs to be really representative of real-world data. And real world data is messy, and it changes and it messes you up, and it breaks things in weird ways that you could never have anticipated. 15:06 Ewan So no matter how good your integration test data ends up being, there's always stuff that doesn't get caught by it – then your integration tests give you less confidence. The other thing as well is, if you're doing a lot of… one thing you're trying to check for when you're doing very big machine learning training runs is that you're not going to start running out of resources – you're not trying to factorize a really gigantic matrix that breaks through the amount of RAM that you've got available, for example. So that means that you've got to throw an awful lot of integration test data through every time you're running your integration tests. And if you're doing a lot of dev work – if you're doing that, it means that your integration tests become super expensive and don't end up catching that many errors. What I've generally found gives me more confidence is stuff like this. I'll maybe jump back to the repo. One of the steps in this example is doing a data increment. What that does is – it creates just a Jupyter Notebook report. This is a PDF. And there's just a whole bunch of eyeball checks that you end up doing. So you've got this plot – this is the most recent day of data. It looks reasonable. That's the sort of level of confidence that I was used to dealing with. Maybe it's quite nice to automate this, but it's all eyeball-based. It's all kind of “Does that feel okay? Yeah. Then I'll push the button and push it to the next phase.” But what I'm very excited about with all of these tools that are about data validation and data checking – I think I've seen all of these people talking on the MLOps community – but they're just very good tools for making me feel just that bit more confident that there's nothing super weird coming through in the data, or there's nothing super weird coming out in the results. 17:26 Ewan So I'm excited by these because they're effective and they're good. Like, I use Great Expectations a lot. This is an example of a Great Expectations test, where it’s saying, “We're expecting the values to always be between this and this, with 95% confidence.” That's very expressive and that feels good to me as a data scientist. That lets me feel confident that there's nothing bad coming through in the data. So I want these to continue to get better and more expressive. I think one observation that I'd make is that these are always more noisy than a regular integration test would be. Because you've got things like this 95% confidence – you're always dealing with messy data so the threshold is always a bit lower with these sorts of tests than those I’ve maybe used in the past, which means that most of the time when you're running these, you kind of just say “Yeah, that's probably okay.” Sign it off and go on to the next stage. So if these get even better, you're gonna make my life a lot better. Please keep doing this, thank you very much. The second thing that I want to talk about is CI and CD for data products. I think that we're very close to having this being a reality, which would be awesome. Again, this would be fantastic. Please, everyone who's smart in the community – keep doing this sort of stuff. I've stolen this diagram from GitLab. It's just a nice illustration of the loops that you've got for continuous integration and continuous deployment. In my experience, we're not there yet for most of the data products that I'm working with. There's a few reasons for that. We just previously talked about how noisy the tests tend to be for various data products. So the automated test part is a bit less automatic than in regular software. 19:54 Ewan A few other things I would say – sometimes if you're doing these, your test run can be very expensive. If you're training a big model, you're throwing a lot of data at it, it can cost quite a lot of money (cloud resources) to get there. That's kind of another reason why, rather than running a few unit tests and having a nice little thing that you can deal with, some of these machine learning services get a bit sprawly and a bit big and a bit difficult to do that with. And the other thing that I would say is, the deployment to production is always a bit more… it's a bit harder to roll back. For example, if you've deployed a bunch of data into your production database and you want to roll that back, then you've got to do a whole bunch of things. You've got to replay the data, you've got to make sure that everyone who's used that data knows that there is a mistake in it, you need to make sure that if they've downloaded it to excel on their desktop, that they know to redownload it. So data is always just that little bit horrible and a little bit messy. But I don't think we're a million miles away from this. And it would just be super nice and super happy-making if that became reality. 21:24 Ewan What I'm going to talk about next – one thing that I really like with MLOps is that it's starting to break down some of these machine learning monoliths that I've… Well I'll just go and start talking. Previously, this system that I've got, would have just been one big system. It would have been a box on an architecture diagram that says, “Do not touch this. Here Be Dragons.” And nobody would be very confident breaking this apart, because we just didn't have the vocabulary for this. This was just “the machine learning part” maybe. But what I really like about MLOps is that – because we're talking about this so much, we're talking about what a machine learning system is – it means that we can start to break down that machine learning system into repeatable components. We've got the bit that deals with data in stream, we've got the bit that deals with data at rest, we've got modeling, we've got feature stores, we've got ANN indexes. I really like that, because now people know what we're all talking about – because we've got a common vocabulary. The brilliant thing about this as well is that we can put names on these things as well. So if you're talking about an “approximate nearest neighbors index,” you've got a few options. You can say, “Are we gonna do this ourselves? Are we going to use something like Pinecone?” This means that we're actually maturing what machine learning is a little bit, because we're able to break things apart. We're able to swap things out. We’re able to be a lot more agile – things are less monolithic. This feels very productive as a discussion compared to how things were a wee while ago. 23:14 Ewan Going back to your point, Demetrios, I’m not a very good person to try and sell to, because I'm very skeptical. This is just something that I would say like, most of the time, I don't actually care about the features of the feature store. I’ve taken this table from a blog post from logical clocks and you can see that they've done a very good job. Literally, this was one of the first blog posts I read that actually made me understand what a feature store is. Because previously, I was just like “Well, it's a database. Right?” But this actually made me understand some of the differences, and some of the kind of “features” that you get from it. But with the exception of a few of these, like maybe “key value lookups” the rest of those tables – not the thing that's going to sell it to me. And that's because the mental model is the same – it’s a key value store. I'm familiar with that. That's good, that's happy. But what people should be doing when they try and sell me a feature store is not the technical features, but telling me “What do I get from having a feature store (starting with ‘why?’)?” And the things that I love about futures – I’m very surprised I'm using the word “love” about it – but the things that I love are that I now have a word for this thing. I can say “feature store” and anyone who's within the community will understand what I'm talking about. I can use this in discussions with developers, I can use this as a way to articulate what that part of the system is doing without having to go through all of that kind of “Well, it's going to be a key value store. It’s going to be very low latency. Look up and read and write, blah, blah, blah.” But we're not going to have to tell all of that. So I love that it's a common vocabulary that we've got now. 25:15 Ewan And I really… There's something quite nice about the features store, because it's collaborative. You're saying that “this is the one place where we're going to write everything to”. That's a very strong signal culturally to people. So I think that the usual problem with trying to sell data products to people – what's the purpose of them? It's always really difficult to articulate that. What's the use case of a database? It's really hard to articulate because it enables other things – that's probably the use case of it. But telling me a story of stuff like this, it's going to allow me to have better discussions with people and we're going to be more collaborative when we do it. That's how you sell a feature store to somebody like me. 26:22 Demetrios There's gonna be a lot of people reaching out to you after this one, I think. From all these feature store companies. [chuckles] 26:28 Ewan [laughs] Yeah, I’ve done their job for them. That's good. 26:31 Demetrios Exactly. [laughs] 26:34 Ewan So these points that I am making just now, what I'm observing is that talking about MLOps and the discussion that we're having is advancing our knowledge as an industry, as a community – however you want to put it. A few years ago, this was the kind of community attitude to Jupyter Notebooks. This is a brilliant presentation, at the Jupyter conference, somebody's saying that they don't like notebooks (Joel Grus). And this is what - three, four years ago? So it's a very good and clear presentation, because what they're saying is that notebooks can be pretty horrible. They've got the stateful element to them – you can have this cell, then this cell, and that cell runs. They don't play nice with version control – all these sorts of very valid criticisms of notebooks. But I was talking with my colleague Finley today, and we were just talking about some of the nice things that you can do to make it easier to live with notebooks. We talked about tools like Papermill, Jupytext, and also having a really clear understanding about what you're using them for. So when are they for exploratory work? And then how do you use them responsibly? Like, how do you make sure that the code isn't different in different places – so importing from modules, refactoring out of notebooks, all this sort of stuff. I'm now very happy to have notebooks done responsibly in my production products. What do I mean by that? There are two places in this repo where I've got notebooks. We've already seen one – this one about data validation, visualization, having a nice exploratory data analysis that I can then automate and parameterize using something like Papermill. So that's one place that I've done that. And then the other place that I'm very happy to have used the notebook is in the way that I've developed the model. 29:02 Ewan Here's where I've been iterating – I've been changing parameters, I’ve been actually trying to understand what the shape of embeddings looks like. But you can see up at the top that I've not started from scratch. I'm actually using some of the data manipulation, data cleaning, so that I know that I'm starting from that same starting point. So I don't think notebooks are as bad as they're made out to be, especially with a lot of the advanced tools that we've got and the advanced understanding that we've got from the machine learning ops toolkit. 29:48 Ewan And this is the kind of final point that I'm going to talk about. What I'm trying to say here is, we've got a few different disciplines that all work together in MLOps. And it's quite tempting to split your domain up a bit. I would know that, for example, my data scientists would be very productive – they make a lot of value in this section by tweaking the model, changing the parameters and also, in some of the feature engineering bits – we've got some of that. That's where they will be able to make a difference. We've also got kind of very hard engineering problems at scale about “How do we ingest data? How do we deal with an API that's going to work at scale? Is it going to be secure?” And then cloud and data architecture about like, “Should we build or should we buy? What's the right tool to use? What’s the right cloud provider for all of them?” What I am trying to advocate for is to not do this – to not silo people off from each other. Instead, I've got a very nice Instagram-friendly gradient over here. But what I really advocate for all the time in my teams is that we don't pigeonhole people. We try and mirror the knowledge about – if you've ever got a place where the knowledge is so specialized that nobody else can look in, that's a bad smell. And I really think that MLOps, a lot of the – first, the tooling makes it much more democratic. The barrier to entry is a lot lower with a lot of the MLOps tooling. You can get your hands dirty quite quickly, follow a few tutorials, blah, blah blah. That's easy to do. And the other thing is just going back to that thing that I was saying about – the vocabulary is clearer. So if you're a data scientist and you want to talk to an engineer, now you don't need to try and translate from each other because we've got this lingua franca. We can talk about features stores and we both know what we're talking about. 32:12 Ewan I'll give a concrete example of why this is important. I don't know if you know this blog post. It's one of my favorites. I recommend it quite a lot. It's from a few years ago. But it's “Getting better at machine learning” by Robert Chang, and it's about moving from doing data science machine learning on your laptop to doing it in production at scale. And it's wonderful. It's got a lot of references, it's got all of these things that it will point you towards. One of the references it's got is this DataEngConference talk. And there's this quadrant diagram that I'll try and do justice to try and explain it. Let me know if I get the wrong. What they're saying is that all machine learning problems have two key characteristics. There's the latency, which is on this axis here. Where very low latency means it’s real-time, over here means that you can take minutes to get results back. And the other axis that we've got is how much context we need to make a good prediction – to make a good result inference from your machine learning model. Up here, lots of context required – so lots of data required, lots of data from different sources, perhaps. And then down here, not so much context required – the data set is simpler for you. Now, if you're a data scientist – if you're working in this silo over here of training your model – what you're always going to do is you're going to end up up in this quadrant here. Very low latency and you need a lot of data. Because you've got all of your training data available to you, it's easy to run a query that gives you the past three years of data. You can squash that into a model that gives very accurate predictions and you don't have to worry about how much data you're processing, so how that's going to affect the latency. And you don't have to worry about where on earth you’re going to get that data from. 34:38 Ewan So if you leave data scientists in a silo, then you're always going to end up up here, and it's up here that the most difficult engineering problems are. You've got to be really clever about how you cache things. You've got to be very careful about making sure that the stuff that you're reading very quickly is up to date and all that sort of stuff. You can have a round trip that takes a few milliseconds because this is super real-time. But if you've got people collaborating, like in this model, and you're talking about the trade-offs, you're talking about what the data actually looks like – you can usually find a shortcut. And you can do something like this. You can either say, “I am going to be happy with less data.” So push down the graph like this. So “Maybe I don't need that three years of history. Maybe I just need some simple bits of information that I can hold about the user.” Or you can push it in this direction. You can say, “Okay, I do need all of that three years worth of data, but I'm going to squash that down into a slow-changing description of that user.” Maybe one of those embeddings that we were showing earlier. So if you've got people working together and you don't have boundaries – you don't have siloes – then you figure out these shortcuts and you make your life a lot easier. And I think that's something that MLOps helps me do. And I'm very appreciative and very happy for that. 36:13 Ewan Going to just wrap up a little bit and say what are the things I've really been reflecting on and been very happy with. So going back to that statement of “MLOps feels useful”. The tools themselves are amazing. There's so much just pure technical progress that has been made, especially in some of those examples that had the data validity checking – that is so advanced compared to what I was doing a few years ago. So even just the tooling is impressive. But this point of the advantages are beyond just the pure toolset that you get. There's a strong cultural signal that comes through all of this. It's more mature – the discussion that we're able to have. And the way that we talk with each other is, again, maturity, more systematic. It feels like my field is starting to grow up a bit, which is amazing. Really, this is what I want all of the time – a more collaborative environment. Because if you've got more brains working better together, then you get better outcomes and that's what I think is important for us to be focusing on. So let's break down a few of these silos. Let's work together – and I’m trying not to get too cheesy here – but that's what I want to happen. 37:51 Demetrios Dude, so… can you go back to that slide real fast, where you went from all the different silos? Yeah, to this. That for me is so eye-opening, because you show the different areas where, yes, there is like, the technical knowledge and all of these different fields, but how problematic it can be and trying to go to the next slide where you have that blend, and it's everybody is… there's not a clear cut of where one person's job starts and stops. You're able to overlap, and like you said, you're able to have this common vernacular, and be able to interact and interface with a data engineer, as a data scientist, or someone on the business side. Do you have any specific stories about how that has happened or played out in your life? 38:56 Ewan I can give you examples of how it's gone poorly. The two examples I've got: You can either get it wrong by making everyone stay in the silos, and then what you end up with is – I won't name names, but I was chatting with a data engineer friend of mine and they were saying that their job was to take notebooks that data scientists had written, delete the whole thing, and start again from scratch. That's not fun for anybody, is it? That's not a good situation. The other way that I've got it wrong is by not abstracting from this. What I mean there is like, we say, “Okay, everyone is now able to do anything that they want. All that you need to do is be an expert in these 20 different technologies. You need to know about Kubernetes, you need to know about Airflow, you need to know about Docker. Well, obviously you need to know about Google Cloud, so you need to know about Dataflow, Dataproc. You need to know about BigQuery. You need…” The list goes on and on and on. And you need to have, like, world-class expertise in each of them. That isn't feasible. It's very expensive. It means that a lot of your cognitive ability is spent on knowing what the tools are, rather than being able to figure out where you're going to add value. So it's a really tricky thing to get right, because there is an element of specialization that helps you be more productive. But overspecialization is too much. What’s that quote? “Over specialization is for insects.” You've got to get that balance right and it's incredibly tricky to get right. Either you abstract, so that you haven't got the ability to do anything useful, or not abstract at all, so that you need to know every single nuts and bolts to be productive. Some of this stuff, if I go back to this one here, where some of the tools that we're talking about. Are those abstract elements? These are repeatable, sorry… Well, repeatable is one. But these are tools that you can use in multiple different situations. You can recompose them. You can make completely different outcomes. But they're at a high enough level that you don't need to worry about all of the bits that are going on underneath. 41:50 Demetrios Yeah, it's pretty impressive how you show that and break it down. For me, that opened up so many doors in my mind. And I love the fact that you showed that one. When you showed the first slide, I was like, “Oh, wow, that's great! That's a great visual.” And then when you took it to the next slide you showed, “But let's get this gradient going on.” That was very special. Because how many times have we heard that the data scientists just throw it over the fence? Like you were saying, that is not helpful for anyone and that is not repeatable, and really, it's not a stable, sustainable way to go about doing machine learning. 42:40 Ewan Yeah, exactly. And it's not fun, either. Nobody likes the feeling that their work gets deleted. Then the other thing is like, this is so complex. Machine learning can go wrong in such a number of horrible, different ways. You need all of these brains to be working together to figure out how to solve these problems. If you don't have that kind of collaboration – the hive mind – if you don't have people working together like that, then you don't solve the most difficult problems. 43:16 Demetrios That's such a good point too. Like, if each one is just doing the same thing or trying to replicate, and only getting so far, then what's the point of having a team? 43:31 Ewan Yeah. Well, I mean, this is my perspective. I'm a big believer in the strength of teams. It's why I've been so keen to get into… Like, previously in my career – going way, way, way, way, way back to the beginning. Oh, jeez. I’ve talked a lot, didn't I? Way back in my career at Skyscanner, I was always very techie. I was very kind of not being hands-on. I loved that feeling of writing code, that immediate gratification that you get when it works. That was me – I loved that. But as I've got older (maybe that's part of it) the thing that really makes me passionate now is about making a good space for people to work together in a team and what makes a really effective team tick. That, for me, that feeling that you get when you're in a really good team – everyone trusts each other, everyone knows what you're meant to be doing, and you've got this vision, and you feel excited to be getting towards it. If you can make that space – and that's kind of what that gradient is trying to show – that's a team feeling. It's like, I don't know… if you could only bottle it... well, then you'd be making your millions, because… 45:10 Demetrios Then I'd be able to sell to you. 45:12 Ewan Oh, yeah, definitely. If you’ve got that, then I'll take everything you’ve got. 45:15 Demetrios Then you’d answer me on LinkedIn. [laughs] I got a question coming through from Mike. He's wondering where you think the biggest opportunities to fill a gap are – either in how we talk about things or in technical maturity? 45:36 Ewan So it's probably on this part of the diagram. I think that's where a lot of the bad handoffs (?) happen. You've got… I don't have a good term for this business rules engine. I'll show you what it looks like in the (sorry, can't do two things at once). Like, this business rules engine, all it is – is doing stuff like this. This is just filtering out stuff that you've already reviewed, in this case. But this part, it can be really… This is one of the places where you can make a real big difference to the quality of the outputs. Like here, I've just filtered out stuff that you've already reviewed, but that'd be a terrible user experience if you kept getting recommended stuff that you'd already seen or reviewed. Other stuff that you could do very easily in here is do stuff like increase the diversity of the results that are coming through. So if you took the 50 nearest neighbors and you shuffled them up a bit more, you’d made sure that there was a measurement of diversity in there, that'd be awesome. This is the part that I don't have good vocabulary for and it doesn't translate very nicely between the two, either. Because when I'm talking about business rules and then I'm talking to somebody deeply technical about their API, and they're talking a different language about like, “How does this get cached? How do we make sure that this works nicely with the CDN?” and stuff like that. That's where the interfacing tends to break down a bit more than others. I don't know what the solution looks like, but this is a bit that always feels like, if you can be more flexible – if you can talk and have more collaboration going on there – then that would make life a bit more straightforward. 48:02 Demetrios Sweet. Yeah, that's such a nice point. I just threw the GitHub into the chat for everybody that wants to go and explore after we finished with this. There's a lot of great stuff in here. And you said some really nice things that I'm probably going to try and write a blog post about because… I wrote down some notes here, especially when it comes to the idea of rolling back and why it's so hard when data is involved. Why CI/CD is sooo needed and the benefits of a CI/CD system, and just how messy data can be, and how, when you throw data in into the mix, it just starts muddying up the waters – so all of the things that need to happen when you have to roll back a model, is quite tedious. 49:04 Ewan Yep, yep. You can see the response as well. If you're in a team that's doing data engineering and you use the word “replay” of data, people's faces always drop, because that's the operation that's painful, it's expensive, and it's where things go wrong, as well. I would love to be in this sort of situation where we were able to kind of like, “Oh, it's really easy. We can automatically test it. We can roll things back.” But I've not got a lot of examples where we've actually lived in that sort of situation. It's always a bit more kind of “We'll do it when we have to.” 49:49 Demetrios Do you think that's the… just because again, the tooling and architecture and wording – everything isn't quite mature enough yet? 50:06 Ewan [sighs] I think tooling will be one part that helps us. Like, I talked about the example of “You always make your Great Expectations tests a bit too noisy,” just because this language, while it's really expressive, it's hard to capture all of the myriad ways that data is going to be horrible to you. So I think that more tooling like these is going to be the big step in us getting there, so that we get closer to this. Instead of having to replay all of our data one time in four, we can do it one time in forty, then we can maybe accept that… well, it's a monetary costs, but there's also a reputational cost if you've put some weird data into a database, and then somebody looks at a report and goes, “Why is why is this number like that?” That's always a very embarrassing situation to have to have to explain to your senior stakeholder. 51:13 Demetrios [chuckles] Yeah. Yeah. [chuckles] You said it perfectly. There's so many ways that data can go awry. There's just so many different things that can happen to data. And tooling, hopefully, we'll get there and start recognizing all of the different ways. But you have all these edge cases. You have all these anomalies that it's really hard to factor for before you see it. Right? Like, you don't realize it's gonna go like that until it does go like that and then you go, “Ah! It can do that, too. Okay.” 51:49 Ewan Yeah. Yeah. I've got one story that's jumped into mind from my early days of Skyscanner. I worked with Mike there. I was looking at the data about flight prices in the database, like “How expensive was it to get from London to Moscow? From San Francisco to New York?” And I saw some really big numbers in there and I thought, “What's going on here? Somebody spent 10 grand to get from Moscow to London.” And then I was asking people about it and what happened was, we were logging the price in local currency. So that wasn't 10,000 GBP, that was 10,000 rubles. The face of the person who I was talking with about it, they just dropped and they looked a bit green. We've been logging it for years at that point, but we've never actually… [cross-talk] Yeah. So, yeah – data is horrible. I don't know why we get into this field. 53:01 Demetrios [chuckles] Man, this has been brilliant. I really appreciate you coming on here and chatting with us. And I highly encourage everyone out there that is listening to check out the repo and play around with it. I thank you so much, Ewan, for finally chatting with me. 53:17 Ewan Yes. Thank you for being patient, buddy. 53:22 Demetrios [chuckles] Yeah, that's it. Playing the long game. Three years later, it happened. And it was worth it. 53:27 Ewan Yeah, I think so too. It's been a real pleasure. Thank you for having me. 53:32 Demetrios Yeah, it's been awesome, man. And it's great to see you in the community and all your contributions. So thank you for that. Thank you, everyone, for joining us. We will be back next week for more meetups and until then – we'll see you in Slack. Feel free to hit Ewan up if you have any questions in Slack. I'm sure he would be happy to answer them. Yeah, that's all we got for today. See you all later. 54:00 Ewan See ya.

+ Read More

Watch More

DuckDB is fast for analytics, but what can it do for AI? // Mehdi Ouazza // DE4AI

Posted Sep 17, 2024 | Views 603

I Am Once Again Asking "What is MLOps?"

Posted Apr 22, 2025 | Views 108

# MLOps

# AI

# Model

How to Systematically Test and Evaluate Your LLMs Apps

Posted Oct 18, 2024 | Views 15.2K

# LLMs

# Engineering best practices

# Comet ML