MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Micro Graph Transformer Powering Small Language Models

Posted Jan 30, 2024 | Views 236
# Product Management
# Natural Language Processing
# Dataception
Share
speakers
avatar
Jon Cooke
CTO @ Dataception

Jon is a 20-year veteran in Data, Analytics, and AI and is a Data product specialist. After many times seeing the massive time, friction, failures, and costs typically associated with data and analytics initiatives, Jon founded Dataception. Its mission is to use tech to eliminate the data grunt and work together with data product management and AI to build and iterate sophisticated, business-facing analytics in real time in front of the business.

+ Read More
avatar
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More
SUMMARY

Specialist deconstructed Encoder/Decoder Transformers along with data product management and tech to vastly accelerate prototyping, building, and deploying business-facing data products at high speed and low cost.

+ Read More
TRANSCRIPT

Demetrios [00:00:00]: Hold up. Before we get into this next episode, I want to tell you about our virtual conference that's coming up on February 15 and February 22. We did it two Thursdays in a row this year because we wanted to make sure that the maximum amount of people could come for each day since the lineup is just looking absolutely in credible, as you know, we do. Let me name a few of the guests that we've got coming because it is worth talking about. We've got Jason Liu. We've got Shreya Shankar. We've got Druv, who is product applied AI at Uber. We've got Cameron Wolfe, who's got an incredible podcast and he's director of AI at RebuyEngine. We've got Lauren Lochridge, who is working at Google, also doing some product stuff. Oh, why is there so many product people here? Funny you should ask that because we've got a whole AI product owners track along with an engineering track. And then as we like to, we've got some hands on workshops too. Let me just tell you some of these other names just for a know because we've got them coming and it is really cool. I haven't named any of the keynotes yet either, by the way. Go and check them out on your own if you want. Just go to home, mlops community and you'll see. But we've got Tunji, who's the lead researcher on the Deepspeed project at Microsoft. We've got Holden, who is the open source engineer at Netflix. We've got Kai, who's leading the AI platform at Uber, you may have heard of. It's called Michelangelo. Oh my gosh. We've got Faizaan, who's product manager at LinkedIn. Jerry Liu, who created good old Llama index. He's coming. We've got Matt Sharp, friend of the pod, Shreya Rajpal, the creator and CEO of Guardrails. Oh my gosh. The list goes on. There's 70 plus people that will be with us at this conference. So I hope to see you there. And though, let's get into this podcast.

Jon Cooke [00:02:11]: My name is Jon Cooke. I'm CTO of Dataception, CTO and founder of Dataception. And I like my almond lates, basically homemade almond lattes, non milk lattes.

Demetrios [00:02:24]: Welcome back to the ML Ops community podcast, everyone. Glad to see you again. I'm your host, Demetrios. And today we're talking with Jon Cooke about the evolving landscape of data management. This episode was as much organizational as it was technological. We did get into spark and all the nuances there. But wow, what a great way to break it down. Listening to Jon talk about how to align stakeholder values, how to understand from a business side of things the value of data, and how to look at the data in a way that both the technology side and the business side can really squeeze the most out of. Was enlightening to say the least. So I appreciate how Jon goes into this idea of data handling, and I also really appreciate how he talks about data products and what needs to be done on that side. As I said, it's organizational and it's technical. It's part organizational, it's part technical. I don't know what kind of recipe I am cooking up over here, but if I were to say the parts of one and the other, I would say three parts organizational to two parts technical. Hope you enjoy this conversation with Jon. Let us know in the comments. And please, if you like it, share it with one friend.

Demetrios [00:03:54]: All right, let's get into this. So you're in London right now, but before this you were in New Zealand. Tell me about that story.

Jon Cooke [00:04:07]: Yeah, so my mom, my mom's a Kiwi, so I spent 99, I left the UK with a plan to sort of do a bit of a backpack and that kind of stuff, go see the family, and spent time in America, the month in America, and went to Fiji, spent six months there, but ended up being like three weeks because it was raining season. Of course, being a dumb Brit, I didn't understand that whole process. So first week was lovely, sitting spear fishing on the beach, and then two weeks of absolute rain in a bar surrounded by Londoners drinking beer. I think I could be back home while the weather's. So I left there and then went to New Zealand, was there for a year, went skiing for a whole season and full of whole family and stuff. And then went to Australia, tended to be there six months, ended up staying there six, five years. Five years. Bought a house and all that kind of stuff.

Demetrios [00:04:51]: No, Australia?

Jon Cooke [00:04:52]: Yeah.

Demetrios [00:04:52]: Where in Australia?

Jon Cooke [00:04:53]: I was living in Sydney the first time, let's say that Sydney. And then came back in 2006. And interestingly enough, I met my wife, who's from Melbourne when I was in 2009, when I was backpacking around Southeast Asia, Lao and Cambodia and this sort of stuff, and came back, then went to live in Holland, and then we came back here, then we lived in Melbourne, then came back here. So I've been back here ever since.

Demetrios [00:05:15]: You are well, traveled?

Jon Cooke [00:05:16]: Well, fairly well, some bet the, some not as well as others.

Demetrios [00:05:22]: That's incredible. You've got me a little jealous because I want to do the whole Southeast Asia and Australia. You left out something too that you told me about, but maybe you purposefully left it out. I want to put you on the spot right now. You almost showed up in the Lord of the Rings. What's the deal?

Jon Cooke [00:05:40]: Yeah, when I was there in 99, they were filming the first movies, the Peter Jackson movies, and I actually lined up to be, there were two lines in Victoria Mac Victoria University, and there were two lines. There were one for soldiers and then one for elves. And of course I was in my twenty s, I was younger and that kind of stuff. And I walked up the elves and they went, nah, you're a soldier. And I'm like, why is that? You're not pretty enough. So I lined up to be a soldier. They took photographs and all that kind of stuff, but I obviously didn't even make the cut to be a soldier. But I was very nearly in that because basically then the whole country was basically roped into because there was that many people.

Jon Cooke [00:06:15]: So yeah, very nearly was a soldier in Lord of the Rings.

Demetrios [00:06:19]: Oh man, that is so cool. And what an experience. You're just there traveling on your year long trip and boom, you're like, well, let's see. Let's try out for lord of the rings and see what happens.

Jon Cooke [00:06:31]: Yeah, absolutely. There could have been another career path. Who knows?

Demetrios [00:06:39]: Well, dude, what are you up to now? Tell us about what you're working on and how you got here. Between all of this traveling. You eventually started playing around with data and. Yeah, what are you doing?

Jon Cooke [00:06:51]: Yeah, I mean, I've been sort of playing around data probably 30 years. I was actually doing virtual reality back in the 90s, believe it or not, for british aerospace. And that's my first job out of uni. And then come through the whole kind of did a lot of low latency messaging in the late ninety s service or an architecture in the early two thousand s. Then worked for a consultancy, financial service consultancy, doing a lot of risk and kind of data driven projects because banking is all about, all about data. Low latency type stuff, Monte Carlo simulations, billions of calculations, that sort of stuff. Went through the whole hadoop thing as well. So got about ten different hadoop projects and a few actually got a few of them to work, which was quite amazing, really.

Demetrios [00:07:30]: Then you celebrated by going to Southeast Asia.

Jon Cooke [00:07:33]: That's exactly right. And I ended up running a data practice and a consultancy, fast service consultancy, left there, went into PwC and started an AI data and AI practice did a lot more AI. They did my first generative AI project, believe it or not, back in 2017.

Demetrios [00:07:51]: Or what GPT.

Jon Cooke [00:07:52]: Prior to that, we weren't using transformers at all. It was like we were using handcraft taxonomies and classic machine learning and naive bays, type algorithms and that sort of stuff. Did one for a large bank, a large swiss bank, which doesn't exist anymore, which is basically a chat bot, and we're on 1500 documents, so that's very small amounts of data. So myself and the lead data scientist there, a guy who was absolute genius, who was the next trader, built a handcrafted taxonomy, and we did machine learning to do the inference between show me the swap rates for the Southeast Asian Bermuda swaps or whatever it is, that type of stuff, and then it guided you through the whole chatbot experience. This is what the swap rates are. This is what you mean. There was a lot of reinforcement learning, manual basic feedback. That's what, as you can imagine.

Jon Cooke [00:08:39]: But it was prior to us using transformer type architecture. And there was another one that was done by the it who doing credit ratings, spitting out the reports for credit ratings for banks. So Apple's been downgraded by X, Y, and Z and give you the whole report, taking news feeds and that sort of stuff. And rather than kind of classic market data, but also these alternative data sets. And that was kind of interesting, be able to deliver a report that normally took six weeks and 30 minutes. That was another gen AI one we did then, again, which was interesting then, yeah, a lot of AI started work in there, in classic machine learning as well, regressions and that sort of stuff as well. Left there, then went to databricks, and when they were very early on. So this is 2018.

Demetrios [00:09:23]: This was because presumably you had felt the pain of Hadoop, and you started playing around with Spark, and you were like, well, I guess this works a little better.

Jon Cooke [00:09:31]: Well, absolutely. It was interesting because the last project I did in PwC, just enough was for actually a regulator, and I actually built a whole architecture, and we put Spark in there, and there's lots of different use cases, and I want to do classic prescriptive analytics all the way through to kind of machine learning, this kind of stuff. And they said, right, spark's the engine for you. And actually, the first use case we found was small data with self joins, and Spark struggled quite a lot. So I had this kind of epiphany back then saying, actually, you need multiple engines for multiple use cases, but you need to be able to tie it together in an ecosystem where you can do lots of different types of analytics. Now you want to do a graph based analytics or you want to do a machine learning this kind of stuff. And actually, interesting enough, the next after databricks, when I formed dataception, we've actually built an architecture and a platform to actually do that kind of thing. And then the whole data product movement came about and the architecture and the forces align type stuff.

Jon Cooke [00:10:25]: And that's what we're naturally now kind of building out now that whole kind of vision of actually, it's not one engine to rule them all back to the Lord of Rings reference, but using genuine data products to actually be able to accelerate these sort of things. So if I want a machine learning algorithm, I want to use Python or I use an LLM, I want to use whatever I want to use. If I want to use a prescriptive metric that's a classic SQL engine. They're different engines, they're different things, but they all need to be able to work in this kind of cohesive ecosystem, frictionless, quick, cheap, fast, be able to move with the business. So that's really where that whole kind of vision sort of started. And it has proved out quite well inside that project.

Demetrios [00:11:07]: Yeah, so this is fascinating. Tell me more about data inception and what you're doing there. You gave us a bit of the breadcrumbs here for say, and I would love to hear about the work that you're doing right now. What's your day to day filled with? And then we can go into some of the pains that you're seeing and especially around the data products and how that movement has come about and the fun you've been having there.

Jon Cooke [00:11:38]: Sure. Obviously I started dataception because I saw the pain I've been suffering. A lot of people suffer building data and delivering analytics is we all know it can take months to deliver simple stuff in large organizations. So I sort of found it dataception and a vision to try and solve that problem. And with a mixture of business kind of knowledge, product management, tech and data, trying to put those together to actually do fast accelerated delivery around these sorts of things. And I've been over five years. And so dataception is, that's his core mission. We do a lot of work around classic sort of data architecture, technology architecture.

Jon Cooke [00:12:16]: So building spark platforms and databrick platforms and onto data work like single custom view and lots of those sort of challenges we solve. But also we have built together something called a data product pyramid, which is actually a process which try and take the concept of agile for analytics and product management. So agile, you develop, you go to do research, you work the business, you do wireframes and you iterate around quickly this kind of stuff. And it's really hard to do that in data. And so we build this kind of process where we do that with what we call data product, which are analytics components with visualization, we're trying to bring that stuff together. And so we've been doing a bit of that with various clients and it's a good process, but it's still quite manual. So we do a product canvas, we do an ideation session, and we work at what data is need and then we do a pool prioritization, so the value comes in, then do some wireframing, build the analytics, that kind of stuff, and work with the data teams. But it takes a bit of time, obviously, as you can imagine.

Jon Cooke [00:13:23]: But it's really the thing that drives me and it bakes my noodle this in 2023. And I was having this conversation 20 years ago, if someone like head of sales says, show me my total sales for last year. If I don't have that data and I don't have that kind of analytics, why does it take me weeks to months to build that? Why can't I just at a click of a button and actually do that? Or I want to do, let's turn that into a forecast, like what's my sales going to do in the next three months? Again goes, throw it over to the data team, the analytics team three months later comes back, well, I've missed the opportunity there. So why are we still in this position where we're actually still spending a lot of time and foundational work and it still happens that we work with customers trying to get them over that type of hump. We're still in that position. We were this 20 years ago. It's like, well, why isn't the dial move that move significantly there? And obviously we'll talk about transformers and llms coming in. I think that is really starting to move the dial quite significantly more, the mindset shift as well as the technology shift.

Jon Cooke [00:14:20]: But this, to me, this is why I formed dataceptions, try and really just shrink that gap because obviously we're a consultancy and I spend all my days typically with business people, business and tech, but mainly business people, where I sit in the room go, what do you need? What do you actually need to do to deliver the value? What's going to drive your business? I don't talk about technology. I don't really even talk about data. Till the last bit. It's like, what's the key decision you want to make? What's the key outcome you want to get to? And I do a lot of that with heads of sales, head of risk, chief operating officers, ceos as well, and that trying to close the gap between that conversation and actually delivering something that they can use to actually make the decisions. That's what I'm trying to do with dataception. That's what we're trying to do with dataception.

Demetrios [00:15:04]: What are some of the main causes that you've seen? Things take too long when you're talking about these reports or forecasts or anything that has to do with data. Why is it that it is so messy? What are some main causes?

Jon Cooke [00:15:24]: Yeah, I mean we could probably spend a long time on this because they'd be well documented. But the things I typically find is the staff of the process is really just finding what the business wants fundamentally. I get a lot of just give me the dashboard of my KPIs. I was like, well that's usually useful. What I tend to do is actually try and distill them down to the two or three things they need to make their day better or their decision, that kind of stuff. And once you've done that, then you need to do what I call data archaeology. You're going into systems, you're going into the data warehouse, you're trying to discover this kind of stuff. And it's unpicking all that.

Jon Cooke [00:15:56]: You're brushing off, lifting up the rocks, brushing off the type of stuff. And if you haven't got the data in readily available or the data is in a particular form, that's for a particular previous use case, which I find a lot. And this is one of the challenges. I have a lot of data warehouse data platforms, they were built at a time where this is the scope of what we want to do, but actually we want to do something completely different now. Oh, we have to kind of work out why the decisions are made, unpick it, retransform the data, that sort of stuff. And that is one of the biggest challenges I see even today. It's like we still have this concept, we need to get all the data in there before we can do anything with it. We can't just like let's just take some other transactional system, do some prototype and see if it's good enough, see if it's got the right quality, see if it's got the right inference points, all that sort of stuff before we actually on, onboard it onto the platform.

Jon Cooke [00:16:41]: We have to onboard it onto the platform first on the infrastructure and that kind of stuff.

Demetrios [00:16:46]: Wait, can I stop you there? Because if I'm understanding this correctly, it's almost like there have been a lot of decisions made in the past on how to architect their system, and there's opinions that have been made and certain formats and certain schemas that have been taken with the data. And if you want to do something new, what takes forever is that you have to backtrack through all of those opinions that have been set or the decisions that have been made and almost untie the knot each time. And so what you're saying is instead of letting the data go through, these transformations get dumped in the data warehouse or the data lake, whatever it may be, just take it at its source and then see where it's at, what the condition of that data is, and almost duplicate it and let it come through a new system and then let it go through the old system. And just to see where are we at with this data, can we do something with this? So you're almost like diverting or making a copy of it. I'm not sure how you go about that, but it's like you absolutely divert both of them and you make sure that the data is there that you need. You're collecting the right data, you can transform it in the way that you want, so you have a better understanding of what that raw data is like. Is that more or less how you're explaining it?

Jon Cooke [00:18:11]: Yeah, except it's not like an official kind of, we're going through all the infrastructure and stuff you think about. It's like a sandbox or a sandpit. That's the thing I found a lot, especially with data scientists, even with bi people and stuff, where they say, we want to do something new. We've had a question asked of our business, and we don't know what the answer is. We don't even know what the data, the right data is to get that. So can we actually bring the data and have a look at it first in some kind of sandbox environment, not fully onboarded, not fully curated or fully modeled before we actually work out if a if it's useful, b if it's got the right quality, if we've got enough statistically significant data points. That's the other thing. I've done quite a few machine learning projects where they didn't succeed because there wasn't enough data points.

Jon Cooke [00:18:57]: One where you've got one outage, we did one for around it risk, and there's like one outage per year. It's like, well, you can't do any machine learning on that. But we didn't know that till we actually had to look at the data. But we didn't want to wait for it to go to the warehouse before we actually saw that. So there is an interesting tension around that ability to look at the data, to be agile, and then also to, then to be able to overlay other parts of the organization onto it as well. But you need to be able to do that quickly and efficiently, because sometimes you're sometimes going backwards and forwards to the system teams a lot. They go, what do you want? Well, I don't know, but. Well, tell me what you want.

Jon Cooke [00:19:32]: And you end up this kind of backwards and forwards of what we're trying do is shortcut that. Before you say, right, that's the extra data we need. Here's a sample model. Right, let's now push it through the operationalizing piece, make it hardened and all that sort of stuff. But let's do that first.

Demetrios [00:19:47]: Yeah. And it almost feels like the bigger the scale of the company, the more of a headache this can cause just because of all the different stakeholders that have their fingers in the mix and all of the different infrastructure that it has to go through, or all of the access that you need. And so there are so many different ways that this can be complicated. Am I missing any?

Jon Cooke [00:20:10]: No. It comes down to really, the complexity of the estate, really, the it and the data estate. You're right. And that's typically mirrored, know Conway's law around the mirror. The organization tends to get the system estate that they deserve or that mirrors that. So it tends to be, I find, small companies or smallish companies with really complex data estates because they've grown organically, they've done some m a and they've got all these different systems and they've got spaghetti everywhere, and it's this type of stuff. And I understand that. Let's get all the data out, stick into one place, then we can sort of see what we want to do with it.

Jon Cooke [00:20:42]: But actually, one of the big lessons from Hadoop days was actually, if you bring all the data in as is stick it in, then try and model it, this kind of stuff, you can end up failing because it's not the right formats, it's not aggregated the right level. What you want to do is bring it in piecemill and then overlay your business use case on it. That's the other thing from the loop dates. Let's ingest it all in first and then model it before we know what we want to do with it. And I think we're now at the point, especially with the market coming, with budgets tightening, all this kind of stuff, we need to be able to move quickly, we need to be able to not spend vast amounts of money on data infrastructure. It's more around how can we get the analytics good enough and quickly enough and cheaply enough to actually be able to do that. And that to me is the key kind of pivot point we've got word.

Demetrios [00:21:26]: Now, it's funny that you say these startups that have also had just a mess of a data because I've talked to my friends in sales quite a bit and I feel like this is a common thing that happens to startups with their head of sales or whatever you start off. And just about every startup will start with like, oh well, HubSpot's free and it does basically everything we need. Let's go with HubSpot. They go with HubSpot and then once they get mature enough to where they're like, all right, a head of sales comes in. The head of sales is like, we can't work in HubSpot, we got to migrate to Salesforce and then you migrate to Salesforce. And did you actually need to do that? I don't know. I'm not the one to say, but nine times out of ten probably not. It's just because whoever came in has more experience with that new tool so they feel more comfortable and they get to make the decisions and they own the budget and so they go for it.

Demetrios [00:22:25]: But now you're left with a lot of complicated stuff because wait, did that data, does it tie back to HubSpot? Can we get the report from Salesforce? How does that look? What are we doing here? When you're talking about something as like forecasting for ahead of sales, if we're going back to that example, it's just a mess even thinking about it in a startup. And I can't imagine a bank that's been around for the last 200 years.

Jon Cooke [00:22:52]: Absolutely. Ironically, the bank sometimes can be better than the startup because it depends on how, like I said, it's the complexity of the estate that comes in and how that's that sometimes, especially if you go to a challenger bank that's got like five systems or ten systems, or you go to a startup that's got 30 systems, that kind of stuff. So it's really that sort of complexity. But you're absolutely right. That migration or that complex estate where you've got multiple systems and you have migrating between the two. And I guess that's the other thing that I'm seeing. If you go to more of the architecture side and there's something the data mesh is trying to solve as well. It's this ability of this problem where you're not actually having data one side and transactional, what I call the transactional side, the analytical plane and the other transactional plane, you got transactional systems doing your workflows, all this kind of stuff.

Jon Cooke [00:23:45]: You've got the analytical plane, actually these emerging. If you look at your recommendation, and you're the classic example, that's actually an operational system, but it's not just happened to be an analytic workload. And I put in a number of architectures where people are delivering components from the transactional side and the analytical side together in the same place. They're actually all interacting as one because they're all basically operational systems that drive your business. And this is the other thing that's really interesting around the kind of the data product movement. And this is actually when you start going down to small components which you can deliver from analytics teams or from transactional teams or system teams or whatever, and you start wrapping those big monolithic systems you talked about. So the salesforce versus the HubSpot type thing, you can actually get some agility around there, but it takes a kind of a very different mindset around to be able to do that and a lot of commitment to building that and governing it and managing it going forward. But actually, if you can do that, you get a lot of that sort of agility around that.

Demetrios [00:24:43]: One thing that I feel like I've seen a lot as far as design patterns go, and I'd love to hear your opinion on it, is that you have a data platform that is created at a company and then you have a machine learning platform that is created at a company. Does this view that you're talking about, especially what you just said, conflict with that, or how does that evolve it? What you're saying, how does that differ from this common organization design pattern that we've seen?

Jon Cooke [00:25:17]: That's a great question because that's exactly the point. I think you actually combine the two, but not as a platform, as an ecosystem fundamentally. So the data in my mind doesn't live outside the use case or tends not to live outside the use case. What I'm going to be using for and what it means is basically that is you're trying to onboard the data, you're trying to use the data, but in conjunction with what you're using it for. Now, obviously there are some exceptions where, like single customer views and stuff like that, where you have to kind of join it across different use cases and stuff. But fundamentally, to me, the two things that are intrinsically linked, I think one of the things that I'm seeing or trying to move the dial a little bit around is let's not try and model the world up front, because fundamentally the businesses, it's all about the business and about the business process and what you're trying to do. If you bring the data in and marry it to the use case and then compartmentalize it and deploy it at the same time with the analytics and the data sets, that's kind of stuff. And this is what we call a data product.

Jon Cooke [00:26:15]: Now, the industry does talks about data differently. That's what we call actually in a product management kind of sense, suddenly your whole world changes, because actually you have these kind of big data platforms which are just data only trying to operate almost like in a vacuum. And what tends to happen with data teams is not their fault. It's just the way, the way they're set up. They come to all sort of this order takers. The business comes, say, I need data for X, Y and Z, or why do you need it? I'll. Don't worry about that, just give it to me. It's like, well, we don't really know what it is and that kind of stuff.

Jon Cooke [00:26:43]: And you see lots of things on LinkedIn and other things around value. When you're in that mode, you've got no chance of control of any value because you're just delivering data. You have no idea. It's all pain, because actually, if it works, you don't get thanks. And if it doesn't work, you're hammered.

Demetrios [00:26:57]: Right.

Jon Cooke [00:26:58]: So for me, it's being business driven, and we talk about data driven, but actually being business driven is actually using the data and the analytics and the use case all in one and bring it as same part of the process. The people who are actually building the analytics are bringing on the data. Yes, there are going to be some governance and overlays and this type of stuff as well, but that's really where we need to get to. And you can argue whether that's business teams or central teams. Business facing central teams, that's a different discussion. But it really is people who are actually building the use cases need to be bringing on and managing the data, and it need to be uplifted by tech, and it's obviously where llms and stuff really start to come into the fore around this because it's changing that whole mindset. I can upload a document, start questioning. You asked that two or three years ago of an organization.

Jon Cooke [00:27:43]: Can you build that for me?

Demetrios [00:27:44]: Ask.

Jon Cooke [00:27:44]: Twelve months and x million quid, right? It's that type of buy a platform in this sort of stuff. But now it's actually because business people can actually interact with it with OpenAI and this kind of stuff. The light bulb's gone on, say, well, why can't we do this? And to me, that's what we need to move to. And it's not just obviously interrogating static data through training and stuff. It's also being able to create forecast and create new analytics and stuff in the same mode of operation. And that, to me, is where the dial really gets moved.

Demetrios [00:28:13]: Well, how do you see the profile of the people that are dealing with this data change? Because I feel like right now we have the data engineers, and it almost feels like we're very much speaking to data engineers right now. And they're the ones that, as you said, they are in a bit of a, I don't want to say lose lose situation, but someone comes to you and they're like, I need this data. And then you're like, well, why do you need that? Don't worry about it. I just need it because I got to make XYZ or I got to do this. But you don't need to know it's above your pay grade. And then that person probably, it sounds like to me like that's a data engineer right there. And what you're talking about in this idea of, oh, well, maybe if we start to phrase it differently, or if we start to view it differently, not phrase it differently, and really rally around a different type of organizational operating model. Yeah, exactly.

Jon Cooke [00:29:16]: Different way of operating.

Demetrios [00:29:17]: And have this different organizational operating model, you are going to see different personas per se, where maybe, as you were saying, it's a business person that can talk to an LLM, or maybe it is someone that is very deep in machine learning, that also has that data engineering aspect. What do you, if anything, see these different personas changing, or if not at all changing?

Jon Cooke [00:29:52]: I think it's exactly right. So there are a number of ways that we can slice and dice that. But for us, the way we've adopted is much more of a kind of more product management type approach, not project products. We're actually building out business facing analytics in a kind of project product management mode. And what we tend to find is basically if you have small teams which are multidisciplinary, which actually can hode around like you would do, like a normal product, like a software product or whatever you've got in data. For us, a small team with a business SME, a product manager, we have something called the data product specialist, which is kind of the mixture of business data and tech that actually can then build out analytics, more like an analyst, plus it's more that thing.

Lina [00:30:40]: No, I will not recommend your podcast as the ML engineer at DQB bank. I'm definitely not wanting other engineers to have all this relevant information and I will not give my name, Richard Cleaner. So definitely I do not recommend this podcast.

Jon Cooke [00:31:03]: And also a data engineer. And potentially, potentially if you're going to be a complex use case, some sort of technical architect, data architect type person, but it's basically what you're trying. You're honing these teams around this or spinning them up to actually solve the particular use cases with products. So we talk to you. All right, let's build a metric. I'll give that to the data engineer. They'll build a window function, deploy that, and then you'll get maybe a UX person as part of that team as well to actually skin it and put it in front of the business. Let's iterate around that.

Jon Cooke [00:31:31]: If we ask some questions and the SME would step in around that. And the idea is you're spinning up these teams, these multidisciplinary teams with tech that assists and then coming back to your original point around, do we have data platform analytics platform? Do we have analytics and data infrastructure IT systems or combined to actually allow us to do that, to accelerate that, then we have a very different operating model. Fundamentally, we have product facing teams which are multidisciplinary and they have data engineers and stuff like that as well, to actually be able to live with these products. So the personas don't change, but the way they operate does. And you might have a platform team, right? Data platform team or analytics technology team, a bit like your cloud team that's building the infrastructure, but they are not delivering the pipelines, they're not delivering the analytics. You have these kind of business facing kind of multidisciplinary teams. So that I see, and we've used this a number of times and it works really well.

Demetrios [00:32:20]: So you put something in the box when I asked what kind of stuff you wanted to talk about, and I'm not going to lie to you, I don't understand shit. Let me just read this to everyone else. See how many people out there understand more than five words at a time here. It's like, what do you want to talk about? Ooh. Specialist deconstructed encoder decoder transformations along with data product management and tech to vastly accelerate prototyping. What the hell does any of that mean? Man, break that shit down for me because I feel like we've kind of been talking about it.

Jon Cooke [00:33:01]: Take a step back. We have a process called data product pyramid, which allows people to create analytics products with Ux in front of the business in a very sort of ux type, product management type way. But like I said, at the moment, it's quite manual. So you build a product canvas in. Lots of companies do this away to product canvas. You identify what data it is. We have a value framework, this kind of stuff. And then you throw it over to the data team to work with the data teams and analytics team to actually prototype those things.

Jon Cooke [00:33:27]: That can take weeks to months, as we talked about. Now, we've built some technology to actually build and deploy data products in minutes, fundamentally, but it's always advase this type of stuff. So you basically say, I want to create a metric. Create the metric. Here's the sum, here's the data set you want to want to use on it. Here's the donut chart. You want to put it together, click, click. Now, there are stacks which you can sort of do this to get to today, but there's nothing really around where to see if you actually want to do that through natural language.

Jon Cooke [00:34:01]: So when I'm talking to a business person, say, I create a metric, so I have to do click, click, drag, drop, drop, that sort of stuff in our nonbore data sets. Build a pipeline, that kind of stuff. Now, what we're developing is something that allows you to do that in natural language, fundamentally. And this is where the transformer comes and the decoded, encoded kind of architecture. So fundamentally, we are joining an LLM or a specialist LLM. I can't talk too much about it because it's a little bit harsh, but which allows people to actually do that whole drive that product lifecycle into design, prototyping and onboarding of pipelines through natural language, fundamentally. And the core of it is you have to kind of deconstruct the transformer figures, which is the core of the LLM piece. Do some specialist stuff we can use, like llama two, and we've been looking at llama two.

Jon Cooke [00:34:46]: We're also looking at chat GBT and stuff like that. They're big kind of beam off type, all singing, all dancing a bit. We actually don't need it. We only need, like, 10% of that. Fundamentally, we want to be able to deploy quickly, and we wanted to be able to do things like retrain it as we go, continuous training and all that sort of stuff. So we're looking at deconstructing the actual transformer architecture and actually accelerating it to the parts that we need to do it to accelerate that whole kind of process of defining prototyping, getting stuff in front of the business, and then being able to deploy it with one click to the ecosystem. We call the data for ecosystem and then iterate around it. So agility, speed and low cost is what we're trying to get.

Demetrios [00:35:23]: Through natural language. It's not like people are able to spit up infrastructure, or is it?

Jon Cooke [00:35:31]: Yeah, they will be able to. So basically, they'll be able to spin up a data product like a metric with a user piece of ux. They'll be able to forecast, they'll be able to create all the types of analytics through natural language and then deploy them instantly.

Demetrios [00:35:46]: So basically, you're talking about really what the business side is asking of the data teams already, but now they can just ask it of this LLM and say, hey, I want to know what my forecast is for sales in the next three months.

Jon Cooke [00:36:02]: Yeah, exactly right. It's that type of process. Yeah. Obviously there'll be simple use cases. There'll be more but fast use cases which will need some handholding and some people to do that. But, yeah, fundamentally, it's trying to give the business what they've been asking for. They've been asking me for 20 years, fundamentally, and I've done built solutions which are kind of rule based and kind of workflow based and stuff which work well, but obviously with the transformer pieces, actually, you can get that kind of feedback very quickly, dude.

Demetrios [00:36:37]: So this is fascinating, because on one hand, I'm wondering about how you can be assured of the reliability of these numbers that are coming out. How are you making sure that, okay, it's not just spitting out nonsense?

Jon Cooke [00:36:55]: Yeah, that's a very good question. So the high level answer is that we're not getting the LLN to spit out the numbers. Fundamentally, we're interacting with our own infrastructure to be able to do that. So you avoid the hallucination problem, you avoid the retrain, the massive retraining problem, even fine tuning and that kind of stuff. We're using some sort of derivative of a rag equivalent to be able to actually spit out the numbers and actually do the run. We have basically a number of ensemble models which are kind of all integrated as part of the. So the LEM is probably the natural language piece and the gen AI piece, but actually it's not actually calculating the numbers or doing that kind of stuff.

Demetrios [00:37:38]: Yeah, that makes sense because I think that would scare me a little bit if it was.

Jon Cooke [00:37:44]: Yeah, indeed.

Demetrios [00:37:44]: We all know how much llms love math and numbers.

Jon Cooke [00:37:48]: Yeah, indeed, indeed. And I mean, you think about llms through the attention mechanism and it's all probability based. I know OpenAI, I've done a lot of work to try and the new version to actually get to do calculations, but it's not a calculation fundamentally. If I wanted to do a simple regression or a simple even time series forecast, it's going to struggle because that's not what it's designed to do. Right. So you need to be able to integrate that capability into it, but it's actually outside, but alongside, if you know what I mean.

Demetrios [00:38:19]: Yeah, so in a way, and another design pattern that I feel like I've been seeing is almost like there's a proxy that will take in whatever the request is and it will route it to the right place and then it will get that pipeline going. And I like how you said it. If I'm understanding it correctly, it's almost like you have a model that will understand what the query is and then it will know what to do with that query. And so now we're almost venturing into agent territory, I guess. Is that how you look at it too? Yes.

Jon Cooke [00:38:58]: So we're definitely in that sphere where we're actually asking various components to go away and do things which are outside of the, you know, outside of the LLM. What we're trying to do is basically shortcut a lot of the, you know, what a lot of agent work does is trying know, replicate, you know, high friction process. At the moment, that's why they tend to be deployed. We've got this high friction process with a lot of manual. Can someone figure out what to do? Fundamentally, part of what agents are being deployed to do is actually to figure out the way forward. It's interesting, again, if you look at OpenAI with their Q learning. I did a lot of work a few years ago using agent based modeling, which is another, it's not LLM agents are different, but you do have agents. We try and simulate a system and one of the approaches was use reinforcement learning like Q learning, to actually uncover hidden states in the data and stuff like that as well.

Jon Cooke [00:39:53]: So agents are doing that kind of thing. What we're doing is that we know what the outcome is going to be, so we don't have to use the agent part to figuring out. We can actually be a bit more, much more prescriptive around how we do it. So it's a lot faster and it's a lot more optimized around that. And we don't need a 70 billion model to do that. We can actually use much.

Demetrios [00:40:15]: So because you know what the end state is going to be, it takes a lot of that guesswork out of the agent having to come up with. All right, well, here's what I think the best plan of attack will be.

Jon Cooke [00:40:29]: Yeah. Because you got to have a reward mechanism. You've got to have kind of path to parcels that needs to follow, and then it has to kind of be able to retrace and all that sort of stuff. We don't need that piece as yet. That's the bit where you're driving, because fundamentally it's about working with the business teams to actually find that. Right. And then part of what we do in dataception is actually help the business process for doing that. We actually work with the team to actually drive out what that is fundamentally.

Demetrios [00:40:54]: So we ask business what that end state is.

Jon Cooke [00:40:56]: Well, yeah, exactly right. Because a lot of business type teams, we find, actually aren't very good at actually defining that clearly that end state. And so we have a whole framework where we actually take people through this. And we did this for a company a couple of months ago where we sat with the CEO, the CEO, CTO and the head of product, and we worked out what the three or four most valuable data products, analytics products that they wanted to build for their customers. And we worked through that. So we can do that. We did that in the day and had we had the technology, we could have actually deployed it probably in a couple of days. That's the dial that we need to move.

Jon Cooke [00:41:32]: Right. It's interesting, there was a LinkedIn poll around what does data driven means? That's what data driven means to me, where you can iterate with the business people with the value, proper value framework driving that whole kind of business change conversation and then not spend twelve months, two years to try and implement it.

Demetrios [00:41:49]: Implement, yeah. And I laugh when you say a lot of business people don't understand because I just think about my own use cases and our buddy Dan, who is the data guy in the ML ops community, and I'll ask him. It'd be nice if I could see some data around this. And his thing is like, well, what data are you looking for, man?

Jon Cooke [00:42:11]: Yeah, absolutely. Yeah. You're asking for an outcome, and what's that? What's the outcome? So I get inside your head and say, actually what you're asking for and what you actually mean are two very different things. And that, again, is what we do at day. So that's kind of our core USP. We get in the head of the business and then drive that conversation. Then, because we're analytics and data expert and tech experts, we can then deploy that quickly. But we have to build stuff a lot of the time, and we're trying to shortcut it.

Demetrios [00:42:41]: What you're talking about reminds me a ton of when we had Leanne Fitzpatrick on here from the Financial Times, and she talked about how for them, they set up, and it's been like almost a year. So excuse me if I butcher this, but I will put the original in the description in case anybody wants to go and watch or listen to that episode, because it very much is in this vein of even if someone asks for data and even if we can supply that data, what we've created as like a middleware team between, it's almost like a cushion between the data team and the business side, or just like the data askers, it doesn't even have to be the business side. It's like the people who want some kind of data versus the people that have to supply the data. And I think in the data mesh, they call this like consumers versus producers. Right. And so, yeah, what they did is they created this cushion of people that are very understanding of data, and they're almost like educators. So when someone wants something from data teams, they don't go to the data team, they go to one of these hubs or what? I was like, oh, it kind of sounds like they're like a human API because the human will go and talk to another human and say, hey, this is what I'm looking for. I want this data.

Demetrios [00:44:10]: And that human API will say, okay, so if we get you this data, what are you going to do with it? And that's like the first thing that they say, well, because I want to know how many people unsubscribed from this, and then I can send them like a follow up email or whatever. And, okay, so if you send follow up emails, what are you trying to do with those follow up emails? It's exactly what you're saying. What is the end state that you're trying to get to and do we need that data to get there, or is there another way around it? Or maybe it is that we need that data and more. But the person who understands a, the data ecosystem, and then B, how difficult or how easy something is or what you can do with data will almost like bounce a bunch of questions back at the person who's asking for the data before they come to any conclusion.

Jon Cooke [00:45:08]: Exactly. I mean, we even take a little bit further because we talk about framing a lot. So part of the process we go through is like we're framing and we're framing the business problem. Are we asking the right business problem, which is an interesting one, and then we talk about analytical framing. Are we asking it in the right way? Are we asking what's the type of analytics we need to do? Because I'm going to say, give me this happened quite a lot. Give me a machine learning algorithm to do this, this and this and this. It's like, well, no, you just need a metric to do that. And the business frame would be like going from classic, give me your story of a KPIs, actually.

Jon Cooke [00:45:43]: So the end of the day, what are you measured on? Oh, I measured on efficiency. Right. So if we give you the top three things to improve your efficiency for that day or the end of the week or whatever your measurement period is, and on these three data points, is that going to solve your problem? Yes. So that's the business framing. And then the technical framing is basically, the analytic framing is basically, are we asking, we don't need AI for every single use case, right? Absolutely we don't. Or can we deconstruct it down from a forecast model for a couple of metrics underneath? If we build you four of those and you have your historical metrics, but then we build a forecast on top and we can deliver those quickly and easily. And that type of stuff, can we do it that way? And it's those sort of pointy questions that I use all the time to actually, to be able to ask it again, it sounds like a similar sort of thing. But interestingly enough, if you develop a sort of product management method mindset, that's what you do.

Jon Cooke [00:46:37]: That's core part of product management. It's the value, it's driving the business conversation. And then you bring in the technical and aspects. This is why we like product management quite a lot in this space and it works really well.

Demetrios [00:46:49]: How do you feel like, so aside from what you just said, what are some ways that us as engineers or individuals, it doesn't matter what our background is, can adopt more of that mindset and have more success with that mindset.

Jon Cooke [00:47:07]: Yeah. So again, there's a couple of things. The first one is obviously empowerment in your organization, and you can find out pretty quickly how empowered you are, because if someone comes ask you, a senior comes and asks you if you can push back and are allowed to push back and get listened to, then you've gotten back. If you can't, then there's an organizational challenge there. The second bit is be able to ask those questions, as I said. So fundamentally, what are you trying to do with it? Going from like an auditator to what we call consulting, a trusted advisor, someone that say, oh, they understand my business. And I responded to a post the other day, and I've seen quite a lot of business leaders who are really savvy, understand tech and data that drives what they want to do. Right.

Jon Cooke [00:47:50]: They see it as a key tool around do that, but a lot of people don't fundamentally. So can you get to the point where with your business stakeholders, not your it, your business stakeholders, to that point where they trust you to, they understand, you understand the business as well as they do to be able to ask those sorts of questions and they trust your advice and this type of stuff as well. So can you do it for me, those are the two, the two big things. Because if you can't do those two things, then you're going to really struggle.

Demetrios [00:48:14]: Yeah. Because at the end of the day, somebody on the business side, your CEO, doesn't care if you're using kubernetes.

Jon Cooke [00:48:22]: No, indeed, indeed.

Demetrios [00:48:24]: And so how you're able to abstract yourself away from all the cool tech that we all love talking about and getting into the weeds on and learning more about on a daily basis, but then being able to speak, it's not different languages, but speak in two different positions. Yeah.

Jon Cooke [00:48:47]: Do you know the hitchhiker's guide to the galaxy?

Demetrios [00:48:50]: Of course.

Jon Cooke [00:48:51]: The babelfish.

Demetrios [00:48:53]: Yeah.

Jon Cooke [00:48:54]: That's what you got to do. We need to be the babelfish where we can actually translate between the thoughts. It's not necessarily the voice, the thoughts between people, but also the other thing I would say, coming back to this empowerment thing, I've delivered, I lost count how many analytical solutions I've delivered to the business that would change the dial, but they haven't been adopted because of all sorts of other reasons. We know that politics of that sort of stuff. So that's something you definitely need to test in your organization. What's the appetite for actually being data driven, that kind of stuff. Because you're an alert.

Demetrios [00:49:26]: How do you test that?

Jon Cooke [00:49:27]: Well, basically you push back and you actually ask those questions. Like I said, you say that actually, let's re understand what you're trying to do. Actually, if you did it a different way or here's an improvement that will drive 10% on the bottom line or increase your conversion rates kind of stuff. If that's like, actually, we're not interested in that. But that's a very good way of actually testing whether there's an appetite to be. What I say, analytics is a business change process or should be. If you test it, it isn't a business change process or seen of that or seen something else, just reporting or this kind of something we have to do that is a really good litmus test of how successful an analytics person is going to be in the organization.

Demetrios [00:50:07]: Yeah, that is some fascinating stuff to think about because you hit on such a clear point where you can do everything right and produce, you can create some kind of output, but if at the end of the day the adoption isn't there, it's worthless.

Jon Cooke [00:50:30]: Yeah, exactly right. Exactly right. And it's interesting, I see lots of on LinkedIn around and other posts around, you've got to make sure your data products are adopted and valuable. This kind of stuff, it's like, well, if you've built them and you don't know that my mind, there's going to be a challenge, right? You build them because there is an adoption, because there's a need, there's actually a business need to be able to do that. And I think that's something that in the data world, we're still struggling a bit with again. And it's back to this kind of get all data in one place and start modeling it. This kind of stuff, it's old, the business aren't using it. I've lost count of how many times I'll say, well, we've built this data platform, we built this data infrastructure, we've done this whole kind of stuff and no one's using it.

Jon Cooke [00:51:11]: It's like, well, that's because you've gone from a technology focus angle rather than a business chain angle, because the business chain angle is actually just forget the technology, start with this, work out what the business needs to do and get their buy in from before we do anything. That's fascinating to build it and we will come was kind of not going to happen. A big graveyard in the hadoop world. That was the big promise of Hadoop and build it and they'll all put it all in one place and all going, it's like, no, actually, we got to pivot to that, to actually start talking, having business conversations, and work out what tech and data we need for this business. There will be some infrastructure, there will be some cross, some foundational stuff we need to build, but it's not like 1216 months or 18 months because the market might wear it now with budget shrinking, that kind of stuff.

Demetrios [00:51:51]: Yeah, and who knows what kind of LLM will be out by then, that all of the CEO C suite is going to say, well, can we just do this with AI now? And like, no, but like, I can't imagine all of those people that are building their data platforms. And then when Chat GPT came out, everybody all of a sudden, you know, we could just throw some AI at this. And how many times the product managers or had to go through and figure out a story around that and say, absolutely, yes, but no.

Jon Cooke [00:52:28]: It's also interesting that the first time in my career, in my 30 year career, I've been through a number of epochs on the Internet and ecommerce and big data. The first time where actually a new world changing technology has come out, where business people can play with it directly. So they got a chat TP, give me a business plan, or let's generate this email, blah, blah, blah. I can't really think of any other major technology advantage advance that's happened in the last my lifetime. Anywhere where the business people can actually, maybe the Internet, maybe, but they need to be able to build stuff. It's actually there. I can do all sorts of things. I can go and chat TP or anthropic or bold or whatever it is, the business people, I want that because I can play with it.

Jon Cooke [00:53:09]: And why can't you build that? For me, it's very front and centered, whereas before it's been like, oh, there's some door, some black magic going on, some black box type stuff, can you go and build it? And the promise happens and it doesn't happen, this kind of stuff. But now it's business people. I've got the hands on it. Oh, my word. With us as tech people, how do we fulfill that promise? But it does what? It also means that the business people are now starting to think of all those things they've been asking for, for years and haven't been under limber for whatever reason. The light bulb's gone on fundamentally around that. It's that kind of actually, oh, my God, this is actually going to change. We need to change our business process, which, wow, thank you.

Jon Cooke [00:53:45]: Finally, data and tech is about changing business process or business process supporting change of business process. And that's starting to permeate now, which is, first of all, I've really seen that in that kind of holistic way.

Demetrios [00:53:58]: Yeah, it is funny, you talk about seeing people on LinkedIn mentioning the adoption and building things that, and you have to have adoption for your products and all that. And how you said, if you don't know that before you build it, you've failed. If you don't know that there is an actual value and use case and people are jumping over one another to get to this product, then you're taking on a lot of risk. And my budy Adam said that one time when I was in a meeting with him and he was showing me what he was tinkering on, I was like, dude, this is so cool. How have you built this? Or what have you been doing? Blah, blah, blah. And one thing that stuck with me from the conversation that we had was him saying, I'm tired of making products that people don't want. And it was just that. So now all he does is he works a little bit, maybe changes a few things, and then goes and talks to as many people as he can to see what they think of it, and then maybe he'll change a few things and then go back out and talk to as many people as he can.

Demetrios [00:55:11]: And that is really the way forward that is super valuable. It just feels like the right way of doing it. Right.

Jon Cooke [00:55:19]: Startup world's run like that, right? Startup world is absolutely run like that. And the big companies, larger companies, get the walk, and the bigger they get, the slower that type of thing happens. Right? Or doesn't happen at all, because it's a very different culture. I talk a lot about the UdA loop, the observe, orientate, decide, act, which is obviously from the fighter pilot thing. Obviously, the speed of that depends on the speed of your business, obviously. But it's that kind of scientific method. Let's work out what's happening from a data perspective, analysis perspective. Let's orientate, let's put the data in, work out what it means to us, and then decide, then act, then do it again and iterate.

Jon Cooke [00:55:55]: And the measurement is basically talking to customers, talking to users, talking, that kind of stuff. If you didn't do that in startup, well, you'd be dead right. Startups would fail because you don't spend typically. I mean, it's not always the case, but you don't typically spend two or three years and millions of pounds developing something, then to go and test it with customers. Right. You just don't do it right.

Demetrios [00:56:14]: Yeah. Well, John, man, this has been fascinating. I really appreciate you coming on here. And for everyone that is not already following you on LinkedIn, I highly recommend it. You put out some great content, and I always enjoy when you pop up on my feed.

Jon Cooke [00:56:29]: That's it.

Demetrios [00:56:31]: That's the podcast for today.

Jon Cooke [00:56:33]: Brilliant. Well, thanks so much, Dewey. Just having me on. It's been a bit interesting chat and had a lot of fun. So really appreciate it.

Demetrios [00:56:43]: I'm Emmanuel Mason, machine learning engineer at.

Jon Cooke [00:56:46]: Stripe and author of Building Machine learning powered applications.

Demetrios [00:56:50]: And if you don't want your machine learning models to explode, well, you should.

Jon Cooke [00:56:54]: Subscribe to this podcast.

+ Read More

Watch More

Vector Databases and Large Language Models
Posted Apr 18, 2023 | Views 3.1K
# LLM in Production
# Vector Database
# ChatGPT
# Redis
# Redis.com
# Rungalileo.io
# Snorkel.ai
# Wandb.ai
# Tecton.ai
# Petuum.com
# mckinsey.com/quantumblack
# Wallaroo.ai
# Union.ai
# Alphasignal.ai
# Bigbraindaily.com
# Turningpost.com
Evaluating Language Models
Posted Mar 06, 2024 | Views 1.1K
# Evaluation
# LLMs
# LTK