Leadership on AI
Speakers

I am a technology executive and entrepreneur in data science, machine learning and AI. I work with global corporations and start-ups to develop products and businesses based on data science and machine learning. I am particularly interested in Generative AI and AI as a tool for invention.

Mert is the current Chief Technology Officer at Just Eat Takeaway.com with previous experience as a CTO at Delivery Hero Germany GmbH, Director of Engineering at Delivery Hero, and IT Manager at yemeksepeti.com. They have a background in software engineering, system-business analysis, and project management, with a master's degree in Computer Engineering. Mert has also worked as an IT Project Team Lead and has experience in managing mobile teams and global expansions in the online food ordering industry.

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.
SUMMARY
Agents sound smart until millions of users show up. A real talk on tools, UX, and why autonomy is overrated.
TRANSCRIPT
Mert Öztekin [00:00:00]: And you are not just providing a functionality or feature to the business anymore. You also need to help the business to go through this big transformation.
Euro Beinat [00:00:08]: When we share these tools to our colleagues, they came back to things that we didn't expect.
Mert Öztekin [00:00:12]: Yeah, well, practically the problem didn't start with AI. We had that problem in the past as well.
Euro Beinat [00:00:18]: So you mentioned that it's hard to measure. It's not hard to measure. The question is, are you measuring the right thing?
Demetrios Brinkmann [00:00:29]: We can start out with just the idea of how you are thinking through enabling your different departments and the wider companies to be more productive with AI.
Mert Öztekin [00:00:41]: I mean, the role has changed quite a lot for me as well, because up until two, three years ago, your focus as a CTO is to basically provide the functionality the company needs. Comes from product, comes from marketing, comes from any department. And also you are responsible for the scalability, stability of the platform. You have to make people trust what you are building. In the last two years with AI, with ChatGPT being available to everyone, you also realize that, oh, okay, this is big. And you are not just providing a functionality or feature to the business anymore. You also need to help the business to go through this big transformation about how you are thinking, how your processes, how your way of working needs to change. And as you are the responsible person for the technology in the company overall, you have to basically leave your comfort zone and then push the boundaries across all these departments to help them understand what is coming up, how big that will be for those departments, for the business, for the ecosystem and for the world.
Mert Öztekin [00:01:55]: And also in the background, build the foundations and the platform so people can operate on these things as well.
Demetrios Brinkmann [00:02:01]: So you have to become a shepherd.
Mert Öztekin [00:02:03]: Yeah, exactly, exactly. So you are not responding to requests anymore. You start to push the boundaries of the business more than you have ever done. So it was a change.
Euro Beinat [00:02:13]: So a couple of additions, let's say, because I can see everything that MERT has said so far, the few decisions that we made even before ChatGPT is that all these technologies, large language models, independent of the fact they're not perfect, independent of the fact that this is going to take years before we get something which is foolproof and so on, but they're so new and so impactful that you need to find a way to learn them. And then you can decide to mandate to create education and so on, or you can take the opposite direction, which is let's give everybody the tools so they can experiment bottom up. And I totally convinced that for every organization that I see, this is one of the best options you can possibly have, it's provide within boundaries, that boundaries being the guardrail. Such a way that you do not make catastrophic mistakes. And so but give the tools that people can decide how to create agents for themselves, how to in fact create the things that they need for their work. Right. And the reason why we did that is because long ago in GPT3, I think it was really, really early stages when we shared these tools to our colleagues. They came back to things that we didn't expect.
Euro Beinat [00:03:32]: Use cases that I can't invent that thing because I'm not in that line on business. I will never be able to figure that out. By giving everybody, it means literally everybody, tools that can experiment. There are two things that we bank on. The first one, they will create something useful for them. So there's going to be a productivity quality of work that increases, which then if you multiply for the entire organization is going to be material, very material. And the second thing is that there is a distinct advantage in having everybody understand where these tools work, where they fail, how I can tweak them to make them work where I need. Because then they can apply the same intuition to the applications that they develop for our consumers.
Euro Beinat [00:04:18]: Right. So these two things tend to become, let's say they are the same thing in the end, but they're very important. So collective discovery on the one hand, but also collective awareness on the other end.
Demetrios Brinkmann [00:04:30]: Actually, I think one thing that I was shocked by was when Paul was telling me how you guys encourage most developers to use various types of coding agent tools.
Mert Öztekin [00:04:42]: Yep.
Demetrios Brinkmann [00:04:43]: And it's for the sole fact of, hey, if there's something valuable here, you should know about it and you should choose. We don't want to say, oh, we're going to mandate you can only use cursor, or we're going to mandate you can only use XYZ insert, your favorite coding tool, Claude code. But we want you to be able to find the one you like and use it if it gives you that productivity.
Euro Beinat [00:05:04]: Yeah. So you should weigh in on this one as well. But there are advantages on both sides. On the one hand, there is a big advantage of one tool, everybody standardized on that one. You can get better conditions, commercial conditions, and also a lot of knowledge that can be easily shared. Right. So because you just create, let's say, a lot of knowledge there. However, the reality is that these tools are still developing at this moment there is a winner, but next year might be another winner.
Euro Beinat [00:05:30]: So there's going to be, let's say an evolving landscape. And you also want to make sure that you continuously test and stretch all the tools out there for various reasons because for many use cases you better use different tools. It's just a reality of that. But second, because it is evolving so you don't really bet on anything, but you bet on a way of learning these tools. So in that case, having a certain degree or total degree of freedom is fine. The other thing that we are not really, let's say, compromising on is to push everybody to use these tools. Right. So there's a deliberate mandate to experiment and get to 100% adoption as fast as possible for the same reasons that they discussed before.
Mert Öztekin [00:06:14]: Yeah, so we didn't exactly follow the same approach in terms of just let everyone use everything. We started with GitHub, Copilot. We also launched Amazon Q Developer. We are right now launching Cursor. I think the Android engines are using Gemini and I know there are lots of other tools as well available. The reason why we didn't just let all is to also provide some focus to the organization and also share some of the learnings about how to make the best about a certain tool instead of jumping from tool to tool. Because there's also a trust element on if an engineer will trust to technology. And for building the trust, I think you need to spend some time like you need to give the credit to the AI or tool, whatever it is, that well, it will not solve everything at first go.
Mert Öztekin [00:07:11]: So you need to get better with your prompts, the AI will get better with the models and there are things that it will be great on doing and there will be things which they are really not good at. And if our assumption was that if we let them just jump from one tool to another, we may lose that momentum on investing right time. And also we would like to provide some guidance about how to get best about a certain tool. Internal trainings, having the providers coming and telling people how to use those tools. And commercial side, it also helps with the negotiations as well if you go with a few of them. But I fully agree the idea is not to limit or make it very strict so that people feel like, okay, I mean this company is just, just not evolving. This company is against these two. So if you give that feeling and if you just push for a certain thing, I think engineers are very smart to understand that this company is not going to be innovative.
Mert Öztekin [00:08:11]: This company will disappear. So but also giving some guidance helped, helped I believe to us we were quite pushy about the adoption of these tools as well. So in the all hands I was very clear that like you guys have to use this. So there's no option because it's not about AI replacing you, but someone using AI will replace you. So it's not about you losing a job on just it. You are losing your profession if you're not using these things. Because these things will not disappear. These things will be part of our work.
Mert Öztekin [00:08:43]: And the adoption has picked up very much like I think we have more than 95% of the engineers using these tools every day for their day to day work. Obviously we didn't Six months ago we made a calculation that we believe 30, 40% of the code generated is AI generated on the production systems. And it's very difficult to measure that because you have suggestions, they can copy paste, they can use other tools as well. So you cannot easily 100% accurate with this number. But the reality is we also didn't see a 30% velocity increase in the organization because software development is just not just coding. It has lots of functionalities as well. And all these things are getting better over time. But the trust, the learning, the way we are operating is changing over time.
Mert Öztekin [00:09:34]: And I think, I mean we are in a very strange and interesting and exciting time.
Demetrios Brinkmann [00:09:40]: You bring up a really good point. Around 30% or 40% of the code is generated by AI. It's hard to track the metrics on that. I want to get to that in a second. But I also want to raise the other point that you said, which is we didn't see a 30% increase in velocity because that's not the bottleneck. And I have a joke. I'm always saying that the bottleneck usually is the aligning part. It's not the actual like hands on the keyboard part.
Demetrios Brinkmann [00:10:10]: The aligning takes much longer.
Mert Öztekin [00:10:12]: Well, it's getting better as well. As you said, there's just so many factors in in place on affecting your total outputs, capacity or velocity. That depends on the coding, that depends on alignment with your team members, with other teams, with product manager waiting for ux, waiting for qa. So but I think AI is helping in all areas in parallel. So it will be a mistake to just bet on certain process and just optimize it to a maximum and then ignore it. I think it will all evolve at a certain pace. But also the reality is behind the scenes there are also new work that is coming up about being more AI ready or AI available as an organization, as a technology as well. So the way we have architected these platforms were based on the fact that Always the software engineers will scale themselves, the teams will scale themselves.
Mert Öztekin [00:11:10]: We started with monoliths, then microservices, and probably right now we will see a different shift of how software will be architectured so that AI can generate code much more satisfactory than the limitations of today's architecture. So that means additional work will come to the product and engineers as well.
Euro Beinat [00:11:31]: So you mentioned that it's hard to measure. It's not hard to measure. So there are many tools to measure productivity of coding, right? So then. And there are companies that have tools for that job, so you can do that. The question is, are you measuring the right things? Right. So the fact that you have, let's say, more PRs, let's say, or things like this is that, let's say good measurement of productivity. So that remains the thing is that in any case, coding is just one of the work streams. So if you look at the entire company, there's engineering that does many things, including coding.
Euro Beinat [00:12:10]: So there's a group of people that are contributors to code, contributors to the thing. But then this type of, not automation, but in any case agentic use to improve your work can be applied everywhere across the organization. Then interesting part of coding, but that's similar to what you see in legal profession, what you see in customer support, you see in some ways also in marketing is that software development is a well understood discipline. So there are schools, there are trainings, there are coding tools, of course, there are best practices and so on. So people more or less know how to do it and then others can judge if they're doing well or not and so on. So there are parts of the organization where this kind of agentic augmentation follows a fairly well defined path just because the discipline is well defined. And then you have a long tail, an extremely long tail of how to code. It's sort of the way I look at them is tiny total addressable markets while coding.
Euro Beinat [00:13:16]: It is a large total addressable market within an organization. You have thousands of tiny total addressable markets where the total addressable market might be one or 10 people, almost an individual basis. And they can use these tools to make their work more impactful, can be faster, can be better, can be whatever metrics you have. Now that's another very. To me, it's a very interesting segment of agentic automation which usually is not captured by let's say one or two companies, but it is created by every single worker bottom up. So that's one of the reasons why we have been working so much and pushing so hard in Getting in the hands of everybody in the group. Tools such a way that people can create their own agents. Because there might be three people in the marketing department that struggle with that problem.
Euro Beinat [00:14:11]: They repeat the same thing over and over and over. Now they can actually do it much faster. So you can do something else. How do I get that problem if I'm not there? So there's no way that they can invent. It's not like coding. We know what good looks like, but here I don't know what good looks like. They know and they are the only ones that can put together something that makes their work better. So it's a tiny addressable market, but you have it thousand times.
Euro Beinat [00:14:38]: So if you have a very low barrier to creating these agents, then you can create thousands of automations across the organizations which in total they're really impactful. So that's the long tail of, let's say, agents at work.
Demetrios Brinkmann [00:14:54]: And it doesn't make sense for a company to create a product to service the three people in marketing.
Euro Beinat [00:14:59]: Because there's no market.
Demetrios Brinkmann [00:15:01]: Exactly.
Euro Beinat [00:15:01]: There's no market. So you need to have a tool that makes it extremely simple and extremely easy to create these automations for you. Then it makes economic and practical sense. Right. And that's the reason why it was not done until recently.
Demetrios Brinkmann [00:15:15]: It's probably a good segue to go into your bold vision for 30,000 agents in production. What is it next year?
Euro Beinat [00:15:24]: March.
Demetrios Brinkmann [00:15:25]: March, yes.
Mert Öztekin [00:15:27]: Where you are right now.
Euro Beinat [00:15:28]: We are 18.
Mert Öztekin [00:15:29]: 18. Okay.
Demetrios Brinkmann [00:15:30]: Yeah, that's.
Euro Beinat [00:15:32]: We're getting there much faster than that. So I think we're going to go there.
Demetrios Brinkmann [00:15:35]: There was a graph.
Euro Beinat [00:15:36]: That's a graph. It's exponential. So let's go back and think about why is that a good idea, even if it looks silly. I think it's a very good idea because it has to do with change management. It has to do much less with the agents, but has to do much more ability of everybody of removing the barrier that they believe for agents. So back in July, when we have seen the agents created with tocan, the platform that we use for that sort of plateau, in spite of the fact that had all the functionalities, they connect to mcps, they could connect with, let's say dozens of internal systems. But people were not adopting more than the technical people. So only technical people were sorted.
Euro Beinat [00:16:25]: And then we had to figure out the reason. The reason was extremely simple, is because everybody else thought that this is a tool for engineers. Engineers can create agents. So we have to wait for them. So I was going to come in and help and so on and so on. So we had to remove that barrier. And how do you remove it? By sitting with the marketing people, by sitting with HR people and creating these agents in 30 minutes, in 15 minutes, not in a week.
Mert Öztekin [00:16:49]: Right.
Euro Beinat [00:16:50]: And once you go through that process, then you see that you say, wow, is that it? Is that all it takes to create an automation? Makes a difference? Yes, that's all it takes, right? It is, at this point in time is simple. So that what made a difference and it started going up. A few other things happen. When you're at that point. If you have created an automation that helps you, you're going to tell somebody else, hey, I've done this one and it's going to help. Why don't you do it too? And so on and so on and so on. So that mechanism goes into, let's say, a percolation of knowledge that enables everybody to create the agents. What we don't know how useful and how good these agents are, we are going to focus on that afterwards.
Euro Beinat [00:17:36]: First we make sure that everybody feels at ease that within very specific boundaries. Right? So you can do what you want. You have very strong security provisions. But within those security provisions, everybody's encouraged to experiment as much as they can in such a way they can become the owners of those tools. So this is not an engineering tool, it's a tool for everybody. Once you get there, you start getting into what are the good agents, how can make them better? How can you change the processes then that generated this inefficiency in the first place? So how can you change the organization? It's very hard to get there if you don't have everybody on board. So 30,000 agents in fact means it's a change management process. It's a cultural process, it's an organization thing.
Euro Beinat [00:18:27]: It has nothing to do with the technology itself. I actually really, really like that approach.
Mert Öztekin [00:18:32]: And I can just add to what you said, although we just became part of process. So Tolkan is not available to our organization yet, but we will help to 30,000 for sure. So you will make it before March.
Euro Beinat [00:18:43]: Hopefully you'll help. But to get over that, I like.
Mert Öztekin [00:18:46]: The fact that there are two things. One is you put a bold idea there which no one can ignore and that creates a question mark in people. Like what is that? Why 30,000? What is token like? Should I be worried about that? Should I get a training? So you create a demand, you create an expectation, you create a question mark in people's. If you don't go Bold if you don't make it challenging, it's very easy for people to just move on, ignore and focus on their own work. And the second one, I also agree, like, I have a daughter, 8 years old, she's learning to read and write and my KPI is how many pages she reads a day. I actually don't care if she's reading the stories that I want her to read or not. But at this point I want her to read, I want her to understand, I want her to get better, more comfortable with what reading is. And then when she gets more comfortable, then we will discuss about which stories, which values she should get about it.
Mert Öztekin [00:19:46]: But right now, similar to probably what ER was doing is you need to create the first level of understanding about what is going on and everyone needs to jump into that boat by forcing themselves. And the value, how you are going to get the maximum value out of that, it's the next stage. So I think that's the right thing.
Demetrios Brinkmann [00:20:05]: To capture the imagination in a way.
Euro Beinat [00:20:07]: Yeah, absolutely, absolutely. And then as I said before, we always go back to some, let's say of the tests that we did back in 21, it was really, really early on with LLMs and people were really coming back, they haven't seen these tools before. So we stitched together very simple user interface on top of GPT, I think something like that. And we gave it to our colleagues and do something because it looks interesting. We never been able to do good stuff with NLP at this level. So this is really bad at this moment, the time tools, really bad. Right. But we're really promising too, and they were improving really fast.
Euro Beinat [00:20:48]: So if that trajectory holds, then you're going to have in one or two years something which is really surprising and we never had before. But how do you use it? Do I know? So we had ideas, of course. We repeated all the NLP use cases that we had before with LLMs, but then we gave it to about 100 colleagues at that time. Sort of really rudimentary chatgpt where you could test and so on. And people were coming back with things like they created exams for schools. They're completely personalized. They put, let's say a small CV of, let's say students in the prom, so they're doing experiments like that and so on. How did you think about this? Kind of.
Euro Beinat [00:21:30]: It's actually a great idea. It's a great idea. Did I have that idea? No. And so many other things. So that what convinced us that because of the novelty of these tools, because we Never had tools similar like this one. You have to experiment bottom up if you want to find out, let's say, what you can possibly do with these things. But also all the errors that you can find. This mechanism of, let's say collective experimentation is one of the greatest sources of threats, potential errors.
Euro Beinat [00:22:04]: That's how we learn where we should not go. Because then you don't have one team figuring out. In fact, you got 30,000 people doing red teaming all the time, which is also very useful.
Demetrios Brinkmann [00:22:17]: Well, yeah, because you're scratching your own itch. At the end of the day, I have. Whether it's professional or personal, I have this thing that I want to do. Let's try and see if I can build an agent to do that.
Euro Beinat [00:22:29]: Absolutely. And then you can see that somebody else has done an agent on education, just that example. But you can do the same thing for customer support. You connect the dots. Right. And that dot connection is what you need because we just don't have an intuition yet about, let's say, the full potential of this tool. So how do you get to explore the boundaries, just stretching them as far as you can?
Demetrios Brinkmann [00:22:53]: Yeah, it feels like there is a lot of the change management piece here and being able to encourage the employees to spend time with it. As an employee, I would probably think I'm so busy with other stuff.
Mert Öztekin [00:23:15]: To.
Demetrios Brinkmann [00:23:15]: Put the time in, even though I know that later on maybe it's gonna make my life easier. But to front load that pain is kind of a hard ask.
Euro Beinat [00:23:26]: You know that I like the mattress, but I think this is a completely wrong way of thinking.
Demetrios Brinkmann [00:23:30]: Yeah. All right, well, I'm glad that you refute that. Why?
Euro Beinat [00:23:33]: Because I think think these tools are going to be used only if they are immediately helpful to you. There is no way that you can push these tools down the throat of people. By the way, this is smart colleagues. They know how to work. And already they have been optimizing their work, dealing with, let's say, inefficiency all the time. It has to be beneficial to you. And you have to be the one that recognizes that. If that's not the case, you should not do it in the first place.
Euro Beinat [00:23:56]: There's no sense to do something, doesn't help. Now what we can do is that we need to make sure that the barrier to get to that point is as low as possible. That means the tools work. They give access to all the knowledge bases that you have, prevent you to make big mistakes. So you got boundary conditions, provide examples of what others are doing. So the barrier should be very, very low. But once the barrier is low, the only thing that makes sense is that if you do, you create something that helps you. If it doesn't help, you should not do it.
Euro Beinat [00:24:30]: So, yes, it's fine to test here and there, but that's why we're not even thinking about allocating special time for this. It should not be necessary. It should be immediately useful. So otherwise you actually create friction and then you have to deal with friction. Then you have all the other discussions. So I don't think it's a good idea. You also need to recognize then the excellent work that some people do. So one of the instruments, and there are many, is gamification.
Euro Beinat [00:25:00]: So we have this program which we call Prosus as Talent, I think that's the name. But anyway, it's designed around Shark Tank. It's a competition. It starts now. Across the entire group, everybody creates agents and there's going to be, let's say, a committee that selects agents every month. So the best ones are going to be published across the group. They're going to be, let's say, featured. Everybody knows about that.
Euro Beinat [00:25:28]: And then there is this, let's say, crescendo towards March. There's a big event and there's a big selection of the winning agent and the winner will spend the week in the Silicon Valley and many other things. Right. So there is a reason to do that because it makes a difference for you, but it's also something that, you know, it is recognized by others. So you need a lot of these things. But the idea of separate, say, making this a sort of separate program of innovation, I think that's not a good idea.
Demetrios Brinkmann [00:25:58]: Yeah, I like that. And the real key there is you are now creating a very high bar for the product itself, for the agent builder, because it has to be instantly useful.
Euro Beinat [00:26:13]: It has to be immediately as well. And we also accept, and we're really, I think it's a very good idea is that let's say a certain portion of these agents are just experimentation. I'll do it because I really want to stretch the tool and find out what it does, what it doesn't. It's not a waste of time. You know, you just call it learning. And I think it is learning. Learning is not a waste of time.
Demetrios Brinkmann [00:26:37]: Yeah, of course. And you might win a trip to.
Euro Beinat [00:26:39]: Silicon Valley and then you might even get a trip to Silicon Valley.
Demetrios Brinkmann [00:26:43]: Yeah, that's so cool. Maybe it's a good moment to shift a little bit towards the idea of governance and how you're thinking about 30,000 agents, you kind of mapped out, you have boundary conditions, but then there's probably a lot of other pieces that you're thinking about when it comes to the governance of how these agents work and what you're thinking through as you're looking at like, okay, we're introducing AI into the company. What does that mean on a governance level?
Mert Öztekin [00:27:16]: That's an interesting topic, governance in general, governance work when people don't feel that, but it is somehow protecting them without putting so much pain or hassle to the people. So obviously it depends on the area. For software engineers, the governance is if it's generating code or reviewing code, you need to have the technology or the tool sets that is basically governing what your agent is doing. So there should be other agents or people governing what is going on so that it will allow the software engineers to be more flexible, more confident about what they are doing. So you need to build those things. So you cannot just take governance as a Google form, that everyone needs to submit something before doing it, or DPO needs to approve the use cases. So that is bureaucracy that is also needed, don't get me wrong, for cases that is needed as well. But the idea is how we are going to build the governance so people don't feel that there's a governance happening.
Mert Öztekin [00:28:21]: Or it will be quite practical for people to understand that, okay, they are asking these questions or they have these boundaries or limitations to protect me or the business from these cases. In the early days of AI hype, we had almost no governance in terms of tool set selection because there's just so many tools available for marketing, for sales and other stuff. And we have, I mean, it didn't took more than six months to realize that everyone is buying tools without talking to each other and even they're buying the same tool from different departments with different contracts as well, with different pricing. But as you said, it is fine because you also don't want to slow people from adopting or trying these things. There's definitely a cost of it and usually the technology department pays the cost in terms of making sure these things are secure or have a proper procurement as well. But you just want them to try. And after six months, almost eight months, we started to put more governance about, okay, how you can procure an AI tool, how you can launch it, how you can make it, access to these things. But you also try to be practical in these ones.
Mert Öztekin [00:29:33]: You don't want people to feel like, okay, you are killing the innovation of the business. You are just slowing me down. You are just making it harder for me to do just because you don't have the budget. So we don't want to do that. But we started with almost no governance. We are right now in the middle. I don't know if we will go more onto the more strict governance over time. I think our approach should be more at automating these governing tools so that you will feel less about someone is governing or something is governing.
Mert Öztekin [00:30:02]: So we build autonomy within a framework kind of solution for the business. And again it depends on the area. Software engineering is different, procurement is different, launching a sales AI agent is completely different or a customer service use case is completely different. So as long as you are practical, I think you can find a solution.
Euro Beinat [00:30:25]: And so if I can add a few things because you mentioned there's something I think is very important is it's really hard to get these tools to work 100%. So you have to accept that they make mistakes, that they're say they're residual hallucinations still there. And sometimes they're just not the typical tool which is deterministic. If I have, let's say I follow a certain sequence, I'm going to get always the same outcome. So these tools are not like that. That's a nature of that. So you need to accept a little bit that even for experimentation it can go somewhat wrong. It's just the nature of that.
Euro Beinat [00:31:08]: So the question is, what is it? The things that you accept, what are the things that you think they're not acceptable? Well, you have many levers to decide that. So the first one is, let's say whatever authentication authorization these tools have, it should never be more than what people have. It can be less, potentially even less than you. But anyway, no more. So that's one thing. The second thing that at this point in time I think we are sort of uncomfortable letting agents do their work completing autonomy. So there's always a level of supervision and that supervision is at the end of the process or it can be weaved in into the process itself. For instance, we know that prompt injection in many agents is still an issue.
Euro Beinat [00:31:55]: So you want to make sure that if there is a sense that perhaps another model says this is potential risk, you have a pop up, you have something that the human has to say, do you really want to visit this site? Or better, do you really want me to go to that site?
Demetrios Brinkmann [00:32:11]: Or execute this test?
Euro Beinat [00:32:12]: Or execute this test? So you need to introduce a lot of these checks in between which makes, let's say the process potential is lower. But in Any case, you enable more use cases in that a lot of these things are part of the way you design things. The other observation is, how do you figure this out? Which are these measures? By enabling testing, because people come up with additional ideas or issues and threats and therefore you react to that. So to the extent to which you try to limit the worst case, so you really say, what is the worst case? What's the acceptable worst case case? Then you have the baseline on top of that you can develop. But I think part of the experimentation and part of the education to people using this tool is that you have to accept 80, 20, so they will, at this point in time, it's experimenting. They not work. They're not going to work to the level you want. You need to do some work until you get them to do exactly what they want.
Euro Beinat [00:33:16]: Want. Right. But experimentation has to be based on this acceptance. Yes. It's going to be some iteration before I get him to do what I would need.
Demetrios Brinkmann [00:33:24]: Yeah. And Mert, you said something around the sprawl of tools and just being able to grab whatever you needed. And then you kind of had to tighten up the policies a little bit more on that. It reminded me of a piece that my friend told me on how his he was in a data governance role and his job was to go around and check all of the different tools that were being used that were with AI, or whether it was like Claude code or it was a marketing tool that sent data somewhere that was being processed by AI. And he thought, like, yeah, all right, cool. We're not that big of a company. There's probably 10 or 12 maybe. And after he went around to every team, it was like, oh, there's 90.
Demetrios Brinkmann [00:34:14]: And that's not even all of them that I know about, because I'm sure there's people that are grabbing some data and then throwing it into their personal Gemini account or whatever. And so in that regard, I often wonder, like, because it is such a tool that we want to be able to like, enable folks with, but at the same time, you can open yourself up to all these different vectors that you weren't thinking about before. How do you think through that in a way that is like, okay, now we are paying for 90 tools that we didn't realize we were paying for. And these tools, we may or may not be using them to send data off, like through another service. Right. So in a way, I'm thinking about the product management tool that records your calls and then sends all of that, all your calls to ChatGPT. To get it analyzed, you're paying for the product management tool. But then you have to know in that governance position that oh, all of that data is going to chat GPT.
Mert Öztekin [00:35:26]: Yeah, well practically the problem didn't start with AI. We had that problem in the past as well. Still, since technology started, every department can buy a tool for themselves.
Euro Beinat [00:35:42]: And they do.
Mert Öztekin [00:35:42]: And then they do. Yeah, like two, three years ago when the enterprise IT team showed me the number of licenses we have from different tools paid by the credit cards of the the department needs. It's mind blowing. And they had no AI in any labeling or domain name. So this problem always existed. So what we built as a technology department is one how we are going to basically secure ourselves. Because the concern is that you understand people want to make their organization more productive and solve the problem as soon as possible. That's the entrepreneur mindset.
Mert Öztekin [00:36:21]: So that's totally fine. But we also have a responsibility to protect the business and the customers and the partners as well. So for that you need to have the policies that are practical so people can follow and at least even they buy something they can just tell that okay, I'm buying this one. Hey, would you like to integrate to our OKTA system or our security system? Is it behind Zscaler or something else? And you need to educate people about, okay, we know you are building a shadow IT from time to time but at least let us help you so that we won't have a bigger problem as a business. So that's we have been trying to do and I think we are more successful because the technology also helps about understanding which tools are being used. The finance reports tell us what is going on. The network log says that okay, there's so much traffic going in this direction. We try not to prevent it from the beginning but we are trying to educate them to, to help them help us protect them and the business as much as possible.
Mert Öztekin [00:37:23]: Because I think one concern I usually have about the technology department or me or my people is that we sometimes overestimate our capabilities in terms of oh, I can solve their problem, I just need five more engineers or I can also solve marketing problem. But that's not the best use of your five engineers for that problem. There's already solutions out there. Let them just, just outsource this problem to someone. As we have a core business problem, food delivery, grocery delivery, we need to use that 5 engineers to solve our problem. If it's 100k that we will spend, let's spend it as a business. So that's why we need, we need to, I think, empower the people to basically operate in a, in a, in a given framework and help us help them about how we will govern or procure or maintain these things. And I will be honest, I think we didn't nail it.
Mert Öztekin [00:38:21]: We still have painful areas as well. But in order to 100% nail it, then I think you need to have a very strict governance which will eventually come with the cost of innovation in the business. So you need to take certain risks based on, based on your business model, based on what you have, and educate people to help you. That's how we try to solve it.
Euro Beinat [00:38:44]: I totally agree. The education comes in many different forms and so on. But one way you already said that these tools, they make new mistakes, but they can also make new majority of mistakes we already made before. They're just better at them in the sense that you have more ways to do that, you have more ways to make mistakes and so on. So you need to make sure that people are aware of that. But you can also act at various levels. For instance, you can act, let's say at the point in which you say this type of data, and there's a very specific list of data types, should not be used with this tool, full stop. That's it.
Euro Beinat [00:39:30]: So you cannot go there. Then you can also introduce a certain number of friction in various places. For instance, if you have an agent that sends email, is not allowed to send bulk email only one at a time, and you are not allowed to loop it so that you reduce the chance that this is going to start spamming the word and so on. So there are a variety of measures that can be put in place. And if you put them all in place, you eliminate all errors. Then you don't really have much use of these tools. But then you start peeling them away to see what's acceptable, what's not acceptable. You get to the point in which you find, let's say, some sort of balance between the need of innovation and also the need to protect the organization.
Euro Beinat [00:40:20]: That's for sure the case. What is also extremely important is that a lot of the things we learned in the past about all this are still valid. But these tools are also new. So you find out a lot of the vulnerabilities if you want, by doing. Therefore, if you have tools that enable you to test without making big mistakes, then you find the things that can go wrong and then you serve that and then on and on and on and on and on. It's a process of learning the fastest you Learn the fastest you can use these tools at scale for important things, which is very beneficial.
Demetrios Brinkmann [00:40:57]: I want to end with one question that wasn't on the docket, but as we're talking, it's like it's very important. And I've heard folks in the community talk about it. I've heard folks in different communities talk about how they are thinking through, through provisioning resources and provisioning, mainly finance, towards AI because it is such a. Like we saw the graph yesterday, it was exponential. Did you know it was gonna go exponential? Probably not. Were you thinking maybe it could? Yeah, but how are you allocating budget towards GPUs or OpenAI bills or all of the different things that come with the AI if you are going to see the explosion that you're expecting?
Euro Beinat [00:41:45]: So that's a super interesting and important point too, because we see two phenomena. On the one hand, let's say the, let's call the unit cost of intelligence goes down very quick. So for doing the same thing a year ago, and there's half and half and half and half. Right. The token, the cost per million token is also going down. So there is a trend in our favor and all this cost will be, let's say, decreasing over time. At the same time we see another phenomenon is that everything that we do requires more tokens, it costs more. So these two things are one against the other.
Euro Beinat [00:42:19]: Altogether it goes towards being more expensive. So the additional use of tokens overcompensates the reduction of costs. At that point you have many things that you can do. One thing is optimization. Many use cases developed so far are super, let's say inefficient. The prompts are very large, the contest are very, let's say, wordy and all these kind of things. So you need to go back and redesign how things are done because otherwise it costs and so on. There are many tricks and many heuristics to reduce the number of tokens at the output for many use cases.
Euro Beinat [00:43:00]: So people start getting, let's say paying a lot of attention to optimization of this, you start getting to the point in which open source models are competitive with, let's say, the bigger commercial models.
Mert Öztekin [00:43:13]: Not for last night.
Euro Beinat [00:43:14]: Yeah. So not for everything, but for many practical use cases. Then at that point you're just better off hosting the whole infrastructure yourself. Right. Which is another way of covering it. But all in all, it was not a problem last year because volumes were small. And then let's say use cases were not complex. It is an issue now.
Demetrios Brinkmann [00:43:37]: That is exactly what I was thinking because especially as you're scaling much more and you're expecting that usage is not going to slow down anytime soon, like that scale is only going to increase. You're going from 18 to 30, I imagine the next goal, and that's just until March. By the end of next year it's.
Euro Beinat [00:43:56]: 100,000, much more complex. And then let's say the other thing which is obvious now is that the complexity of use cases comes with the increase in cost. Right. In the past, independent of the complexity of the question or the task, the usage was the same. Now there's a correlation between complexity and usage, which also good from the business point of view because you solve a bigger problem. So it makes sense. But still you need to start paying a lot of attention because it becomes a significant operating cost if you let it go.
Mert Öztekin [00:44:30]: Yeah, yeah. I mean, I agree. I think deep inside you believe that the return of investment of this is always positive somehow. Like maybe not in the beginning because the learning curve of wrong, wrong agents are being built or tools are bought. But over time, with people learning, with the technologies getting better, the adoption getting higher, you know that the return of investments is going to be higher for the business. And every business wants to grow more. And I think we all believe that AI is going to be the game changer of how fast the business can grow or how fast the business can fail. So do you really want to try to optimize your costs on these ones or make sure that this company will get 2 times bigger in 5 years time or will shrink half in 5 years time? So that's the concern.
Mert Öztekin [00:45:25]: Obviously you will optimize those things over time. But I think truly we all believe that the return of investment will always be higher for the business if we will make people use these technologies much better than they have been using a year ago or two years ago. So you need to have conversations with your CFO or other leaders to manage that. But at the end, if people are using it, they are finding value out of it. No one is just inherently trying to waste money of the company. They get more out of these things than the company spends. Obviously probably some of them don't have the most optimized prompt or, or the right model. But I think these are very small unit costs at the end.
Mert Öztekin [00:46:11]: That is not the biggest concern of me at the moment. My, my concern is how we are going to grow the company two, three times in five years time and how AI is going to enable that.
Demetrios Brinkmann [00:46:28]: Something funny happened to me six months ago ago I asked Deep research for a report on different GPU providers, and it was absolutely I couldn't figure out what each NEO cloud value prop was, so that set me off on my latest side quest of creating a Practitioner's guide to choosing GPUs. I'm happy to announce this guide is now ready to see the light of day, and you can download it for free right now by clicking the link in the show notes. We've already got some community members feedback of what they wish they would have known before signing a gigantic contract, and I would love to hear from you if this provides any value, or you have some things that the rest of the community should be thinking about when they're on the market for GPUs. Go ahead, download that resource right now, completely free. It's in the link in the show notes.
