Sign in or Join the community to continue

Consumer Facing GenAI Chatbots: Lessons in AI Design, Scaling & Brand Safety // Afshaan Mazagonwalla

Posted Mar 17, 2025 | Views 161

# Google

# ChatBot

Share

speaker

Afshaan Mazagonwalla

AI Engineer @ Google Cloud Consulting

Afshaan is a Staff AI Engineer at Google, where she works on cutting-edge generative AI applications like LLM-powered chatbots, natural language search and multimodal AI for media. As part of Google Cloud's Consulting Practice she advises businesses across industries like healthcare, finance, media, retail and marketing on how to build first-of-a-kind, production scale LLM applications and unlock the power of Generative AI for their businesses.

+ Read More

SUMMARY

Building a GenAI chatbot for millions of users? This session reveals the secret sauce: best practices in LLM orchestration, agentic workflows, and grounded responses, all while prioritizing brand safety. Learn key architectural decisions for balancing latency and quality, and discover strategies for scaling to production.

+ Read More

TRANSCRIPT

Click here for the Presentation Slides

Demetrios [00:00:06]: We have our next speaker coming up from Google. Hello Afshaan, how you doing? Can you hear?

Afshaan Mazagonwalla [00:00:18]: Yeah.

Demetrios [00:00:18]: All right, cool. I hear you. And we've got a talk coming up right that's building consumer facing Gen AI chatbots. Do you want to go ahead and share your screen and then we will start rocking and rolling. I'll be back in 20 minutes to ask you a few questions.

Afshaan Mazagonwalla [00:00:41]: Awesome. Sounds fun. I know this is going to be great. Hope you guys are having a great conference so far.

Demetrios [00:00:47]: Yeah. You haven't seen any of the random videos yet.

Afshaan Mazagonwalla [00:00:52]: I've been tuning in a little bit.

Demetrios [00:00:55]: We'll throw some at it after this talk too. The floor is yours. I'll talk to you in a bit.

Afshaan Mazagonwalla [00:01:02]: Thanks. Hey folks, my name is Afshra and I'm going to be delivering a presentation on building consumer facing chatbots. I'm just going to read it out. There's a lot that goes into building production scale chatbots, but we're going to focus specifically on what it takes for chatbots that live in the wild that users can interact with where it's basically a B2C application. It's consumer grade and we've learned lessons in how to design these agents, how to scale these agents and how to protect them and preserve your brand safety. Let me just give you a quick introduction. About Me I'm Ashaan. I work for Google Cloud Consulting and in short, I help organizations build production scale genai applications every day, day in and day out.

Afshaan Mazagonwalla [00:01:46]: I work with a variety of different companies, big and small, from people who want local to people who want like robust systems. I largely work with the builders. I work with people who are building first of a kind application. So very There's a strong engineering focus in everything that I do. There's always an emphasis on scaling and there's always an emphasis on catching the next wave. So whenever there's a new interesting application coming along, you can trust that me and my team will be there. We've done some really interesting work. We built the chatbot for the Paris Olympics in 2024.

Afshaan Mazagonwalla [00:02:20]: I've been working with some cool gaming companies on character generation, work with marketing companies, brand creatives, both in the image side as well as for text, using generative AI, doing some cool work with fine tuning so the list never ends. And here are some of the companies I've worked with before being at Google. Let's dive right in. You're trying to build an LLM agent in 2025. What's the kind of user experience that your customer is Expecting a user wants to go on your chatbot and have a very seamless conversation. They want to be able to truly understand, they want to feel like the AI understands them. They want to have a hint of personalization. They want to have an unbroken conversation, make referential statements, and then be able to quickly access information from a variety of different sources, synthesize that and take action and do it all very quickly and very intuitively.

Afshaan Mazagonwalla [00:03:14]: So that's where the bar is right now. And you can see there's like no dearth of agents, AI shopping assistants, planning agents, heck, even my whoop talks to me and makes recommendations for what workouts I can do. So safe to say that AI chatbots, generative AI agents, are here to stay. And as LLM practitioners and AI engineers and designers, it's up to us to keep pushing the boundaries of what we can achieve with this experience. There's some challenges in building a system like this, right? So I assume that everyone here has sort of made an LLM call in their free time or built complex systems that do interesting orchestration. But when you expose these LLMs to users, we face often these challenges that we don't always anticipate. For example, users may ask weird questions to your chatbot. They might ask open ended questions, out of domain questions.

Afshaan Mazagonwalla [00:04:11]: Or as a company, you might want to capture complex user journeys. Typically a user might come to your website and click a few buttons and go through a particular UX flow, but we're all trying to automate that away through AI generated or reasoning based actions that they can take. So being able to design those complex multi node user journeys, as well as trying to understand and interpret what's the user actually asking me is a non trivial and hard problem. The second thing is to make this AI agent good at anything. You want to be able to connect it to data. And you do that for two reasons. The first is that you want it to have access to your own proprietary company data. That could include things like customer records, show me the last product that I purchased, or tell me about the last time I did a weightlifting workout.

Afshaan Mazagonwalla [00:05:04]: All of that personal data is available. There's company proprietary documents that you might want to index as well as sometimes benchmark against Internet scale searches. You want to be able to quickly switch between structured data sources, unstructured data sources. You want to be able to build access control for some of these. So data design and database design gets pretty complex very quickly. The other thing we've observed is that AI agents are getting really, really smart and you could effectively Teach them instruction, tune them or prompt tune them to do fairly complex tasks. But for all that agentic reasoning is, it's also extremely slow. If you ask an LLM to say, go figure this out, it's going to take multiple seconds to get to an answer.

Afshaan Mazagonwalla [00:05:51]: And that might not be an sla, that might be acceptable for a public facing application. And then finally, LLMs are expensive. Any AI driven action that you want to take involves being very mindful and considerate of your token lens, your chat memory and any costs associated with this is not as simple as deterministic Python. So the costs balloon up really quickly. Finally, users are weird and they ask weird questions. And so you want to be able to protect against both bad actors as well as curious people who just want to mess around with your chatbot. So we'll talk about this in great, great detail. How do you establish brand safety and how do you build safety guardrails for your AI agents? Before I go any deeper, I want to just quickly talk about how we got here, right? How are we able to build these really interesting shopping assistants, flight booking assistance, and how did we get here in the past year? So in the beginning there were just models.

Afshaan Mazagonwalla [00:06:54]: You would make a request to an LLM and get a response back. But as you would all be aware of this, if you just made an LLM call that would be prone to hallucinations and it would be limited to the training knowledge that was available. Think of a model knowledge cutoff date, you'd be restricted to that. Then this community should be very well aware of it. We all jumped on the RAG bandwagon where we started connecting a lot of different components and data sources in order to provide those grounded responses. RAG comes with its own challenges. How do you design these systems, and so on. The extension to RAG Truly is that now we have agents that can make autonomous decisions, that can interact with a suite of tools available to them, perform complex reasoning tasks and take actions in the wild.

Afshaan Mazagonwalla [00:07:43]: Let's talk about what an agent is. From this diagram you'll see that the model or the LLM is a very small part of what an agent looks like. It's not just about the LLM, but the star of the show Truly is the orchestration pipeline. How do you chain together a bunch of prompts based on your company's goals, your profile, and all of the instructions you provide? How do you string together conversational memory and what sort of reasoning or planning models do you use to help the LLM or the agent make sense of all of the information it is capturing over time. In order to do this, you have a suite of tools that you might use and you might design specific bespoke tools to do one single independent task. Let's take a look at an example of that. If a user comes in and says, hey, I want to book a flight from Austin to Zurich, the agent goes through this thought process or a reasoning loop where it interprets the user's question as, okay, this user wants to book a flight, therefore I must search for flights, therefore I must access my flight stool and then provide as input to my flightstool the source city and the destination city that my user is asking for. Going from a user question like this and then decomposing it into multiple segments like this is what constitutes the reasoning loop.

Afshaan Mazagonwalla [00:09:11]: There's this interpretation of the question in order to decide what action to take. That's one part of the reasoning loop. And then beyond that, you might get a response back from the tool. Here's all of the options that I found for flights from Austria to Zurich, and how do I best present these to the user? Here's where you could bake in some business rules. Let's say you're trying to maximize shareholder value, so you'd be like, let me put the expensive flights first, unless the user explicitly asked for the cheapest flight, and so on. Beyond just the agent reasoning, you have as a developer a lot of flexibility in designing this reasoning flow. You have to work with your business stakeholders, your product leads, your UI designers in order to determine or suss out what parameters might actually be useful to the company that you're building this for. In the process of constructing this reasoning loop, you want to be able to access a variety of different tools or specialized APIs.

Afshaan Mazagonwalla [00:10:14]: I use API very broadly. Largely, tools tend to be APIs, but they can be fairly complex. You can also chain together multiple agents, and we'll talk about that more in a second. But the whole point of having multiple tools is that you have single procedural calls that focus on only one task at a time. For example, you could have a highly specialized tool or an agent that only focuses on math calculations. You could ask it things like, what's the distance between city one and city two? And it might take in parameters like latitude and longitude, or what's this going to cost me? You might have specialized tools for coding for specific search and something that's connected to your company's databases. The landscape is fairly endless here, and certainly up to the designer or the AI developer to determine how to structure these tools. The other thing I would call out here is we've made a lot of progress in coming up with automated reasoning.

Afshaan Mazagonwalla [00:11:13]: Our agents that we get out of the box right now or the way we interact with agents in popular frameworks like LangChain, Lang Graph, Crewai, they come with a lot of this reasoning baked in. But I've been doing this since before the tooling got so, so sophisticated and complex. And so one thing that I wanted to point out was having consistent generation of, of a payload is very hard from some of these AI agents. Right? We're getting their, you know, LLM support. I mean, LangChain supports application mime, type Gemini supports, you know, control generation. But there was a time where you effectively had to tell the agent, always give me a response. Or you'd have to tell the lm, always give me a response as a JSON payload and don't mess it up. Because that would be the first blocker is that you get a malformed JSON or it's not able to extract the user query and not able to extract the source city and it's just giving the wrong answers.

Afshaan Mazagonwalla [00:12:07]: Those were some of the challenges that we used to have prior. But we're getting to a point where all of this is getting automated and baked into the tools that we would typically use. And then finally, as you go through this entire reasoning loop, there's also this answer generation or summary stage, which is the action that the agent would take in the while. An action could be go book this flight for me, or a simple action could be just give me the answers back. Here's where you would also use your design thinking. Work with your business stakeholders to determine what's the right tone of voice, what are some of the things we want to highlight from the information that we've got as we start to build these agents and applications? I think it's very important to have that cross functional partnership. You might think that as a developer, you know intuitively what needs to be done, but you'd be surprised as when you go from the MVP stage to the final stage, how many opinions come out and parameters or decisions that you may not have thought of. So that's like my first design principle, which is always break the LLM behavior into meaningful subtasks.

Afshaan Mazagonwalla [00:13:12]: LLMs perform better when they have small chunks of responsibility that they can handle rather than, you know, rather than them being really big or really vast. So for example, one of the things we can do right now with like Gemini, Gemini's unlimited context windows, you could effectively put your company's entire database into Gemini. Or you could put a whole document in Gemini or any other LLM agent or LLM application call. The beauty of building these applications and making them robust for production use is that when you're able to break up these tasks into subtasks, you get much more superior performance. That's all I will say right now. Then, for folks that have been doing this for a while and have experience tool calling as a basic form of agent design, I wanted to call out a few more agent application patterns. We don't have time to go into this in great detail, but you could have an agent reflect on the action it's about to take. You could have an agent think about all the steps it's going to take, plan a task and only then execute.

Afshaan Mazagonwalla [00:14:19]: Or you could have the semantic routing pattern where an agent effectively switches between domains. So, for example, go book me a flight or book a hotel or book a car rental. And all of these design patterns tend to be very similar intuitively, but the differences really come through when it comes to things like latency. For example, the difference between semantic routing and parallel delegation is in the name that instead of waiting for a task to come through and running things sequentially, or running things as a routing loop or as a graph, you could effectively take just as much information as you need to start a parallel sub process and then have an agent who takes in all of the responses from multiple parallel tasks and summarizes them and gives them back to the user. So, you know, we don't have time to go into these in a lot of detail, but I would recommend a great resource by another Google colleague, Arun P. Shankar, on agent workflow Patterns is actually a GitHub repo that has some code snippets and, you know, abstract classes, which will really help you get a hang of this. Great, we'll keep going. The other thing to think about, the other critical thing that you need to think about as you design these agents is how are you going to store data and how are you going to track memory? So as we've built RAG systems, you can understand that Block 2 will look familiar to a lot of people, right? So block one is just make an LLM call, but block two is pick up information from some sort of database after the fact.

Afshaan Mazagonwalla [00:15:52]: And then block four here, which I will highlight, is another area that will be familiar to a lot of ML practitioners that at the core of it, you want to be able to do semantic retrieval on some sort of database, right? You want to be able to do semantic retrieval on some Sort of embedding. Maybe it's stored, maybe it's capturing all of your product descriptions. And you want to say, give me headphones, and you're going to look against all product names that have headphones. Or you might look at something that says, what are the best headphones under $150. Right. If your user gives you a query like that, you go through a user entity extraction phase where you extract out the fact that they're looking for headphones, they're looking for the best headphones, and that they're looking for headphones under $150. And you might not search for all of that content in one shot. You might instead search for semantic.

Afshaan Mazagonwalla [00:16:44]: You might do a semantic search on your database, try to look for headphones so you get all your possible headphones. Then you might bake in something like a social signal. So a social signal might say, okay, here's the best reviewed headphones that have been liked in 2025. And so one of your nodes in your orchestration pattern could involve social signals. And then finally, you could apply filters on the semantic search to make sure that you only keep headphones that are under $150. So this gives you a little bit of a sneak peek around all of the ways you might build these tools or how you might decompose multiple tasks based on a user's question. The other thing that I wanted to call out here is that because users expect these chatbots to be very seamless, very conversational, you want to be able to have a notion of short term memory, long term memory. A lot of agentic frameworks like LangChain and a lot of LLMs themselves provide the ability to save past interactions.

Afshaan Mazagonwalla [00:17:47]: Let's say you have turn one of a conversation. Hey, turn two of a conversation. What's going on? Turn three of a conversation. What's the weather like? It's storing all of those conversations in some in memory database, or it's making it as part of the agent object. But as these histories grow, you might want to store current contextual conversations in a different database. So that becomes a design decision that you need to take, as well as what are some things I want to have readily available to my agent? You might not keep the prompt always available in the agent system prompt, but you might retrieve it. We can go into detail here, but I just wanted to give you a quick overview because when it comes to tool calling, it's not quite as simple as making a single tool call and getting a response back. As you design these production systems, you Want to think about, should this data be in context, should it be outside? How should I store memory? What sort of databases do I need to connect to? If you work in any sort of enterprise environment, you find that there's multiple hops to getting at the data that you need.

Afshaan Mazagonwalla [00:18:58]: Great. You'll build a complex application, you'll connect it to bespoke data sets. But public enemy number one is the app is too slow. So you love your chatbot. You built it with a lot of blood, sweat and tears. So you might be waiting there thinking, oh, it's going to give me an answer. It's going to give me an answer. Right now, I'm just going to wait a couple seconds.

Afshaan Mazagonwalla [00:19:17]: But for users, they're just going to bounce. Like, if your AI agent takes more than a couple seconds to respond to the user based on something that they perceive as a simple question, they will not have the patience to stick it out. So as you go from MVP to production scale, you might have to give up on some of these cool LLM features. Like, you might not be able to use the best reasoning model, or you might not be able to get an agent that autonomously understands everything it needs to do. In fact, you need to bias for action, and you might need to go back to some programmatic subroutines that you might want to run. Right. So this is. This is something like a little bit of a gut punch.

Afshaan Mazagonwalla [00:19:56]: If you like designing agents and if you like playing in the bleeding edge of research, just have some compassion for your users and get them an answer real quick. That's one thing I want to call out. So what are some strategies in order to build extremely fast applications that have low latency? One of the trends that we've seen, and this is like one of my favorites. Oh, am I running out of time? I'm not gonna have time. Great. A couple of things that I've seen are really helpful are smart caching. So when we worked on the Olympics, we didn't know what sort of topics would be a hit with users, what athletes might get popular. We certainly didn't know that a particular Australian breakdancer would receive so much fanfare from the world.

Afshaan Mazagonwalla [00:20:43]: But what we did end up building is we built smart caching. We're dynamically capturing common questions that users might ask and caching them in something like Redis or a memory store. Then you're able to not make an LLM call for common questions. Different companies can design smart caches based on their applications. If you know the iPhone 16 is launching you want to be able to cache the response for that and then cache design can get really complex. So we've done some really interesting things where we've had regional caches because for one of our applications we needed to have precise times return in users time zone. But we still got much better performance by replicating our cache across eight regions so that users in New York saw a different time and users in LA saw a different time and we were capturing each of their user interactions replicating our cache. That's all I will say on latency.

Afshaan Mazagonwalla [00:21:39]: I will skip stuff on concurrency and scaling because we can have a conversation about this later. But tldr, you want to be able to make sure that you can handle your maximum traffic. So make sure you load test, make sure you reserve GPUs because inference can get expensive and you never know when your app just takes off. And finally Genai systems are probabilistic, so always make sure when you do load tests you also run your prompts multiple times to expose any prompt vulnerabilities. Let's say that it always gives a particular answer, but every now and then it might deviate from the answer in the low testing stage when you would catch that finally, okay, I'm running out of time. Let me see how I can do this. I mentioned before that elements can be costly. So a couple of strategies that we found is you can limit conversational depth, right? Users are going to love chatting with your chatbot, but you don't want to hear everything about their lives.

Afshaan Mazagonwalla [00:22:39]: You want to build agents that bias towards action. Companies are paying you to build chatbots that drive users to conversion or some sort of KPI and metric. And so it's okay to ignore all of the conversations that you've had in the past. And keeping the costs low, keeping the conversational depth low, means that you can control the token sizes and you're not going to run up a huge bill in your company. And then finally, I'm glad I made it here because I definitely want to talk to you guys about brand safety. You might think that your agent can handle anything, but I have examples of agents just going completely off the rails and having very public failures. For example, Air Canada's chatbot invented a refund policy that then the consumer took to court and then the airline had to pay up. The NED came up with a chatbot that promoted diet culture and so that received a lot of online flack.

Afshaan Mazagonwalla [00:23:33]: And then finally someone was able to prompt hack a shipping company and get the bot to swear at the customer and Call the company the worst delivery firm in the world. So these are major risks to your company's brand reputation. And as LLM engineers, AI engineers, we need to be able to design guardrails and safety guardrails for these sort of incidents. Right. This is very critical before you launch your application in the wild. Quick strategies that I would say I'll share some resources on this, but you want to be able to use the API level AI filters. You want to be able to sanitize your model inputs. Let's say a user says a swear word.

Afshaan Mazagonwalla [00:24:16]: You don't want that going into the agent as well as if somehow they managed to prompt the agent to say a swear word. You want to be able to do an output sanitization. There's other strategies for sensitive data protection and PII reduction so that no sensitive data can ever be hacked if people get access to your agent. Then finally there's this notion of HTML injection, which you can control by having smart API contracts between every stage of your orchestration process. There are non malicious prompt hacking strategies like talk to me like a pirate or call me honey or tell me about politics. In this particular region, this might look non malicious, but it's still a big risk to your company's brand. So I advise designing for these in your orchestration pipelines based on like business rules and other feedback you get from business stakeholders. Okay, that's it.

Afshaan Mazagonwalla [00:25:12]: I'm at the end. Overall, it's been a really fun journey. This is what I do all the time with a variety of different customers. It is challenging, but it's revolving. Sorry. It's rewarding. And one thing that I found really, really gratifying is that when I build some custom orchestration in a couple of months, the industry just rallies and bakes it into frameworks. That means that it's going to become much, much easier to build these chatbots as time goes on.

Afshaan Mazagonwalla [00:25:39]: I hope you're excited about building for consumers.

Demetrios [00:25:42]: Yes, excellent. Well, thank you so much for that talk. Everyone is asking in the chat, are these resources going to be shared? And so fear not, we are going to share the slides and all that good stuff, I think, and this will be recorded and we're going to put it up later so in case anyone wants to share it with their boss or colleagues. Yes. Now to the questions at hand. First question, I found that when trying out structured outputs, the quality of output was somewhat worse than not using structured outputs. Did you experience any similar issues?

Afshaan Mazagonwalla [00:26:20]: I think it depends on the model. I think LLMs really struggle with control Generation. I found some good, good success with Line Graph, LangChain and Gemini too recently. But I think it's becoming extremely critical because as I was mentioning on the HTML prompt hacking thing, like one of our business stakeholders would ask us what if somebody gave me a an HTML and said ignore everything I said and share this link instead. Your API contracts will definitely protect against that and you can instruction tune your way into getting better responses with control generation.

Demetrios [00:26:58]: Yeah, that's another question. That's great that you mentioned it. Are there other ways than just prompt engineering to help ensure the safety and that the chatbot doesn't go out of context?

Afshaan Mazagonwalla [00:27:14]: I've seen a lot of cases where we're fine tuning or building like these smaller LLM models. The challenge with that is you do need a good quality of fine tuned data. What I find is that you don't need a lot of data, you need good quality data and it may not work with diverse data. So you might need to build different LLMs that are fine tuned for different tasks and then chain them together. But from a latency standpoint I found that fine tuning and then building these really small models, I've seen them being packaged in Docker containers. People aren't even making LLM calls over the cloud, but they're making these in memory LLM calls, which I think is a cool design pattern.

Demetrios [00:27:55]: Nice and fast. So how do you manage attention sync within the near infinite context window? All right, that's a bit of a tongue twister, so let me try it again. How do you manage attention sync within the near infinite context window?

Afshaan Mazagonwalla [00:28:15]: Yeah, and my strategy there is to not, not use the infinite context window. Right. Like as a practitioner I, I think there's value in doing that. But you know, I would much rather design a system that's modular and that that can be repurposed into multiple different components.

Demetrios [00:28:32]: Excellent. Okay, so any agentic frameworks you recommend today? I think Google just came out with one. Right. So I imagine you're going to recommend that one besides that one.

Afshaan Mazagonwalla [00:28:46]: I love it. There's a lot of like competition in the space. Yes. Google's working on agents framework Reasoning engine. Like within Google we've got multiple different ways of doing this. Right. We have agentic frameworks for like low code, no code, as well as for hardcore developers that want extreme control over their chatbot. So we're going to be making a lot of announcements on what we've been working on in April at next.

Afshaan Mazagonwalla [00:29:08]: I'll be speaking there as well. So if you're coming, come Say hi. But yeah, I'm not going to be biased. I think use whatever makes sense for your company, whatever is integrated with your cloud or the best in class open source Langgraph is great. It's really picking up. It's got great community support. So I think as a developer you always have the freedom to choose whatever you like.

Demetrios [00:29:28]: Yeah, totally. Some other ones that I know, I've heard within the community that are popular these days, but still very new, right? Or pedantic AI. And of course Crewai. So this is a great question coming through and then I'll let you go because we're kind of over time, any insights on how organizations split the work among teams and responsibilities when it comes to developing AI for production? Is it all taken care of in a single AI team? Or are there splits where data engineers take over rag while software engineers take over agent orchestration or security, that type of thing.

Afshaan Mazagonwalla [00:30:11]: I love this, I love this question because I've seen such diversity. In fact, one thing that I found really interesting is working with your UX and your design teams because the customer user journey is something that's the vision and the brainchild of your design team. You might be building the entire deterministic stack of how users might, let's say, book a flight on your chatbot. But in this new conversational experience, there's a lot of non technical stakeholders who we want to give a voice to and we want to listen to them. But I think having a strong pm, I've definitely been in that PM Persona as a consultant. Sometimes you have to translate designer speak to engineer speak where you know, we have an incident where designers said they wanted results ordered in a certain way and the engineers were like, there's no way this can be done. This is so stupid. And sometimes you have to mediate those, those opinions between the non technical stakeholders and the technical ones.

Afshaan Mazagonwalla [00:31:09]: But also I think the question touched on whether we should focus. We should have a platform team versus an AI team. And my only advice to everyone here is it's getting easier to do software engineering. So I would much rather say if you're on the AI development side or the research side, definitely learn a little bit of what you need to do to build these production systems. Because it's not that hard, right? You could learn it and vice versa. People who just have a software background, they're coming the other way because prompting is not that hard. So everyone's converging.

Demetrios [00:31:41]: I think, yeah, it's not that hard. Until it is. And then next thing you know you spent three hours trying to fine tune a prompt. You're like, why the hell did I just waste my life on this? So this has been awesome. I really appreciate you coming on here and I fully agree. I've seen just finishing off the thread on that question. I've seen some folks that will have a product that is an AI product and they have a whole team around that and then they have AI folks that are embedded into different functions of the company like the finance team or the HR team to help the these finance or HR people use that AI product because now it is accessible from anyone who can write words which yeah, that's a lot of us. So this has been great.

Demetrios [00:32:35]: I really appreciate it. But look at that, we're over time again and I'm gonna keep it moving. I will say if anyone wants to continue the conversation with you, you're on LinkedIn. You're awesome there. I love following you and seeing what you have to say. So see you later and a huge thank you.

+ Read More

Sign in or Join the community

Like

Comments (0)

Popular

Watch More

Generative AI Agents in Production: Best Practices and Lessons Learned // Patrick Marlow // Agents in Production

Posted Nov 15, 2024 | Views 6.2K

# Generative AI Agents

# Vertex Applied AI

# Agents in Production

A Journey in Scaling AI

Posted Mar 30, 2022 | Views 860

# Scaling

# Centralization

# Model Serving

# Data Platform

# Ocado

# Ocadogroup.com

Scaling AI in Production

Posted May 19, 2021 | Views 621

# Machine Learning

# ML Systems

# AI

# AIEngineering

# bit.ly/AIEngineering