The Hidden Infrastructure Behind Every AI Agent // Erice Hughberg // Agents in Production 2025
speaker

Erica Hughberg is a technical leader and community advocate passionate about helping engineering teams build scalable, secure, and human-centric application platforms. With a background in software engineering and a deep understanding of cloud-native technologies, she specializes in driving the adoption of open-source projects like Envoy Gateway, Istio, and Kubernetes Gateway API, which enable organizations to simplify traffic management, security, and API distribution.
As a maintainer of Envoy AI Gateway, she plays a key role in shaping the future of API infrastructure. She focuses on features to ensure organizations can securely and efficiently integrate AI-powered services while simplifying traffic management, security, and API distribution. In the Envoy community, she drives collaboration, mentorship, and contributions that advance the project and its adoption.
Lastly, as a believer in the power of storytelling, Erica enjoys translating complex technical concepts into engaging, accessible narratives in the form of social media posts, conference talks, podcasts, and educational content.
SUMMARY
AI agents aren't just generating content; they're generating traffic. Like any good agent, your AI agent isn’t working alone. Behind the scenes is a mission-critical handler: the AI Gateway. In this lightning talk, we'll explore how Gateways are evolving to handle the evolving realities of GenAI: dynamic routing, access control, cost-aware load balancing, model-aware failover, and observability across multi-model environments. If you're building agents or just trying to keep up with the traffic they generate, this talk will help you understand the infrastructure patterns that are evolving to support a new landscape of software.
TRANSCRIPT
Erica Hughberg [00:00:08]: Thank you. Thank you for having me. That's awesome. And yeah, we're going to talk about the hidden infrastructure behind every AI agent.
Demetrios [00:00:18]: I love it. I love it. I'll let you go and rock and roll. And I'll be back in just a minute.
Erica Hughberg [00:00:23]: Thank you. Or is it a secret AI agent? Well, the smart, the capable, the mission driven, the agent. Well, you see, even James Bond didn't work alone. He had a team behind him. Because behind every agent there is a team. There is the one who assigns missions and the one who equips them and the team that keeps the mission running quietly in the background. Today's AI agents, they are no different. They may look autonomous on the surface, but every time they take an action, query a model, hit an API, fetch a tool, someone, well, something has to handle that traffic.
Erica Hughberg [00:01:46]: So here's the question that I want to explore with you. If your agent is Bond, who takes on the role of Q, or rather, what? Because carrying out a successful mission as an agent isn't just about being intelligent, dashing smart. It's about the infrastructure, the backs, them, the tooling in the background. Mission Control hq. Let's lift the curtain on the real hero of the operation, the gateway. Well, what's the mission, though? Because every agent and every secret agent has a mission. But missions aren't completed with good intentions alone. We can't just wing it.
Erica Hughberg [00:02:50]: They require planning, coordination and oversight. And our mission today, our mission today is to enable those agents. So let's take a look at the brief and take a little look at the situation that we're dealing with. The agents that we're dealing with, they're not just smart, they're talkative. And every decision they're making, it kicks off this flurry of activity. There are calls to OpenAI that talks to bedrock, anthropic Gemini requests to internal tools to databases, embedding services. And there's lots of traffic. There's traffic going all over the place.
Erica Hughberg [00:03:32]: There's lots of it. And each one of those calls, that's network traffic. And it needs to be routed, it needs to be authorized, it needs to be tracked and optimized. So something that could look like a simple task, like summarize it, book it, look it up, it can cascade into a complexity of potential failure points and security risks. So it's starting to get complex to manage. So let's get to work. And the natural answer that many of us get to is we going to need some sort of gateway. And when we get to this gateway question we ask Ourselves, are we going to build it? Are we going to enhance something that already exists? And you may say, well, networking is a solved problem.
Erica Hughberg [00:04:42]: Like we have gateways and traffic management that's already sorted and API gateways, that's really last decade, that's been around for forever. And you wouldn't be wrong if you have all of those thoughts and all of those feelings. And it would be great gut instinct if you just got to the conclusion that yes, let's just enhance something. So the foundation of network proxies like Envoy Proxy, that is almost 10 years old, it's been there for a long time, right? 10 years, is handling Internet traffic everywhere. And probably more Genai traffic than you can imagine is going through Envoy Proxy. And I like to think of on my proxies, this exciting little cog handling connections everywhere, this glue that's putting and holding the Internet together. And right now on this live stream and me talking to you, there is an Envoy proxy in the chain of network traffic somewhere, most likely delivering this to you. And what has really changed though? Why can't we just stick in traditional API gateway and handle all our AI gen AI traffic? Well, you see, traditional API traffic looked a little bit different from what we are seeing today.
Erica Hughberg [00:06:09]: So the Envoy is everywhere in this small cog that's all over the place, right? Traditional API traffic looked a little different. It was light, it was fast, and we tried to make sure it was deterministic, like how long it was going to take to execute something. You see the agents that we're dealing with today and the type of API requests they are making, you can say they got expensive tastes and a little bit like Mr. Bond. So their API or the type of network requests they make are bit different. And the landscape really are dealing with is therefore different. We are not talking just about routing from A to B with deterministic request anymore. We are navigating a landscape that is dynamic and frankly cost sensitive and security conscious.
Erica Hughberg [00:07:09]: Well, we always been security conscious, but we are dealing with information that we're sharing in a different way. And we are putting these agents on more and more critical missions. So resiliency is becoming increasingly important. So you may feel like you're giving your agent fairly simple requests like just go get an answer, get the job done. But it's not just about getting it done. You need to figure out how and you need your gateway to help you do so safely, quickly, reliably and affordably. So how do we do it? And the challenges that we are dealing with becomes around dynamic routing and we talk about dynamic routing is because models can slow down, providers can go offline and latency can spike. So we need to be able to reroute traffic Access control can be from the what can this agent access on behalf of who and how do we go upstream to the different providers they have? Different providers require different types of access like tokens oauth and different models of accessing.
Erica Hughberg [00:08:29]: Then we have cost aware load balancing because different models cost different different and different providers. You may have different contracts just you from your contracts that you have and you may also be self hosting models and your gateway needs to know which target is worth it for the request that's being sent. Then resiliency. You need to fail over but you need to maintain the experience so you can't just fail over from whatever to whatever. So as you switch gears it needs to make sense. Lastly observability how are providers and models performing? If you can't see it, you can't see fix it. So see it, say it, sort it. Okay, so we are on this mission together to help our agents to make our agents operate better.
Erica Hughberg [00:09:26]: So what are we doing? I personally part of the open source community, we came together to address a lot of different challenges and I am one of the maintainers of Envoy AI gateway. I talked a little bit earlier about the Envoy project and Envoy proxy is almost 10 years old. We are building on that Envoy foundation to address these challenges. I just talked about about the controlling of usage, observing the usage and the authorization and the failover and connecting to different providers and just the we're also talking about the tooling connectivity and MCP protocol and A three right. And this is about how do we continue on the stable foundation to enable that connectivity. So if you enjoy this mission by the way, you can scan that QR code and join our mission and be part of this. But this has been a really exciting journey. So when we talk about gateways it's really really fascinating what we need to do to help our infrastructure.
Erica Hughberg [00:10:39]: We have a lot of the puzzle pieces and we have a really good foundation to continue building on. But let's take a look on our post mission debrief and one can say that at this point we've done a lot. There's lots of different solutions and opportunities out there and let's call a mission a success. The agents completed the mission, but they didn't do it alone. The gateway when the infrastructure becomes too complex, even the best agents they stall out like integrations pile up and debugging gets really painful. Innovation slows down and good infrastructure like an AI gateway helps get make that simple. So the AI gateway helps handle the messy stuff, the traffic routing, the auth, the observability and help make cost efficient all and your developers can focus on what really matters, building agents, building those agentic systems, shipping value and moving fast and you can really think about that. This is the traditional platform engineering versus product engineering and making this really enabling people to move faster and delivering value.
Erica Hughberg [00:12:18]: So are you ready for the next mission? Because you see even Q got tired of rebuilding the gadgets every time. Because you see sometimes Mr. Bond, he didn't come back with all of his gadgets intact after his missions. And also sometimes you just want things to work, you want to get going in days rather than weeks. So I wanted just to share with you different ways and some resources for your next, your next mission if you'd like. So here's some QR codes on the far left here you have a QR code for a reference architecture on AI Gateway and kserv and some other tooling for self hosting and using Envoy AI Gateway. My name is Erica Heuberg, by the way. I work at a company called Tetrate.
Erica Hughberg [00:13:08]: We have a couple of solutions called Agents Operations Director and Agent Router Service. Agent Router Service is a hosted version of Unreal AI Gateway so you don't have to install it and run it all yourself. And on the far right there you have a QR code for a MLOps community community podcast where I chat with Demetrius about networking and all of going way down into the details of what's happening. So I'd love for you to listen to that because it's really interesting because I feel like I scratched the surface of some of the networking challenges. So listen to that podcast and you can learn more and please get involved. Say, yeah, this is a bunch of little things, but yeah, some QR codes.
Demetrios [00:13:51]: And in that podcast I will say you did the most incredible job of making analogies and metaphors for the whole state of the Internet and the evolution of the Internet and the history of the Internet and how we've gone and developed on computers since like the year 1995.
Erica Hughberg [00:14:18]: Yes, and I, I definitely really gave up on the idea of trying to fit that into this talk. So I opted for my Bond theme and decided to put the QR code here. Yes, exactly.
Demetrios [00:14:30]: People liked this. This was just the taste.
Erica Hughberg [00:14:33]: So there's no Bond theme in the podcast, but.
Demetrios [00:14:39]: If you like our restaurant and Lego blocks and yes, other kinds of analogies, I think.
Erica Hughberg [00:14:46]: Yes. There. There's a box with toys in it.
Demetrios [00:14:48]: Yeah, the toy box. That's another one.
Erica Hughberg [00:14:50]: There's a toy box in there. We talked about tofu as well.
Demetrios [00:14:55]: Restaurants. Exactly. Humans learn through stories. That's the key takeaway here. And actually, Jyoti, in the chat was saying we need to get you the Bond theme song for this presentation.
Erica Hughberg [00:15:11]: But yeah. So feel free to check out anything. But if you want to join in with any of the open source stuff, feel free as well.
Demetrios [00:15:22]: Also, yes, PR is accepted, right?
Erica Hughberg [00:15:25]: Yes. Ideas accepted. Even if you don't want to contribute code, if you want to contribute a problem you have, I like to say that questions is one of the best ways to start contributing in open source.
Demetrios [00:15:40]: Nice. Oh, that's really cool. Well, Erica, this has been great. We're gonna rock and roll. I think we have our next speaker on Colin and so I will see you later. It's always a pleasure and thank you for this Bond themed presentation. It made me realize that I am very surprised we did not see any Matrix themed presentations. But who knows? The night is still young.
Erica Hughberg [00:16:08]: Thank you.
