Collections

All Content

All

Paul van der Boor, Sean Kenny & Demetrios Brinkmann · Jul 29th, 2025

Building AI Agents that work is no small feat. In Agents in Production [Podcast Limited Series] - Episode Six, Paul van der Boor and Sean Kenny share how they scaled AI across 100+ companies with Toqan—a tool born from a Slack experiment and grown into a powerful productivity platform. From driving adoption and building super users to envisioning AI employees of the future, this conversation cuts through the hype and gets into what it really takes to make AI work in the enterprise.

# AI Adoption

# Toqan

# Prosus Group

1:05:01

Colin McNamara · Jul 28th, 2025

From Console Scripts to Agentic Services: Building Observability into Everyday LLM Workflows // Colin McNamara // Agents in Production 2025

This talk shares the ongoing, real-world journey of building agentic infrastructure at AlwaysCool.ai—from simple GPT-based tools to our first production-ready AI microservices. We started with small wins like automating nutritional analysis and FDA label validation, but quickly ran into issues with sync limits, cost control, and debugging blind spots. That led us to build a shared agentic service layer, using LangGraph to orchestrate multi-step flows and FastAPI to serve those agents cleanly. With OpenTelemetry at the core, we now send metrics and traces to Prometheus, Grafana, and LangSmith for real-time visibility, which is critical for compliance workflows such as HACCP, CAPA, and FDA traceability. We’re not claiming to have it all figured out—this is a story of learning in the open, much like we do at the Austin AI Middleware Users Group (AIMUG). If you're navigating the same terrain—tooling decisions, observability gaps, or production pressure—this talk offers patterns, tools, and cautionary lessons worth carrying into your own journey.

# Agents in Production

# Console Scripts

# Agentic Services

# Always Cool.AI

Comment

15:03

Devin Stein · Jul 28th, 2025

The Facts Flywheel // Devin Stein // Agents in Production 2025

Agent memory and organizational knowledge are actually one in the same. Organizations remember the whys, and how-tos by writing down learnings in a centralize place. For agents to remember, they need to do to the same. But they both suffer the same shortcoming - it's impossible to keep up-to-date. Can we solve both problems at once?

# Agents in Production

# Facts Flywheel

# Dosu

Comment

24:33

Valliappa Lakshmanan · Jul 28th, 2025

Too much lock-in for too little gain: agent frameworks are a dead-end // Valliappa Lakshmanan // Agents in Production 2025

If your goal is to accelerate development of agentic systems without sacrificing production quality, a great choice is to use simple, composable GenAI patterns and off-the-shelf tools for monitoring, logging, and a few other capabilities. In this talk, I'll present an architecture consisting of such patterns that will enable you to build agentic systems in a way that does not lock you into any LLM, cloud, or agent framework. The patterns I talk about are from my GenAI design patterns book which is in early release on O'Reilly's platform.

# Agents in Production

# Agents Frameworks

Comment

35:37

Erica Hughberg · Jul 28th, 2025

The Hidden Infrastructure Behind Every AI Agent // Erice Hughberg // Agents in Production 2025

AI agents aren't just generating content; they're generating traffic. Like any good agent, your AI agent isn’t working alone. Behind the scenes is a mission-critical handler: the AI Gateway. In this lightning talk, we'll explore how Gateways are evolving to handle the evolving realities of GenAI: dynamic routing, access control, cost-aware load balancing, model-aware failover, and observability across multi-model environments. If you're building agents or just trying to keep up with the traffic they generate, this talk will help you understand the infrastructure patterns that are evolving to support a new landscape of software.

# Agents in Production

# Hidden Infrastructure

# Tetrate

Comment

16:16

Shraddha Yeole · Jul 28th, 2025

From Spikes to Stories: AI-Augmented Troubleshooting in the Network Wild // Shraddha Yeole // Agents in Production 2025

It’s 2 a.m., and a critical service slows down. Dashboards scream red—packet loss, timeouts, delays. The clock is ticking. Eyes race across a maze of graphs, flipping through visualizations and route tables. One graph leads to another. A dozen tabs open. Fatigue sets in. You’re left guessing: Is it the network, the application, or something else? Welcome to the new normal in network operations—where telemetry is endless, but clarity is rare. This session explores how AI and large language models (LLMs) transform observability by evolving views from data presentation to intelligent data interpretation. Instead of manually piecing together clues, imagine asking, “What’s wrong here?” and receiving clear, contextual insights. AI-powered storytelling augments human reasoning, reduces noise, and accelerates fault isolation—lowering misdiagnosis risk and improving mean time to identify (MTTI) and resolve (MTTR). Join us to see how storytelling is reshaping digital operations.

# Agents in Production

# AI-augmented Troubleshooting

# ThousandEyes

Comment

11:49

Annie Condon & Jeff Groom · Jul 28th, 2025

Evaluating AI Agents: Why It Matters and How We Do It // Annie Condon | Jeff Groom // Agents in Production 2025

As we integrate agentic AI into business products, robust evaluation of the agents is essential to delivering the highest quality. Proper evaluation ensures that AI agents are reliable, safe, effective, and aligned with user intent. Unlike traditional software or machine learning models, AI agents are non-deterministic and require specific types of evaluation. This talk outlines the importance of evaluating AI agents, the key components that we version and test at Acre Security, the metrics that matter for different types of agents, and how we currently achieve success evaluating AI agents that we build at Acre.

# Agents in Production

# Evaluating Agents

# Acre Security

Comment

13:27

Rakesh Kumar & Demetrios Brinkmann · Jul 25th, 2025

Real-time Feature Generation at Lyft

This session delves into real-time feature generation at Lyft. Real-time feature generation is critical for Lyft where accurate up-to-the-minute marketplace data is paramount for optimal operational efficiency. We will explore how the infrastructure handles the immense challenge of processing tens of millions of events per minute to generate features that truly reflect current marketplace conditions. Lyft has built this massive infrastructure over time, evolving from a humble start and a naive pipeline. Through lessons learned and iterative improvements, Lyft has made several trade-offs to achieve low-latency, real-time feature delivery. MLOps plays a critical role in managing the lifecycle of these real-time feature pipelines, including monitoring and deployment. We will discuss the practicalities of building and maintaining high-throughput, low-latency real-time feature generation systems that power Lyft’s dynamic marketplace and business-critical products.

# Real-time Machine Learning

# Feature generation

# Lyft

58:05

Renato Byrro · Jul 25th, 2025

Machine Experience Engineering // Renato Byrro // Agents in Production 2025

Machine Experience (MX) Engineering is similar to UX Engineering, but focused on creating interfaces that efficiently address AI models' needs and are easy for them to understand. Existing APIs were developed for human software engineers. If we want LLMs to be reliable when calling tools/functions, we have to develop interfaces that are tailored for their reasoning model.

# Agents in Production

# Machine Experience Engineering

# Arcade.dev

Comment

27:33

Advait Patel · Jul 25th, 2025

How to Stop AI Agents from Bleeding Your Cloud Budget // Advait Patel // Agents in Production 2025

As AI agents become active participants in production environments, handling infrastructure tasks, chaining tools, generating outputs, and executing plans across cloud services, the financial implications are often underestimated. These agents may appear intelligent, but they have zero awareness of cost boundaries. A single agent loop with poorly bounded retries, excessive API calls, or unrestricted tool usage can quietly rack up hundreds or even thousands of compute, token, or storage costs. In this session, I’ll walk through how seemingly harmless design decisions, like overly verbose prompts, excessive tool chaining, or unrestricted LLM usage, can result in runaway spending. I’ll share lessons from deploying agentic systems in cloud-native pipelines and infrastructure security tools, including my work on DockSec, an open-source AI-powered container security analyzer. We’ll explore how agents misbehave in cloud billing terms and what the attendees can do to stop it. Attendees will learn practical strategies to monitor, contain, and optimize agent costs: from integrating cost observability into your agent stack, to programmatically setting retry, token, and API call budgets, to leveraging agent memory, caching, and behavior throttling to reduce waste. Whether they’re scaling agents in production or just starting to build them, this talk will give them the tools to design agent systems that are not only intelligent but also financially sustainable.

# Agents in Production

# Cloud Budget

# Broadcom

Comment

31:05

Collections

All Collections

All Content