MLOps Community
+00:00 GMT
Agents in Production 2025
LIVESTREAM

Agents in Production 2025

AI Agents Are Already Working — Let’s Talk About It

Agents are no longer just experiments. From e-commerce to customer support to analytics workflows, they’re quietly getting real work done in production.

On July 17, join the MLOps Community for Part 2 of Agents in Production — a virtual event focused on the messy, practical side of building and deploying AI agents.

What’s on deck?

  1. Taming complexity: agent memory, behavior control, latency vs. response tradeoffs
  2. Stories from the field: How companies are actually using agents in live environments
  3. Tooling that works: routing, evaluation, UX, and cost performance in the wild

It’s free, it’s global, and it’s going to be packed.

Speakers

Aditya Gautam
Machine Learning Technical Lead @ Meta
Dipanwita Mallick
Principal Product Manager @ HP
Philipp Schmid
AI Developer Experience @ Google DeepMind
Shraddha Yeole
Senior Software Engineer, Machine learning @ ThousandEyes part of Cisco
Michael Albada
Principal Applied Scientist @ Microsoft
Allegra Guinan
Co-founder @ Lumiera
Yuki Watanabe
Sr. Software Engineer @ Databricks
Madison Kanna
Growth Engineer @ Baseten
Sarah Gebauer
Founder and Physician @ Validara Health
Venkata Gopi Kolla
Lead Software Engineer @ Salesforce Inc
Surendra Narang
Senior Manager Cyber Security @ PaloAlto Networks
Meryem Arik
CEO / Co-founder @ Doubleword
Krista Opsahl-Ong
Research Engineer @ Databricks
Gal Peretz
Head of AI @ Carbyne
Sara Estevez Manteiga
NLP Engineer @ TrueFlag
Tula Masterman
Principal AI Agent Solutions Architect @ MeshAgent
Lars Maaløe
Co-founder & CTO @ Corti
Mariana Prazeres
AI Engineer @ Run the eval loop
Stephanie Kirmer
Senior Machine Learning Engineer @ DataGrail
Erica Hughberg
Community Advocate @ Tetrate
Annie Condon
AI Solutions Engineer @ Acre Security
Jeff Groom
Director of Engineering, AI @ Acre Security
Maria Zhang
CEO & Co-Founder @ Palona AI
Somya Rai
Principal AI Engineer @ EXL
Aurimas Griciūnas
Founder & CEO @ SwirlAI
Valliappa Lakshmanan
Operating Executive @ N/A
Robert Caulk
C*O @ AskNews
Diego Oppenheimer
Head of Product @ Hyperparam
Shahul Elavakkattil Shereef
Co-founder & CTO @ Ragas
Simba Khadder
Founder & CEO @ Featureform
Nimrod Busany
Founder & Chief Scientist @ Traigent
Patrick Barker
CTO @ Kentauros AI
Edward Upton
Founding Engineer @ Asteroid
Advait Patel
Senior Site Reliability Engineer @ Broadcom
Dexter Horthy
Founder @ HumanLayer
Ben Labaschin
Principal Machine Learning Engineer @ Workhelix
Jonas Scholz
Co-founder @ Sliplane
Vrushank Vyas
Dev Rel / GTM @ Portkey
Shai Rubin
CTO and Co-founder @ Strudel AI
Pierre Gerardi
MLOps Team Lead & Senior Machine Learning Engineer @ Superlinear
Hakan Tek
Full-stack Developer @ Digital Data GmbH
Erik Goron
ML / AI Engineer @ Happyrobot
Devin Stein
Founder & CEO @ Dosu
Adam Sroka
CEO & Co-founder @ Hypercube Consulting
Mohamed Rashad
Co-founder & CTO @ DevisionX
Tanmay Tiwari
Senior Software Full Stack Engineer @ Rivian via BayOne
Ryan Fox-Tyler
Co-founder and SVP Product/Engineering @ Hypermode
Colin McNamara
Co-founder; Managing Partner for Engineering @ Always Cool Brands | Always Cool AI
David de la Iglesia Castro
AI Engineer @ Mozilla.ai
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community

Agenda

From3:40 PM, GMT
To3:45 PM, GMT
Tags:
Opening / Closing
Welcome Note
Speakers:
Demetrios Brinkmann
From3:50 PM, GMT
To4:15 PM, GMT
Tags:
Keynote
The Future of Compute: How AI Agents Are Reshaping Infrastructure

The rapid evolution of AI agents is exposing a widening gap between their unique computational needs and today’s infrastructure. This keynote cuts through the hype to highlight why traditional compute paradigms—mainframes, VMs, containers, even serverless—are struggling to keep up with agents’ bursty, stateful, and hardware-hungry workloads. We’ll examine the economic and technical inefficiencies organizations face, from unpredictable scaling to persistent state management, and why simply “tweaking the cloud” won’t cut it. Expect a candid look at the real operational challenges, the architectural dead-ends, and the tough question: do we adapt existing frameworks, or is it time for a radical rethink of how we design and manage compute for the AI era? Actionable insights, not wishful thinking.

+ Read More
Speakers:
Diego Oppenheimer
From4:20 PM, GMT
To4:45 PM, GMT
Tags:
Presentation
Beyond Chatbots: How to build Agentic AI systems with Google Gemini

As AI continues to evolve, we will a shift from static chatbots to dynamic agentic AI systems capable of autonomous reasoning, tool integration, and multi-step problem-solving. This talk explores how to design AI agents that leverage structured outputs, function calling, and workflow orchestration with Google Gemini.

+ Read More
Speakers:
Philipp Schmid
From4:50 PM, GMT
To5:00 PM, GMT
Tags:
Lightning Talk
How to Build Execution Layers That Don’t Burn Out

We’ve all built tools that looked good in demos but broke the moment we let go of the wheel. This talk is about building something different—systems that don’t just run, but run well when no one’s watching.

In under 10 minutes, I’ll walk through how I designed an execution layer that handles thousands of operations daily—without melting under pressure, without drifting off course, and without needing constant supervision.

I’ll share the real structure: • How it thinks • How it decides what to do next • How it knows when to stop • And how it stays sane when things go wrong

No theory, no fluff—just what it took to make something dependable in a world that isn’t.

+ Read More
Speakers:
Tanmay Tiwari
From5:05 PM, GMT
To5:30 PM, GMT
Tags:
Presentation
From Guesswork to Greatness: Systematic AI Agent Optimization in Production

Every engineer building AI agents has experienced it: you tweak a prompt, swap out a model, or adjust a RAG setting—only to find it either worsens the agent or improves one aspect while breaking another. Why does this happen? Because teams typically test just one configuration out of countless possible combinations, hoping for the best.

Current evaluation tools are built for single-point assessments, not the extensive multi-dimensional comparisons that real-world scenarios demand. Sure, you might be able to A/B test two prompts or select from a few models, but exploring hundreds of configurations across dimensions like cost, latency, and accuracy simultaneously is nearly impossible.

In this talk, we'll demonstrate how adopting a structured approach to testing alternatives can significantly change outcomes. Leveraging concepts from multi-objective optimization, we’ll illustrate how Traigent's SDK and UI empower engineers to allocate their testing budgets effectively. Traigent intelligently identifies and explores promising configurations, highlighting optimal tradeoffs. You'll learn how this methodology can yield quality improvements of 4–7x and reduce costs by up to 90%, all without resorting to guesswork or manual trial-and-error.

+ Read More
Speakers:
Nimrod Busany
From5:35 PM, GMT
To6:00 PM, GMT
Tags:
Panel Discussion
Underwriting Assist - A Multi Agent System

Underwriting Assist is a LangChain- and Ray-powered multi-agent system that accelerates insurance underwriting by 3x and cuts manual errors by 40%. It leverages RAG, shared memory, and LLM-based agents for clause analysis, risk profiling, and rationale generation. Real-time evals and human-in-loop feedback ensure accuracy, explainability, and regulatory compliance at scale.

+ Read More
Speakers:
Maria Zhang
Somya Rai
Aurimas Griciūnas
From6:05 PM, GMT
To6:15 PM, GMT
Tags:
Break
Break 1
From6:15 PM, GMT
To6:40 PM, GMT
Tags:
Presentation
Building Agents for Healthcare

Healthcare is one of the most vital and far-reaching sectors in our society, touching every individual at some point in their lives. Yet, it faces mounting challenges: rising administrative burdens, increasingly complex disease patterns, and growing patient volumes strain already stretched systems. In this talk, Lars will explore the untapped potential of AI agents to address some of healthcare’s most pressing real-world problems. He will present Corti’s unique approach to developing domain-specific agents equipped with healthcare-relevant skills—engineered not only for impact, but within a framework that places governance and safety at its core. Join us to learn how AI can be responsibly and powerfully deployed to support the future of care.

+ Read More
Speakers:
Lars Maaløe
From6:45 PM, GMT
To7:10 PM, GMT
Tags:
Presentation
How to Stop AI Agents from Bleeding Your Cloud Budget

As AI agents become active participants in production environments, handling infrastructure tasks, chaining tools, generating outputs, and executing plans across cloud services, the financial implications are often underestimated. These agents may appear intelligent, but they have zero awareness of cost boundaries. A single agent loop with poorly bounded retries, excessive API calls, or unrestricted tool usage can quietly rack up hundreds or even thousands of compute, token, or storage costs.

In this session, I’ll walk through how seemingly harmless design decisions, like overly verbose prompts, excessive tool chaining, or unrestricted LLM usage, can result in runaway spending. I’ll share lessons from deploying agentic systems in cloud-native pipelines and infrastructure security tools, including my work on DockSec, an open-source AI-powered container security analyzer. We’ll explore how agents misbehave in cloud billing terms and what the attendees can do to stop it.

Attendees will learn practical strategies to monitor, contain, and optimize agent costs: from integrating cost observability into your agent stack, to programmatically setting retry, token, and API call budgets, to leveraging agent memory, caching, and behavior throttling to reduce waste. Whether they’re scaling agents in production or just starting to build them, this talk will give them the tools to design agent systems that are not only intelligent but also financially sustainable.

+ Read More
Speakers:
Advait Patel
From7:15 PM, GMT
To7:40 PM, GMT
Tags:
Presentation
From Detection to Correction: A Practical Guide to Implementing Multi-Agent Systems for Misinformat

The rapid spread of digital misinformation requires solutions that move beyond simple detection and address the entire lifecycle of false narratives. This talk will first introduce the fundamental concepts of Multi-Agent Systems (MAS), exploring their core components, architectural patterns, and the collaborative potential of specialized AI agents. We will cover how these systems are designed to break down complex problems into manageable tasks, fostering modularity, scalability, and enhanced performance through cooperative and competitive agent interactions. Building on this foundation, the second half of the talk will transition from theory to a practical application, detailing a novel multi-agent framework designed to manage the complete misinformation lifecycle. As presented in our recent paper, this system employs a pipeline of specialized agents for the classification of misinformation types, evidence-based detection, automated correction, and source identification. We will walk through the architecture of this system, demonstrating how each agent contributes to a more transparent, reliable, and holistic approach to combating misinformation at scale. This talk will provide attendees with a deeper understanding of how multi-agent systems can be practically implemented to address one of the most significant challenges in our modern information ecosystem.

+ Read More
Speakers:
Aditya Gautam
From7:45 PM, GMT
To7:55 PM, GMT
Tags:
Break
Break 2
From7:55 PM, GMT
To8:20 PM, GMT
Tags:
Presentation
The Facts Flywheel

Agent memory and organizational knowledge are actually one in the same. Organizations remember the whys, and how-tos by writing down learnings in a centralize place. For agents to remember, they need to do to the same.

But they both suffer the same shortcoming - it's impossible to keep up-to-date. Can we solve both problems at once?

+ Read More
Speakers:
Devin  Stein
From8:25 PM, GMT
To8:50 PM, GMT
Tags:
Presentation
Too much lock-in for too little gain: agent frameworks are a dead-end

If your goal is to accelerate development of agentic systems without sacrificing production quality, a great choice is to use simple, composable GenAI patterns and off-the-shelf tools for monitoring, logging, and a few other capabilities. In this talk, I'll present an architecture consisting of such patterns that will enable you to build agentic systems in a way that does not lock you into any LLM, cloud, or agent framework. The patterns I talk about are from my GenAI design patterns book which is in early release on O'Reilly's platform.

+ Read More
Speakers:
Valliappa Lakshmanan
From8:55 PM, GMT
To9:05 PM, GMT
Tags:
Lightning Talk
Evaluating AI Agents: Why It Matters and How We Do It

As we integrate agentic AI into business products, robust evaluation of the agents is essential to delivering the highest quality. Proper evaluation ensures that AI agents are reliable, safe, effective, and aligned with user intent. Unlike traditional software or machine learning models, AI agents are non-deterministic and require specific types of evaluation. This talk outlines the importance of evaluating AI agents, the key components that we version and test at Acre Security, the metrics that matter for different types of agents, and how we currently achieve success evaluating AI agents that we build at Acre.

+ Read More
Speakers:
Annie Condon
Jeff Groom
From9:10 PM, GMT
To9:20 PM, GMT
Tags:
Lightning Talk
From Spikes to Stories: AI-Augmented Troubleshooting in the Network Wild

it’s 2 a.m., and a critical service slows down. Dashboards scream red—packet loss, timeouts, delays. The clock is ticking. Eyes race across a maze of graphs, flipping through visualizations and route tables. One graph leads to another. A dozen tabs open. Fatigue sets in. You’re left guessing: Is it the network, the application, or something else? Welcome to the new normal in network operations—where telemetry is endless, but clarity is rare. This session explores how AI and large language models (LLMs) transform observability by evolving views from data presentation to intelligent data interpretation. Instead of manually piecing together clues, imagine asking, “What’s wrong here?” and receiving clear, contextual insights. AI-powered storytelling augments human reasoning, reduces noise, and accelerates fault isolation—lowering misdiagnosis risk and improving mean time to identify (MTTI) and resolve (MTTR). Join us to see how storytelling is reshaping digital operations.

+ Read More
Speakers:
Shraddha  Yeole
From9:25 PM, GMT
To9:35 PM, GMT
Tags:
Break
Break 3
From9:35 PM, GMT
To9:45 PM, GMT
Tags:
Lightning Talk
The Hidden Infrastructure Behind Every AI Agent

AI agents aren't just generating content; they're generating traffic. Like any good agent, your AI agent isn’t working alone. Behind the scenes is a mission-critical handler: the AI Gateway.

In this lightning talk, we'll explore how Gateways are evolving to handle the evolving realities of GenAI: dynamic routing, access control, cost-aware load balancing, model-aware failover, and observability across multi-model environments.

If you're building agents or just trying to keep up with the traffic they generate, this talk will help you understand the infrastructure patterns that are evolving to support a new landscape of software.

+ Read More
Speakers:
Erica Hughberg
From9:50 PM, GMT
To10:00 PM, GMT
Tags:
Lightning Talk
From Console Scripts to Agentic Services: Building Observability into Everyday LLM Workflows

This talk shares the ongoing, real-world journey of building agentic infrastructure at AlwaysCool.ai—from simple GPT-based tools to our first production-ready AI microservices. We started with small wins like automating nutritional analysis and FDA label validation, but quickly ran into issues with sync limits, cost control, and debugging blind spots.

That led us to build a shared agentic service layer, using LangGraph to orchestrate multi-step flows and FastAPI to serve those agents cleanly. With OpenTelemetry at the core, we now send metrics and traces to Prometheus, Grafana, and LangSmith for real-time visibility, which is critical for compliance workflows such as HACCP, CAPA, and FDA traceability.

We’re not claiming to have it all figured out—this is a story of learning in the open, much like we do at the Austin AI Middleware Users Group (AIMUG). If you're navigating the same terrain—tooling decisions, observability gaps, or production pressure—this talk offers patterns, tools, and cautionary lessons worth carrying into your own journey.

+ Read More
Speakers:
Colin McNamara
From10:05 PM, GMT
To10:30 PM, GMT
Tags:
Presentation
Driving Evaluation-Driven Development with MLflow 3.0

Quality is the top barrier preventing Agentic applications from reaching production. This talk introduces Evaluation-Driven Development, a methodology that uses evaluation as the cornerstone for building high-quality, reliable Agentic systems. We will demonstrate how to drive it with MLflow 3.0, a new generation of the popular MLOps platform redesigned for the LLM era, including one-line observability, automatic evaluation, human-in-the-loop feedback loops, and monitoring.

+ Read More
Speakers:
Yuki  Watanabe

Sponsors

Live in 12 hours
July 17, 3:30 PM, GMT
Online
Organized by
MLOps Community
MLOps Community
Live in 12 hours
July 17, 3:30 PM, GMT
Online
Organized by
MLOps Community
MLOps Community