MLOps Community
+00:00 GMT

Collections

All Collections

See all
Agents in Production 2025
28 Items

All Content

All
Paul van der Boor
Dmitri Jarnikov
Demetrios Brinkmann
Paul van der Boor, Dmitri Jarnikov & Demetrios Brinkmann · Aug 1st, 2025
9 Commandments for Building AI Agents
Building AI agents that actually get things done is harder than it looks. Demetrios, Paul, and Dmitri break down what makes agents effective—from smart planning and memory to treating tools, systems, and even people as components. They cover the "react" loop, budgeting for long tasks, sandboxing, and learning from experience. It’s a sharp, practical look at what it really takes to design useful, adaptive AI agents.
# Building AI Agents
# GenAI Interface
# Prosus Group
1:20:34
Edward Upton
Edward Upton · Aug 1st, 2025
Increasing powerful agents are leading to increasingly higher stakes automation. Between fighting fraud at scale using LLM-pipelines to handling healthcare and insurance data with browser agents, I've observed my fair share of consequential agent failures. In this talk I'll share what I've learned about circumnavigating the agent failure landscape.
# Agents in Production
# Agent Failure
# Asteroid AI
Comment
25:21
Drawing from my forthcoming publication, I'll explore the foundational decisions that determine whether AI agents deliver lasting value or become expensive technical debt. Rather than focusing on specific frameworks or tools, I'll cover the core tradeoffs between flexibility and performance, the memory patterns that actually matter for agent reliability, and how to architect systems that can evolve with the rapidly changing AI landscape. The key insight is understanding what problems agents fundamentally solve—automating complex, multi-step workflows—and designing memory and coordination systems around those core needs rather than getting caught up in today's specific technologies. You'll leave with a framework for making architectural decisions that will serve you well regardless of which models, frameworks, or tools become dominant next year.
# Agents in Production
# Managing Memory
# Workhelix
Comment
29:01
As AI evolves from passive assistants to intelligent agents capable of reasoning, planning, and acting autonomously, the infrastructure supporting them must evolve too. Multi-agent systems require low-latency, scalable, and secure environments that enable real-time coordination, dynamic workloads, and continuous learning—often beyond what traditional cloud setups can deliver. This talk explores the infrastructure blueprint needed to support agentic AI at scale, including hybrid edge-cloud strategies, and data-local compute. Learn what it takes to build a robust foundation for the next generation of intelligent, collaborative agents.
# Agents in Production
# Multi-Agent System
# HP
Comment
18:42
Most AI systems still assume a single human working with one or more agents. In reality, work is a team sport—several people, several agents, one shared goal. MeshAgent turns that reality into software with secure Rooms: on-demand workspaces where every human and agent sees the same live context, abides by access controls, and is fully traceable. In this talk you'll learn how MeshAgent unlocks true multiplayer AI: - Co-create in real time: launch a shared Room where humans and agents collaborate—invite colleagues via link, iterate live, and watch agents work alongside you. - Add new agent teammates in minutes: stand-up chat- or voice-capable agents with the Python, TypeScript, or Dart SDKs, and interact with them in your browser using MeshAgent Studio. - Equip agents to ship actual work: plug in built-in MeshAgent tools or custom Tools so agents do more than just chat. - Go to production as a team, worry-free: MeshAgent owns infra, scaling, logging, and cost dashboards, so your team focuses on outcomes, not ops.
# Agents in Production
# Multiplayer AI Systems
# MeshAgent
Comment
13:53
Don’t trust AI agents. Just because an agent is in your system doesn’t mean it should have overly permissive privileges. Restrict access. Defining a clear role for each agent from the beginning. Give each agent only the tools and information access it really needs. Monitoring of the agent's activity. Keep an eye out for odd behavior or agents that step out of line. The sooner you catch it, the easier it is to fix. Keep things safe without making them slow. Good security shouldn’t get in the way of your agents doing their job. You can have both speed and safety.
# Agents in Production
# Multi-Agent System
# Salesforce
# Paloalto
Comment
21:16
Deploying AI agents in healthcare isn’t just a technical challenge—it’s a clinical one. As a physician working at the intersection of care delivery and machine intelligence, I’ll walk through what it really takes to make agents useful, safe, and credible in high-stakes environments like hospitals and clinics. This talk will focus on: What makes healthcare environments uniquely hard for agents—ambiguity, interruptions, human variability, and risk tolerance Why typical evaluation metrics often miss the mark, and what to measure instead (think: harm reduction, workflow fit, and appropriate escalation) How to scope agent autonomy to reflect the real-world roles of nurses, physicians, and support staff Where agents can shine in augmenting clinical work—and where they’re likely to fail without robust oversight
# Agents in Production
# Agents in Scrubs
# Validara Health
Comment
16:03
Everyone is building AI agents, but at their core is the LLM—and choosing the right one is critical. With new models launching every week, each promising game-changing productivity, how do we make informed, data-driven choices?"" In this talk, I’ll focus on LLM selection for a critical agent skill: code understanding. I’ll present a study applying 15 leading LLMs to real-world code summarization tasks, using practical, agent-relevant metrics like verbosity, latency, cost, human-aligned accuracy, and information gain. We’ll explore how these models actually perform in practice, beyond benchmarks and hype, and what that means for building effective, capable agents. Whether you’re building autonomous coding assistants, dev-focused copilots, or multi-modal agent systems, choosing the right LLM isn’t optional—it’s the foundation. This talk aims to cut through the noise and offer actionable insights to help you select the best model for your agent’s real-world success.
# Agents in Production
# LLM
# Smart Agents
# Strudel AI
Comment
25:01
Join Brooke Hopkins (Founder & CEO, Coval) and Peter Bakkum (API Multimodal Lead, OpenAI) for an insightful fireside chat focused on the cutting-edge voice-to-voice architectures powering modern voice AI applications. They’ll unpack the unique challenges of designing and deploying real-time, multimodal systems that enable seamless, natural conversations between users and AI agents. Drawing from Brooke’s expertise in simulation and evaluation at scale and Peter’s experience building OpenAI’s real-time APIs, this conversation will dive into how infrastructure, latency optimization, and rigorous testing come together to create reliable, production-ready voice AI experiences.
# Agents in Production
# Voice AI
# Coval
# OpenAI
Comment
57:13
Voice agents live or die on latency and trust. I’ll share how HappyRobot’s MLOps pipeline turns raw production audio into high-accuracy, low-latency models: 1. Synthetic labels first: we generate large-scale annotations with reasoning LLMs. 2. Human in the loop: a targeted subset of samples are reviewed by human annotators to correct drift and refine prompts (DSPy-style). 3. Distill & specialize: small, domain-tuned models are fine-tuned via LoRA/distillation. We’ll walk through our MLOps stack. From observability to AI-assisted data generation and model fine-tuning / optimization.
# Agents in Production
# Voice Agents
# LLM
# HappyRobot
Comment
17:08