MLOps Community

MLOps Community Podcast

# Conversational AI
# iFood
# AI Agents
# Prosus Group

The Latency Goldilocks Zone Explained

Rafael (Head of Innovation, iFood) and Daniel (Data and AI Manager, iFood) pull back the curtain on ILO-Agent — iFood's conversational AI ordering system built for 200 million users across Latin America. Recorded live at AI House Amsterdam, this conversation goes deep on the engineering and product decisions behind building recommendation systems, agentic AI, and why the speed of your AI's response might actually be destroying user trust.
Rafael Borger
Daniel Wolbert
Demetrios Brinkmann
Rafael Borger, Daniel Wolbert & Demetrios Brinkmann · May 12th, 2026
All Tags
All Types
Nicolás Alejandro  Bogliolo
Demetrios Brinkmann
Nicolás Alejandro Bogliolo & Demetrios Brinkmann · May 11th, 2026
Before MCP was a standard and before LangChain was widely adopted, his team had already shipped their own orchestration layer and tool protocol in production. This conversation is a rare look at what it takes to build an agentic system that actually books trips, runs on WhatsApp, and keeps adding capabilities without falling over.
# Agentic AI
# MCP
# Ai agents
Anurag Beniwal
Demetrios Brinkmann
Anurag Beniwal & Demetrios Brinkmann · May 1st, 2026
Anurag Beniwal (Member of Technical Staff at ElevenLabs) breaks down the real-world challenges of building voice agents—from latency, transcription accuracy, and turn-taking to the tradeoffs between cascaded systems and end-to-end speech models. The conversation explores why production systems rely on “constellations” of models, how to design for non-technical users (especially in customer support), and why voice unlocks richer context—but introduces far more complexity than chat. Ultimately, it’s a deep dive into making voice AI practical, reliable, and usable at scale.
# Voice
# AI Agents
# Customer Support AI
# Amazon
Jesse Vincent
Demetrios Brinkmann
Jesse Vincent & Demetrios Brinkmann · Apr 24th, 2026
Jesse Vincent breaks down how modern “agentic” software development is shifting from writing code to managing intelligent systems. He shares how his Superpowers toolkit uses structured workflows, skills, and subagents to turn vague ideas into executable plans—emphasizing that clarity of intent matters more than coding itself. The conversation explores how AI agents can be guided using psychology, why separating roles (planner, implementer, reviewer) leads to better outcomes, and how iteration—not perfection—builds powerful workflows. Ultimately, the future of software isn’t code—it’s specs, judgment, and orchestrating agents to do the work.
# Superpowers
# Claude Code
# Developer Tools
Maggie Konstanty
Demetrios Brinkmann
Maggie Konstanty & Demetrios Brinkmann · Apr 21st, 2026
Most teams treat evals like a last-minute checkbox—ship first, panic later—but that’s exactly backwards. The real edge comes from treating evals as a continuous, evolving system from day one, not a static test suite. Because here’s the uncomfortable truth: LLMs don’t fail cleanly or consistently, and neither do your users. If you’re not constantly adapting how you evaluate, you’re basically flying blind—just with more features to hide it.
# AI Evals
# LLM Evaluation
# AI Product Management
Zach Lloyd
Demetrios Brinkmann
Zach Lloyd & Demetrios Brinkmann · Apr 17th, 2026
# AI Agents
# Cloud Development
# Warp Terminal
Mihail  Eric
Demetrios Brinkmann
Mihail Eric & Demetrios Brinkmann · Apr 15th, 2026
Conversation with Mihail Eric on how agent-driven development is reshaping engineering work, faster iteration, new failure modes, and shifting team dynamics. Focus on validation, cost tradeoffs, and what breaks when code is mostly generated rather than written.
# Software Engineering
# Coding Agents
# AI Engineering
Maher Hanafi
Demetrios Brinkmann
Maher Hanafi & Demetrios Brinkmann · Apr 10th, 2026
Scaling LLMs in production requires balancing cost, latency, and performance. Through techniques like dynamic GPU scaling and TensorRT optimization, latency was reduced by up to 70%, while iterative learning and tight alignment with business goals ensured strong ROI.
# GPU
# GPU Optimization
# AI Agents
Robert Ennals
Demetrios Brinkmann
Robert Ennals & Demetrios Brinkmann · Apr 7th, 2026
Most people cripple coding agents by micromanaging them—reviewing every step and becoming the bottleneck. The shift isn’t to better supervise agents, but to design systems where they work well on their own: parallelized, self-validating, and guided by strong processes. Done right, you don’t lose control—you gain leverage. Like paving roads for cars, the real unlock is reshaping the environment so AI can move fast.
# AI Agents
# Parallel Agents
# Broomy
Kashish Mittal
Demetrios Brinkmann
Kashish Mittal & Demetrios Brinkmann · Apr 3rd, 2026
Kashish zooms out to discuss a universal industry pattern: how infrastructure—specifically data loading—is almost always the hidden constraint for ML scaling. The conversation dives deep into a recent architectural war story. Kashish walks through the full-stack profiling and detective work required to solve a massive GPU starvation bottleneck. By redesigning the Petastorm caching layer to bypass CPU transformation walls and uncovering hidden distributed race conditions, his team boosted GPU utilization to 60%+ and cut training time by 80%. Kashish also shares his philosophy on the fundamental trade-offs between latency and efficiency in GPU serving.
# GPU Starvation
# Uber ML
# ML Infrastructure
Jens Bodal
Demetrios Brinkmann
Jens Bodal & Demetrios Brinkmann · Mar 31st, 2026
AI agents are shifting the role of developers from writing code to defining intent. This conversation explores why specs are becoming more important than implementation, what breaks in real-world systems, and how engineering teams need to rethink workflows in an agent-driven world.
# AI Agents
# Software Engineering
# AI in Production
Code of Conduct
Your Privacy Choices