MLOps Community Podcast
# AI Evals
# LLM Evaluation
# AI Product Management
It's 2026, and We're Still Talking Evals
Most teams treat evals like a last-minute checkbox—ship first, panic later—but that’s exactly backwards. The real edge comes from treating evals as a continuous, evolving system from day one, not a static test suite. Because here’s the uncomfortable truth: LLMs don’t fail cleanly or consistently, and neither do your users. If you’re not constantly adapting how you evaluate, you’re basically flying blind—just with more features to hide it.


Maggie Konstanty & Demetrios Brinkmann · Apr 21st, 2026


Zach Lloyd & Demetrios Brinkmann · Apr 17th, 2026
# AI Agents
# Cloud Development
# Warp Terminal


Mihail Eric & Demetrios Brinkmann · Apr 15th, 2026
Conversation with Mihail Eric on how agent-driven development is reshaping engineering work, faster iteration, new failure modes, and shifting team dynamics. Focus on validation, cost tradeoffs, and what breaks when code is mostly generated rather than written.
# Software Engineering
# Coding Agents
# AI Engineering


Maher Hanafi & Demetrios Brinkmann · Apr 10th, 2026
Scaling LLMs in production requires balancing cost, latency, and performance. Through techniques like dynamic GPU scaling and TensorRT optimization, latency was reduced by up to 70%, while iterative learning and tight alignment with business goals ensured strong ROI.
# GPU
# GPU Optimization
# AI Agents


Robert Ennals & Demetrios Brinkmann · Apr 7th, 2026
Most people cripple coding agents by micromanaging them—reviewing every step and becoming the bottleneck.
The shift isn’t to better supervise agents, but to design systems where they work well on their own: parallelized, self-validating, and guided by strong processes.
Done right, you don’t lose control—you gain leverage. Like paving roads for cars, the real unlock is reshaping the environment so AI can move fast.
# AI Agents
# Parallel Agents
# Broomy


Kashish Mittal & Demetrios Brinkmann · Apr 3rd, 2026
Kashish zooms out to discuss a universal industry pattern: how infrastructure—specifically data loading—is almost always the hidden constraint for ML scaling.
The conversation dives deep into a recent architectural war story. Kashish walks through the full-stack profiling and detective work required to solve a massive GPU starvation bottleneck. By redesigning the Petastorm caching layer to bypass CPU transformation walls and uncovering hidden distributed race conditions, his team boosted GPU utilization to 60%+ and cut training time by 80%. Kashish also shares his philosophy on the fundamental trade-offs between latency and efficiency in GPU serving.
# GPU Starvation
# Uber ML
# ML Infrastructure


Jens Bodal & Demetrios Brinkmann · Mar 31st, 2026
AI agents are shifting the role of developers from writing code to defining intent. This conversation explores why specs are becoming more important than implementation, what breaks in real-world systems, and how engineering teams need to rethink workflows in an agent-driven world.
# AI Agents
# Software Engineering
# AI in Production


Lorenzo Moriondo & Demetrios Brinkmann · Mar 27th, 2026
Meet arrowspace — an open-source library for curating and understanding LLM datasets across the entire lifecycle, from pre-training to inference.
Instead of treating embeddings as static vectors, arrowspace turns them into graphs (“graph wiring”) so you can explore structure, not just similarity. That unlocks smarter RAG search (beyond basic semantic matching), dataset fingerprinting, and deeper insights into how different datasets behave.
You can compare datasets, predict how changes will affect performance, detect drift early, and even safely mix data sources while measuring outcomes.
In short: arrowspace helps you see your data — and make better decisions because of it.
# arrowspace
# Vector Search
# Epipelxity



Donné Stevenson, Pedro Chaves & Demetrios Brinkmann · Mar 20th, 2026
Marketplaces are about to get weird.
With Pedro Chaves and Donné Stevenson: agents picking your house, negotiating deals, even talking to other agents for you.
Less browsing. Less choice. More automation.
Convenience… or giving up control?
# AI Agents
# Marketplace
# Prosus
# OLX


Johann Schleier-Smith & Demetrios Brinkmann · Mar 17th, 2026
A new paradigm is emerging for building applications that process large volumes of data, run for long periods of time, and interact with their environment. It’s called Durable Execution and is replacing traditional data pipelines with a more flexible approach.
Durable Execution makes regular code reliable and scalable.
In the past, reliability and scalability have come from restricted programming models, like SQL or MapReduce, but with Durable Execution this is no longer the case. We can now see data pipelines that include document processing workflows, deep research with LLMs, and other complex and LLM-driven agentic patterns expressed at scale with regular Python programs.
In this session, we describe Durable Execution and explain how it fits in with agents and LLMs to enable a new class of machine learning applications.
# AI Agents
# AI Engineer
# AI agents in production
# AI agent usecase
# System Design


Chris Fregly & Demetrios Brinkmann · Feb 24th, 2026
In today’s era of massive generative models, it's important to understand the full scope of AI systems' performance engineering. This talk discusses the new O'Reilly book, AI Systems Performance Engineering, and the accompanying GitHub repo (https://github.com/cfregly/ai-performance-engineering).
This talk provides engineers, researchers, and developers with a set of actionable optimization strategies. You'll learn techniques to co-design and co-optimize hardware, software, and algorithms to build resilient, scalable, and cost-effective AI systems for both training and inference.
# NVIDIA GPUs
# CUDA framework
# GitHub repo
