MLOps Community Podcast
# NVIDIA GPUs
# CUDA framework
# GitHub repo
Performance Optimization and Software/Hardware Co-design across PyTorch, CUDA, and NVIDIA GPUs
In today’s era of massive generative models, it's important to understand the full scope of AI systems' performance engineering. This talk discusses the new O'Reilly book, AI Systems Performance Engineering, and the accompanying GitHub repo (https://github.com/cfregly/ai-performance-engineering).
This talk provides engineers, researchers, and developers with a set of actionable optimization strategies. You'll learn techniques to co-design and co-optimize hardware, software, and algorithms to build resilient, scalable, and cost-effective AI systems for both training and inference.


Chris Fregly & Demetrios Brinkmann · Feb 24th, 2026


Rahul Raja & Demetrios Brinkmann · Feb 17th, 2026
Information Retrieval is evolving from keyword matching to intelligent, vector-based understanding. In this talk, Rahul Raja explores how dense retrieval, vector databases, and hybrid search systems are redefining how modern AI retrieves, ranks, and reasons over information. He discusses how retrieval now powers large language models through Retrieval-Augmented Generation (RAG) and the new MLOps challenges that arise, embedding drift, continuous evaluation, and large-scale vector maintenance.
Looking ahead, the session envisions a future of Cognitive Search, where retrieval systems move beyond recall to genuine reasoning, contextual understanding, and multimodal awareness. Listeners will gain insight into how the next generation of retrieval will bridge semantics, scalability, and intelligence, powering everything from search and recommendations to generative AI.
# AI Agents
# AI Engineer
# AI agents in production
# AI Agents use case
# System Design


Vincent D. Warmerdam & Demetrios Brinkmann · Feb 13th, 2026
Vincent Warmerdam joins Demetrios fresh off marimo’s acquisition by Weights & Biases—and makes a bold claim: notebooks as we know them are outdated.
They talk Molab (GPU-backed, cloud-hosted notebooks), LLMs that don’t just chat but actually fix your SQL and debug your code, and why most data folks are consuming tools instead of experimenting. Vincent argues we should stop treating notebooks like static scratchpads and start treating them like dynamic apps powered by AI.
It’s a conversation about rethinking workflows, reclaiming creativity, and not outsourcing your brain to the model.
# Vincent D. Warmerdam
# Calmcode
# marimo
# wandb
# Jupiter Notebooks
# Data Science


Ereli Eran & Demetrios Brinkmann · Feb 10th, 2026
A conversation on how AI coding agents are changing the way we build and operate production systems. We explore the practical boundaries between agentic and deterministic code, strategies for shared responsibility across models, engineering teams, and customers, and how to evaluate agent performance at scale. Topics include production quality gates, safety and cost tradeoffs, managing long-tail failures, and deployment patterns that let you ship agents with confidence.
# AI Agents
# AI Engineer
# AI agents in production
# AI Agents use case
# System Design


Nick Gillian & Demetrios Brinkmann · Feb 6th, 2026
As AI moves beyond the cloud and simulation, the next frontier is Physical AI: systems that can perceive, understand, and act within real-world environments in real time. In this conversation, Nick Gillian, Co-Founder and CTO of Archetype AI, explores what it actually takes to turn raw sensor and video data into reliable, deployable intelligence.
Drawing on his experience building Google’s Soli and Jacquard and now leading development of Newton, a foundational model for Physical AI, Nick discusses how real-time physical understanding changes what’s possible across safety monitoring, infrastructure, and human–machine interaction. He’ll share lessons learned translating advanced research into products that operate safely in dynamic environments, and why many organizations underestimate the challenges and opportunities of AI in the physical world.
# AI Agents
# AI Engineer
# AI agents in production
# AI Agents use case
# System Design


Kris Beevers & Demetrios Brinkmann · Feb 3rd, 2026
Hundreds of neocloud operators and "AI Factory" builders have emerged to serve the insatiable demand for AI infrastructure. These teams are compressing the design, build, deploy, operate, scale cycle of their infrastructures down to months, while managing massive footprints with lean teams. How? By applying modern intent driven infrastructure automation principles to greenfield deployments. We'll explore how these teams carry design intent through to production, and how operating and automating around consistent infrastructure data is compressing "time to first train".
# AI Agents
# AI Engineer
# AI agents in production
# AI Agents use case
# System Design


Mike Oaten & Demetrios Brinkmann · Jan 27th, 2026
As AI models move into high-stakes environments like Defence and Financial Services, standard input/output testing, evals, and monitoring are becoming dangerously insufficient. To achieve true compliance, MLOps teams need to access and analyse the internal reasoning of their models to achieve compliance with the EU AI Act, NIST AI RMF, and other requirements.
In this session, Mike introduces the company's patent-pending AI assurance technology that moves beyond statistical proxies. He will break down the architecture of the Synapses Logger, a patent-pending technology that embeds directly into the neural activation flow to capture weights, activations, and activation paths in real-time.
# EU AI Act
# Regulations Compliance
# Tikos


Wilder Lopes & Demetrios Brinkmann · Jan 20th, 2026
Enterprise organizations face a critical paradox in AI deployment: while 52% struggle to access needed GPU resources with 6-12 month waitlists, 83% of existing CPU capacity sits idle. This talk introduces an approach to AI infrastructure optimization through universal resource management that reshapes applications to run efficiently on any available hardware—CPUs, GPUs, or accelerators.
We explore how code reshaping technology can unlock the untapped potential of enterprise computing infrastructure, enabling organizations to serve 2-3x more workloads while dramatically reducing dependency on scarce GPU resources. The presentation demonstrates why CPUs often outperform GPUs for memory-intensive AI workloads, offering superior cost-effectiveness and immediate availability without architectural complexity.
# AI Agents
# AI Engineer
# AI agents in production
# AI agent usecase
# System Design



+1
Corey Zumar, Danny Chiao, Jules Damji & 1 content:more content:speaker · Jan 16th, 2026
MLflow isn’t just for data scientists anymore—and pretending it is is holding teams back.
Corey Zumar, Jules Damji, and Danny Chiao break down how MLflow is being rebuilt for GenAI, agents, and real production systems where evals are messy, memory is risky, and governance actually matters. The takeaway: if your AI stack treats agents like fancy chatbots or splits ML and software tooling, you’re already behind.
# Agents in Production
# Open Source
# MLflow
# Databricks


Zengy Qin & Demetrios Brinkmann · Jan 2nd, 2026
What if the computer itself can think and take actions for you? You just give it a goal, and it performs every click, type, drag, and gets work done across the desktop and web. In this talk, Zengyi reveals the breakthrough technology that his company OpenAGI is developing: AI that can use computers like humans do. He talks about how his team developed the model, why it outperforms similar models from OpenAI and Google, and its wide use cases across different domains.
# AI Agents
# Robotics
# OpenAGI Foundation



Varant Zanoyan, Nikhil Simha & Demetrios Brinkmann · Dec 28th, 2025
Feature stores might be the wrong abstraction.
Varant Zanoyan and Nikhil Simha Raprolu explain why Cronon ditched “store-first” thinking and focused on compute, orchestration, and real-time correctness—born at Airbnb, battle-tested with Stripe. If embeddings, agents, and real-time ML feel painful, this episode explains why.
# AI Search
# AI Agents
# Zipline AI
