MLOps Community
+00:00 GMT

Collections

All Collections
MLOps Reading Group
1 Item

All Content

Popular topics
# LLMs
# LLM in Production
# AI
# LLM
# Machine Learning
# Rungalileo.io
# MLops
# RAG
# MLOps
# Interview
# Machine learning
# Tecton.ai
# Arize.com
# Generative AI
# mckinsey.com/quantumblack
# Redis.io
# Zilliz.com
# Humanloop.com
# Snorkel.ai
# Redis.com
All
Duncan Curtis
Demetrios Brinkmann
Duncan Curtis & Demetrios Brinkmann · Apr 18th, 2025
How Sama is Improving ML Models to Make AVs Safer
Between Uber’s partnership with NVIDIA and speculation around the U.S.'s President Donald Trump enacting policies that allow fully autonomous vehicles, it’s more important than ever to ensure the accuracy of machine learning models. Yet, the public’s confidence in AVs is shaky due to scary accidents caused by gaps in the tech that Sama is looking to fill. As one of the industry’s top leaders, Duncan Curtis, SVP of Product and Technology at Sama, would be delighted to share how we can improve the accuracy, speed, and cost-efficiency of ML algorithms for ​A​Vs. Sama’s machine learning technologies minimize the risk of model failure and lower the total cost of ownership for car manufacturers including Ford, BMW, and GM, as well as four of the five top OEMs and their Tier 1 suppliers. This is especially timely as Tesla is under investigation for crashes due to its Smart Summon feature and Waymo recently had a passenger trapped in one of its driverless taxis.
# ML algorithms
# A​Vs
# Sama
45:35
Vaibhav Gupta
Charles Frye
Ben Epstein
Vaibhav Gupta, Charles Frye & Ben Epstein · Apr 17th, 2025
Breaking the Demo Barrier and Getting Agents Shipped Deploying Large Language Models (LLMs) in production brings a host of challenges well beyond prompt engineering. Once they're live, even the smallest oversight—like a malformed API call or unexpected user input—can cause failures you never saw coming. In this talk, Vaibhav Gupta will share proven strategies and practical tooling to keep LLMs robust in real-world environments. You'll learn about structured prompting, dynamic routing with fallback handlers, and data-driven guardrails—all aimed at catching errors before they break your application. You'll also hear why the naïve use of JSON can reduce a model's accuracy, and discover when it's wise to push back on standard serialization in favor of more flexible output formats. Whether you're processing 100+ page bank statements, analyzing user queries, or summarizing critical healthcare data, you'll not only understand how to prevent LLMs from failing but also how to design AI-driven solutions that scale gracefully alongside evolving user needs. Modal: ML Infra That Does Not Suck Building an application on the cloud doesn't have to suck. Even if it uses GPUs and foundation models! In this talk, I'll present Modal, the serverless Python infrastructure you didn't know you always wanted.
# AI Systems
# Frameworks
# BAML
# Modal
1:08:10
Traditional product development cycles require extensive consumer research and market testing, resulting in lengthy development timelines and significant resource investment. We've transformed this process by building a distributed multi-agent system that enables parallel quantitative evaluation of hundreds of product concepts. Our system combines three key components: an Agentic innovation lab generating high-quality product concepts, synthetic consumer panels using fine-tuned foundational models validated against historical data, and an evaluation framework that correlates with real-world testing outcomes. We can talk about how this architecture enables rapid concept discovery and digital experimentation, delivering insights into product success probability before development begins. Through case studies and technical deep-dives, you'll learn how we built an AI powered innovation lab that compresses months of product development and testing into minutes - without sacrificing the accuracy of insights.
# Gen AI
# AI Agents
# PyMC Labs
1:00:44
AI is heading for an energy crisis, with data centers projected to consume as much electricity as France by 2027. Big Tech's current solution—building more power plants—is unsustainable. Real solutions lie in energy-efficient computing (like in-memory and analog) and shifting AI to edge devices. Without these, AI’s progress risks being bottlenecked by electricity limits.
# Energy Crisis
# Edge AI
# Climate Change
Josh Xi
Demetrios Brinkmann
Josh Xi & Demetrios Brinkmann · Apr 11th, 2025
In real-time forecasting (e.g. geohash level demand and supply forecast for an entire region), time series-based forecasting methods are widely adopted due to their simplicity and ease of training. This discussion explores how Lyft uses time series forecasting to respond to real-time market dynamics, covering practical tips and tricks for implementing these methods, an in-depth look at their adaptability for online re-training, and discussions on their interpretability and user intervention capabilities. By examining these topics, listeners will understand how time series forecasting can outperform DNNs, and how to effectively use time series forecasting for dynamic market conditions and decision-making applications.
# Time Series
# DNNs
# Lyft
53:42
Sophia Skowronski
Adam Becker
Valdimar Eggertsson
Sophia Skowronski, Adam Becker & Valdimar Eggertsson · Apr 9th, 2025
We break down key insights from the paper, discuss what these findings mean for AI’s role in the workforce, and debate its broader implications. As always, our expert moderators guide the session, followed by an open, lively discussion where you can share your thoughts, ask questions, and challenge ideas with fellow MLOps enthusiasts.
# Generative AI
# Claude
# Hierarchical Taxonomy
55:09
Tanmay Chopra
Demetrios Brinkmann
Tanmay Chopra & Demetrios Brinkmann · Apr 8th, 2025
Finetuning is dead. Finetuning is only for style. We've all heard these claims. But the truth is we feel this way because all we've been doing is extended pretraining. I'm excited to chat about what real finetuning looks like - modifying output heads, loss functions and model layers, and it's implications on quality and latency. Happy to dive deeper into how DeepSeek leveraged this real version of finetuning through GRPO and how this is nothing more than a rediscovery of our old finetuning ways. I'm sure we'll naturally also dive into when developing and deploying your specialized models makes sense and the challenges you face when doing so.
# Finetuning
# DeepSeek
# Emissary
1:00:31
This third article in the series on Distributed MLOps explores overcoming vendor lock-in by unifying AMD and NVIDIA GPUs in mixed clusters for distributed PyTorch training, all without requiring code rewrites: Mixing GPU Vendors: It demonstrates how to combine AWS g4ad (AMD) and g4dn (NVIDIA) instances, bridging ROCm and CUDA to avoid being tied to a single vendor. High-Performance Communication: It highlights the use of UCC and UCX to enable efficient operations like all_reduce and all_gather, ensuring smooth and synchronized training across diverse GPUs. Kubernetes Made Simple: How Kubernetes, enhanced by Volcano for gang scheduling, can orchestrate these workloads on heterogeneous GPU setups. Real-World Trade-Offs: While covering techniques like dynamic load balancing and gradient compression, it also notes challenges current limitations. Overall, the piece illustrates how integrating mixed hardware can maximize resource potential, delivering faster, scalable, and cost-effective machine learning training.
# MLOps
# Machine Learning
# Kubernetes
# PyTorch
# AWS
David Cox
Demetrios Brinkmann
David Cox & Demetrios Brinkmann · Apr 7th, 2025
Shiny new objects are made available to artificial intelligence(AI) practitioners daily. For many who are not AI practitioners, the release of ChatGPT in 2022 was their first contact with modern AI technology. This led to a flurry of funding and excitement around how AI might improve their bottom line. Two years on, the novelty of AI has worn off for many companies but remains a strategic initiative. This strategic nuance has led to two patterns that suggest a maturation of the AI conversation across industries. First, conversations seem to be pivoting from "Are we doing [the shiny new thing]" to serious analysis of the ROI from things built. This reframe places less emphasis on simply adopting new technologies for the sake of doing so and more emphasis on the optimal stack to maximize return relative to cost. Second, conversations are shifting to emphasize market differentiation. That is, anyone can build products that wrap around LLMs. In competitive markets, creating products and functionality that all your competitors can also build is a poor business strategy (unless having a particular thing is industry standard). Creating a competitive advantage requires companies to think strategically about their unique data assets and what they can build that their competitors cannot.
# AI
# LLM
# RethinkFirst
40:51
Rohit Agrawal
Demetrios Brinkmann
Rohit Agrawal & Demetrios Brinkmann · Apr 4th, 2025
Demetrios talks with Rohit Agrawal, Director of Engineering at Tecton, about the challenges and future of streaming data in ML. Rohit shares his path at Tecton and insights on managing real-time and batch systems. They cover tool fragmentation (Kafka, Flink, etc.), infrastructure costs, managed services, and trends like using S3 for storage and Iceberg as the GitHub for data. The episode wraps with thoughts on BYOC solutions and evolving data architectures.
# Batch Systems
# Cost Management
# Streaming Ecosystem
# Tecton
47:39
Popular