MLOps Community Podcast
# AI
# Data Privacy
# Google DeepMind
Robustness, Detectability, and Data Privacy in AI
Recent rapid advancements in Artificial Intelligence (AI) have made it widely applicable across various domains, from autonomous systems to multimodal content generation. However, these models remain susceptible to significant security and safety vulnerabilities. Such weaknesses can enable attackers to jailbreak systems, allowing them to perform harmful tasks or leak sensitive information. As AI becomes increasingly integrated into critical applications like autonomous robotics and healthcare, the importance of ensuring AI safety is growing. Understanding the vulnerabilities in today’s AI systems is crucial to addressing these concerns.


Vinu Sankar Sadasivan & Demetrios Brinkmann · Feb 7th, 2025
Popular topics
# Machine learning
# MLops
# Interview
# AI
# Machine Learning Engineer
# LLMs
# Case Study
# Model Serving
# Deployment
# Machine Learning
# RAG
# Monitoring
# FinTech
# Open Source
# Cultural Side
# Scaling
# Data Science
# Observability
# LinkedIn
# A/B Testing


Alex Strick van Linschoten & Demetrios Brinkmann · Jan 31st, 2025
Alex Strick van Linschoten, a machine learning engineer at ZenML, joins the MLOps Community podcast to discuss his comprehensive database of real-world LLM use cases. Drawing inspiration from Evidently AI, Alex created the database to organize fragmented information on LLM usage, covering everything from common chatbot implementations to innovative applications across sectors. They discuss the technical challenges and successes in deploying LLMs, emphasizing the importance of foundational MLOps practices. The episode concludes with a call for community contributions to further enrich the database and collective knowledge of LLM applications.
# ChatBot
# LLM
# ZenML


Ilya Reznik & Demetrios Brinkmann · Jan 27th, 2025
Ilya Reznik's insights into machine learning and career development within the field. With over 13 years of experience at leading tech companies such as Meta, Adobe, and Twitter, Ilya emphasizes the limitations of traditional model fine-tuning methods. He advocates for alternatives like prompt engineering and knowledge retrieval, highlighting their potential to enhance AI performance without the drawbacks associated with fine-tuning.
Ilya's recent discussions at the NeurIPS conference reflect a shift towards practical applications of Transformer models and innovative strategies like curriculum learning. Additionally, he shares valuable perspectives on navigating career progression in tech, offering guidance for aspiring ML engineers aiming for senior roles. His narrative serves as a blend of technical expertise and practical career advice, making it a significant resource for professionals in the AI domain.
# Meta
# Consulting
# Instructed Machines, LLC


Tomaz Levak & Demetrios Brinkmann · Jan 24th, 2025
The talk focuses on how OriginTrail Decentralized Knowledge Graph serves as a collective memory for AI and enables neuro-symbolic AI. We cover the basics of OriginTrail’s symbolic AI fundamentals (i.e. knowledge graphs) and go over details how decentralization improves data integrity, provenance, and user control. We’ll cover the DKG role in AI agentic frameworks and how it helps with verifying and accessing diverse data sources, while maintaining compatibility with existing standards.
We’ll explore practical use cases from the enterprise sector as well as latest integrations into frameworks like ElizaOS. We conclude by outlining the future potential of decentralized AI, AI becoming the interface to “eat” SaaS and the general convergence of AI, Internet and Crypto.
# AI
# Decentralized Knowledge Graph
# OriginTrail


Krishna Sridhar & Demetrios Brinkmann · Jan 17th, 2025
Qualcomm® AI Hub helps to optimize, validate, and deploy machine learning models on-device for vision, audio, and speech use cases.
With Qualcomm® AI Hub, you can:
Convert trained models from frameworks like PyTorch and ONNX for optimized on-device performance on Qualcomm® devices.
Profile models on-device to obtain detailed metrics including runtime, load time, and compute unit utilization.
Verify numerical correctness by performing on-device inference.
Easily deploy models using Qualcomm® AI Engine Direct, TensorFlow Lite, or ONNX Runtime.
The Qualcomm® AI Hub Models repository contains a collection of example models that use Qualcomm® AI Hub to optimize, validate, and deploy models on Qualcomm® devices.
Qualcomm® AI Hub automatically handles model translation from source framework to device runtime, applying hardware-aware optimizations, and performs physical performance/numerical validation. The system automatically provisions devices in the cloud for on-device profiling and inference. The following image shows the steps taken to analyze a model using Qualcomm® AI Hub.
# AI
# Models at the Edge
# Qualcomm


Zach Wallace & Demetrios Brinkmann · Jan 14th, 2025
Demetrios chats with Zach Wallace, engineering manager at Nearpod, about integrating AI agents in e-commerce and edtech. They discuss using agents for personalized user targeting, adapting AI models with real-time data, and ensuring efficiency through clear task definitions. Zach shares how Nearpod streamlined data integration with tools like Redshift and DBT, enabling real-time updates. The conversation covers challenges like maintaining AI in production, handling high-quality data, and meeting regulatory standards. Zach also highlights the cost-efficiency framework for deploying and decommissioning agents and the transformative potential of LLMs in education.
# AI Agents
# LLMs
# Nearpod Inc


Egor Kraev & Demetrios Brinkmann · Jan 8th, 2025
Demetrios chats with Egor Kraev, principal AI scientist at Wise, about integrating LLMs to enhance ML pipelines and humanize data interactions. Egor discusses his open-source MotleyCrew framework, career journey, and insights into AI's role in fintech, highlighting its potential to streamline operations and transform organizations.
# Machine Learning
# AI Agents
# Autonomy
# Wise



Michelle Marie Conway, Andrew Baker & Demetrios Brinkmann · Jan 3rd, 2025
Lloyds Banking Group is on a mission to embrace the power of cloud and unlock the opportunities that it provides. Andrew, Michelle, and their MLOps team have been on a journey over the last 12 months to take their portfolio of circa 10 Machine Learning models in production and migrate them from an on-prem solution to a cloud-based environment. During the podcast, Michelle and Andrew share their reflections as well as some dos (and don’ts!) of managing the migration of an established portfolio.
# Tech Stack
# Cloud
# Lloyds Banking Group


Jineet Doshi & Demetrios Brinkmann · Dec 23rd, 2024
Evaluating LLMs is essential in establishing trust before deploying them to production. Even post deployment, evaluation is essential to ensure LLM outputs meet expectations, making it a foundational part of LLMOps. However, evaluating LLMs remains an open problem. Unlike traditional machine learning models, LLMs can perform a wide variety of tasks such as writing poems, Q&A, summarization etc. This leads to the question how do you evaluate a system with such broad intelligence capabilities? This talk covers the various approaches for evaluating LLMs such as classic NLP techniques, red teaming and newer ones like using LLMs as a judge, along with the pros and cons of each. The talk includes evaluation of complex GenAI systems like RAG and Agents. It also covers evaluating LLMs for safety and security and the need to have a holistic approach for evaluating these very capable models.
# Generative AI
# LLM Evaluation
# Intuit


Guanhua Wang & Demetrios Brinkmann · Dec 17th, 2024
Given the popularity of generative AI, Large Language Models (LLMs) often consume hundreds or thousands of GPUs to parallelize and accelerate the training process. Communication overhead becomes more pronounced when training LLMs at scale. To eliminate communication overhead in distributed LLM training, we propose Domino, which provides a generic scheme to hide communication behind computation. By breaking the data dependency of a single batch training into smaller independent pieces, Domino pipelines these independent pieces of training and provides a generic strategy of fine-grained communication and computation overlapping. Extensive results show that compared with Megatron-LM, Domino achieves up to 1.3x speedup for LLM training on Nvidia DGX-H100 GPUs.
# LLM
# Domino
# Microsoft


Aditya Naganath & Demetrios Brinkmann · Dec 10th, 2024
LLMs have ushered in an unmistakable supercycle in the world of technology. The low-hanging use cases have largely been picked off. The next frontier will be AI coworkers who sit alongside knowledge workers, doing work side by side. At the infrastructure level, one of the most important primitives invented by man - the data center, is being fundamentally rethought in this new wave.
# LLMs
# AI
# Kleiner Perkins
Popular