MLOps Community Podcast
# Memory
# Checkpointing
# MemVerge
Boosting LLM/RAG Workflows & Scheduling w/ Composable Memory and Checkpointing
Limited memory capacity hinders the performance and potential of research and production environments utilizing Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) techniques. This discussion explores how leveraging industry-standard CXL memory can be configured as a secondary, composable memory tier to alleviate this constraint.
We will highlight some recent work we’ve done in integrating this novel class of memory into LLM/RAG/vector database frameworks and workflows.
Disaggregated shared memory is envisioned to offer high-performance, low-latency caches for model/pipeline checkpoints of LLM models, KV caches during distributed inferencing, LORA adaptors, and in-process data for heterogeneous CPU/GPU workflows. We expect to showcase these types of use cases in the coming months.
Bernie Wu & Demetrios Brinkmann · Oct 22nd, 2024
Popular topics
# LLMs
# Interview
# Generative AI
# Cambrium
# ML
# Continual.ai
# MLops
# Coding Workshop
# Presentation
# Case Study
# Scaling
# Deployment
# Kubernetes
# Startups
# DevOps
# Observability
# A/B Testing
# Python
# Artificial Intelligence
# MLOps on the Edge
Gideon Mendels & Demetrios Brinkmann · Oct 18th, 2024
When building LLM Applications, Developers need to take a hybrid approach from both ML and SW Engineering best practices. They need to define eval metrics and track their entire experimentation to see what is and is not working. They also need to define comprehensive unit tests for their particular use case so they can confidently check if their LLM App is ready to be deployed.
# LLMs
# Engineering best practices
# Comet ML
Raj Rikhy · Oct 15th, 2024
In this MLOps Community podcast, Demetrios chats with Raj Rikhy, Principal Product Manager at Microsoft, about deploying AI agents in production. They discuss starting with simple tools, setting clear success criteria, and deploying agents in controlled environments for better scaling. Raj highlights real-time uses like fraud detection and optimizing inference costs with LLMs while stressing human oversight during early deployment to manage LLM randomness. The episode offers practical advice on deploying AI agents thoughtfully and efficiently, avoiding over-engineering and integrating AI into everyday applications.
# AI agents in production
# LLMs
# AI
Jelmer Borst, Daniela Solis & Demetrios Brinkmann · Oct 8th, 2024
Like many companies, Picnic started out with a small, central data science team. As this grows larger, focussing on more complex models, it questions the skillsets & organisational set up.
Use an ML platform, or build ourselves?
A central team vs. embedded?
Hire data scientists vs. ML engineers vs. MLOps engineers
How to foster a team culture of end-to-end ownership
How to balance short-term & long-term impact
# Recruitment
# Growth
# Picnic
Francisco Ingham & Demetrios Brinkmann · Oct 4th, 2024
Being an LLM-native is becoming one of the key differentiators among companies, in vastly different verticals. Everyone wants to use LLMs, and everyone wants to be on top of the current tech but - what does it really mean to be LLM-native?
LLM-native involves two ends of a spectrum. On the one hand, we have the product or service that the company offers, which surely offers many automation opportunities. LLMs can be applied strategically to scale at a lower cost and offer a better experience for users.
But being LLM-native not only involves the company's customers, it also involves each stakeholder involved in the company's operations. How can employees integrate LLMs into their daily workflows? How can we as developers leverage the advancements in the field not only as builders but as adopters?
We will tackle these and other key questions for anyone looking to capitalize on the LLM wave, prioritizing real results over the hype.
# LLM-native
# RAG
# Pampa Labs
Simba Khadder & Demetrios Brinkmann · Oct 1st, 2024
Simba dives into how feature stores have evolved and how they now intersect with vector stores, especially in the world of machine learning and LLMs. He breaks down what embeddings are, how they power recommender systems, and why personalization is key to improving LLM prompts. Simba also sheds light on the difference between feature and vector stores, explaining how each plays its part in making ML workflows smoother. Plus, we get into the latest challenges and cool innovations happening in MLOps.
# Feature Stores
# LLMs
# Featureform
Stefano Bosisio & Demetrios Brinkmann · Sep 27th, 2024
This talk goes through Stefano's experience, to be an inspirational source for whoever wants to jump on a career in the MLOps sector. Moreover, Stefano will also introduce his MLOps Course on the MLOps community platform.
# Inspirational Source
# MLOps Course
# Synthesia
Sai Bharath Gottam, Cole Bailey & Stephen Batifol · Sep 24th, 2024
Delivery Hero innovates locally within each department to develop the most effective MLOps practices in that particular context. We also discuss our efforts to reduce redundancy and inefficiency across the company. Hear about our experiences creating multiple micro feature stores within our departments, and our goal to unify these into a Global Feature Store that is more powerful when combined.
# Global Feature Store
# MLOps Practices
# Delivery Hero
Adam Kamor & Demetrios Brinkmann · Sep 20th, 2024
Dive into what makes Retrieval-Augmented Generation (RAG) systems tick—and it all starts with the data. We’ll be talking with an expert in the field who knows exactly how to transform messy, unstructured enterprise data into high-quality fuel for RAG systems.
Expect to learn the essentials of data prep, uncover the common challenges that can derail even the best-laid plans, and discover some insider tips on how to boost your RAG system’s performance. We’ll also touch on the critical aspects of data privacy and governance, ensuring your data stays secure while maximizing its utility.
If you’re aiming to get the most out of your RAG systems or just curious about the behind-the-scenes work that makes them effective, this episode is packed with insights that can help you level up your game.
# RAG
# Named Entity Recognition
# Tonic.ai
Markus Stoll & Demetrios Brinkmann · Sep 3rd, 2024
This talk is about how data visualization and embeddings can support you in understanding your machine-learning data. We explore methods to structure and visualize unstructured data like text, images, and audio for applications ranging from classification and detection to Retrieval-Augmented Generation. By using tools and techniques like UMAP to reduce data dimensions and visualization tools like Renumics Spotlight, we aim to make data analysis for ML easier. Whether you're dealing with interpretable features, metadata, or embeddings, we'll show you how to use them all together to uncover hidden patterns in multimodal data, evaluate the model performance for data subgroups, and find failure modes of your ML models.
# Data Visualization
# RAG
# Renumics
Sean Morgan & Demetrios Brinkmann · Aug 30th, 2024
MLSecOps, which is the practice of integrating security practices into the AIML lifecycle (think infusing MLOps with DevSecOps practices), is a critical part of any team’s AI Security Posture Management. In this talk, we’ll discuss how to threat model realistic AIML security risks, how you can measure your organization’s AI Security Posture, and most importantly how you can improve that security posture through the use of MLSecOps.
# MLSecOps
# AISPM
# Protect AI
Popular