Collections

All Collections

See all

AIQCON SAN FRANCISCO 2024

39 Items

Blog

161 Items

MLOps IRL

87 Items

MLOps Community Podcast

222 Items

AI in Production

63 Items

ROUNDtable

5 Items

MLOps Community Mini Summit

7 Items

LLMs in Production Conference Part III

35 Items

LLMs in Production Conference Part II

63 Items

LLMs in Production Conference

24 Items

All Content

Popular topics

# LLM in Production

# LLMs

# AI

# Rungalileo.io

# Machine Learning

# MLops

# LLM

# Interview

# Tecton.ai

# Machine learning

# Arize.com

# mckinsey.com/quantumblack

# RAG

# Redis.io

# Zilliz.com

# Humanloop.com

# Snorkel.ai

# Redis.com

# Wallaroo.ai

# MLOps

All

Shiva Bhattacharjee · Sep 13th, 2024

Alignment is Real // Shiva Bhattacharjee // MLOps Podcast for YouTube

If the off-the-shelf model can understand and solve a domain-specific task well enough, either your task isn't that nuanced or you have achieved AGI. We discuss when is fine-tuning necessary over prompting and how we have created a loop of sampling - collecting feedback - fine-tuning to create models that seem to perform exceedingly well in domain-specific tasks.

# DSPy

# AI infrastructure,

# TrueLaw Inc

38:36

Vikram Rangnekar & Demetrios Brinkmann · Sep 11th, 2024

Ax a New Way to Build Complex Workflows with LLMs

LLMClient is a new way to build complex workflows with LLMs. It's a typescript library based on research done in the Stanford DSP paper. Concepts such as prompt signatures, prompt tuning, and composable prompts help you build RAG and agent-powered ideas that have till now been hard to build and maintain. LLMClient is designed for production usage.

# LLMClient

# DSP paper

# AX

50:51

Vishakha Gupta · Sep 10th, 2024

Can A RAG Chatbot Really Improve Content?

The ability to semantically search for a concept, summarize a response, and point to relevant links is exactly why large language model (LLM) and retrieval augmented generation (RAG) methods have become so popular. Our LangChain-based implementation uses ApertureDB under the covers as the vector store/retriever for high-performance look-up of documents that are semantically similar to the user’s query. Now we can look at the questions that resulted in insufficient or incorrect responses and introduce helpful and accurate information where it belongs. Ultimately, if we can help our users find guidance easily, then it's a win for everyone.

# Vector Database

# RAG

# Usability

# ApertureDB

Markus Stoll · Sep 3rd, 2024

Visualize - Bringing Structure to Unstructured Data

This talk is about how data visualization and embeddings can support you in understanding your machine-learning data. We explore methods to structure and visualize unstructured data like text, images, and audio for applications ranging from classification and detection to Retrieval-Augmented Generation. By using tools and techniques like UMAP to reduce data dimensions and visualization tools like Renumics Spotlight, we aim to make data analysis for ML easier. Whether you're dealing with interpretable features, metadata, or embeddings, we'll show you how to use them all together to uncover hidden patterns in multimodal data, evaluate the model performance for data subgroups, and find failure modes of your ML models.

# Data Visualization

# RAG

# Renumics

50:39

Sean Morgan & Demetrios Brinkmann · Aug 30th, 2024

MLSecOps is Fundamental to Robust AISPM

MLSecOps, which is the practice of integrating security practices into the AIML lifecycle (think infusing MLOps with DevSecOps practices), is a critical part of any team’s AI Security Posture Management. In this talk, we’ll discuss how to threat model realistic AIML security risks, how you can measure your organization’s AI Security Posture, and most importantly how you can improve that security posture through the use of MLSecOps.

# MLSecOps

# AISPM

# Protect AI

42:36

Jonathan Bennion · Aug 29th, 2024

GraphRAG Analysis, Part 2: Graph Creation and Retrieval vs Vector Database Retrieval

GraphRAG (by way of Neo4j in this case) enhances faithfulness (a RAGAS metric most similar to precision) when compared to vector-based RAG, but does not significantly lift other RAGAS metrics related to retrieval; may not offer enough ROI to justify the hype of the accuracy benefits given the performance overhead.

# GraphRAG

# Retrieval Database

# Vector Database

# The Objective AI

Harcharan Kabbay & Demetrios Brinkmann · Aug 27th, 2024

MLOps for GenAI Applications

The discussion begins with a brief overview of the Retrieval-Augmented Generation (RAG) framework, highlighting its significance in enhancing AI capabilities by combining retrieval mechanisms with generative models. The podcast further explores the integration of MLOps, focusing on best practices for embedding the RAG framework into a CI/CD pipeline. This includes ensuring robust monitoring, effective version control, and automated deployment processes that maintain the agility and efficiency of AI applications. A significant portion of the conversation is dedicated to the importance of automation in platform provisioning, emphasizing tools like Terraform. The discussion extends to application design, covering essential elements such as key vaults, configurations, and strategies for seamless promotion across different environments (development, testing, and production). We'll also address how to enhance the security posture of applications through network firewalls, key rotation, and other measures. Let's talk about the power of Kubernetes and related tools to aid a good application design. The podcast highlights the principles of good application design, including proper observability and eliminating single points of failure. I would share strategies to reduce development time by creating templates for GitHub repositories by application types to be re-used, also templates for pull requests, thereby minimizing human errors and streamlining the development process.

# GenAI Applications

# RAG

# CI/CD Pipeline

1:05:02

Korri Jones, Sonam Gupta, Nehil Jain & 1 more speaker · Aug 26th, 2024

Exploring Long Context Language Models

# Long Context Language Models

# RAG

# SQL

49:25

Nicolas Mauti & Demetrios Brinkmann · Aug 23rd, 2024

BigQuery Feature Store

Need a feature store for your AI/ML applications but overwhelmed by the multitude of options? Think again. In this talk, Nicolas shares how they solved this issue at Malt by leveraging the tools they already had in place. From ingestion to training, Nicolas provides insights on how to transform BigQuery into an effective feature management system. We cover how Nicolas' team designed their feature tables and addressed challenges such as monitoring, alerting, data quality, point-in-time lookups, and backfilling. If you’re looking for a simpler way to manage your features without the overhead of additional software, this talk is for you. Discover how BigQuery can handle it all!

# BigQuery

# Feature Store

# Malt

50:39

Andy McMahon & Demetrios Brinkmann · Aug 20th, 2024

Design and Development Principles for LLMOps

As we move from MLOps to LLMOps we need to double down on some fundamental software engineering practices, as well as augment and add to these with some new techniques. In this case, let's talk about this!

# MLOps

# LLMOps

# Barclays

1:10:18

Chip Huyen & Demetrios Brinkmann

Finetuning Open-Source LLMs // LLMs in Production Conference 3 Keynote 1

Sebastian Raschka & Demetrios Brinkmann

The Birth and Growth of Spark: An Open Source Success Story

Matei Zaharia & Vishnu Rachakonda

LLMOps: The Emerging Toolkit for Reliable, High-quality LLM Applications

Matei Zaharia & Demetrios Brinkmann