Collections
All Collections
All Content
Popular topics
# LLM in Production
# LLMs
# AI
# Rungalileo.io
# Machine Learning
# MLops
# LLM
# Interview
# RAG
# Tecton.ai
# Machine learning
# Arize.com
# mckinsey.com/quantumblack
# Redis.io
# Zilliz.com
# Humanloop.com
# Snorkel.ai
# Redis.com
# Wallaroo.ai
# MLOps
Gideon Mendels & Demetrios Brinkmann · Oct 18th, 2024
When building LLM Applications, Developers need to take a hybrid approach from both ML and SW Engineering best practices. They need to define eval metrics and track their entire experimentation to see what is and is not working. They also need to define comprehensive unit tests for their particular use case so they can confidently check if their LLM App is ready to be deployed.
# LLMs
# Engineering best practices
# Comet ML
Raj Rikhy · Oct 15th, 2024
In this MLOps Community podcast, Demetrios chats with Raj Rikhy, Principal Product Manager at Microsoft, about deploying AI agents in production. They discuss starting with simple tools, setting clear success criteria, and deploying agents in controlled environments for better scaling. Raj highlights real-time uses like fraud detection and optimizing inference costs with LLMs while stressing human oversight during early deployment to manage LLM randomness. The episode offers practical advice on deploying AI agents thoughtfully and efficiently, avoiding over-engineering and integrating AI into everyday applications.
# AI agents in production
# LLMs
# AI
Jelmer Borst, Daniela Solis & Demetrios Brinkmann · Oct 8th, 2024
Like many companies, Picnic started out with a small, central data science team. As this grows larger, focussing on more complex models, it questions the skillsets & organisational set up.
Use an ML platform, or build ourselves?
A central team vs. embedded?
Hire data scientists vs. ML engineers vs. MLOps engineers
How to foster a team culture of end-to-end ownership
How to balance short-term & long-term impact
# Recruitment
# Growth
# Picnic
Francisco Ingham & Demetrios Brinkmann · Oct 4th, 2024
Being an LLM-native is becoming one of the key differentiators among companies, in vastly different verticals. Everyone wants to use LLMs, and everyone wants to be on top of the current tech but - what does it really mean to be LLM-native?
LLM-native involves two ends of a spectrum. On the one hand, we have the product or service that the company offers, which surely offers many automation opportunities. LLMs can be applied strategically to scale at a lower cost and offer a better experience for users.
But being LLM-native not only involves the company's customers, it also involves each stakeholder involved in the company's operations. How can employees integrate LLMs into their daily workflows? How can we as developers leverage the advancements in the field not only as builders but as adopters?
We will tackle these and other key questions for anyone looking to capitalize on the LLM wave, prioritizing real results over the hype.
# LLM-native
# RAG
# Pampa Labs
+1
Tom Sabo, Matt Squire, Vaibhav Gupta & 1 more speaker · Oct 3rd, 2024
Bending the Rules: How to Use Information Extraction Models to Improve the Performance of Large Language Models
Generative AI and Large Language Models (LLMs) are revolutionizing technology and redefining what's possible with AI. Harnessing the power of these transformative technologies requires careful curation of data to perform in both cost-effective and accurate ways. Information extraction models including linguistic rules and other traditional text analytics approaches can be used to curate data and aid in training, fine-tuning, and prompt-tuning, as well as evaluating the results generated by LLMs. By combining linguistic rule-based models with LLMs through this multi-modal approach to AI, we can help to improve the quality and accuracy of LLMs and enable them to perform better on various tasks while cutting costs. We will demonstrate this innovation with a real-world example in public comment analysis.
Scaling Large Language Models in Production
Open source models have made running your own LLM accessible many people. It's pretty straightforward to set up a model like Mistral, with a vector database, and build your own RAG application. But making it scale to high traffic demands is another story. LLM inference itself is slow, and GPUs are expensive, so we can't simply throw hardware at the problem. Once you add things like guardrails to your application, latencies compound.
BAML: Beating OpenAI's Structured Outputs
We created a new programming language that allows us to help developers using LLMs get higher quality results out of any model. For example, in many scenarios, we can match GPT-4o performance with GPT-4o-mini using BAML. We'll discuss some of the algorithms that BAML uses, how they improve the accuracy of models, and why function calling is good and bad.
# LLMs
# RAG
# BAML
# SAS
Simba Khadder & Demetrios Brinkmann · Oct 1st, 2024
Simba dives into how feature stores have evolved and how they now intersect with vector stores, especially in the world of machine learning and LLMs. He breaks down what embeddings are, how they power recommender systems, and why personalization is key to improving LLM prompts. Simba also sheds light on the difference between feature and vector stores, explaining how each plays its part in making ML workflows smoother. Plus, we get into the latest challenges and cool innovations happening in MLOps.
# Feature Stores
# LLMs
# Featureform
+2
Nehil Jain, Sonam Gupta, Matt Squire & 2 more speakers · Sep 30th, 2024
In our September 12th MLOps Community Reading Group session, live-streamed at the Data Engineering for AI/ML Virtual Conference, we covered the paper "Hybrid RAG: Integrating Knowledge Graphs and Vector Retrieval for Information Extraction." The panel discussed using hybrid RAGs to pull information from unstructured financial documents. The paper's approach combines knowledge graphs with vector-based retrieval for better results. We critiqued the context-mixing methods and lack of graph pruning techniques. Overall, it was a solid session with great insights on improving financial data extraction using hybrid models.
# RAG
# Knowledge Graphs
# Vector Retrieval
Stefano Bosisio & Demetrios Brinkmann · Sep 27th, 2024
This talk goes through Stefano's experience, to be an inspirational source for whoever wants to jump on a career in the MLOps sector. Moreover, Stefano will also introduce his MLOps Course on the MLOps community platform.
# Inspirational Source
# MLOps Course
# Synthesia
Sai Bharath Gottam, Cole Bailey & Stephen Batifol · Sep 24th, 2024
Delivery Hero innovates locally within each department to develop the most effective MLOps practices in that particular context. We also discuss our efforts to reduce redundancy and inefficiency across the company. Hear about our experiences creating multiple micro feature stores within our departments, and our goal to unify these into a Global Feature Store that is more powerful when combined.
# Global Feature Store
# MLOps Practices
# Delivery Hero
Stephen Bailey · Sep 24th, 2024
Data can't be split into microservices! But teams should own their data! But there should be one definition for metrics! But teams can bring their own architectures! Data platform teams have a tough job: they need to find the right balance between creating reliable data services and decentralizing ownership -- and rarely do off-the-shelf architectures end up working as expected. In this talk, I'll discuss Whatnot's journey to providing a suite of data services -- including machine learning, business intelligence, and real-time analytics tools -- that power product features and business operations. Attendees will walk away with a practical framework for thinking about the maturation of these services, as well as patterns we've seen that make a big difference in increasing our adoption while reducing our maintenance load as we've grown.
Popular