MLOps Community
+00:00 GMT

Collections

All Collections

See all
Raise Summit AI Conversations powered by Prosus Group
4 Items

All Content

All
Mats Eikeland Mollestad
Mats Eikeland Mollestad · Sep 16th, 2025
Smoke Testing for ML Pipelines
Machine learning pipelines are vulnerable to data and infrastructure errors that can disrupt production. By implementing smoke tests with both random and controlled synthetic data, teams can validate pipeline functionality and schema adherence before running full-scale jobs. This practice supports continuous integration and delivery, leading to fewer outages and more reliable deployments.
# ML Testing
# CI/CD
# Machine Learning
Hudson Buzby
Demetrios Brinkmann
Hudson Buzby & Demetrios Brinkmann · Sep 12th, 2025
For better or for worse, machine learning has traditionally escaped the gaze of security and infrastructure teams, operating outside traditional DevOps practices and not always adhering to organizations' development or security standards. With the introduction of open source catalogs like HuggingFace and Ollama, a new standard has been established for locating, identifying, and deploying machine learning and AI models. But with this new standard comes a plethora of security, governance, and legal challenges that organizations need to address before they can comfortably allow developers to freely build and deploy ML/AI applications. In this conversation will discuss ways that enterprise scale organizations are addressing these challenges to safely and securely build these development environments.
# Generative AI
# Security and Governance
# JFrog
59:23
Elma O'Sullivan-Greene
Elma O'Sullivan-Greene · Sep 11th, 2025
Building better agents fast: real stories, lean workflows, and practical tips for building trustworthy, human-friendly agents in accounting and beyond.
# AI Agents
# Biomedical Models
# MyOB
39:07
George Chouliaras
Antonio Castelli
Zeno Belligoli
George Chouliaras, Antonio Castelli & Zeno Belligoli · Sep 9th, 2025
We share a pragmatic framework for evaluating LLM-powered applications in production. Anchored in high-quality human labels and a calibrated ‘LLM-as-judge’ approach, it turns subjective outputs into consistent, actionable metrics—enabling continuous monitoring, faster iteration, and safer launches at scale. We distill lessons from a year of building and operating this framework at Booking.com, with the aim to make evaluation a core practice in the GenAI development lifecycle.
# Gen AI
# Evaluation
# LLMs
# LLM Evaluation
Nishikant Dhanuka
Demetrios Brinkmann
Nishikant Dhanuka & Demetrios Brinkmann · Sep 5th, 2025
Nishikant Dhanuka talks about what it really takes to make AI agents useful—especially in e-commerce and productivity. From making them smarter with context (like user history and real-time data) to mixing chat and UI for smoother interactions, he breaks down what’s working and what’s not. He also shares why evals matter, how to test with real users, and why AI only succeeds when it actually makes life easier, not more complicated.
# Context Engineering
# AI Engineering
# Prosus Group
52:37
As AI agents like Claude and Cursor integrate into enterprise workflows, organizations face critical security challenges around safe resource access. The Model Context Protocol (MCP) is establishing communication standards, while OAuth 2.1 and token exchange mechanisms provide authentication frameworks to balance AI capabilities with enterprise security requirements for sensitive corporate data.
# AI Agents
# MCP
# AI Security
# Machine Learning
Joel Horwitz
Demetrios Brinkmann
Joel Horwitz & Demetrios Brinkmann · Sep 1st, 2025
We’re entering a new era in marketing—one powered by AI agents, not just analysts. The rise of tools like Clay, Karrot.ai, 6sense, and Mutiny is reshaping how go-to-market (GTM) teams operate, making room for a new kind of operator: the GTM engineer. This hybrid role blends technical fluency with growth strategy, leveraging APIs, automation, and AI to orchestrate hyper-personalized, scalable campaigns. No longer just marketers, today’s GTM teams are builders—connecting data, deploying agents, and fine-tuning workflows in real time to meet buyers where they are. This shift isn’t just evolution—it’s a replatforming of the entire GTM function.
# Agentic AI
# AI Agents
# Neoteric3D
48:57
Kelly Hong
Adam Becker
Matt Squire
+2
Kelly Hong, Adam Becker, Matt Squire & 2 more speakers · Sep 1st, 2025
When Bigger Isn’t Always Better: How Context Length Can Break Your LLM ​Longer context windows are the new bragging rights in LLMs — now stretching into the millions of tokens. But can models really handle the first and the 10,000th token equally well?
# Context Windows
# LLMs
# Prompt Engineering
1:00:29
Sonam Gupta
Adam Becker
Nehil Jain
+1
Sonam Gupta, Adam Becker, Nehil Jain & 1 more speaker · Sep 1st, 2025
This paper challenges the LLM-dominant narrative and makes the case that small language models (SLMs) are not only sufficient for many agentic AI tasks—they’re often better. ​🧠 As agentic AI systems become more common—handling repetitive, task-specific operations—giant models may be overkill. The authors argue that: ​SLMs are faster, cheaper, and easier to deploy ​Most agentic tasks don't require broad general intelligence ​SLMs can be specialized and scaled with greater control ​Heterogeneous agents (using both LLMs and SLMs) offer the best of both worlds ​They even propose an LLM-to-SLM conversion framework, paving the way for more efficient agent design.
# Small Language Models
# Agentic AI
# LLMs
58:13
Nikolaos Vasiloglou
Demetrios Brinkmann
Nikolaos Vasiloglou & Demetrios Brinkmann · Aug 27th, 2025
Nikolaos widely shared analysis on LinkedIn highlighted key insights across agentic AI, scaling laws, LLM development, and more. Now, he’s exploring how AI itself might be trained to automate this process in the future - offering a glimpse into how researchers could harness LLMs to synthesize conferences like NeurIPS in real time.
# NeurIPS
# Deep Learning
# RelationalAI
57:36
Privacy Policy