MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Making Your Data Agent-Ready with EnrichMCP // Simba Khadder // Agents in Production 2025

Posted Aug 06, 2025 | Views 27
# Agents in Production
# Enriching MCP
# Featureform
Share

speaker

avatar
Simba Khadder
Founder & CEO @ Featureform

Simba Khadder is the Founder & CEO of Featureform. After leaving Google, Simba founded his first company, TritonML. His startup grew quickly and Simba and his team built ML infrastructure that handled over 100M monthly active users. He instilled his learnings into Featureform’s virtual feature store. Featureform turns your existing infrastructure into a Feature Store. He’s also an avid surfer, a mixed martial artist, a published astrophysicist for his work on finding Planet 9, and he ran the SF marathon in basketball shoes.

+ Read More

SUMMARY

Agents are only as useful as the data they can access. EnrichMCP turns your existing data models, like SQLAlchemy schemas, into an agent-ready MCP server. It exposes type-checked, callable methods that agents can discover, reason about, and invoke directly. In this session, we’ll connect EnrichMCP to a live database, run real agent queries, and walk through how it builds a semantic interface over your data. We’ll cover relationship navigation (like user to orders to products), how input and output are validated with Pydantic, and how to extend the server with custom logic or non-SQL sources. Finally, we’ll discuss performance, security, and how to bring this pattern into production.

+ Read More

TRANSCRIPT

Simba Khadder [00:00:00]: Good morning, afternoon, evening, 3am depending on where you are today, I'm going to talk about something that I don't think it's talked about enough, which is connecting agents to data with mcp. A little bit about me way of intros My name is Simba, I'm the founder and CEO of a company called featureform. Featureform is a feature store company. As of recent we released an open source product called Enrich MCP which is seeing a lot of usage and getting a lot of buzz. What Enrich MCP does is it is a framework built on top of MCP that allow it's pretty much built for data APIs. Most companies and enterprises building agents of MCP are actually just building really just data APIs, data connectors, and Enrich MCP makes that much, much easier. The way we interact with the web is fundamentally changing. For the last 20 years, everything's been kind of dominated by what I would describe as just pretty much SaaS companies and most SaaS companies like think Jira, Asana, Salesforce, these are not really products that people love.

Simba Khadder [00:01:35]: And the reason people don't love them has nothing to do with the company, has nothing to do with the quality of whatever. It's purely just the fact that until now the only way we could interact with those large amounts of data, tabs and tasks was manually point and click. So Jira felt a lot like manually editing a database and it was extremely painful. Now, especially with mcp, you can connect to all of these applications via MCP and actually ask your agent to interact with these things. So rather than Salesforce being the or whatever Asana Jira being the thing that you go into, what's happening is we're using agents to interact with all of our applications. The agents themselves become the or the app becomes a system of record and the agents themselves are actually the thing that we interact with. And MCP enabled this. The reason why MCP enabled this, what really happened is that MCP is a standard that we all agreed upon and MCP just makes it possible for me as a developer at Slack or wherever to build an API server.

Simba Khadder [00:02:47]: Kind of like how I'd build a REST server and enable anyone to connect agents to it. But when most people think of mcp, they think of I can connect to Slack and Gmail and Calendar. And that's true and amazing. But the real killer use case for mcp, where we're seeing MCP actually being used at most especially large enterprise companies today is as a way to access data. Because it turns out that the LLMs of today, your cloud fours, are plenty smart enough to handle most of the situations that they're in. The thing that is limiting them, the actual bottleneck on on the problem, is their access to data. When I say data and LLM, the first thing most people think of is rag, which makes sense. RAG has its use case.

Simba Khadder [00:03:45]: But the thing is, for a lot of situations, a lot of problems, and a lot of use cases, rag kind of sucks. I'll give you an example. Let's say I'm building a customer support bot for, I don't know, let's say your doordash. And someone asks, why is my doordash taking forever? If you do rag, you'll typically build RAG on top of your help center or something similar, some sort of like, set of documents that you have internally to handle support. And so what rag will do, you will search for relevant paragraphs based on the user query, which is that your doordash is taking forever, and it will put that into context and allow the LLM to generate a response. So what you'll end up with is a user asking, why is my order taking forever? And their response will be, here are common reasons for delays. As you can imagine, that is an awful experience as an end user. So the thing is, it's not like the LLM wasn't smart enough to solve this.

Simba Khadder [00:04:52]: The problem was that the LLM did not have actual access to the order data, the dash data, the restaurant data, anything. So it just had to do the best it could. And what is really happening is that most LLMs are essentially just glorified summarizers. The idea of identic enrichment. Some people call this context engineering. It doesn't really matter what you call it, but the concept's all the same, which is you need to connect your agent to data sources to enable it to find the relevant data it needs to optimize your context and solve for a response. So someone asks, why is my order late? You need a way to access data wherever it is, whether it's under behind an API, whether it's in Postgres, wherever it's in Salesforce, whatever, get that data and then feed it into context to be able to actually solve the problem at hand. So in the case if someone asks about an order, you can get the order data.

Simba Khadder [00:05:48]: This is how things should work. And how this was done historically is with tools. But MCP provides a data API like it allows you to build a data API to actually solve this. This is becoming more and more critical as our Use cases move from essentially summarizes summary summaries like queries where someone asks a question, you take some paragraphs from a doc and you summarize it into decision making it actually doing things, it thinking through lots of data. What's funny is that a lot of people quote most enterprise data as unstructured, which is true, but most of the unstructured data, the reason why it's so big is just due to the fact that unstructured data tends to be things like images and PDFs which are inherently large. The most valuable data per byte is structured. It's in postgres or wherever, in Snowflake or somewhere in a structured form. And until MCP we did not really have a standard way to access that sort of data.

Simba Khadder [00:06:59]: So this isn't a new concept in the sense like a lot of companies have come to the realization of everything I've said and the go to way I'm seeing people try to solve this problem is by taking API specs and essentially doing one to one conversion from API to mcp and it doesn't work. If you haven't tried it yet, I'll save you the energy, give it a shot. But you will find it does not work. I have yet to find a single enterprise customer that has managed to actually get this to work outside of trivial use cases. Enterprise APIs have way too many endpoints, so those all become individual tools that the agent has to reason about. Those API endpoints tend to have many, many, many parameters and lots of customization because they're built for developers. Developers benefit from having a ton of parameters, right, because I can look and choose the ones I care about, but as an LLM it can get completely inundated with just the amount of stuff it needs to reason about in one shot. What you'll find is that for building MCP servers, the most important thing is optimizing descriptions.

Simba Khadder [00:08:06]: It turns out that API docs and open API specs, which again are built for humans who typically work at the company and can ask questions and reason about things, they're not really built for LLMs and so they don't really work very well. Then finally, authorization just inherently works differently for agents. The way you set up your off and off z stuff for LLMs or sorry for your API may not map or may not really make sense for agents. You probably want something a bit higher level. So we built in ret MCP as a solution for this based on our own finding. Because we were actually building an MCP server for ourselves, we Originally tried the API to MCP style. It did not work. And then we kind of, we realized that what was missing was the semantic layer, the semantic information for it to be able to actually understand how our business works.

Simba Khadder [00:09:04]: Essentially take our business data model. Don't try to kind of dump the whole thing at once, kind of break it down. Say, if I'm an E commerce store, I have users, I have orders, I have items. This is what my store sells. These are the attributes of all those things. Build a semantic build explanation of your business in code. Tie that to serving logic, whether that be API calls or SQL queries. Add the governance directly to the data layer.

Simba Khadder [00:09:31]: It's almost like building an MCP server from the data up as opposed to working from the hey, I have 50 resource endpoints and 50 tools. Think of it from the perspective of here's my data model, here's how to retrieve these different bits of data. If you've ever used GraphQL, it's very similar in a way. Like conceptually looks different, but conceptually it looks very similar to GraphQL because the funny thing is that the killer use case for GraphQL is actually the agent. But GraphQL isn't really built for agents. So if you were to rebuild GraphQL for agents, you'd end up with enriched MCP. It's open source. You can find the repo@enrichmcp it if you were to Google it.

Simba Khadder [00:10:12]: I'm going to try to give you guys a live demo which is always fun that explains how this actually works. I've been talking very theoretically. I want to give you an actual example. Let's look at our shop API with SQLite backend. So what you'll find control. The first thing I do is I define my nurture MCP server, I explain what the server is, I define my data model. I have users. Here are these attributes.

Simba Khadder [00:10:50]: A user has a list of orders. That's a relationship. I have products, products have attributes. I have orders. Orders have attributes for all of the relationships. And at the top I create resolvers to define. Here's how you would resolve. Pretty much like almost like double clicking in, I make this a little bigger.

Simba Khadder [00:11:13]: I just show this. This probably looks kind of small. So resolvers let you double click in so that you can see more information if you need it. This, once you run this, if you run this code, you end up with an MCP server. This all gets converted to MCP and so I'll actually connect to that same thing and I'm going to show you how that looks. The first thing that's kind of interesting is we generate this call called Explore Data Model that we give to the LLM. Oh, I'm sharing the wrong tab. I'm opening the MCP Inspector now.

Simba Khadder [00:11:51]: The MCP Inspector is just a way to be able to look at tools manually, just so you can understand what's actually being created. One of the first things that we generate is this Explore Data Model. This is completely generated by the framework itself and it is able to show, if I run, creates and describes to the model all the information that it needs to understand the business and the business's data model and what calls to make. All those resolvers, all those calls, those lists, those gets, they all become tools. They're all structured. That structure gets shared with the user. The parameters can be well defined. This enables the LLM to actually work.

Simba Khadder [00:12:35]: If you build this from scratch with just building MCP tools, you can kind of take this and rebuild all those things. What you will find is that you will end up copying and pasting a ton and you likely will end up with a very unoptimized set of tools and tool descriptions. The nice thing that we found is by being opinionated and forcing you to think of the data first, things tend to work a lot better. Maybe I'll give one a live demo of it running with Claude real quick. To do that, I'm going to have to move this over to Claude. I will share Cloud. Here I have Claude. I've connected Claude to that same thing I showed you earlier, the Enrich MCP shop.

Simba Khadder [00:13:22]: I'm going to say find fraud transactions in my store. The first thing CLAUDE is going to do is it's going to go through that data model that we talked about. It's going to understand what it has, what it has access to, how to use those things. It will actually work its way through all the gets in the list and it will here it's like it gets the user, then it gets the products for that order that the user made. And it's kind of digging its way through the semantic graph to be able to resolve and find the information it needs. And eventually it finds fraudulent transaction. But I hid in the data, so it's wild how well it works if you kind of start with the data first. That's kind of the big learning that we had was that if you start with your data first.

Simba Khadder [00:14:13]: The last thing I'll share is that we had a case study done with a large Fortune 500 customer. They found that none of our MCP servers were really working outside of contrived examples. When they moved them all to Enrich MCP and started with the data first and created that semantic model, things came together cleanly and it was the first time they'd released in the magic of mcp. Kind of like that demo I showed you where it just kind of worked. They didn't have to massage it. They didn't have to really tune the system prompt to make it call the right tools in the right order for the demo to work. It just worked. I think that's the magic finish.

Simba Khadder [00:14:54]: Mcp, it is open source. Everything I've shown you is free, anyone can use. We even have integrations into things like SQL alchemy. We have memory built in. We have a variety of different concepts inside the niche. MCP that work. Awesome. That's all I have.

Simba Khadder [00:15:11]: Cool, cool.

Skylar Payne [00:15:13]: Thanks so much for coming and sharing. So folks, questions, Throw them in the chat and I'll go ahead and read them off. In the meantime, if folks want to learn more and connect with you, where should they go?

Simba Khadder [00:15:26]: Yeah, I'm on LinkedIn. Easy name to remember. Simba call there on LinkedIn. You can connect to me there. If you're in San Francisco by chance, we have a hackathon and MCP hackathon this weekend. We also have an MCP meetup on the 30th. If you connect to me on LinkedIn, I can get you more information about that.

Skylar Payne [00:15:41]: Awesome, awesome. So while we're waiting for maybe some other questions to come in, I guess I can kind of like roll through. So, you know, one thing I was just kind of like looking through Enrich mcp, just curious. Like, it seems like this is purely focused on like getting access to data and kind of like not the use cases of like doing something. Is that.

Simba Khadder [00:16:06]: Yeah, I think that's. Yes, we have the ability to write. So it's not just read like writing, updating. That's all kind of core parts of things. What I'm finding is think of the APIs that you build in REST. Like most APIs are essentially just interfaces over data access. Right? Most enterprises building MCP servers are finding that most of what they're doing is the same thing. If you look at, let's say the Asana MCP server, the whole MCP server is pretty much just a data access thing.

Simba Khadder [00:16:35]: The idea of doing something in most cases is writing a record into a database for most. Most products. So. Yes. But it turns out that that is most of what enterprise MCP servers are anyway.

Skylar Payne [00:16:48]: Totally, totally. That Makes a lot of sense. Let me just make sure I'm looking at the chat. One other thing I was thinking that might be interesting. I'd love to hear your thoughts on this. I find it really interesting to see you go from building this feature store to building this partially because one thing I keep having in my head is there's all these different terms flying around. Prompt engineering, context, engineering, et cetera, et cetera. And like coming from like a more traditional ML background, I keep thinking that like a lot of this is like the same problem that feature store solved of like hey, I have all this stuff that I need to like put into my prompt and we have these things like MCP which are usually framed in like the tool use sense.

Skylar Payne [00:17:31]: But sometimes I also think like, hey, there's this problem of like how do you even get the information into the prompt that you want? And so just curious if you are thinking about that or how you think about the relationship between like, you know, your journey, building the feature store and now like MCP and other pieces of infra.

Simba Khadder [00:17:50]: It's funny how similar it is. If you look at MLOps and you think of the companies that were successful in the mlops space. There was a variety of different categories that emerged. But I would say that the two categories that were most maybe well represented were data access, data and evaluation. And so it kind of turned out that. And if you ask an LLM, you know, anyone building LLMs, what are the hardest parts? Data and eval. So in many ways it's very, very similar. In fact, most teams in my experience that originally built the ML and managed ML platform also build AI and AI platform.

Simba Khadder [00:18:31]: So it is very a very similar problem. The only difference between a piece of context in the LLM case and a feature in that kind of classical ML case is typically with classical models the features are hard coded to the model, whereas with agents and LLMs the agent itself has to go kind of discover the features. So the discovery part is a new part. Obviously prompt engineering is a new concept. There's a lot of new concepts, but there's a lot that is similar. There's a lot of especially coming from a recommender system background. The idea of using vector, dbs, RE ranking, all that stuff like that is not new, that is stuff that we've learned. And it feels like lots of people building LLMs had to relearn a lot of lessons that were learned before by us and we stole them from the LLP people as well.

Simba Khadder [00:19:15]: So it's kind of everyone relearning the same sort of problems.

Skylar Payne [00:19:20]: Totally, totally. Yeah. It's similar feelings. I often say that everyone thinks AI is moving really fast, but part of me thinks it's like, it's because we're, like, reinventing the same things over and over again. So, yeah, definitely. Agree. Well, cool. Thank you so much.

Skylar Payne [00:19:39]: Oh, go ahead, Go ahead.

Simba Khadder [00:19:41]: Yeah, I was going to say, the other thing I think I found is that people, they. Things are moving really quickly because there's a lot more hype than there is actual growth and substance. And I don't think that's like, this is all hype, by no means. But what I think is happening is when something kind of works, people blow it up to be like, this is going to change. Like, MCP is a good example of this. MCP is a massive step in the fact that we have a standard protocol. But I always joke of, like, could you imagine if people are this excited about REST becoming a thing? It's almost like a. Like, REST was a massive step forward in, like, Web 2.0, but there was no, like, LinkedIn influencers, you know, talking about rest.

Simba Khadder [00:20:25]: And so I think there's a lot of noise. And the noise is because this is so transformative, but I think there's a lot of people just trying to figure out what's happening, which is why I think conferences like this are so important to, like, share ideas and figure out what's actually working and not working.

Skylar Payne [00:20:39]: Totally, totally agree. Yeah. The whole day has been very information dense. And, you know, your example of hype reminds me, like, right before your session, we were looking at some memes and there was, like, this one that said, like, there's a lot of, like, posts you'll see on Twitter that are just like, I vibe coded an app that teaches coding to dolphins, and Now I have 12 mil. ARR. I'm only 9 years old. And it's like, you know, there's just like, this level of hype that's just like. I look at a lot of these posts and I'm just like, I know for a fact the thing you're showing doesn't work the way you're showing it, you know?

Simba Khadder [00:21:11]: But, yeah, there's a lot of fake demos, too. Well, not. They're not fake for just their demos, but I think there's a lot of big steps between demo and reality, and I think that that's part of it. I also think that because this is so transformative in general, you see these, like, massive growths of companies that kind of are in the right place at the right time, but it's just the amount of movement and transformation. It's like nothing I've seen before, to be honest. So I think it creates a lot of opportunity, but it also creates a ton of confusion.

Skylar Payne [00:21:42]: Totally awesome. Thank you so much for your time. Remember folks, if you want to connect, reach out to him on LinkedIn, Simba Kotter, and also check out Enrich MCP on GitHub.

Simba Khadder [00:21:56]: Awesome. Thank you. See ya.

+ Read More
Sign in or Join the community
MLOps Community
Create an account
Change email
e.g. https://www.linkedin.com/in/xxx or https://xx.linkedin.com/in/xxx
Comments (0)
Popular
avatar


Watch More

Generative AI Agents in Production: Best Practices and Lessons Learned // Patrick Marlow // Agents in Production
Posted Nov 15, 2024 | Views 6.4K
# Generative AI Agents
# Vertex Applied AI
# Agents in Production
Lessons From Building Replit Agent // James Austin // Agents in Production
Posted Nov 26, 2024 | Views 1.4K
# Replit Agent
# Repls
# Replit
Create Multi-Agent AI Systems in JavaScript // Dariel Vila // Agents in Production
Posted Nov 26, 2024 | Views 1.2K
# javascript
# multi-agent
# AI Systems