MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Context Engineering 2.0: MCP, Agentic RAG & Memory // Simba Khadder

Posted Nov 25, 2025 | Views 37
# Agents in Production
# Prosus Group
# MCP
Share

speaker

user's Avatar
Simba Khadder
Sr. Manager & Software Engineer @ Redis

Simba Khadder is the Founder & CEO of Featureform, now part of Redis. Following the acquisition, he joined Redis to lead Context Engine, which helps developers deliver the right data at the right time to power next-generation AI and agents.

Before Featureform, Simba founded TritonML after leaving Google, building ML infrastructure that supported over 100M monthly active users. He channeled those learnings into Featureform’s virtual feature store, designed to turn existing infrastructure into a fully managed feature store.

Outside of tech, Simba is an avid surfer, mixed martial artist, published astrophysicist for his work on Planet 9, and once ran the SF marathon in basketball shoes.

+ Read More

SUMMARY

Context Engineering 2.0 treats retrieval, tools, and memory as one surface that agents can navigate. The aim is to make documents, databases, events, and live APIs addressable and navigable through a single MCP native interface. Think GraphQL for agents. RAG works well for one shot queries from textual corpora like help centers and docs. With Redis's vector database, users can index, embed, and retrieve relevant chunks. Sources like relational databases and APIs are out of reach through RAG. Teams paste large ad hoc JSON objects into prompts, rely on Text2SQL, or struggle with OpenAPI to MCP wrappers. It is not reliable and it does not scale across the organization. With Redis Context Engine we are engineering a better way to expose data to agents. A unified, schema driven, MCP native layer connects all your data and powers real time, reliable agent workflows. Define a semantic schema and structured data enters the same path as unstructured text. Agents blend semantic search with structured filters in one call, traverse relationships, call APIs, and keep state via memory. All powered by Redis.

+ Read More

TRANSCRIPT

Simba Khadder [00:00:00]: Calling in from Amsterdam, actually. So I am at the Prosus office with Demetrios and so I got to jump in last minute today. I'm gonna actually give the same keynote that I gave in London. So if you were in London for that, I'm sorry, but you get to listen to it all over again. But I'll be talking about context engineering 2.0. It's almost funny to call it conscious engineering 2.0 because I don't even really think we had a 1.0. But the whole concept here is that we're going to talk about unification of context.

Simba Khadder [00:00:41]: Right now we have rag, we have memory, we have mcp, we treat them as three different surfaces and we're also complaining at the same time about tool bloat and a lot of the things that come with that. So I want to talk about a new way to think of context and context engineering. So before that I should probably tell you who I am. If you don't know. I'm Simba. I'm at Redis, where I lead Context Engine in Feature Store. I have actually a very new employee at Redis. I came in via the acquisition of featureform, which I founded many years ago.

Simba Khadder [00:01:17]: Now featureform was a feature store, continues to exist and continues to be invested in by Redis. We also built an open source enrich, mcp, which I'll talk a lot about today, which kind of encapsulates some of the concepts and ideas of Context Engine. Really. I think how I would describe what I do at Redis is I focus on connecting models and agents to data. So you've all probably, I mean, I'm sure most of you have heard of Redis today. It's massive, well loved, continues to grow super quickly. But most of you probably think of Redis as that thing that makes your apps go faster, which is true and continues to be true. But how a lot of you were probably introduced to Redis or maybe use Redis in your day job is via this concept more around caching rate limiting session store.

Simba Khadder [00:02:20]: Redis started many years ago now with the focus and a lot of the use cases that we focused on were about delivering real time data for mobile and web. Now the new focus and what we've been focusing on as well is delivering real time context for agents and models. We'll break this down, but I'm going to talk about three types of context. Structured data, unstructured data, and memory. First I want to talk about unstructured data. This is rag, which is probably something that at this point everyone's familiar with. I'll talk about how RAG works, why it continues to work, why vector DBs aren't dead, I guess though that's a thing. But they're just one piece of the puzzle.

Simba Khadder [00:03:14]: So a lot of context data that agents need and LLMs need to be able to do proper and make proper decisions today exist in documents, PDFs, help centers, knowledge bases that people have across their companies, or maybe it's online on the website or something. And so we need to be able to pretty much allow an LLM to go search for relevant pieces of text from some corpus to be able to help a user. So for example, if someone asks a question about this presentation, well, this presentation isn't really online. So I would need to have, and I typically do have private data that I can connect to my LLM via rag. The way that works again is that you take your documents, you chunk them up, you typically put them in some sort of search database or a vector database and enable an LLM to essentially run search on it. How it's typically done and how we think of vector dbs, a lot of the time is around creating embeddings and doing a nearest neighbor look. But there's so much more than that if you think about it. There's a lot of different things that you might want to use to go search for the right content.

Simba Khadder [00:04:37]: And I've seen some really complex queries where you use a lot of hybrid components to make sure that you actually get the most relevant context. Which again is all this is about to the models and we're really fast, which is just true to the Redis brand. The other thing that's talked about often is memory. So Redis again has been playing the space and has been very successful I think in memory because of the same things that made making us successful in Rack. We're very fast and we have a lot of really nice to use abstractions built into core Redis that abstract well up to things like memory supports. We have the Redis agent memory server which allows you to easily manage both long term and short term memory. For agents, short term memory is more around as a conversation's happening, keeping just the context of even the last few messages, enabling the LLM to not have to reread everything, every single message because there's no state. The state is the short term memory then long term memory.

Simba Khadder [00:05:58]: For example, if you say something that is clearly a preference of yours, I want you to not use EM dashes in any content that you generate, allowing it to Remember these things that are a little more permanent, they're not exactly permanent, but they're long term. They're just concepts, things that are worth remembering for a long period of time, potentially across multiple conversations or even across multiple tools. So this is memory finally. And this is I think, the least talked about part of this, which is structured data for agents. So this is probably the part where it is the least written about. And I think there's the most interesting things right now in the space are happening around is structured data. One of the more common ways I see it getting done today is text SQL, where you actually enable an agent to write SQL itself to be able to essentially run queries against postgres. Now, if that sounds mildly terrifying to you, it should, because it is a little bit terrifying in some cases.

Simba Khadder [00:07:20]: I mean, there's always a way to kind of make these things work, but it's highly nuanced and it's just, in my opinion, a really bad idea to go with. One problem, which is really probably the least important problem that people talk about a lot is that sometimes the SQL that gets generated is just wrong. It's literally not syntactically correct. More often, if you just imagine your internal databases at your company. Imagine and think of how many tables that you have that kind of have similar names or columns that you have that kind of similar names kind of mean the same thing, but you have a ton of internal knowledge about which columns are actually important, which columns aren't important, which ones you should be using, peculiarities of the columns, etc. So the agent's not going to be able to figure that out in a single shot. And it just requires a lot more work to even get your data in a place where an agent could even do this in a way that makes makes sense. Once you get there, then there's a million security issues and access control issues.

Simba Khadder [00:08:31]: You're having agents run queries directly on production data where other people's data probably lives as well. And then on top of all of that, there's performance. Agents might be able to write SQL to do what you want, but. Or what the user wants. But a user could also. We're just talking about prompt injection. You could also inject a prompt here to either make it query something crazy or even just take down the database by coaxing it to run some recursive query that just totally maxes out your CPU or something. There's just a lot here that can go wrong.

Simba Khadder [00:09:15]: I just think in general it's a bad idea to just give Agent direct access to a database. So, okay, the next thing I see, and this is becoming more often with the rise of MCP is fine. Well, we have this API that already exists, so what we're going to do is just kind of wrap it. Every endpoint becomes an MCP tool, every parameter becomes a parameter in that tool. I've seen this a lot and I've yet to see it work outside of contrived use cases that especially at an enterprise, you end up with way too many tools. A lot of the tools you actually end up having to chain together to be able to make it do anything useful. Think of your own like an API or an API that you've used and how many parameters a typical API endpoint has. It's typically tons.

Simba Khadder [00:10:05]: And there's a good reason for that because for humans and developers that gives us flexibility. Those parameters are actually useful for us because there's a lot of kind of random nuance things that we might need to do do that API. But for an agent, which is really essentially taking this thing, reading it once in the context flow, there's no. It just doesn't usually it just almost never works that it's going to be able to reason about all the tools, all the parameters and use them correctly. You end up wasting a ton of tokens. You end up having it kind of loop try to figure out how to use API and almost always it doesn't actually work outside of again contrived demos. There's off. So getting the off to map from open API or from your REST API to MCP is a pain.

Simba Khadder [00:10:57]: And I already mentioned API chaining. So API chaining APIs aren't really built for agents like open. Like the rest APIs you have today are not built for agents. What that means is that the abstraction layer that they sit on is too low level. You need something at a higher level. Thinking of how an agent will read and reason about your data. So what's the answer?

Adam Becker [00:11:24]: Well.

Simba Khadder [00:11:27]: My idea was kind of going back to just basic engineering principles. Let's give it something to work with, let's give it a schema and let's write code again. So we kind of took a page from Pydantic or an ORM style workflow. Really what we're doing is we're allowing you to define at the business layer abstraction what your entities are. So I have customers, I have orders, I have items. You can define attributes, you can define relationships. This becomes an MCP server. So it speaks mcp, but that's not actually that Important when it speaks.

Simba Khadder [00:12:08]: What's important is more that you are defining your schema from the data first. Rather than having 100 tools like a get customer list, customer, etc. You're just defining where the customer is. Different attributes of it, different relationships and ways it can navigate from there. And you're enabling the agent to actually understand your business model in one or your data model really in one go. And then once you have that kind of semantic catalog, you can make it navigable and retrievable. How that looks in practice. Again, it becomes an MTP server.

Simba Khadder [00:12:47]: When someone writes find me fraud transactions, it will first, as you see at this top point, these are all tool calls. If you haven't used Claude the things that have a box with a carrot, you can see the first thing it does is it explores the data model. It tries to understand what's there. Then it starts navigating. It lists orders. It starts going through the order to the items and the user of our order. It just digs through. I have a video.

Simba Khadder [00:13:15]: We'll see if it works. Might not. All right, well. Oh, there it is. All right, so you can see here that I have. I'm writing your customer support agent. You're looking for fraud and this is how it ends up actually looking in practice. The first thing it does is it does that explore data model that I talked about, where it actually goes and navigates what's there.

Simba Khadder [00:13:50]: And then it starts. You see that thing called list orders underneath and then called list orders of second page. And you can see it navigating. The thing is put a different way. The reason why we all talk about context engineering is that we're kind of realizing that the only thing that matters, like the agents and LLMs are inherently intelligent enough to handle most of our problems. The problem that we actually have now is how do we get the right data in front of it to enable it to actually do the right thing and do something useful. It's almost like you have this. I mean, it's like you have this genius thing that kind of knows a little bit about everything.

Simba Khadder [00:14:33]: It's almost like it's read every book in the world, but it just doesn't have any context about what you're working on. So all you need to do is get the right context in. Like we said, the view is that context is all that matters for agents right now. I talked about three different ways of getting context. I talked about rag, I talked about memory, I talked about structured data. At your company, I guarantee you, you have unstructured data you have structured data and then memory is going to be relevant just in general of any agentic workflow, especially if it touches users. And so right now they just look all very different and there's no way to join them. Like you can't have structured data attached to documents, you can't have memories that were remembering something about the document or.

Simba Khadder [00:15:26]: And that's where I think the power really comes in. So when you start unifying these things together, where now we have all of this surface from APIs to data warehouse to user inputs and we transform them into context to store in our context engine and enable an LLM to go and retrieve what it needs to solve any problem at hand. If you're building an agent with Langgraph rather than again having the vector store. The cool thing we've read this specifically is that we already have all these things. We have rack, we have long term memory, we have short term memory and we have now structured data retrieval of mcp. So we have all these pieces and that puts us in a spot where we can unify, we can create this unified semantic and access layer that allows us to get to our structured data or unstructured data and our long and short term memory no matter where it is. So it provides one pane of glass cheer context, where it's no longer structured data versus unstructured data versus memory, it's just context. Different types of context, different ways to navigate them and look through them and to remember them.

Simba Khadder [00:16:37]: This is the idea, this is where I think context engineering is going is rather than thinking of all these random different ways of doing things, we're going to think of context and context engineering as a kind of unified concept, as a type of data pipeline in a sense. So with that, that's a little bit about what we're thinking about here at Redis and I'll stop there for any questions. Thank you.

Adam Becker [00:17:18]: Awesome, Simba, that was incredible. Thank you very much. If folks have questions, now is the time to ask. I see there's already one that showed up on the chat and we're going to be open to a few other ones. Everton asks, I don't know if exactly at what point he asked this. Do you think Trino could help in this issue? I don't remember what issue that was.

Simba Khadder [00:17:41]: I assume we're talking about structured data and how to access structured data. So this is kind of like the idea of like, well, why don't we just use like snow? I mean snowflake is another concept like or spark or whatever. Why don't we just have agents call Spark or Trina. It's the same issue, which is that like it's a terrifying thought to give a massive Spark cluster to like an agent and let run whatever it wants on your data. I just think it's a much better workflow and it's funny because we already did this in ML. There's a reason feature stores existed which is that you're materializing views and that's maybe lower level data engine. Way to think of what we're doing here, which is you're creating materialized view. I mean even a Rack pipeline, it's a projection, right? You're taking unstructured data, you're chunking it, you're projecting into embedding space.

Simba Khadder [00:18:28]: That's a materialized view, an unusual one compared to most others, but it is just a materialized view. So we're creating materialized view of context and letting the agent understand and access and navigate it.

Adam Becker [00:18:41]: Everton is clarifying within access to the data is what he means. I think that is it might be what you're referring to. It's the same thing as writing the SQL to postgres.

Simba Khadder [00:18:52]: Yes, I think so. Obviously feel free to clarify if we're.

Adam Becker [00:18:58]: Saying what is the state? I mean I love the vision. What is the state of Redis context engine now?

Simba Khadder [00:19:04]: It's new. We are are kind of in a stage. We could think of it as like a design partner stage. We have been even before redis with Enrich MCPU we have been working with like large Fortune 500 companies, helping them and we've seen huge success in actually using the Nurture MCQ framework. And so part of what I'm personally doing at Redis is kind of bringing this to life here. I think we'll have a lot of really cool things to share soon. If you are an enterprise that is thinking about working on or maybe even has a solution to this problem. I would love to chat and you can come reach out to me and.

Adam Becker [00:19:51]: Yeah, nice connect with Simba. We have another question here from Saad. What message would you have for R and D teams working in banking, for example, back office, etc. In terms of security and observability, I imagine they mean security and observability with respect to context.

Simba Khadder [00:20:09]: Yeah, it's actually another value prop of just a way of thinking of that single pane of glass. That last slide I showed where we have all the context coming through one pane of glass, observability becomes solvable because you can see that all the context retrieval is coming through one place you can see what the agent is doing and why it's doing it. It makes observability 100 times easier. We're actively adding a lot of really interesting opentelemetry hooks into Enrich, which is also part of that security. There's another piece of it, there's another reason. And this goes back to why text to SQL is often a bad idea in these use cases, which is that with the materialized view idea, you control what's in that view. You can choose. You will never end up in situations like, oh no, I didn't realize I was going to be able to find that table.

Simba Khadder [00:21:02]: I didn't even think about that table. And sure, like, there's a lot of like off insecurity stuff you have internally, but agents are smart and sometimes they will do. I mean, if you've used agents for coding, you will have seen it where it will come up with some really clever hacks to get around a wall. You'll essentially hack yourself on accident because it will be like, oh, there's this wall here, I really need that table. But I think it's there. Oh, I have this crazy idea to get around it. And it will do so. They're like very wily, maybe is the right word.

Simba Khadder [00:21:31]: And so of the materialized views, it creates a nice clean layer to really expose and export the context that you want it to have access to and what those things mean and how you want to be used.

Adam Becker [00:21:47]: Cool that whoever did the branding on single pane of glass is beautiful. I mean, it's nice. Well done. So on that note, got two minutes before folks are going on to the next stage. I wanted to ask you because of all the people I think that you would probably have strong intuitions on this is how does you mention feature stores? Does feature store as a concept simply folding into context? Is it context engineering plus some? How should I continue to think about the future of that, even conceptually?

Simba Khadder [00:22:22]: Um, so I'll break into two answers. One's theoretical and one is practical. The theoretical answer is that it's all we've ever been doing is context engineering. We just call that a different term. Or you could argue that context engineering is feature engineering. Doesn't really matter which one is like the main one. The idea is the same, which is that we need to create signals and, and, and useful features or useful context for models to be able to work with. Now with classical ML models, it only accepted inputs in a specific way, kind of in the Same way of LLMs.

Simba Khadder [00:22:54]: It only accepts inputs as text like you eventually Kind of turn it into text or tokens, I guess, is more. More literally. So that's the theoretical, the practical. And what we're actually seeing is that there are two different things, platforms at companies, because there's a lot of engineering ways of thinking where, like context engineering today is a lot more oriented towards how do we create. And the semantic layer is much more important than the actual, like, ETL layer for a lot of companies versus feature stores, I think the ETL layer is a huge part of it. So, yeah, I think companies will continue to have both and we see that. I mean, this recommender, systems, fraud detection, those things will continue to exist. They'll continue to be classic malls, and they'll continue to need feature engineering and feature stores.

Adam Becker [00:23:49]: Simba, thank you very much for joining us. We have a couple of more questions, but I also need to let everybody go onto the main stage because we're having our second keynote, closing out the evening with Chip's Talk. Maybe there's a way for you to stick around the chat for just a little longer. We have Sanjeev asking a question and Ogona asking another question. So I'll make sure to drop in the chat and I'm gonna send you the link to it. If this stage closes, we'll probably see you on stage one for just a minute. Awesome. All right, Simba, thank you very much.

Simba Khadder [00:24:26]: Thanks for joining us. Thank you. Thank you.

+ Read More
Comments (0)
Popular
avatar


Watch More

Context Engineering, Context Rot, & Agentic Search with the CEO of Chroma, Jeff Huber
Posted Nov 21, 2025 | Views 224
# Context Rot
# Search
# AI Agents
# AI Engineering
# Chroma
Boosting LLM/RAG Workflows & Scheduling w/ Composable Memory and Checkpointing
Posted Oct 22, 2024 | Views 528
# Memory
# Checkpointing
# MemVerge
Beyond Prompting: The Emerging Discipline of Context Engineering Reading Group
Posted Sep 17, 2025 | Views 963
# Context Engineering
# LLMs
# Prompt Engineering
Code of Conduct