Multi-Agent Personalization with Shared Memory: From Email to Website to Proposal // Hamed Taheri
speaker

CEO and Founder of Personize.ai. 20 years in AI and machine learning. Managed teams, built products and implemented solutions in small ot large organizations. Led 60+ revenue ops, CRM, and GTM data programs with world-class CMOs and VPs of Sales. Two MSc degrees focused on machine learning, decision support, and visual analytics.
SUMMARY
Personalization at scale needs deep understanding of each customer. You must collect data from many sources, read it, reason and infer, plan, decide, act, and write to each person. One agent doing everything gave us poor and inconsistent quality. Multi-agent systems changed that. They deliver mass personalization. They also break in edge cases, contradict each other, and are hard to debug. I will share how we addressed this with Cortex UCM, a unified customer memory, and Generative Tables. We map noisy data into a clean, structured layer that agents read and write. We began with email for both outbound and inbound communication. Then we personalized websites and product pages for e-commerce at scale. I share customer stories. For example, one customer had over 60,000 product pages that required customization for thousands of communities and product offerings. I will present our decentralized shared-memory orchestration briefly and how it stays transparent and debuggable. It opens safe paths for external agents. What failed. What worked. What we are building next.
TRANSCRIPT
Hamed Taheri [00:00:05]: Nice meeting everyone who is participating in this event. I'm going to talk about multi agent personalization and the role of memory and how we the learnings we had during our journey of building this personalization engine from personalizing emails, websites and product pages. First I need to give you a high level what I'm going to cover quick introduction who we are and that hopefully would be a context about formulating the objectives that I have and how we come up with these memories that I'm going to present on here and hopefully learn from you guys as well. And then I'm going to talk about solutions that are very popular in the market and then it will introduce how we are doing it, our better features, our approach and learnings we have still ongoing and and how it can be used and of examples. I want to start with introducing Personalize our startup. We are based in Canada and also we have US presence as well. And the name shows we are focusing on personalization at a scale. And when it comes to personalization the main way of doing it today is templates static and it's token based.
Hamed Taheri [00:01:35]: And essentially that's the only way a business can communicate with a large number of people. And we are entering a new world of generative AI and a possibility that we can generate content in a quality that is, that is, that is as good as human, essential, possibly like a senior experience person. And the opportunity opens door for generative personalization. We think this is a new category and to make it more specific because I'm going to talk about memory, I want to give you some kind of where we are focusing on. We are focusing on companies that they have databases. So we are not chatbots, we are not co pilot, we are not and conversing with people. We are conversing with databases and we are focusing on batch processing with AI agents in that space. I'm going to share the high level of our architecture which is.
Hamed Taheri [00:02:29]: I think many of you are already working with that which is multi agent execution around customer databases, mainly CRMs. And what we learn about personalization it's not just about writing some beautiful content and we need an energetic approach step by step, starting with a deep research, inference, reasoning, sometimes planning and then finally writing content and generating content for a customer. And that should happen at a scale. And the challenges that we faced was a bit different. We are facing with a kind of task that there is no human in the loop. There is always human supervision at high level but the task cannot be done if you have a human in it. And we have multiple Agents possibly built by different builders and they need to generate consistent output at a scale. And we started to essentially tackle this.
Hamed Taheri [00:03:30]: The accuracy wasn't just a beautiful text. We are working with businesses, they say that looks good, it's beautiful, it doesn't mean anything. That was the kind of situations we had. And the idea is, okay, we have to focus on what define accuracy in a way that this is defined for that specific company. We have a lot of unstructured data. When we are talking about batch processing at big databases and multi agent. It's super expensive if you don't plan for it and you don't optimize for it. And there's lots of latencies and tool calling and APIs and the latencies.
Hamed Taheri [00:04:05]: And when we are working in a system that is autonomous, we have to have a plan to make it reliable within that space of challenges we are technically tackling. One part that I'm going to highlight more is the customer understanding. So when we are talking about accuracy in personalization, the question is how well our agents can understand customers. And that's not the only thing they should understand in a very unified way that also share with other agents. Why? Because at a scale we have the challenge of consistency. We might have conflicting experiences with customers. So we need to figure how we bring everything in a way that we have to share the true and deep understanding of each customer. So, okay, we want that.
Hamed Taheri [00:05:00]: And the question is how we are doing it. Today I'm going to share just a high level of two ways of doing it. And then we can talk about what we learn and what we are proposing as something that is inspired by what existing solution, but essentially something that would work in our context of personalization. Batch processing and autonomous agents. One is rag and vector databases. We have memory technologies that are amazing. They put an interface to a lot of data and the agent at the time of doing the task, they can retrieve the chunks of the data they need and complete the task. But what we learned is having access to raw data doesn't mean that I know my customer.
Hamed Taheri [00:05:56]: And the retrieval might also be influenced by how the prompt is written. And also if you move from one agent to another agent, if even running the same agent again and again, and we are talking about tens of thousands of time, the consistency might not be predictable, but the chunks might come from different forms and kind of parts of the data and it might be partial. And so when it comes to personalization, it might create some risk of inaccuracy and challenge of the proper personalization. Therefore it's always reliability and to trust someone can have this AI to delegate their communication and customer communication at the scale. The other way is MCPS and function callings giving the AI agents ability to use tools, connect with different applications, databases get what they want and use it. And there is still kind of sensitive to prompts and instructions. There is a possibility if you work with multiple databases turning towards the bigger company with bigger databases. The function calling and using MCPs might result in a lot of data that will be added to the context that they are not necessarily relevant to that specific task, but they create, they are required to have it.
Hamed Taheri [00:07:18]: And you have to do it through adding it to the context, creating context overload. And essentially we still might not have the right understanding shared across all the agents because this function might be called differently by different agents. The way we are proposing kind of and kind of essentially very evolving. At the same time we are learning, we are testing, we are experimenting and we call it Cortex. It's an inspiration from human Cortex. And essentially we have some principles in designing this approach to the memory which is first of all, it's proactive. So we don't settle on just capturing raw data. We are proactively running internal agents to infer information and insights and synthesize information.
Hamed Taheri [00:08:09]: I give you content. I would stick to this simple example as we are moving forward, which is look at a lot of businesses. Some businesses would need to know if the company is B2B or B2C or they're a direct to consumer and that what means they might disqualify a big chunk of it or they might treat them with different services. That data doesn't explicitly mention the raw data. Again, this example is very simplified and proactive means that we look at the data because we know that in that domain it's important to know it. We proactively capture that information and add it to the memory. One of the things number two in our principle is attributes are shared and standardized. So every memory are captured in the standard naming per customer.
Hamed Taheri [00:09:01]: So essentially they are searchable. We can apply filters, we can use them for routing, scoring. And that's another thing that we learned that's very important. If you have a big database and you have a question of which company is B2B or B2C Querying database on that. And when you have tens of thousands, if you work with language models, you need some predictability on the accuracy and you need to trust the results. We propose this kind of combination and of proactive data like a data inferences and Also standardizing the attribution attributes. We have the opportunity, they're kind of essential together. And I'm going to share how we are getting there.
Hamed Taheri [00:09:42]: One the other thing is versioning. The other one that I want to highlight is recall. The idea of recall is today we have a lot of businesses that building a lot of agents. They are very powerful, but all of them they need access to customer data and they have their own way of. It might take weeks and they have their own way of accessing it. The centralized recall built on top of that proactive memorization and attributes. We are creating a more consistent way and reference to every agent has the same access to a bigger picture. And because of the proactive memorization moving from raw data to memories, we are compacting so that we have a lot of space in context to add more and more and use it and keep it simple and like a light for the agent to do the amazing job.
Hamed Taheri [00:10:33]: We are redefining our prompt through the learning. We had a prompt or the agent has like a prompt across different multi step and that was hard coded. And we have instruction, best practices, few shots examples. And if you losing that and someone in the team or even the customer wanted to change it, we have a lot of learning loss. So we started to add that as a user level memorization. Then we also made company and contact. As you are seeing, we are shrinking the prompt into just specifics of that task. And everything should be reusable for the agents.
Hamed Taheri [00:11:15]: One of the things that we are testing this is something that we work in progress. In the last three months we are working with almost 2020 plus B2B companies that we are experimenting implementing our cortex. They have a one click implementation. We do deep research. Typically it takes between five or 10 minutes. But the idea is we want to fully understand the customers and make things like in a way that we are proposing. And the thing is that's impressing people is in a couple of minutes the AI writes the way that is more like a senior person that company. It articulates and personalize sections of the website in the language and quality of writing.
Hamed Taheri [00:11:55]: That is very tailored and very aware of the domain, very technical domains. And it can even generate blog posts that are again aligned with the company brand voice. And this is essentially something we are working progress. But essentially we are hoping that this give agents a quick and immediate kind of awareness of the business context and they know how to operate for that company. And they have the recall mechanism so they can also have access to the right information about customers. So Moving from weeks of or possibly building everything and try and error moving it to mns. An AI agent can be built by different builders. And the idea is we've seen a lot of interesting things and this what I mentioned here, this is a work in progress and as I want to mention here, this is a work in progress.
Hamed Taheri [00:12:49]: We are testing different parts. The results sometimes again the results is exciting and if you like to if you have ideas or if you can use these types of cortex like a UCM memory for some use cases, we want to know more and we experimented with more use cases and see how far we can maintain accuracy and consistency in different domains and use cases. We have in our DNA our product that we have a personalized studio. We have a centralized memorization and recalling. So every agent that is registered in our platform API or natively the recall and memorization is part of it. We are working early access in MCPS, our APIs and ZAP here to idea is we want to give more people to just use that they don't need to come to our studio anymore. And we are curious to learn more and if someone is interested in learning more. If you have use cases to access the customer data and you wonder if we can help on this front, I would love to hear it.
Hamed Taheri [00:14:01]: If you know agents, you are building agents that are interacting with customers of a business. But I wonder if you how you would you would you would have access to all those internal data, external data and that deep understandable customer. And if you have use cases, we'll look forward to hear it. I hope you like this presentation. I kept it less than 15 minutes so that's the end of class.
Demetrios Brinkmann [00:14:28]: Excellent. Super appreciated. Hamed, that was great stuff. As somebody said in the chat, this is like super rag. It is like rag on steroids in a way. And I want to just ask one question that I didn't quite understand and then we got to keep it moving to Sachi's talk. But how can man, there's so many questions that I want to ask from like centralizing things on the or multi tenancy with the memory to like the version control that you had mentioned on one of the slides. I want to go down the route of the version control and say like hey, but I also have a question coming in here so I'm going to give it up to the chat.
Demetrios Brinkmann [00:15:20]: Let them ask the question can I silo memory from agents acting in different knowledge domains? For instance, if one department should not have access to another knowledge, their agents shouldn't either.
Hamed Taheri [00:15:37]: So there are There are two levels. One we have one customer, completely different organizations. So we are working on a schema based types of approach up from top down. So how to enforce structure like a way of naming and memorization. So on that front the same thing can go a bit lower. The part I showed you that the three layers of reusability of contact, component, user. We know that we need to add another organization and department level. That means that we have some guidelines for a specific department, organization wide and also user specific.
Hamed Taheri [00:16:17]: It is a work in progress. We are not that sophisticated. But essentially we understand that that is coming. We have some intelligent way of matching the right parts of instruction to the agent. So the agent has a way of looking at all the possible contexts and choose what they need. That said, we need to have more control. But it's a very smart question and we need more time to build this section.
Demetrios Brinkmann [00:16:43]: Yeah, you, you're hitting on some of those pains of context engineering and it's great to see. Thank you Hamed for joining us.
Hamed Taheri [00:16:50]: That's it, thank you.

