Sign in or Join the community to continue

MCP, Agents & the $40M Bet on Multiplayer AI

Posted Jun 12, 2026 | Views 5

# Enterprise AI

# AI Agents

# Dust

Share

Speakers

Stanislas Polu

Software Engineer & Co-Founder @ Dust

Stanislas Polu, the co-founder and engineer of http://dust.tt, is an alumnus of http://openai.com, http://stripe.com, http://stanford.edu, and http://polytechnique.edu.

+ Read More

Demetrios Brinkmann

Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More

SUMMARY

Stanislas Polu is Co-Founder & CTO of Dust — the enterprise AI agent platform used by 51,000 workers at 3,000+ companies. Before Dust, he spent three years on OpenAI's research team under Ilya Sutskever, working on mathematical reasoning in language models, and prior to that was an engineer at Stripe. He brings a rare combination of frontier AI research and product-building experience to the enterprise agent space.

+ Read More

TRANSCRIPT

Stanislas Polu: [00:00:00] Even if the technology, uh, again plateaus, uh, even with current model shape, uh, the way we're gonna work in 10 years is probably, uh, nothing compared to the way we work today. And so there's still many, many, many opportunities to build for, for everyone.

Demetrios: Probably start with this multiplayer versus single player AI. I want to mention to folks that you've got a decorated background. You were at Stripe in the early days. You then, you were also at OpenAI, and now you're doing your own thing with Dust. So it's super cool to see this progression. Are there things that in those early days of Stripe scaling up, or in the early days of OpenAI scaling up, that you have taken away from that and you want to infuse into Dust as you're scaling up?

Stanislas Polu: Oh yeah, a ton. Uh, mostly Stripe over OpenAI. First, uh, yeah, I've, [00:01:00] I've been very lucky. I've been very lucky. Uh, so we joined Stripe, uh, af- through an acquisition actually, uh- Mm-hmm ... with my current co-founder, so it's not the first time we're starting a company.

Demetrios: Nice.

Stanislas Polu: Uh, the big difference that, uh, it was, uh, 15 years ago, and we, we were young.

Stanislas Polu: We had absolutely no clue about what we were doing, uh, but it was fun. Uh, but we can chat about that later. It's a whole subject of its own. Uh, Stripe, we saw the, uh, 150 people to 3,000. It was, uh, absolutely incredible. And I think, uh, uh, I hold very dear to my heart the, uh, 150 to 500 people. I think it was pretty magical at Stripe.

Demetrios: Mm.

Stanislas Polu: Uh, because there was, uh, really, uh, very minimal management, and it still was working really well. And I'm trying-- I've, I've been trying since then to understand what was happening at the time for that to work so well. I'm trying- Do you think that's

Demetrios: the writing culture?

Stanislas Polu: So yes, uh, so I've got a, I've got a bit of a theory about that, uh, which we, we actually turned into operating principle at Dust.

Stanislas Polu: Uh, but my, my big theory about the [00:02:00] time, it's slightly bullshit, but I think it's interesting nonetheless. Uh, my theory is that we were actually implementing a, a flocking algorithm. So you know what a flocking algorithm is? Is- No ... it's like, uh, when you-- So y- when you want to program a flock of birds or a flock of whatnots, uh, in computers, uh, that are kind of, uh, autonomous groups of things that auto-organize around obstacles, there's actually only three things to, uh, there's only three rules to make it work.

Stanislas Polu: It's, uh, local separation, so small force that, uh, push, uh, the, the people, uh, away, uh, distant attraction-

Demetrios: Hmm ...

Stanislas Polu: and alignment. And, uh, that gives you kind of a, some form of autonomous behavior around obstacles. Uh, you can s- you can look at flocking algorithm online and you'll see those small videos. And I think at Stripe, the writing culture was the distant attraction.

Stanislas Polu: Uh, so Stripe was very early into implementing that kind of a open writing culture. It was mostly through email at the time, [00:03:00] interestingly.

Demetrios: Ooh.

Stanislas Polu: And what-- Oh, that was great. It was great. Really? Oh, yeah. I don't like Slack that much personally. Oh, okay. But I love email, but that's another take that we can spend- Yeah

Stanislas Polu: a lot of time

Demetrios: on. You're surprising me here. All right.

Stanislas Polu: Uh, but anyway, uh, so distant attraction. Why? It's because, uh, we had those mailing lists, and so there were, uh, people within the org that were spending a lot of time kind of, uh, sifting through those mailing lists. And so that meant that somebody would see someone else talk about subject A over there and see another person talk about sub- the same subject over there in the org and was able to tell them, "Hey, you're talking about the same thing.

Stanislas Polu: You should chat." And so that's kind of a distant attraction, uh, force. Uh, I think at Stripe, one thing that really struck me is the amount of trust between people at the time. Uh, it was just ingrained in the culture in a very nice way, and that's a kind of a local separation force, because you don't look over the shoulder.

Stanislas Polu: If somebody says, "I'm-- I've got this," they've got this, and you stop caring about that. And then finally, the last one was, which is, uh, which is [00:04:00] almost, uh, the, the most bullshit one, is, uh, alignment. But I think at Stripe, we were, we were, and which is very interesting because it's very hard to reproduce that in modern times with AI, but we had a very strong alignment on the vision.

Stanislas Polu: It was very easy to understand. We are a dev API for payments. Basically, there's a tensor of countries, payment methods, payout methods, and you just, just want to, to fill it in. Uh-huh. It was very easy to understand, and so that means that you are, everybody's kind of aligned very naturally. And so that gives you the three properties of flocking, which gives you kind of a autonomous behavior in a sense, emergent autonomous behavior, which explains why potentially we didn't need that, that much management and why it was so nice to work, uh, in that environment.

Demetrios: It feels like we could draw a parallel here and heads up, segue into multiplayer AI, single player versus multiplayer with the flock algorithm, and how have you thought about humans working with agents in that [00:05:00] flock style? Do you think that there's anything that you could look at there?

Stanislas Polu: Yeah, there's probably a ton of stuff, but it's very hard to anticipate because today we are still very much stuck in the single player mode.

Demetrios: And what do you mean by that, single player mode?

Stanislas Polu: It's mostly that most of the agent interactions we have, we just have them alone for two reasons, I think. Uh, the main two reasons I see is first, the time horizon of the tasks an agent can take is still pretty small. Uh, according to the meter benchmark, uh, we are topping at, right now at, uh, a couple hours, let's say a half a day, and mostly in coding tasks, by the way.

Stanislas Polu: And, you know, half a day is still very much a single player thing. You, I can work half a day on a task, you can work half a day on this task, and you work alone, and it's fine because it's a kind of well-shaped tasks. The other thing is that the, the, they're still very imperfect in many ways, and [00:06:00] so you interact a lot with them.

Stanislas Polu: It's messy. Uh, really, when you look at an agent trace, uh, between a human and an agent, you look at the, how the sausage is made, and you're like, "Oh, woo, man, you did something funky there."

Demetrios: Or it's like, "Oh, you were so close, dude. Uh," Yeah. "... a little bit over to the right or left and you would've been good."

Stanislas Polu: And so on those time horizon and, and the, in terms of capabilities, the models are super jagged, m- which means that, uh, you're on that-- you're looking in that direction, the thing is clearly superhuman.

Stanislas Polu: You move a few, uh, uh, uh, degrees off on a slightly different task, you fall into a valley, and the, the, the thing, that thing is the dumbest, dumbest thing on Earth. And so that's why it's messy. And so all of that makes it very kind of n- unnatural to collaborate in an agentic loop with multiple humans, to share, uh, your sessions with agents because they are messy.

Stanislas Polu: You share the output, you don't share the session. Um, and so that's, that's [00:07:00] why I think we're still very much, um, a single player AI in the way we, we work. We all spend all of day in, in Claude Code or Codex. We all spend all of day in, uh, our favorite agent harness to work on presentation and stuff. Uh, and it's-- we pretty much do it, uh, uh, alone, and we share in the, in the other platforms the output, and we collaborate there, be it on GitHub, be it on Notion, on Google Drive, whatnot.

Stanislas Polu: Um, I think the thing that, the, the, the thing that is supposed to s- is, is, is about to switch is probably, uh, as the models are getting better, and they seems to be getting better consistently, it's always an open question, but when you look at the past few months- Mm-hmm ... um, well, the, uh, the time horizon of those tasks will increase.

Stanislas Polu: And if you start considering a task like that represents a week of man work For most job descriptions, those tasks are inherently cross-team, cross-people, involving multiple person. Even if we-- If you think of coding, there is very few coding tasks that takes [00:08:00] a week that doesn't involve collaboration, even with somebody external, be it a product manager, be it a designer- Yeah

Stanislas Polu: uh, or any-- Or maybe the infra person because you have something finicky about the infra. Uh, and so, uh, and if you look at something that is outside of coding, then most tasks that takes a week, they are, uh, extremely cross-functional. And so that will flip the picture where, uh, it's not gonna be possible anymore for just one human to steer the agent in those type of task.

Stanislas Polu: The agent, uh, even the relationship between the human and the agent with-- Will, will flip. The agent will orchestrate a longer horizon task involving humans, and the humans will steer, but they'll, they'll-- I, I'm quite convinced there's gonna be a need for many humans to steer, uh, those agents, be it an agent or multiple agents, we don't really care, uh, for those kind of longer horizon tasks.

Demetrios: They'll invoke the human agent or the human tool when they need to. So it's invoking a SRI tool- Yeah. Yeah ...

Stanislas Polu: invoking a, for

Demetrios: that review.

Stanislas Polu: It's, there's gonna be, there's g- there's gonna be some of [00:09:00] that, for sure.

Demetrios: Yeah. And now, don't you feel like this is just an, it's a UX problem more than anything? Because when you tell me about the single player mode, I think, okay, but what ways do I have to share with my colleagues rather than exporting the chat or putting it up with my PR, things like that.

Demetrios: It doesn't really feel like I have the tools in order to get-- To bring someone else along-

Stanislas Polu: Hmm ...

Demetrios: this journey with me.

Stanislas Polu: Yeah. I think, I think the, the, the interactions will pretty much remains, uh, kind of a, a you with the agent, but it's just that the agents will create many of those sessions in the, to, to achieve that task and have, will kind of collaborate with many humans.

Stanislas Polu: And so- Oh ... it is entirely a, a, um, a UX is maybe slightly, uh, diminishing, but I do think it's not a, [00:10:00] it's not a A, a machine learning problem. It's more of a product problem. Yeah. Um, uh, it means-- it just means that, as an example, if we, uh, if we take, uh, CoWork, which is having, uh, kind of a, a real moment these days.

Stanislas Polu: Because it's working locally on your computer, it is inherently a single-player thing. It's very hard to kind of, uh, uh, uh, collaborate around something through CoWork with multiple humans. And so that means that you need to push that onus inside of a hosted environment. This is the only way, I think. Um, and, uh, and there's a lot of things that we, we are exploring around what is the right product surface to enable that collaboration, uh, where you have multiple humans, multiple, uh, sessions around a shared state, uh, and how that-- what, what it can look like.

Stanislas Polu: I think this is the frontier in terms of product of interacting with agents, and that's the one we're very excited about.

Demetrios: Talk to me more about that.

Stanislas Polu: So one of the things we're doing right now, uh, is, uh, is-- So we, we, we've, we've, we've released a first [00:11:00] version of that, which is called, uh, really a, a, a-- We call it a pod.

Stanislas Polu: And a pod is really a shared state with, uh, uh, a number of, uh, sessions, uh, between humans and agents. And I'll give you a very simple example of where it plays really nice. So we have, um, as with any companies, we have a team weekly. Uh, and the team weekly is mostly made of a set of, uh, slides that, uh, are somewhat always the same with, uh, small variations every week and maybe some deep dive, uh, about, uh, subjects, uh, every other week.

Stanislas Polu: Um, and so in the past, the way it would work is that somebody would start building a slide and, and ping the humans and have the humans collaborate in those slides, et cetera. And, uh, the kind of, uh, orchestration of all of that would take, uh, some amount of time, uh, to that human. Uh, now comes, uh, agents.

Stanislas Polu: Maybe you can, uh, the, the initial organization building a slide, you can, you can, [00:12:00] uh, you can offload to an agent, but you still have to ping the humans and, uh, organize the, uh, the, the kind of organize the collaboration so that we get to a, a shared results. Um, and so with a pod, which is a group of, uh, conversations, group of humans in a shared state, uh, now our team weekly is much more driven by an agent than driven by a human.

Stanislas Polu: So basically, uh, on-- Our team weekly is on Tuesdays. Monday morning, uh, somebody triggers an agent with a skilled team weekly. The team weekly, the, the agent creates a pod. It creates, um, one session per slide, uh, with the owner of each of those areas, and maybe one session for the, uh, people that are, uh, uh, u- uh, doing the deep dive that week.

Demetrios: Mm-hmm.

Stanislas Polu: And it will, will pre-work the slides, so going into Metabase, pulling the data, keeping up to date, revenue, customers, going on GitHub for the, um, uh, the product work, and then going on whatever platform for the support state, whatever. It'll pre-build the slides, and then it'll ping the, the, the, the person responsible for [00:13:00] that slide and say, "Hey, what do you wanna talk this week about?

Stanislas Polu: And let me-- Let's collaborate on, on these slides. What do you want to add? Is there anything you want to highlight?" And so that collaboration happens in one of that session. Uh, each humans that are responsible for each of those subjects are collaborating in different sessions. And on Tuesday, let's say noon, the agents, uh, the agents wakes up and, like, uh, looks at everything, uh, probably ding on the humans that haven't, haven't finished, uh, their work.

Stanislas Polu: And then it wakes up- There's always a work. Yeah. Wakes up again in two hours, like two hours before the team weekly, look at all the, the things that have been built and coalesce them into one shared, uh, presentation. So we, we don't use any presentation anymore. We use, uh, we have a frame, which is the equivalent of Canvas or Artifacts or really kind of a code-generated, uh, uh, UIs.

Stanislas Polu: Mm-hmm. And so it, it, it takes all of those frames and, and, and pull them in that, in that shared frame, and that thing is ready. And really, the, the weekly, the team weekly, which is kind of a, a multi-days collaborative task, is today mostly on auto mode, right? And I think [00:14:00] that's, that's the kind of, uh, product surface that are interesting.

Stanislas Polu: If you don't have that pod, which is that group with that shared state, feels, uh, kind of weird because you have a bunch of conversations over the place. You don't have a shared place to see them all. The agent, uh, doesn't have a place to organize all of that work with multiple humans. And so I think that's, that's the type of things we're exploring and the, which is very exciting for, for what's to come or for more than for what's where we at.

Demetrios: Can you talk to me more about this pod?

Stanislas Polu: Yeah. So, uh, yeah, the, the, a pod is really, uh, like technically it's really a group of humans, uh A group of agents, a group of sessions, which is the interaction between human and agents,

Demetrios: uh- And these are sandboxed?

Stanislas Polu: Each of the session has a sandbox.

Demetrios: Uh-huh.

Stanislas Polu: Each of the session has a, a state.

Stanislas Polu: Uh, so a state, I mean, let, let's call it a file system. The file system is visible by the humans, but also visi- uh, mounted on the sandbox. Basically, to make that easy, we're diving into the technical details here. We [00:15:00] use, uh, GCS fuse, so we back that with GCS, so it's very easy to show it in the UI. It's very easy to, uh, show it as a, mount it as a file system to the agent.

Stanislas Polu: So the home directory of the agent in the sandbox is, uh, is GCS backed basi- basically. And so that's the state in one session. And then inside of the pod, we also have another file system which is shared across those sessions, and the agents are free to move files between the two file systems. So basically, they see /session/ blah, blah, blah, and in there they have all the s- all the file system related to the current session.

Stanislas Polu: And then they have /pod, s- uh, s- and the sli- under /pod, that's the shared file system where they can move some of those files up or down as they see fit. And so they can really collaborate across sessions. Uh, you can have collaboration across sessions through that shared file system at the pod level.

Demetrios: Mm-hmm. And when I'm interacting with my agent, I can drop into somebody else's slides presumably also. It's not just my slide.

Stanislas Polu: So it, uh, if, yeah, yeah, if the, if the [00:16:00] slides of that also-- So somebody has a session with an agent inside of the pod, is working on a slide, it lives, uh, naturally as a starting point on the file system of the session, so it's mostly visible to that session.

Stanislas Polu: And then that person will say, "Okay, the slides are ready. Move them to the, move them to the, to the pod." The slides get moved, and then somebody else will be able to see them or even modify them- Okay ... uh, which is risky in a sense, but that's fine. I mean, uh, that's, uh, that's, uh, so in Google Drive, you can modify, uh, whatever else, uh, has been doing in your company as well, so I think it's okay.

Demetrios: Do you have the possibility to look at the history of the modifications in case you're like, "Ooh, Control Z"?

Stanislas Polu: Yeah, no, not right now. Uh uh, technically, technically- That's why

Demetrios: it's risky then. I see.

Stanislas Polu: Yeah, technically, technically we have it because, uh, GCS is versioned, so technically we do have it, and that's obviously something that we would want.

Stanislas Polu: Uh, yes, you can, uh, you could-- The, the re- though the reality is that you do something with an [00:17:00] agent, it will, uh, what will happen is g- it's gonna read the file, and then it's gonna emit an edit. Uh, and so generally, they are able to premo- pretty easily backtrack that change because they still see them in context.

Stanislas Polu: Mm-hmm. The only thing that, uh, the agent can do that would be bad is like, uh, uh, uh, like deleting the whole pod shared's, uh, uh, file system. Uh, that's possible technically. Um, uh, I don't think it happens much, uh, so that's, that's kind of fine.

Demetrios: And also I imagine you have some kind of redundancy, maybe some backup somewhere just in case you

Stanislas Polu: ever- Yeah, so again, in, in GCS everything is versioned, so we could- Yeah

Stanislas Polu: uh, we could totally, uh, restore it. And, uh, and, and yeah.

Demetrios: So, okay. Now the shared state- Yep ... and the ability for almost like this tiered level of I'm working, I'm bringing it down, not necessarily locally, but I'm having my one player experience, and then I'm pushing to the [00:18:00] multiplayer so everyone can share and look at it.

Demetrios: How are you envisioning that in other scenarios?

Stanislas Polu: Ah, there's, uh, th- there is so many of them. It's, um, uh, it's one of the-- By the way, it's one of the Very hard difficulty of building an horizontal platform. Uh, there's a lot of value into building an horizontal platform. There's a lot of pain in building an horizontal platform.

Stanislas Polu: The pain is mostly this one, is that the, uh, uh, there is so many use cases, uh, that, that you-- it's-- you, you cannot-- I mean, I would love to be able to tell you Dust is awesome for that, but that means we would be verticalized. If we were to be verticalized, we wouldn't have the right to really try to equip the entire company.

Stanislas Polu: And so that's why we accept that, uh, that tension. Uh, many other ways to use a pod. You can, uh, you can use a pod as, um, as a, as a-- So basically, we also have in that pod, we also have a, a list of, uh, tasks, which is a way to try to [00:19:00] organize work. And so you can create a pod for any project you do. And let's say you're on Slack, and you're like, "Oh, uh, I need to do, uh-- We need to do this, this, and that."

Stanislas Polu: And you, you call Dust from Slack, and you ask it to add those tasks to the, to the pod you're working on. May- let's say you're working on a initiative, a sandbox. You're working on adding sandbox, uh, to, to Dust. You say we should, we should do this, this, and that. Uh, the agent is capable of interacting with the pod and add tasks itself.

Stanislas Polu: And then as a human, you can, uh, you can, uh, the tasks are kind of a nice way to organize the, uh, organize the work and trigger sessions to start solving the task, uh, with, with an agent and a human working together, uh, to solve that task. So you have that kind of a task, uh, kind of a trying to explore the task management part of working, uh, with agents, uh, uh, around a project.

Stanislas Polu: There's obviously another usage of pod which is, uh, which works well, is around, um, internal, kind of a internal support [00:20:00] like, uh, uh, so obviously the kind of external support, the tier one support will, uh, will, it'll-- is best tackled in platform that are specialized for that. But when the questions is finicky and it reaches your internal support team, uh, a pod is a really great place to share the way you solve those questions, uh, uh, for people to, to, to, to look at and, and, and explore.

Stanislas Polu: And so I think there's many, many, uh, use cases around that. We're really in the business of creating, um, uh, Lego bricks that work well together and, and make sure that they are somewhat universal. Um, and so, uh, we see very varied usages. One principle that I think is interesting, uh, that we try to enforce is, uh, what we call the, uh, bidirectional harness.

Demetrios: Mm-hmm.

Stanislas Polu: Meaning that we've enforce us a very simple rule is that any feature we build, it should be equally accessible by agents and humans.

Demetrios: Okay.

Stanislas Polu: There shouldn't be anything that a human can do in Dust that an [00:21:00] agent cannot do, and vice versa. And I think-

Demetrios: And how does that play out in the time when you're actually building?

Stanislas Polu: So it, it, it kind of creates a constraint on the things you can build. Uh, as an example, uh, uh, I mentioned the tasks around the pod. This is a product surface that is, uh, kind of very human-centric. Well, it's, uh, it, it, it, it, it's very important that those tasks are-- can be listed by agents, can be edited by agents, can be created by agents, can be, uh, the work can be triggered, another conversation can be triggered, a session can be triggered, uh, from those tasks by agents as well.

Stanislas Polu: And so it's really kind of a, a, a great principle to make sure that we, we don't overfit on any of those two users, because we're really building a, a product for both those users. Uh, and we really care about that because we see, we s- it's the only way to start meshing the work o- of humans and agents together, in our opinion.

Stanislas Polu: Only if you have that kind of symmetry will you [00:22:00] progressively stop making it fuzzy whether the work has been done by humans or an agents, which we think is the future and is exciting in many ways. Uh, other things that it-- There is so much, so much places where you really want to build something for humans, and when you try to apply and believe strongly in that principle, it really makes you rethink what it should look like.

Stanislas Polu: Uh, and so it's a useful principle to, uh, to apply. But I think most AI products today are kind of building for two, two users, and so it's important to build for them both.

Demetrios: Yeah, it's, uh... It reminds me of when someone told me, "You can create documentation that humans can read but agents can't, but you can't do it vice versa."

Demetrios: If you create documentation for agents, a human is going to understand it. Yep. So you should default to agents being able- Yep ... to understand your documentation first and foremost, and humans, uh, will by default also understand. Now, when you were talking [00:23:00] about the idea of the pods and also how work gets done over longer time horizons, it made me realize, uh, and I had the vision of data moving through like a DAG, and how a finished piece of work is not dissimilar to that.

Stanislas Polu: Mm-hmm.

Demetrios: You have different nodes where work gets done, and then it goes to the next node, and whether that node and that piece of work is being done by a human or it's being done by an agent, it doesn't always have to be the same human that is doing it. And I think that's the huge unlock that I'm understanding with you, is that a human is an expert in different parts of that process, so it shouldn't be the same human that owns it all the way through.

Demetrios: We hear a lot of talk of how, well, now PMs and designers are shipping code.

Stanislas Polu: Yep. [00:24:00]

Demetrios: And that's great But they have their unique skill set where they're very good at things, and isn't it almost more efficient if they can do what they're best at in that node of the DAG-

Stanislas Polu: Mm-hmm ...

Demetrios: and then it gets shipped to someone who can make it battle-hardened?

Demetrios: Maybe it's the DevOps person or the security person. They can harden it, and then it goes to production. And then you have that feedback loop where it will go back around and you're constantly iterating and updating. But those longer time horizon events, now when you talk about multiplayer, I'm understanding what you're saying there.

Stanislas Polu: Yeah, I think, I, I, I think that's, uh, that's exactly that. And, and, and the, the grail is, uh The goal is really to try to create a place where it's really easy to hand off, and so to really create an edge and a, and a next node [00:25:00] and hand off to the, the, the next processing unit. And the next processing might be humans, might be an agent, and it should be really easy to route towards one or the other.

Stanislas Polu: And, uh, and, uh, and, and, and, and, and that's, that's the thing. And, and to make that a reality, uh, the, the complex, uh, the, the, the dirty work that needs to happen is that, uh, those agents has to be very contextful. And, like, it's super interesting into trying to think what is the right context for those agents to be useful agents of work.

Stanislas Polu: Uh, is it just, um-- I mean, today the, the, the basic answer is a long agency loop and a gazillion MCP calls to, uh, rebuild the context in every session. Um, the kind of a second stage answer is, uh, is, uh, turn everything into a skill and eventually the context will be rebuilt. Uh, is that part automated? Is that part on autopilot?

Stanislas Polu: Is that part manual? These are all the kind of, uh, question at the [00:26:00] frontier of what it means to create that, that right context, uh, for agents, which is, uh, quite interesting.

Demetrios: Yeah. How are you seeing success with that and breaking down those barriers so it is more fuzzy when you have the handoff and you can still-- I am-- Also understand there's moments where there's two different people, humans, that are working on the same thing, and then the agent in the middle, and so you almost have like this Venn diagram of you have the product manager, you have the engineer, you have the agents all looking at one node, if we're gonna- Yep

Demetrios: continue the metaphor of the DAG.

Stanislas Polu: Yep. Uh, I, I, I, it's, uh, I think we are still-- I don't have a crisp necessary answer to that. I think we are, uh, again, our-- I think we are still very early on, on this one because as we said in, at, uh, as, as we said at the very beginning, the capabilities of the agents are still kind of, uh, uh, jagged.

Stanislas Polu: And so that makes that whole [00:27:00] process, uh, not super smooth at all time, but we start seeing light at the end of the tunnel. And so our goal is really to discover that with our user, to give them the right Lego bricks. Um, that means that the, the, the, the, within our, within our user base, there's always gonna be kind of a AI operators, uh, tinkerers, uh, that will, uh, see the bricks and will want to assemble them in kind of new ways.

Stanislas Polu: And it's so-- It's super exciting to see what people do with the bricks you give them. Sometimes they come up with, uh, stuff that are just mind-blowing. They just rebuild entire systems, all of that, and it's, uh, it's exciting to see. But I think it's a, it's a co-discovery, it's a co-discovery thing. So it's a-- Would be very, uh, would be very, um I mean, o- on top of that, it's so dependent on the capability of models, which is moving so fast and s- and which is still kind of somewhat unpredictable in many ways.

Stanislas Polu: Uh, so it would be very, uh, uh, uh, adventurous of f- of me of trying to, to have a very crisp answer on this one. Uh, I

Demetrios: mean- Slant a flag right now. Yeah. [00:28:00] And then it's like by the time this podcast airs, oop, that didn't age

Stanislas Polu: well. Exactly. Exactly. Yeah,

Demetrios: that's- Which,

Stanislas Polu: uh, going, going back to the, uh, alignment, uh, uh, I kind of seeded that, uh, like, uh, at the very beginning.

Stanislas Polu: Going back to the alignment, this is one of the major challenge I see into trying to reproduce the kind of, uh, uh, conditions of Stripe early on, I think is the alignment part. Because when you're operating in that space, the technological substrate is moving under your feet in a way that wasn't true before.

Stanislas Polu: For the past 20 years, when we're building, uh, tech companies, you were building on JavaScript and Postgres. Was rock solid. It's fairly stable.

Demetrios: Yeah.

Stanislas Polu: Like, there's nothing more stable. I mean, those things evolve, but it's like stable, stable, concrete type stable. Uh, and today we're building on, uh, on models that are changing every weeks.

Demetrios: Yeah.

Stanislas Polu: And so every time we try to build a very crisp picture of, of what we could be constructing in a year, we'd got it completely wrong because within that [00:29:00] timeframe, uh, the models shifted. The place we wanted to build that nice city of yours, uh, kind of went under sea, and a big mountain appeared right next to it, and so now you n- you wanna build on top of the mountain, uh, obviously.

Stanislas Polu: And so, uh, that creates the, uh, that creates, uh, uh, that creates a real challenge for creating a crisp alignment of the team because you, you, you s- you're subject to the fog of AI, and it's very hard to see f- past six months, let's say.

Demetrios: Ooh, I like

Stanislas Polu: this term.

Demetrios: So it's an interesting- The fog of AI. It's, uh,

Stanislas Polu: the fog of AI, yeah.

Stanislas Polu: That's, so that's, uh, that's a pretty, uh, that's one of the interesting challenge of building a, building a, an AI company, I guess.

Demetrios: Hmm. And now just going back to Dust and how you're building, do you- allow folks to interact and be multiplayer in all different types of scenarios like Slack, like in their web browser or in GitHub, or they throw like a Jira or Linear issue at it.

Demetrios: Is that [00:30:00] kind of the vision where you're saying we have all these tentacles?

Stanislas Polu: Yeah, we try to build as much tentacles as, as possible. The big one is obviously-- So th- there's always a, and it is always an interesting tension, is that, uh, uh, when you build one of those connection, like Slack is one of the, one of the big ones.

Stanislas Polu: It's a very natural one. It's a, a chat, uh, system, and so that's a place where you might wanna trigger work with agents.

Demetrios: Mm-hmm.

Stanislas Polu: Um, uh, it is also an acquisition engine internally within work- within, uh, companies because people are on Slack and they see- Yeah ... other people using a system, and so they discover it, and they might start using it, obviously.

Stanislas Polu: The, the tension is that the, you're constrained by the interface of the platform you're connected to. And so the experience, everything we just said is not, is very hard to do in Slack because you, you're bound by the interface. And Slack has been doing a lot of work there, and, and, and they-- Actually, you can create pretty smooth interface today in Slack.

Stanislas Polu: But let's say GitHub as an example. Uh, it's just a [00:31:00] text, text async. Yeah. Uh, you don't see the syncing tokens, you don't see the tools being used, all of that. And so I, we really see those places as, uh, uh, entry points more than place where the collaborations, uh, or the collaboration with agents really happens.

Stanislas Polu: Because you always wanna, you, you always try to drive the user to go open the session inside of Dust because that's where all the richness, uh, happens.

Demetrios: Ah, I like that. So it's the gateway into it, and you're pushing them to go and look and inspect on Dust so that you can get that richer experience.

Stanislas Polu: Yeah.

Stanislas Polu: Some actions that agents take for the, for obvious reasons of, uh, some admins, uh, mark as high stakes, so it mean- it requires a user confirmation before the action is taken, and a good example is sending an email. Um, uh, and, uh, and so in Slack, you can actually build a pretty good, uh, tool approval experience.

Stanislas Polu: But in GitHub, you cannot. Mm-hmm. [00:32:00] You cannot send the next message on the issue, say, "You need to click here to approve the tool." That wouldn't make sense.

Demetrios: Well, I'm also thinking about even things like MCP apps that make the- Yep. Oh, yeah ... in-chat experience so much richer.

Stanislas Polu: Yep.

Demetrios: You don't have that ability- Yeah,

Stanislas Polu: yeah

Demetrios: uh, if I'm not mistaken, in Slack or- You

Stanislas Polu: could, you could re- uh, you could imagine that Slack rebuilds, uh, supports MCP app, and you manage to forward the MCP app payload to Slack. That could be... But again, GitHub wouldn't work, Zendesk or all of that. Pretty complicated.

Demetrios: Yeah, or there's so many areas or surfaces that you touch that do not support that, and it would be very nice to have that richer data experience.

Demetrios: Like even in Jira or Linear, you might wanna see some of these data points or data exploration that happens with the agent before you commit to a, quote-unquote, "sprint," because- Yeah ... I don't know what happens in sprints [00:33:00] these days. It's an absolute mess. I- Maybe that's a thread we can pull on, on how- Yep

Demetrios: sprints work now. But, uh, go-- continuing on this MCP apps, you almost want an area where you can use all of the data that you're having happen in these different- Yep ... spots, like the Slacks or the Jiras or the Linears, and then put them in something to allow folks to have that richer experience. So I could see that being a huge value prop.

Stanislas Polu: Yeah. But we, uh, we really see those platforms as a... I mean, we try to insert ourselves in an ecosystem. We don't, we don't want everything to happen on Dust. We want to have, uh, uh, work, some, some amount of work happen on Dust, but we-- Sometimes that work starts from Slack, continues on Dust, and ends up in GitHub as a, as an issue.

Stanislas Polu: And that's really fine. Uh, it's really important to be able to read from them. It's really important to be able to write back to them. I think that's the, uh, [00:34:00] that's the, at least our position on this one.

Demetrios: Yeah, and it goes back to the whole fuzziness of the handoffs and how those work- Yep ... right now and how that actually is, is being done because you've got some folks that are working in Slack, or you've got decisions being made in Slack, and then an agent will execute on that decision that's being made, which then becomes a GitHub issue or a PR.

Demetrios: Yep. And then somebody else has to review it, but they are using an agent to review it, so you've, you've got all these different ways that it could be siloed, or it probably is being siloed right now. Yep. And I know when we had our lunch and learns, one of the big things that folks asked for was: How are teams working with coding agents?

Demetrios: How are you figuring out which skills should be part of the whole company, or should it be team-wide skill? [00:35:00] How are you making sure that sessions can be reproduced or shared, or- Yep ... are you committing your whole chat history to the PR? Like, there's these questions that are coming up, and- Yep ... we don't have the tools for them right now.

Stanislas Polu: No. At least, uh, in the, in the coding agent space, we don't have them yet, but that seems, uh... I mean, it seems pretty obvious that, uh, at some point, uh, your Codex dash dash session and that ID or Claude dash dash session and that ID will be, uh, hosted in the cloud, and you'll be able to tie it to the PR, and somebody will be able to re- restart from there.

Stanislas Polu: Uh, that seem, uh, pretty obvious, uh- Yeah, that's an easy one ... iterations for them. Yeah.

Demetrios: Now, I don't wanna pass over this very big important thought of finding alignment is very tricky in- Yep ... the age of AI because you have the fog of AI, right?

Stanislas Polu: Yep.

Demetrios: Are there things that you have [00:36:00] done or seen work-

Stanislas Polu: Mm-hmm ...

Demetrios: or just like a principle that you're operating on in Dust so that you can not necessarily predict the future, but-

Stanislas Polu: Yep

Demetrios: iterate quickly when it does happen?

Stanislas Polu: Yeah, exactly. I think, I think the, uh, the, the failure mode of that is to, uh, uh, uh, refuse to paint a picture. I think you have to operate with, uh, uh, some form of conviction within uncertainty.

Demetrios: Mm.

Stanislas Polu: It has to be clear to everyone that we are operating in a very uncertain environment, and that whatever we say, uh, will be subject to change because the environment change, but that we create conviction on the actual current direction we're taking.

Stanislas Polu: And so that means, uh, uh, people-- I mean, everybody wants to rally on a vision or rally on a direction. And so if you cannot give a one-year and two-year direction, you have to give a six months direction, and that's fine. [00:37:00] And whenever you give a six months direction, by month three, it's, it's gonna be updated.

Stanislas Polu: And so that means that you, you, you need to have that kind of muscle of updating it and being con- uh, providing a lot of conviction in, in, in, in that process, but really being very explicit each time, uh, in doing it. Because otherwise, you might fall in the trap of, of not, not creating that, that, that shared at least, uh, speed vector, um, that w- you're trying to, to have for the group.

Stanislas Polu: Uh, and that's the worst, that's the, the, the, the worst part. Basically, the longer the speed vector, the best it is. The shorter, uh, it's, it's worse. Uh, three months is, uh, uh, so let's-- Uh, we're mixing s- time with speed here, but it's fine. Three months, what would be, what would be... Anyway, three months, there's a new string starts.

Stanislas Polu: Three months is better than zero because then you don't know- Mm-hmm ... the direction.

Demetrios: Yeah. And I imagine how it plays out in practice is every day folks are showing you new things- Yeah ... and saying, "Hey, should we maybe go in this [00:38:00] direction?" Yep. And you have to make those hard decisions of, is it worth course correcting for this?

Demetrios: Have we seen something that fundamentally changes where we want to go?

Stanislas Polu: Yep. Yeah, this, uh, so, uh, what we're trying to do is we're trying to really give a good picture of what we want to achieve within three to six months. But then we also, uh-- So that's, that's a kind of a, a, a photo. Like it's a, it's a snapshot.

Stanislas Polu: Uh, and we try to update it, but it's a kind of a, at the horizon of big O of months. And then we, uh, we have, uh, what we're actually doing, and so we call it, uh, our stack rank, so that's pretty easy. It's a list of, uh, project that we're actively working on, and we try to create it very clearly so that the rest of everybody knows.

Stanislas Polu: So engineering generally knows what they're working on. Not always true, but, uh, most of the time it's better if true. Yeah. Uh, and i- enables everybody else within the company to know what we are changing on the product right [00:39:00] now. Uh, and, and then that's, that's stack rank. We really try to keep it very dynamic, so that means that we, we have a process which we call stack rank update that is open to anybody in the company to update the stack rank.

Stanislas Polu: And so, uh, obviously it gets windy at the top, so it's not an easy process. It's not a that hard process either. But it's, uh, basically a button you can push and both founders appear in the room, and we chat about how we should update the stack rank, and everybody within the company has access to that, uh, to that.

Stanislas Polu: Uh- Wow. But generally we try to, we expect, we, we have high expectation for the input to that process, depending on the, on the size or the, uh, the kind of appetite that is, uh, associated with the update. Uh, but, uh, but I think it's very nice because it gives some priority to everybody to be a bit of a product person, uh, within the company.

Stanislas Polu: And, uh, we're trying to push everybody to be a bit of product person inside of a company.

Demetrios: Let's change gears and talk about something. Before we hit record, we were [00:40:00] mentioning tokenomics, and I think that is very hot topic these days because the general narrative you hear from the internet is, "Oh, the era of subsidies is over."

Demetrios: Mm-hmm. "Now folks are trying to go public. We're not gonna get these cheap tokens." I'm not sure that I fully buy into that. Whenever I hear something from the masses, I-- that automatically gives me, like, red flag.

Stanislas Polu: Yep.

Demetrios: However, I do see that the costs are going up a lot. Oh, yeah. And it is really easy to spend a lot of money and not create anything of value.

Demetrios: So there's like that piece- That is easy, yes. I can burn a whole hell of a lot of tokens. That doesn't necessarily mean I'm good at my job, right? Uh, and so I do understand that. You were mentioning there's the cost of tokens, but then there's also the cost of inference. And so let's maybe center the [00:41:00] next five, 10 minutes around that.

Stanislas Polu: Well, I think the, the, the way, the way I, uh, we think, or at least I think about that is, uh, we know the end state. This, uh, we are subject to gravity. This technology has a ceiling. It will converge, and when it does, it will be commoditized, and tokens will be cheap and plentiful. That's the end state. Oh. The question is-- And maybe we'll have ASI by then, but whether-- whatever is the state where it reach, it will plateau at some point because it, I mean, there's a limit amount of energy and stuff, you know?

Stanislas Polu: Uh, and, and I think at, uh, when, when it plateaus, uh, it'll get commoditized, and it'll be pretty cheap, very close to the price of, uh, of, of power, basically. Uh, the question is when that happens, maybe it happens in, uh, in a year, maybe it happens in five years, maybe it happens in 20 years, maybe it happens in 100 years.

Stanislas Polu: W-w-who knows? Uh, and so the question is, uh, is trying to think about what's the transition state between [00:42:00] now and then. Uh, but I guess the end state is plentifulness, cheap intelligence from, from electricity. Um, and then, well, the one thing that I've kind of a, was a, a new learning for me, so I'll, I'll share it because I think it, it was a learning for me in the past couple weeks, months, is that I've always supposed that if models' performance plateau, we would enter a very rapid commoditization phase.

Stanislas Polu: And I think I've been coming back from that, uh, that assumption because the, uh, the demand has increased so much that the pressure on inference, so just the, the fact of serving- Oh ... the tokens is so high that, uh, the actors that have, uh, pre-bought that inference capacity will be able to maintain high margin due to that pressure.

Demetrios: Hmm.

Stanislas Polu: Which means that at least my learning is that even if I suppose [00:43:00] that today is the best we'll ever get in terms of models, which is probably not true, but even if I make that, uh, that thought experiment, that doesn't mean that in six months the tokens will be close to zero. Because there is so much built-up demand and so much-- And some of that demand is, might not be the best demand, because as you said, it's easy to, to be token maxing on some pure AI slop bullshit.

Stanislas Polu: But-

Demetrios: Yeah ...

Stanislas Polu: uh, uh, but this, the demand is still there, and people are still ready to pay for it. And so there's a, there is kind of another pressure in the market for, uh, around inference that will maintain the price of token high. But, uh, but anyway, it's very hard to, um, uh, to know. There's a There's also the other thing that you can s- say that is very easy to say that makes sense is we're getting, uh, in a world where those, those models on some tasks are PhD level, if, if not better.

Stanislas Polu: Who needs a PhD level to fill in a Salesforce?

Demetrios: Yeah.

Stanislas Polu: No one. But [00:44:00] the counterargument to that, there is two, is, uh, who needs a MacBook M2 to do spreadsheets?

Demetrios: Yeah.

Stanislas Polu: No one. But e- but everybody wants one. We

Demetrios: still use it,

Stanislas Polu: yeah. Because it's the best thing. Uh-huh. Uh, and the more serious argument is, uh, uh, even if the model is a PhD level, if you apply it to tasks like filling Salesforce, you might gain one nine of, of reliability.

Stanislas Polu: And whenever you gain one nine of reliability, some funky stuff, interesting stuff happen. You go from, "I look up the ta- I look up what's going on, I'm checking the work," to, uh, uh, to, "I'm checking every week," to, "This thing is fully, entirely automated and I will never, ever in the history of humankind, uh, fill in a Salesforce, uh, card, ever."

Stanislas Polu: So many people are- And so I think that- ... praying- Yeah. ... for that. And so I think there's a, there's still an argument for using the best frontier model, even for tasks that seems pretty, uh, not so PhD level-esque, uh, because of that kind of [00:45:00] added nine of reliability on the task.

Demetrios: Well, if the demand right now is the main constraint on what's making this plentiful, like you were saying, 'cause, um, 4.8 dropped recently and I've played around with it.

Demetrios: I haven't seen that explosion of, "Wow, this is so amazing"-

Stanislas Polu: Mm-hmm ...

Demetrios: and I need to use it all the time. There's a lot of talk about how Mythos is gonna drop and it's going to be a game changer, et cetera, et cetera. We'll see when that happens. But assuming in this thought experiment that the models are what they are and we have this pent-up demand, the question in my mind is: how long does this pent-up demand stick around?

Demetrios: Does it grow over time? Does it diminish over time? Is it something that stays stable and so we just need to keep bringing more energy, more [00:46:00] inference online to- Yeah ... service that demand?

Stanislas Polu: Well, I think there's a, there's a-- everybody, uh, I, I presume everybody along the, uh, the, the chain, uh, the, the value chain, uh, will be happy to provide more offer, right?

Stanislas Polu: Yeah. The, uh, the, uh, uh, electricity companies in every countries will be ha- be, uh, happy to build more nu- nuclear plants. And then the, uh, the, uh, cluster building companies will be ab- happy to build new cluster in the place of old industrial places that are not used anymore. And the GPU providers, uh, uh, mostly one today, but probably more competition tomorrow, uh, will be happy to provide the GPUs there.

Stanislas Polu: And so, uh, I mean, it, it, it will have-- economically, it will have to equilibrate. And, um, and we know where the, uh... I mean, it's, uh, uh, with those type of technology, it always, uh, it always equilibrates to-towards commoditization. It's like your, it's like your, your, your mobile phone, [00:47:00] uh, your mobile phone service, uh, uh, service.

Stanislas Polu: You, you pay 10 bucks, 20 bucks a month, or maybe a bit more in the US, I don't know. Uh, but it's, uh, it's so cheap compared to the value it provides you.

Demetrios: Mm. It's

Stanislas Polu: just because it's in fully commoditized, and it equilibrates to cost plus margin.

Demetrios: Mm. It is fascinating to think about that. And right now, though, where we're at living today, I imagine that's another piece that makes the ground underneath you very shaky.

Stanislas Polu: Mm-hmm.

Demetrios: And so trying to build in, going back to this metaphor of the fog of AI, knowing that the costs aren't really stable, locked in, reliable. And so then you're going out there and talking to users, and you have your own product pricing models and all of that fun stuff. And then the users are already having their own thoughts about the tokens that they're [00:48:00] spending and- Yep

Demetrios: how they're spending it. Uh, that makes for some interesting conversations.

Stanislas Polu: Yeah, definitely. I, I think these are-- It's the fog of pricing, I guess, this time around. Uh, uh, we, uh, uh, interestingly at DUSK, we, um, so we started with the, with a flat price. Uh, and the, uh, the thinking was, was the following. The thinking was you want a flat price because you want to encourage usage and value creation.

Stanislas Polu: So you want kind of a Chinese buffet type of pricing because they- Mm-hmm ... uh, you pay once, can use as much as you want, and, uh, let's go for usage, let's go for, for, for value creation. Um, and it was also built on the assumption that eventually, even if the eventually was very l-loosely defined, eventually costs would go down.

Stanislas Polu: And as it happens, uh, uh, with the emergence of, uh, the models, uh, I mean, with the awaking of the [00:49:00] models, uh, at the end of last year, uh, it is not sustainable anymore for us. And so we have to refactor entirely our pricing to move to what is the industry standard today. And I think it's the only sustainable pricing unless you are, uh, willing to go extremely, uh, negative margins.

Stanislas Polu: Uh, you, you, you, you have to move to a credit-based pricing. Yeah. Because you don't control those costs. We don't know what's gonna be made of tomorrow. We don't know what's, um, maybe Mithos will be really a banger, and it will be maybe really expensive at the same time. Yeah. And that's great. That's great if it, that's the case.

Stanislas Polu: That's great if you are in a credit-based pricing world because you're like, "People, just go for it. Use Mithos if you want. It's an awesome model. Uh, it's gonna cost you a lot, uh, but we ha- I mean, uh, it's fine." If you are on a kind of a flat price, uh, like a typical SaaS pricing, uh, you just, you just, like, uh, if a model drops, awesome, very powerful, but super expensive, you just [00:50:00] can't serve it.

Stanislas Polu: Just can't serve it because it's, uh, the, the, the economics don't work anymore. No, I'm just- I s- Oh, sorry. Go ahead. Yeah, no, it's all right. So I think the uncertainty and kind of the, uh, uh, time horizon for the, uh, uh, evolution of pricing makes it, uh, a pretty much an unstable position to not have a credit-based pricing in the market today, I think.

Demetrios: Yeah, it is very dangerous to- ... go per seat, and then especially if you're abstracting away the cost of the providers. There's some creative things you can do, like you were talking about earlier, of using smaller models. But that, at the end of the day, is not going to work, especially if folks are wanting these, the best-

Stanislas Polu: Yeah, exactly

Demetrios: models all the time.

Stanislas Polu: For, I mean, maybe there is a set of task where I'm happy to use a small model, but for most of the stuff I do with agents, I, I just want the best.

Demetrios: Yeah. [00:51:00] Well, and you don't-- It almost takes more time to figure out where to insert the small model.

Stanislas Polu: Yep.

Demetrios: Which slows you down, and you just, it's like, "Well, I know I could save a few cents here- Yeah

Demetrios: but let's try and use it because it's gonna be a lot easier for me- Yep ... than if you set up..." And, and that's one thing that has changed a lot recently too, is how we don't necessarily set up by hand each step of this graph. We let the agents, we kind of throw the problem at the agents and let them figure it out.

Demetrios: And so if the agents have the ability to choose the smaller models or kick off tasks with smaller models, that's great. I'm not gonna go in there and, like-

Stanislas Polu: Yeah, exactly ...

Demetrios: prompt it, "Use a smaller model for these specific tasks when you are going and doing them in the plan," you know? Yep. Uh, maybe there are people that do, and those folks, wow, hats off to you.

Demetrios: But [00:52:00] for now, I, I imagine you could optimize just by- Micro, micromanagement

Stanislas Polu: of agent fleets. Yeah. And

Demetrios: you probably could get more lift or more cost savings just by optimizing certain context window tricks or- Yep ... the ability to make sure that your agents aren't constantly ingesting files. Uh, maybe it's like caching tricks or things like that, that are gonna be much n- I'm not gonna say, like, much cheaper, but they are going to potentially give you more bang for your buck.

Stanislas Polu: Yep.

Demetrios: So there's, there's all that fun stuff. Now, I have one thing that has been going through my head that I've been wanting to ask you, and I'm sure you hear it every day that you talk to customers. And it's, well, yeah, cool, Dust, I see this vision, but can't I just do that with Cowork? As we were talking about earlier, Cowork is having its moment.

Demetrios: What is the unlock on this, like, single player to [00:53:00] multiplayer? I can share my slides with people, or I can share my sessions with people from Cowork. So where do I go with Dust?

Stanislas Polu: Yeah, I think there's, uh, so there's, there's three pillars to our differentiation with Cowork. Uh, it's, uh, so first on the, on, on that multiplayer, there is some use cases that we can explain pretty, you know, as I s- did for the Team Weekly, where I think you can paint a, a picture with you, you-- It's very hard to do, uh, with Cowork, that kind of a collaboration around a, a, a, a unit of work that needs multiple people.

Stanislas Polu: The, uh... I mean, you can obviously do it, but it requires more, more kind of manual work, and it's more, it's less integrated. I think there's, um, in terms of, uh, uh, enterprise, there's the, uh, governance part, which is very important. Um, having the ability to, uh, we, we try to do a better job than Claude, uh, on that, than, uh, be able to distribute in a controlled way the different, uh, MCP servers, different tools, the different [00:54:00] skills, uh, limit, uh, the access to some data to some users, have that things transitively managed so that people can discover the, the agents and the skills they can use, but in a way that is aligned with other, what are the restrictions of the workspace.

Stanislas Polu: I think, uh, uh, Cowork with, uh, all the things turning locally, the skills being shared through, uh, kind of out of, out of bands to some extent as of today, um, is a little bit more, a little bit more of a, of a, of a Far West when you want to have kind of a, a slightly more, uh, governed deployment of AI within your company.

Stanislas Polu: So that will talk to different type of company, obviously. And finally, the third pillar, which is a pretty obvious one, is like you don't wanna get stuck in with one model provider.

Demetrios: Mm-hmm. Especially as we were talking about tokenomics just now

Stanislas Polu: Exactly. You don't wanna be stuck with one token provider because, uh, the next best model, you don't know where it's gonna-- wanna know where it's coming from, and you surely wanna be, have access to it.

Demetrios: Hmm.

Stanislas Polu: And so I think it's, uh, kind of a, a mix of those three. We try to be very innovative on the product, [00:55:00] uh, in terms of, uh, multiplayer AI, and that's a differentiation. That's something that, uh, we hope will awaken and become stronger and stronger as we go. Uh, we wanna be, uh, better at the governance and kind of the enterprise readiness, uh, as much as possible in-- at the product layer, obviously.

Stanislas Polu: Uh, uh, we're not talking about, we're not talking about FDs, uh, trying to implement stuff in your company, but really as a product be more enterprise ready. And then the third pillar is that kind of a, is that kind of a, is that kind of a, a freedom of, of freedom of choosing your model and being able to test open source models and keep your finger on that and have all of that in one central place.

Stanislas Polu: So I think it's really, that's kind of a, those three factors that makes us, uh, still a, a compelling, uh, option, uh, in the time where everybody, everybody talks about co-work. But, um, I mean, we've lived through other hype, micro hype cycles. Uh, we've had obviously ChatGPT, we've had-- We've long enough, we, [00:56:00] we've old enough for, for having lived as a company the emergence of ChatGPT, uh, imagine.

Stanislas Polu: Uh, and, uh, we've lived through the, the, the Glean moment, like it was all about Glean, uh, two years ago. Glean, Glean, Glean. Yeah, what happened

Demetrios: to Glean? That's true. Glean-

Stanislas Polu: Well, I think it's, uh, still a very powerful, uh, enterprise search and AI answer product. Uh, and I think they, they're doing actually, uh, really well.

Stanislas Polu: I think they're just, uh, they're just, uh... I've seen, I don't know if it's official or not, but I've seen on, on a, on newspapers, I don't know if it's a true number, uh, but they've reached $300 million, uh, uh, revenue. It's a, it's a very healthy growing company. Uh, but at that, at that time it was the, it was really a darling.

Demetrios: Yeah, the be all to end all. I remember- Yeah ... those days in 2023 or 2024. Yeah,

Stanislas Polu: exactly. Exactly. So there'll be- Yeah ... uh, there'll be other hype cycles. That's fine. I think we, uh, what we wanna provide our users is really a platform where there is, uh, there is really, uh, building and, and helping them being at the forefront of what can be done with, uh, with agents.

Demetrios: Hmm. [00:57:00] So at the risk of jumping all over the place.

Stanislas Polu: Yep.

Demetrios: File systems.

Stanislas Polu: Yep.

Demetrios: You-- We got into this earlier. Are you in a place where you feel like file systems can be updated for more agentic work and/or, I'll add to this longer question, 'cause the way the agents work and the way that they will almost fan out and do things-

Stanislas Polu: Yep

Demetrios: and do things very compute-intensive things, and then just hang out for a minute and be like, "Hey, human, I want, I want your feedback." And maybe it's a minute, maybe it is an eternity in computer time. Yep.

Stanislas Polu: Yep.

Demetrios: Are there other areas where you feel like the current infrastructure that you have to build Dust, you need new tools?

Stanislas Polu: Well, uh, on the file system, it's true that, [00:58:00] uh, uh, we can predict a moment where, uh... So right now we discuss about it. We have a file system mounted on the pod. We have a file system mounted on sandbox. Uh, some of the file system in pod are shared. You can have multiple agents working with that file system.

Stanislas Polu: As of today, it's mostly still human collaborative driven, and so the kind of, uh, concurrency on that file system remains, uh, within, uh, typical human bounds. Yeah. Uh, but you could totally imagine like an army of agents collaborating on a project, and then the file system becomes the wrong abstraction. Uh, or I mean, it's not necessarily the wrong abstraction, but there's gonna be, uh, there's gonna be a concurrency pressure on it.

Stanislas Polu: Yeah. And it doesn't have any primitive for that. I mean, there's file locking, but, uh- Who uses file locking? Agents

Demetrios: will.

Stanislas Polu: Agents will, agents might. Yeah. Um, uh, but I, I, I think-- But to some extent, this, to me, this has been solved. There's a, there's a, there's a couple of [00:59:00] companies, uh, that are building kind of a Git infrastructure because, uh, if you want to scale, uh, providing GitHub repository to agents, uh, using GitHub is not very convenient because there are rate limits that are meant for humans on the platform.

Stanislas Polu: And so there's a bunch of companies that are pretty exciting that, uh, that built Git infra. And Git infra seems to be a, a fair, fair solution to the, uh, to the problem because at least you have that, uh, resolution, forced resolution step when you merge back to main. That makes, uh, kind of, uh, some sense.

Demetrios: Yeah.

Stanislas Polu: Uh, but at the same time, yeah, the, uh, uh... But I think the, uh, yeah, as, as we're, as we're chatting, I think one, one step before that that excites me a, a, a lot is the, uh, is the kind of a stateful sandbox. The sandbox that stays live. Uh, uh, because I'm super excited about kind of a hyper-specialized business apps.

Stanislas Polu: Uh, many of our users are already doing that with those [01:00:00] frames that we, uh, mentioned. But those frames, they're not backed, they're not backed by a particular sandbox, so they, they, they, they kind of, uh, they kind of, uh, they, they, they, they, they're snapshots. Which mean that if you want to update it, you need to ask the agent to go fetch the data again and go update the frame.

Stanislas Polu: So what you really want- Uh-huh ... is you want those frames to be backed by a, a sandbox that stays around, because then you can have inside of a sandbox, you can have a SQLite database, and you can rebuild the CRM and ERP. You can rebuild, uh, whatever, uh, things you need to do your job. And the frame is the UI for the human.

Stanislas Polu: The sandbox is accessible by agents, and so you have an API for-- You have an, a way to work with agents or directly as a human. Every human can create a different view of the, the app. If, let's say, we are talking about, uh, uh, uh, feed- uh, a feedback system for the product where you automatically route to different owners, and that's something that you cannot manage to do well in GitHub or Linear, and so you build your own [01:01:00] thing.

Stanislas Polu: Maybe the manager will want a different view than the IC, which will want- Of course ... a different view than the, uh, person that report the feedback. And that's, uh, having that ability is extremely exciting. And in that world, the kind of a de facto way to build that, I think that is very easy, is to, is to do a shared nothing type of approach where the sandbox is the state and you just, uh, you just let the agents do whatever they want, use whatever technology they want, and you just snapshot the sandbox.

Stanislas Polu: You can go back in time easily, and there is nothing more to it. There's no external database, there's no external Git repository. I think that will-- That's kind of a very nice, convenient way to package those kind of things for agents. Uh, the only problem is, like, if you imagine that you have hundreds of agents collaborating on a shared sandbox, then it starts failing, and maybe then you need, uh, uh, the codes to be managed by GitHub.

Stanislas Polu: Maybe you need the, uh, the data to be managed by a hosted database, which-- So we'll see.

Demetrios: So it [01:02:00] works until you get to a certain scale, which it doesn't feel like a lot of these workloads are at yet.

Stanislas Polu: Oh, yeah. I mean, it's, so the, the moment where... So imagine you are, uh, you are a small SMB, uh, in whatever countryside of, uh, of Germany, and you, uh, and you do cookies, and you're paying your, you're paying your ERP, uh, 20K or 30K a year to use three features of an ERP that has 10,000 features.

Demetrios: Yeah.

Stanislas Polu: That's, that's the state of SaaS today. And-

Demetrios: Exactly ...

Stanislas Polu: and so you may want to rebuild that, uh, that, that, that process, which is, uh, get me the dough, get me the thing, and then, uh, here are the, the commands, and it's a pretty easy one inside of one of those kind of a specialized app. And so, uh, there, indeed, there is no-- There's gonna be a few people, uh, interacting with it, maybe it's for agents, maybe directly through UIs, and I think having that, [01:03:00] that's, that, that shared no-sync approach really works.

Stanislas Polu: The moment where it would fail is where, uh, you start growing and you have thousands of humans. If they interact through the UI, that would, the model would still work fine. The moment it would fail is, like, you know 10,000 people and, and, uh, or you, or you're 10,000, you're 10 times bigger and know, uh, uh, you need to make that system evolve, and you have agents working with it and changing the codes at the speed of hundreds of agents at the same time.

Stanislas Polu: We can imagine that world, but that's, that's where it's gonna fail. But it's so far away, uh, that, uh, it seems like, uh, fine to not over-engineer for it.

Demetrios: And even the current state of a large company, you don't necessarily have the majority of your projects being hit hundreds of times a minute by agents. So it's almost like I see a 20, 80/20 principle here, where you could have that working for a [01:04:00] lot of your different projects, but then those projects that really need scale and they need that type of limiting or- Yep

Demetrios: unlimitedness to it-

Stanislas Polu: Yep ...

Demetrios: you would take those, or you would architect those in a different way.

Stanislas Polu: Exactly.

Demetrios: Huh. So that is Stateful sandboxes. Are there any other hangups that we might encounter if we tried to do them? It feels like it could get expensive if I constantly have a sandbox running and I only use it once a week.

Demetrios: It could be wasteful.

Stanislas Polu: Or- No, but, uh, that, that, that, that we have really great companies that gives you the nice primitives there, where you, you just shut them down. The, uh- Yeah, and then it sort of- ... boot up time is- Yeah ... boot up time is, boot up time is 100 milliseconds. You don't have... If, if it's serverless thing, the beauty of that is that you don't have to care about where is the DB, do we have to shut down the DB, blah, blah, blah.

Stanislas Polu: [01:05:00] No, the everything is in the sandbox. It's a SQL database. It's a file. It's one megabytes to do most of any companies would do with, uh, anything serious. It just works. I think it's what I really see as the prolongation of the Excel spreadsheets.

Demetrios: Yeah.

Stanislas Polu: Today we have the Excel spreadsheets, uh, no man's land, and SaaS.

Stanislas Polu: Yeah. Uh, and it's kind of the, uh... It's kind of a way to... It's, it's, it's, it's a capture of a, a lot more that it can do with the spreadsheets that you were forced to do with SaaS. And so that's funny because if you really believe in that vision, it feels like, uh, Wall Street was right way in advance for danging the SaaS businesses.

Stanislas Polu: Uh, and that's, uh, it's pretty rare to think that Wall Street was really, really right in advance. But, um, but, uh, but that's, uh, yeah, that's the, uh, the fun, uh, the fun thoughts. Uh, but yeah, I think there's a massive opportunity to have anybody create the, the app they need to do their job, uh, that is, uh, equally accessible.

Stanislas Polu: The state is equally accessible by agents, and the, [01:06:00] the UI is actually can be create... They can customize to every human, basically. That's, that's something very exciting over there.

Demetrios: Yeah. Especially if you're interacting with this through The chat ability or I, I guess where my mind gets caught up and where I've never fully bought into this generative UI principle, although I'm coming around to it a little more the more and more that I use things like the HTML that the agents will create on the fly.

Demetrios: Uh, it's like, ah, this is, this is actually kind of that idea of generative UI. Yep. But what I would get hung up on is that a lot of times we have learned and we are comfortable with certain tools because we know those five things that we need to do. We know how- Yep ... to do them and where to do them. If we now are being asked to create our own tool that does those five things, and [01:07:00] we have to spend the time creating it, even if it is going to give us a lot of success on the tail end because it's gonna be much faster, it's gonna be more custom, we don't have to- Yep

Demetrios: spend as much money, it's a pretty hard ask to g- get it set up.

Stanislas Polu: Yeah, but it's gonna be as with any content economy ever. It's a one point, one person producer, 99 person consumer. Which is true of, of, of spreadsheets as well. There are people that knows, uh, that know how to do a great spreadsheets, and there are not that many of them, and they are the ones building the spreadsheets.

Stanislas Polu: And then there are

Stanislas Polu: the rest of us- And the dashboard ... consuming. Yeah. And the dashboard, and there's the rest of us consuming, consuming them.

Demetrios: Ah, so y- you're thinking there will be the builders of these, it's just not, we're not all going to be expected to build it. There's going to be the specific builders, and they're going to-- I like that idea of extending the spreadsheet.

Demetrios: You're now going to [01:08:00] get this micro SaaS that is very specific to your business.

Stanislas Polu: Yep. And any- anybody can build if they want. It's just that, uh, we know how it, where it lands. It's like some will build, some will consume, and...

Demetrios: Yeah. The, the one place that I think that is for sure having its questions is the no-code, low-code type of tools, because that is 100% in the crosshairs of- Yep

Demetrios: what we're talking about.

Stanislas Polu: Yep, yeah, yeah.

Demetrios: However, I did just see a tweet of somebody laughing saying I am constantly surprised at how folks will set up with their agents a whole workflow to automate something that costs like 90 cents to run, when with Zapier you could do it- Yeah ... for five cents.

Stanislas Polu: Yeah, yeah.[01:09:00]

Demetrios: So it goes back to that thing of like what costs versus convenience type of thing. Yep. And it is much more convenient to explain what we want, especially if you're using some dictation tool. You just mind dump, the agent organizes it and says, "All right, let's try it." You test it, it works, it looks good, ship it.

Stanislas Polu: Yep. Exactly. We'll see. It's hard to, uh, it's hard to predict where it's gonna land, but it's surely, uh, there's surely a ton of, um, ton of exciting stuff to explore. Agreed that, uh, the kind of, uh, on-the-fly dynamic UI, it's hard to see it fly right now just because of the latency. But here it's kind of a slightly more async.

Stanislas Polu: It's like, uh, you build once, use many times, but you can build very custom. And agreed that all of that is doable with, uh, with many no-code platforms out there. And it's, uh, it's very hard to understand why they, why they didn't conquer the world.

Demetrios: Yeah. Well, there, [01:10:00] there's also the other idea of just, uh, spitballing with you on product here, 'cause you're taking the approach of one sandbox that will have different views that people can drop into and they can look at.

Demetrios: However, what I wonder, and, and I see it as the use case in my mind is you have a lot of customer calls. Like, I'm sure you're on calls all day long talking with customers or talking with potential customers. And some of the time of that call, or the same call can have value for a salesperson, it can have value for a PM, it can have value for your engineering team.

Demetrios: And so you wanna take that transcript and you wanna display the valuable parts to each party that's looking at it. And so going back to this analogy of, all right, we've got data and we wanna display it to different stakeholders in different ways. Wouldn't it [01:11:00] be, or is there a world where I can, as a PM, have my sandbox and everything gets piped into that one place?

Demetrios: Or maybe that is just like you don't need to do it at the sandbox level because you are doing it in all these different ways, and you have something that's on top of all the sandboxes that's coagulating all of that data.

Stanislas Polu: Yeah, I mean, uh, some use cases will be, uh, leveragable by just having the data and having a agent just, uh, just process the data.

Stanislas Polu: There's a lot of that happening in Dust today. Like we, uh, you, we have a, uh, let's say a, a Gong connector. And so companies that have connected to Gong, they have all those Gong calls that gets, uh, transcribed and recorded, and that enables interesting use case where indeed product teams can have a-- launch an agent, do a deep dive on all of the customer calls for the past months, uh, tell me everything that they said about that feature or that thing, and that's, uh, [01:12:00] also completely possible today.

Stanislas Polu: Uh, is-- That is an exciting-- That, that's something that's possible today. Uh, the kind of a, a sh- uh, um, uh, micro SaaS view of that would be, uh, you have a call, you dump this transcript. There's a, it triggered in, in a, in a, in a UI. It triggers an agent that will, uh, create stuff for the sales team, create stuff for the product team, uh, auto-assign tags for the product surfaces that are concerned, auto-assign tags for the geography development review.

Stanislas Polu: And maybe you have a, a way to say, "Okay, looks good. It's good," uh, and submit, and then it gets, uh, stored in different places. I mean, there are two ways of tackling the same problems. Uh, but I do think that, um, uh, uh, um, like there is many cases, many use cases for, uh, kind of a DB-backed apps that will, uh, that, that could be customized for a special way of doing work.[01:13:00]

Stanislas Polu: Uh, I have, I have many of, uh, my own at Dust on-- that could change a little bit the way we operate. We kind of, we can always kind of a, uh, you know, you, you bend the way you want to operate to the tools that, that, that you use. Uh, the way you share, uh, goals for your week or your days, uh, at, at Dust happens on Slack.

Demetrios: Mm-hmm.

Stanislas Polu: So it has a certain form factor that is bended compared to the ideal state. Uh, the way we make decisions, we do that on GitHub, uh, which is, uh, interestingly a fun place to do because it helps kind of a asynchronous collaboration. And sometimes it's done on Slack, and it's bended in very different ways depending on the platform you use.

Stanislas Polu: And so there's, uh, so many processes and way to, ways to work that we, that are shaped by our, by our tools and that the other way around. And so that's what I find that's pretty exciting.

Demetrios: Ah. So then you would have the creative control to do what you want and not be beholden to- Yeah ... what the tools force on [01:14:00]

Stanislas Polu: you.

Stanislas Polu: Yep. Yeah, yeah.

Demetrios: I could see that. And, and the-- my imagination is going wild as I think through what you could do in those instances. I like this, like database-backed apps or just a sandbox with a database that's stateful is a very fun thing to ponder. Have you felt like the way that agents use databases We need a new type of database or the way that agents are constantly going around and trying to collect data.

Demetrios: And the reason I bring this up is because I was just reading that, uh, quote-unquote old paper, I think it was from December 2025. Yep. Uh, it was like supporting our AI overlords, and it's written by, you know, some of the greats. I think Matei Zaria and [01:15:00] Ion Stoica and, uh, Shreya Shankar. And so they s- they mentioned how you have this, these agents, and what they called it was like agent speculation.

Stanislas Polu: Mm-hmm.

Demetrios: So the agents will go, and they kind of have a theory, and then they'll try and find the data to support that theory and execute on what they need to do. But a lot of times if they're trying to get data from a database, it's not that efficient to pull the whole row or create this join from these gigantic tables or whatever it may be that you're like, "Uh, can we have agents interact with data in a more efficient way?"

Stanislas Polu: Yeah, I mean, there surely is many things to invent there, but I'm also, uh, I'm a bit of a pragmatist system, those questions. And, uh, at the end of the day, when you think about that technology, it's trained on the internet. It's trained from, uh, uh, [01:16:00] reinforcement learning from y- human feedback. It's trained on, on, on human traces being built by many labs, et cetera.

Stanislas Polu: And so, uh, my point is that it's a technology that is extremely, extremely, extremely anthropomorphic.

Demetrios: Uh-huh.

Stanislas Polu: And so it, and it feels, it, it actually, we had that weird moment where we needed RAG.

Demetrios: Yeah.

Stanislas Polu: And it was kind of a new way of presenting the data made for agents, made for the constraint of the context size that was rather small.

Stanislas Polu: Uh, and with the context size augmenting and the, uh, the very anthropomorphic nature of that technology, the, what's the best way to do search in company data today is just let do what people call agentic search, which is basically just let's use the tool that the human use. If we can do it, they can do it as well.

Demetrios: Uh-huh.

Stanislas Polu: And we're ready to pay the tax of the latency, uh, uh, associated, uh, because it's just much simpler to just give them the tool and let them, uh, let them do the work. And so I'm, uh, uh, I think I'm, uh, um, it [01:17:00] feels like, uh, as the context is getting longer, the, the, the, the, the way those, those, those agents use tools is looking much more, I mean, they, they, they'll, it's looking very much like we use tools.

Stanislas Polu: And so If we didn't have the need in the past, why would they?

Demetrios: Yeah.

Stanislas Polu: It's

Demetrios: an interesting question. And they've been RL'd on what the humans will expect of it, so-

Stanislas Polu: Yep ...

Demetrios: if you-- And they've been RL'd on all these traditional tools par- also, so if you're now creating a new tool that you're expecting the agents to use well, it may take a few cycles.

Demetrios: I, I'm fascinated by that space just because I feel like you've got this whole new way of interacting with machines through agents and the infrastructure that we have. It's fun to think about, is it good enough? Can we do, like you were saying, can we explore and not be beholden to what tech currently- [01:18:00]

Stanislas Polu: Mm-hmm, mm-hmm

Demetrios: constrains us to? Or is there, like the other piece, the other side of the coin that you're saying, be more of a pragmatist and say, "Look, I think if we can use it this way, the agents can use it this way."

Stanislas Polu: It's, uh, it's obviously gonna be, uh, always a bit of, uh, a bit of, uh, a bit of, uh, a bit of both. But, um, uh, even in the way we use tool, we use them very differently.

Stanislas Polu: Uh, it's like, uh, we use UIs, uh, they use APIs turned MCP turned CLI.

Demetrios: Hmm.

Stanislas Polu: Uh, and eventually it's the CLI. And so I would love to use every SaaS in the world with a CLI myself. So in a sense, at the end of the day, we merge back into the most efficient place. Uh, I might use, I might use the, the CLI that agents have been, uh, that have been built for agents for using, uh, using those SaaS tool myself.

Stanislas Polu: I'm a big of a fan of the terminal, but

Demetrios: I just saw a tool, I can't remember what it was. It was like turn any SaaS [01:19:00] into a CLI. Yep. That was their whole tool. I can't remember what it was, but that is, uh-

Stanislas Polu: It's exciting to me.

Demetrios: Yeah.

Demetrios: Real- It's, yeah, it's, it's a different level that we're playing at. Well, man, this has been great. Is there anything that you wanna mention before we jump? Is there anything that I didn't ask you about? Like-

Stanislas Polu: No, I think this conversation was great. We covered, uh, we covered the, the main subject. I think we are on the verge of, uh, going towards a multiplayer AI.

Stanislas Polu: We don't quite know yet what it look like, but it surely won't look like what we're doing today. And so that's, uh, that's why I, I think it's, uh, in the-- In that world where, uh, there are two black holes being created, uh, in the market, I think, uh, it's always an interesting questions of, uh, are they not gonna, gonna take it all?

Stanislas Polu: Uh, we, we, we, we really attach to the, to the, to the thinking that even if the technology, uh, again plateaus, uh, even with current model [01:20:00] shape, uh, the way we're gonna work in 10 years is probably, uh, nothing compared to the way we work today. And so there's still many, many, many opportunities to build for, for everyone.

Demetrios: I will also finish with Dust is known throughout the world for having the absolute best in-person events. So if you see a Dust event happening near you- Exactly ... I highly recommend that you go. Go for it. Yeah. That is for sure. They are awesome events. Anybody that has been to one will tell you the same. So thanks for doing this, dude.

Demetrios: I really

Stanislas Polu: appreciate it. Thank you very much. It was great.

+ Read More

Watch More

Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries

Posted Oct 31, 2025 | Views 449

# MCP

# AI Agents

# LLM Judge

Generative AI Agents in Production: Best Practices and Lessons Learned // Patrick Marlow // Agents in Production

Posted Nov 15, 2024 | Views 6.5K

# Generative AI Agents

# Vertex Applied AI

# Agents in Production

Building Reliable AI Agents

Posted Jun 28, 2023 | Views 1.3K

# AI Agents

# LLM in Production

# Stealth