MLOps Community
Sign in or Join the community to continue

Why Agents are Driving Software Development to the Cloud

Posted Apr 17, 2026 | Views 12
# AI Agents
# Cloud Development
# Warp Terminal
Share

Speakers

user's Avatar
Zach Lloyd
Founder & CEO @ Warp

Zach Lloyd is the founder and CEO of Warp, the platform for agentic development. What began as a reimagined terminal has evolved into an Agentic Development Environment where engineers orchestrate AI agents to build, deploy, and debug production software. Backed by Sequoia, GV, Sam Altman, Dylan Field, and Marc Benioff, Warp is used by over 700,000 developers at companies including Docker, Ramp, and Peloton, as well as leading AI labs, Big Tech companies, and over half of the Fortune 500.

Before Warp, Zach was the overall engineering lead for Google Sheets and the Google Docs suite. He’s also held roles at NASA, did a stint at Yale Law, served as interim CTO at TIME, and co-founded SelfMade. Zach holds a BS in Symbolic Systems from Stanford and a Master’s in Philosophy of Science from the London School of Economics and Political Science.

+ Read More
user's Avatar
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More

TRANSCRIPT

Zach: [00:00:00] However, if you're reviewing a code that your agent wrote, that's a ridiculous flow. It's like, why am I leaving my workbench to go up to GitHub? You shouldn't. You should just do it right as the agent is writing it. And so you, what you're seeing is some of the things that that like, were like the hero features of GitHub, like I think collaborative code review.

Zach: Are gonna be moving out of GitHub.

Demetrios: I want to, I wanna start with the fact that you've been creating Warp Before that you did Google Docs. You are all about the collaboration and so Warp is a logical next step in collaboration, making the idea of software engineers collaborating in the cloud be as easy as the knowledge workers collaborating on Google Docs.

Demetrios: But there was something funny that I read when I was doing research to start [00:01:00] this podcast with, and I think you have a bit of a hot take when it comes to sandboxes. You're not super stoked on sandboxes. I'm cool. I'm cool with, I'm cool with

Zach: sandboxes.

Demetrios: Well, what is it? Tell me why you don't like the idea of the paradigm

Demetrios: of computers.

Zach: Okay. So, um, no, I, I'm cool with sandboxes. It's a funny hot, it's a funny way to start with a hot take. My, uh, my take is just like, there's this, uh, I've seen a bunch of tweets around like the right way of like, moving agents into the cloud. It's like set up like a cloud computer and ~like ~let your agents kind of loosen it, and that's fine.

Zach: That's not exactly how I think it ought to work. I think that the primitive that makes more sense to me is actually like imagining your agents as sort of like, um, teammates who run in the cloud and as teammates they have like sets of [00:02:00] permissions and are doing stuff as themselves and the, like, the really simple paradigm of like, well move your laptop.

Zach: To a cloud dev box someplace and use that as a sandbox is kind of wrong because what you actually want is different agents doing different kind of things and each one of those agents has its own set of like access controls. And just to make that more concrete, it's like on our team when we're using agents, we have like, we have agents that are automating all sorts of stuff, but some of those agents are automating things that like touch our CRM.

Zach: We don't want those agents like touching our code base necessarily. Some of those agents are, uh, you know, checking for fraud and need to look at our production database. And so I would rather view the agents less as like that you plop them into a cloud computer and more as like they are sort of teammates each with a set of permissions and roles and their own little workspace, uh, where they just have from like a lease privileged perspective.

Zach: Exactly. What they need to do their job. And I [00:03:00] think that the, the like cloud computer thing is like an analogy. It just is not the right, not quite right. It's almost like the difference between, um, like a long running server and a like lambda or cloud function. I think we want our agents running. Honestly more like lamb desert Cloud functions.

Zach: But I don't know. It's not like, I don't, I'm not like upset about this view. Like I, I'm totally fine if people wanna launch cloud computers. Uh, I just think it's, it's not, it's not exactly the way that I would, uh, try to set it up technically.

Demetrios: So what you're saying is there are other beliefs and views that you hold more dearly and you are upset about.

Demetrios: This

Zach: is not a, this is not a thing that I spent like my nights thinking about at not the hill you're gonna die

Demetrios: on.

Zach: No, no, not at all. This is like a, a kind of. Oh, I see everyone moving this direction. I think maybe there's like a slightly better direction to to move in. Besides that,

Demetrios: yeah,

Zach: that's all.

Demetrios: What are some hills you wanna die on?

Zach: Um, I mean, a big thing that I'm like focused on right now is I do think we [00:04:00] want to get agents off of people's laptops. Um, and so I, I do think that the move from Interactive Agents, which is what last year was, was all about, and I'm talking about the coding space here, where, you know, last year I think really caught on things like Claude Code Codex warped to some extent, where it's like.

Zach: You as a developer are sitting there talking to your laptop, asking it to build features, asking agents to fix bugs. You know, we're at a point now where, um, agents need to transition from being like a solo sport to a team sport. And this is what I'm hearing from companies too. Like we talked to a lot of business leaders, like they want a way of building automations with agents.

Zach: They want a way of seeing what these agents are doing. They want agents. You know right now if you do things locally, it's like every time you start a new agent, you basically, you're starting from scratch in a sense. 'cause the knowledge of the agent isn't really being built up over time. And so. This is the year I, I think, [00:05:00] um, you gotta get agents off laptops.

Zach: You gotta get them into some centralized system that's running in your cloud where you are tracking what they're doing, where there's cumulative memory, where they can run when your laptop is closed and so on. And so I, that's like, I dunno if it's a hill I wanna die on, but it's like very much what I think is, that's just like what I think needs to happen this year.

Zach: And I think it is, is starting to happen for sure.

Demetrios: Well, how are you seeing the best teams collaborate?

Zach: Yeah, so~ it's,~ it's pretty cool. Um, I, I mean, I, I can speak to, to our team, which I think is a pretty good team. It's like I can pull up now in Oz. Oz is our, like, cloud agent platform and I can see what everyone on our team is doing with agents, which is pretty cool.

Zach: And so if, uh, like I wanna go like review a PR or something, I can actually see how the agent. How the agent got there, I could see what the plan was, I could see the conversation, I could see what verification steps happened. Um, [00:06:00] I can, uh, also like set up basically environments for agents to run in that other people on the team can use.

Zach: I can set up skills, so things that are like repeated knowledge where, uh, you know, if I. A good example is like our person who runs our data team has set up a skill where anyone on our engineering team can now like, use an agent to access our, um, you know, our like data warehouse and the DBT tables and the agent understands the schema and all of that.

Zach: And so we can all, like now self-serve into doing data analysis. When we launch new features, we wanna explore user behavior or even people who aren't on the engineering team, like, uh, who are on the go-to market team or on the growth team, can use that same skill. And so. I think a really high functioning team that's building with agents right now is thinking about like, how do you accumulate knowledge?

Zach: How do you make sure, um, that everyone on the team is like working, uh, you know, as effectively as possible? And that involves like a lot of transparency around how everyone is doing their stuff with [00:07:00] agents.

Alright folks, real quick. Hyperbolic's GPU Cloud delivers NVIDIA H100s at $1 and 69 cents per hour and H200s at $1 and 99 cents per hour with no sales calls, long-term commitments or hidden fees. Spin up one GPU or scale to thousands in minutes. With VMs and bare metal clusters, high speed networking and attachable storage, you only pay for what you use.

Save up to 75% less than legacy providers. You need steady production grade inference. Well, then choose dedicated model hosting with single tenant GPUs and predictable performance without running your own hardware. Try now at app.hyperbolic.ai. Let's get back into this show.

Demetrios: Yeah, there is a lot of. [00:08:00] The boring word would be governance stuff that you should figure out here.

Demetrios: And one of those that I've been experimenting with or just realizing that companies are butting their heads against is. When do we make a skill for the whole team? When do we make a skill for the whole company? How do you track thousands of skills across the company in this skill repository or database?

Demetrios: And then that's just for skills. What about for special prompts or special rules or all the ways that you want to now kind of solidify these practices and these operating procedures?

Zach: Yeah, so the, it's, it's exactly what. Companies need as they scale is like, you can't just make everything available to everyone and every agent, or you're gonna get agent chaos and you're gonna get agents doing things that you don't want them doing.

Zach: Or you know, more sort of prozaically like you're not, it's gonna overwhelm the agent. Like agents have limited context, like they're going to use the [00:09:00] wrong thing at the wrong time. Yeah. And so, you know, just in the same way that we've spent as an industry building. I don't know, 20 years of SaaS, 25 years of SaaS for humans to, to do all of these internal tasks.

Zach: What we need is, is a SaaS equivalent, essentially for the agent knowledge worker or the agent coder to like find the right tasks to work on to find the right data stores. And I don't think that SaaS is gonna look like human SaaS. Like human SaaS has. There's been a lot of work put into ~like ~what are the dials and knobs and UI that a human needs to use in a tool like Salesforce or Linear or Notion or whatever.

Zach: And that stuff tends to be like irrelevant for agents. What matters to agents is like, it's like what data is available? Can they understand the data schema? What actions can they take? Uh, how do they [00:10:00] report their artifacts? And so it's actually ~like ~a much lower level like machine oriented set of s but I do think like what's the SaaS platform that needs to exist for agents to do knowledge worker tasks?

Zach: Is, is, is a reasonable way to think about it.

Demetrios: So I've heard you talk about this meta. App store type of thing before, and now I'm starting to wrap my head around it because it feels like there's gonna be an app store, but it's not going to be the traditional SaaS that we think of, and it's probably going to be very bespoke to each individual organization.

Demetrios: Yeah, so, so the way I

Zach: see SAS sort of developing is that increasingly the front door. To any like knowledge work task is gonna be what I call a, like a meta app, which, um, I think the easiest way to think about it's something like Warp or something like Claude Code or Claude Cowork or GBT Codex where you, uh, as a knowledge [00:11:00] worker, the mindset that you're gonna be in is like, not like I want to go use a tool to do a task.

Zach: Like there's only gonna be one tool. And that tool is where you ask an agent to do something. The agent can actually do the thing for you and to like draw, make this more concrete. ~Like ~for most of my career, if I wanted to do some kind of data analysis, uh, I would probably import the data into a spreadsheet.

Zach: Like I'm a spreadsheets person. I help build Google sheets. Uh, I would go in there, I would probably like, add a bunch of formulas. I would maybe create a pivot table. I would format it. I would try to like generate some charts. And then I would try to like, but like why am I doing that? It's because I'm trying to understand something or get some insight.

Zach: And today, ~like ~the spreadsheet for me is being disintermediated because I will just ask Warp to like do the analysis for me. Yeah. And what's really cool is it's like Warp can literally, um. Build me the spreadsheet on demand if I wanted to. [00:12:00] I, it can build me a table, it can build me formulas, it can build me charts.

Zach: I just don't need to, as a user, learn another interface and it's, it's like a pain to learn how to use a spreadsheet like. It's like, it's what people in finance spend, spend years, spend years doing. It's like you, you don't actually need to shape, uh, as a human, your, your skillset to the app. Uh, RA like, like in the past, you would need to like go, like read the manual essentially.

Zach: There's only gonna be one manual to read, which is, again, it's a tool. It's like warp or cloud code or Codex or one of these tools that can basically go from your intent. To solving your problem. And the, the types of problems that these apps can solve is like, it's incredibly wide ranging. Like for in I, another example is like, I don't really make presentations anymore in a presentation tool.

Zach: I just ask, uh, an agent to make me a website. [00:13:00] Uh, and like I ask it to read our brand manual. So the website is styled the way that I want. It's just so much faster to have it do it via code than it is for me to like go into Google slides or Figma slides and start like figuring out how to like, you know, you know, draw cursors and shapes.

Zach: I just would rather go right from my intent to the outcome. And so if you think of it like that, you should start thinking about like, well, what value do these SaaS interfaces really provide, and I think at the interface level it's not that much. I think there is a lot of value that they provide in terms of ~like ~repeatable, complex business logic.

Zach: So for instance, like an app that I think is probably fine is something like Stripe, where it's like, it's like years and years of encoding like regulations and integrated into payment systems that it's not a thing that you're gonna do on the fly. But like there are apps, you know, a lot of apps that are just [00:14:00] sort of a simple UI on top of a database schema that I, I, I don't want to use like our, our, I'll give you one more example.

Zach: I know I'm talking a lot like our, uh, talent team built our own interface on top of our AT our applicant tracking system, our ATS. We like it more than the, than the interface that this ATS has been building for the last 10 years or whatever. And this is built by our talent team who don't know. They don't code.

Zach: And so it's just like bespoke apps for, uh, these internal applications or even no apps, just goes straight to the answer. Just go like from a prompt. Use the database, get what you want, is the future.

Demetrios: Yeah, I've said this a few times before where it's like the agent is now the equivalent to the browser in a way because it's your gateway to the world.

Zach: That's a great analogy.

Demetrios: And so

Zach: that's a great analogy. It's, yeah,

Demetrios: we don't want 50 million browsers, we just [00:15:00] want one browser that can get us to the webpage.

Zach: Yes. I think that's, maybe, that's even like the better word for it. Like I, I'm, I, I'm calling it a meta app, but it's like. It is a browser that does things essentially.

Zach: Yeah. Um, is maybe a good way of putting it. But yeah. It's the only app that you really are gonna need is the way of interfacing with the agent for a lot of tasks. Not for everything, but for a lot.

Demetrios: Yeah. It feels like then your app is just a skill in a way.

Zach: Yeah. It's, it's kind of true. It's like some, the data needs to live someplace, right?

Zach: So I think data still matters a lot. Um. You need to guide the agent in terms of what the data means and how to use the data. And that could be a skill. Um, there's some place for code still in terms of, you know, one of the problems with agents is that they are, there's, they're non-deterministic. So even if you give an agent a really good skill, if you have [00:16:00] like a business process that needs to be done exactly right, that needs to be in code, in my opinion.

Zach: Um, again, just going back to like a financial transaction system, you don't want an agent like inventing how to do that on the fly, but, you know, the, there's just a lot of things where it's totally fine to have the agent do it on the fly, and so it's, it's gonna be a really different world of software.

Zach: Software's gonna still matter a ton, but the, you know, the, the way you interact with it. Is gonna be very different. And I think, honestly, I think that's one of the things I haven't heard people talk about quite enough is like there's a lot of talk about, um, the way that you build apps is very different now, which is true, right?

Zach: There's, it's very easy to build an app in war for cloud code or even like a thing, like lovable. But what, but what actually is gonna be the biggest change for, for knowledge workers is the way that you use apps. Not the way that you, [00:17:00] uh, build apps. And so it's the way that you like, learn how to prompt and talk and work with an agent is gonna be the biggest change that that's coming.

Demetrios: So one thing that happened to me, and I have an exact experience of this, is I created a little electron app. I vibe coded it up.

Zach: Okay.

Demetrios: And I was super stoked because it did all these different things to create short clips for my podcast. Right.

Zach: Cool. Yeah.

Demetrios: Later on, I thought, wait a minute. Why do I even need an app for this?

Demetrios: Why did I go through the process of this app when now I just can create it? Inside of the agent and I can tell it. All right. Go ahead and use this skill to extract the transcript, then use remotion and uh, it's like a process that it has now in a skill form. And I don't actually need an app. I don't need to drag and drop the clip.

Demetrios: I can just say, here's my [00:18:00] video. Go and do your thing.

Zach: Exactly. That's how it's gonna work. Like just encode the knowledge and a skill. And then this one interface of like. Interacting with the agent can basically get you the output and honestly, if you needed an app, it can do what I call like build you a just in time app.

Zach: It can build you a little interface in the course of interacting with the agent. But I don't know, like the idea of like these long, long lived apps where you have to learn how to use them, I think is. Is gonna, is gonna change. I think that's actually pretty cool. I think it's gonna be pretty fun. As someone who is interact, it's, it is a better way of interacting with computers.

Zach: The way I, I called it in my article is like, we're all like digital gods now. We can all just like, have computers, do what we want, when we want. Um. At least until the singularity or whatever, until they, they, they stop listening. But like at this moment, we've never had, it's never been cooler in my opinion.

Zach: Just be able to [00:19:00] tell a computer what you want it to do and you don't have to shape your behavior to the computer. You can just like, it will do what you want. It's pretty neat.

Demetrios: Well, that goes to this idea that you have on like how important it is to be able to express your intent. Yes. And I've also said this where.

Demetrios: Articulating what's inside of you that you wanna get out is one of the strongest skills that we can have these days. You know, whatever's inside of you. You've got all these ideas, or you've got this stuff that's stewing, and how do you articulate that in a way that can then be taken by the agent and created.

Zach: This, this is one of the first things that I wrote about when agents started to become a thing, which was like the limiting factor here is actually, um, human's ability to express what they want, right? So if you're, and that's true for if you have a, an agent that can basically do what you [00:20:00] tell it to do.

Zach: It will do what you tell it to do, but it can't like yet. Read your mind.

Demetrios: Yeah.

Zach: And so, um, and the other thing that's interesting here is like often you don't actually know what you want a hundred percent.

Demetrios: Yeah.

Zach: Until you start doing it and iterate. And I believe creativity and like building stuff is fundamentally an iterative process.

Zach: This is how I work at least where, okay, I think I kind of want something that has this shape. And the agent will gimme something. I'll be like, no, no, no, I don't want it. That's, that's the wrong shape. Let's go again. And so that's gonna be where, what sets? Um, I think great knowledge, workers apart knowledge work, like great PE builders, you know, creator whatever is the ability to ability, iterate and express the intent that you want to an agent to actually get to something that's.

Zach: Cool and useful and, and like worthwhile. And so it's really [00:21:00] interesting from the standpoint of like, okay, if you're someone who is, you know, entering the workforce or like, you know, you're gonna try to figure out what, what do you, what should you learn how to do? You should actually learn how to communicate really well.

Zach: Uh, you should learn how to like write well. You should learn how to like, think very clearly and express intent. Um, which is. Know, it's not the same exact skillset as like learning how to program. Um, uh, it's, it's like a little bit different. It's related.

Demetrios: Uh, and I wanted to get into a little more of something you glossed over earlier where working in Warp, you have the whole team that can see the whole history of how a PR was created.

Demetrios: I know that you have been pretty bullish on file systems from the get go. Can you explain how that looks when I submit a pr, how I can see that [00:22:00] provenance, how everything is working? Yeah, so this is one

Zach: of the big things that we're, we're building. So we built this, uh, platform called oz, and the idea behind OZ is to make it really easy to, you know, basically move all these cloud agent.

Zach: Conversations and prompts and execution off of laptops into the cloud and get to something that's like, you know, like Google Docs type system where it's like we're all collaborating together as a team and ~like ~moving from the world of like a. Microsoft Word on everyone's individual laptop to a world where these agents are sort of team visible and team owned.

Zach: So the way that it, it actually works is like, um, when you, when you run a cloud agent through Oz, it um, basically everything is cloud backed and so the cloud backing can be stored in our database or. We're making it so hon honestly, so I, I think companies want the ability [00:23:00] to own this data, right? They want to own the conversation traces, they want to own the prompts.

Zach: Uh, and so we're making everything pluggable. So if you want as a company to be like, okay, we will provide a data store into which Oz Cloud agents can record all of this stuff so that we own it. Oz is basically the orchestration layer and the coordinator and the thing that says, okay, when Zack launches an agent, we're gonna, we're gonna run this agent in a particular environment.

Zach: We're gonna run it on a particular cloud machine. We're gonna write all the data to a particular. Cloud location and uh, it's all API based. So, you know, that's all flexible. So it could run on our hosting or it could run on another company's hosting, which by the way is a thing the companies really want, uh, is the ability to have these agents not lift their code out of their own infrastructure someplace else.

Zach: Uh, it can write the data into our backend or it can write the data into their [00:24:00] backend. So our, our ~like ~product approach is, ~is ~really around flexibility, um, for companies that wanna like, deploy something like this. Not trying to force everyone into a particular way of doing it. Where we do, where we do, uh, sort of force consistencies at the API layer.

Zach: So there's a bunch of contracts where, uh, the agents like they need some place to write a conversation that conforms to a particular model. Does that make sense?

Demetrios: Yeah. I guess where I'm questioning now is do you have a spectrum of how like locked down it is? So if somebody doesn't want anything going outside, they can't use the research lab models like, so they're not privy to Codex or Quad code.

Zach: So we support a variety of things here. What I find that companies often want is to provide their own LLM endpoints, so companies. We'll often want to, uh, serve these models, uh, through like [00:25:00] things like Amazon Bedrock or GCloud, like Vertex or Azure. And uh, so we, we try to support that as well. Um, again, there's like a real spectrum.

Zach: Some companies are just like, host the whole thing for me. Yeah. Like they don't care. They want ease. Some companies are like. We can't have inference leaving our cloud. We can't have, uh, code leaving our cloud. We, you know, um, and so there's just a variety of things and we're trying to build something that is, you know, we don't do like a full on-prem yet.

Zach: Like we, we, we could. It's like that, that starts to be hard for us to manage, but within. Everything but that we are trying to make it so that it's very, very flexible in terms of, you know, data ingress and egress. 'cause you know, companies, this is their, their IP that the, uh, agents are working on.

Zach: Sometimes it's the customer data, and so you need to be really careful.

Demetrios: [00:26:00] Yeah, I can imagine why companies would want all of this provenance on how folks are interacting with the agents, but what are the big use cases there? Is it audibility?

Zach: Exactly. So, okay, so the big use cases are auditability.

Zach: So especially if, um, you know, something goes wrong, you wanna understand why,

Demetrios: how

Zach: we got there.

Zach: And that could be for, uh, you know, it could be for like a, uh, security. It could be for just like, uh, um, yeah, like something caused a crash. You wanna figure out why. So it's like, could be debug ability. You want. Uh, and, and sometimes honestly for like compliance reasons, like companies, it's just like not safe to have agents doing stuff that people you'd be tracking, um, you want, uh.

Zach: Uh, cumulative knowledge. So like, I would call it like, uh, like con, like a context management so that over time [00:27:00] the agents improve. And so one of the foundational building blocks for that is like. Some way of ~like ~seeing what agents have done, how they've been prompted. Probably Ealing like is this, especially for automations?

Zach: Like, is this, is this the right setup? Is this the right way of configuring the agent? So you want, if you wanna do that, you need to see record of the agent runs and ~like ~how it, how it worked. And then a third thing would be, um, you know, handoff and steering. So, uh, if you have everything in the cloud with a conversation trace.

Zach: You can do things like, um, you can have an agent start a task in the cloud, and then you can have a developer ~like ~pick it up and, or you could have multiple developers pick it up and like, uh, and so that, that sort of functionality again, is unlocked by being in the cloud. Uh, so there's a, a whole, a whole host of reasons why I think it's, it's a better architecture.

Demetrios: Yeah. There. Was [00:28:00] something that I think you said in your talk at the coding agents conference We did. Yep. And it felt like it was almost like these 10 commandments that you've thought about when it comes to what you would want with a tool like this. Yeah. Can you go over those real fast?

Zach: Yeah, sure. So I think you want, uh, some of them I just mentioned, I think you want an audit trace for the reasons I just said.

Zach: I think you want, um. Handoff because these agents are not always able to, um, complete a task. Uh, I think you want, um, access control. So you want the ability to grant agents granular permissions to systems and understand, like, and be able to limit what they, what they can access You want, um, memory, so, which is, I've been calling it like.

Zach: Cumulative context. Memory is probably a better word for it. You want, uh, over time, um, [00:29:00] the system to work better. You want evals. So you want the ability like to measure how, uh, these agents are performing over time. Uh, and then from a like deployment perspective, you want, uh, flexible deployment. So meaning like.

Zach: You know, you, I think it's really important to companies that the agent can run in existing environments and you don't have to like necessarily forklift code over. And then the final thing I would say is you want programmability. And so that means anything you can do with an agent through ~like ~a ui, uh, I think should be honestly, everything should be built as a programming stack.

Zach: So there's an SDK for launching agents. There's an API for launching agents if I want it programmatically. Get, uh, the artifacts that an agent produced, uh, I should be able to do that. And so, yeah, I think that's ~like ~the shape of a, a good agent system. I [00:30:00] see a lot of companies trying to build this. Some companies will be successful at building it.

Zach: I think a lot of companies are gonna waste a bunch of time trying to build something. 'cause it's very easy to like, get started with ~like ~a website where you ~like ~launch a cloud code. But that's like not. That's like not gonna get you to, uh, a real, like, scalable solution for, for big companies. It will, like, if you're like Stripe and they have a product around this, you can do it.

Zach: But I would think for most companies you should be looking at ~like ~what are the components that you should be, uh, using from something like Oz, where you can sort of like, you know, bootstrap your agent system and not waste time, uh, building it and make sure it works well.

Demetrios: So one thing that you didn't.

Demetrios: Hit on, or maybe I missed, it was observability of the agents working.

Zach: Uh, I missed this one. Sorry. Yeah, this is the other one. It's like, yeah, you should be able to go in at any given time, get a link to an agent and just see what it's doing and ~like ~steer it honestly. So [00:31:00] if you wanna hop in and change its direction, you should be able to do that.

Zach: Um. You should be able to do it from anywhere is another thing like agent access anywhere. So you could do it from, uh, you know,

Demetrios: aa, you should

Zach: be able to do it from your phone. You should be able to do it from, uh, you know, like wherever. Uh, and so these agents are things that you're just like, you know, if you have a teammate on Slack, you should be able to like access your,

Demetrios: your agent teammate wherever you are.

Demetrios: Yeah. Okay. So, and. That is going back to what we were just saying earlier, how we interact with the agents and how we interact with apps in general is a much different experience because we're now interacting in a way that we wanna see all these agents in the ui, but that might be the only UI that we care about.

Zach: Yeah, I mean, I think.

Demetrios: Again, I think your job as

Zach: a [00:32:00] builder, um, and I use, I'm gonna use builder even more broadly. It's not just like software engineer anymore. It's like, uh, even as a knowledge worker in general is gonna be like, how do you effectively prompt, manage and like, um, you know, steer, kind of like steer a bunch of these agents?

Zach: Do useful stuff.

Demetrios: Yeah.

Zach: And so, um, you know, you should imagine you're gonna have a lot of them, they're gonna be running on different time horizons. They're gonna have different skills, they're gonna have different outputs. It's gonna be really, like, one thing that is really kind of annoying is like, it requires a lot of multithreading in your mind.

Zach: I don't know if you feel this as you're like, I, I constantly feel like I'm like losing track of like, like, oh, I had Yeah. Exhausted. It's exhausting. It's, it's kind of a, just from like a psychological perspective, is one kind of like. Unfortunate outcome is like, I'm getting more done, but I feel like I'm also working harder.[00:33:00]

Zach: Yeah. It's like more mentally taxing on me. Like we're

Demetrios: working harder, but less at the same time.

Zach: Yeah. I'm like, just like juggling a lot of stuff, you know? Um, like typically when I do a podcast like this, I'm like, I have agents working on stuff in the background and I'm like, okay, I'm gonna have to go back and check that PR

Demetrios: Yeah.

Zach: When this is done and like it's, uh. It's not really making my life easier. It's making me more productive, but it's not, yeah, I don't think the demands on people are gonna be less, unfortunately.

Demetrios: No. Well, yeah. I mean fortunately, unfortunately, it definitely is a lot more. Like brain juice that you gotta put towards something.

Demetrios: And

Zach: it's not, it's not the fun ver it's like the, it's like the context switching brain juice. Yeah. Which is what I hate. Like, it's like it takes you out of flow. I would much rather be in flow coating something

Demetrios: deeper

Zach: work, but

Demetrios: yeah.

Zach: Yeah, that's like more fun. [00:34:00] But I just can't really justify doing that when I could be building five things at a time.

Zach: You know, it's, which, yeah, it's, it's a weird situation.

Demetrios: It is. Well, and I have to ask, I know you've answered it a few times before, but Yeah, I imagine listeners are probably wondering this question of like, how do you feel like you can compete against the labs that are just pushing out subsidized tokens?

Zach: It's an awesome question. It's like, it's the, it's the most challenging thing for our business for sure. And so just to expand on the question, it's like when we, uh, offer AI in Warp, we essentially are like paying API rates. It's somewhat better than API rates because at our scale we we're able to like negotiate discounts, uh, with the model providers that, um, you know, that if you're like an individual using the API you, you can't really do.

Zach: Um, but [00:35:00] the, the short answer is like, I think we are. Trying to be complimentary to the other tools on the market. Uh, we are trying to build something that is actually a great place to use cloud code. If you want to use cloud code or if you want to use Codex. We are building features that make it possible to use like our code review and our file tree and voice input and like actually make a really nice workbench for any coating harness.

Zach: We have a bunch of launches that are coming out around this, uh Oz, our orchestration platform is also, uh, we're making it work, multi harness, and the idea is just like, I think it's better for us to compliment rather than compete when we're in a period of costs. Being a major driver of what people choose from a pure product standpoint, I'm very biased, but I think our product is like better than a lot of these like other [00:36:00] TUI like, I think it's better than, than a lot of the other stuff out on the market, but we're not in a moment of pure product competition because the labs have a cost advantage.

Zach: Which is not like unfair. It's like, it's their model. It's like, it's fine. I do think though, for the record, it's temporary because, uh, I think, uh, open weight models, uh, you know, Nvidia just invested $25 billion in this. There's like reflection ai, there's a whole bunch of open weight models that are gonna become good enough.

Zach: For the coding use case, and that's gonna drastically change the economics of selling tokens. But for now, like I don't think we should try to compete too much on that. That said, it's actually, we're still having like. Pretty wild growth, even as like a, a business of selling API, sorry, uh, of like yeah, selling these API tokens, like we're still growing really fast.

Zach: Like we're, we're growing at like over [00:37:00] a million new revenue every week recently, and it's all like, and it's a all, uh, we don't lose money on it, right?

Demetrios: Yeah.

Zach: And so it's pretty cool from the perspective of like, even though we aren't the cheapest option. It's such a good product and it's so useful to people that they're totally willing to pay a slight premium to get, uh, to use AI through us.

Zach: And so, you know, it's, and it also just speaks to just like the value, like the market that we're in, being such an incredible growing market that I still think we can build a really big business even if we're not like the cheapest product around.

Demetrios: That's a, that's great answer. and. It's interesting that you talk about swapping out harnesses.

Demetrios: Yeah. I would have considered warp to be a harness. You don't

Zach: look at it now. Yes. We have our own harness. Yeah. So, okay, so we have, um. [00:38:00] Well, he, here's how I think about it. So in case people are not totally familiar, so warp itself is the, is the terminal or we, we now call it like an a DE. It's, so it's a agent development environment.

Zach: It's a term that we, we coined actually, that's becoming, it's like a hybrid

Demetrios: terminal, right?

Zach: Yeah, it's like a, it's like a hybrid terminal that has some coding features in it, but only coding features that are really useful for agentic development. Um, within Warp we have our own agent, which, um, is, is, uh, think of it as like our version of Cloud code, where the advantage of using our agent is that it's really tightly integrated with the ui.

Zach: And so like, it just is like, um. You get better user experience than if you use ~like ~a peer CLI. So like we give you editable diffs, we give you code review. We have LSP that runs over the agent stuff. And so there's all these nice features around using our agent and it's a really, really good agent harness.

Zach: So like [00:39:00] historically, we've been many times at the top of the public evals. So one of them is terminal bench. We've been in the top three on suite bench. Uh, so it's a really great harness, but. We're about choice at this point. Uh, and we, we recognize that some users, honestly, for cost reasons, would prefer to use Cloud Code or Codex.

Zach: They wanna use their subscription or whatever. Hmm. And so you can use those things in Warp and have a really, really great experience that way as well. And so we, we think it's like a, you know, it's, it is, we do have our own harness, but we don't sort of force it. Uh, is is our take on it right now.

Demetrios: It's bring your own harness or use off the shelf in a way.

Zach: I like that. Like developers are like curious, right? So sometimes developers want to use Gemini or it's like, I don't, it's Gemini's fine. Uh, like sometimes, but like there's a totally good reason why if you're a developer you [00:40:00] might wanna like switch between Art Warps harness the cloud code harness Codex Harness.

Zach: You, you might wanna run all three. Honestly, for different types of tasks, you might wanna compare them.

Demetrios: Yeah.

Zach: Um, with Warp you can like, you know, I think one really cool thing that comes out of work destroying multiple harnesses is like. You can have, you can fire up three agents and have Gemini try to do a task code x try to do a task cloud code.

Zach: You can have cloud code review code X's work. It's, it's just like, it's a, it's a cool thing to sit one layer above from a like, uh, orchestration standpoint. And so there's real power in that. Uh, and for companies, we can provide metrics in terms of like. Which one of these harnesses is actually performing best.

Zach: Uh, and so there's, there's like a nice thing just being one layer above.

Demetrios: You have that optionality, but also again, going back to the observability and being able to tell, first of all, what are people [00:41:00] preferring? And then second of all, actually, should they be preferring something else? Because it feels

Zach: like, like what's exactly for and on like a, a tokens per.

Zach: You can measure the efficiency of the harnesses. You can measure like the PR throughput from observability standpoint. It's really great. And then the other thing that companies are concerned about is lock in, right? Yeah. So if you are, they should be, honestly, like you should be concerned if you're locking in your entire stack to like clock code or codex or something.

Zach: It's like, I don't know. There's this crazy stuff where the government is designating philanthropic as supply chain risk. And like even beyond stuff like that, it's, it's. You should be setting yourself up as a company for a future where some of your models are open weight. Yeah. I really strongly feel that you should not be imagining that you're gonna have to pay a ton for intelligent tokens, uh, forever.

Zach: And so I would be very wary as a company of like just betting on a single frontier lab to provide all your tokens because you're [00:42:00] gonna end up overpaying and giving it a bunch of power.

Demetrios: Yeah. I make this joke a bunch, but I saw it. One of my favorite tweets was instead of having Codex, you review Claude code's work and say, Hey, like, review this work and tell me where there was problems.

Demetrios: You tell one of them. That the other one just went through the code base and introduced a bunch of bugs and it needs to go find them.

Zach: That's smart.

Demetrios: And so it's much more rigorous and much more like, oh man, there's bugs everywhere. We gotta go find it.

Zach: That's awesome. That's hilarious.

Demetrios: So,

Zach: um, yeah, we're, I mean, we're investing also in, um, uh, an actual orchestrator.

Zach: And so what I mean by that is a. So with, with Oz, we have all these orchestration, primitives, you know, the five things I mentioned earlier. But what you really, what's [00:43:00] cool is once you have those primitives, you can start to have like different patterns of orchestration. And so you can have an orchestrator where you have like a team leader that says, I want to, uh, if you give it a complex task, I want it to ~like ~spin up five agents to do this.

Zach: We're gonna have two agents that are verifying we're gonna have one that's. Designing and there's all, there's like a ton of different patterns you can do for this. Wow. Just like there's different patterns for how humans work together. And so this is gonna be, um, native functionality that we have for making it possible for agents to do harder, longer time horizon tasks.

Zach: Um, and so it is, it's gonna be one of like the most, it's just gonna be like a crazy, a crazy type of sci-fi experience that you see. Agents coordinating each other. Uh, but yeah, that's definitely, that's coming. It's pretty cool.

Demetrios: Yeah. You have the QA agent, and I imagine you have different compositions of teams that you could have, again, where I can bring my own and create my own composition, or I can grab some off the [00:44:00] shelf for more.

Zach: Exactly. Yeah. And so my attitude on this is like, I don't know the right pattern. If it's like a hierarchy, if it's like they all work on a team, if everyone has roles. Uh, and I don't think anyone knows the right pattern. I don't actually even think that there is a right pattern. Uh, so what we're trying to do is make it very flexible, um, so that you know, people who want to try different stuff and can build different patterns of orchestration on what we're doing, and we'll, we'll provide, like you said, some out of the box.

Zach: So that you can sort of like measure and see how they do.

Demetrios: Yeah.

Zach: Again, it's pretty cool to see agents having conversations with each other. It's like kind of tricky. Yeah.

Demetrios: It goes back to we're all our own little r and d departments and you never know the pattern might be, did you ever do that in grade school where you would write like a paragraph and then.

Demetrios: It would be time, and you would give the paper to the person sitting to the right of you and you would get their paper, and then you would write the next paragraph. And by the end you had all these different stories that were all completed by [00:45:00] everyone.

Zach: That's so fun. That's cool. And

Demetrios: that might be the best way that agents work too.

Demetrios: You never know until you try it.

Zach: I don't know. I have a feeling that they'll work in ways that are similar to how humans organize because they're just trained on so much human.

Demetrios: Yeah.

Zach: Human intelligence, but fundamentally they're like alien brains and so they, they might work in ways that we don't work well together.

Demetrios: Have you thought through the limitations of GitHub and what you want to do about that?

Zach: Such an interesting question. I know there's a lot. Um, yeah. So. So first off, have I thought it through a little, we're feeling it because GitHub has all sorts of reliability problems right now. It's, it's a huge bottleneck to productivity.

Zach: Um, you know, the other, the other place that I thought about is like, there's no real [00:46:00] good reason to have this like inner loop and outer loop type workflow when you're working with agents. And so we do. One of the cool features that we have in Warp is like you do code review within Warp, which is the same place the agent does its work.

Zach: And so I think it's a stupid wasteful flow to ~like ~build something with an agent, push it up to GitHub, look at their code review ui, have like a code review bot that does its thing there, pull it back in. Like why are you doing this dance of like up to GitHub, down to your local, up to GitHub? I think that that part, uh, we're already like building stuff to just make it so there's no, you don't have to do that.

Zach: The other thing that is interesting is just

Demetrios: like, oh wait, but is that through just like auto save and then you can roll back if you need to.

Zach: So it's still using Git, just to be clear. Mm-hmm. It's still using Git, but like the piece of it that we are, uh, moving into warp or have moved, really, it's just like [00:47:00] iterative code review.

Zach: There's, you could just bring it all into your app, and so it's, again, if you're, if you're code reviewing with a collaborator on your team and you're reviewing their code, I think it's fine to go to GitHub. However, if you're reviewing a code that your agent wrote, that's a ridiculous flow. It's like, why am I leaving my workbench to go up to GitHub?

Zach: You shouldn't. You should just do it right as the agent is writing it. So you, what you're seeing is some of the things that, that like, were like the hero features of GitHub. Like, I think collaborative code review are gonna be moving out of GitHub. I think, um, I think there's like a more interesting question of like, is Git good?

Zach: First of all, I hate Git, I don't know how, how you feel about gi. Like, I don't like it. I don't, I I don't understand kit and part of the reason. Like I, I'm a, I spent a long time at Google. We didn't use get at Google and like the thing that we [00:48:00] used at Google was, uh, is basically based on the Perforce, which is like another source control system, and it made so much more sense to me.

Zach: Intuitively, and so I, I hate Git so I'm not a big fan of Git. I would love to see the world not be working on top of Git and I don't know if Git is great for tracking agent, uh, metadata, but the problem is there's so much tooling and scripting around Git uh, that I think, I think it's more likely that get will.

Zach: Stick around and you'll find ways to stick, to basically build metadata around what agents are doing and into Git and then to invent some new

Demetrios: Yeah.

Zach: Thing. But, um,

Demetrios: it's involved.

Zach: I don't know. Yeah, I think, I think GitHub is at risk of disruption, which is a crazy thing to say because it's, it is the developer tool that has the strongest network effect of any developer tool.

Zach: But, um. I think it's a, I think everything is kind of at risk right now, to be honest. And so if I'm them, I'm thinking about like, how do I [00:49:00] rebuild this so that agents are the primary customer, not um, not necessarily humans.

All right, y'all. This episode is brought to you by the good folks at MLflow, the open source platform for developers who want to build production-ready AI applications, enhance your AI applications with end-to-end AI observability, all in a single integrated platform with MLflow's Gen AI capabilities.

You can evaluate AI applications using a suite of built-in or custom judges. Visualize trace executions and agentic analytics and continuously monitor evaluations all while tracking every run in one place. Ship better agents faster. You know that's the name of the game. Get started at mlflow.org.

Demetrios: There was something that you were about to say before I cut you off with, uh, the, how you're doing the auto save or get like features.

Demetrios: [00:50:00] Was it bad idea of, is Git the right abstraction or is Git the right tool for the job? So

Zach: we, we do everything on top of Git right now. Yeah. So like for instance, if we want the, uh, we're launching or it should be launched, is, uh, like a checkpointing feature, right? Where if the agent does some work and you want to go back to it, we, we build that.

Zach: On top of Git we build our code review features on top of Git primarily because like things we wanna reinvent now and there's things we wanna maybe reinvent in like a year. And I don't think, like we're, we we're so stretched on the things that we wanna like reinvent right now. I don't think that we, we're not prioritizing like a whole new sort of source control system.

Zach: Um, but uh, yes, right now our stuff is built on top of Git but it is moving some of the stuff from GitHub. Out of GitHub into our app [00:51:00] directly.

+ Read More

Watch More

Illogical Logic: Why Agents Are Stupid & What We Can Do About It // Dan Jeffries // Agents in Production
Posted Nov 15, 2024 | Views 1.4K
# Logical Agents
# Kentauros AI
# Agents in Production
Building the Future of AI in Software Development
Posted Dec 12, 2023 | Views 490
# AI
# Software Development
# Exafunction
# Codeium
# QuantumBlack
# McKinsey and Company
Open Source and Fast Decision Making: Rob Hirschfeld on the Future of Software Development
Posted Jul 04, 2023 | Views 797
# DevOps Movement
# API Provision
# RackN.com
Code of Conduct