Building Production Copilots
speakers


At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.
SUMMARY
Copilots embedded within SaaS applications have become one of the dominant ways of leveraging LLMs within products. In this lightning talk, Tristan reviews some of the dominant UI paradigms and features, general design patterns and system architectures, and top challenges and future frontiers of production copilot systems.
TRANSCRIPT
Introduction
And I believe our next speaker is somewhere around the Bay Area. Did I get that right, Tristan?
That's absolutely right. I'm from Oakland. I'm sitting in Oakland here, so I gotta, I gotta tell you a little story about Oakland. Uh, totally not what your thing is gonna be. No, I, I live in the middle of nowhere, Germany. Right. And. There's like a hundred people that live in my town and then the next town over where I go running, there's a guy, he's gotta be from Oakland.
He's just got gigantic Raiders everywhere. Everything is Raiders. And it is, uh, it's so much so that I, one day I'm going by and he's having a barbecue outside and I'm just like, dude, what's the deal? Huge Raiders fan. He's from the Bay Area also. And I realize now, after saying that story, it's not as good as I thought it was gonna be.
Well, we definitely have some Oakland pride here. Oh man. So Trista, we're cruising. We're gonna do start slicing it up and doing some 10 minute lightning talks. I'm gonna throw 10 minutes on the clock, and of course, if anyone wants to learn more about what you all are doing at continual, you've got a booth in the solutions tab.
So hit the left. Hand button, it says solutions, and you can go through and you can find all the different booths that we've got and all of the cool stuff. There's great different swag that's being handed out, and there's all kinds of, uh, There's all kinds of insights that you can find. So Tristan, before I take up any more of your time, man, I'm gonna let you cruise through this and I will jump off the stage.
I think, yeah, I see your screen being shared and it's actually, I'm seeing myself right now on my screen, so I'll let you go to your slides. I'll, I'll switch over to my slides. There we go. I see your slide. There you go. Full screen here. Full screening in thank Demetrius right up. Alright man, I'll talk to you soon.
Alright, awesome. Well, uh, this is will be a quick one, so, um, uh, 10 minutes or less. Um, and today I'm really excited to talk about, um, building production ai co co-pilots. Uh, my name is Tristan. I'm the co-founder, uh, and c continual ai. If AI. Um, we're building a developer platform for.
Uh, I previously was a CTO for machine learning at Cloudera. Uh, built one of the early ML ops platforms that was called Sense, which Cloudera acquired. Um, which is basically a short introduction in a way of saying that I've suffered for the last 10 years, uh, in trying to make AI ML easy, um, by building sort of standard ML ops, uh, uh, tools.
And today I wanna talk about something, uh, different that I'm extremely excited. It's a basic belief that the software experiences applications.
Um, we're entering an era of AI copilots and what does that exactly mean? And, uh, what does it mean for us, uh, both as consumers of these applications in terms of how the applic, how our experiences will change, but also more importantly, um, what I wanna talk about is how does it change for builders, uh, people as people that want to put ai, uh, into the world.
Um, how do we go about doing that? So a few observations. First, uh, off the top, um, I think mls, traditional mops will mostly be irrelevant. Um, I think AI copilot will replace existing applications. And I think there will not be a single personal ai co-pilot, uh, uh, that, uh, rules them all. So it's not like we're gonna only talk to chat bt and chat.
BT is gonna orchestrate every single application. Instead. When I, when I think what we're, I think what we're entering, when I say the era of AI co-pilots is an era where every single application out there within every single vertical domain, Will be enhanced by its own specialized, customized, tailored AI copilot that has connectivity to the data that's sitting inside of that application, the APIs that, that, that application, uh, uh, uh, has, uh, and, um, and is deeply tailored to the workflows of the end user.
Whether that's a marketer, a salesperson, a doctor, a teacher, uh, a student, um, a scientist, uh, all of the, uh, uh, you know, the, uh, mechanical engineer, all of those things will, will be using their own AI copilot. Um, you know, this is an example of, uh, epic. I mean, this is an example of why I believe that, uh, AI co-pilots are gonna be embedding embedded into existing applications rather than replacing the applications or having a single, uh, ai, uh, co-pilot for all applications.
This is an incredibly complicated domain specific application. A tremendous amount of, uh, sort of, uh, data around in this case, patients. Uh, and I don't think there's way that conversational interfaces, types, applications are gonna fundamentally. Now, what do we mean, uh, by fundamentally transform these applications?
What are these AI co-pilots gonna look like if they're embedded into, um, uh, these traditional applications? I think there are three, um, emerging patterns for how AI co-pilots get embedded into the applications. The first is one that we all know and love, which is, uh, or know and, uh, certainly love more and more, uh, over the last nine months.
Which is, uh, the conversational experience into applications. And so we're starting to see this in more and more of applications already today. Um, copilot chat, uh, copilot x GitHubs copilot is introducing chat experience, um, beyond experience that they have. This, the chat experience, do it replace entire application does have.
Incredible use cases around knowledge discovery, uh, ad hoc query, uh, uh, information retrieval. Uh, and it's a very natural user interface, um, that everybody, uh, is used to. So, um, I think that what, what I foresee happening is that every application will add. Conversational user experience on top of their existing application data to.
Um, the second dominant pattern I see is, is not just having an AI chat on the sidebar, but actually trying to put AI into the micro workflows, uh, of individual users. And so here's an example from Hex. I think this is one of the best examples of what I would call a copilot command or copilot action. Uh, and here they're showing how you can write analytical queries, you know, using sort natural language.
Uh, to, um, a SQL query. Um, and, and, and that can give you relatively complicated SQL queries or it can do other things. For instance, fixing and suggesting fixes for mistakes or suggesting improvements. And so here you're seeing, uh, copilot commands in the, uh, in the analytical domain. But you can also think about the, uh, similar experiences that you might.
Writing, uh, content, which has similar sort of, uh, type of functionality and user experience. The final one is I think we're also gonna see an increasingly more and more, uh, copilot automations. These are full automations of user workflows. Uh, this particular one is again from GitHub, copilot x. Um, which is, uh, in this case, uh, doing automations on a poll request.
So as soon as you open a poll request, there's an event driven, uh, agent or workflow that gets kicked off, you know, looks for bug code review suggestions summarizes inside. Uh, the, the, the code review. And so, increasingly, I think we're gonna move to a world where copilots not only can, we can ask copilots for things, but we can, they'll also automate our work.
The net user benefits here are help, you know, illuminate or enrich the information inside of our applications and provide insights, um, help us complete tasks and streamline our workflows. Uh, and then finally help us fully automate tasks and potentially even work autonomously with, with limited or no intervention from the end user.
So the question then becomes, okay, you know, these are putting production AI systems inside of the applications that we're, that many of us are building. How do we think about building these, uh, particular, uh, experiences? And I wanna argue, uh, that there's a, uh, thinking about it holistically as. Uh, the AI inside your application as a co-pilot system, uh, provides a unifying view to thinking about, uh, how to embed these different types of, uh, of, of, of user experiences, uh, into your applications still being powered by a core engine that is connected to the data, the context, uh, the APIs of your individual application or of the individual, uh, application domain.
So there, the a co-pilot production copilot system. Um, we've been working on several at continual. Um, but production copilot, So the first is new user experience. We need to think about how are we gonna embed, uh, the AI experience into the user workflow. Uh, and as I you just mentioned, I think there are three primary user experiences, the conversational experience, which is chat, chat bot.
A command oriented experience, which is often in line to a particular, uh, particular workflow or, uh, where you're trying to do a sort of a one turn task, uh, not have an ongoing, sort, interactive conversation with the system. And the final one is automations a way for these systems to have, uh, uh, automated processes.
That are continually reacting to events that are happening inside of your application, uh, and, and, and enriching and enhancing and automating, uh, the, the information inside of, inside of the application. So the, the front end is those user experiences and there's, you know, that's traditional sort of user experience design.
They then, I think, will into a router way different user experiences, talk to Commons internal. Uh, you know, AI system that is most tailored to the particular domain. So you may have, uh, you know, your text SQL engine, uh, that may go to a particular agent, uh, behind the scenes, uh, that is specialized in doing text sql.
You may have a conversational support bot that may be a, a slightly different agent. So from, from multiple user experiences that you'll go into a router. That will then go to multiple potential agent inter agents and, uh, in interaction by agents. Uh, I mean a system that has the capability. An AI system powered by typically a foundation model like the open ai, uh, CHATT models, um, that has both the capability to respond to natural language in, uh, uh, to queries, but also the capability tools.
Uh, would be your, your plug-in layer. And so this is, I think the, what OpenAI is doing here, uh, and the, and the announcements they actually just made earlier this week, uh, are, are the right architecture for building these, or the foundational architecture for building these AI copilots. So OpenAI just announced function calling that allows the, uh, agent, uh, to decide and make a decision on plugin call, uh, and the, and the information.
Plugins provide a unified interface to all of your external, uh, uh, systems. And so that could be your data. So we could, you could use a plugin to connect search index or vector index context. You look, could use plugin to store information into data. You could use it to connect to an external system. So if you're connected, for instance, your sales system and talking to an external service api, you could do that.
Uh, and it'll also allows you to talk to more deterministic workflows. So you, if you have workflows that are very specific, uh, you can build and have those plugins actually connect to those internal workflows. And then finally, you can actually make this whole system recursive. You can have agents that talk to agents.
And so if there's a specialized agent that you know itself potentially has a limited set of plugins, you can actually create a hierarchical system of, uh, of agents. And as these copilot systems become more and more, uh, sophisticated, I think what you'll see is this sort of this, uh, general architecture and then this recursive nature to this architecture.
So you can decompose, uh, different elements of your stack. So at the, on the final, uh, you know, side, we have observability and guard rails. Of course observability and, uh, across this entire process is incredibly important. So what are the top challenges to putting these co-pilot systems into production? I think there's really, honestly just one, and that's reliability and performance.
Um, and so I just wanna give a few tips, uh, uh, uh, in terms of, you know, different ways to kinda, uh, that are sort of relatively straightforward in terms of ways to make sure these AI copilots, uh, perform well. So the first one's obvious, start simple. You know, even simple AI copilot features can significantly improve products.
I've talked to actually multiple founders who have put AI copilots into their products and have seen like significant uplift, um, from, from, from basic plans to premium plans, uh, and ultimately user satisfaction. Um, observe and watch user behavior like a hawk. So when you're starting out every single conversation or every single action, you should honestly just look at it and see, is it working?
Is it not working? You can honestly, without, you know, structured, even structured testing, you can fix a lot of problems very quickly by observing, uh, the actions of your users. The, the, the final thing that I would call level one, which are these easy fixes, is isolate complicated areas of your domain versus if you're doing rich analytical queries into narrower sub workflows or agents that can be tested more formally, more easily.
And so don't try to make one agent that does everything and decides everything, plans out everything, instead decompose it, uh, where you have more control over some areas, but maybe more dynamic, uh, planning at the at at a higher level. In terms of level two, as you become more sophisticated in, in production, what, what, what do you need to do?
The first is build automated evaluations and integrate them into your ci cd system. So particularly if you're doing difficult analytical queries or difficult automations, you really need to formalize evaluation system. Build. How then do you get feedback and improve your system? I think one of the most powerful ways is actually to build and maintain a.
So this is a great way where you can collect feedback, see problems, propose solutions, and not pollute your prompt as a huge prompt, but instead dynamically inject the kinda closest examples that you have, uh, into your prompt. And that can create a data engine, uh, for your application without getting into the complexity of fine tuning, uh, your own model.
And finally try to create delightful and magical experience. One way that we've seen this thing incredibly powerful is you actually treat your front end like another plugin, API manipulate, user experience. User experience, dynamic. So with that, I guess this is just a call to let's build AI co-pilots into our applications.
Uh, and if you are interested in doing this, uh, for your own application or just interested or, or business and you're interested in co-pilots in general, uh, or have thoughts on this general topic, please email me [email protected], or you can sign up for early access to, uh, our [email protected].
Thanks a lot. So I have so many things. First of all, I love this idea. Second of all, it feels like what I've seen being implemented with this idea has not worked as of yet, and there is this huge desire for it. But it is like, yeah, I mean, you mentioned open AI is just come out with it, so hopefully that makes things that are built with these plugins and all of this.
Um, these functions more stable, but it is, uh, it is something that I, I really wonder about. I mean, it's no small feat to try and take this on. Tristan, what. Gives you the confidence, man, I can't believe it. Well, well, I think the, I think the critical thing is, you know, we believe that basically these AI co-pilots are gonna enhance your existing applications, not replace them.
And that's that. So you can choose and pick and choose, hey, what are the workflows or what are the sort of the challenges that a user might have where the AI can enhance it, right? So we're already seeing, for instance, that just to do basic support queries, right? This sort of like chat with your docs use case, it actually makes a, that's very simple, right?
You can implement that, you know, in lane chain or something, you know. Pretty quickly yourself. Uh, you know, it's a little bit hard when you get production to kinda make it a production system, but the basic kinda idea of chatting with your documentation actually isnt a really, really useful product feature, right?
It's a better way to, than searching your docs in many cases. And then I think with Hex, you know, you can see an example of like a very narrow use case. Um, for instance, fix your bugs. You know, if you have something that executes wrong and you have a error, uh, and then you can fix that right in line and often you comma.
I think ultimately these things are gonna become incredibly rich, right? You know, these autonomous agents that people are, you know, kinda getting excited about will become actually things that work. Uh, right now that's not the case, but if you think about the next couple years that, that, I think that will be the case.
I think all software will have this idea of, you know, my software ai or my software copilot, uh, inside of them as part of a core feature of the product. Yeah. That's so good. I mean, you hit the nail on the head in the. M'S in production report that we just put out, and I will do a shameless plug right here if anybody wants to download it free.
No email needed or anything, but we just put this survey out, right? And it's the report that I've been grappling with this data for the last three months or so, even more probably now, and it's very much that question of what parts of my workflow get fully replaced. And what parts of my workflow get augmented and where and how do I know what each part is like?
There it is very, it's not clear at all right now. So I think we're exploring this unknown together, trying to figure out what exactly is it that is going to get replaced, what's going to get augmented, and how can we find that out? Yeah, no, absolutely. And I think it's, uh, I think it's gonna be both. I mean, I think even, you know, be, you know, you can say it'll be nine, but that's still a version of augmentation.
I will become more and more automated. A co-pilot I do think will ship more and more towards these, this fully automated, uh, uh, sort of, um, task completion. Um, but you know, you'll still interact with them potentially conversationally for sure. I think everybody will have a conversational interface and then there'll be all these, uh, sort of enhancements as well into the product.
But that all can, I think, be powered by the idea of a copilot system. Mm-hmm. You know, where that, that has that connectivity to the data, the API's making decisions, what to do. I think there's sort of unified way. Uh, to sort of think about how do you build that if you were an application builder, but we're certainly still in the early innings.
Hundred percent. Um, for sure. So, alright man, I've gone way over on time because I've loved this agent idea and I love talking to you. I'm gonna kick you off now though lovingly and I will see you when I'm in San Francisco in two weeks. Thank you. See you.
