MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Unlocking AI Agents: Fixing Authorization to Get Real Work Done // Sam Partee // AI in Production 2025

Posted Mar 14, 2025 | Views 432
# AI Agents
# AI Applications
# Arcade.dev
Share

speaker

avatar
Samuel Partee
CTO & Co-Founder @ Arcade AI

Sam Partee is the CTO and Co-Founder of Arcade AI. Previously a Principal Engineer leading the Applied AI team at Redis, Sam led the effort in creating the ecosystem around Redis as a vector database. He is a contributor to multiple OSS projects including Langchain, DeterminedAI, and Chapel amongst others. While at Cray/HPE he created the SmartSim AI framework and published research in applications of AI to climate models.

+ Read More

SUMMARY

This talk is about making AI agents truly useful by fixing how we handle authorization. Right now, we depend on API keys and static tokens stored in environment variables that tie actions to single users, which isn't flexible or secure for bigger operations. I'll cover why this holds us back from letting AI do important tasks, like sending emails or managing sensitive data, autonomously. We'll explore simple ways to update these systems, so AI can work for us without constant human intervention. This is all about moving beyond flashy demos to real-world, impactful AI applications.

+ Read More

TRANSCRIPT

Demetrios [00:00:06]: Where are you at?

Samuel Partee [00:00:32]: You're so funny, man. I swear they get weirder every time.

Demetrios [00:00:38]: I have to one up myself. So I. You know, that's true.

Samuel Partee [00:00:43]: I'm not even sure what that was, but I. That was funny.

Demetrios [00:00:48]: Well, dude, you've got 20 minutes on the clock. I'm gonna let you get rocking and rolling. Let's hear it and I'll be back with questions.

Samuel Partee [00:00:56]: Awesome. All right. Boom. Let's hope I can get this done in that time.

Demetrios [00:01:02]: Let's hope Sam can get it done.

Samuel Partee [00:01:07]: Significantly better than me.

Demetrios [00:01:10]: You need to share your screen online. I can't see anything yet.

Samuel Partee [00:01:22]: Making me modify my settings.

Demetrios [00:01:25]: Modify the settings. He got out of here. He didn't really like the vibes Agent telling him that it wasn't right. He needs to modify the settings in. Come right back for another time around. This is Sam Party having technical difficulties. We're having technical difficulties. Mr.

Demetrios [00:02:03]: Sam Party is back.

Samuel Partee [00:02:06]: Not my fault. Not my fault. All right, let's do this.

Demetrios [00:02:12]: Technical difficulties with my man Sam Party.

Samuel Partee [00:02:16]: Very good.

Demetrios [00:02:19]: I see your screen. Let's go ahead and share it and get rocking and rolling.

Samuel Partee [00:02:27]: Thank you, Demetrius, for your ever increasingly weird intros. Hopefully everything is going well and you can hear me. Today I'm gonna be talking about unlocking, pun intended, fixing off to get real work done. Little cheesy, but it at the same time is true. And I'm going to show you why. I'm going to go pretty fast through this because I have shown this before in one of my other talks with Demetrios, but if it doesn't make sense, you can go look at that talk. I want to go through a version of what I call the Agent hierarchy of needs. It's like obviously off Maslow's hierarchy of needs.

Samuel Partee [00:03:05]: And this just kind of sets the stage for what I'm going to talk about today. So early on, we started with large language models and we started with prompt orchestration, which I'm calling the early days of the GPT tree LangChain, when really it was just about how do we pass text prompts efficiently around a large language model. I start here as one of the early times in which we were able to get some benefit know 2019, 2020ish. Then vector databases, huge craze still, maybe you could say there still is. Retrieval and search will never go away. It's always going to be a thing. Recommendation systems are all but neglected in their use of vector databases still to this day. But that gave us relevant information to put into that context.

Samuel Partee [00:03:58]: Agent Orchestration then came out with much of the existing orchestration frameworks adopting the agent paradigm. Then we were able to do something more than just gain context, which was tool calling. This is the first time an agent was able to go and reach out to the real world and actually do something. Why it's actually through generating text. But that text was in fact something like JSON that you could unserialize and take and put into a function with a runtime. This gave agents and or LLMs the ability to go and do something. Then this is where it gets a little bit, I'll say into the future where I'll call this tool orchestration and I'll talk about this a little bit throughout the talk and why it's necessary. But this layer is the layer in which agents start to become customized.

Samuel Partee [00:04:49]: You might also call this multi agent orchestration. There's a lot of ways to do this and I think that the field hasn't necessarily landed on an approach. What I'll mostly talk about today is the layer that I see on top of that which might not make sense to most people but it is actually off. This actually empowers the agents to take action on behalf of a user. I'll talk a lot about why that's important but really it's the way in which tool calling becomes better than just getting the weather or searching Google. Has anybody seen something like this recently? Process.ingoogle drive secret bunch of refresh tokens and client secrets. You downloaded your credentials JSON. What about some more? What about hard coded scopes with the utmost permissions for Gmail that's safe or a brand new service that we're just throwing our tokens and secrets into.

Samuel Partee [00:05:50]: What about our cache in the client code? Do we. You know that's mostly considered not safe in a bad security practice. What about some more process in bars? What about literal security nodes saying this can do anything and the process is a non deterministic agent. It includes deleting and we're, we're. We're just throwing those scopes over the wall and having these agents with tools that have elevated privileges beyond what they're supposed to do. And so really I want to talk about tool calling and off and, and I'll ground it with. Does anyone else notice how every tool calling example is get weather? Have you ever thought about why you thought about like if they actually had something like end user auth right then they could go do something maybe more personalized and more interesting for me and instead we haphazardly do that right now, through API keys and static tokens in an environment, this limits us to basically no multi user support because that token is scopes to myself or like my company. Right.

Samuel Partee [00:07:03]: And this is similar to bot tokens where some people have solved or I would say tried to solve this problem through giving a bot certain levels of permissions. And that doesn't necessarily solve the problem because then most of the time you're giving a bot elevated levels of privilege beyond what you would give someone in your organization. Also a really important part that I don't hear enough is the location of token usage. So tools are where the agent reaches out to the real world and it's often deeply abstracted from where the client is actually interacting. I'll show a diagram of this, but it really makes it difficult to integrate existing kind of IPASS based providers or just auth providers into what we use every day because the tools are actually a long way away from where the client is interacting with the agent. And that makes things very difficult to support in terms of authentication. Okay, so this is an example of something like get weather, it's Google Search, but basically the same thing. You can think of it like it's either API key or just basically like Google Search.

Samuel Partee [00:08:21]: You can just do it without any kind of user. So you have some kind of client app. You could just have a client agent, tool executor and an LLM. This should mostly make sense. And I'm going to rush through this because I don't think I have a lot of time. But this flow is a very common flow. It is where you are going to go and predict parameters, use those parameters to call a function, invoke that function, get back results, feed that back to the LLM and then take what the LLM says as the final result with the agent. It's like a summarization flow if you think about it like that.

Samuel Partee [00:08:54]: Now let's compare this to something that requires end user auth. Everything in green that's added is where we, the developer of the agent now have to worry about. If you're developing an agent and you want to be able to send a Slack message, you now have to worry about all of these things. Who is the user? What is the auth service? How am I actually pinging Slack? Maybe I'm just going straight to the service provider with imparse. That is what most people do nowadays, especially with mcp. But if you have something like an all service, you then have to introduce that into your code base. Where does that go? Does that go in the tool executor? Does that go into the agent? Does that go in the client application? God forbid. This flow that I don't really have time to explain, but we'll go over it more, is essentially adding in the OAuth2 flow to the same tool calling process you saw before and adding the necessary requirements of knowing who the user is, knowing where you're going to off, and providing that token to the tool executor at runtime, such that you are authorized to act on behalf or the agent is authorized to act on behalf of that user.

Samuel Partee [00:10:17]: So quick shout out. Obviously I work on Arcade and this is what we do. This is why I am very passionate about it. We provide these three blocks that take out a lot of this so you can go back to focusing on building the agent. It's really nice because then you just say you're using OpenAI, you get a link, you can stop worrying about using any other auth provider. If a user's not going to be authenticated or is not authenticated, you get a link back. Even if you're using OpenAI, it comes in the content field. And so in that way, whether you're using LangChain or OpenAI, what have you, you are able to use tools like Slack like Google Gmail or Google Calendar or Zoom and you can just send a unique user ID and have all the all taken care of.

Samuel Partee [00:11:13]: This is a very important point that I want to talk about as I go. And I talk about another point that comes after this, which is that agents shouldn't just be able to work for you, they need to be able to work as you. And so some people call this user impersonation, but really it's the ability to go and impersonate you on the Internet across multiple services and maybe with multiple tokens, even in the same service with different levels of privilege. Because you might say this agent should be able to access Calendar and this agent should be able to access Gmail and they don't necessarily need to share the same levels of privilege based on whatever app you're building in the architecture as multi agent architectures or multi tool architectures or tool orchestration, whatever you want to call it, become more popular. We are going to need a system for this that is able to spread out those scopes, responsibilities, claims, what have you, tokens, manage them and give them to the tool executors when they need to reach out to those services on behalf of the agent. So a quick shout out. We've been working with LangChain on open tool calling that was supposed to be one by one, but it is essentially a very generic version of what you just saw. And some people might say, wait, don't we already have this? Well, this includes a lot of different attributes that like for instance, authorization that allow us to do more interesting things and bypass the problems that I've been talking about.

Samuel Partee [00:12:50]: And so you can go and hit zoom and schedule a meeting. And these schemas are all online. The website's up opentoolcolling.org it's basically brand new right now and we're going to be adopting it with both of the companies. So take a look at that, give us your feedback. Please do make issues if you have thoughts, open discussions, what have you. It works over HTTPs, so no state a lot of the things that people complain about these days. All right, so what can you do with end user Auth? I'm going to give a really quick demo here, but. Oh, actually no, this first, then the demo.

Samuel Partee [00:13:29]: There's some new paradigms that are unlocked, right. And I've been talking about this a little bit. Right. But you can actually go and read the email of your users. And this is kind of a hard point to understand because it's not. You can do this today with just a token in the environment, but what you can't do is then give that service to a bunch of users. And so if you want to go and sell that agent or you want to go and build a business around that agent, you need the ability to have it done for them. And this is really hard today.

Samuel Partee [00:14:01]: There's a lot of work that goes into this and it's one of the things that frustrated me the most when I was at Redis because we were building agents. And I kept having these moments where honestly at the time I didn't know as well as I do now. And I just could not get LangChain to pass a token all the way down to the tool layer or at the time I was actually using GPT Tree, I think most people probably don't remember that it was called that Llama Index. For those who are wondering who that is. So on demand is really important because if you have 8,000 tools all scoped and claimed to the least level of privilege, which they should be, you don't want to have the user going off 8,000 different services and tools. You want them to be populated when the agent needs to call them. And in that way, if you have a system by which there is an authorization check in the tool calling flow, you can bypass all of that user experience that I know everybody hates, which is that initial click, click, click, click, click and get it when you need to do it. Ambient, like Agent Inbox that we put out with Lanechain that was really popular.

Samuel Partee [00:15:13]: I still use it and it just manages your inbox. It's very cool. If you haven't checked that out, I think they took it down. They put up a. They put up a barrier on it or something, I Forget. But the GitHub repo is open and you can go and use it today. If you want to use it with Arcade, check out the multi center branch. Ambient is really cool because they're just working in the background.

Samuel Partee [00:15:35]: And once you've auth with Arcade, you're getting the refresh tokens done for you and the caching of the token done for you. So they can continue to go and hit YouTube and Slack and Google Workspace and LinkedIn and Atlassian and blah blah blah blah, all on your behalf and you just send them on their merry way and they come to your work for you kind of. Hence the name of the talk is the doing real work. This enables you to make agents kind of like the one. I think it's called vibe work. Jacob put it out in like a day and it just does Slack and email and messaging and it does it all in the background for you. Workflows. That's it also actually does workflows.

Samuel Partee [00:16:19]: But this is really popular right now. Being able to do things like cursor your CLI. Put that into GitHub. Whether it's context or actions like opening a PR or reading a PR. I'm going to show example of that here pretty soon. And then, you know, going to another service. I'll go over later. Later.

Samuel Partee [00:16:42]: How am I doing on time? I probably gotta go. Okay. I'm only gonna show one of these because I probably don't have that much time. We put out this one recently. Archer. It's a Slack agent. It works through the Slack Assistance API. It's not published on the marketplace, but it is open source.

Samuel Partee [00:17:02]: You can go and use it right now. And what it does, it's a Lang graph agent. It's essentially a react architecture, but it adds in the auth interrupts and the checking of Auth for tools. You can see the Wine graph studio thing here. The. I forget they call it the graph. The capabilities of this one are really interesting because it goes and hits the things that I do a lot at work. So like go and check this pr.

Samuel Partee [00:17:30]: Are there any comments? Has Someone reviewed it, you know, when was it updated, those kind of questions. I don't have to go to GitHub anymore, which is really nice. Go and list all the comments and summarize them. Or go and I'll show an example of it actually. What else? So also worked on Agents as tools. Very cool. That repo is going to be coming out. Follow me on Twitter if you want to see it.

Samuel Partee [00:17:55]: I'm going to be publishing it there and you can get on the repo. I think that's the right one. These QR websites are actually pretty confusing. All right, so this is a demo of Archer. Archer is pretty cool. It's the same Slack assistant I was telling you about and it is pretty fast. So I'm actually going to stop it in some places. You'll see here it has the assistance API up in the top right hand corner so I'll use that later.

Samuel Partee [00:18:21]: But you have history. You can pick models, multimodel and with LangChain obviously you can use a lot of models. This can also use the LLM API of Arcade. And so I asked here do I have any meetings this week? I just say Alex. It knows obviously that Alex is my co founder and then it asked me to authorize. And so this is the flow you've probably done a million times, right? And in doing so you get a familiar experience. And that's really important with AI right now because a lot of people are especially enterprises are starting, you know, how do we protect ourselves from autonomous bot groups. And so when they get this familiar flow they.

Samuel Partee [00:19:06]: They feel much better about using it. And that is really important. So this was really hard Slack the views and to do so with the interrupts of LangChain was actually probably the hardest part of this. What actually happens here is you click on the in this case Google list Events is the name of the tool in Arcade. And then you click I completed the authorization. And then you have to confirm with an action which is not an event. That took me a while to figure out. Just being honest.

Samuel Partee [00:19:38]: And the new LangChain or langgraph command is way better than node interrupts. So if you're using node interrupts take a look at command. So yeah, it accurately said you have this weekly growth meeting. Alex is going to it. And then I open the assistance pane here and I think what I'm going to say is what I was just talking about. Are there any recent VRs? Just tell me about them. You'll see here. Actually this is an open source repo but the explicit scopes on this, I've wiped my user out of the system.

Samuel Partee [00:20:13]: At this point, I haven't actually logged in. You'll see this example of after you log in, you don't have to log in again after this, but now here, because it could be a private repo, we've safeguarded it and so you don't have to do that in every scenario. It's up to the tool developer and you can look at the tool code to be able to see what it's going to do. So, yes, I'm done. Same thing. That one's really fast. Wow. This is definitely sped up, by the way.

Samuel Partee [00:20:42]: So you see some PRs. Oh, dang it. This thing is always so finicky. Hide. Oh, all right. I'll just let it run. Tell me about them. Tells me about them.

Samuel Partee [00:21:02]: Click. Authorization complete. Awesome. Done. Tells you about the PRs. The cool thing is now I can go and ask about that pr, which actually is not an elevated level of privilege because it's in the same scoped thing for GitHub, so. So same level of privilege. It's not going to go back to that GitHub and it's going to get more information about it.

Samuel Partee [00:21:24]: You can see all the information about that notion pr just because that was really fast. I'll leave it on here. That is the new Notion toolkit. If you're a notion user and you want to use it with LLMs, come check it out, I think. Yep, that's it. If you want to come and check it out, it's on GitHub. That's the open source repo and the website is there for the docs. I think that's actually the marketing website, but the docs link is right there as well.

Samuel Partee [00:21:53]: Thank you.

Demetrios [00:21:55]: Very good, sir. I think that was that. Was that. Was this good. Hold on, I have.

Samuel Partee [00:22:08]: It.

Demetrios [00:22:08]: Was that good? You didn't even see it, did you? You were, you were looking at your shared screen.

Samuel Partee [00:22:12]: I was like, I was trying to find. Stop sharing. I don't, I don't use Chrome. I'm one of those people that have moved on past Chrome.

Demetrios [00:22:20]: Congratulations. I guess. So, dude, there's some, there's some questions coming through here and the main one that, you know, people were going to ask is how does OTC relate to mcp?

Samuel Partee [00:22:36]: They work together. You want to use an McP server with OTC, go right ahead. Supports SSE. The thing with MCP right now is that it is built really to run on a desktop right now. Now there's a lot of discussions online about how to improve it past that. And I think the team over there is doing a good job. But right now the protocol really doesn't support a really large scale use case that an enterprise would have. And so in doing so we really wanted something more general that would eventually support something like grpc.

Samuel Partee [00:23:07]: Right now it's HTTP and it's not state, so you're able to use something like serverless. There's a lot of different things, but in this way we want the communities to be able to work together. It is not like either or. It is very much so. Like we want it to be open, hence the name.

Demetrios [00:23:24]: Incredible. So multi agent Auth will slow down response time.

Samuel Partee [00:23:30]: It's really fast. Most of it's cached.

Demetrios [00:23:33]: Take our word for it.

Samuel Partee [00:23:35]: Well, I mean the only time when you log in, right, it's. It's going to be cached the next time Scach and Redis sits in front of the API and you're going to have millisecond response times.

Demetrios [00:23:49]: Nice, I like hearing that. So one difference between the familiar and not is often with agents, I want them to have one time access to a function. How do I make it easy to revoke a set of permissions?

Samuel Partee [00:24:06]: Right now that's supported in the dashboard, you can hit the API as well, but I don't think it's actually in the reference documentation. It might be, but it is in the dashboard. You just click on the service you want to revoke for and click Revoke Token. So you will see a list of users, everybody that's used that service and you can click Revoke Token as well. That kind of use case is also really interesting. We've heard that a couple times now. I'd love to hear more about why you want it to be revoked after one use. So it would be great if you email me or something.

Samuel Partee [00:24:37]: Yeah, good question.

Demetrios [00:24:38]: And Arthur's saying specifically like going into Chrome Settings and doing it each time is a pain in the ass.

Samuel Partee [00:24:47]: Doing into Chrome Settings?

Demetrios [00:24:49]: Yeah. Like revoking access, right?

Samuel Partee [00:24:52]: Yeah. So you, I mean if you want, when you call it to get the Auth token, you can use it and then revoke it in the same like right after the call right away. Yeah, you wouldn't have to do that.

Demetrios [00:25:03]: Oh, interesting. There might be.

Samuel Partee [00:25:06]: Obviously you got to use it first, right?

Demetrios [00:25:08]: Yeah.

Samuel Partee [00:25:09]: Otherwise it's not going to work.

Demetrios [00:25:11]: It's like that meme where you put the stick in the bike.

Samuel Partee [00:25:16]: I love that one actually.

Demetrios [00:25:18]: So there might be a long tail of tools. How does one integrate my tool into Arcade.

Samuel Partee [00:25:24]: Oh, great question. Love that tool SDK. It's in the repo that was linked on the last page. It was. It is really easy. All of our toolkits are open source. You can see the code that we're running in the cloud, you can see everything and you can fork them. You want a different version of list emails, you want a different version of go get Jira tickets, you can do that with open source code.

Samuel Partee [00:25:43]: Fork them, use them immediately. And we just launched remote workers, which enables you to have a deployment done for you. You can just say rk deploy and it's going to take whatever's in your toolkit PIP package, launch it on a serverless worker and then it is going to connect it to the cloud engine. From that point on you can use whatever you want, whatever API tool, calling LLM, LangChain, OpenAI, what have you. And you don't have to worry about pretty much any of the deployment or the setup. Kind of like Vercel.

Demetrios [00:26:16]: Oh, nice. Yeah, I was kind of thinking, oh, that's almost like a Docker container, but different, similar.

Samuel Partee [00:26:23]: It's lighter. Much lighter.

Demetrios [00:26:25]: Okay.

Samuel Partee [00:26:26]: It's like, you know how Vercel is like they package up the. You know, it's more similar to that. I wouldn't say it's even close to what Vercel is because it's brand new feature. So if they're sharp edges, please don't roast me on that.

Demetrios [00:26:37]: Like, yeah, it's like we're basically Vercel.

Samuel Partee [00:26:40]: Yeah. I was like, that is not what I'm saying. That is not what I'm saying. No.

Demetrios [00:26:44]: Incredible. All right, so another question for you. This might have to be the last one because we're going over a little bit on time. Are there any examples of pre built agents? I can't seem to find any on your docs page. Where do I look to find some pre built agents?

Samuel Partee [00:26:59]: If you click on LangChain and then the Create React agent, you just drop the tools right into it. That was just updated to 1.2, so it now supports Async as well. You can use any of the pre built agents. There's a function to LangChain. And then I think we're working on two Crewai and two OpenAI for the new framework. So you'll be able to just take our tool definition and immediately get that tool definition. Also, if you want to use the cloud hosted tools without ever worrying about like a local manager, you can use the tool format API in our cloud. So you can get them for any model, you just pull the format and then you take that predictor set of parameters and then you can just call our cloud and so you never have to worry about like setting up the tool locally.

Samuel Partee [00:27:47]: So anthropic OpenAI what have you Grok.

Demetrios [00:27:50]: All right, time for you to get out of here.

Samuel Partee [00:27:54]: You already see me. Well, that was awesome. Always love this. Thank you.

Demetrios [00:27:58]: I only had 25 minutes with you, but we're behind and I am really excited for the next talk, I'm not gonna lie. So Sam, thank you. If anyone wants to continue the conversation with Sam, he. He is very active on LinkedIn and Twitter and we'll just message me on Twitter. Yeah, exactly. And we'll drop your email in the chat so you can drive him to a bunch of random newsletters. We'll put his address too. Might as well, in case you want to get magazines.

Demetrios [00:28:27]: All right, see you later.

+ Read More
Sign in or Join the community
MLOps Community
Create an account
Change email
e.g. https://www.linkedin.com/in/xxx or https://xx.linkedin.com/in/xxx
I agree to MLOps Community’s Code of Conduct and Privacy Policy.
Like
Comments (0)
Popular
avatar


Watch More

Agentic Tool Calling // Samuel Partee // Agents in Production
Posted Nov 15, 2024 | Views 1.1K
# AI tools
# Agents
# ArcadeAI
Generative AI Agents in Production: Best Practices and Lessons Learned // Patrick Marlow // Agents in Production
Posted Nov 15, 2024 | Views 5.8K
# Generative AI Agents
# Vertex Applied AI
# Agents in Production
Transforming Healthcare with AI: Automating the Unseen Work // Shaun Wei // Agents in Production
Posted Nov 26, 2024 | Views 1.3K
# Healthcare
# HeyRevia
# AI Agents