Why we built PydanticAI, and why you might care // Samuel Colvin // Agent Hour #2

Name: Why%20we%20built%20PydanticAI,%20and%20why%20you%20might%20care%20//%20Samuel%20Colvin%20//%20Agent%20Hour%20#2
Uploaded: 2024-12-19

Posted Dec 19, 2024 | Views 24

# Pydantic

# Agents

# Agent Hour

# AI agents in production

speaker

Samuel Colvin

Founder @ Pydantic

Python and Rust engineer. Creator of Pydantic and Pydantic Logfire. Professional pedant.

+ Read More

SUMMARY

//Abstract In this talk, Samuel will go into more detail on why they built PydanticAI and what problem they're aiming to solve. He'll also cover some of the future enhancements they plan for PydanticAI.

//Bio Python and Rust engineer. Creator of Pydantic and Pydantic Logfire. Professional pedant.

This is a bi-weekly "Agent Hour" event to continue the conversation about AI agents. Sponsored by Arcade Ai (https://www.arcade-ai.com/)

+ Read More

TRANSCRIPT

[Link to Presentation](Link to presentation: https://github.com/samuelcolvin/boston-ae/blob/main/slides.md)

Demetrios [00:00:05]: So I'm excited. This. This should be fun. I thought I was going to be able to share a little music and set the mood right, but I couldn't figure that out in the allotted time. And I just want to get rocking now that you're here. I think it would be awesome to just kick it off, get us going hard. And for those that are wondering, we have like two talks today and we did. We did this agent hours last two weeks ago, and we had the breakout room sessions, which were a lot of fun.

Demetrios [00:00:45]: So we wanted to do it again, but we wanted to concentrate the breakout rooms into one place. And so now after the talks this time around, we're going to have the. The breakout room or just one breakout room. So it's all going to be here with us now. So you can just hang out. We're going to have the talks. Samuel, you're up first. And then after that, we will hopefully have some cool discussions.

Demetrios [00:01:15]: So if there's questions, I'm sure I'll have plenty of questions to ask. I know there's people that are. So the fun part about this is that I think there's some people that are looking for how to get into this room right now. So if you are watching us on the live, you hit the agenda on your left and. And then that should ping you right in here. And once you're in, it's easy. It'll be around. But yeah, Samuel, if you want to share your screen or anything, I think.

Samuel Colvin [00:01:51]: Yeah, I will do that. I will try and put you on this screen and put what I have in the way of slides on this screen.

Demetrios [00:02:03]: There we go.

Samuel Colvin [00:02:03]: And I can get going. So.

Demetrios [00:02:08]: Boom.

Samuel Colvin [00:02:09]: Can you. I presume you can. You can see that?

Demetrios [00:02:12]: Yeah. Nice and big, too. Thank you. That's good.

Samuel Colvin [00:02:16]: I hope that's a good size. If anyone is struggling to read anything or struggling to understand, I can see the messages. So send a message and I'll try and fix it. I've got frustrated with every possible deck format and so I'm using this plain text format, so I hope it's not too annoying. I'll try and remind me how long I have to talk, just so I've.

Demetrios [00:02:37]: Got a feeling 15 minutes.

Samuel Colvin [00:02:40]: Okay. So I gave this a version of this talk and it took a bit longer. So I'm going to whiz through and assume most of you have some idea of who I am and what's going on. But yeah, Started Pylantic ages ago back in 2017. Became a company in 2023. We built LogFire, which is our observability platform, which you should go and try if you're building in Python. Yeah, the first thing we did was release Pydantic V2. That was last year, the rewrite in Rust.

Samuel Colvin [00:03:07]: Pydantic today is downloaded 300 million times a month. So it's important to remember obviously we're talking about Genai today, but one of the power of Pydantic is it is not just for gen AI. It is widely used in general development, used by everyone who's writing Python, which is basically everyone. But obviously it's got a kind of new lease of life from Genai, where is used in general in API stuff, but in particular in validating structured responses. So Pynantic is. Yeah, I kind of said ubiquitous, boring and generally liked by developers, but laughter is perhaps a bit strong, but I'll try it. And again, going back to my point before that Pyntech is general purpose. I think it's worth just repeating.

Samuel Colvin [00:03:51]: This is what Pynxic was built to do. Pynantic long predates LLMs and the basic idea is you define a model like this, which has. We use type hints in Python to define your types, but unlike normally type hints which do as per the name, are just a hint and don't do anything at runtime, we basically use those type hints to enforce that type. And in particular, and this is obviously super relevant to Genai, we're kind of lax in the sense we try to coerce types. So if we see. You'll see here that pycharm is complaining that ID is supposed to be an integer, I've passed it a string. Pydantic by default will try to coerce values. Similarly, this is an ISO8601 or RFC339 format date, but it's a string.

Samuel Colvin [00:04:41]: Pylantic will take care of coercing values from, in this case a string into a date. And we'll do the same in JSON. So we'll take a JSON input. Again, the coercion thing is super valuable because there's no date type in JSON. If we were being super strict with snookered, we literally cannot define a Python date in JSON directly. So we have to do the coercion thing then the last thing that we built into Pylantic, actually Sebastian Ramirez, who built fastapi, actually originally contributed JSON schema. At the time it was all about APIs. We didn't even know JSON schema was going to go and get used by LLMs to define tools, but we built that long ago, which turned out to be super valuable.

Samuel Colvin [00:05:25]: This is old school use of Pydantic for general purpose programming. What then happened and what everyone went and realized was, oh, when we're making calls to OpenAI and in particular doing tool calls, Pydantic becomes super valuable. So we have our same model defined here. We're giving it a doc string, and with a little bit of work and a bit of ugly use of dunder methods, we can define tool calls entirely using our definition from our Pyntex model parameters of the JSON schema. We take the name from the name of the base model, take the doc string to become the description, and hey presto, we get tool calls. And then of course the great thing is you can then use that same model to go and validate the JSON you get back from, in this case OpenAI and give you a user or generate errors as to why that data was invalid. This is basically the thing everyone uses Pydantic for. This is what now you can do this built into the OpenAI SDK.

Samuel Colvin [00:06:32]: It's also used in all of the agent frameworks, LangChain, Crewai, phy, data, instructor, etc. Etc. Etc. I can't even remain remember the name of all of them, but they're like tricks that they're all really doing. And the thing that Pynanzing is most useful for is basically this thing. And so the kind of reason I'm talking today is that we decided that, to put it politely, we weren't that happy with any of the agent frameworks. We didn't particularly want to go and build with any of them, but we wanted to build with Genai. And so we released Pydantic AI, what, like two weeks ago now? It's been quite a manic couple of weeks, but yeah, I think two weeks ago on Monday is roughly when we released it.

Samuel Colvin [00:07:16]: And it's had obviously an amazing, amazing reception. But like basically it's a number of different things, but it's kind of a wrapper for this thing fundamentally. So again, we have our Pydantic base model defined, but we're importing agent from Pydantic AI and we're setting the result type to be user, which is going to go and do that same setting up the tool call. But this comes with a bunch of other nice stuff. In particular we get reflections. If validation fails, Pynantic AI will take care of taking the validation errors that Pylantic gave Us sending them back to the model and saying, please try again. But there's more it can do. Dependency injection is something that we.

Samuel Colvin [00:08:03]: Again, pedantic AI is not about producing the best possible demo or giving you the easiest experience in the first 10 minutes. It's about building production applications. And so dependency injection and type safety are really critical. Type safety doesn't particularly matter when you're giving a presentation, but when you're trying to build a real application, it's super valuable. And so in this case, we define our dependencies Here we have an HTTP connection, we have some API keys, and we define deps here. Depth type when we define the agent, then we go and register some tool calls. In the first example we were using just the result type, which internally is a tool call to give you structured data out of the agent. But here we're defining tools that are effectively discretionary, so the agent can, sorry, the model can choose to go and call them or not.

Samuel Colvin [00:08:55]: In this case, we define a getlat long tool call which gets us a location, and we define a weather tool call which given a LAC long, will return the weather. And then we go and run our model. Here we define our depths and we pass them in as a keyword argument when when calling the agent and we get our result. But the powerful thing here is the type safety. So using some clever tricks in Python's typing that you don't need to worry about, we can basically, with static analysis, guarantee that you're using the right depths. And so if you run mypy or pyrite over this and your depth type here is incorrect, you'll get an error. And obviously once you define your depth type here, when you come to access your deps. So here we have context deps client which is accessing this attribute.

Samuel Colvin [00:09:53]: If you accessed it wrongly or if you called it wrongly, you would get a type checking error that's super powerful. The overall workflow here is that the model would be clever enough to say, okay, I'm going to take the input which is let's get the weather in London and in Wiltshire, let's extract the locations from that. Let's then go and call the get that long function to extract to turn those locations into a latitude and longitude and then take those values and use them to call the weather function to get the weather and then return it. At the risk of never run a live demo in a presentation, especially a 15 minute one. I will try and run that as an example. So this is that same code. The only difference is there's a Bit more code to get the weather to look pretty. But if we go and oh, Nice error from GitHub Copilot.

Samuel Colvin [00:10:46]: If we go and run that, oh, it's not going to run like that. But if I exit this mode and I come here and I try and run, what was I running? Run that example. What we should see is it making the relevant calls and then coming back with a response and giving us the weather. So it so OpenAI in this case. And I haven't talked about model agnosticism, but we have good support for other models. We'll take care of effectively calling the right tools. You can see here that it's calling them together. So it's calling get that long twice and getweather twice in parallel.

Samuel Colvin [00:11:30]: But if we come back to the slides and we go on, this is all very well, but the problem is what's that model doing internally? And you started to see the beginnings of it just now. But obviously we think observability is really important. That's why we've built Logfire and why we've built an optional integration into Pydantic AI so that you can use LogFire to understand what your agent is doing, so what tools it's calling and how long they're taking. So I have. I'm not going to open up Logfire on this occasion, but. Well, the first thing is you saw immediately here. This is Logfire giving you an output of what happened that's nested in the terminal. But if you open the Logfire dashboard, you get effectively the same view, but with much more information.

Samuel Colvin [00:12:17]: Not only do you see what took how long in not only the LLM calls, But also the HTTP calls to the different APIs, but you can also see on every given line exactly what's happened, the cost in terms of tokens and what's taken how long. We think observability is super valuable, and we've seen that immediately in Logfire. Loads of people coming to use logfire off the back of finding it through pylantic AI, but to go on to a few other things. And now I'll talk for a little bit about a couple of things that are coming up soon in Pydantic AI, but aren't quite there yet. Agent handoff is a big subject. It's how Swarm got a lot of its attention. You can already do this with Pydantic AI by effectively registering other agents into dependencies like this and then calling those other agents from within a tool call. But we're just about to add support for basically syntax for adding another agent directly to an agent.

Samuel Colvin [00:13:25]: So here, instead of using the decorator to add tools, we add tools using the keyword argument tools and then we have this agent tool which you will see pycharm is complaining doesn't exist because I'm literally working on the PR right now. But the idea is that you can register agents directly with an agent. Again, we can do some clever stuff to make sure the types are correct. So all of these agents need to have the same deps type. So you can then pass deps between agents as a call, as a run is ongoing and have confidence in the typing. We obviously have an input type now which is then used to validate the arguments passed to the tool. And I think this will be. This is one of the two forms of model or agent composition.

Samuel Colvin [00:14:14]: That's super exciting. So this is the like agent handoff and then we have the kind of calling multiple different agents in sequence, which is the other, the other big thing for us to go and add. I don't quite know what the API is going to be for that yet, but that's another thing we're going to talk about in future. One of the nice things about doing this in a structured way like this, rather than declaratively, imperatively, excuse me, by inside a tool is we can go and build a state machine for what this looks like statically and tell you the different agents that are being called under what conditions and what the different flows are of a multi agent model. I mean, one of the problems we have honestly with pylantic AI is that agent isn't quite the right word because our agents are quite small and self defined. They're almost like agentlets. And so yeah, you end up composing them together to form actual as components rather than necessarily trying to bundle all of your logic into a single agent. I see some questions, but I'm just going to go through a couple more things and then I can try and answer another thing that's coming up.

Samuel Colvin [00:15:26]: Many of you will be aware of Model Context Protocol, which Anthropic announced probably I said last week, but it's probably two weeks ago now. And again we want to add this concept of tool sets to pylantic AI, where instead of registering a single tool, you can register tool sets and these can either use Model Context protocol or you can define your own ones or we will have tool sets for things like. I meant OpenAI. Sorry, I do mean Open API there. So using Open API and JSON schema, you could register a tool Set using purely an API's open API endpoint, or using Model Context protocol, or defining your own ones. So, but the idea here is that like if you're building an application that's using Genai, enormous amounts of your work is basically doing the boilerplate to integrate with, let's say, Slack's API. This should allow us to have a rich library of different existing tool sets that you can basically register if and when you want them. Thank you very much.

Samuel Colvin [00:16:34]: I don't know how much of the 15 minutes I've used up. I tried to go really quickly. I may have gone too quickly, but I can answer some, some questions now or later on. Whatever you, whatever you think is best.

Demetrios [00:16:43]: Yeah, we've got a few minutes and anybody that is on the call, if you want to just jump on and ask questions live in your voice, go for it. I threw one in the chat.

Samuel Colvin [00:16:56]: I'll just answer a couple of things I've seen here. I'm seeing hello from Cohere. I think it's worth saying the examples I talked about here were mostly using OpenAI but like we already have support for Anthropic Gemini Grok Olama and there's, there's a PR on for co here and we're happy to add that we do. Actually, one of the things that's been amazing in, in the time that we, since we released this is the number of people who've come and, and contributed models to the point where we started to have to say no to some lesser known ones because we don't want to be just maintaining loads of, loads of different model integrations. But cohe is definitely one we will accept. What were the frameworks lacking? I don't want to be too rude about existing people who built open source. None of them did what I wanted. In particular, they were not particularly type safe.

Samuel Colvin [00:17:49]: If you go and invent some sexy but esoteric syntax to define chains of models, you end up losing all the type safety that is, that is available in modern Python. And that's very frustrating. And so it was the production readiness in particular that was, that we had trouble with.

Demetrios [00:18:11]: Yeah, I think Achilles you. Is that how I pronounce it? Somebody's got their hands up.

Achilleas Pasias [00:18:20]: Yeah, yeah, that's me. Can you hear me?

Demetrios [00:18:23]: Yeah, yeah, yeah, we hear you.

Achilleas Pasias [00:18:26]: Great presentation, awesome release. Guys, I have a few questions. The first is right now, the agent, you define some tools and the agent decides to call some tools until it finishes the task. Can I override this logic? I mean, for example, an agent might want to Use a tool, but depending on the arguments and the tool that it wants to use, I want to follow a different flow to do something differently than the standard flow the agent would do internally. Is this easy to customize?

Samuel Colvin [00:19:11]: I mean, the. The whole principle of tools as they are defined by the underlying model providers is they are these discretionary things that a model can decide to go and call. If you wanted to have a tool that behaved differently depending on your own context, you can have that logic inside the tool. So you take the different arguments you want and you decide what to do inside your tool, call. The other thing is, if you want to get the result out, rather than have it as being discretionary, you can use this syntax of structured results which effectively require the agent to go and call this tool to end a particular run. And we have a PR open to basically add a way to exit a run early in the case that you basically call a function. And within one of these function tools you can basically end the particular run. Again, we don't want to add too much esoteric custom Pynantic AI syntax.

Samuel Colvin [00:20:10]: We want the library to be relatively thin and allow you to fall back to writing Python code for everything that doesn't need to be within the library. I mean, I think that is like good principle of building any software library, any Python library. Don't go and do things that don't need to be in your library. Be relatively cautious about adding functionality. And I've learned that from having maintained a Library that has 300 million downloads and has been around for five years. And if you accidentally go and add something because it seemed cool once and now you're stuck with maintaining it. Yeah, so I think it's worth being cautious about those things. I mean, I got excited one day and defined a color type in Pydantic because I wanted to do something with colors.

Samuel Colvin [00:20:55]: And now we still have a color type that hangs around, that passes hex colors, which is not something that anyone would expect to be in Pydantic core in Pyntex itself. So yeah, I think we have the right predicates here, but if there are particular things people want, we can definitely think about adding them.

Achilleas Pasias [00:21:12]: Excellent, thank you very much.

Demetrios [00:21:16]: The code is this in some repo somewhere or the best AKA the best slides ever?

Samuel Colvin [00:21:23]: I think they are on my GitHub they're just a markdown file. If they're not today, I think they're called, confusingly they're called Boston because I talked at a conference in Boston and I use them. Use them there but I can. I'll tweet the link after this as well. So yeah, if anyone is looking for them, they can find them.

Demetrios [00:21:41]: Sweet. And then Thomas was saying, is this a replacement for Llama Index or LangChain?

Samuel Colvin [00:21:49]: That's your choice. If you're happy with Llama Index or LangChain, then obviously you're totally free to carry on using them. I'm personally going to use it instead of instead of them. I mean, in the particular case of Llama Index and rag, we don't have yet a model agnostic interface for generating embeddings, but I think we'll add it. But again, coming back to my point about production and not trying to do too much, in the end, vector search is database querying and we don't want to go and build an orm, especially an ORM inside Pylantic AI. And so in the end we're going to give you an interface for generating embeddings, but probably not an interface for querying, because to do that in an actually production ready way, you need full ORM or write SQL, which would be my personal preference. Or depending on whatever database you use, awesome.

+ Read More