MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Exploring AI Agents: Voice, Visuals, and Versatility // Panel // Agents in Production

Posted Nov 15, 2024 | Views 1.2K
# AI agents landscape
# SLM
# Agents in Production
Share
speakers
avatar
Diego Oppenheimer
Co-founder @ Guardrails AI

Diego Oppenheimer is a serial entrepreneur, product developer and investor with an extensive background in all things data. Currently, he is a Partner at Factory a venture fund specialized in AI investments as well as a co-founder at Guardrails AI. Previously he was an executive vice president at DataRobot, Founder and CEO at Algorithmia (acquired by DataRobot) and shipped some of Microsoft’s most used data analysis products including Excel, PowerBI and SQL Server.

Diego is active in AI/ML communities as a founding member and strategic advisor for the AI Infrastructure Alliance and MLops.Community and works with leaders to define AI industry standards and best practices. Diego holds a Bachelor's degree in Information Systems and a Masters degree in Business Intelligence and Data Analytics from Carnegie Mellon University.

+ Read More
avatar
Jazmia Henry
Founder and CEO @ Iso AI

Jazmia Henry is a trailblazer in AI, Reinforcement Learning, Machine Learning, and Analytics. As the founder and CEO of Iso AI, she's transforming how enterprises handle AI agents at scale. Jazmia's leadership experience has spanned across roles at Microsoft, Morgan Stanley, and The Motley Fool, where she has developed AI agents for major financial institutions and led teams creating adaptive systems for real-world challenges. Her graduate education includes Columbia University, Oxford, and fellowships at Stanford and a certificate from Harvard Business School. At Iso AI, Jazmia focuses on cutting-edge simulation environments for AI training and testing. She's driven by the belief that AI's effectiveness depends on its testing quality, leading her to develop tools ensuring AI systems are intelligent, reliable, and scalable. She has recently opened up a consulting arm of her company that helps enterprises develop and deploy AI Agent solutions. Jazmia is equally passionate about the human side of AI, exploring its intersection with human behavior and advancing human-computer interaction. She's dedicated to helping others harness AI's full potential across industries.

+ Read More
avatar
Rogerio Bonatti
Researcher @ Microsoft

Rogerio is a researcher in the Applied Sciences Group at Microsoft, where he develops AI-based experiences for the new Windows Copilot. His research focus is on multi-modal foundational models (LLMs, computer vision) for computer agents. Rogerio holds a PhD in Robotics from Carnegie Mellon University.

+ Read More
avatar
Joshua Alphonse
Director of Developer Relations @ PremAI

Joshua is a seasoned Developer Advocate who leads Developer Relations at PremAI. Joshua has spent his time empowering developers to create innovative solutions using cutting-edge open-source technologies. Previously, Joshua worked at Wix, leading Product and R&D engagements for their Developer Relations Team, and at Bytedance. He successfully created content, tutorials, and curated events for the developer community.

+ Read More
avatar
Julia Kroll
Applied Engineer @ Deepgram

Julia Kroll is an Applied Engineer at Deepgram, where she provides engineering and product expertise on speech-to-text and language models, enabling developers to use language as the universal interface between humans and machines. She previously worked as a Senior Machine Learning Engineer creating natural-sounding AI voices, following five years at Amazon, where she contributed to machine learning and data engineering for AWS and Alexa. She holds two computer science degrees, a master's from the University of Wisconsin-Madison and a bachelor's from Carleton College. Her interests lie at the intersection of technology, linguistics, and society.

+ Read More
SUMMARY

This panel speaks about the diverse landscape of AI agents, focusing on how they integrate voice interfaces, GUIs, and small language models to enhance user experiences. They'll also examine the roles of these agents in various industries, highlighting their impact on productivity, creativity, and user experience and how these empower developers to build better solutions while addressing challenges like ensuring consistent performance and reliability across different modalities when deploying AI agents in production.

+ Read More
TRANSCRIPT

Diego Oppenheimer [00:00:04]: It sounds like we are live here. So first of all, thanks everybody for coming to the panel. I'm really excited about what we'll be talking about today. So today's topic is exploring AI agents, voice, visuals and versatility. At the core of it, it's really what we'll be talking. We have this great, great set of panelists to talk through. Kind of like how agents are using different kinds of interfaces. Everything from integrating voices to how new kinds of UIs to actually the use of smaller models, small language models in particular for a bunch of different usage patterns inside that.

Diego Oppenheimer [00:00:42]: I'm particularly excited because we have a lot of practitioners on this particular panel. I'm going to allow people to introduce themselves and we can go from there. Just because I have in my screen here, left to right, Jasmia, you'd get to go first. So why don't you give us like the short 30 seconds on you and what you're working on and then we'll pass it on there again.

Jazmia Henry [00:01:08]: Yeah. Hi, I'm Jazmia Henry. I have been working in Data Land for the past over a decade. Currently I am the founder and CEO of isoai where I'm building out open source models for agency AI, specifically with trying to help authentic AI have better context.

Diego Oppenheimer [00:01:27]: Awesome. Rogerio. Yeah.

Rogerio Bonatti [00:01:30]: Hi everyone. It's great to be here. My name is Rogerio Bonatti. I'm a researcher at Microsoft and a lot of my research focuses on multimodal AI so combining visual information with text and also decision making. And lately I've been focusing my works on agent and what it takes to develop the next generation of computer based agents.

Diego Oppenheimer [00:01:53]: Fantastic. Julia.

Julia Kroll [00:01:56]: Hi, I'm Julia. I'm an applied engineer at Deepgram. I've been working in the space of data, machine learning and language AI for the last eight years and at Deepgram we're a foundational voice AI company and we've recently released our voice Agent API.

Joshua Alphonse [00:02:17]: Awesome.

Diego Oppenheimer [00:02:17]: And Joshua.

Joshua Alphonse [00:02:19]: Thank you Joshua. I am director of Developer Relations at PREM AI. At prem, we started as an applied research lab but now we are a startup that has a platform that allows you to build and fine tune small language models in particular with a autonomous fine tuning agent. We have a lot of cool things going on and I'm really excited to join this panel with all of you.

Diego Oppenheimer [00:02:42]: Awesome. Fantastic. This is a live discussion. We get to jump in, jump out, go through it. I'll prime the topic. So really quick introduction myself. So I'm Diego Oppenheimer. I've built a couple of AI companies in the past super excited about this, kind of like move into agentec workflows, but definitely not the star of today.

Diego Oppenheimer [00:03:02]: So I'll let these folks, you know, kind of jump in. So we're seeing a lot of these kind of new, various interface changes and interactions, you know, emerge in AI agents from voice to, you know, from, from text to voice to multimodal. And it seems like we're kind of starting to see this, like really compelling voice interfaces start to pop up in a bunch of places and maybe this is kind of like the next natural evolution of what that human AI interface look like. So Julia, like you've been in the space for, for a really long time. I'd love you to kind of like what are you seeing? Like, why do you think voice might be the interface for agents and what's exciting there that you're seeing?

Julia Kroll [00:03:45]: Yeah, absolutely. Voice is our natural mode of communication as humans. Like here on this panel, we're just having a chat. It feels very natural and fluid. And at their best, AI agents are basically meant to function like an AI version of a human agent. And so it feels really natural to have a conversation with them, be able to explain what you need, delegate to them. And in general with voice, the barrier to entry is lower. So you don't need to kind of learn a whole new system or paradigm.

Julia Kroll [00:04:20]: You can just say what you want to happen. And the AI agent can really naturally collect information and have a fluid back and forth to gather the information it needs to accomplish tasks for you.

Diego Oppenheimer [00:04:33]: And this is going to open for everyone. So makes a ton of sense in human agent interaction. Like, does it make sense for like agent agent interaction that voice would be like kind of like an interface? Or is this really like kind of like reserved for when interacting with humans? I'll open this up to anybody who wants to take it.

Joshua Alphonse [00:04:55]: Yeah, I mean, I've actually, I've seen some pretty interesting agent to agent interactions as well. I mean, voice is our like, you know, just like Julia said, you know, is our natural form of communication and it allows us to be hands free, allows us to be efficient and allows us to be ourselves. However, you know, we've seen some advancements actually like within this like last couple weeks, you know, like Even in the Web3 space with Truth Terminal and so forth. Right. Like there's agents that are having these conversations with each other and then creating their own ecosystems, creating their own economies and their own currencies as well. And it's just pretty insane to see the direction of where this is going. You know, even from just like a personal assistant to changing the entire interface of what the Internet is.

Jazmia Henry [00:05:44]: Yeah, I definitely want to piggyback on that. One of the things that we do with ISO is examining the ways that reinforcement learning has been used traditionally in order to allow autonomous systems to be able to communicate. And part of that is by leaning into all of those senses that we as human beings have, and one of them, of course, being voice. But when we're talking about like agentic to agentic communication, there can be a lot of power in using agents to kind of like with swarm algorithms, using them to bounce back and piggyback off of each other just by observation. And so finding ways to whether that's by, you know, distilling voice and through voice, having, you know, different modalities of them being able to communicate with each other through like, you know, voice to text, or even if it's just simply by observing another agent or another human being performing some action either by video or by stream, and then from there distilling that into some type of reasoning pattern. There could be a lot of amazing ways for agents to be able to communicate with each other the same way that we are able to communicate, but also in ways that are specific to them.

Diego Oppenheimer [00:06:52]: Got it. And like, do you think, like, you know, one of the interesting things that I find with like, voice is really like, I mean, in general with working with machine learning, especially live systems, like latency performance, all this stuff kind of like always matters. Responsiveness. It's particularly important with voice when you're interacting with humans, right? Because like, there's a, like the kind of natural pattern of interaction falls off if you're not having that, like, really well provided kind of like, interface. I'm kind of curious like a, like some of the, you know, I'm going to split this into two parts of the discussion. One which is some of the challenges that exist there and what you've seen and kind of like what might be interesting things for the audience to think of when, when thinking about like, latency performance and then going back to the agent to agent interactions via voice, which I think is an interesting paradigm. I hadn't really thought about it that much. Like, do those things still matter? Right? Like, because again, like, you know, another agent will not be.

Diego Oppenheimer [00:07:46]: Will not find it weird if, like, you know, there's a little bit more latency. Obviously you want, like, high transactions. It almost feels like voice might be slow, you know, from an agent to agent interaction. Like, in terms of, like, it could be way faster if it wasn't communicating that way. But I'll kind of open it up. So who wants to take on the like, you know, general latency, performance responsiveness, you know, part of the topic and then we can talk about like a little bit how it's applied and to agent. To agent.

Rogerio Bonatti [00:08:15]: Yeah, maybe you can make a few comments about the, the idea of latency. And I have a few thoughts there. When I think, most of the times when we think of agents, we are thinking about a system that is interacting live with a human. But especially my line of research, when we talk about computer agents, that might not necessarily be the case. There could be specific types of actions where you don't necessarily have to be live. Let's say if you need your. Let's say if you had a computer agent and you need to do modify a few documents or read over some documents, you don't necessarily need that to happen in real time in front of you, but that could be an action that happens offline. So in those cases you could say that latency might, of course it matters and you want that action to happen as fast as possible, but maybe it's not as critical as a interface that is interacting live with a human.

Rogerio Bonatti [00:09:06]: So I just wanted to make that distinction that sometimes latency is not that critical for an agentic system and you care more about precision rather than just having it fast. And. But of course there's a, there's a trade off. And even in my research work I have evaluated multiple models ranging from small visual language models going all the way to the very, very large cloud models. And like what I found is not surprisingly, there's no free lunch there. Usually a model that is much smaller is going to be faster. But unfortunately they are not as good as the larger models because they can hold larger. The larger weights are usually better doing inference for agentic tasks.

Rogerio Bonatti [00:09:49]: So I don't know, it's kind of a unsurprising result. But that's what we found in our papers.

Jazmia Henry [00:09:58]: I think that there's a power and multimodality that we can see even humans speaking with each other. Like even while we're using voice with each other, we're also using vision. We're also using other type of context clues from the things that we've experienced in our lives. Even the fact that we knew that was really always about to talk because we could kind of see him like, you know, his body kind of moved in a certain way as he got closer to the camera. And so we all knew to not speak at that moment as he Spoke. That action was much faster than he began speaking. And we as humans were able to pick that up. The power of AI agents, especially as we're moving into the future of being able to reduce that latency, exist with how good we're getting, with creating those other multimodal emotional modalities, with how it communicates.

Jazmia Henry [00:10:49]: So when we begin to open ourselves up to more than just text or a vision or visual as separate entities, but begin looking at them as entities that can work together, then that can definitely explode the way that AI agents are able to perform, especially when we're talking about latency, if you have a live interaction situation, but also can begin to help out with a lot of the issues that Rogelio points out with the smaller language models versus the large ones. There's been research coming out about, you know, the power of a swarming small language models to get them to work better together than separately. And a lot of that goes into different models all having different tasks and through that being able to communicate. I think that definitely in the future we'll begin seeing more and more instances where that latency is being reduced and that power is being increased even as we have smaller models.

Diego Oppenheimer [00:11:51]: I mean, obviously the smaller models show a much lower general latency, behave faster. There seems to be the path forward is the, you know, getting to these ensembles of like, kind of smaller models that are like kind of task specific, so that you can get a lot of gains in, you know, in speed in particular and efficiency there, not to mention kind of like the cost of running then the hardware that you need on these things. Rohirri, I think you worked on a bunch of stuff here in like your RS agent, like research in terms of kind of like compression and new architectures. Maybe you can kind of like, tell us a little bit what's exciting there.

Rogerio Bonatti [00:12:33]: Yeah, like, what I can disclose is.

Diego Oppenheimer [00:12:40]: I think your microphone might be a little garbled there. We'll come back to you. I'll go back to that. I do want to talk about that. But while you get to fix your microphone, let's land this for the audience. A couple of real use cases. So you're all actually working on real world use cases. Let's kind of.

Diego Oppenheimer [00:12:58]: Maybe I can start with you, Joshua. Like, let's kind of talk through some of the real world use cases that you're landing right now or considering or maybe most excited. Pick your poison.

Joshua Alphonse [00:13:09]: Yeah, absolutely. There's a couple. There's a couple of things, right, like right now, like prem AI, where I work, we're not necessarily an agent company. Right. But we use a lot of agents in production on our end to give this type of experience for developers that may not have machine learning like engineering resources in order to complete different tasks like fine tuning. So right now at prem, we're building an autonomous fine tuning agent that can do synthetic data generation, evaluation, the whole nine. And it gets you, has you covered from the very beginning. And this is how we're trying to open up the democratization of AI and also have this more accessibility to and gather more adoption towards putting agents in production and so forth.

Joshua Alphonse [00:13:59]: And like you know, this, we're using this inside of our own software that we're pushing out. Another thing that I saw like outside of what we're building at prem and again like even back to the web3 space because I think it's just really interesting what's been happening this week. It's really exciting. There's been frameworks like Eliza that have come out that are like for autonomous fine tune autonomous agents that can then do autonomous trading for you. So this is like again like another opening to the future of where finance is going and you know, how to profit off of cryptocurrencies and you know, doing all the research towards it. So this is actually something that's backed by Binance research team which has come come to light soon within these last like few hours or so. Yeah, pretty insane. Yeah, but those are things that I would say like you know, some of the things that I'm excited for as well.

Joshua Alphonse [00:14:53]: And you know we've using a lot of small, like to back to the small language models. We've been using these small language models and just creating different architectures and like hierarchical architectures of having like a larger model sit the top and having a bunch of other smaller models distributed that have domain specific tasks to complete for whatever you need. We're finding like, you know, obviously small language models aren't perfect at everything but we're improving and improving as we're, as we're going.

Diego Oppenheimer [00:15:19]: Yeah, yeah, I've seen that arc, you know, I mean I've seen a couple of implementations of the for a better or worse kind of like you know, kind of like larger model initial endpoint, kind of like router, as a router in terms of kind of like breaking down the kind of like major task and then kind of like going down through kind of like. I don't know if I would call them simpler tasks but maybe more specialized tasks but smaller models where you can actually like get a bunch of cost and efficiency and kind of scale, but you're still using kind of the larger models up front where it's where you need the most generalized kind of like decisioning or kind of like competence around that.

Joshua Alphonse [00:15:58]: Precisely.

Diego Oppenheimer [00:15:59]: It's got to put your queries in.

Joshua Alphonse [00:16:00]: The right places, that's all.

Diego Oppenheimer [00:16:02]: Do some of the use cases that you're excited about, like that you've seen in the real world or coming out in the real world, it's okay to give us a little bit of a glimpse into the future?

Julia Kroll [00:16:10]: Yeah, absolutely. So Deepgram, we build foundational voice, AI and agent models and so we expose APIs for companies to build on those. And so we're seeing a lot of early, exciting use cases. I would say some of the most common areas that we're seeing folks really jump into are order taking, appointment scheduling and really personalized customer support or intake. So any kind of application where you want to instantly be able to help someone, but also be able to draw on deep context and background knowledge, being able to know their calendar, their medical history, their financial records so that you can help them really quickly.

Diego Oppenheimer [00:16:56]: Yeah, I mean, I'm particularly excited. I'm biased here because I invested in an early stage startup in the space. But like just startups that are figuring out how to navigate like insurance call trees, I mean we are going to save so much time once we don't actually have to sit there and have somebody else navigate those trees for us. It's going to be great. I love it. Jasmine, you actually worked on a couple of autonomous system use cases. Maybe you can kind of like bring those to light, kind of talk us through a little bit of what you saw there and maybe a little bit how you kind of see this world playing out.

Jazmia Henry [00:17:34]: Yeah, yeah. I was at Microsoft building autonomous systems there. So I won't get too much into it because I don't want my mic to also get Jarval like or Jillios. I'm not sure if it's because he was getting too. Yeah, I mean we, we used to, with the power of reinforcement learning, which is what my current company is doing, connecting reinforcement learning with agentic workflows in order to improve the way that agents are able to reason and the way they're able to solve problems with things that they've never seen before. And so one of the, you know, we've solved anything from, you know, how to create Cheetos, robotic arms at PepsiCo, being able to better, you know, create Cheetos and so that people are able to get that, get that product out Faster improving the performance of self driving cars and drones, autonomous vehicles, or even things that can actually help and save lives, like being able to create a baby formula. At the time, I'm not sure if people remember and during COVID there was a moment where there was a lack of baby formula and babies were in need. And so the thing that ended up solving that was a project that I was on where we created autonomous systems in order to build that so that human beings wouldn't have to be in a factory and be exposed to each other in Covid.

Jazmia Henry [00:19:04]: So there are a lot of amazing opportunities that have already existed in the past, but as we're moving into the future, that will begin seeing changing the way that we do things in everyday life. Those things were all very expensive, all very high impact. But now as compute costs are going down now as more opportunities to use AI is becoming more democratized, we're going to begin seeing way more opportunities for people to use this to improve their lives in ways it's going to be absolutely amazing. Got it.

Diego Oppenheimer [00:19:39]: Well, I think one of the most exciting parts about this kind of where we're going with kind of like agents in particular is in my opinion, their ability to navigate existing systems. Right. So you know, if you go build out a full existing system, I think like that's interesting and that's great and we can get like reality is that like, I think one of the, one of the interesting things from my perspective, I always say, like, look, the most universal API in the world is language, right? It's like two people who speak the same language. And if you have two completely disparate systems and you have two people who speak the same language, they can kind of figure out how to use those systems. It's extremely inefficient, but like they will figure out like how to like do that. And so going back to you, Roger, on this, like, kind of like talk to me about like, you know, kind of like, you know, you GUI vs API based agent paradigms, like how that field might evolve. I think of course, like the biggest thing. Of course not, I'm making an assumption here is really this ability to navigate current systems.

Diego Oppenheimer [00:20:38]: Right. That already exist and kind of overlaying this intelligence on it. So we'd love to hear more from you on this.

Rogerio Bonatti [00:20:44]: Yeah, and I hope my mic is okay now.

Diego Oppenheimer [00:20:46]: You're good, you're good.

Rogerio Bonatti [00:20:47]: Perfect. Okay. It's a great question, Diego. You know, most of the research that exists today on computer based agents or web based agents is focused on multimodal mostly by image understanding. So image and text understanding, but in the very, the very same way that we as humans use a computer. So it's, it's focused on segmenting pieces of the screen so like buttons, if you can click on text boxes, images, and then emitting actions that interact with these elements. And I believe that on the, on the short to maybe medium term that is the way to go because we, that's how systems are designed today and there's no way. But on the very long term, I think that agents are going to evolve, computer agents specifically are going to evolve towards calling APIs instead of interacting with graphical interfaces.

Rogerio Bonatti [00:21:43]: And I say this because it's much more precise for a system. Let's say if I'm, if I want to order, I'll make up some use case. Let's say I want to order a pizza at Domino's. It's much more precise for a computer agent to call a Domino's pizza ordering API with very precise inputs and outputs than it is for the agent to go to Domino's website and then select, let's say, the type of pizza that I want and then my address, et cetera. It is much more prone for mistakes by doing the graphics based interactions. So that's kind of how I see this field evolving. Maybe in a way this implies that it is going back to unimoto. Maybe it's going to be a text only agent instead of multimodal.

Rogerio Bonatti [00:22:25]: So that's maybe an interesting research question as well for us to think about.

Diego Oppenheimer [00:22:30]: Anybody else? Any thoughts on that? Seeing the same pattern or the same thought?

Julia Kroll [00:22:35]: Yeah, I think that's a great example where you can combine the flexibility of a voice UI with the detailed implementation of a REST API. So you can have this user facing agent who's just asking, hey, what kind of pizza do you want? What kind of toppings, what size? And you can stumble your way through it, you can change your mind, chat with your friends and then the agent on the back end can distill that information into a very structured REST API call to put in your pizza order. I do agree that rest APIs are essential for building function calling applications, but I don't expect that it's going to be the end users who are going to be calling those APIs. It can be abstracted and handled by the agents.

Diego Oppenheimer [00:23:21]: Got it. So we're going to do a little bit of a round robin for this next one which is, and I'll kind of start like what do you think right now is the next bottleneck for kind of like wide implementation of this kind of like, you know, multimodal agents, like, and you get to pick one. There's multiple. But I'm just kind of curious, like, where do you think kind of like the next bottleneck in terms of either adoption or development? Where do you see kind of like the, the most immediate challenges? And I start with you, Jasmine.

Jazmia Henry [00:23:52]: I would say context specific data. There's a lot of data out there, but we have not. We as human beings have not yet begun interacting with AI agents enough to know how we're going to interact with AI agents in the future. And so I can see that being a big hindrance from us being able to create AI agents that are, have the level of specificity that people need to feel like it's actually working well.

Diego Oppenheimer [00:24:18]: Got it. Rohrio.

Rogerio Bonatti [00:24:21]: I'm going to say planning slash reasoning. Like in my, in my research, I think being able to predict multiple steps ahead as the agent is planning as opposed to doing a kind of one a very greedy selection of the next action. To me that's one of the biggest challenges going forward.

Diego Oppenheimer [00:24:39]: Is this the idea that like mostly right now everything kind of gets like figured out in line versus being able to kind of say, hey, look at the whole problem, like come back? Like, I think there was like a, I think the one of the folks over at an OpenAI had in the situational awareness paper was talking a lot about this in terms of like one of the things that's missing is like we as humans get to like look at a problem, think about the whole end to end problem, then come back and break it down. But like right now, the way that most language models work is just kind of like all in one go.

Rogerio Bonatti [00:25:12]: Exactly, exactly. I would say in the current paradigm of language models, what we do is it's called chain of thought prompting where you ask the LLM to think step by step while solving, but that's only, let's say one reasoning chain about what might happen of what should happen going forward. When I say planning, I mean it means that the LLM would be somehow able to create multiple of these chains of thought in parallel and then think about this possible scenarios, kind of backtrack from there and then make the best choice for the very next step. And then we plan everything at every single step. So it's this ability that we have as humans to like look ahead and think ahead and then go back to the present moment. Right.

Diego Oppenheimer [00:25:53]: Awesome. Julia, what do you see as the most immediate kind of like bottleneck or kind of like next Barrier to get over.

Julia Kroll [00:26:02]: Yeah, for production deployments at scale, I think scaling the capacity of AI agents is going to be huge. If you think about maybe an old fashioned paradigm with humans, you have a human come in and work an eight hour shift and capacity can be much more fluid and scalable with AI agents. But if you think of a use case where you need thousands or millions of agents working at once, just figuring out the infrastructure requirements behind that are a big challenge. That's still very new in the field.

Diego Oppenheimer [00:26:37]: Yeah, I think that's, that's a great point. Something I've been thinking so, so I've been in kind of like systems and kind of systems design for a while and like, you know, the way I kind of conceptualize how the runtimes of all these agents are going to work are kind of. When we started working with lambda functions, which were great, right? You got these kind of like ephemeral functions that you can like release, compute on and do it. Then suddenly when you were running them for everything, there's just like, how do I even know what's going on and how do I chain them and like what's broken? And it's like, like the complexity of like running in a case for like lambda functions at scale was just like so big. And I think we're, we're about to hit that times 10 with you know, the control of like, if you think about agents and in particular we start thinking about agents like swarms versus just kind of like single point, you know, do that. Joshua, what do you think is the next kind of barrier here?

Joshua Alphonse [00:27:30]: Yeah, these are some really good answers I kind of agree with. Not even kind of. I agree with Jasmia on this one too when it comes to data, but when it comes to high quality data and high quality data sets, as we're building these agents out, we want to make sure that the decision making and the training process is good enough so that we can have more accurate responses, avoid biases and so forth. But on top of that too, I mean, for me it's all about abstraction and accessibility as well. This is the next other thing that I think is like, you know, the bottleneck to kind of, to kind of close in. You know, we have some really cool frameworks, but the complexities are going to continue to go on and advance more and more. So we have to continue to make, you know, the implementation of AI agents in production more accessible to a wider range of developers.

Diego Oppenheimer [00:28:23]: Got it.

Rogerio Bonatti [00:28:24]: Awesome.

Diego Oppenheimer [00:28:24]: Well, I believe I'm getting the virtual call on stage to say our time is done. This is awesome. What a great set of panelists. Deep, deep in the weeds on all this stuff. This has been fantastic. First of all, I want to thank all of you for being on today. And for all of you in the audience, thanks for listening, and we'll catch you soon.

Joshua Alphonse [00:28:49]: Thank you all so much.

+ Read More

Watch More

Generative AI Agents in Production: Best Practices and Lessons Learned // Patrick Marlow // Agents in Production
Posted Nov 15, 2024 | Views 2.3K
# Generative AI Agents
# Vertex Applied AI
# Agents in Production
Building Conversational AI Agents with Voice
Posted Mar 06, 2024 | Views 1.5K
# Conversational AI
# Voice
# Deepgram