MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Underwriting Assist - A Multi Agent System // Somya Rai | Maria Zhang // Agents in Production 2025

Posted Jul 25, 2025 | Views 58
# Multi-Agent System
# EXL
# Palona AI
Share

speakers

avatar
Somya Rai
Principal AI Engineer @ EXL

Somya Rai is a Principal AI Engineer driving next-gen innovation at the intersection of GenAI and real-world impact. With over a decade of experience architecting enterprise-scale AI systems, she specializes in building robust, explainable, and high-throughput solutions that scale.

Somya’s expertise spans fine-tuning multimodal transformers using LoRA, orchestrating high-concurrency RAG systems, and deploying Physical AI pipelines with tight feedback loops. She has deep experience optimizing GPU usage, debugging distributed memory pressure, and crafting scalable training workflows.

She currently leads AI engineering for EXL’s innovation lab, enabling high-speed MVP execution and partner co-development with AWS, GCP, and Nvidia. A firm believer in engineering excellence, Somya bridges traditional AI rigor with cutting-edge GenAI advances to deliver measurable business value.

+ Read More
avatar
Maria Zhang
CEO & Co-Founder @ Palona AI

Maria Zhang is the CEO and Co-Founder of Palona AI, a company building emotionally intelligent voice and chat agents for restaurants and consumer brands. Previously, she held VP of Engineering roles at Google, Meta, and was CTO at Tinder. She also founded Alike, which was acquired by Yahoo.

+ Read More

SUMMARY

Underwriting Assist is a LangChain- and Ray-powered multi-agent system that accelerates insurance underwriting by 3x and cuts manual errors by 40%. It leverages RAG, shared memory, and LLM-based agents for clause analysis, risk profiling, and rationale generation. Real-time evals and human-in-loop feedback ensure accuracy, explainability, and regulatory compliance at scale.

+ Read More

TRANSCRIPT

Maria Zhang [00:00:08]: I'm the co founder and CEO of Palona AI. We build fully autonomous agents and currently we're focused on serving the restaurants industry. So we have multiple agents both in serving the guests and also helping restaurateurs and restaurant operators to bring consistency and automation efficiency into into their day to day work. Before before I jumped into the deep end and started this company I was at Google. I was a VP of engineering there. Before that I was at Meta and I supported the AI for products team at Meta but I think my most interesting job people tanto like to ask me about I was CTO at Tinder and gone through the the trauma and disgross stage and I see her laughing. I think there's some Tinder questions lined up already. Always happy to to talk about that as well.

Maria Zhang [00:01:06]: Yeah so thank you for having me today.

Demetrios [00:01:08]: Yeah I sadly missed the whole dating on apps revolution. Sadly or not so sadly I don't know because I was married before that.

Maria Zhang [00:01:17]: But Sonia so sadly happily married. Not sadly.

Demetrios [00:01:22]: Somia, can you give us a little background on what you're working on and and yourself?

Somya Rai [00:01:26]: Sure, absolutely. So I am the principal AI engineer with EXL and we are data AI company. We are backed with a lot of domain rich data so we, we also run our BPA solutions and currently we are using generative AI and all that all the new stuff happening in the space to reimagining the process and one such process we'll be talking about today. So yeah helping helping insurance companies, healthcare companies and banking and other stuffs to reimagine. Not the digital transformation but it's just the reimagination in terms of how are we transforming what we were doing earlier in terms of automation with now building some smart agents, intelligent systems out with them.

Demetrios [00:02:07]: Excellent. So that brings us right into the topic at hand in the multi agent systems. I think what I would love to know Samya, maybe you can kick us off is if you can just peel back the curtains a little bit and tell us what your agent architecture looks like, how you're going about structuring it and making sure that it's reliable and all that fun stuff that we can bring up.

Somya Rai [00:02:34]: Sure, absolutely. This particular solution that we have recently deployed with one of our insurer is that it's an underwriting assist and and it solves a new business submission problem statement. So new business submission is in this particular insurers is into B2B which is underwriting for another business, a business company and they usually do a lot of physical survey in terms of identifying Are there any available sanctions on that? Getting into looking into the lexisnets databases or any OFAC databases or some of the very key attributes that they have to map out in terms of identifying what is the category of this particular organization. So for example that someone says I'm a farm, I own a farm place. Now farm place can be a retail or agricultural land as well which cannot be identified or classified until unless I am physically present at the location. So as an underwriter it was a task for them to actually go visit, understand what are the risks associated. So if some or some companies are close to the chemical factories or any other factories or let's say next to a very active water bodies, so they are pretty high in terms of evaluations or the premiums are becomes very high because it becomes quite, quite risky. So this was a very manual process.

Somya Rai [00:03:45]: Now considering this that they get 250 new cases or or a lead per day across multiple underwriters, it was taking a lot of times, it usually used to take weeks for these people to to underwrite, give the evaluation and finally go to the courts in the negotiation. Now with this multi agent system we have actually employed the overall stack of the agents which are collecting the data. First of all parsing the information which comes usually in the email format in form of attachment. Attachments can have lot of multi pages of documents, it can have several images, some coordinates as well, which we can look into the geospatial tools in terms of identifying where exactly this particular organization is located, what's the age of the person applying for the insurance, etc. And what is the si, I mean this particular category classification codes for this company. So instead of the underwriter doing it manually, we have deployed these agents which is like an ingestion agent which looks into the data, parse the data, look out for any missing information, send back the information to the broker in case there are any missing or critical missing information. But in case there are some, we've got all the required information, but there are few which are required. There is a to and fro communication but apparently it kicks off some of the checker agents in terms of sanctions around the company or what is the exact, the risk profile, the location of the company, what's the category of the company, all of that.

Somya Rai [00:05:11]: And once put together it looks into the historical data which is where the risk profiler agents comes into the play. It looks into the data and it historically how we have underwritten or written some of the or generated the codes for the similar sort of organizations in the industries or the same space. So that gives us that this particular sections of the agents they build all kind of a data take input and then finally have the human in loop at every step of it. Because we do want to make sure that the information being captured is very is correct and it is not going to impact because decisioning or the giving the quotes to the final, final, final underwriter is very, very important in terms of all the information are in place. So having human in loop helped in really and putting all of this agents together and finally going to the decisioning and quoting the times from weeks to just two days. So that's the impact we have seen on ground with these agents in the system.

Demetrios [00:06:10]: I want to talk about impact more and how you're measuring that and all that fun stuff. It seems very clear. Maria, I want to tag you in though because it feels like what you've done somia is super cool on the fact that you're almost like verticalizing and really giving some clear tasks and types of agents. So you're saying you've got the risk assessment agent. Maria, have you taken a similar approach on saying these are. It's almost like you have the different agent tasks that you want done?

Maria Zhang [00:06:41]: Yes. Yeah. We also have a similar approach there. So I do think the vertical solution at the end of the day. Right. Although everybody's trying to build like oh, this thing is so magical. Does everything. Yeah.

Maria Zhang [00:06:57]: But I, I think vertical per industry is really important because there's just a lot of nuances and expertise and, and the rest of the ecosystem you need to kind of interface with and understand and plug into and so forth. Yeah. So in terms of the specific say we're talking about, for us it's the restaurant industry, it could be the insurance industry. Health scare. Right. And then kind of first off it's vertical per industry and then within the industry we also have different functions. Yeah. So on one side we do want to have a kind of a unified interface.

Maria Zhang [00:07:35]: Like for example, our order agents can be in different form factors. You can call, you can text, you can web widget, you can know WhatsApp, IG. That's fine. It's actually that one agent with kind of different ways of interfacing and communicating. Yeah. And then I can do delivery pickup. Right. So we don't need to be too granular is my, my point.

Maria Zhang [00:07:56]: Right. We don't need one agent WhatsApp, one agent for IG. I think that's unnecessary. But at the same time we do have agents with other purposes. That's not handling orders, it's doing something completely different. And then they have their own expertise and different modality as well.

Demetrios [00:08:20]: Yeah, that makes a lot of sense. It feels like you break it down, excuse me, you break it down by task, not necessarily by platform. But you have to keep in mind the different platforms that it's coming in on because there's probably a little bit of nuance on each one, especially when you start talking about, oh, we're with WhatsApp, for example, you can send a voice memo and so if you send a voice note to someone, that's a whole different type of intake that and data that you're going to have to deal with and have the agent be ready for. Yeah, so we'll keep on cruising. And, and really what I am thinking about with something that you said, somia is around how this is done for folks and, and I would like to just get an idea of maybe the scale at which you're doing this because a lot of times I feel like we can get something that's working really well when it is at a certain scale and it's not like that gigantic scale. But when we try to scale it up, that's when things start to break. And I'm wondering if you've seen or you haven't had to to really worry about it because this use case isn't like you're getting 1,000 requests per second. So you don't necessarily need to think about that type of scale.

Somya Rai [00:09:49]: Sure, absolutely. So I would say that underwriter and in underwriting we do not have a very, very huge scale, but we still have around approximately 7,000 to 8,000 emails that we have to read per month. So there are 250 underwriting team, let's say for example, and they're working on some 100 emails per underwriter or so or let's say three new mails per underwriter that they are getting every underwriter in an organization. So what we actually faced an issue was not around the scale of using the application, but around how they're using the application. For example, the checker agent and the sanction and the validation agent. I could have actually, I mean we started by actually building one single agent for validation and the checker. They'll. They sound very similar but the task that are doing is pretty different.

Somya Rai [00:10:37]: Then we realize that more collection of tools become the failure points. So as. And we add lot of tools and we are asked one single agent to perform multiple activities. Even though when we have paralyzed those activities, it really does not give us good Outcome rather it was giving certain. I mean actually it was giving a lot of. It ended up being hallucinated. So so what we did that we split these two things. So some of the approaches that we have taken and Maria just touched upon the interfaces now with this particular client, they want it as a REST API.

Somya Rai [00:11:08]: The solution as a REST API, but the same solution if I want to use as a streaming solution where I need to use a web sockets. What we really did that we actually used. So we built the entire the Agentix system into the Langgraph. And while building the. I mean Langrav is one of the best frameworks that we have come across in terms of implementing the Agent framework because it does help with a lot of memory management, a lot of the state and session management as well. Because when we are doing a human in loop, sometimes they the user is not available or the sessions timeout. We do not want to go back and start the entire process again. We want to maintain and retain the session as in where they have left and they can come back and pick it up from there or provide an input and the agent can move on to the next steps.

Somya Rai [00:11:53]: But I think with the Nvidia aiq, which is. It's a NEMO agent toolkit that we have recently they have launched as a blueprint as well. What we have done that this is a one single configuration file and within that configuration file I can expose a number of my agents, I can have my Land Graph agent, I can have another crew agent, I can have a different functional calling within this agentic framework of from Nvidia and I can expose them as a, as a, as a REST API. I can expose them through MCP servers, I can expose them through WebSockets, etc. So I think two state two. I mean there were three major challenges. One as well, how do we interface this? How do we make it more reusable and replicable across the similar industry? Because this particular use case has got high replicability around every insurer. So how do we make it more replicable? And that's where the scalability comes into play in terms of how does the interface remain same irrespective of the industries and the agents and the functionality of the agents changing at granular level.

Somya Rai [00:12:53]: So those were the challenges that we face and that's why we we started with Langra, but we ultimately packaged it on an Nvidia platform and then we are serving currently as a REST API and a web socket and we really do not have to worry. I mean any users they do not have to worry about how do we really interface that so we can expose a number of agents with Nvidia config files or a toolkits that they're providing.

Demetrios [00:13:15]: Well, now that you say that, it makes me realize this is a bit of a paradigm. And Diego, who was on here, given the keynote earlier, was talking about how some things are being flipped on their heads when it comes to working with agents. And this is almost like one of them, that it's not necessarily the scale of the incoming requests, but it is the scale of the different ways that you can connect or the different platforms, the different integrations that you have to work with. And every different company that you want to plug your agent into has their own way of doing things. And, and that's where the scale and the complexity can come in. Going back to the last talk too. So I, I, I want to jump over to you, Maria, because there's a few questions around memory that I would love to ask, especially when it comes to when we're ordering food, there's picky eaters and for myself, like I've, I've got certain food preferences and I don't know if you think like, how do you think about memory optimization? Not just like flooding the context window with everything all the time. Maybe you're embedding things, you're, you've got like these vectors.

Demetrios [00:14:30]: What are you looking at in those terms of like optimizing on the memory side.

Maria Zhang [00:14:35]: Yep, yep. Yeah, before it, oh, before I dive into them on the memory side, which is plus one for what Sonia explained on the scale. Scalability. Right. You know, of course at massive scale there's always, I had to deal with your concurrency issues and sequencing calls and so forth. But it really, the bottlenecks are not the, the AI systems or the agentic frameworks. The bottlenecks are how business, businesses are conducted in the real world. Right.

Maria Zhang [00:15:07]: Including when you place an order, there's a ticket that goes into the kitchen so the chef's can see it. And in some, some of the restaurants have a digital display. Some of the restaurant is still a little printer. You need to make sure that little printer is working. And then imagine the throughput of a printer. Yeah. And also the slow ones, the line.

Somya Rai [00:15:29]: Yeah.

Maria Zhang [00:15:29]: So, yeah. So don't overlook that, these type of constraints. Yeah. So memory very, very important. And I think we actually have a consumer facing experience on the guest side. But human beings, we have memory. Right. Hey, we met a few months ago and we talked about.

Maria Zhang [00:15:53]: Oh, Sonia, we just met. Yeah. So I think it's really important to have some categorizations, differentiations of memories and it's actually quite tricky. So for example, we have short term memory that's in context like Demetrius, you said people were sick, remember that? But we don't want to set it in the long term memory because people are not always, always sick. Right. Tell me I'm allergic to shellfish. That you're probably still going to be allergic to shellfish when we talk again three months from now. Right.

Maria Zhang [00:16:25]: So now if you think about it, there is the, the pollution issue, right? Yes. How do you bring the long term memory into the current conversation? How do you really mean the contextual aware basically you often see. Oh, the. My agents. Contextual aware, kind of a conversation contextual. It's all about memory, short term memory management. When you reference to that, when you reference that, you get to go back and see this is what that means in this context. It's a tricky problem.

Maria Zhang [00:16:58]: Then long term memory is persistent. It doesn't matter. It doesn't really matter. It needs to be persistent. Then this is where memories are shared. When you have multiple agents, we can always look up, you know, this guest is allergic and you can have, you know, you different, different agents can all share this knowledge. Short term memory, I do not recommend it to be shared broadly because it just creates more confusion. Right.

Maria Zhang [00:17:26]: Because in different context that could mean something completely different. So and then, and then you have to kind of traverse. Right. Because it's a two way street what we talk about today. Part of it becomes long term memory needs to be stored and shared, persisted. And also I need to bring some of the long term memory into the current conversation. Ask you the same question over and over again.

Demetrios [00:17:51]: Well that, that brings up a great point around how you know, when you're interacting with the user, which parts of that interaction to spit off into the long term memory. Do you have certain categories or certain things? Like I imagine if allergy comes up, that's a hot word and you're going to want to throw that in and that's very clear. But there's other times where it's not necessarily as clear.

Maria Zhang [00:18:15]: Yeah, yeah. So you have to have the intelligence. Right. So it's not hard coded. Yeah. So it is intelligence. I think my team named it muffin after cookie. Yeah.

Maria Zhang [00:18:29]: So there's a whole muffin system and we did it open source it or anything. But I think it also depends on the situation. Yeah, but you actually have a service, right. That has the intelligence and that's evolving. Right. To. To. To.

Maria Zhang [00:18:46]: To. To manage this two way street? Yeah.

Demetrios [00:18:49]: Is that something that you're putting into your evals and you're having. You're sending off to a reasoning LLM that say looking at different chats or the interactions and recognizing when something needs to go into the long term memory.

Maria Zhang [00:19:07]: Yeah. The short term memory are on its own, can stand on its own. The long term memory are stored in a persistent database. Mutton is service that coordinates passing information back and forth.

Demetrios [00:19:24]: That's awesome. That's really cool. I hadn't heard that before, but I might steal that term.

Maria Zhang [00:19:30]: Maybe you and I can work on.

Demetrios [00:19:32]: That together when you open source it.

Maria Zhang [00:19:35]: Well, I always want to share. I always wanted to, you know, you know, learn and share. Right. So yeah.

Demetrios [00:19:41]: Very cool. Somia, have you found any techniques, ticks, tips or tricks on optimizing the memory? And there is also the other side of memory that we haven't really touched on but maybe you want to dive into, which is the memory of when an agent does something correctly, how do you make it remember that that was the right way to do it? So it should do that. Whenever it's asked to do that task, it remembers. This is how I do step A, step B and then that type of thing.

Somya Rai [00:20:14]: So I think it's a procedural based memory that we refer to in terms of what are the facts. So when, whenever there is an agent outcome that is given so there's always a human in loop and the human actually validates whether the information is correct or no. If the information is correct, no further changes are required. We basically flag that this information or the steps taken were correct in under this scenario and especially in this space when it is pretty much regulated and a lot of compliance have to be maintained, a lot of audit trails are required to be maintained because we need to really give out the reasoning in terms of why we are declining this particular bid or why we are accepting or why we are going ahead for the courts. So what we are actually doing. So I'll take a step back basically that a capturing the session states how do I really transfer the information or how do I do the agent to agent communication. I think this is very critical in terms of how do I pass the correct information to the next user or the next agent in the system. That is where we are using two state systems.

Somya Rai [00:21:16]: One is definitely a working memory as Maria mentioned, this is in context memory which is very recent, whatever the conversation is happening. Then there is a persistent which is where for example the session got over or the user is not available and they did not provide any response. So how do I pick up from there? It is basically capturing that at what stage was at what was the session ID was at, what was the information was produced, what was the input given, was the output given and did we get a response from the user or am I given an instruction to move ahead in the workflow? All of us that capture, we can have two level of memory out there. One is very recent where user is giving me very active information, I can move on. And that is where the cache memory can comes into the pay which is a short term memory. But let's say the user is not available, I did not get a response, 60 seconds are timed out. Then I basically save that similar information in my DynamoDB or, or any kind of a persistent memory that we are using. And for actually transferring or doing your agent to agent communication we are basically using Kafka.

Somya Rai [00:22:16]: So it's an event based triggering. Like once I. Once Agent 1 has done any activity, it has triggered some information and it is. It is all schema managed, it is schema control. So it is getting validated everywhere so that none of the wrong information. So how do we really validate that? You know, are we giving any kind of a penalty or are we giving any kind of a reward in terms of certain agents performing a correct outcome or giving me the correct outcome like you know, under the scenario this is how the reasoning should be done or this is how the planning should have been done. So I think that is where we are controlling it with a lot of schema design basically because we want to though it is an autonomous process, we still want to have it much of deterministic flow considering that there are a lot of compliance and audits are happening. So that is where this is what we have implemented.

Somya Rai [00:23:03]: And. And if I need to go back in history and say what was my first input, how did I how the user changed. So all of is that being captured and can be easily presented because ultimately the final, final agent, which is a negotiation agent which prepares a complete file, it does include all the evidences like how exactly I reach to it. So those steps and the evidences are very critical in this process. So yeah, so that's how we are doing it currently.

Demetrios [00:23:30]: Since we are on the topic and you were mentioning these pieces. Are either of you at the stage yet where you're trying to optimize on cost? Because I. I've heard folks tell me anywhere from I just want to get something that's working and then I'll get Something that's cheap. And so right now I'm willing to throw all my resources at it to make sure it works and it works well and then that people actually like it and it's used. And then I'll go to seeing how fast I can make it. And at the end of the road I will go to how cheap can I make it? So I'm wondering if either of you have hit the. Let's see if we can make it cheap and if so, how.

Somya Rai [00:24:21]: Sure, go ahead.

Maria Zhang [00:24:24]: Yeah, it just. Cheap and expensive are relative, right? Yeah, so. Yeah, so. So there are actually two factors on the, on the cost side. Yeah. On one side is cost to serve. Okay. The compute cost that I'm processing, you know, this and that.

Maria Zhang [00:24:47]: And. And on the other side is actually cost to. To sell. So how am I closing this customer and and so forth. Right. I mean including if you charge them on stripe and you had to pay stripe to 3% whatever. So all of these are cost costs. Right.

Maria Zhang [00:25:03]: So not necessarily the compute cost is the highest. Oftentimes it's not the the highest part of the overall cost. So. So kind of I would recommend take a holistic view. And very good news for all of us, the entire industry, I'm sure both of you have seen if you compare July this year is compared to say April last year, the cost to compute, the costs had dropped so drastically. Yeah. And me being a optimistic person, I expect that will continue to drop next year. Right.

Maria Zhang [00:25:38]: I don't see any reason it would bounce back up or anything. I think the progress. Right. On the overall infrastructure supply, you know, and then just also reduction of of compute required through distillation and other techniques will continue to improve. Right. So we're kind of riding this wave and the tailwind is on our end. So my recommendation is two things. One is look at your cost holistically and then see where do you need to improve.

Maria Zhang [00:26:11]: And two is just given the past pattern and expect that trend will continue that the provider, WePay, whoever they are, is providing the compute will continue to come down. I do not recommend building your own data center, renting your gpu. I get like five emails a day and people rent gpos at. Why? Yeah, so I think that's temporary. I think there was a little bit of a kind of a supply chain or a supply demand, not meeting supply, not meaning. I think the supply will always catch up and meet demand. Yeah, there are that I said. Right.

Maria Zhang [00:26:49]: And if you're just serving a lot of volume and you think really, you know, there's so much cost there. There are many things you could do right to, to, to get the cost down. One thing is a multimodal architecture. Right. Because different models are different. Yeah. So this is something you can be flexible on and then not kind of have a strong dependency say on cloud for or whatever. And then you can always find a cheaper alternative.

Maria Zhang [00:27:19]: That's one. And then the other thing is given them against running your own GPUs assuming you're not. And then you also want to look at a variety of host providers. These different providers you can negotiate the variable cost. So you can also find a better option. But you need to build the architecture that gives you this flexibility to move things around based on costs. And that's a call order actually it's easier said than done. And we have built a provider like that.

Maria Zhang [00:27:48]: We have built architecture, we have switched a provider. The providers probably don't like us to design system that way too much. But it definitely gives you that opportunity to seek better, better price and overall, you know, have a better, better economics for your solution. Yeah.

Demetrios [00:28:09]: Awesome. Yeah. Somebody was just putting in the chat like yes, it's the price of the pie, not the flower 100 that.

Maria Zhang [00:28:17]: Yeah.

Demetrios [00:28:19]: Somia, have you found anything?

Somya Rai [00:28:21]: Yeah, absolutely. So I completely agree with Maria when she mentioned that there is an inferencing cost and then there is some of the variable costs. So just like she described, we've also segregated the cost basically the fixed cost and the variable cost. So some of the fixed costs that we cannot really let go is basically third party database access, for example LexisNexis or any other third party which we really require and our decisioning is dependent upon. So irrespective of agent take non agent execution, we do have to have the, the subscriptions available but the variable cost where we can really control and that is where we I think the Nvidia and using any scale ray has actually helped us in optimizing the serving part of it. So we use Triton serving mechanism for all our models. How do we serve the open source models or the, the pre trained models that we are using for the risk classification. And one thing that has really helped in reducing.

Somya Rai [00:29:12]: So the more the GPU or the more computer hours, the more cost you incur. So what we have done that using the, the Ray which is where using their VLM models for basically data extractions, we have parallelized it. It's a multi threading kind of a system that we have built. It's a distributed extraction system. So now when I get an emails, irrespective of which underwriter is getting that email, let's say there are 100 emails in a, in a day. I have to manage the data ingestion triggers, all the 100 emails altogether for the extraction part of it. And then individual agents comes into play, basically validating them, checking for the sanctions, checking for any kind of red, red flags around the, around the properties or any of the insurance that we are trying to write it out, write it off. So I think those small nuances have helped us in reducing and optimizing the runtime and that has basically helped us in reducing the cost.

Somya Rai [00:30:03]: But since I mentioned that it is not, the scale is not that high, so the cost is not yet a problem. But yes, when I run the same solution until for multiple insurers, definitely it will become a problem for me to handle. And that's where we are trying to build some optimization such, similar to these Triton, like using Triton servers or using Ray for the distributed extractions of the documents or the ingestion of the documents.

Demetrios [00:30:26]: So if I'm understanding this correctly and bear with me because sometimes my head's a little thick, you're using Ray and the distributed nature of Ray to bring down the runtime so that your GPU time is less.

Somya Rai [00:30:46]: That's right, yeah, yeah. So we are basically doing multi threading which is where we are executing like extraction of 100 emails altogether rather than doing it one by one or maybe based on use cases.

Demetrios [00:30:59]: In like the old world of data science, it's like you're doing batch ingestions maybe once every X hours or N hours. And so when you do those ingestions you're doing it with Ray, which is distributed and it means that you're using much less rented GPU time. Yeah, yeah. So I think that's all we've got today folks. I will let you all, if you want, give some final words and it might just be, you know, we're hiring or come and talk with us if you're interested in restaurants or insurance, but that's going to be all for you.

Maria Zhang [00:31:40]: Yeah, we're hiring.

Somya Rai [00:31:42]: Absolutely.

Maria Zhang [00:31:43]: Thank you for the prompt. Yeah, please do DM me or go apply on our website Palona AI and the other thing is just go, go build. We're still at a very early stage of this whole cycle of evolution and disruption. And it's not a bubble and don't get overly excited, it's not a six month journey or 12 months or it's going to be a long journey and just so happy to be here. Thank you so much.

Demetrios [00:32:14]: Awesome.

Somya Rai [00:32:16]: Same thoughts. We are hiding and we are really looking for great talents out there. So anyone interested, they can DM me directly or they can go on EXL site and they can apply for the jobs. And yes, as Maria and Dimitris, you mentioned, this is a very exciting space for we are just at the beginning of it. I think there is a lot to be done and someone has actually mentioned that, you know, how do we really, I mean, yeah, we were building a lot of autonomous systems, but the disruption is about how do we reimagine the process rather than transform the process. So what we have been seeing in the industry is a digital transformation, which has been very long time, but it's all about the reimagination and making our people, our resources more more intelligent and more efficient compared to what we have done before with the transformation phase of it.

Demetrios [00:33:02]: So, yeah, folks, this was awesome. Thank you so much, the both of you. We're going to keep it rolling and for anybody out there that wants to join the team, we'll drop the links in the chat or you can just click on their speaker profiles and you can see their linkedins, all that fun stuff.

+ Read More
Sign in or Join the community
MLOps Community
Create an account
Change email
e.g. https://www.linkedin.com/in/xxx or https://xx.linkedin.com/in/xxx
Comments (0)
Popular
avatar


Watch More

Create Multi-Agent AI Systems in JavaScript // Dariel Vila // Agents in Production
Posted Nov 26, 2024 | Views 1.2K
# javascript
# multi-agent
# AI Systems