MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Taking LangChain Apps to Production with LangChain-serve

Posted Apr 27, 2023 | Views 2.3K
# LLM
# LLM in Production
# LangChain
# LangChain-serve
# Rungalileo.io
# Snorkel.ai
# Wandb.ai
# Tecton.ai
# Petuum.com
# mckinsey.com/quantumblack
# Wallaroo.ai
# Union.ai
# Redis.com
# Alphasignal.ai
# Bigbraindaily.com
# Turningpost.com
Share
speaker
avatar
Deepankar Mahapatro
Engineering Manager @ Jina AI

Deepankar is one of the core contributors to Jina, an MLOps framework for building multimodal AI applications. He is passionate about taking services to production and has built several abstractions using Python & Golang to streamline the process of deploying Machine Learning applications. Deepankar created LangChain-serve, which bridges the gap between local LangChain apps and production.

+ Read More
SUMMARY

Scalable, Serverless deployments of LangChain apps on the cloud without sacrificing the ease and convenience of local development. Streaming experiences without worrying about infrastructure

+ Read More
TRANSCRIPT

Link to slides

Hi, everyone. This is the I'm here to talk about. So how to take your apps to production. Unfortunately, I couldn't be online during the call and I had to uh send across a recording. If you have any questions, feel free to ask them and then my love work space. I try to answer them as soon as I can. Hi, everyone. This is the I'm here to talk about. So how to take your apps to production. Unfortunately, I couldn't be online during the call and I had to uh send across a recording. If you have any questions, feel free to ask them and then my love work space. I try to answer them as soon as I can. I will try to rush through the slides as much as possible since I only have 10 minutes, I will try to rush through the slides as much as possible since I only have 10 minutes, a brief introduction about myself. My name is the banker. I work as an engineering manager at J A I and I'm based in Bangalore and India at uh Gina. Among other things, we develop frameworks for multi modern A I applications uh in open source. Uh I shared my socials here, feel free to reach out in case you want to collaborate. a brief introduction about myself. My name is the banker. I work as an engineering manager at J A I and I'm based in Bangalore and India at uh Gina. Among other things, we develop frameworks for multi modern A I applications uh in open source. Uh I shared my socials here, feel free to reach out in case you want to collaborate. All right. So what is uh so the goal behind land serve is to take users local Linkin based apps to the cloud without sacrificing the ease of development that Linkin already provides. Uh So basically uses Genna to achieve this. Uh So to the members in the audience who are not aware of what Genna is, it is an ML framework to deploy and manage multimodal applications All right. So what is uh so the goal behind land serve is to take users local Linkin based apps to the cloud without sacrificing the ease of development that Linkin already provides. Uh So basically uses Genna to achieve this. Uh So to the members in the audience who are not aware of what Genna is, it is an ML framework to deploy and manage multimodal applications linkedin. So is available on Pipi, you can install it using Pip install linkedin. So, linkedin. So is available on Pipi, you can install it using Pip install linkedin. So, all right. So in this quick talk, let's try to understand how to use length. And so or using a couple of simple examples. The first example here is deploying your custom agent uh in just four simple steps. all right. So in this quick talk, let's try to understand how to use length. And so or using a couple of simple examples. The first example here is deploying your custom agent uh in just four simple steps. Yeah, Yeah, let's take an old example from the Latin talks. Uh In uh this example, uses the search tool using the GP wrapper. Then uh uses this tool to define L L N chain uh which is basically used to create the zero short agent. And finally, we run agent uh execute or plot, run on a particular question. let's take an old example from the Latin talks. Uh In uh this example, uses the search tool using the GP wrapper. Then uh uses this tool to define L L N chain uh which is basically used to create the zero short agent. And finally, we run agent uh execute or plot, run on a particular question. This can be done on your local uh to get the chain of thought and the output. This can be done on your local uh to get the chain of thought and the output. All right. So let's see what things are required in case you want to deploy this using that line. And so, All right. So let's see what things are required in case you want to deploy this using that line. And so, and so the first step and so the first step here is to uh define or reactor your code into and define a function. So in this case, we define a function called ask then add type Hinch to this function. So here we've added uh the, the input of type string and the return type of string. here is to uh define or reactor your code into and define a function. So in this case, we define a function called ask then add type Hinch to this function. So here we've added uh the, the input of type string and the return type of string. Then we add uh the return value. So we just basically do agent executor or run on that input and finally return that. Then we add uh the return value. So we just basically do agent executor or run on that input and finally return that. Uh At last, we basically import the serving decorator from L A. So this is a decorator that the Lain serve uh Python module provides. So you add that uh import that and add that to the function that we just define. Uh At last, we basically import the serving decorator from L A. So this is a decorator that the Lain serve uh Python module provides. So you add that uh import that and add that to the function that we just define. So this is just to show that we can also run the function as is like before on your local. So this is just to show that we can also run the function as is like before on your local. All right, let's go to the step two. So the step two here is to add uh requirements that includes all your requirements for that function. All right, let's go to the step two. So the step two here is to add uh requirements that includes all your requirements for that function. All right, that looks a little blurry. So I'm gonna try a different screen real quick. All right, that looks a little blurry. So I'm gonna try a different screen real quick. Thank you for your patience, everyone. Thank you for your patience, everyone. I'm sure no one is foreign to some technical difficulties. Always fun. Um All right, let's see. Let's see. I'm sure no one is foreign to some technical difficulties. Always fun. Um All right, let's see. Let's see. How does that look? How does that look? All right, it looks better but no audio. All right, it looks better but no audio. So in this case, open A I and Google search results are enough. So whatever your dependencies are you just add them to our requirements or in the same directory and the same with them. So in this case, open A I and Google search results are enough. So whatever your dependencies are you just add them to our requirements or in the same directory and the same with them. OK? Now that the code refactoring and requirements for T H T are done, let's run this. Uh let's deploy this app locally. So the command to do that would be L C serve, deploy local and your application name. So in this case, I do uh P Y was a final name. So we basically pass the module name for it app. OK? Now that the code refactoring and requirements for T H T are done, let's run this. Uh let's deploy this app locally. So the command to do that would be L C serve, deploy local and your application name. So in this case, I do uh P Y was a final name. So we basically pass the module name for it app. So uh that's for, that's what running this will expose uh rest API on your local on the port 80 80. Uh Now you can talk to it using the curl command. So you notice that we are basically sending a curl request to local host port 80 80. And the function name that we had added called ask has been added as an end point on that epa So uh that's for, that's what running this will expose uh rest API on your local on the port 80 80. Uh Now you can talk to it using the curl command. So you notice that we are basically sending a curl request to local host port 80 80. And the function name that we had added called ask has been added as an end point on that epa So you can send a call request here. If you if you absorb the data, the input schema. In this case, the input of the field input was the one of the arguments under the asked function. And this schema also accepts all the involvement variables, all these involvement variables are basically needed so that the function can run. So in this case, these are the tokens provided by open E I MP P F. So you can send a call request here. If you if you absorb the data, the input schema. In this case, the input of the field input was the one of the arguments under the asked function. And this schema also accepts all the involvement variables, all these involvement variables are basically needed so that the function can run. So in this case, these are the tokens provided by open E I MP P F. All right. So once we know that OK, the function is uh all these, the function is exposed as APR on your local and actually we can interact with it. Now let's go to the next step and deploy it on G I I cloud. So the command here, the only change here is uh from local, we've shifted to J cloud which is basically G I I cloud. So we basically run L C so deploy J cloud app and uh that would give us an end point. All right. So once we know that OK, the function is uh all these, the function is exposed as APR on your local and actually we can interact with it. Now let's go to the next step and deploy it on G I I cloud. So the command here, the only change here is uh from local, we've shifted to J cloud which is basically G I I cloud. So we basically run L C so deploy J cloud app and uh that would give us an end point. Uh This endpoint ha is a pair endpoint uh with the on G I I Uh This endpoint ha is a pair endpoint uh with the on G I I using this same point, you can basically send this call, same call request to validate that all your requests are passing successfully. You also get uh dos using this and also the open E P K uh specs which can be used as another agent if you, if you want. using this same point, you can basically send this call, same call request to validate that all your requests are passing successfully. You also get uh dos using this and also the open E P K uh specs which can be used as another agent if you, if you want. Yeah, Yeah, let's go to a slightly more complex example. So in this example, we'll try to enable human in the loop on our server which is deployed on G I I cloud. Uh This is enabled using web socket streaming. So let's let's go through this. let's go to a slightly more complex example. So in this example, we'll try to enable human in the loop on our server which is deployed on G I I cloud. Uh This is enabled using web socket streaming. So let's let's go through this. Uh So first thing like before we decorate our function, so we decorate, we define a function called H I T L. This accepts an input, called question. And the output type hint is again of type strength. Uh So first thing like before we decorate our function, so we decorate, we define a function called H I T L. This accepts an input, called question. And the output type hint is again of type strength. We add the return values uh which is basically again, the not We add the return values uh which is basically again, the not the change here. The difference here is basically to uh define a streaming handler that streaming handler is responsible to send the web socket uh responses back to the user. So you get the streaming handler and I'll add them as a callback manager to whatever L L M function you define this passing. This would be enough to send all the L M output via this callback manager back to the user. the change here. The difference here is basically to uh define a streaming handler that streaming handler is responsible to send the web socket uh responses back to the user. So you get the streaming handler and I'll add them as a callback manager to whatever L L M function you define this passing. This would be enough to send all the L M output via this callback manager back to the user. Via this. We also enable human in the loop. So whenever there is uh an input required, we basically intercept C so we basically intercept them and send a response to the user and wait for, wait for user input to come back so that it can proceed with the next steps. Via this. We also enable human in the loop. So whenever there is uh an input required, we basically intercept C so we basically intercept them and send a response to the user and wait for, wait for user input to come back so that it can proceed with the next steps. Yeah, finally, we basically add uh import and add the serving decorator. Remember to pass where socket is true in this case. Yeah, finally, we basically add uh import and add the serving decorator. Remember to pass where socket is true in this case. All right. So let's skip the local deployment and go to the cloud directly. So we basically do the same again as we deploy cloud and we pass H I T L which, which was a fine and uh doing this will give you a, we talk at end point All right. So let's skip the local deployment and go to the cloud directly. So we basically do the same again as we deploy cloud and we pass H I T L which, which was a fine and uh doing this will give you a, we talk at end point and you can, you notice that this is what W S S which is basically a web socket endpoint in this case, rather than demoing using call. So we, I have written a very simple Python client. Uh So first, it connects to the endpoint and sends adjacent which has the question and the E N V like before uh then uh it waits for a stream of responses back from the server uh when and prints them to use this console. and you can, you notice that this is what W S S which is basically a web socket endpoint in this case, rather than demoing using call. So we, I have written a very simple Python client. Uh So first, it connects to the endpoint and sends adjacent which has the question and the E N V like before uh then uh it waits for a stream of responses back from the server uh when and prints them to use this console. Uh It intercepts uh the there's a particular format that it intercepts which is expected whenever user input is desired that that format is defined here. And whenever that Uh It intercepts uh the there's a particular format that it intercepts which is expected whenever user input is desired that that format is defined here. And whenever that required, we basically intercept L C will basically intercept them and send a response to the user and wait for, wait for user input to come back so that it can proceed with the next steps. required, we basically intercept L C will basically intercept them and send a response to the user and wait for, wait for user input to come back so that it can proceed with the next steps. Yeah. Finally, we basically add uh import and add the serving decorator. Remember to pass uh where socket is true in this case. Yeah. Finally, we basically add uh import and add the serving decorator. Remember to pass uh where socket is true in this case. All right. So let's skip the local deployment and go to the cloud. Directly. So we basically do the same again as we deploy J cloud and we pass H I T L which, which was our final. And uh doing this, we give you a web talk at end point. All right. So let's skip the local deployment and go to the cloud. Directly. So we basically do the same again as we deploy J cloud and we pass H I T L which, which was our final. And uh doing this, we give you a web talk at end point. And you can you notice that this is what W S S which is basically a web socket end point in this case rather than demoing using call. So we, I have written a very simple Python client. Uh So first it connects to the endpoint and sends adjacent which has the question and the E N V S like before. Uh then uh it waits for a stream of responses back from the server uh when and brings them to the user's console. And you can you notice that this is what W S S which is basically a web socket end point in this case rather than demoing using call. So we, I have written a very simple Python client. Uh So first it connects to the endpoint and sends adjacent which has the question and the E N V S like before. Uh then uh it waits for a stream of responses back from the server uh when and brings them to the user's console. Uh It intercepts uh the there's a particular format that it intercepts which is expected whenever user input is desired, that that format is defined here. And whenever that is there, we basically ask an input, uh ask for an input to the user and whatever input report, we just send it back to the server. And this is how the human in the loop is interpreted in, in as a Uh It intercepts uh the there's a particular format that it intercepts which is expected whenever user input is desired, that that format is defined here. And whenever that is there, we basically ask an input, uh ask for an input to the user and whatever input report, we just send it back to the server. And this is how the human in the loop is interpreted in, in as a Yeah. So that was a very simple example of, of, of enabling streaming and human in the loop. Uh But again, this can be expanded to any, any complicated cases. Yeah. So that was a very simple example of, of, of enabling streaming and human in the loop. Uh But again, this can be expanded to any, any complicated cases. All right. So what's what's coming next? So we want to host, uh we want to enable hosting, stream apps on the cloud for these uh apps So that user will also get an EU Y for an application and then the complete journey would be available for any user. We also want to add authorization for the A P N points which will uh which will validate uh the request from a valid users rather than allowing anyone access. And we want to add more examples to our talks. All right. So what's what's coming next? So we want to host, uh we want to enable hosting, stream apps on the cloud for these uh apps So that user will also get an EU Y for an application and then the complete journey would be available for any user. We also want to add authorization for the A P N points which will uh which will validate uh the request from a valid users rather than allowing anyone access. And we want to add more examples to our talks. All right. So that's, that's the uh time my heart. Thank you so much for tuning in. Uh If you find the uh if you found this uh useful, please uh let us know if you have any feedback, any feature requests uh join us on Slack or create any issues. Yeah. Thank you. All right. So that's, that's the uh time my heart. Thank you so much for tuning in. Uh If you find the uh if you found this uh useful, please uh let us know if you have any feedback, any feature requests uh join us on Slack or create any issues. Yeah. Thank you.

+ Read More

Watch More

1:01:43
How to Systematically Test and Evaluate Your LLMs Apps
Posted Oct 18, 2024 | Views 13.8K
# LLMs
# Engineering best practices
# Comet ML