MLOps Community
+00:00 GMT
Sign in or Join the community to continue

How to Build LLM-native Apps with The Magic Triangle Blueprint

Posted Mar 15, 2024 | Views 477
# LLM-native Apps
# Magic Triangle Blueprint
# AI
Share
speakers
avatar
Almog Baku
Fractional CTO for LLMs @ Consultant

A seasoned tech entrepreneur, leader, and LLM-native consultant. Expert in leadership, entrepreneurship, problem-oriented thinking, AI infra, data pipelines, cloud-native solutions, and getting things done. Creator of the open-source project raptor.ml

+ Read More
avatar
Adam Becker
IRL @ MLOps Community

I'm a tech entrepreneur and I spent the last decade founding companies that drive societal change.

I am now building Deep Matter, a startup still in stealth mode...

I was most recently building Telepath, the world's most developer-friendly machine learning platform. Throughout my previous projects, I had learned that building machine learning powered applications is hard - especially hard when you don't have a background in data science. I believe that this is choking innovation, especially in industries that can't support large data teams.

For example, I previously co-founded Call Time AI, where we used Artificial Intelligence to assemble and study the largest database of political contributions. The company powered progressive campaigns from school board to the Presidency. As of October, 2020, we helped Democrats raise tens of millions of dollars. In April of 2021, we sold Call Time to Political Data Inc.. Our success, in large part, is due to our ability to productionize machine learning.

I believe that knowledge is unbounded, and that everything that is not forbidden by laws of nature is achievable, given the right knowledge. This holds immense promise for the future of intelligence and therefore for the future of well-being. I believe that the process of mining knowledge should be done honestly and responsibly, and that wielding it should be done with care. I co-founded Telepath to give more tools to more people to access more knowledge.

I'm fascinated by the relationship between technology, science and history. I graduated from UC Berkeley with degrees in Astrophysics and Classics and have published several papers on those topics. I was previously a researcher at the Getty Villa where I wrote about Ancient Greek math and at the Weizmann Institute, where I researched supernovae.

I currently live in New York City. I enjoy advising startups, thinking about how they can make for an excellent vehicle for addressing the Israeli-Palestinian conflict, and hearing from random folks who stumble on my LinkedIn profile. Reach out, friend!

+ Read More
SUMMARY

In this talk, we explore the exciting yet challenging domain of Large Language Models (LLMs) in artificial intelligence. LLMs, with their vast potential for automating complex tasks and generating human-like text, have ushered in a new frontier in AI. However, the journey from initial experimentation to developing proficient, reliable applications is fraught with obstacles. The present landscape, akin to a Wild West, sees many stakeholders hastily crafting naive solutions that often underperform and fall short of expectations. Addressing this disparity, we introduce the “Magic Triangle,” an architectural blueprint for navigating the intricate realm of LLM-driven product development. This framework is anchored on three core principles: Standard Operation Procedure(SOP), Prompt Optimization Techniques (POT), and Relevant Context. Collectively, these principles provide a structured approach for building robust and reliable LLM-driven applications.

+ Read More
TRANSCRIPT

How to Build LLM-native Apps with The Magic Triangle Blueprint

AI in Production

Adam Becker [00:00:05]: To call onto the stage someone that's actually been one of the secret founders of the New York City branch of Mlaps community. Almog. Are you around?

Almog Baku [00:00:16]: Let's see.

Adam Becker [00:00:21]: Here you are.

Almog Baku [00:00:24]: You. Good to see you. Long time no see.

Adam Becker [00:00:28]: Almog also has the special distinction of being the person that I run into on the street randomly the most. Yeah, almog, good to see you. Now, even virtually normally, it's face to face. You're going to walk us through how we should be thinking about product development. You call it the magic triangle. It's a framework that you've been putting together.

Almog Baku [00:00:54]: I'm very curious to see what you have to share with us.

Adam Becker [00:00:57]: I'm going to start the sharing screen, and I'll come back in ten minutes.

Almog Baku [00:01:01]: Yeah.

Almog Baku [00:01:03]: All right, so this is actually a very strange session because I guess it's exactly on the border between product and engineering. So we are going to talk about how to build excellently performing LLM native applications with the magic triangle. But first, a little bit about me. So my name is Almond. I'm based right now in Israel. I'm a seasoned entrepreneur. I guess we can just jump right ahead to our story. So right now, the digital frontier is a little bit weird.

Almog Baku [00:01:50]: It's pretty much like the wild west. It's an uncharted and untamed territory when everything is possible and everybody rushing to the gold and looking for the hidden gold rust. Right. So what are these gold nuggets that everyone is looking are looking for? Obviously, the LLM is the essence of the gold, but how do you actually mine it? So we plan basically to mine this gold and to use this gold to build LLM native applications, which are applications that use LLM models at their core. Think of it like their engine. So up until now, we've written regular applications using standard engineering, like binary, zero and one. But now we have this secretive mind that we can use to build very complex applications that we couldn't do up until now, like chats or any variation of chat with your data, data extractions, summarization, writing content, and et cetera. And a side comment is that I think that most of the applications that we see today are more like in the first category of chat with the data.

Almog Baku [00:03:19]: But actually, I personally believe that this is only the tip of the iceberg, and most of the potential are hidden on the other ideas of how to leverage the LNM technology for building innovative products. So we have our gold mines, but it's actually very hard to navigate and to find these gold mines and the reason is, in short, there is no talent. It's pretty much similar to what happened in the previous decade when everybody started.

Almog Baku [00:04:02]: I think maybe the Internet is coming.

Almog Baku [00:04:05]: In and out, and the long version is we experiment with chat GPT, and very quickly we put this data and this prompt, and then we respond to the model, and very quickly we say, aha, we have a POC. But in reality, it's very hard to implement this POC that we've built with chat GPT in real life, because in real life we don't have chat to communicate with. And a good LLM is a good LLM native product needs an implementer that excels in product engineering, research and data science altogether. So the magic triangle is here to solve it. You can think about it like a framework of a conceptual framework to navigate your thoughts and walk while designing and implementing your LLM native applications. So let's dive into this paradigm. So the magic triangle is based on three pillars, the SOP, the propt optimization techniques, and the relevant context. I know this is a lighting talk, so it will be very quickly.

Almog Baku [00:05:24]: My goal here is to intrigue you to think about this kind of stuff, and stay tuned if you want to hear more details in the long blog post I'm going to publish soon. But in a nutshell, the SOP is a term that I borrowed from the production, like real life production of literally building machines. It's an operation that basically large organizations write a recipe step by step instructions that help the workers carry out a routine operation while maintaining high quality and similar result each time. So think of the LLM like an inexperienced worker. Your prompt should be exactly like the Sop of the manufacturers, right? We need to write a step by this inexprixcel in its task, the prompt optimization technique. The second pillar are basically the techniques that we know, chain of thoughts, role assignment, react, tree of thoughts, agent formatting, and et cetera. And you can notice here, some of these techniques are only in the prompt layer, but some of these techniques are only in the software layers. But there are a lot of techniques, and I only specified here a few that are combining both of them.

Almog Baku [00:07:00]: Like agents, we need to combine a prompt and a software layer that actually implement the actions that the agents took. The relevant context is the third pillar. Basically, we need to provide with this generic prompt that we've created a specific instruction. What's its goal right now? So it's basically like we have this great task for you, but now you should be focused on this very specific task. But here there are a few things that we probably want to do, like only putting relevant context and not like huge context, because otherwise we can find ourselves in the needle in the high stack problem, which is we provided the model with huge context and it can't find the relevant parts of it. So there are two ways to create context. One is the embedded context. We place variables inside our prompt, like hey, you are a helpful assistant helping Almog.

Almog Baku [00:08:14]: Almog is a variable part of this prompt. We also have the appendix. Context is that we're writing the prompt, and then on the appendix of the prompt we add additional context. Here, for example, we can use rug or vector databases to fetch this context and to put it in the prompt. So this is a short snippet show how everything looks together. So we have this prompt. You are a technical writer. The technical writer is a technique.

Almog Baku [00:08:52]: Here we're using role assignment. You are able to communicate only with Yaml. This is another prompt technique of output formatting. Summarize this following article for general audience.

Almog Baku [00:09:09]: Okay, probably will take another second or.

Almog Baku [00:09:15]: How the model should perform these actions in order to perform well. So this is the magic triangle. Put for thoughts for you and thank you.

Almog Baku [00:09:31]: Awesome almog. Thank you very much. Man. Somebody in the chat has a question, and then I have another question. So let's start with the chat. So give us 1 second. Okay, we got Jorge asking, how is SOP possible if LLMs are not deterministics. So what strategies to get it?

Almog Baku [00:09:57]: So there are multiple strategies. And every other day we publish like we see more research published about strategies that you can use to improve the results of your prompt. Even though the results are not deterministic, we know from research that when you do a certain juggling with your prompt, we can improve the results. It's not deterministically, we'll improve it, but you can see that in a few percentages, like when you say to your prompt, please solve me the math problem of one plus one. It will try to solve it naively, but if you'll try to add less, sync it step by step, we'll see repeatedly that it will solve it much better. So these are the prompt optimization techniques. I hope that answer your questions.

Almog Baku [00:10:55]: Is that true also for the standard operating procedures, right?

Almog Baku [00:10:59]: So the idea of the SOP is to create a script very detailed of what the steps it should do. The step by step is a specific operation, is a specific technique, and basically it says, let's think it step by step. It doesn't specify the steps. The idea of implementing SOP here is to say, oh, we see that this content writer is very well performing because when he do this and this, that's how we do it. So basically, what we need to do is to model this excellent employee and to write the recipe how the LLM can reproduce its results. Got it.

Almog Baku [00:11:48]: This is kind of like the example of when you have new salespeople that join a company, that you have them kind of like shadow other, the best salespeople, right? You're trying to kind of pick up, what is it that they do so that you could sort of establish some sops around them. Is that sort of like that?

Almog Baku [00:12:06]: Exactly. So large organizations, if it's not like business development, kind of self. So large organizations even write script. Hey, please, my dear new salesperson, just follow the script line by line, word by word. Just do what I tell you to say and you'll sell. So obviously it's not 100% bulletproof, but most of the time, if the script is good, they will sell. Much better than improvising.

Almog Baku [00:12:38]: Yeah, okay, sorry, can I bug you for another minute? I want to see you go through the slides just like, one more time. I really want to register. So we have these three pillars. The first one is the standard operating procedure. That's exactly what we were just saying. It's the script for how to actually get it done in the most excellent way. And then we have the relevant context. That's where rags come in, vector databases come in, but that's where you also have the converse desire not just to include the right context, but to remove context that might otherwise distract it.

Almog Baku [00:13:12]: Right?

Almog Baku [00:13:13]: We have two forces there. And then you have prompt optimization techniques, and that's where you have all of your run of the mill. This is where we just got these massive diagrams of all the different techniques for how to actually do your prompt. Interesting. Okay, so the idea is to keep all three in mind, and when we go about designing an application, what should be the flow of how we should think about? Where do we start?

Almog Baku [00:13:42]: So this is the tricky part. You should do all of them altogether while writing your application. So you can think of it as the pot is something that you implement while writing the SOP, while you keep some vibes to put the context inside. So you need to implement all of them together for every application. Some of these parts, you have to write a software that implement, right? Like tree of thought, you can write the prompts, but if you won't write the implementation of the prompt, nothing will work. The same for agents, right? Some of them is you only implement on the prompt.

Almog Baku [00:14:27]: Right? Yeah. Some of them would require just kind of like multiple invocations and that sort of thing. Okay. Some of it you need software. Some of it you need language. What about, can you give us just like a couple of other examples of the standard operating procedure? I think that you had it there on one of the slides.

Almog Baku [00:14:47]: So the naive example of SOP outside of this LLM world, you think about Amazon, the e commerce website. Like, they have thousands of people, employees, wrapping up packages so they can say to a newcomer employee, just wrap the package and that's it. But they will have much better results if they'll take the best performing wrapper. And they'll try to model him. Like, here is where he puts the tape. This is how much tape he used. This is how he put it all together. And then they can write a step by step instructions and maybe even draw some lines on the packages to help the newcomer wrappers how to do their job better.

Almog Baku [00:15:41]: So this procedure actually called in large organization SOP. So the idea here is to borrow this concept and to treat the LLM as an unskilled employee or unskilled worker or like an intern. There are many ideas I heard around how to look at the LLM.

Almog Baku [00:16:07]: Yeah, last question somebody's asking from the chat, Jorge, is are there any papers related to Sop and pot? Is that something that anybody can look up?

Almog Baku [00:16:20]: That's exactly what I'm working on. And I'll release the article next month, something like that.

Almog Baku [00:16:27]: Okay, ping us and we'll send that along to everybody. Amog, thank you very much for joining us and for sharing with us the framework man.

Almog Baku [00:16:38]: Thank you. Thank you for.

+ Read More
Sign in or Join the community

Create an account

Change email
e.g. https://www.linkedin.com/in/xxx or https://xx.linkedin.com/in/xxx
I agree to MLOps Community’s Code of Conduct and Privacy Policy.

Watch More

How to Systematically Test and Evaluate Your LLMs Apps
Posted Oct 18, 2024 | Views 14.9K
# LLMs
# Engineering best practices
# Comet ML
Taking LangChain Apps to Production with LangChain-serve
Posted Apr 27, 2023 | Views 2.4K
# LLM
# LLM in Production
# LangChain
# LangChain-serve
# Rungalileo.io
# Snorkel.ai
# Wandb.ai
# Tecton.ai
# Petuum.com
# mckinsey.com/quantumblack
# Wallaroo.ai
# Union.ai
# Redis.com
# Alphasignal.ai
# Bigbraindaily.com
# Turningpost.com