Sign in or Join the community to continue

Surf the Next Wave of Innovation

Posted Feb 26, 2024 | Views 234

# Machine Learning

# LLM

# Cloudera.com

Share

speaker

Erik Steinholtz

Solution Engineer @ Cloudera

Presale/TAM with 10 years in international software companies. Primary experience is from Business development and startup in the local geography. Extensive experience in SOA, BPM, B2B, EAI, and EDI, and a broad range of applications and technologies.

Erik wants to work with leading technologies and bring them to their potential for broad recognition in the local marketplace.

Specialties: Solution Selling, Technical Visioneering /Evangelism, IT architecture, articulating technical USPs, Competitor monitoring, Technical evaluation.

+ Read More

SUMMARY

Erik Steinholtz takes us on a tour of the machine learning workbench and discusses the lifecycle approach to machine learning. Erik provides a comprehensive overview of the process from data discovery to model training and deployment. He also delves into the nuances of classic ML and LLM, emphasizing the importance of platform architecture and asset management. Erik showcases the use of fine-tuning and showcases a fascinating demo of using LLM for text generation and retrieval.

+ Read More

TRANSCRIPT

Erik Steinholtz [00:00:00]: So yeah, that was that a little.

Erik Steinholtz [00:00:03]: Bit on the India's part and that's, yeah, it's a little bit off topic today, but I think it's so cool that you just can do SQl on streams that you don't have to have all these specialized competencies. So I just thought we wanted to show that. Okay, but how about the machine learning that we do? We have a workbench for, for machine learning. So I assume that that's sort of maybe more of your focus of interest. We follow a lifecycle approach here also.

Erik Steinholtz [00:00:35]: So you start from the pipelines, you.

Erik Steinholtz [00:00:40]: Do some data discovery and exploration model training. You put it into a registry, then those registered models can be packaged and.

Erik Steinholtz [00:00:52]: Deployed and they are served.

Erik Steinholtz [00:00:56]: And you can also actually on the same sort of rails into production environment that they provide, from packaging to serving. You can also have your applications, for instance, something like a visualization application can use the same thing. Then you have of course, monitoring where you can get the model drift, you can get thresholds to be triggered so that it tells you it's actually time to retrain the model. A little quick tour. And this is only very superficial into what we do in the workbench.

Erik Steinholtz [00:01:36]: So this is the first screen that.

Erik Steinholtz [00:01:40]: You see in the workbench and it has all your projects. It gets you some announcements and things here, but we can look either now at the, let's say overall level or on a specific project we can look at. If we take it from the life cycles perspective, we start with, okay, how do data come in? They come in with some jobs and.

Erik Steinholtz [00:02:10]: This can be sort of your final.

Erik Steinholtz [00:02:13]: Part of the data processing pipeline. So these are some jobs that you can run. You can create dependencies, a run needs to run before c, et cetera. On the next level, you probably want to do some experiments. So this is when you, okay, let me try now. A number of different algorithms, et cetera. I see, how did my experiments go? And I can get, okay, this is.

Erik Steinholtz [00:02:46]: How the various experiments I have, how.

Erik Steinholtz [00:02:51]: They fared with regards to the hyperparameters that they have set for them, et cetera.

Erik Steinholtz [00:02:58]: On the next level, maybe you file.

Erik Steinholtz [00:03:02]: One of these is, yes, that's the way I want to go. So you put it now in the registry, and from the registry you do a deployment. So it's just an example of how a deployment can look. And it gets a rest interface. It is packaged into a container, it gets a rest interface. So it's ready to serve large volumes of codes for you. And together with that you might have.

Erik Steinholtz [00:03:41]: An application that can be deployed now.

Erik Steinholtz [00:03:45]: In the same way. And this is typically an application for maybe visualizing the data that you just have engineered in a very smart way. We can also, another time when you have more, we can show you the visualization tool, but that is just a very standard thing that you can get graphs and you can get all kinds of shorts, et cetera. So this is a quick tour. We're going to soon come to amps. We promised you that we're going to talk about how you do prototyping, and this is where we supply you with prototypes, so you have something to start work from. Going to come back to that in a second. But let's first just talk about throughout our platform, you always get the full lineage, so you can always see when someone asks you, okay, how do you come up with your brilliant conclusion? Where's the data to support it? It will show them this graph.

Erik Steinholtz [00:04:54]: Either you show only the blue parts, which are the various data sets that are involved, or you show the green parts also that are the transformations that happen between those data sets. And this is populated for you. That's the advantage of having a platform. Now, how about llms? Many companies find that they need to think beyond just Chat GPT and the main reason, of course, is that as I talked to someone who here who works with medicines, et cetera, there are certain kinds of data that you definitely don't want to take the loop around OpenAI and maybe even the majority of corporate data, you don't want to expose there generally what we're seeing. And this was, I think it was confirmed on the recent conference we went to, where many speakers talked about how classic ML and LLM is. They are sort of complementary, okay, we used to do specific training data sets for specific applications, specific models. We did a lot of data classing, we applied data science, and that's how we go about this. Suddenly, in the world of llms, yeah, it is the same, but it's different, definitely, because the llms can do so much in and of themselves, and you can actually focus now a lot of your effort around how to modify, how to do some font engineering before you.

Erik Steinholtz [00:06:47]: Fit it in, how to apply some.

Erik Steinholtz [00:06:50]: Fine tuning on the back end of it, eT. And you might want to start maybe with an API enabled one, such as OpenAI, where you do experiments. Then when it's time to go to production, yeah, that's when you start thinking about, oh, I don't want to expose my enterprise data in production in this way. So then you want to instead download and process it in house. So there are a lot of new, I would say, nuances to how you go about the topic of AI and, mm, lot more emphasis on the platform architecture. How can I manage all these different assets in a good way? How can I make sure that the pipelines are streamlined in this new, more, let's say modular way of thinking? So in the end, your data differentiates your AI. That's what we believe is true for every company out there, and for many, it means that you need at least to have the possibility to have it in your own data center. If that is the best for you, or if you feel that we want it hosted by this particular cloud provider, then that's fine.

Erik Steinholtz [00:08:23]: Also, there are a lot of open source llams out there. We support working with them all, and we are in the next release that is coming. You're going to see more tighter integration with hugging phase, because in hugging phase, of course, you have a wealth of not only models, but you have a lot of train data, evaluation data, stuff that can make you more productive.

Erik Steinholtz [00:08:56]: So it's got to be Python integration.

Erik Steinholtz [00:09:03]: Just a quick word on fine tuning. That's something that many find very, actually very useful. So that this is an example of why we see that llama is so good to use and maybe the most fine chained model out there is. Yeah, here you see from 7 billion to 13 billion to 70 billion, how it performs over a certain test set. Now to the right, you can achieve, with fine tuning, you can achieve performance that is actually even better than what you would on this small data set with GPT four. Lo and behold. So we see that there are a lot of nuances, a lot of different ways to approach this new world. And this is something of, I wouldn't call it reference architecture, and it's actually from an investment firm called E 16 Z.

Erik Steinholtz [00:10:16]: But what we did with that is we said, okay, how would we, if we were to apply this? We could work very well in this architecture. And there are some icons there that symbolize our various components.

Erik Steinholtz [00:10:33]: If you take the example of retrieval.

Erik Steinholtz [00:10:37]: Augmented generation rag, that's very popular today, you see that's on the top, you have the data pipelines, and you have the embedding model, very important. And then you have a vector database, and you can work with a variety of embedding models. You can work with a variety of vector databases. Not the same. It's not one size fits all, fits all. Then you have the orchestration layer. Nyshane is sort of the one that I personally know best. I don't know about you, which one you tend to use.

Erik Steinholtz [00:11:14]: Then you have things like logging and LLM ops. How do you do things like detecting bad language and toxicity is of course what I was looking for there. That is really good. It's a really smart idea to actually apply some fine tuning, especially if you use Lora low rank models, because the lower rank models can be applied, let's say, on the output. So you have your main output here, and then you have a specialized tuning here. And you do the same, the large chunk of the processing only once, even though we have two different outputs. So that's one smart application of not using an excessive amount of compute when you want to monitor. We have customers also employing a playground, so you always have a sandbox where you are evaluating the latest models that are coming, because they are really coming by the day.

Erik Steinholtz [00:12:33]: So it's an important task in and of itself to have a good playground for LLM evaluation. Uphosting is of course, what we have been talking about, that you need to really have a flexible and strong way of hosting this stuff and good ways to put it into production in an automated way. Okay, so I wanted just to show you quickly how we do this is we apply. I use one of our prototypes. I think it's fun to play with rag. What if I just take as the context. Now, how many here listen to the Lex Friedman podcast? A couple of you. Okay, I really love that one.

Erik Steinholtz [00:13:23]: If you haven't listened to it before, this is a strong recommendation for me to get offset. I just transcribed a couple of the initial episodes and I said, yeah, all these guys, we want to be able to ask any questions to them to what they said in the next Freeman podcast. Right? So that's what we can use an LLM for. Max Tegmach is there, my personal favorite. I'm very proud that he's from Sweden, of course. And now, since the events of this weekend, of course, we had to have great Brockman and Sam Olsen when they appeared there on the podcast as well.

Erik Steinholtz [00:14:10]: So if you have any questions to.

Erik Steinholtz [00:14:15]: What they said to Lex Friedman today, you'll be welcome. So how did I do that? It's actually not that difficult, actually. So we start with some prototypes. And this is a big library. I talked to one guy before here about shard modeling. That's one of the prototypes we have. If you want to see how you can fare with that structural time series, et cetera, we actually have a couple of prototypes on retrieval, augmented generation rag. I chose this one.

Erik Steinholtz [00:15:06]: It's based on OpenAI. And why did I do that? Because sometimes, even for us, just like for the rest of the world, it's a little bit hard to get hold of gpus these days. And here, of course, I rely on OpenAI gpus and it can be good. In the experimentation phase, I definitely don't have anything that I'm afraid to expose in this stage, so it's good. So what I do is I launch project and it will start. Here is a couple of initial steps that it goes through. Basically I will, by the end of step six here, I will have a chatbot that is ready with some pre populated examples. So now I wanted to see, okay, what if I exchange those pre populated examples in this prototype? So now I'm now working in the prototype mode, if you like.

Erik Steinholtz [00:16:09]: So yeah, it turns out that's actually pretty simple. So I can show you how I did that. I first had to go to, I call it the bio chatbot, the bring your own data chatbot. What I did, we have a directory here called data. And this is where all the text is stored. So is took these transcripts, I chunked them up a little bit and put a little, not too much label, but a little so that it could recognize the various speakers that are in here. That was the only labeling I did. And then it's here.

Erik Steinholtz [00:16:54]: It's one directory for speaker. You had to chunk it up a little bit. So it's in 1 kb chunks. That's what you have to do. And one of them, it looks really corny in a way, because the chunking is pretty brutal. It's just on the one k size. So it starts in the middle of setup. It's not the best way to do it.

Erik Steinholtz [00:17:16]: You could probably do it a much better way when you do it, when you chunk up your take. But at least this works. It's the same default way that for toolkits like Langchain works also. And then I put the label up there. So this was actually Christophe Cock who said this. Okay. And then basically all I had to do was to run the encoder again. And that one takes all the embeddings and put it into the vector database.

Erik Steinholtz [00:17:45]: And then I'm ready to go. So in I can, if I choose my applications, here's my rig app and I sorted and I called it licks in a box and listed the speakers there. So you know who you're talking to. I see now that I have to get my key. So you have to keep your eyes closed. I'm just kidding. This is probably what will change for the next version of this phone would hide this field with your keys. Okay, but I have a couple of funny questions.

Erik Steinholtz [00:18:29]: But the way you ask this is what did person X say about a topic? So I can have a few shots here and then I let you go if I just. So, yeah, this one is good. What did Samuelspa say about the collapse of the Silicon Valley bank? That was a little bit too much flipping, but anyway, and this is the way this is as the call goes over ShedGpt, it takes a little while. As you know, we don't have the nice nifty streaming that ShedGpt does in its UI. The API doesn't provide that. It gives you all the answer, like boom. So it comes out like this. Samote said the connaps of Silicon Valley bank is due to mismanagement and the pursuit of high returns in a lower interest rate environment to criticize the management team for buying long term instruments, et cetera, et cetera.

Erik Steinholtz [00:19:32]: On top of that, I made sure here to get some references. So in those text chunks I just showed you, you can see which text chunks is it referring to to get these answers. So it's a nice way of making the circuit complete. If you like, you can always look up did the LLMs did a good job of interpreting this context that I gave it. This is Chat GPT, so I probably think it did. But then when you come to the phase, when you start using your own, maybe home trained Ella down, or maybe fine tune it at home, that's where you probably want to scrutinize it even more and maybe do side by side comparison more or whatever. So yeah, actually I have lots of funny questions, but it's even better if you ask your questions. I would think some people have questions, especially for Sam Altman and Greg Brockman here.

Erik Steinholtz [00:20:39]: So what are your questions? So, and when you start looking at the names here, by the way, this is really funny when you listen to it, it's like a small circle of people from MIT. So one of these guys, he had Joshua Bengu as his professor and all that. So it's like a small little circle. Anyway, Ray Brockman mentioned that the future of OpenAI is focused on ensuring the development of artificial general intelligence that aligns with human values. He stated that OpenAI's goal is to make sure AI benefits all of humanity and that the world becomes place where they want to. And as you can see, this is slightly different from what it answers. Just in the vanilla untinted OpenAI model, if you like. I'm curious about this.

Erik Steinholtz [00:21:42]: What did you say about the list through Aji? Yeah, let's do. And one thing, the OpenAI shorter. Have you heard about that? I remember he mentioned this actually. So that was why I was curious about it. He mentioned the OpenAI short and that is their sort of overall controlling document for what they want to do. And a lot about this, how they handle the risks with AI and all that. So Greg Brockbo said that OpenAI chores crucial document that aligns the company on its values and mission. It took about a year to compile.

Erik Steinholtz [00:22:21]: The first line emphasizes minimizing conflicts of interest with the mission while raising resources. I think that's actually all about what the drama of this weekend is about. The charter ensures that the development of Agi benefits everyone. In the voice, concentration on power. Great highlights. The charter guides them to make decisions that align with their values. Even if it comes at the expense of their stakeholders. Yes.

Erik Steinholtz [00:22:55]: And wasn't some of the stakeholders, they were quite upset when Sam Altman was fired. It's interesting. It was all in the OpenAI charter. I believe there was someone who the risks. Exactly. Yeah.

Erik Steinholtz [00:23:12]: That's good.

Erik Steinholtz [00:23:14]: Are you specific, Greg or Sam Olsblan or Max Tangmark or who should we ask this time? I think they all spoke about it. How many here did hear Max Tangmark's sommer prod? That was a little bit a sad story. You should listen to him here. He's like a lighter. Better so just to do in that defense. But what did Sam Ultima say about the risks with Agi? Probably okay. Samotba said that the risks of the AgI are significant and that safety measures need to be in place to prevent harm. He believes that AgI has the potential to either intentionally or unintentionally destroy human civilization or suppress human spirit.

Erik Steinholtz [00:24:15]: These risks make it important to have conversations about power, safety and human alignment when it comes to Adi. Thank you. Chachi PD. I think it got it pretty well. And you can see the text references there. What I found now, sort of if we go back to what is the good and bad of the various ways you can work with M Ms, maybe the sort of two big contenders or rag and fine tuning with Laura. I don't know. Even though they are very different, Rag will give you a very specific answer because it will pick out a chunk of text and then answer based on that.

Erik Steinholtz [00:25:05]: So what if you answer. If you ask it a simple thing like who spoke on a podcast, it would come up with a pretty random answer it will say Lex Friedman spoke. So that's good to know that it's very specific. But also what I wanted to show I doing quite get there but a little slider here for the number of chants that you actually include because you can vary how many text chants you want to include in the answer and that you do your prompt engineering with. I didn't quite get.

+ Read More

Watch More

Investing in the Next Generation of AI & ML

Posted Jan 27, 2023 | Views 758

# Investors

# Capital G

# Developer Tooling

# Foundational AI

Data Observability: The Next Frontier of Data Engineering

Posted Nov 24, 2020 | Views 715

# Monitoring

# Interview

# montecarlodata.com

AI's Next Frontier

Posted Dec 10, 2024 | Views 2.6K

# LLMs

# AI

# Kleiner Perkins