MLOps Community
timezone
+00:00 GMT
Sign in or Join the community to continue

MLOps at the Age of Generative AI

Posted Aug 07, 2023 | Views 621
# Generative AI
# LLM
# Scale Venture Partners
Share
SPEAKERS
Barak Turovsky
Barak Turovsky
Barak Turovsky
Executive in Residence @ Scale Venture Partners

Barak is an Executive in Residence at Scale Venture Partners, a leading Enterprise venture capital firm. Barak spent 10 years as Head of Product and User Experience for Languages AI and Google Translate teams within the Google AI org, focusing on applying cutting-edge Artificial Intelligence and Machine Learning technologies to deliver magical experiences across Google Search, Assistant, Cloud, Chrome, Ads, and other products. Previously, Barak spent 2 years as a product leader within the Google Commerce team.

Most recently, Barak served as Chief Product Officer, responsible for product management and engineering at Trax, a leading provider of Computer Vision AI solutions for Retail and Commerce industries.

Prior to joining Google in 2011, Barak was Director of Products in Microsoft’s Mobile Advertising, Head of Mobile Commerce at PayPal, and Chief Technical Officer at an Israeli start-up. He lived more than 10 years in 3 different countries (Russia, Israel, and the US) and fluently speaks three languages.

Barak earned a Bachelor of Laws degree from Tel Aviv University, Israel, and a Master’s of Business Administration from the University of California, Berkeley.

+ Read More

Barak is an Executive in Residence at Scale Venture Partners, a leading Enterprise venture capital firm. Barak spent 10 years as Head of Product and User Experience for Languages AI and Google Translate teams within the Google AI org, focusing on applying cutting-edge Artificial Intelligence and Machine Learning technologies to deliver magical experiences across Google Search, Assistant, Cloud, Chrome, Ads, and other products. Previously, Barak spent 2 years as a product leader within the Google Commerce team.

Most recently, Barak served as Chief Product Officer, responsible for product management and engineering at Trax, a leading provider of Computer Vision AI solutions for Retail and Commerce industries.

Prior to joining Google in 2011, Barak was Director of Products in Microsoft’s Mobile Advertising, Head of Mobile Commerce at PayPal, and Chief Technical Officer at an Israeli start-up. He lived more than 10 years in 3 different countries (Russia, Israel, and the US) and fluently speaks three languages.

Barak earned a Bachelor of Laws degree from Tel Aviv University, Israel, and a Master’s of Business Administration from the University of California, Berkeley.

+ Read More
Demetrios Brinkmann
Demetrios Brinkmann
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More
SUMMARY

The talk focuses on MLOps aspects of developing, training and serving Generative AI/Large Language models.

+ Read More
TRANSCRIPT

I'm Barak Turovsky, I'm executive and residence at Scale, venture partner, leading enterprise, uh, VC VC firm focused on ai and I, I'm not, uh, a big, uh, uh, connoisseur of coffee. I just, uh, prepare uh, Cappuccino using, uh uh, machine.

Welcome back to the MLOps Community Podcast. I am your host, Demetrios. Today I'm flying solo. I just got done talking with Barak and wow! What a amazing person to chat with. He is just full of so much wisdom and it is incredible because he. Is one of the originators in this space. We could say he worked on the Google team that created the translate that we all know and love. I know. Specifically, I used that tool so much when I was learning Spanish and got dropped in Spain and had to fend for myself. So that was my best friend back in 2008, I think.

And he talked today to us about how he was using machine learning and specifically what we could call today as a large language model, but basically just transformer architecture on the translate app that you get from Google, and how they created that and what he thought about it and what he called the "First Boom" of AI and what repercussions that had.

And then he just went into his now what he's calling the "Second Boom" of AI. He has an incredible framework that he goes through as to when you can be looking at using large language models when you shouldn't be using them. How he views the use of them right now and what he thinks is the best uses of them.

I mean, there was so much incredible information here. It was just like, yeah, I loved it because he comes with a product mentality. He was a product manager at Google when they created the TPUs. I. Which is a funny story that he shares in talking about that whole experience about how they said, well, we just need to create better hardware if we wanna make this work.

And then they went out and did it and invested hundreds of millions of dollars to make that actually happen without knowing if it was gonna work or not. That is wild to me, and he was at the forefront of that. He was the product manager on some of these incredible use cases. So I loved that. But then we got into how he thinks product and machine learning engineers can work better together, and more specifically, what you need to know.

In the ever-changing paradigm of working with machine learning today and how you can set yourself up for success when you are working with product departments or product personas. I loved it. I had a ton of incredible takeaways and I really appreciate Barak for coming on here. So let's just get right into this conversation in case you haven't.

Real quick, it would mean the world to me if you just share this episode with one friend. We're starting to increase our reach a ton, and it seems like people are sharing, so I'm gonna keep asking. I'm gonna keep asking you to share this with one friend because it is awesome just hearing how you all are enjoying these episodes and liking them.

I am going to put a little disclaimer because I know we have been talking about LLMs quite a bit lately. This episode is no different, but we are trying to find that happy medium. It's not going to jump only to LLMs all the time. We are still talking about traditional MLOps use cases. I guess we can call it traditional, even though it's only like two years old.

You know what I mean? And so now share it with a friend. If you liked it mean the world to me. Let's jump into this conversation with Barak,

So Barak, it's great to have you on here. I'm super excited because there are so many people that I think follow you and enjoy your content on LinkedIn and on social media, and you're one of those people that actually has been in the trenches. You've done a lot of great stuff in the field of machine learning, and you were just talking before we hit record about how your expertise really lies in product, but I think you're a little humble and you've got a ton of expertise, and you've also been around for a while. So I'm excited to dive into this conversation with you. Where I really wanna start is just from the beginning. Where did you realize that you wanted to get into this whole tech thing?

Yes. Oh, on tech. Well, it's, uh, it's a very long story. Uh, first of all, Demetrios, thank you so much for inviting me. I'm excited to be here on this podcast, and obviously it's very important topic to talk about, not only how AI will transform many industries, but I also believe it'll transform MLOps and DevOps industry too.

Uh, people will need to, uh, gain new skills and transition to new skills, and I'm very excited because I believe we are in a second coming of AI. And I'm very, uh, still very lucky and humble that, um, a I was part, some, uh, was in the central part of both of them. The first part of AI was in 2015, 2016 when Google did a breakthrough of basically being able both on the hardware and software side to be able to use what we call deep neural networks, what we now call.

Large language model, so AI at scale, because we saw that was purely theoretical. Let's call it academical. You could not use it on a very large dataset and Google, specific Google Brain teams that we work very closely with. Basically prove that it's possible both on the software. And then Google developed a custom hardware called tenor processing units to do that.

And the first, very first product that was using Deep Channel Network at scale was Google Translate the products that I led. So I mean, I will extremely humble and proud of being part of that amazing journey. As a side note, if people want to read a little bit about the history, both technical history and overall history of AI, there is still a great article from New York Times called "The Great AI Awakening" that I recommend everyone to read.

It's a really great article, article. Uh, maybe we can even send a link to people. Uh, yeah, we'll put in the notes, the part of the podcast. So that's basically the first coming of AI. But I would say this first coming in 2016, it really exploded and that created all this evolution. But to some extent, this revolution was limited to big tech companies like Google, like, uh, Meta, like Microsoft and others, Amazon, and all the stuff around like assistant and speech.

And Google did a lot of things about query understanding and ads, et cetera. And I did a lot of this work across understanding speech, understanding speech, uh, understanding search queries. Matching queries to, to ads, et cetera. All this was amazing work, amazing achievements, but they were limited. The use case were limited.

Now with this to some extent, I think even OpenAI was surprised when they put this clever UI on top of those large language models, and they were obviously because startups are much less bound by constraints that Google or other big companies have. But now the interest and applications of large language models are going to be democratized. They're going far beyond. I would actually say, and we'll talk about it, they will transform a lot of industries. Some industry will be transformed faster, some slower, some use cases will be transformed faster, some slower. But eventually what will happen is that I believe that we will say the, we will truly go into AI being the new electricity that'll transform almost every aspect of our life.

So, as I say, I worked in the space for, um, For, uh, more than 10 years and, and was part of both waves or both comings. So very excited about that. If you talk about my tech background, I started in, in the military, in Israel military many years ago doing what is called AI. Uh, but now at that time, nobody called it this way and worked on variety of companies across startups and big companies across, uh, advertising and commerce and payments.

But last 10 plus years are mostly focused on AI.

Yeah, and so you were at Google when they released that, what you were calling the first wave, right, of AI, and you saw how it affected Google. Can you explain that whole movement and what, what happened? Like were your eyes opened as soon as Translate came out, and what was the internal buzz there were people asking, what else can we add this to and what did that look like?

Yeah, it's a great question. So it's a, it's a fascinating story. So the digital networks, were a talk, by the way, AI exists for many years. I mean, some people going back to Turing tests, and as I say, that's a great. A great a awakening article.

It's a great article. Um, I highly recommend people to read, but on a high level, a lot of academics, including academics that were at Google, came to Google Translate and said, Hey, uh, why you not using digital networks? And we asked them, what's your dataset? They're like, oh, we have amazing results. Like, what's your dataset?

10,000 sentences. So like, well, our dataset is like, uh, on the older system, let's call it machine learning one Oh, statistical machine translation was like, In good languages like Portuguese or would be like single digit billions. So like can your system run it? No. And the results are not great if you, if you use Al Networks on 10,000 sentences.

So it's not, doesn't help. And then at some point, Jeff Dean, who is kind of, I think considered the father of modern AI, he started to look at that and got interested in that and he came to our team and said, I think it's possible and engineering. My engineering. At that time, he was my boss and then he became my peers.

He basically was skeptical mostly because of the production, uh, aspects of it. He said It'll be too slow. And Jeff Dean basically asked him what would it take? And he said, new hardware. And Jeff said, you'll make it happen. And uh, I mean, it's unbelievable. It's actually gives a lot of credit to Google.

Google invested upfront without no clear. Use case yet on monetization, even though it was obvious that Google will monetize it, Google invested $130 million subro. It's a public information to develop what's now called custom hardware. And the reason for that, when we started looking at the results, but in 2015 there are also software, uh, breakthrough mm-hmm.

With Google Brain, that they develop a system that could process all those billions of training sets at scale. But it was extremely slow. Just to give you a sense how slow it was. It was hundred x slower in production versus our production system. So let's say if translation would take, I dunno, end-to-end.

One second, it'll take a hundred percent. It was unusable system. That's why Google had to play both the site of the model and also with the hardware part, and effectively developing custom hardware that initially Google Transit was the first customer. But, and it was unbelievable, the excitement to your point about excitement.

Excitement about the sides, that so many people were so excited that we conservatively estimated that it'll take us three years to launch the first language. We launched, I believe, 20 languages. We finished 20 languages and nine months. So we did it at one fourth of a time. A, because obviously Google is amazing company with amazing engineer, but also the excitement about this literally transfor transformative moment was unbelievable.

And just another kind of funny story is this article, the Great AI Awakening, because we all understood that we're working on something so transformative for the industry, not only for Google, we actually allowed the reporter to sit under embargo. He could not publish anything for six months. He would literally sit in the productization meetings that I would lead when we would talk about having product.

It was ton of details, how you do it and how you launch it in the products that had billions of users. So it was a super exciting project and as I said, we had to overcome, uh, latency issues and quality issues and hallucination. All the styles that we're talking today, they still exist. Yeah. But we had to overcome them and it was amazing ride.

Wow, so that is incredible. Can you tell me a little bit more about the experience of building a product at that scale A and the challenges that you went over, but also these challenges are very topical for today, what we're talking about. So how did you overcome things like hallucinations.

Yes. So it's, uh, by the way, first of all, you hit, you know, you literally hit the nail on the head. I think what we experience right now is this height. It's also, as I said, it's very positive height because now the LLMS or this kind of, uh, AI is democratized, but people mostly towing and playing with it. And there is a big difference between towing and playing and. Doing something cool and launching it at scale to, uh, billions of users, hundreds, billions, millions and our case, billions of users, and making sure that the product is not only useful, but also, you know, uh, not offensive, et cetera, et cetera.

So there was a, a lot of work including building, supporting ml, ML systems to do these hallucinations and get hallucinations happen. It'll continue to happen in, uh, in, uh, ai. And by the way, it's also back to the use cases and I will talk about my framework in some use cases. Hallucination are very important, uh, to deal with, for example, use case of service, that I will talk a little bit.

It's a use case that I believe you need high, very high accuracy in very few hallucinations, and also you need to make sure you reduce offensiveness, et cetera. Especially if you're talking about a company. Like Google or Facebook, et cetera. There are use cases in translation. One of them, or let me give you an easier use case.

You are writing a draft of an email and you as a user will probably review it before you send it. In those particular cases, even if you have hallucination, if the division of work between human and machine is that u uh, machine, machinery, console, what I call fluency, meaning, uh, you know, good writing, good, good.

I quent writing because it's, that's in many cases takes a lot of time from people. Um, then it might be fine and maybe that's actually a good time to talk a little bit about my framework on that front, because I think that's kind of my framework was actually developed as part of my work in Google Translate and also all the discussions, especially in the recent months, as was all those massive hysteria.

That, you know, oh, search will be disrupted tomorrow and Marcus will take over Google search with OpenAI. So a lot of those things, uh, it was very quickly, uh, the industry prone that it's not the case. But, and I was very enjoying to see that I was right. But maybe I will, I can also expand why, kind of look at the, at the, at the, as a, as a framework to kind of explain it a little bit.

Yeah. I love the framework. I think you were, what you showed in. The article that you wrote, it matches up so much with a lot of this stuff that we've been talking about and how, basically how severe the downside is, is really what you wanna look at if you are trying to use ai. Because you can't allow for any kind of hallucinations if the downside is huge.

But if you're just absolutely, as you said in the article, you said, well, if you're just creating a poem, Or some music lyrics, if it hallucinates, that might actually make for a better poem. Who knows? Yeah, it might be a feature. Exactly. Exactly. So yeah, maybe get into the framework a little bit more. That would be helpful.

I. Okay. So as I said, the framework was developed a based on a lot of like 10 years of experience working on a lot of use case recruiting, search and others, but also all those massive steria that a lot of, especially investors would call me and ask me, Hey, is Google bad? Or, you know, is search completely disrupted because of OpenAI?

And basically I try to explain to people, And people say, oh, that's very helpful. So on a high level, I believe one of the biggest misconceptions, and people look at what the large language models can do, they're very excited about what I call fluency. And fluency just means in my of fluency means. Uh, how suddenly people discover that, uh, machines could talk like human in many cases, very eloquent and very confident human may.

Uh, meaning like if you interact with, you know, open AI with, uh, CH GPT or with Bar or with new Google, uh, generate ai, you see how eloquent and how confident and how polished the answers are, and that's great. It's a huge technological achievement, but it's important to understand that it does not mean that.

Very confident talking means that you're highly accurate. You could be, it's a completely different dimension. You could be very confident and very eloquent, and very polished, and very inaccurate. And by the way, investors usually know those people very well. It's a very, uh, common, uh, behavior in some people, uh, which is basically, it's a like human trait.

You are very confidently and, uh, and, uh, uh, Um, uh, confidently in very polish manner. Talk about any topic in the world. If you ask a question, you don't know an answer, you just make shit up on the spot. And, and because you're so confident, because your delivery is so smooth and so amazing, people tend to get carried away, especially in topics they don't understand, like AI really, and that's a very dangerous outcome regardless of the intent.

It could be con arches that want to de from you of your money and you get carried away. We have ton of examples, including recent ones like F T X. Or it could be a person that is, he's not a con artist, but he really believes in his own thing. It doesn't mean it's true. It could be completely wrong, but he just believes in it.

Again, maybe F T X could be that example. I mean, you know, Sam be, Andre talks about he's innocent. He just believes that he's right, right. So you never know, but at the end of the day, the outcome is pretty bad regardless. So I think what's really important when you look and approach and applicability of large language models is to look at the.

Plot the use cases on those two by two grid, uh, maybe even three by two by three grid, and basically put. On one end, you put high fluency versus low fluency, and the other way you put high low accuracy and you say, okay, so what's most important for this use case? And I'm specifically using uh uh, an extreme examples.

When you write a greeting or you write a children books or composing a music or writing a poem or writing a science fiction, what you really, really need is a good story. Uh, in some cases you might not even need accuracy. It's actually might be, uh, a feature that you don't need made up story anyway. But at the very least, there is no right or wrong answer.

Right. So accuracy is not that critical. Mm-hmm. When you look at the use cases more into a search site, for example, You are asking Aquarius that will use as a supporting data for important personal business decision, meaning it'll include involving money or time, or both. Uh, what you're really need at the end of the day is high accuracy.

Yes. In some use cases like. I want to buy a dishwasher, uh, or I want to, uh, book a hotel in Paris for my next, uh, vacation. I need some explanation. I need a story. But at the end of the day, if a good story comes from expense of high accuracy, that's not gonna help you really, really, really need high accuracy here.

And uh, the last thing represented by colors in my framework is basically what I call low stakes versus high stakes, meaning, What is the, what is the risk or what are the consequences of making a mistake? So let's look at maybe a bit more, uh, practical examples. For example, writing a business memo email that is kind of yellow in this case, meaning what I mean by yellow is like, it's not that like it's, it's, I don't care that the email will be inaccurate.

I do care. But the good thing is that me as a user, unless I'm very dumb, I'll probably not send an email that was generated by a large language model just like that. I will double check for accuracy. Yeah. And what I'll do is that that'll be division of one between machine and humans. That is very reasonable human machine.

Help me write a good story, because writing a good eloquent story takes a lot of time and effort for most of people. People are usually very good stuck. They're striving to make it a good story. The same about creating business presentation, et cetera. So I will, let's just pause right there for one second because I want to explain to those that are just listening.

We have this XY coordinates, and on the left side you have low accuracy. And then on the right side you have high accuracy, and then on the bottom you have low fluency, and on the top you have high fluency. And basically everything in the left upper quadrant are green use cases, and that means that you don't need super high accuracy.

You do need higher fluency though. And these are things that Barack was just talking about, like writing a poem or writing science fiction where story is important there. But as you start to go over and you need more accuracy, that's where things become more yellow and the further to the right you get, so the higher accuracy that you need, that's when things get read.

And so you were just about to go into some of the other use cases. Feel free to jump into some of these red use cases where you do need. High accuracy, like the travel recommendation bookings or, yep. Yep. That, because yes, I would guess also the difference between yellow and red, because in yellow you might actually need high accuracy too, but the good thing is that user.

Could double check for accuracy because the use case, even at scale, meaning even if you need to write millions and billions of business presentation or business memo emails, you still will Divi division of work will be. Write me a good story. I will double check for accuracy. And even if you need to adjust some of the facts, uh, it could provide a productivity boost of up to 70%, which is huge because as I said, eloquent writing.

It takes time and effort for a lot of people. Now, on the red side, that's where it becomes very tricky. Any search use case, it's physically impossible to put a person behind every search query and validate it. That's not possible, and if you ask users to, to validate every search query, then it's not that different.

Then what we're doing in search today, and therefore if we, I go to the next slide when I basically create those two circles in, in blue and red, basically, I clearly, if I grossly oversimplify, I basically say that use cases that are create or workplace productivity use cases, improving productivity of creators, meaning writing, composing music, writing a poem.

Writing a children's books or improving what I call walk white collar productivity, writing business memo email, creating a business presentation, creating a marketing asset, creating stock images, et cetera. Those use cases are much better fit for large language models because in those use cases, you either have what's really important as a story and you can basically, user can validate the fact.

The use case that I call them information slash decision support use case that are roughly search use cases like give me an answer. You machine give me an answer. Which hotel I should book in Paris? Gimme an answer. What appliances I should buy, what insurance should I buy? Or gimme an answer for? A very important supporting data for very important business decision.

Those use case in my opinion, will take much longer. Those are longer term use case for LLMs. Uhhuh. I would even argue that people will not be. Constable and ready to adopt the use cases of search, uh, using lms. At least right now, people will focus more on those creative slash workplace productivity use case.

So that's on a high level my, my, my framework. You also have this idea of it's not only, hey, let's have a travel recommendation or a hotel that we're going to recommend that you go to. Right now a lot of people are talking about, oh yeah, well you'll get the recommendation and then you'll be able to book it right there.

But what happens when you don't really trust the booking agent either, and maybe you book it, but it books for two children, or it books for two adults and really you want, you have, it's just you going, that kind of thing is also very scary and it's hard to debug those pipelines. Absolutely. I would even say less than that.

I would even not talk about taking actions. And my great example is what happened with voice assistance and making those purchase and voice assistance like Alexa, because, you know, we, we were in the scare of new paradigm shades that will disrupt charge. Mm-hmm. Uh, like in 2015, 2016, when Alexa launched voice assistant and said, Hey, you now can Buy, you know, your toilet paper and your two space or whatever you using Alexa. And there was no technological challenge with doing that because Amazon, they have 150 million prime members that in the US where they know they can literally predict when you'll buy next to space because you order it for them and there is no technological issue to understand that you're asking for a tool space and they know which one.

But guess what? And Google, Google got cared about it and invested a lot in, in Google Assistant. Mm-hmm. And Amazon invested, and Google invested, and Microsoft invested in Cortana and Facebook invested, and Apple invested. Everybody invested in all those devices. And there is hundreds of millions of devices, but those devices, people don't use them for what I call high stake use cases of let me buy something using voice, which is very close to the lens, right?

Mm-hmm. People use it for setting an alarm. People use it to play music. People use it for law stake use cases. So my feeling is that forget about, just like, can I buy it? Because on Amazon you could buy it very easily and it'll be even shipped to you. I think the paradigm shift of trusting a machine for something that'll evolve time or money or both, people are not there yet.

Yeah. And some of it is technological challenge, but some of it, like in Amazon and e-commerce, it's not a technological challenge. It's a user adoption challenge. So I think those areas will, will be adopted much slower. If you ask for prediction, people will go broadly, let me try use machines or L LMS for creative use case.

Let help me writing a review. Help me write a, a note, thank you. Note to a t-shirt. Help me write a, a memo, help me write a presentation. And when you start comfortable, more comfortable, comfortable, that could take three to five years, then I will start saying, okay, now I trust your machine. Be my personal assistant.

Gimme an answer. What, what restaurant I should book? Give me an answer. What insurance should I buy? Et cetera, et cetera. That's my feeling, how it'll go in terms of consumer adoption. And there's obviously technological challenges of hallucination where you're talking about high stakes, highly accurate use cases too.

Mm-hmm. Yeah. My favorite thing to do with Siri literally is. Saying, where are you? And when I lose my phone is like, Hey, where are you? And that's basically what it is. And like you said, setting alarms or maybe playing music, but not much more than that. And I'm not sure what you're saying is, yeah, in a few years we'll have that adoption because people will feel more comfortable using these large language models.

I'm kind of swinging the other way that the more I use them, the less confident I feel in doing more with them. And so I, uh, I, I wonder if other people are feeling that way too. Uh, but so, and that's, I'm just curious, that's the metrics. Is it for use case that are more creative use case or those are more use cases of information seeking?

more creative use cases. Oh, even on creative. Okay. Okay. Interesting. Yeah. Yeah. Like, Hey, generate something for me, or help me summarize this, help me transcribe this, whatever it may be. I, I guess I hold them to a very high bar and I, maybe you're right. They're full. by the way, I would consider both you and I not very representative.

Uh, Users at scale, because like we are still very advanced users, right? We are early adopters of everything. So yes, I mean, I, if you ask me, uh, starting to the, the paradigm of, uh, search will probably take five to seven years to, to get to sizable scale. I think other topics like creative workloads, activity will be faster, but again mm-hmm.

It đź“Ť still will take time. Yeah.

So there is something that you, you kind of talked about there On one hand, everyone needs to remember that infamous blog posts. That is, you are not Google. But on the other hand, I would love to hear what you as a product owner, what you think about when you are looking at these new applications that are coming out.

And as you mentioned, there's a big difference between making a Twitter demo like an auto GPT that blows up and gets a ton of stars. Then when people actually use it and try and use it in production, it's like, dude, this is not working. And that can go for any kind of app right now, or any use of these LLMs and how we can mitigate some of these big questions and trade-offs that we have.

One being costs, that's huge. Right. And then the other one that you mentioned is latency. And so, yep. If you are trying to do this at scale and really have a. Battle hardened, Bulletproof LLM ops pipeline. What are you looking at and how are you thinking of that? And especially also just to add another question onto the infinitely long question.

How do you see other companies doing this? Well, first of all, I think it goes back to my framework is that people should be realistic, what can be achieved and start this use case that I would call them easier and harder use cases because if you try to shoot for the stuff and thing, I'll replace Google search, right?

Because that's very difficult. LLMs might take 10 years together because there is always, you know, the TTA principles that in productizing something, you sometimes spend 80 to 90% of their work to, to, uh, optimize for 5% of the use cases. So, for example, reducing hallucinations, you might, it might take literally 10 years.

To get from 90% accuracy from 90% on 95, and that might not be enough because on search you might need 99% accuracy. On the flip side, if you need to create, if you start to creating use cases, I dunno, writing a movie script, you might be totally fine with 80% accuracy, right? And 20% hallucination is less of an issue.

So I think first of all, you need to be realistic. What use cases you should focus on, right? That's the first thing. And then there's stillton of topics that you mentioned. One of them back to the use cases, is that like, Latencies and cost, believe it. Uh, guess what? Even on those, you should focus on easier use cases because.

Many use cases, latency is less of an issue. It is a huge issue in search because people expect instant answers. So you need very high accuracy. Means probably means larger models, bigger costs, and you need very low latency meaning, and very, very high freshness. Meaning in search, you need super fresh results.

Otherwise, there's stone of search queries that are influenced by, uh, fresh, uh, by fresh results. And guess what? Fresh results have much more abilities to either been hallucinated. Or being influenced by boats or whatever. If tomorrow I, I dunno, Trump was addicted or Trump announced he will, uh, running for president, you can have ton of boats that will completely pollute your results, right?

And it's not even hallucination. It's for real. Something, uh, something, uh, not, not fully, uh, not fully correct. Uh, if you go to use cases of, I dunno, uh, please create a draft email for me. A you probably can wait, maybe, maybe even an hour. You can wait. So latency is not as, uh, uh, rigid. Uh, as I said, accuracy results are not that critical, uh, et cetera.

So the first and foremost, I think classical role focus on use cases that are achievable. And then even on those use cases is don work, because you need to make sure also even the, uh, Point of, you know, open-ended N L U based input. It sounds very cool. Oh, let me allow people to say whatever they want. But what I discovered in my many years, for example, working on speech is that our language is extremely rich and it's awesome, but it also extremely open-ended and ambiguous.

And I always give people this example of, you know, if you were a witness of a crime scene, you will, you'll call to a police station. They like describe the. Describes the suspect like, well, I, I cannot describe, I cannot draw. Like, don't worry. Let me bring you a police sketch artist. He will help you draw this person, and then you suddenly discover.

He's asking you these questions that is, you know, 15 types of eyebrows and 15 types of ears and 15 types of eyes and, and fork types of noses. And then it becomes very complex. So in many cases, when you give people a lot of choice to express themselves, it's awesome, but it also complicates the product solution.

In many cases, you actually need to provide people easy to use either toggles or, you know, UI choices. You know, like shorter versus longer story. Visual versus, uh, a textual story. Um, casual versus professional. So a lot of the things can also be simplified. There are stillton of things to be figured out there, but make big advice to people and I think we can also talk about the industries where I believe that will be disrupted.

Focus on use cases that LLMs are, are good fit today versus use cases that LLMs could be a good fit tomorrow because that's potentially a very long road to make it happen. I love that. And there is one thing that you said in your article that I wanna harp on for a moment, and that is around the idea of having.

The user feedback and really where you're going to create a moat if you are creating some kind of tool or a use case, maybe it's people within their companies and the whole company is giving them pressure that they need to start using these LLMs and they have to make sure that they're powering up all their coworkers with some large language model capabilities or whatever it may be.

But you were talking about how the real delta is going to be created when you can get the human feedback. When it is built into the workflow of the application, so that may be, if we're going back to this text creation example, that may be that every once in a while you just have to create some kind of text or you have to change the text that's underlined.

Grammarly, for example, maybe they say, look, this isn't correct, but we're not gonna give you the answer that we think is correct. We're going to ask you just to change it. And then because you change it now, Grammarly has a more robust data repository that they can use and call on. Train their models with better.

Yeah. By the way, there is multiple ways to look at that. You could also suggest a change and you learn from adjustments that people do. There is multiple ways to do it, but definitely feedback is very important angle here. Yeah, yeah, exactly. And so that is kind of the idea that you're making is that the feedback is going to be the crucial aspect as we move forward.

So whatever you are creating, Try to make sure that you can have that feedback loop inside of whatever it is you are creating. Yep, absolutely. So that's super cool. Now, there is a a few other things that I wanted to go down the road of, which is one being these different industries that you feel like are going to be affected the most.

I get the feeling that we are both very aligned on the idea of it's not going to be self-driving cars. That's not where we're going to see LLMs make their biggest impact. And for that case, it's none of those bubbles that were in red that are basically, as you go along the. High stakes access. The higher stakes, the less LLMs are going to become useful in the short term.

Yeah. Correct. And you, you think that later on give it whatever, five, 10 years, it will eventually get there. You have no doubt earlier. That's my feeling. Again, it, it highly depends back to your point, how I think it'll be a progress. People will start trusting machines more and more on this creative or workplace productivity case.

And if they build this, Trust or rapport or whatever you call it, then I think you might be able to go there, uh, uh, to go there, uh, and go to those use case. And remember, there's also, there is also the aspect of, uh, technical aspects that needs to be done. Yeah, that's such a great point. So the, I guess the, the really interesting thing that I'm wondering about are what are the in industries in your mind that are going to be disrupted the most in the next few years or that you are seeing being disrupted right now?

I mean, I think we've all seen that copywriters are getting a huge level up. It's not necessarily that they're being completely replaced, but they are being, Able to use a lot more tools and they have a greater advantage now and they're able to get more done. Same with coding. If you are using copilot, that's an awesome example.

But what are some other, I, I mean, I guess law taxes. I know I've, I've seen a few really great use cases of, uh, a friend of ours, friend of the pod s who came on here and he was talking about how he works at this company called Digits, and they are using large language models to analyze people's taxes and how they do the taxes and suggest to a human who is a tax professional, Hey, maybe you can do this.

Uh, to save some money on your taxes. Yep. And then that tax professional will say yes or no on it. Yep. And you touched on three out of four that I wanted to talk, but I will kind of expand on them. The first one I think is by far, the first one in the short term is entertainment. It's already happening, and that's not surprising because entertainment is for good or for bad, always are running after the shiny new things or shiny new technology.

I mean, again, you probably know that e-commerce and internet and credit cards. The first, as the first, uh, adopt early adopters were the porn industry, but that's kind of a part of the entertainment industry and on, on the bedside. They always were forced to sue, like Napster example, and we see exactly the same thing.

You see deep fakes and different artists trying all those cool LLM things because it's also easy to try. Yeah. But on the flip side, we see studios are basically suing the platforms. Yeah. Or asking them to remove, like give fake or drape and all other things. So that is already happening and it's, it's fascinating topic to, I will probably publish an article about it.

It's fascinating topic to look at, not only because of entertainment as an interesting angle. But also because that's in many cases, shows a way where things could, uh, evolve because they're always the first to adopt new cool things that are very, you know, top of mind for people and also, you know, try to stop it.

So it's very interesting. The second one, the big, it's not really an industry, it's more I would call, uh, a collection of use cases, and I call them use cases that are customer interactions, use cases, sales, marketing, customer service, et cetera. Those include almost every vertical. So you mentioned legal. It includes legals, includes, you know, it includes transportation, like airlines.

It includes utilities, like, um, like, uh, you know, cable companies, et cetera. It includes healthcare, includes insurance, everything right. I believe it'll impact every company. I would actually say more than that, that I believe that, that this disruption of customer facing interaction will impact almost every industry across the board, financial services, banks, et cetera.

And the reason for that is that I believe that within. Three to five years. Any company that has sizable, let's say more than 1 million customer base would need to make their internal knowledge basis like their internal knowledge base. For example, if you call, I know American Airlines and ask why my, my, am I eligible for refund if my site was canceled or you're calling, uh, you, you know, you, uh, internet provider asking my internet not working.

That's basically what you know customer service are doing today is they're basically understanding your intent and trying and creating an internal knowledge base, trying to give you a cookie cutter response. I believe that if you make those knowledge bases accessible and serve by large language models across all channels, emails, documents, chats, Even calls, for example, you probably familiar with Google Duplex technology, that Google used to book appointments that people could not differentiate whether human speaking companies that will not embrace this new paradigm will face the risk of being disrupted by competitors either existing or new.

That will deliver not only. Fraction of a cost. It could be 50 per much lower cost. It could be 40 to 50% cost. Tructure definitely, but I believe it'll be much better customer experience because you will leverage generative AI or language models. Because think about it, if today you call customer service and saying why my internet is not working, and it's relatively poorly trained, in many cases, offshore customer service guy who understood you intent more or less, Created the database or knowledge base and gave you a cookie cut response that could be too technical for a average person.

Large language models could easily, you can easily ask him, give this knowledge, summarize this knowledge base on knowledge basis. Summarize it makes sense of it, and rephrase it for a non-technical person. It's also very good. Chad GPT four is very good on. Understanding human emotions to some extent. That person is upset, was scared, et cetera, and find a way to calm down, basically use phrase that'll calm down a person, et cetera.

I believe we'll see much better customer experience if you use large language models and lower costs. So I believe that will be a, by far, the biggest disruption. I now have this kind of a slogan that all those companies, Does that have sizable customer base need to focus on Chad GPT in your competition or your chat or your competition, will Chad GPT you?

And finally, you mentioned, uh, coding. I think coding will be disrupted and it's probably very relevant to your audience. And finally, I also believe that education is being disrupted as we speak. Uh, as you probably saw, GPT four especially is amazing in passing standard type tests. It immediately impact public companies that focused on that, like Chegg or Pearson.

And I think we just in the beginning of that, But I would say back to my point, that cus customer, customer having interactions, it's almost every industry that will be big focus for LLM. Yeah, that's fascinating to think about because I know you think about these apps such as Twilio and Twilio feels like they're in a perfect spot to help others considering what they do.

Just if they add some kind of GPT feature to it, so it makes it really easy for people who are already using the Twilio features. Then they can go and add that on top. And so I think we don't need to get into it. 'cause it feels like that's probably for a different discussion. Uh, when it comes to, do you create a whole new tool or do you do what?

Hopefully Twilio will do, since they already have the distribution, they already have the customer base. You just throw on some chat GPT or, yeah, we shouldn't go into that. But it's not that easy. Not on the tools and not on the process. Because if people think. It's a new offshoring era. Let's fire a hundred people in US or Europe and hire, you know, entire machines instead, like we hired in India.

That's not that. You will need different tools to deal with hallucination, so you'll need different skillset. For example, you need more people that do data cleanup, et cetera. Data processing. You will need probably people that will be higher end customer service agents to deal with exceptions. The things that you know, high stakes, for example, giving refund or monetary.

So the entire process and tool needs to be re-engineered. So it's not as simple as like, oh, let's just, yeah, let's just pretend that we added check G B T on top a trilio, other tools and let's call it a there. I think it, we are very far from figuring even out what, how it'll look like It's a, it's a big amount of, it's a big change.

Yeah. Well, it is interesting you mentioned that this, the data side is going to be so important and it feels like yes. The data engineers are going to be even more important if they weren't already an integral piece of the whole MLOps or machine learning lifecycle. The data engineers are going to be so integral in this and absolutely.

Maybe you could speak a little bit to that and what absolutely. Maybe you could see happening there. Yeah, so there is actually two aspects to it. So I mentioned the first aspect, so the, the most interesting question that we still don't have an answer is that, The field of dominated by huge and mostly proprietary models or.

It'll be split or maybe even dominated by more custom models that are built on top of open source models that are custom, customized and fine tuned for a specific industry specific use case or even specific company. And it's not, I mean, I don't know the answer to it. Like the first three months it was like, you know, open eyes eating the world and everybody will try it.

But as I said, they will try. Then when people go to scale, they just realize, a, it's expensive and there is issues with privacy and security and proprietary and you name it. And then suddenly we see something that we, I knew forever that like open air doesn't have any big mode because a lot of this research came out of Google.

Google doesn't have MO either. You probably saw this articles that Google, we don't have a MO and op neither that. Open ai. There's a ton of companies that develop, uh, op open source models that any, any industry, I know entertainment industry could take an open source model and put their own training data, especially if it, you know, IP protected and, and do something with it, something interesting.

Uh, so if that happens, you will see a huge impact on MLOps and DevOps to basically all the tooling and vector database, et cetera, to. Find the data, prepare it, pre-process it, store it in the right embedding manner or vectorized manner to make it, uh, available for, uh, for training. So that's one angle that I think it's re relatively highly dependent on how you, uh, which models will be used.

And my feeling it'll be, there'll not be one dominant approach. It'll not be all open ai. Or all Google or all Microsoft, it'll be split. Some companies will use custom models. Some companies will use, uh, proprietary big models because those models are expensive. Uh, the second aspect is that I think it's very important is the, what I call data retrieval.

So I was talking about, you know, that you need to connect your internal, uh, uh, knowledge basis to lms. It doesn't have to be that you need to train it every time. Or fine tune it, but you definitely need to make sure it's accessible for some kind of indexing or some kind of internal enterprise search engine.

And that is also totally known trivial task that requires to some extent, uh, MLOps and DevOps different type of, uh, tasks. So it's becoming closer and closer to data science. And finally, I believe that. In many cases, LLM will be focused much more on handling what I call front end. They will handle customer interaction in a sense.

They'll understand the intent of the customer. They'll also generate the output. What will ton of tools on the backend that some of them might be ML based or even LLM based. So some could be just classical data retrieval that is also very complex. How do you call your private and you know, public sources of information.

In, uh, internal knowledge basis. I dunno. Reddit, uh, again, in my example of how do I reset, I dunno, a router, you might have information, internal information, but if it's a complex issue, maybe even check on Twitter or on Reddit what people are saying, how to deal with this. And that could be another option to help people.

So you need to create a lot of databases, which is very hard. Greeny work. And have some kind of a logic what kind of information you pass to LLM. So LLM will do the reasoning and summarizing and basically finding what's the right answer to give to the user. All the stuff, I think will basically those smaller, um, internal models, whether LLM, uh, ML based or not.

LLM based to note will handle a lot of this backend and that would require a lot of work. And I think MLOps and DevOps, their skillset needs to be much more what I call LLM friendly. They would need to be much more, much deeper understandings. They the, um, strengths and also limitations of l lms and making sure the systems, internal systems are designed in a way that will make the, make LMS be more efficient.

Man, that is such a great point. This idea of it's the front end, but it's not necessarily the backend. We're taking the input and doing something maybe with some LLMs, and then you pass it to the backend and you potentially can use some kind of rule-based system, but as long as it's passed through that first LLM, and then when you're bringing it back, you are maybe passing it again through another LLM.

And Yep. Ah, yeah, I, I love that idea. So the, there's so many great things that I've learned from you in the last hour chatting. I've got one more question if you Sure will allow me. I know we're a little bit over on time, so No worries. Hopefully somebody isn't waiting for you on the other end of this call.

The thing that I'm wondering about as a product leader, How do you recommend to the machine learning engineers to work with product and work with the teams to better position themselves for success? Yeah, great question. So I would look at that from two angles. One is on the technical side, which is probably closer.

Really try to be on top poles as much as you can. There is, I mean, the head is feeling even for me, There is ton of stuff happening every day on like what's going on with models, but there is good, there is a lot of good articles that's even right now the leaderboard. And by the way, if you need it, I can send in a, a link the leaderboard that compares leading proprietary and um, and open source models and their performance and they even show how they measure it.

I think understanding those aspects and looking like where different models come ahead and where the high limitations. I think understanding that angle will be very critical because it'll start on, will start, people will start understanding where the limitations are. Also understanding a bit the basics of how LMS work, especially the concept of embedded or vectorized, uh, database that the, how they try to predict the next word, because that will give people an idea when they start thinking about the data from internal knowledge basis or from internal system needs to be potentially stored or it is prepared.

To be ready to be used consumed by LLMs in more efficient manner. That's important. And on the other angle, I think on the product side, the more you can be product and client professional and try to understand and ask ton of questions and even challenge the product managers about the use case, because as I said, cart, is there a ton of hype about, oh, let's just connect.

Let's just take some kind of an input and call check GBT four and give some some response on that. That's our application. That's not an application, that's a demo. Start thinking as a user, like, okay, here's my user, and how does that, is it a demo for 5% of my users? So it's actually applicable to many users.

I mean, I always, uh, encourage and maybe some of my product, uh, colleagues will not like it because they sometimes want to pretend that they know everything, but like nobody has a, a monopoly on good ideas. Good ideas could come from everyone and bad ideas could come from everyone. So it's important to try to ask question and understand the use case and product or business use case as much as you can.

Those two things I think will make you much more profession professional. Brock, this has been fascinating. I really appreciate you sitting down and chatting with me about this. I learned a ton. And of course, if people are not following you on LinkedIn, I highly encourage that you share all kinds of great wisdom and up to date information on everything that we just talked about today.

So I think we're gonna end it there. Thanks again, loved having you. Uh, it was a pleasure. Thanks a lot.

+ Read More

Watch More

29:56
Posted Apr 11, 2023 | Views 2.2K
# LLM in Production
# Large Language Models
# Industrialized AI
# Rungalileo.io
# Snorkel.ai
# Wandb.ai
# Tecton.ai
# Petuum.com
# mckinsey.com/quantumblack
# Wallaroo.ai
# Union.ai
# Redis.com
# Alphasignal.ai
# Bigbraindaily.com
# Turningpost.com