Sign in or Join the community to continue

LLM-based Feature Extraction for Operational Optimization

Posted Jul 21, 2023 | Views 945

# LLM in Production

# LLM-based Feature Extraction

# Canva

Share

speakers

Xin Liang

Senior Machine Learning Engineer @ Canva

Xin is a Senior Machine Learning Engineer at Canva within the Content & Discovery group. Xin has experience across multiple machine learning areas, including natural language processing, computer vision and MLOps. Driven by her deep passion for Artificial Intelligence, she leverages her engineering expertise and leadership skills to architect and deliver machine learning solutions.

+ Read More

Demetrios Brinkmann

Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More

SUMMARY

Large language models (LLMs) have revolutionized AI, breaking down barriers to entry to cutting-edge AI applications, ranging from sophisticated chatbots to content creation engines.

+ Read More

TRANSCRIPT

All right, so we have come to our last talk of the day. It's not over yet, but I am getting a little emotional already. And where is, oh, there you are. You're already in tomorrow, huh? Yes, I'm in. Excuse me. I'm in the future. You, you are. It's already tomorrow where you are. I can't believe it. Uh, and uh, anything interesting happen in the future yet?

Um, it's a nice day in Melbourne. I'm based in Australia, so that's why I'm in the future, so, um, good things will happen. That's what I'm talking about. That is so cool. And so I'm gonna let you take over and bring us home. Um, this is very cool to have you here and thank you for waking up early, cuz I know it is a Saturday where you are and you're waking up early on a Saturday to do this.

So that shows some serious commitment and I appreciate it and I'll let you, I see your screen is shared here, so I'll get off the stage and let you jump right into it. Thank you. Thank you very much. And hello everyone. Good morning, good afternoon, or good evening, depending on where you are in the world.

I'm Sheen from Canva, and today I'm gonna talk about using as feature extraction method for operational optimization. A bit about myself first. I'm engineering machine learning within Canva in the content and discovery area. Haven't been playing around with, um, lateral language processing, computer vision, and MLOps for a while.

Now, for those who is not familiar with Canva just yet. It is an online design platform aiming for everyone to design everything. And it's how these slides are made. So LLMs or large language models that we have been talking about in the last two days, and it, um, is the hottest kit in town right now. Lots of hypes and lots of discussions are centered around them and what can we do with them?

LMS have revolutionized AI breaking down entry barriers to lots of cutting edge, um, AI applications. And we have seen many applications built on top of LLM, solely copywriting software, chat bot coding, assistant payer programming buddy tooling to understand, you know, lengthy legal documents. The list can keep going forever.

That is if you give Al Lambs the input and that it will generate the decide output. However, another way to utilize LLMs, which is let's talk about, is how to embed LLMs into existing technologies, whether it's to improve existing ML AI technologies or introduce ML AI capability into existing technologies.

So simply put, LMS can be used as a middle layer reaching any upstreams and downstream tasks. This is what we are focusing on today, LLM based feature extraction for downstream tasks. But why do we use LMS as a middle layer in such as, you know, building applications on top of it solely? Um, because we found that it can provide solutions with higher performance.

And accuracy at a greater velocity and at reduced cost. Sounds too good to be true. I'll use two case studies to illustrate. So one of the use cases for LMS in Canva is for informational categorization. We have a variety of contents to form our information architecture from design templates, media assets, to informational articles, and we would like to organize our contents and information in a way that is most suitable to our information architecture.

With the categorization system, this categorization can be applied to various areas within Canva. Content and information, including our user search query, um, to categorize them and understand users' interest in our content. And another area is our contents page, um, internal retrieval.

Let's take a look at our first case study user search query categorization. Internally, we would like to understand what users' interests are in our content by aggregating them and then grouping them into different categories. Our content has a tree structure like this, and when we categorize user search queries, there's a series of.

Steps to funnel these queries into different branches of our content structure and a categorization models would be developed for each step or note of this tree structure to funnel the search queries into different categories. Three, for example, an intense classification is needed as at the first layer to class, classify aggregated queries into different intent.

Intent in this context is about what type of content users are searching for in our architecture. For example, some users search queries about template, some about feature, and then the others. So even though it was a straightforward test classification problem with a few classes, it still needs to go through the full model, um, development cycle to develop this classifier in a more traditional ML route.

So when LMS came out and our intense classes increased, we thought it would be worth evaluating LMS for this task in comparison to our existing classifier. And here are what we need to do to capture the text intent into, in these two approaches respectively, the classifier. And it requires the full development workflow, including to collect a large amount of label data, tens of thousands of data points, and then set up the training infrastructure to train the tax intense classifier, and then set up the deployment infrastructure to deploy the trained classifier.

Then we can extract text intense feature at inference time. With S API as text intense classifier, we need only need to collect a handful of examples under 10 examples. And then we need to design the chrome structure to output the predefined classes. And then at inference time, we could use field short learning with a few examples that we've lost previously.

Um, and. To extract, detect intent, feature at inference time. So this process is much more simplified, no training and deployment, infra setup, and with a much smaller annotated data set. So in terms of development time, timeline, these are the timelines for the two approaches in comparison. It took about four weeks for to develop these, um, single purpose classifier, including, you know, one data collection, exploration, and then the intent classifier training, inference, module development, and then deploy the classifier.

Well, it took about a week to achieve the same task using LMS APIs, and it significantly shortened the development process and timeline, especially for straightforward tasks on tech starters, such as the intense classification in our use case. And this is the overview on the operational effort for both approaches.

The um, for the single purpose intent classifier, end development time, four APIs. In terms of operation cost for the single purpose, um, intense classifier, it costs about a hundred dollars per month. While that of, um, the LMS APIs is roughly less than $5 per month in our case study and operation cost is based on the, you know, available public pricing on a.

The operation of our jobs, it's a scheduled jobs running weekly with ESS and thousands of input. So from an accuracy perspective, LMS APIs is higher, um, than the single purpose te uh, intense classified in our case study with only field shot learning. No fine tuning involved. And here are some takeaways from this, uh, use case of us evaluating LMS when to use LMS api.

It's effective on straightforward tests, uh, tasks on tech, uh, tech data, and it helps for prototyping and speed up the solution launch of the first iteration to gather more production information. It can be cost effective when the scale meet cost saving advantages, which it does in our case studies. For prompt engineer, uh, prompts, design fuel, short learning can be sufficient to show custom logic.

On the text tasks instead of like, you know, any, um, fine tuning required. And it's important to consider the standardization of the, um, completion format, whether it's a j o format or we structure a prompt to, um, have the fine set of answers and only to output, um, the answer. In terms of error me, uh, mitigation, M'S.

APIs do have rate limits, so to handle hitting the rate limits, ling is needed with mechanism and it's also good to have a fallback solution when possible to mitigate the negative impact of API downtime. In order fine tuning. As I mentioned, fuel short learning is good for rapid solution release, which is, you know, in our, um, case at first iteration.

However, when there is a scale on the data and or the custom logic, fine tuning might be needed to maintain both performance and cost effectiveness. And we found that small training data sets are sufficient for fine tuning roughly about 50 to hundred data points per class. In our case start, uh, in our ca use case, um, which would give acceptable performance.

And this is our first case studies to illustrate, you know, the operational efficiency and optimization with M'S. APIs in production. Our second case studies is on the content page categorization, and we have content pages of various characteristics from the pages with, you know, collection of similar template with short form metadata texts to informational articles with long form texts.

And we would like to group our content pages together based on their relevance and proximity in our information architecture. So essentially we would like to group our pages together, um, based on the semantic similarity into different topics, clusters. Due to the vast difference in the text lens and content information and metadata among these pages, variations of photo text, feature extractions are necessary using free LLM natural language techniques, such as like keyword, um, extraction with a.

Keyword extraction, specific Python library, um, text summarization with, you know, assisting, um, model architecture or libraries to do so. And then to adopt, um, key point of dot point, um, extraction to further distill this, um, key information from this, um, different, uh, Variations of the pages, um, into the similar text forms, and then we can convert the text of similar forms into, um, the embedding space before we categorizing and grouping relevance pages together.

Using this features. And this feature extraction requires different methods and frameworks and libraries, as I mentioned, which are pretty, you know, scattered rather than a one. Uh, Methods to go to and do everything for us. So when S came out and the variations of our content pages increased, we thought it would be worth evaluating for this task and see if it can simplify the feature extraction step or at once.

So we have experimented with different feature extraction method with the combination of open source, um, embeddings, such as the sentence transformers, birds embeddings, or the LMS embeddings. And these are the metrics for our contents page categorization. Um, using different text and feature extraction methods, we defined three metrics to evaluate performance of this task.

Balance to indicate how even the pages are grouped across all the pages in scope, um, to see how balanced they're instead of a concentrated, uh, set of pages in all groups completion. I percentage of these pages scope are. Relevant topics or clusters instead of being left out as outliers and coherence.

That's the way, um, that the pages grouped together is coherent and sensible instead of non-related pages being grouped together. So we experiment, um, all these different methods and combinations or, um, With the, um, in embed, and it turns out that LLM in embeddings on plain page text without any text feature transformation gives the best outcome.

It achieves the most, um, balanced, um, grouping and it group. 89% of the pages in the relevant categories, which is the highest, uh, among all the other methods. And you can see here also it achieves the highest coherent score among other, um, feature extraction method. And this is also the overview on operational effort for both approaches.

So for the, um, single purpose methods plus open embeddings, the end-to-end development time is about two weeks. It's only for the feature extraction step. While that of LMS embeddings is about three to five days, the operation cost, um, for the um, pre methods is about three per month. Well, um, as part of, you know, the training steps, that's the training, you know, uh, cost and the debt of the LMS inbounds will be about one third of that.

And as we discussed in the previous slides that, um, the metrics for this task actually, um, LMS embeddings can achieve, you know, um, the highest scores for all the metrics defined for this task.

So in this case studies, we also have a few, you know, takeaway learnings using LMS as the feature extraction in terms of feature variations. Um, since it's a one, uh, it's one foundational model to perform various, uh, text feature transformation and extraction, such as keywords extraction, uh, text summarization, a significantly simplified development process.

On the other hand, the text feature extraction can be non-deterministic, depends on, you know, your configuration and settings. Um, so the, um, meaning that the output of the LLMs can be slightly different every time you put, even though you put the same input in. Therefore, the suitability of which depends on use ca uh, use cases and obviously in our use case, um, this is not a problem.

And in terms of embeddings, we found that LLM embeddings appear to represent the text input, um, better than other available text embeddings. And the format and the length of the text input doesn't seem to affect the semantics, understanding much in the LLM, um, embedding space. In the future, um, we'll also see the open source, which can be utilized for text, embeddings and ion tasks when suited and potentially lower the course further in production.

So these are the two case studies where we evaluated LMS as a feature extraction method to understand its operational performance better, and we concluded that LMS outperform, um, other methods on the natural language tasks and text feature transformation and extraction tasks in our case studies with minimum processing and logic required.

The utilization of LMS API in our use cases also simplifies the development process and reduces the development effort and significantly shortens the timeline as well. The cost of LMS API is model dependent, however, um, in our use cases and appears to have lower operational costs for both model training and inference.

And I hope, and with the above, we found that l LMS based on feature extraction, can be operationally optimal, especially for rapid solution Productionization. And I hope this two case studies illustrates, you know, this point, um, enough for you to give it a go, um, and see whether it actually helped your development process.

And that's it. Thank you and I hope you enjoyed my talk. Oh, so good. What a closing. You did not disappoint and there are some awesome questions coming through here in the chat, which is how you know it is good. Uh, because if we, we haven't even finished and I didn't even have to ask for questions and they already are coming through.

Excellent. There's quality here, so there is, uh, there's an awesome question about. Which API was used when you were doing these? LLM? Embeddings? Yeah. Um, oh. In terms of the LLMs embeddings, um, we used the, um, open AI api, um, and with the, the text a, the b2, um, model, which appears to be the most performance and the most cost effective.

Nice. Nice. So, um, what do you think about the use case or use cases where processing millions of samples daily is a requirement? Any thoughts on how you would go about that kind of workload? Yeah. Yeah. So I guess for millions of, um, inputs that's at, at scale, um, and in our case, there's two ways to go about this.

Our case is that, um, we, when we do, you know, the users, uh, search query, um, categorization, what we did to kind of minimize or like reduce the actual volume of the data is to do the aggregation if possible. In our case, again, our search queries are also at, you know, Magnitude of like millions per day.

However, we are able to kind of like do the aggregation and do some of like, you know, the, the, uh, pre-processing to, you know, to do some, a little bit of heavy lifting work to kind of like, um, group them, aggregate them to together to reduce the volume and then use, you know, LMS to do downstream tasks. Um, if that's possible, to reduce the volume and LLM still, you know, provide you a sim.

Simple, uh, simplified. Um, Uh, methods to do the processing that, you know, there's one way to go. On the other hand, if like, you know, that's not a viable approach, then I would probably look into, as I mentioned in the, uh, learnings and second case, uh, use case. Probably would look into the, um, open source LMS and see how to utilize it within, you know, your infrastructure and hopefully that could, could set up your, you know, open.

Source, um, lms. Um, and then you have, uh, an in-house api and that hopefully can provide you with, you know, the lms, um, in embedding, um, source. And that could also be utilized within, you know, your organization. Um, that's way you with like a little bit more, you know, upfront setups and hopefully, you know, um, down the check it would pay off without, you know, the, the, the API cost with the external APIs.

Nice. So that answer the question, uh, There. Yeah, there's some awesome questions coming through here, and that definitely answered the question. Uh, I'm wondering about, all right, so RIT is asking, did you also try coherent embeddings? Was there multilingual content in your test? Yeah, so our focus, um, is on the, the English, um, only.

Um, so this is like in the, in the use case. Um, and I did try the, um, I did try it with like multi um, Different languages, probably not super into the, the in, in the embeddings. How when I tried the, um, you know, the modern languages essentially is to only, uh, operate in the text space so that I could, you know, input text and now get the different, um, uh, Sorry, the output in the, uh, relevance languages.

And then I used the open source embeddings, um, to convert the, the, the text in the, you know, in other languages into the embeddings. So I did explore that way, um, as the, you know, initial experiments, but we found that, The, um, the, yeah, the LLM embeddings on plain text is the way to go. And our kind, like scope is more on the English side.

So I did not, um, thoroughly experimented with the Martin language, um, in terms of the embeddings, but for the text, um, more like the translation or like the localization of the in only in the text space, I think that still works okay for us in our ex uh, in my exploration. So I don't think I told you, but all of the material and all of the creative that we have for this conference and specifically your virtual background, all made with Canva.

So beautiful. Yeah. Yeah. I have to say that right now. Yeah, it's super awesome. We love it. And now I got a hard question for you coming through in the chat, and you may or may not want to answer this, depending on how. Confident you're feeling with the PR team. So Canva had a major data breach a few years back.

Has that influenced the way that you work or your current direction in any way? Well, I believe the, um, breach that, um, we mentioned is probably in a few years back, or probably more than four or five years now. Um, and I think, um, we, we definitely, you know, We, you know, definitely experience like, you know, different things and then we learn from it.

Um, and that definitely, you know, especially, you know, from a security as security perspective, definitely learn a lot from them. And since then we definitely, you know, strengthen, um, our, you know, um, from an organizational, and I also like, um, the technical perspective. We have a definitely dis, gentle, you know, the area around your security.

And I believe since then there's no, you know, bridge after that. So definitely it's a. That was a learning experience from us and we learned a lot from it and I think we kind of, you know, handled and did well, um, since then, for sure. Awesome. So this has been super fun and I thank you so much for coming on here and talking to us and.

Just sharing what you've been up to at Canva. As I mentioned, we're huge fans. The, the small team that we have, we love using Canva for everything design, and it's almost gotten to the point where, uh, I don't even know why I'm paying for Adobe anymore. Don't quote me on that. Ken was way to go. Every month I'm like, do I need my Photoshop subscription?

Do I need my Adobe Cloud, uh, subscription. And yeah, I think there, there are so many pluses. It's just so easy to use Canva, and that's a testament for what you all are doing and how you are making it intuitive. And I can tell by the way that you're thinking about how to use LLMs, you're really thinking about how you can make.

The AI aspect of Canva also intuitive. Definitely. Yeah, that's definitely, and I'm glad, I'm glad that, um, you know, uh, Canva is being adopted, um, you know, the community and it's like, it's easy to use, um, for, you know, design everything for everyone. And I'm glad that, um, our mission is, you know, on a, on the right track.

Yes. Yes. All right. Well, awesome. Thank you for coming. Thank you for presenting. This has been amazing. You have a beautiful new day to begin, and I have a beautiful night to coaches sleep. Thank you for having me. It's been fun. It's been great. Thank you. Likewise. Uh, I'll see you later. And with that I'm gonna close it out.

+ Read More

Sign in or Join the community

Watch More

Evaluating LLM-based Applications

Posted Jun 20, 2023 | Views 2.4K

# LLM in Production

# LLM-based Applications

# Redis.io

# Gantry.io

# Predibase.com

# Humanloop.com

# Anyscale.com

# Zilliz.com

# Arize.com

# Nvidia.com

# TrueFoundry.com

# Premai.io

# Continual.ai

# Argilla.io

# Genesiscloud.com

# Rungalileo.io

Building RAG-based LLM Applications for Production

Posted Oct 26, 2023 | Views 2.2K

# LLM Applications

# RAG

# Anyscale

Building LLM Applications for Production

Posted Jun 20, 2023 | Views 11K

# LLM in Production

# LLMs

# Claypot AI

# Redis.io

# Gantry.io

# Predibase.com

# Humanloop.com

# Anyscale.com

# Zilliz.com

# Arize.com

# Nvidia.com

# TrueFoundry.com

# Premai.io

# Continual.ai

# Argilla.io

# Genesiscloud.com

# Rungalileo.io