MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Graphs and Language // Louis Guitton // AI in Production Lightning Talk

Posted Feb 22, 2024 | Views 626
# KG
# LLMs
# Prompt Engineering
Share
speakers
avatar
Louis Guitton
Freelance Solutions Architect @ guitton.co

Louis Guitton is the ex-VP of Engineering at OneFootball Labs in Berlin, Germany. He is experienced with Natural Language Processing, Recommenders Systems, MVP Building, and the Flow blockchain. Louis teamed up with clients to design and build transformative digital solutions, in particular in the Sustainability space.

+ Read More
avatar
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More
SUMMARY

"It is possible to build KGs with LMs through prompt engineering. But are we boiling the ocean? Can we improve the quality of the generated graph elements by using - dare I say it - SLMs (small language models)"

+ Read More
TRANSCRIPT

AI in Production

Graphs and Language

Slides: https://docs.google.com/presentation/d/1AcYOKc2IHrNdh38GnhQNqPA2AJvRAkHIv3FjDSooNIg/edit?usp=drive_link

Demetrios [00:00:05]: Now we've got Louis. Louis, this is the lightning chat, so we get to have you come on, present some slides and then come off. All right, we're going to.

Louis Guitton [00:00:16]: That sounds about right.

Demetrios [00:00:18]: Go very fast.

Louis Guitton [00:00:19]: Do you.

Demetrios [00:00:20]: I imagine you have some screen sharing to do. Are you going to do this all on?

Louis Guitton [00:00:25]: I'm sharing my screen right now, actually seeing it.

Demetrios [00:00:30]: There he is. All right, man, I'll be back in ten minutes. Right, see you soon.

Louis Guitton [00:00:35]: See you. Thank you, Demetrius, thanks so much for having me and welcome to this lensing talk on knowledge, grass and language models. I'm a freelance software solutions architect from Berlin and my name is Louis Guitar. Most of my NLP and grass work I did starting in 2018 applied to the sports media domain and this was the pretrained, fine tuned era of NLP. So if any one of you remembers models like Elmo and Ulm fit, that was that. And we already had small models back then, so it's kind of fun to hear that we are trying to get to small models again. And so in this talk, KG will stand for knowledge graph and LM will stand for language model, which can be small or large. And although Scrabble is correct in saying that KG is better than LLM, I'm not here to dunk on llms.

Louis Guitton [00:01:27]: Right? Rather, my point is that rising tide lifts all boats. That sentence applied to tech means that the rising tide of llms will lift all over areas, including cages. So let's have a look at how llms and cages can play together. So first, graphs can ground llms in facts, and that's called graph rag. And we'll have a look at an example implementation with Lama index later. Second, llms can help build knowledge graphs. And third, graphs provide a framework to tap into domain experts and through weak supervision. And that's what we'll start with right now.

Louis Guitton [00:02:11]: So visualization unlocks expert semi supervision and that helps us humans to understand black box models. So on the left you have a paper from 2020 by Vincent Varmadam that was using UMAP on embeddings to find some quality issues in them. On the right you have an out of the box visualization from a graph database. And in both cases there's context with color metadata and with algorithms that give you visual meaning. And my next idea is around the debates on human versus AI. And you might have seen news headlines like Chad GPT passing the bar exam. And I would encourage you to have a look at this answer from the snakehole AI snakehole newsletter. It shows that models like chat, GPT memorize the solutions rather than really reason about them.

Louis Guitton [00:03:04]: And it explains know how. Exams are actually a quite poor way to compare humans with machines. And instead, there's a whole new area of research around what's called reasoning and planning to go beyond memorization. And I think graphs can help with that. Finally, the proof is in the pudding. And with this other idiom, I mean that the value of llms must be judged based on their results in production. So at the bottom there, you see an interesting project from two weeks ago by Google where they built a production system to fix bugs using llms, and they achieved superhuman bug fixing. And they built it with an LLM.

Louis Guitton [00:03:47]: Yes, but they also combined it with smaller, more specific AI models and more importantly, with a double human filter on top. Right? So add Google llms, make Google developer faster at fixing bugs, but not obsolete, right? So that is all to show how graphs can work with domain experts. And now we're going to look at graph rag itself. So I'm sure at the conference you're going to hear a lot about rag. And on the right, you see an acronym discussion from two months ago where someone was asking, how do I train a custom LLM? And essentially the top answer and the entire discussion goes like this, like, well, you don't train, you just use rag, right? Everyone who says they're training is not really training. And I know there's DPO, and there's easier ways to fine tune now, but essentially what people are doing is that they're using rag. And like Dr. Wallet cages from any scale says on the left here, fine tuning is for form and not for facts.

Louis Guitton [00:04:53]: And so people have turned to rag to take care of facts. But there's something that's not said so much about rag and that there's a big cost of vector rag. In this tutorial from Nebula graph. There is some points around the different types of rags that they are, and I'll show you in a second a bit more details. And the fact that the fact coverage, like the outcome of your system, is going to vary widely based on the architecture that you choose, and also that the cost will vary. And if you use graph rag, it will come with more concise answers and a cheaper cost because it's using less tokens. So let's have a look at those different types of rags. I tried to create these diagrams to illustrate a little bit the differences and similarities.

Louis Guitton [00:05:44]: So the similarities are that you are asking a question in natural language, and you are getting an answer in natural language every time you're usually using an LLM in purple to kind of generate an answer given the question and some context. And then they start to differ in the way that you retrieve the context. So for example, for vectorag there is an embedding system that transforms the query into a vector, compares that vector to the vector database to retrieve some chunks that will form the context, and pass that to DLLM. Similarly, on the graph rag you have a system that actually extracts keywords from the query and those keywords are passed on to the graph to extract triples in the form of subject predicate, object that contain that keyword. And then you pass those triples to the LLM to generate an answer. And so today with, for example, lama index in ten lines of python, you can code your graph rack system. So you see me here using a local Olama LLM model, a local neo four j knowledge graph, and then I instantiate the graph retriever which extracts the keywords from the query. And then sure enough, you can ask questions to your system like tell me about Peter Quill from Guardians of the Galaxy and the LLM will give you a blurb about Peter Quill.

Louis Guitton [00:07:13]: So that's pretty awesome that it works in just ten minutes, but let's look a little bit more into the details next. So in order for this to work, you're going to need to first construct your knowledge graph. Just the same way that when you have vector rag, you need to build your vector database and embed your documents. For graph rag you're going to need to build a knowledge graph. And for that you need to do triple extraction. And people are usually doing this in a couple of ways. Either they're using a fine tuned model like rebel, or they're using an LLM with a chain of prompts, prompts that say something like extract for me, a few triples in the form of subject, predicate, object. And the issue with those approaches is that usually the quality is pretty poor.

Louis Guitton [00:08:01]: That comes out, and there's a big lack of control in terms of what comes out. And there's also a chain of different prompt chains, one for getting the entities, one for disambiguating, et cetera, and then you're passing the answers with Reg X's and stuff. Nevertheless, it's very easy to do. So again, you use your LLM, you use your knowledge graph, you get the data, and then in just the ten lines, you're using the high level APIs from Lama index to create the knowledge graph. But what's the result of this? Here's the graph that you get for Guardians of the Galaxy, volume three. There are issues on the nodes, where Peter Quill and Quill are two different entities. Here, instead of being just one, there are issues on relationships where being part of the cast and playing a role are two different things, even though they're supposed to be the same thing. And there's also issues in terms of the hierarchy of information, where in Green here, where James Gunn is the director of Guardians of the Galaxy, whereas here, my LLM found that James Gunn could not imagine something that was mentioned in the article.

Louis Guitton [00:09:09]: So that is to be contrasted with the KG that a human would build in Wikipedia, where you have James Gunn, the director of the movies, Chris Pratt, the performer of Star Lord, and then the movies and Star Lord that both appear in the Marvel Cinematic universe. So where do we go from there? We need better ways to do kg construction, and we cannot rely only on llms. And I want to draw your attention to two open source libraries, one text graphs from Paco Nathan and one zshot from IBM. They both try to give you better entities, better relations, and to enable human in the loop systems by integrating with graph databases. And so that's all very exciting, I think. So, to recap, how do knowledge graphs fit with language models? First, we saw that you can build graph rag and knowledge graphs can help llms be grounded in facts. Second, we saw that llms can help knowledge graphs by helping for the kg construction with a few caveats. And third, we saw that graphs can help unlocking domain experts by providing a natural UX for human input and helping us build human in the loop systems.

Louis Guitton [00:10:32]: Thank you so much for your attention, and enjoy the rest of the conference.

Demetrios [00:10:38]: Wow, dude. All right. I like it. That was pretty solid, and it was perfect timing, I've just got to say. So people that have questions for Luis, hit them up in the chat. And there is a funny one that came through that I want to just mention right now from the chat, and there is somebody asking for merch. That is, they were saying, I want Louis Guitan merch, not Louis Vuitton or whatever.

Louis Guitton [00:11:09]: Yeah, that's a classic one letter mistake. That's a rip us there.

Demetrios [00:11:15]: But that is it.

Louis Guitton [00:11:17]: Yeah, I'm happy to chat in the chat and see you. See you there. Enjoy the rest of the conference so much.

+ Read More
Sign in or Join the community

Create an account

Change email
e.g. https://www.linkedin.com/in/xxx or https://xx.linkedin.com/in/xxx
I agree to MLOps Community’s Code of Conduct and Privacy Policy.

Watch More

1:18:29
Language, Graphs, and AI in Industry
Posted Jan 05, 2024 | Views 1.1K
# ROI on ML
# AI in Industry
# Derwen, Inc
RagSys: RAG is Just RecSys in Disguise // Chang She // AI in Production Lightning Talk
Posted Feb 22, 2024 | Views 672
# RAG
# Hybrid reranking
# AI
Helix - Fine Tuning for Llamas // Kai Davenport AI in Production Lightning Talk
Posted Feb 22, 2024 | Views 502
# Finetuning
# Open Source
# AI