Sign in or Join the community to continue

The Intersection of Graphs and Large Language Models

Posted Mar 04, 2024 | Views 627

# Graphs

# Large Language Models

# Fribl

Share

speakers

Anthony Alcaraz

Chief AI Officer @ Fribl

Chief Product Officer at Fribl, an AI-powered recruitment platform committed to pioneering fair and ethical hiring practices.

With recruiting topping the list of concerns for many CEOs and companies spending upwards of $4,000 on average per hire, the need for innovation is clear.

All too often, the arduous recruiting process leaves both employers and applicants frustrated after 42 days of effort with uncertainty if the right match was made...

At Fribl, we are leading the charge to transform this status quo, starting with reinventing the screening process using our proprietary GenAI technology enhanced by symbolic AI.

+ Read More

Adam Becker

IRL @ MLOps Community

I'm a tech entrepreneur and I spent the last decade founding companies that drive societal change.

I am now building Deep Matter, a startup still in stealth mode...

I was most recently building Telepath, the world's most developer-friendly machine learning platform. Throughout my previous projects, I had learned that building machine learning powered applications is hard - especially hard when you don't have a background in data science. I believe that this is choking innovation, especially in industries that can't support large data teams.

For example, I previously co-founded Call Time AI, where we used Artificial Intelligence to assemble and study the largest database of political contributions. The company powered progressive campaigns from school board to the Presidency. As of October, 2020, we helped Democrats raise tens of millions of dollars. In April of 2021, we sold Call Time to Political Data Inc.. Our success, in large part, is due to our ability to productionize machine learning.

I believe that knowledge is unbounded, and that everything that is not forbidden by laws of nature is achievable, given the right knowledge. This holds immense promise for the future of intelligence and therefore for the future of well-being. I believe that the process of mining knowledge should be done honestly and responsibly, and that wielding it should be done with care. I co-founded Telepath to give more tools to more people to access more knowledge.

I'm fascinated by the relationship between technology, science and history. I graduated from UC Berkeley with degrees in Astrophysics and Classics and have published several papers on those topics. I was previously a researcher at the Getty Villa where I wrote about Ancient Greek math and at the Weizmann Institute, where I researched supernovae.

I currently live in New York City. I enjoy advising startups, thinking about how they can make for an excellent vehicle for addressing the Israeli-Palestinian conflict, and hearing from random folks who stumble on my LinkedIn profile. Reach out, friend!

+ Read More

SUMMARY

The intersection of graphs and Large Language Models (LLMs). I intend to explore the benefits of combining graphs with LLMs, delving into the engineering aspects while also touching on the practical applications from my startup's perspective. This talk will highlight my recent work and findings on the superiority of Retrieval Augmented Generation (RAG) Knowledge Graphs over traditional RAG with vector databases, underlining the profound implications of their interaction.

+ Read More

TRANSCRIPT

The Intersection of Graphs and Large Language Models

AI in Production

Slides: https://docs.google.com/presentation/d/1XUvupWUabiLLVFvIqavFSUHUnCtTJQsE/edit?usp=drive_link&ouid=112799246631496397138&rtpof=true&sd=true

Adam Becker [00:00:04]: And right now we have coming up Anthony, and I believe Anthony is going to be talking to us about knowledge, graphs and llms. Is that right?

Anthony Alcaraz [00:00:14]: Yes.

Adam Becker [00:00:15]: Okay. And I feel like that's all I've been hearing about in the last few weeks. Maybe it's just the circles that I orbited, I'm not entirely sure, but. Okay.

Anthony Alcaraz [00:00:24]: So I am a chief ie officer at Freebolt. So we are a startup and we are automating human resources reasoning.

Speaker C [00:00:32]: Okay.

Anthony Alcaraz [00:00:32]: I am more interested in the usage of large language model for reasoning, passing documents, automating backstage processing. Not the chatbot side, but what I will present today is also available for chatbot. Okay, so I believe rack system, okay, retrieval, augmented generation has become, I think, a new system design paradigm that every company in the world would want to implement to leverage large language model for businesses to extract value. And a recent paper has proven that rag is better than fine tuning to integrate new knowledge. Okay, I will put it in my PowerPoint and the link.

Speaker C [00:01:22]: Okay.

Anthony Alcaraz [00:01:22]: And I believe that to build a good rag system, you need to think about it as a data flywheel.

Speaker C [00:01:31]: Okay.

Anthony Alcaraz [00:01:32]: You want it to improve over time. It's the main goal. Okay. You want it to have some kind of automating, the improvement over time of your rack system and the knowledge that is behind.

Speaker C [00:01:50]: Okay.

Anthony Alcaraz [00:01:50]: And I believe that knowledge graph are the ideal fool. Are the ideal food for rack system better than vector database. I will prove that.

Speaker C [00:02:01]: Okay.

Anthony Alcaraz [00:02:03]: And also I think that over time the models, the large language model will become better even we will have maybe in the future, like Lacoon proposed planning models.

Speaker C [00:02:16]: Okay.

Anthony Alcaraz [00:02:16]: That model that can plan.

Adam Becker [00:02:19]: Anthony, can I interrupt? Are you sure you're sharing the right screen? People are asking in the chat, we are seeing, what are we seeing? It says, good evening. I'm not sure it's the right, yeah.

Anthony Alcaraz [00:02:36]: Okay, do you see it right now?

Adam Becker [00:02:39]: Let's see. Now we are seeing, yes, now we're seeing.

Anthony Alcaraz [00:02:44]: Sorry for that.

Speaker C [00:02:45]: Okay.

Anthony Alcaraz [00:02:46]: This was the first PowerPoint slide that was presenting. So data flywheel and rag system as a design paradigm.

Speaker C [00:02:52]: Okay.

Anthony Alcaraz [00:02:53]: I think also rag is important because the key differentiator of your system will be in the future, the data.

Speaker C [00:03:00]: Okay.

Anthony Alcaraz [00:03:00]: And you want to give the model good data. And I think that the best food that you can give at your model are structured data and relational data that is represented by the acknowledge graph.

Speaker C [00:03:14]: Okay.

Anthony Alcaraz [00:03:19]: So how do you achieve this? First, I think that fine tune is needed or might be used to increase the reasoning capabilities of your models.

Speaker C [00:03:31]: Okay.

Anthony Alcaraz [00:03:32]: This is all, I think, fine tuning. You need to think about knowledge graph, a set of tool to reason about your data.

Speaker C [00:03:43]: Okay.

Anthony Alcaraz [00:03:43]: You can use cipher queries, you can use vectorization, you can use graph algorithm, it's very rich, okay. And you need to think the interaction between your llms and the acknowledge graph as an agentic rag.

Speaker C [00:03:57]: Okay.

Anthony Alcaraz [00:03:58]: You will need agent that reason about what they retrieve and what they need within your knowledge graph. Okay, this is a theoretical approach, okay, so this is the Venn dragon that I think about, okay, you are the LLM knowledge graph, allow you to get some kind of reasoning and you have knowledge. I think knowledge graph are the representation of the combination of knowledge and reasoning, thanks to graph algorithm, for example. Okay, so now knowledge graph versus vector search. I think there is no question. Recent studies, polar and all from MIT has proven that accuracy is much more better for knowledge graph retrieval. Yesterday Microsoft has released a paper, a research about graph rag. They have proven that rag with graph are much more better.

Speaker C [00:05:02]: Okay.

Anthony Alcaraz [00:05:03]: For example, in term of diversity, you get 1.8 more diverse viewpoints per answer. The answer were more comprehensive. Okay. They compare both at the same time. Okay, so I won't go long over this. Okay, now how to think the interaction between the three and bother you one.

Adam Becker [00:05:33]: More time, which are we supposed to be on right now we're seeing rag systems and new system design pattern paradigm. Are you sure that it's moving along now? We're seeing it now you're on a cheap lems as reasoning engines for business.

Anthony Alcaraz [00:05:49]: Okay, perfect. I will do that from my. Okay, I think I have a problem. So this was a slide for chronological versus vector search and this one for vector search. Okay, now how to think the intersection between graph and LLM. And here I think there is free intersection. Graph can be used as a context provider for your LLM. It will allow you to add some kind of reasoning thanks to graph algorithm output.

Anthony Alcaraz [00:06:25]: For example, in the Microsoft paper, they use clustering algorithm to augment the context of the knowledge graph. This is one example of graph as context provider. But one interesting thing that I found is that graph can be also thought as a reasoning topology, as a way for the LLM to reason about what it will be doing. It's a way to think about prompt engineering. I have many examples. You have graph of thought paper. You have a recent paper that use a Monte Carlo algorithm also to leverage the reasoning of the model. And the third intersection that is really, really interesting is that you can use transformer and LLM as graph algorithm.

Anthony Alcaraz [00:07:17]: Many papers are performing graph tasks thanks to transformer and large language model. Okay, I won't go in detail in each, but one paper in particular is very interesting. The model is called rare rare and it's retrieval augmented thought process. It takes the id of a graph of thoughts, but each fourth retrieve some knowledge base. In my case would be knowledge graph. And there is a scoring by the LLM of the answer scoring that we could improve over time thanks to recent paper like self discovery.

Speaker C [00:08:14]: Okay.

Anthony Alcaraz [00:08:14]: And I think that you need to think the reasoning of the model.

Speaker C [00:08:21]: On.

Anthony Alcaraz [00:08:21]: The knowledge graph as some kind of graph. And here we can combine three ways that I am currently testing on with fribble. That is this paper about graph retrieval, graph reasoning LLM compiler that do parallel querying of some knowledge source and planning and long graph. Long graph is a long shed paradigm that introduced the concept of cyclical graph. The model, if it fail at point b, can go back to point a for a different option. Okay, this is in theory what has been released in the last weeks. Okay, so this was the first part. The second part is using LLM for graph task.

Speaker C [00:09:18]: Okay.

Anthony Alcaraz [00:09:19]: Here it's very simple. You can do many things.

Speaker C [00:09:23]: Okay.

Anthony Alcaraz [00:09:25]: You can look for centrality within your graph. You can look for the most influential part of your graph, et cetera, et cetera. There is paper that show that there is a great synergy between graph neural network and large language model. I won't go into detail, but I have put it here, the link to the survey here it is very interesting because we can imagine this is what I am testing right now to augment the context of the LLM with another LLM that has been trained on graph task as an alternative to a graph algorithm. Because there is many advantage to use transformer for graph task. Okay, I won't go into detail. One thing that I won't go also into detail is that many people will say, okay, knowledge graph are a huge investment. But it's good to know that LLM is useful for building knowledge graph.

Anthony Alcaraz [00:10:27]: And there is many, many papers. You can do this in numerous dimensions. For example, you can take your document and transform it directly into knowledgegraph. You can do domain information extraction, you can do graph data model generation from a cv. You can fine tune your model on ontology and building some domain information extraction, et cetera, et cetera. You can even use an agent system that is called Autokg to build your knowledge graph.

Speaker C [00:10:55]: Okay?

Anthony Alcaraz [00:10:55]: So it is very rich. Okay, this is the last part. The last part is graph as context providers. Here there is many gates. Okay, you got the cipher query gate. That is the query of the chronology graph. Here you need a prior ontology grounding. Okay, this is how it works.

Anthony Alcaraz [00:11:11]: And neophorgy has proved that you need vector similarity search. Okay, you can use graph algorithm. And you can use also technique that is called generative knowledge graph. Okay, I won't go into detail, but you can use these four gates to augment the context of your LLM. And what you want to do is to chain together those augmentation of context. Okay, I won't go into detail, but you will have it. The detail in the PowerPoint.

Speaker C [00:11:38]: Okay.

Anthony Alcaraz [00:11:39]: And the final point, or I think my system, you need to think a rack system into two parts. You have a knowledge retrieval part and you have a reasoning part. The knowledge retrieval part is better used around a knowledge graph. Okay, this is my certitude. And more and more studies are proving that. Okay, for the reasoning part, you want to leverage maybe different llms, fine tune on some reasoning task. You might want to use also maybe not forcibly llms, but models, okay? Classifiers that has been trained on your knowledge graph. Okay, let's think about that, anthony.

Anthony Alcaraz [00:12:31]: Okay, I am done.

Adam Becker [00:12:33]: We got to go. I have a question. There's so many different things that we didn't get into the detail, but you've brought up to our attention. Where do we get the detail? I want to dive in. Where do we do this?

Anthony Alcaraz [00:12:45]: Okay, I will send my. Because I think I can send my PowerPoint to be public. Yes, I do believe.

Adam Becker [00:12:52]: Okay, but then even the PowerPoint sometimes, you just mentioned some of these things. Can we find references and other resources within the PowerPoint?

Anthony Alcaraz [00:13:02]: You will see that I linked every references that I refer to as the API link. You can click on it and you will have the references.

Adam Becker [00:13:12]: Nice. This is a fascinating space. I think you're almost done, right? Do you have like one more slide?

Speaker C [00:13:18]: Yes.

Anthony Alcaraz [00:13:19]: At the last slide, I put an example for causal reasoning. This paper that I put here prove that by using LLM on the graph algorithm, you can do causal reasoning on the causal knowledge graph with decreased complexity. You pass for you optimize your usage as LLM as a causal judgment, et cetera. This paper, I think, is groundbreaking for the causal sphere, but it is a good proof of the combination of both graph algorithm and large language model as some kind of reasoning engine for businesses. Okay, this is my last point. All these techniques, I implement them within the startup Fribol. Okay. We do this for Asha reasoning.

Anthony Alcaraz [00:14:15]: I believe that those techniques with all the people that I talk to can be used in any use cases currently. I do believe that many businesses will be transformed by a proper implementation of rack framework. Okay. This is my belief.

Adam Becker [00:14:34]: Anthony. Absolutely fascinating. Thank you very much. Please, everybody wants to see the slides. This was incredible. Graph neural network. The first person that had, I think, introduced me to this is Aman, and he's on the. I think he's viewing right now.

Adam Becker [00:14:49]: Aman, I'm giving you a shout out as well. Thanks to you, I've been able to appreciate Anthony's presentation even much more. So, Anthony, thank you very much.

Anthony Alcaraz [00:14:58]: Thank you for having me.

Speaker C [00:15:00]: Thank you. Thank you.

Anthony Alcaraz [00:15:01]: Have a nice night.

+ Read More

Sign in or Join the community

Watch More

Vector Databases and Large Language Models

Posted Apr 18, 2023 | Views 3.4K

# LLM in Production

# Vector Database

# ChatGPT

# Redis

# Redis.com

# Rungalileo.io

# Snorkel.ai

# Wandb.ai

# Tecton.ai

# Petuum.com

# mckinsey.com/quantumblack

# Wallaroo.ai

# Union.ai

# Alphasignal.ai

# Bigbraindaily.com

# Turningpost.com

The Future of Search in the Era of Large Language Models

Posted Mar 14, 2023 | Views 1K

# Large Language Models

# You.com

# Future of Search

Building Recommender Systems with Large Language Models

Posted Jul 06, 2023 | Views 1.3K

# LLM in Production

# Recommender Systems

# Meta