LLMs in Financial Services: Personalized Portfolio Recommendation Engines // Akmal Chaudhri // DE4AI
Akmal is a seasoned IT professional with over 25 years of experience. He's worn various hats throughout his career, including roles as a developer, consultant, product strategist, evangelist, technical writer, and technical trainer at both Blue Chip companies and Big Data startups. Akmal is a familiar face at international conferences and has actively contributed to program committees for major events and workshops. Notably, he has authored, co-authored, or edited over ten books.
This session will delve into how LLMs can be leveraged to create highly personalized and efficient portfolio recommendation engines. Using a Kafka stock ticker feed, tick data will be ingested into a database system where we'll query the data using natural language and build a simple chatbot using speech-to-text.
Link to Presentation: https://drive.google.com/file/d/1HtVFnygwrMvRcu9zj8cDOtOn_F2p2fbK/view?usp=drive_link
Demetrios [00:00:12]: Well, what's happening, man? How you doing?
Akmal Chaudhri [00:00:15]: I am well, sir.
Demetrios [00:00:18]: So well, I know you've got a lot of stuff that you want to be talking to me about, and I'm gonna let you rock and roll. This is gonna be, like, all around LLMs and finance, right?
Akmal Chaudhri [00:00:32]: Yeah, just a little bit of a flavor. I mean, I've only got ten minutes, so. But I wanted to cover something. And I have a speech to text to speech demo. Let's see if it works. So my speakers and the. All the settings on the Mac set to maximum. So hopefully the sound will come through okay.
Akmal Chaudhri [00:00:53]: Hopefully, if you can hear me okay, then that should come through okay as well. But we'll try, try our best.
Demetrios [00:00:58]: Let's see. I'm gonna make a prayer for the demo gods, and.
Akmal Chaudhri [00:01:04]: Isn'T it anything that can go wrong, will go wrong at the worst possible time. But that's life. I mean, what can you do?
Demetrios [00:01:10]: And before you jump and you get into it, I want to mention to everybody here that single store is having a really cool event coming up, right? And we'll drop it in the chat because we got some discount codes for the mlops community. So we'll make sure. Yeah, we'll make sure to drop that in the chat in case anybody's in San Francisco or wants to go to San Francisco next month. Right? In October.
Akmal Chaudhri [00:01:35]: That's correct, yes.
Demetrios [00:01:36]: Cool. Well, we'll drop the links in the chat, and now I'm going to hand it over to you.
Akmal Chaudhri [00:01:41]: Thank you very much, sir. So I will minimize this screen, because what I'd like to do is to show, I mean, I've got it set to show my desktop, so hopefully you can see that. Okay. And that should be a notebook. And by the way, I was listening to the previous speaker and perhaps some comments from attendees that notebooks, you may not use them in production. I disagree. I used to work at databricks, and some of their largest customers used to use notebooks in production environments. We at single store have customers that do the same as well.
Akmal Chaudhri [00:02:14]: But it's a case of choose the technology that you're most comfortable with, and that suits your problem. So for some people, it may not be notebooks. That's absolutely fine. So I'm kind of old and wise enough to know that you shouldn't argue with anyone if they prefer something that's brilliant. Okay, so I'll walk through this a little bit of a Jupyter notebook here. And by the way, there are links to this, and we'll talk specifically about the LLMs in financial services, personalized portfolio recommendation engines. Okay, so that's the title of my talk and a little bit of markdown there. And essentially it's a bit of a demo.
Akmal Chaudhri [00:02:52]: I will try to run the speech part live. This part I've pre run already just for safety. Okay, so like I said, Murphy's law. So essentially what we've got here is I'm taking some ticker data, tick data stocks, okay. And I also have a separate table of sentiment data as well, which I've preloaded. And that one is kind of static, if you like, for the purposes of this demo, whereas the ticker data is live, it's actually being, it's coming from a Kafka feed. And I'm going to use a little bit of lang chain there. SQL agent.
Akmal Chaudhri [00:03:28]: Let's see if any of my arrows work here. If this, you can see that. Hopefully you can. Okay. And for the speech part, well, I'm going to use OpenAI's open source whispered. They do have a paid service for that. But the open source version on GitHub is awesome. Hasn't been trained on Myvoice, and it comes in English and various other languages as well.
Akmal Chaudhri [00:03:51]: Try it out. It's pretty awesome technology. I mean, for what you get for nothing, you're getting really a bargain. And so what I wanted to do was to really show the elements, if you like, that you could use to build a basic chatbot for question and answer over kind of financial data, if you like. And the chat bot itself is not specifically geared to finance. I mean, you could use it in many other applications, many other scenarios. It just so happens that today I'm using some stock data. Now, a important disclaimer here.
Akmal Chaudhri [00:04:25]: Okay, so the tick data that I'm using, and let me just highlight that, is entirely fictitious. It is purely for demo purposes. Please don't use this data for making any kind of financial decisions. Okay, so you are very welcome to use the Kafka broker that I'll be using, and you're welcome to try it out. But bear that in mind. So we just start off here with some libraries, standard stuff, lang chain and OpenAI, and then we just do some imports here to set the scene if you like. And then we do a create database. So single store has a free tier.
Akmal Chaudhri [00:05:01]: You don't need to do this. When you sign up as part of the environment, you get some compute resources and a database created for you. But if you choose to not use that, but use the standard one, then you will need to run this and create that database instead. So, simple stuff here we've got two tables. This is for the ticker data here, and this is for the sentiment data, which I've preloaded. That's nearly 2 million records there. And again, a little bit of instructions there just below, telling you where to get that file that contains the details there and some instructions there using the MySQL CLI, how to get that data in. The reason I've done this already is that it does take a couple of minutes to load.
Akmal Chaudhri [00:05:51]: It is a fair bit of data, especially if you're doing it remotely, as I am. Just an example there, of how you can get that data in. And once it's there, you can keep that in there and, you know, remove the ticker data if you want. And then here, what I've simply done is just check this stock sentiment table just to see what's actually in there. So essentially what it's got is some headline data here, as you can see. Okay, so headline casino closed. Or let's have a look at this one here, which is Warren Buffett's Berkshire Hathaway turns up, stake in liberty, Sirius XM, for example. And then we've got these sentiments here, the positive, the negative, and the neutral.
Akmal Chaudhri [00:06:31]: And these have been calculated by me. So using something, it's a bit of rust code, and using webassembly, something called Vader valence aware Dictionary and sentiment reasoner, which I use to actually convert this original sort of source data into these sentiment data. And then here's the pipeline that I've got, okay, which I've got feeding this Kafka source here, this Kafka public, kafka memcompute.com port 1992 and forward slash stock ticker, which is where the ticker data is coming from. We just take the opportunity to test the pipeline here, and we can do that before we start it. So it's actually retrieved the symbol for me here, which happens to be mmmdh. It's given me a timestamp. And then the open, the high, the low, the price, and the volume. And then we just start the pipeline, and away we go.
Akmal Chaudhri [00:07:31]: And then once it's running, we can take the opportunity to check how many rows we've got in the tick data. So currently that stands at just a little over 9 million, and that will keep going. Okay. And then other things that we can do in the meantime is just do a bit of SQL there, kind of candlestick chart, if you like, which if you're in the finance space, you'll kind of understand what that does. So here we've got like two rows being returned here. This is for the stock symbol AAPL, which is for Apple. And then just down below, a little bit of python code here to do something a bit nicer in terms of just rendering this in a visual way. So not much data here as you can see.
Akmal Chaudhri [00:08:12]: But the nice thing about this is that you can just hover over this and have a look at these, and it's worth letting the data run for a little while and you get something a bit more sort of interesting showing here. And so the key thing, the bit of magic is really the LAN chain, SQL agent. I mean, this is pretty awesome technology because what it allows us to do then is to ask these kind of queries of the data in English. This is what it's been set up to do. So in this first example, I've simply kind of giving me a prompt here, and I've asked from the tick table, which stock symbol saw the least volatility in share trading in the data set. Comes back with a very simple answer here, ll. Sometimes it gives you a bit more information. And in the second one here, I've actually set it up that you can type your own kind of question in.
Akmal Chaudhri [00:09:05]: So I've asked this question here. Let me just try and highlight that for you. It's nothing. My magnifier is not showing that. Great. Let me read that for you. It says, using the stocks, sorry, using the symbol AAPL, what is the most positive sentiment in the stock sentiment table and the current best price for this symbol from the tick table? And it's come back with a couple of values there just below. Okay.
Akmal Chaudhri [00:09:27]: And that's pretty cool. So let me just minimize this now and go over to my virtual machine. And I have the bit of software running in here. Okay. So again, links to this on the GitHub repro if you want to build this and try this out for yourself. So on the right hand side, if you have a look, you'll see that I have this kind of notepad open and it's got some sample questions that I can try because my memory is terrible that I forget to ask. Let's try something here. Okay, so this is connected to that very same database.
Akmal Chaudhri [00:10:06]: So let's start recording. And I can ask the question, how many rows are in the tick table? Okay, stop that. And let's see what we get back. Let's try again. Stop recording. 9,269,589. Okay, great. So let's try another one.
Akmal Chaudhri [00:10:40]: What is the earliest timestamp in the tick table. The earliest timestamp in the tick table is 20240. 911. 95302. Okay. What is the best performing stock in the tick table? TSM. Okay, very simple answer there. Okay.
Akmal Chaudhri [00:11:13]: We could ask the inverse. What's the worst performing stock in the tick table? FDR. Okay. And then let's try that same query that I showed you just a moment ago in the notebook using the symbol AAPL. What is the most positive sentiment in the stock sentiment table and the current best price for this symbol from the tick table? 0.331,509 116.57. Okay, and let me try one more thing just to see if I can trip up the system a little bit. Who was the first man on the moon? I don't know. Good.
Akmal Chaudhri [00:12:17]: So there we go. And so just at the end of the notebook, then a little bit of cleanup. Okay. If you want to get rid of the stuff. So shut down the pipeline. Drop the pipeline. Drop the table for the tick data and for the sentiment data. I've left that commented simply because it does take a few moments to kind of load up, and therefore you might want to keep that and shut down everything here.
Akmal Chaudhri [00:12:40]: And as you mentioned before, we do have an event that's happening. So if you are interested, just go to the GitHub repo veryfatboy forward slash fintech chatbot. Okay. Where you'll get the notebook plus the python code there. Singlestore.com cloudtrial if you want to try out the environment, okay. With this kind of notebook environment, the platform that I was using for generating this and the enterprise AI conference, which I think, Matris, you mentioned already that you're going to give out some codes, but if anyone's interested, just scan the QR code there and you're welcome to use Acmo 50 for 50% discount code. And there we go. So thank you very much for your time and for your attention.
Akmal Chaudhri [00:13:24]: Hopefully that was interesting for you and maybe a little bit unusual. You're welcome to reach out to us. Any comments, feedback are always welcome. Thank you.
Demetrios [00:13:32]: That was very cool. Before you go anywhere, yes, can you drop in the chat that we have the link to the GitHub? Because I definitely want to check it out more. Second, someone was asking, and I guess it all depends on the whisper that you have. Can you ask in different languages?
Akmal Chaudhri [00:13:51]: Yes, you can. So Whisper from OpenAI is actually available and can be used with multiple languages. It's just that certain languages are better supported than others. So for example, English is naturally well supported. I think for some various reasons, simply because of the research work that went into it. But there are you on the GitHub repo for this OpenAI whisper, which is the free version. You can see there's a lot of documentation information around the technology as well. Plus they have various charts showing the level of support for various languages too, so you're welcome to try that out as well.
Akmal Chaudhri [00:14:28]: And as you can see, it wasn't trained on Myvoice, but it did a pretty good job there.
Demetrios [00:14:32]: It was pretty good. That was fun.
Akmal Chaudhri [00:14:35]: Mandez.
Demetrios [00:14:35]: Well, thank you for doing this and thank you for joining us. That was a really cool demo. I appreciate it.
Akmal Chaudhri [00:14:42]: Thank you very much.
Demetrios [00:14:43]: We'll keep it moving now, and we'll and but don't go without dropping me the links so I can drop those in the chat too.
Akmal Chaudhri [00:14:51]: Okay? I can. Are you online somewhere, Demetrius? Because I think my time is up.
Demetrios [00:14:57]: So there's a private chat that we have right here on the stage. Just drop it there, then I'll relay it.
Akmal Chaudhri [00:15:03]: I will do that. So again, thank you very much to everyone for attending, and thank you, Demetrius, for being an awesome mc. And I'll get those to you. So time for me to go.
Demetrios [00:15:13]: He is asking in the chat, I think. Yeah, the time's up, but I guess they're asking about hallucination, and if you've done anything to measure the hallucinations or.
Akmal Chaudhri [00:15:22]: Temperature equals zero, that's what's been set. So as deterministic as it could possibly be, and I think as the, as far as the testing that I've done, it's come up with pretty good answers. Most of the time, I haven't matched every single response. Simply not had time. But I've used this fairly extensively and I've been very pleased with the overall result.
Demetrios [00:15:43]: Yeah, so the vibe check gets passed?
Akmal Chaudhri [00:15:45]: Yeah.
Demetrios [00:15:46]: All right, I'll talk to you later, man. Thank you.