MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Using LLMs to Power Consumer Search at Scale

Posted Jul 21, 2023 | Views 536
# LLM in Production
# Power Consumer Search
# Perplexity AI
Aravind Srinivas
Aravind Srinivas
Aravind Srinivas
CEO & Co-Founder @

Aravind Srinivas is the co-founder and CEO of Perplexity AI, a company on a mission to build the world's most trusted information service. Before founding Perplexity AI, Aravind was a research scientist at Open AI and a research intern at Google. He received his PhD in Computer Science from UC, Berkeley.

+ Read More

Aravind Srinivas is the co-founder and CEO of Perplexity AI, a company on a mission to build the world's most trusted information service. Before founding Perplexity AI, Aravind was a research scientist at Open AI and a research intern at Google. He received his PhD in Computer Science from UC, Berkeley.

+ Read More
Demetrios Brinkmann
Demetrios Brinkmann
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More

Perplexity AI is an answer engine that aims to deliver accurate answers to questions using LLMs. Perplexity's CEO Aravind Srinivas will introduce the product and discuss some of the challenges associated with building LLMs.

+ Read More

 We have got one of the most requested people for this day of talks where? Where you at? Arvind. There he is. The c e O of Perplexity ai. So many people were so excited that you were giving a talk today, man. And I am one of them. Thank you. You're welcome. Happy to be here and excited all about perplexity.

Cool, man. Well, I am going to hand it over to you. No stress on the screen share. We can see it's working now and you should be good. Yep. To go. So I'll be back in like 20 minutes, 25 minutes. All right. Uh, can I get started? Yeah, go for it. Okay, great. Awesome. Um, hello everyone. Um, great to meet you all in this conference.

And, um, I'm Arvind, I'm the co-founder and c e o of Perplexity. Uh, perplexity is a conversational search engine, uh, that basically an is the fastest way to get answers to any question and fastest way to like, do all of your web browsing on the internet. And, uh, We are, uh, four people, co-founders. Myself, Dennis Yara.

He's formerly from Meta Cora and be nyu Johnny Hill. Like, um, our, uh, he, he was actually, well number one at programming at one point, and Andy from date of he, he's one of the Databricks co-founders and also from uc, Berkeley. Um, so our mission is to be the world's most now centric company. Um, I think, I think this is sort of borrowed from Amazon, uh, when Jeff Bezos says Amazon's the earth's most customer-centric company.

And like that's sort of what made them what it is right now. Customer obsession. And, um, our mission is to be obsessed with knowledge and productivity. And we want to be the best conversational answer engine and provide the best surgical pilots for you and, um, The place where people want to be when they want to discover and share knowledge.

And I think if we deliver on this, we are gonna give people back a lot of their time and make everyone smarter every day. And I think that's good for the world and do it in a transparent way where truth is a first party characteristic of all the answers, um, in the form of citations. So, We are backed by great investors, Nat, uh, Jeff Dean, Bob Muglia, ILA, Andrea, y, um, and so many more.

Um, Ja, Susan. So like, for those of you who don't know about our product, like you can ask any question like Inc. Like literally the deepest question. Um, If, if you didn't understand relativity, you could ask about the theory of relativity and like ask it about the differences, special, special theory and the general theory and how they've shaped your understanding.

Like, you know, and

you can ask,

um, basically anything that you don't know about, but. And you wish you had access to amazing, knowledgeable friends, um, always available to you. Right. So usually the smartest we have are, are, you know, like pretty busy and they, they don't, they aren't be available. Like this happens when in college, right?

When we are in college, uh, the best programmer is the one you kinda go to for taking a coding assignment. And uh, the problem with this, with always going to them is like, they're not always available. They're probably gonna try to like, do something on their own. Um, and imagine you had the power of, um, having access to the best programmer, the most knowledgeable person about economics, the most knowledgeable person about physics.

Um, all of them like. One service, one search bar, and always ready to answer any question that you may have and like, dig deeper, deeper, deeper. And like go breath for search depth for search about anything that's perplexity for you. And uh, you know, I'm sure all of you would love to have such a friend always at your demand in your mobile phone anytime, right?

So that's, that's basically what we made it happen now. And, uh, All of you are pretty familiar with like the auto G P T and search agents. Um, you know, like there's a lot of hype around it. Um, but everyone's been asking for like a product that takes those ideas and like prioritizes it in a way that can be used reliably.

And that's sort of what we've done with our search copilot. Um, Put human agency at the premium of the product. Uh, let humans control the search experience. Even though AI is at Humans Mercy, they kind of like doing, it's in service of ai.

This is a new, more powerful and interactive way to search. With copilot, your question sets off an intelligent search process. So you notice by understanding your starts and asks you, your then asks you clarifying questions, clarify to refine the results after you respond, it combs the web. Critically analyzes the information it finds and then presents you with a concise, well summarized answer.

This is a lot better than going Google. Are you planning a trip? Let copilot near personal travel guide it quickly finds the best flights, hotels, and must-see spots tailor just for you. And the whole trip planning thing is like need help starting a healthy diet. I've been using it. I recently pilot curates customized meal plans.

The recipes. Many have done perfectly suited to your tastes and dietary needs. Or maybe you want to learn how to ski this winter, let co-pilot take care of the research for you. It delivers easy to understand steps and the highest rated instructional videos right at your fingertips. Co-pilot curates the best learning resources so you can focus on learning at your own pace.

You can share what you've discovered with others. Not only can they view the questions you've asked, but they can also engage by liking them or even asking follow-up questions. With co-pilot your discoveries transform into shareable knowledge, perplexity, ask anything. This is a part of the motivation to let people fork someone else's thread and ask follow ups.

On top of that is sort of like we are all sort of prompt engineering right now, like the whole, uh, and, and you can consider this is the new kind of programming, you're programming in natural language. And so, um, Like, it sort of made programming a lot more fun and simple that, uh, you know, you can take someone else's prompt thread and you can learn from it, and then you can ask follow ups and learn more about it and you can share it with other people.

And so it's almost like building a GitHub of English, um, where expanding the knowledge capital of the planet by doing that and, uh, making it publicly available and sharing it lot. Right. And. So that's the idea. Copilot is not fully autonomous as you can see. Uh, the LLMs will come back to you with clarifying questions.

And then the search queries initially that you ask are potentially highly ambiguous and have hyper perplexity. And then you can, uh, clarify and reduce the ambiguity and help improve the answers. So in some sense, you are guiding the LLMs to the right parts of the web. And as I said, you're putting human agency at the premium of the product.

It's sort of like a. Fundamental principle of our, our products in general. Um, and then we took a step further there and built a copilot that's personalized to you. And we built something called AI profile, uh, which we're pretty excited about. And, uh, Introducing AI profile on perplexity. A new way to make your search experience truly yours.

Start by telling us a bit about yourself, your interests, your hobbies, you name it. Next, let's set your location and language preferences. As you build your profile, perplexity will present you with dynamic clarifying questions. These personalized prompts are designed to understand you better, creating a useful conversation between you and ai.

Just like that you've created a profile that's uniquely you and your search experience is now tailored to match.

All of this information is private and will only be used to make your interactions with the AI more useful and enjoyable. You're in control of your information and can edit or delete it at any time. Perplexity AI profiles, the power of AI personalized for you. Start shaping your perplexity experience today.

Perplexity, ask anything. So here your, you are kind of prompt engineering for personalization. This is very different from the way personalization has been done in the Web 2.0 era. Where, um, We, we used to pick topics that are of interest to us, and we used to pick, um, you know, like domains that we want to see search results from and so on.

And that's just like a very primitive way to profile any person. Right. So prompt engineering has sort of the future of how you wanna personalize yourself for the ai in some senses, like for the AI to come and like, sort of learn more about you, just like how, and, and, and reduce the burden. Of pro future prompt engineering on your end.

Like you don't have to keep repeating like, you know what you want. Um, we all don't want to keep on typing like codem monkeys, right? When, when we use these chat bots. So, um, so you kind of tell the AI what part of the web you're interested in, like what more about you and, um, And then like you, you, you sort of like guide the element again to the right parts of the web by doing that.

And again, like you put human agency at the premium of the product. And, uh, so, so Amjad the, uh, founder of Repla calls this personalization by prompting and then like, people on our discord are like talking about this, how like, you know, they, they're trying to get something to work. Um, And then like a lot of the times what happens is perplexity is like, if you don't tell, give it enough instructions, uh, you get some undesirable behavior.

Like, for example, you ask a question in Dutch and you get the output in English, and you don't want that. Uh, so you don't want to append that as a suffix for every search, search, uh, query that you might have. So if you just sort of give that as a global instruction, And that's gonna work for any question, right?

So it's just preloaded as a prompt and, uh, I think this is gonna make our whole experience with all these ais a lot better, just like I said, right? You know, we are all becoming programs right now, like, um, because we can all code in English. Um, you know how Python made a lot more developers on the planet.

Um, now we are all coding in English and so. Um, it's a little bit like sharing a template, right? You're, you're, you're like building these macros and then, uh, controlling these AI is to work for you. Um, so that's kind of how you should feel about these. These are co-pilots that are in your service and you're like writing code for them and like, uh, in natural language and you can share, uh, and fork from other people and, um, expand your own productivity and knowledge that way, and also help other people expand it.

So, uh, it's good for the world to do that. And, uh, We are, we are also like doing things like tool use or word from alpha. Where, you know, like we combine the LMS with scientific calculators, like, and it has a lot of people learn math in a much faster way. Um, like I, when I was, you know, in school, it took me a little bit of time to wrap my head around like local minimum and global minimum and things like that.

And, uh, you know, I, I, I wish I had like a tool like this, right? I could just come and ask me and much better than like going to go from calculator and, you know, using all the widgets that they have, uh, because natural language is lot simpler support. You can ask lot of follow up questions that are we've already asked for, and, uh, And with like this rich rendering with integral logic bots, um, it just makes the whole experience of like, you know, learning calculus or anything that's like advanced math or understanding planetary positions, lot more fun.

And, uh, this is just the beginning, right? Like there's so many more tools to integrate with LLMs. And so that brings us to like, why do you want to use tools? Um, and like Oreo, the VP of research at DeepMind, he, he. He, he gets this pretty correctly when we release like a Twitter search tool called like back in December.

Uh, the end game is not to predict the next word at a time. Uh, the real end game is to connect LLMs with a lot of other tools like search engines, python interpreters, um, and like leverage the tools, power and robustness and like basically harness the reasoning power of LLMs too, and build something that was just impossible to build with just pure LLMs.

And, um, I mean, no need to say more like Sun Pcha, the Google CEO comes and tweets an improvement of a bar where, um, they, they, they have like an implicit ba uh, code interpreter on the backend, and that helps them get 30% of the queries on Word and math problems better. And, um, um, so, so this is why you gotta use tools like LMS are great reasoning engines.

Wolfram and databases and search indexes are sort of like knowledge engines or knowledge bases, knowledge graphs and magic happens when you combine the two together. And in some sense, like le aside, even Wolfram, the basic perplexity product itself is the probably the most functional and well executed version of like retrieval, augmented generation, or people refer to it as rag, where you combine the l m with the.

Search index and then, um, make sure that it can always be up to date with the live information. There's no knowledge cut off, but it also retains the conversational and reasoning and referential natural language capabilities of the l l m, the knowledge that's already in the parameters of the l l m, but, um, can plug into all these like indexes and databases and tools and so on, and like create magical experiences for learning and knowledge and research for you.

And, uh, this is just the beginning. There's just a lot more to do API use, um, and so on. So I'm pretty excited about like tool use. And, um, so that brings into like, you know, when you're using a bunch of tools, you're almost like, you're, you're playing an orchestra now, right? So, So that's, that, that's kind of, I'm gonna call it like orchestration.

Um, where we are orchestrating an l l M to use many tools together in order to achieve like, um, amazing things that would just have been impossible to achieve without, um, the tools and just LM alone. And, um, so, so what, what are, what are the challenges in orchestration? Like your, your latency is like, Pretty high, uh, if you're gonna use many tools together.

And the LLMs also, and like your, your prompts are gonna get a lot bigger and so on. So, In the beginning, we, when we first rolled out our product, we, our, our product had like five to six second latency per query. And in fact, like one of our investor friends, uh, Dan Daniel Gross, he used to test the product in the beginning and he, he even joked saying like, you, you guys should not call it, submit a query, but you should call it, submit a job.

Like it's, it's that slow, like it used to take six to seven seconds and now it's like, Incredibly fast. Like, uh, the first thing anybody comes and tells me about perplexity is like, how, how do you guys make it so fast? And I think that's the part of like doing end-to-end engineering. Um, when you're controlling all parts of the stack yourself, you can make it really, really fast.

So as a company, we, we do not use these high level, uh, two libraries like, um, Oh, land chain or dust. We, we don't make use of all these things. Um, we just do the whole engineering ourselves. And I, I kind of recommend that for anybody, anybody else, like, uh, there's just so many, uh, latency optimizations you miss out on.

Um, if you rely on these, um, tooling libraries and then, um, hallucinations. So when you're using many tools, uh, there are so many ways in which it can fail when you're chaining things together. Um, you know, there's all this joke of how, like when you, when you're chaining many pieces together, um, that you just, one part has to go wrong and everything breaks.

So, uh, we can only fix this by learning from human feedback. Lot of prompt engineering updates, good discipline on prompt version, controlling, um, benchmarking for ai quality, benchmarking for model quality and things like that. Uh, and like pro migrations when you're trying to migrate from one model to another.

There's all sorts of engineering channels there. And, uh, integrations. Like, so when you're, when you're, when you're doing orchestration, you're like trying to like integrate with existing tools like search indexes, Wolfram, uh, rendering libraries, um, trying to like make sure that the image rendering works well, um, database plugins and so on, right?

So all that's like a lot of challenge and I think, um, all that's sort of why, you know, like. Building a product like this is not just an l l m wrapper. Um, ma many people think of us as an L l M wrap and, um, that's fair use term because, you know, like we, we rely a lot on opening eyes models. Um, but training our own models is, is one part of the whole thing.

Like, like building a product like ours has so much, so many layers of software, uh, engineering and LM orchestration work to be done there. That, um, We, we, we shouldn't be called an alumni rapper company. So for example, here, uh, you look at the screenshot, uh, there's this question about the Reddit CEO pushing against the blackout Reddit.

At the bottom, you look at the related questions, uh, which are, which is actually one of the most favorite features of like our users, is they don't even have to type in the follow ups. They can just click on it and it automatically contextualizes it in the, the context of the previous query and gives you a good, um, follow up answer.

And how, how do we generate these follow up questions? So that's another l l m that's doing the job. And then that's another L l M that we would train ourselves. Um, and, and similarly, like every query needs to be classified, whether what type of query it is. So that's another lm. And then, um, every query needs to be reformulated into a bunch of.

Other, uh, subqueries here. Like for example, in this query I'm asking, please explain to me the paper augmenting language models with long term memory. Uh, this is a question that, you know, if someone that doesn't know much about, um, LLMs and they're trying to read about papers, uh, they would ask these questions, right?

And then, uh, that gets reformulated into like three or four follow up queries and, and then a search runs for all these follow up queries and pulls up all these results. Like, and then, uh, the copilot looks at like 16 results. Which would not have been possible with just like one single query, right? Um, so that expands the, the diversity of pages of the, uh, copilot cs, um, and creates a very high recall content for the l lm.

And now that pulls up the highest order bits in the summary. And all these like, or, uh, things of like query reformulation, query categorization. Follow up query suggestions and like, um, needs like separate LLMs, and then you need to orchestrate the whole thing, um, into like one single piece. Uh, that's, that's sort of managing all of these workers.

Right? So, so that's like a separate layer of software engineering to be done there, or, or like AI engineering, you may call it that way. And, uh, this kind of, sort of reminds me of this, you know, Steve Jobs movie. Uh, I don't know if many of you have seen this from 2015. Um, I think, um, there are like many steam movies, but this is like a pretty good one where, uh, there's a scene here where Wozniak comes and act like, you know, I'm the guy who can write all the code and like, uh, I, I write all the nuts and bulls.

Uh, like, you don't do it. Like, what do you do? And then, uh, Steve comes and says, I, I, I play the orchestra, right? And then, so that's really what's going on here. Like, we don't train our mottos. At least not all the, the major heavy lifting is still done with OpenAI, but like then what do we do? Um, and the answer is we play the orchestra.

So we orchestrate the whole thing together into one functional working piece that just works seamlessly, reliably at scale for like tens of hundreds of thousands of users every minute. And that's really hard. And uh, and sort of like, that's the major takeaway I feel like you should take away from the stock is like, LM Ops is not just about creating the model, but also like building the whole orchestration, see what around it.

And doing this really well, like has sort of got us lot of organic traction, like Jack Dorsey and our treatment have all tweeted great things about us. And, uh, New York Times Fortune, uh, has covered us in this like code red articles and uh, you know, like fast company fortune. So on Bloomberg, uh, you can obviously, even if you don't wanna stop using Google, you can just use this as a Chrome extension, uh, pin it to your Chrome and then.

Just click on it and ask any question that you otherwise would've ask Google and kind of like pissed off with their ads. Um, you can u you can stop using Wikipedia. Like Wikipedia is like one, uh, Frozen fixed version for all the internet. We don't need that. Like knowledge should be personalized and like consumption should be personalized at, at whatever levels of granularity you want to go dig deep into any topic.

So, uh, perplexity is sort of like a rabbit hole. For people who used to earlier enjoy, like Wikipedia, rabbit holes, perplexity is a black hole in that case, uh, said by a LinkedIn user. And, um, and it's sort of like what, what motivates us? Like a lot of the people in our company are kind of nerds and like, you know, we, we want to build a nerds paradise, uh, for like those who wanna come and enjoy and learning more.

Um, it's a different ethos for the company compared to like, Um, you know, the traditional social media platforms that have always tried to optimize for the, the average person enjoying, like, entertainment on the internet. Um, for us, like knowledge is entertainment. And then, you know, like Cora used to try to do that.

Uh, the difference between Cora and Perplexity is you don't have to wait for Jimmy Bales or like some knowledgeable person to come and answer your question. Uh, after like two, three days, you get the answer in a few seconds equal quality. And any number of questions and like, that's the power of like, basically your, you're experiencing the power of the smartest humans on the planet, on steroids and like, that's pretty amazing, you know, for, for your own like, you know, knowledge upgrades and productivity.

And we are also available on, on iOS, like you can go to the app store and use it on your phone, uh, for people who feel a lot of friction when they're on their phones, going to Chrome and opening a separate tab. So that's another. You know, useful tool we have built and we intend to keep building better mobile experiences.

And, uh, we, every, all this is only possible because we, we have a great engineering team. Um, and, um, I'm super thankful to be working with them. Um, you know, it's one, one thing that everybody asks me is how, how do we execute so much with just so few people? And, uh, it's because these people are so awesome.

Um, and, um, It's also, we, we, we wanna be proof to the world that, um, you can do a lot with less. In fact, you can probably only do a lot with less. Uh, that's my belief. Like if we scaled up to date to like, um, 50 people, we'll we will be a lot slower. And so, um, I'm very proud of working with all of them and, um, they're all very highly responsible for all the success we've had.

And we are hiring for many positions, including a research engineer. Um, so like please, you know, apply on this link or reach out to me directly or any of us directly. And, uh, we recently announced our pro uh, subscription plan that gives you like lot of co-pilot uses per day and GD four uses per day. And, uh, you know, we are actively shipping new features, so you're gonna get all of them for free.

Uh, if you get it now. And, uh, I highly recommend you check it out. And that's, that's it for me. I'm here to take any questions. Awesome, man. There are some incredible questions coming through the chat, and I will say that I love the energy that these people are saying in the chat. First one that gave me a little chuckle is from Delon, and Delon reached out to me for this funny one.

I don't think it was serious, but I might as well ask it to you. Can I ask perplexity how perplexity speeds up tool calls? Uh, so, so perplexity only takes the information from the internet, right? Like we haven't made that public yet. So, um, I guess you can't, but I mean, you can, you can definitely ask, but you might not get a great answer to.

Okay. There. So, I mean, along that theme there are being able to ask anything to. A chatbot or to perplexity, it just has a lot of complications, I would imagine, with hallucinations and not knowing the catastrophic, forgetting that they say and all that stuff that happens behind the scenes that we've been talking about for the past two days.

How do you all deal with that to make sure that what you output is quality? So the question is how, how do we ensure quality in the answers? Yeah. Yeah. So we, we run a lot of evals and. You know, we have this amazing 14 year old in intern who came and set up like AI evals for us. And so anytime we change the prompts or anytime we change the search index, um, we run it through all these rigorous evals and then benchmark it and, um, We are big doc footers of our own product.

Like, you know, that that's necessary when you're, when you're building a search product. Like I don't use Google anymore. Um, even, even if something's not working well with our, our thing, I kind of go through and try to like flag it and dig deeper into why it's not working. So that helps a lot. And you might be surprised, like many anecdotal, um, complaints from people.

Uh, teaches a lot about like how to fix some things and then that ends up fixing a lot of things. So, um, I would just say like, I mean, it's, it's kind of like a joke, but, um, it's tested on production and then see how it goes. That's what, that's what we're going with. That should be a shirt tested in production and you'll be good.

So That's awesome. I love that. Uh, Willem is asking a question, any recommendations? And I love how you were talking about, you're the orchestrator and you, you make the symphony, make that beautiful music. So along those lines, Willem's asking any recommendation on how to choose the correct foundational model based on the question slash step in the orchestration process?

Yeah, so, so right now I feel like, I mean, all this should be desired based on evals. Um, and right now the evals indicate 3.5 turbo or four are like just insanely good. Mm-hmm. Um, I think anything to do with reasoning and logic, you would go with four. Um, I think for most summarization tasks or regular language, pre-processing tasks, you go with 3.5.

Claude Instant is also pretty good compared to 3.5, but not on coding. Mm-hmm. And, uh, so that's sort of my mental roadmap right now, and I don't think the other models on the market are there in the le Leagues of G P T and Claude. So I would kind of stick to these two for now. And so you're not doing anything with smaller?

Uh, we do use like, uh, small models like mpt, fine tuning and yeah. Um, there are some other models in the market and like Falcon is out and we are testing it, but I think it's gonna be a, gonna take a while to match the capability and the speed of these proprietary models. Mm-hmm. And, uh, So I would kind of stick to it for now before, before like trying to be adventurous here.

Yeah, yeah, yeah. Makes, makes total sense. So, uh, I think we kind of answered that one as far as like this. So Nathaniel's asking, Hey, this seems like a cooler, or not asking, but just. Saying more, this sounds seems like a similar but cooler idea to the new Microsoft Windows copilot they're looking to release in addition to Bing and uh oh.

So do you have anything that you can tell us on the technology stack that Perplexity is building? I mean, you said open ai, you said Claude, what about other stuff? What? What other stuff are you using? I mean, we don't use anything else. Like, um, I guess we, yeah, we, we, we use, um, Kubernetes and all, you know, all that React next js.

Mm-hmm. Um, and, uh, you know, I, iOS doesn't Swift UI and Android isn't native, so I, I think. That's kind of like, I don't, I don't know if that's really informative, but, um, it's, it's all like a lot of work on backend that keeps the product working as well as it does now. And, um, so, so I'm not sure what the question was specifically aimed at, but I would imagine it is, it's probably around the machine learning engineering lifecycle.

Like Yeah, if you're doing anything with, um, if you're using any kind of. Guardrails or if you're using any vector database to augment, yeah. Oh, okay. Okay. Got it. Got it. Yeah. We do use vector databases and so we use Elastic, we, I mean that's not a Vector db, but we use Elastic and also, um, quadrant. And, uh, we haven't ventured into Pine Clone or Deviate yet, so we've stuck with Quadrant and um, Embeddings.

I think we don't use OpenAI. We we train our own embeddings. Mm-hmm. That that's turned out to be pretty good for us. I don't think they have the market lead on embeddings as they have for LLMs.

I think that's basically it. Um, so here's, here's an interesting one, which is actually, it gets a little bit ethical in a way. Shiv is asking, do you think that a villager in India can use perplexity for their medical needs and then they don't have to access doctors? Uh, so the way I think about, it's like from, I mean, I don't, I don't think it matters whether it's India or America or something.

Um, like I, I use it for medical queries for my own body. Like, um, like I recently, I was on a trip and I, you know, I was trying to do a hike in a cave, and then I hit myself on the wall of the cave and started bleeding a little bit. And then I, I was, I was freaking out and I needed to know a little, very urgently before going to the, uh, medical center that was nearby.

It was like 30, 45 minute drive from there. I had to know something. Right. I was my, I was freaking out. And so, um, I asked perplexity and like it gave me all the detailed explanation of what it means for my symptoms. And, uh, so I would say that immediate first few minutes of like pandemonium you have, like we can really help.

But, um, you would ideally consult. Practitioner and make sure it's perfect. Just don't go blind faith with like what perplexity is saying and like, don't, you know, like, just act on it. But those few moments of panic, you can definitely like rely on us. Um, you know, it, it might be the negative way too. Like it might raise alarm bells in case you might really need to take action and like won't take action.

Yes. So, I, I'm happy with the way I'm using it, and I think the same, I would apply it to the village in India too. Um, and we support other languages that they can ask in their native language. So, uh, I hope they use it. I think it would be great. So the, um, anyone who has Googled, uh, what their. Symptoms mean?

No. That is the fastest way to get very depressed. Very, very bad. I definitely have had that happen to me before. I have a stomach ache and next thing you know, it must be an ulcer or it must be stomach cancer or whatever. So, so hopefully perplexity doesn't go down that rabbit hole. Um, it doesn't because you, you can be more precise.

You can just say, Hey, this is what it is, and like, what can I do? Like, you know, pretty actionably. Mm-hmm. And. Yeah. So, yeah. Yeah, it's, it's a better option also, like, it depends on how, how obsessed people are about their health and so on. Like for example, you get some people are like, they get a rash on their skin and they're like so scared.

It's like something loose skin cancer. Like some people are like, yeah, whatever. It's just gonna be fine in a few days. I'm more of the second kind, but Nice. So, Uh, Goku is asking about latency and yeah. You're saying, uh, curious if perplexity has taken a toll on computational costs since I believe a lot is happening in parallel and results might just be thrown away based on some other signal.

Yeah, yeah. We, we do do a lot of things in parallel and, and, uh, so the question is like, sorry, I, I don't quite get the question. But, um, is, is perplexity taking a toll on computational cost? Um, I think it's kind of fine. Like we, the major computational cost is actually the l m more than the search index.

And uhhuh, once you've built a good vector DB things are actually pretty fast and you can do a lot of things in parallel. And the embedding dimensions of the vector dbs are not that big that. You know, you, you, you could keep all that CPUs, those fine. Yes. That's awesome. Well, dude, this has been incredible.

I am super thankful that you were able to come on here and chat with Yeah, thank for having. Yeah. And show us about perplexity. I'm gonna start using it. I wanna start playing around with it more. And I, uh, I also know there's a ton of people that are asking questions in the chat, so if you get a minute, I'll drop the link to the chat and then you can answer the other questions that are coming through there.

And, Thank you so much, Harvin. Thank you. Uh, I'll talk to you later, ma'am. See you.

+ Read More
Sign in or Join the community

Create an account

Change email
I agree to MLOps Community’s Code of Conduct and Privacy Policy.

Watch More

Posted Aug 30, 2021 | Views 465
# Vector Embeddings
# Vector Search
# Pinecone
Posted Apr 27, 2023 | Views 945
# LLM in Production
# Anzen