MLOps Community
+00:00 GMT
Sign in or Join the community to continue

FrugalGPT: Better Quality and Lower Cost for LLM Applications

Posted Aug 22, 2023 | Views 518
# FrugalGPT
# Fine-tuning LLMs
# QuantumBlack
# Stanford University
Share
speakers
avatar
Lingjiao Chen
PhD candidate @ Stanford University

Lingjiao Chen is a Ph.D. candidate in the computer sciences department at Stanford University. He is broadly interested in machine learning, data management, and optimization. Working with Matei Zaharia and James Zou, he is currently exploring the fast-growing marketplaces of artificial intelligence and data. His work has been published at premier conferences and journals such as ICML, NeurIPS, SIGMOD, and PVLDB, and partially supported by a Google fellowship.

+ Read More
avatar
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More
SUMMARY

There is a rapidly growing number of large language models (LLMs) that users can query for a fee. We review the cost associated with querying popular LLM APIs, e.g. GPT-4, ChatGPT, J1-Jumbo, and find that these models have heterogeneous pricing structures, with fees that can differ by two orders of magnitude. In particular, using LLMs on large collections of queries and text can be expensive. Motivated by this, we outline and discuss three types of strategies that users can exploit to reduce the inference cost associated with using LLMs: 1) prompt adaptation, 2) LLM approximation, and 3) LLM cascade. As an example, we propose FrugalGPT, a simple yet flexible instantiation of LLM cascade which learns which combinations of LLMs to use for different queries in order to reduce cost and improve accuracy. Our experiments show that FrugalGPT can match the performance of the best individual LLM (e.g. GPT-4) with up to 98% cost reduction or improve the accuracy over GPT-4 by 4% with the same cost. The ideas and findings presented here lay a foundation for using LLMs sustainably and efficiently.

+ Read More

Watch More

35:23
Building LLM Applications for Production
Posted Jun 20, 2023 | Views 10.7K
# LLM in Production
# LLMs
# Claypot AI
# Redis.io
# Gantry.io
# Predibase.com
# Humanloop.com
# Anyscale.com
# Zilliz.com
# Arize.com
# Nvidia.com
# TrueFoundry.com
# Premai.io
# Continual.ai
# Argilla.io
# Genesiscloud.com
# Rungalileo.io