MLOps Community
+00:00 GMT

LLM Avalanche

LLM Avalanche
# LLMOps
# LLMs

At the end of June, I flew out to San Francisco to do three things: Present at Data & AI Summit Host the LLM Avalanche event Get free swag I want to break down LLM Avalanche

September 12, 2023
Demetrios Brinkmann
Demetrios Brinkmann
Demetrios Brinkmann
Demetrios Brinkmann
LLM Avalanche

At the end of June, I flew out to San Francisco to do three things:

  1. Present at Data & AI Summit
  2. Host the LLM Avalanche event
  3. Get free swag

I want to break down LLM Avalanche. Aside from being basically a mini-conference that we called a meetup, there were incredible learnings. It would be unfair if I did not surface my key takeaways from the jam-packed event.

Happy to announce, mission accomplished.

But lets do it in story form. Let’s start by painting the scene.

On a brisk Monday evening, I was walking from a hotel on Mission Street to the Contemporary Jewish Museum. My head was firmly focused on the sidewalk below me. I am not a fan of stepping on random needles.

I heard a commotion and looked up.

“Holy smokes” I mutter.

There were 400 people standing in a line that wrapped around the plaza. Each person eager to learn from the 40 speakers set to talk that night.

Social anxiety kicked in for a moment. Not wanting to stand out, I got in the back of the line.

That didn’t last long. Paranoia of missing the panel I was set to host outweighed the fear of being called out for skipping the line. I managed to sneak my way to the front. I walked past “security” with a head nod.

They had no idea who I was. But I felt special.

As soon as I got in the first thing I noticed was a buzz in the air! All and all we had 900+ people show up for the event. I didn’t have much time to check out the venue cause my mind was firmly focused on getting some quality swag.

Let me tell you, Carly (the one behind LLM Avalanche swag and co-organizer) did not disappoint. As I was in the middle of stuffing my backpack full of free stuff, I heard whispers that Mattei Zaharia was about to get on stage.

Exiting the speaker room my adrenaline spiked! The venue was completely full now. All those waiting in line were now inside. I swam through the sea of people in search of the auditorium.

On my way there I passed the startup showcase area with booths showing new file storage from Lancedb and cool audio-to-text demos from Deepgram.

Doors to the main stage were shut. The team wasn’t allowing anyone else in. I had to convince security I was part of the organizing team just to be let in and catch the last bits of Matei’s talk.

I stood there taking in the sheer madness of it all. Three weeks prior Denny asked if I was up for helping him organize a “meetup” in San Francisco. After 13 hours of travel and a hard case of jetlag, I was gearing up to host two panels with some of AI’s royalty.

One thing’s for sure. The dark lighting in the auditorium wasn’t helping my jetlag.

I had the urge to check out the lighting talks before getting on stage for the panel. As I was walking upstairs my friend Jack stopped me and said, “Dude, Holden is in the crowd watching the lighting talks like just another engineer”.

“Oh Yah?” I replied, “it’s kinda like when the Beatles went to a Jimmy Hendrix concert.”

Kinda…

I caught Shreya’s lighting talk before heading downstairs to check out the panel Alexey was hosting on Performance.

The line up was stacked:

  1. Beyang Liu, CTO, co-founder of Sourcegraph
  2. Greg Diamos, Co-founder and CTO of Lamini
  3. Ankit Mathur, Software Engineer at Databricks
  4. David Kanter, Executive Director at MLCommons
  5. Chip Huyen, Co-founder at Claypot AI

Here are my 3 key takeaways from the panel:

1️. Performance is a complex and multivariate issue: It’s not just about the language model itself but also the interaction between the model and other systems.

Agents that combine language models with tools, APIs, and example code are being explored to enhance performance. However, as soon as you need to make more than two hops for an agent, performance drops significantly.

2️. The rise of open source and technological innovations: The open-source community is making remarkable strides in low-precision training and inference technologies.

Techniques like 4-bit and 8-bit floating point are being explored to reduce compute and memory demands, making models more accessible for hobbyists and cloud computing.

3️. The future of language model interaction: Prompt engineering has been a significant factor in determining search results, but new methods of interacting with language models are emerging.

The DSP framework from Stanford allows users to provide examples and define the search space, letting the model predict the desired outcome. Visual input is also being considered for future interactions.

The panel ended just as fast as it started.

I jumped back outside to socialize. By this time food had arrived. I grabbed a plate of hummus and tzatziki and wandered over to track three for more lighting talks.

Mihail was in the middle of talking about how hard it is to build a viable business on top of mid-journey due to the lack of an API. Charles jumped up to show the weight watcher benchmark he created.

It was time. My panel on risk was about to start.

“Please don’t let me fall asleep on stage” I prayed as I made my way back to the auditorium.

The risk panel didn’t disappoint. How could it with the panelists we had:

Harrison Chase, Co-Founder and CEO at LangChain

Ben Harvey, Founder & CEO of AI Squared

Aakanksha Chowdhery, Staff Research Scientist at Google Deepmind

Yaron Singer, CEO at Robust Intelligence and Professor of Computer Science at Harvard University

Here are some of the major themes from the conversation.

1. Addressing Overconfidence: LLMs have a tendency to be overconfident in their answers, often leading to inaccurate information. To mitigate this, researchers are actively exploring techniques such as confidence calibration and retrieval and augmentation methods.

2. Human Expert Review: The most effective deployment of LLMs involves having a human expert in the loop to review the model’s outputs. Whether it’s generating text or code, the expert’s critical evaluation is crucial in ensuring accuracy and reliability.

3. Being Mindful About Application Selection: It is essential to consider the applications LLMs are used for. Not all tasks can tolerate model hallucinations, and it is acceptable to opt out of using LLMs in such cases. Nobody is forcing you to use AI in your product. Think long and hard about what the implications are before throwing it into the mix.

4. Building Trust and Actionability: We have all seen answers go wrong. The ability for an end user to trust, and the team that implements the LLM into the product to trust plays a crucial role in the adoption of LLMs. To increase trust, it is important to make the results more actionable, relevant, timely, and provide additional contextualization. The more that we can get the LLMs to cite sources and reference why it came up with certain answers the better.

5. Three Categories of Risk: The panel discussed three main categories of risk associated with LLMs: operational risk, ethical risk, and security and privacy risks.

6. UI/UX and Communication: The importance of improving user interface (UI) and user experience (UX) for LLMs. Showing intermediate steps and retrieved documents can build trust and allow users to validate answers. New UI frameworks are seen as potential avenues for improvement. It doesnt always have to be a chatbot!

7. Evaluate Tools Thoughtfully: Relying solely on LLMs to evaluate their own output may not be accurate. The panel stressed the need to be thoughtful, build tools specifically for the task, and avoid common pitfalls. If an LLM lied once, whats going to stop it from lying again?

I later found out the head of NSA data strategy was in the audience. I probably wouldn’t have made that joke about hallucinating if I had known that.

The staff was quick to kick us out at 10pm on the dot. After it was all said and done we raised 21k usd for charity thanks to the generous donations from the Databricks team.

Caption: Co-organizers Alexy Khrabrov, Carly Akerly, Denny Lee, and me for and the event finale selfie

Want to see more from the event? You can view all the pictures and videos from the event at LLM Avalanche.

Dive in
Related
Blog
Evaluation Survey Insights
By Demetrios Brinkmann • Feb 12th, 2024 Views 318
Blog
Evaluation Survey Insights
By Demetrios Brinkmann • Feb 12th, 2024 Views 318
Blog
Price per token is going down. Price per answer is going up.
By Demetrios Brinkmann • Nov 21st, 2024 Views 1.2K
Blog
What does a Machine Learning Engineer at Etsy Do?
By Demetrios Brinkmann • Sep 22nd, 2022 Views 329
Blog
What does a Machine Learning Engineer at DPG Media Do?
By Demetrios Brinkmann • Oct 13th, 2022 Views 296