Sign in or Join the community to continue

Applying Differential Privacy (DP) to LLM Prompts While Maintaining Accuracy // Aishwarya Ramasethu // DE4AI

Posted Sep 18, 2024 | Views 953

Share

speaker

Aishwarya Ramasethu

AI Engineer @ Prediction Guard

Aishwarya is an AI engineer at Prediction Guard, where she focuses on enhancing the privacy and security of LLMs. Her responsibilities include developing techniques for masking Personally Identifiable Information (PII) in user prompts, identifying prompt injections, and researching relevant Privacy Enhancing Technologies (PETs). Aishwarya previously worked as a data scientist in industry and a researcher at Purdue University, where she focused on constructing optimal discrete noise to be added to data (maximizing both privacy budget and utility). With 6+ years of experience in analytics, AI, and data science, Aishwarya’s work has led to both significant research contributions and real-world insights for companies in retail and healthcare.

+ Read More

SUMMARY

Utilizing LLMs in high impact scenarios (e.g., healthcare) remains difficult due to the necessity of including private/ sensitive information in prompts. In many scenarios, AI/prompt engineers might want to include few shots examples in prompts to improve LLM performance, but the relevant examples are sensitive and need to be kept private. Any leakage of PII or PHI into LLM outputs could result in compliance problems and liability. Differential Privacy (DP) can help mitigate these issues. The Machine Learning (ML) community has recognized the importance of DP in statistical inference, but its application to generative models, like LLMs, remains limited. This talk will introduce a practical pipeline for incorporating synthetic data into prompts, offering robust privacy guarantees. This approach is also computationally efficient when compared to other approaches like privacy-focused fine-tuning or end-to-end encryption. I will demonstrate the pipeline, and I will also examine the impact of differentially private prompts on the accuracy of LLM responses.

+ Read More

TRANSCRIPT

Skylar [00:00:12]: All right, we got our first doc coming up on track three. Super excited for this. We're gonna let her kick it off. We have an exciting talk. I'm so happy you can go ahead and introduce yourself and get started.

Aishwarya Ramasethu [00:00:32]: Hi everyone, I'm Aishwarya. I'm an AI engineer at prediction guard. Really excited to explore this topic in more depth today. Let's get started. So as AI becomes more ingrained in our lives, privacy concerns are growing too. How can we make the most of these models while keeping data secure? That's where differential privacy comes in. It's designed to keep individual data points anonymous, even while getting accurate results from large language models. Today I'll be talking about how we can use differential privacy to create synthetic few short prompts for various tasks, ensuring privacy without losing performance I'll start first with discussing some common attacks, especially when using third party APIs with limited visibility, also referred to as black box APIs, to highlight why privacy matters, and not just during training or fine tuning, but even while merely prompting.

Aishwarya Ramasethu [00:01:41]: From there, we'll build towards a solution. I'll give a quick overview of differential privacy and then dive into how it can be applied to generate few short prompts. Language models are known to memorize and accidentally reveal sensitive information that was used in training. As an example, in GitHub Copilot, researchers were able to extract GitHub usernames. Now this is what a typical GitHub URL looks like. And then researchers were able to perform some prompt engineering to then be able to extract email, location, API key, and other sensitive information that is associated with the usernames. Next, there is another type of attack where attackers mainly they use a lot of techniques, including social engineering, to orchestrate attacks. LLMs rely on retrieving data and this makes them vulnerable to attacks where data is poisoned.

Aishwarya Ramasethu [00:02:46]: An attacker might embed embed a malicious prompt in a public code base, knowing users might copy this into chat tools. These tools are then prompted to visit specific websites based on user data already stored, slowly leaking private information. Now here is another attack. You could unintentionally leak sensitive information just through prompting. For example, if a company X is planning staff layoffs and an HR uses a black box API LLM to draft the internal letter, an attacker could exploit this by sending prompts with details they suspect might be part of this communication. The model will then respond more confidently to prompts that it has already seen, and this will end up revealing some sensitive internal information. Now we've established that simply prompting an LLM with sensitive data can pose risks. Imagine you have a data set containing patient information with corresponding diagnosis, and you want to use it to generate a more diagnosis.

Aishwarya Ramasethu [00:03:58]: A common approach is to create a few short prompt, like the example here on the right, and obtain a classification from the LLM. This method is computationally efficient and avoids the need for further fine tuning. Now let's explore various techniques on how to do this so that the private information is not revealed, but the performance of the task is unaffected. One straightforward approach would be to use zero short prompts, which don't rely on additional examples like the few short prompts do. However, these might not be as accurate as including few short prompts in example has been shown to be way more accurate. Next, you could also try anonymizing the PII in the few short prompts. But anonymization alone again isn't enough, especially when other sensitive data like medical records are involved. In many cases, this information can still be enough to uniquely identify individuals.

Aishwarya Ramasethu [00:05:01]: Another option is to use a trusted open source LLM to generate synthetic prompts, either by excluding sensitive information or including it for better accuracy. However, in this approach, there is no way to fully quantify privacy, meaning the model could still potentially reveal sensitive details. Now we move on to how exactly to build the solution. This establishes that we need a solution where privacy can be controlled. When we adjust the pipeline of LLMs generating output by adding noise to prediction probabilities before generating the prompt, we create a synthetic prompt where privacy is controlled. This ensures that sensitive information is protected while still allowing the model to generate useful results. Differential privacy is a technique that ensures individual information cannot be identified using the model outputs. In the context of our example, by adding noise to the prediction probabilities before generating the new prompt, we make it difficult for an attacker to infer any specific details about the underlying data, such such as the private medical records.

Aishwarya Ramasethu [00:06:12]: This way, even if an attacker tries to reverse engineer the prompt, the added noise ensures that individual data points remain anonymous. Now let's look at how the noise is added. LLMs process text by first encoding the input into numerical representations, which then pass through multiple model layers. The model then generates logits for the next tokens, which are converted into probabilities. From these probabilities, the next token is selected using methods like topk or top t sampling, and then the process is continued iteratively until the entire output is processed. To make the prompts private, we'll add noise to the probabilities once they are generated. So right here in this step, here's how the process looks like with noise included. By adjusting the noise parameter, we can control the level of privacy.

Aishwarya Ramasethu [00:07:05]: That is, the more noise that is added, the higher the privacy will be. Now, I'm going to walk you through a short demo. So here's the code to see the synthetic prompt generation in action using a publicly available classification data set to build this entire pipeline. As you can see, the generated prompt closely maintains the structure of the original prompt. And over here you can see how, for the athlete example, the. I'm sorry, let me just go back to the demo. For the athlete example, you can see that the generated synthetic prompt, it does contain the same underlying structure like when the athlete was born and what games he participated in. But then the personal information of the athlete is completely changed.

Aishwarya Ramasethu [00:08:10]: So how can this be extended? This technique can now be applied to rag based applications to prevent data poisoning attacks and protect sensitive documents using rag chats. This ongoing research also to explore how sensitive prompt embeddings can be updated with privacy guarantees. Additionally, for training, a common approach is to use teacher models trained on distributed data, which helps preserve privacy during the learning process. So that's it for my talk. If you have any more questions, please do reach out. Thank you.

+ Read More

Sign in or Join the community

Watch More

How to Systematically Test and Evaluate Your LLMs Apps

Posted Oct 18, 2024 | Views 15.1K

# LLMs

# Engineering best practices

# Comet ML

Small Data, Big Impact: The Story Behind DuckDB

Posted Jan 09, 2024 | Views 13.3K

# Data Management

# MotherDuck

# DuckDB

Building LLM Applications for Production

Posted Jun 20, 2023 | Views 11K

# LLM in Production

# LLMs

# Claypot AI

# Redis.io

# Gantry.io

# Predibase.com

# Humanloop.com

# Anyscale.com

# Zilliz.com

# Arize.com

# Nvidia.com

# TrueFoundry.com

# Premai.io

# Continual.ai

# Argilla.io

# Genesiscloud.com

# Rungalileo.io