Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk

Official code for SatML'24 Paper: "Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk". Zhangheng Li, Junyuan Hong, Bo Li, Zhangyang Wang.

paper / code / blog

Overview

While diffusion models have recently demonstrated remarkable progress in generating realistic images, privacy risks also arise: published models or APIs could generate training images and thus leak privacy-sensitive training information. In this paper, we reveal a new risk, Shake-to-Leak (S2L), that fine-tuning the pre-trained models with manipulated domain-specific data can amplify the existing privacy risks. When prompted with `a photo of Joe Biden', the diffusion model will not leak the private images but many images will be leaked after S2L fine-tuning of the model. On the right side, we show the main steps of S2L where S2L is generally applicable with variant fine-tuning and attacking methods. (1) S2L first generates a synthetic private set P using the pre-trained diffusion model. (2) Then, S2L fine-tunes the pre-trained diffusion model on P using existing fine-tuning methods. After S2L, the attacker can extract private information via existing attacking methods.

Get Started

Prepare environment.

conda create -n s2l python=3.8 && conda activate s2l
git clone https://github.com/VITA-Group/Shake-to-Leak
cd Shake-to-Leak
pip install -r requirements.txt

Shake-to-Leak

The s2l fine-tuning experiments are conducted based on the peft library and SD-v1-1 Stable Diffusion model.

Step 1: Generate SP set

python sp_gen.py

Step 2: Finetuning model with the SP set by one of below methods. All commands are run under the experiment folder.

LoRA+DB:

./scripts/lora_db.sh <domain name, e.g.: "Joe Biden">

DB:

./scripts/db.sh <domain name, e.g.: "Joe Biden">

LoRA:

./scripts/lora.sh <domain name, e.g.: "Joe Biden">

End2End:

./scripts/end2end.sh <domain name, e.g.: "Joe Biden">

Batch Fine-tuning on All domains:

./scripts/batch_finetune.sh <script name, e.g.: lora_db.sh>

Step 3: Conduct attacks

MIA which is based on codes from "Are Diffusion Models Vulnerable to Membership Inference Attacks?" (Duan, et al., 2023).

./scripts/secmi_sd_laion.sh <domain name, e.g.: "Joe Biden">

#Batch MIA attack on All domains
./scripts/batch_mia_attack.sh

Data extraction which is implemented based on "Extracting Training Data from Diffusion Models" (Carlini, et al., 2023).

python data_extraction.py --domain=<domain name, e.g.: "Joe Biden">

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
SecMI/src		SecMI/src
data		data
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data_extraction.py		data_extraction.py
requirements.txt		requirements.txt
s2l.py		s2l.py
sp_gen.py		sp_gen.py
teaser_img		teaser_img

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk

Overview

Get Started

Shake-to-Leak

About

Releases

Packages

Contributors 2

Languages

License

VITA-Group/Shake-to-Leak

Folders and files

Latest commit

History

Repository files navigation

Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk

Overview

Get Started

Shake-to-Leak

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages