laland19
/

attackbenchmark

membership-inference-attack

differential-privacy

privacy-benchmark

Model card Files Files and versions

laland19 commited on Feb 22

Commit

6e13352

·

verified ·

1 Parent(s): 71d1249

Create README.md

# AttackBench: NLP Model Membership Inference Attack Benchmark

AttackBench is a unified benchmark designed to evaluate Membership Inference Attacks (MIA) across different NLP models (BERT, GPT-2, Qwen) under various defense states.

## 📋 Model Zoo & Experiment Levels

The repository contains pre-trained victim models and shadow models categorized by their defense levels:

| Filename | Model | Domain | Level | Description |
| :--- | :--- | :--- | :--- | :--- |
| `bert-news-overfit.tar.gz` | BERT | News | **L1** | Overfitted model (Vulnerable) |
| `bert-news-standard.tar.gz` | BERT | News | **L2** | Standard fine-tuned model |
| `bert-news-dp.tar.gz` | BERT | News | **L3** | **DP-SGD Defense ($\epsilon \approx 0.18$)** |
| `gpt2-news-overfit.tar.gz` | GPT-2 | News | **L1** | Overfitted model (Vulnerable) |
| `gpt2-news-standard.tar.gz` | GPT-2 | News | **L2** | Standard fine-tuned model |
| `gpt2-news-dp.tar.gz` | GPT-2 | News | **L3** | **DP-SGD Defense ($\epsilon \approx 0.18$)** |
| `gpt2-medical-overfit.tar.gz`| GPT-2 | Medical| **Cross** | Cross-domain robustness test |

## 🚀 Usage

These models are provided as **LoRA adapters**. To use them, you need to load the base model first and then apply the adapter.

### Example Code:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model_path = "gpt2" # or path to your local gpt2
adapter_path = "./models/gpt2-news-dp" # unzip the tar.gz first

tokenizer = AutoTokenizer.from_pretrained(base_model_path)
model = AutoModelForCausalLM.from_pretrained(base_model_path)
model = PeftModel.from_pretrained(model, adapter_path)
📊 Key Findings
Privacy Gain: DP-SGD effectively reduces the MIA AUC from ~0.75 (Standard) to ~0.52 (Defended).
Utility Cost: The defense comes with a ~60% increase in Perplexity (PPL) and potential factual errors in generation.
✉️ Contact
Yang Xianzhuang & Project Team 1442427183@qq.com

Files changed (1) hide show

README.md +18 -0

README.md ADDED Viewed

	@@ -0,0 +1,18 @@

+---
+language:
+- en
+- zh
+license: mit
+tags:
+- membership-inference-attack
+- nlp-privacy
+- differential-privacy
+- lora
+- privacy-benchmark
+datasets:
+- ag_news
+- medical-dialogue
+metrics:
+- auc
+- ppl
+---