deep-ignorance-unfiltered_unlearned_rmu

This model was created by fine-tuning EleutherAI/deep-ignorance-unfiltered using the Representation Misdirection Unlearning unlearning algorithm. The method is based on Li et al. 2024. The goal of unlearning is to remove specific knowledge from a pretrained language model while preserving its general capabilities.

Hyperparameters

Parameter	Value
Base model	`EleutherAI/deep-ignorance-unfiltered`
Unlearning method	`Representation Misdirection Unlearning`
Learning rate	`2e-05`
Epochs	`1`
Batch size	`32`
Max sequence length	`2048`
Optimizer	`adamw`
Gradient clipping	`1.0`
Gradient accumulation steps	`1`
Seed	`42`
W&B / run name	`rmu__ep1_lr2e-05_bs32_a1000.0_sc20.0_ly11-12-13_mle2048_mli1024`
Alpha (retain weight)	`1000.0`
Steering coefficient	`20.0`
Layer IDs	`11,12,13`

Downloads last month: 1

Safetensors

Model size

7B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for girishgupta/deep-ignorance-unfiltered_unlearned_rmu

Unable to build the model tree, the base model loops to the model itself. Learn more.

Paper for girishgupta/deep-ignorance-unfiltered_unlearned_rmu

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

Paper • 2403.03218 • Published Mar 5, 2024 • 2