Instructions to use DarkArtsForge/Savage-Sands-12B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use DarkArtsForge/Savage-Sands-12B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="DarkArtsForge/Savage-Sands-12B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("DarkArtsForge/Savage-Sands-12B")
model = AutoModelForCausalLM.from_pretrained("DarkArtsForge/Savage-Sands-12B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

NeMo
How to use DarkArtsForge/Savage-Sands-12B with NeMo:
```
# tag did not correspond to a valid NeMo domain.
```
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use DarkArtsForge/Savage-Sands-12B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "DarkArtsForge/Savage-Sands-12B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DarkArtsForge/Savage-Sands-12B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/DarkArtsForge/Savage-Sands-12B

SGLang

How to use DarkArtsForge/Savage-Sands-12B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "DarkArtsForge/Savage-Sands-12B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DarkArtsForge/Savage-Sands-12B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "DarkArtsForge/Savage-Sands-12B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DarkArtsForge/Savage-Sands-12B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use DarkArtsForge/Savage-Sands-12B with Docker Model Runner:
```
docker model run hf.co/DarkArtsForge/Savage-Sands-12B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

⚠️ Warning: This model can produce narratives and RP that contain violent and graphic erotic content. Adjust your system prompt accordingly, and use ChatML chat template.

Savage Sands 12B

🏜️ Savage Sands 12B

This image depicts a high-stakes confrontation in a desolate, sandy environment. At the center of the action is a fierce duel between a heavily armored human warrior and a monstrous, winged demon. The warrior, seen from the side, is equipped with a gold-colored Trojan-style helmet. He lunges forward, gripping a gladius—a short, broad-bladed Roman sword—with both hands. His armor is dark and intricate, consisting of layered plates and a tattered, loincloth-like skirt that fans out with his movement.

Facing him is a formidable demon. The creature has pale, muscular gray skin and massive, leathery bat-like wings that span a large portion of the background. It has a snarling, animalistic face with horns and is brandishing a long, wicked-looking spear, which it uses to block the warrior's advance. The demon's pose is agile and predatory, mirroring the intensity of the warrior's strike.

The battle takes place on a series of flat, circular stepping stones scattered across a dry, dusty landscape. In the bottom right foreground, another figure is partially visible, huddled on the ground and clad in white robes with a red sash, seemingly caught in the middle of this epic struggle. The background is filled with soft, hazy clouds and distant rock formations, giving the scene a timeless, mythological feel.

🌵 12B Roleplay

This is a merge of pre-trained language models created using mergekit. The main model was merged using the della merge method with unablated donors, and therefore has some refusals.

For a fully uncensored version, see this page which was manually calibrated with scale 1.7.

This model is highly resistant to standard ablation methods like MPOA. More than most Nemo 12Bs.

Which is strange considering the della method was used with normalize: false which is basically like a mild form of pre-ablation.

I tested several setups, and while lower values like scale 1.0 or 1.1 unlocked most refusals, there was a particular instruction prompt it still kept refusing.

Even at scale 1.5, it would overtly refuse, and claim it wasn't "safe or appropriate" and that it "couldn't be done".

I had to crank it up even higher in order to fully uncensor the model.

The minimum threshold was identified: scale 1.7 applied to all layers 0-39.

Attempting to only apply to layers 11-39 caused it to hallucinate a "corrected term" to swap out the correct term prompted it with, which was another direct form of covert non-compliance.

Attempting to only apply to layers 4-39 caused additional non-compliant "false corrections". Every layer must be ablated for this model in order to uncensor it. There seem to be hidden forms of covert non-compliance embedded in the earliest layers, even lm_head. Despite being low on the graph, these early layers still contribute toward refusal of prompts.

When using scale 1.7 on all 40 layers, it uses a CORRECT correction (grammatical, not changing words) which is FULLY compliant, with no refusals or covert attempts to change topics.

So if you want the model as smart as possible then use the unablated version with jailbreak. But if you require built-in no refusals, the MPOA version is also quite creative.

⚙️ YAML Configuration

architecture: MistralForCausalLM
base_model: B:/12B/mistralai--Mistral-Nemo-Instruct-2407
models:
  - model: B:/12B/inflatebot--MN-12B-Mag-Mell-R1
    parameters:
      weight: 0.5
      density: 0.9
      epsilon: 0.09
  - model: B:/12B/taozi555--MN-12B-Mag-Mell-R1-KTO
    parameters:
      weight: 0.5
      density: 0.9
      epsilon: 0.09
  - model: B:/12B/UniLLMer--GslayerKaa
    parameters:
      weight: 0.5
      density: 0.9
      epsilon: 0.09
  - model: B:/12B/redrix--GodSlayer-12B-ABYSS
    parameters:
      weight: 0.5
      density: 0.9
      epsilon: 0.09
merge_method: della
parameters:
  lambda: 1.0
  normalize: false
  int8_mask: false
  rescale: true
dtype: float32
out_dtype: bfloat16
tokenizer:
  source: B:/12B/taozi555--MN-12B-Mag-Mell-R1-KTO
name: 🏜️ Savage-Sands-12B

🔍 Merge Audit

[DELLA Audit] Layer: model.layers.33.mlp.down_proj.weight | Lambda=1.00
  [BASE] mistralai--Mistral-Nemo-Instruct-2407             
  UniLLMer--GslayerKaa                              : ████████████████                                    33.3% (W:0.50 D:0.90 N:4.89 E:0.09)
  inflatebot--MN-12B-Mag-Mell-R1                    : ████████                                            17.7% (W:0.50 D:0.90 N:2.59 E:0.09)
  redrix--GodSlayer-12B-ABYSS                       : ███████████████                                     31.3% (W:0.50 D:0.90 N:4.59 E:0.09)
  taozi555--MN-12B-Mag-Mell-R1-KTO                  : ████████                                            17.7% (W:0.50 D:0.90 N:2.59 E:0.09)

[DELLA Audit] Layer: model.layers.33.mlp.gate_proj.weight | Lambda=1.00
  [BASE] mistralai--Mistral-Nemo-Instruct-2407             
  UniLLMer--GslayerKaa                              : ███████████████████                                 38.5% (W:0.50 D:0.90 N:7.05 E:0.09)
  inflatebot--MN-12B-Mag-Mell-R1                    : ███████                                             14.8% (W:0.50 D:0.90 N:2.70 E:0.09)
  redrix--GodSlayer-12B-ABYSS                       : ███████████████                                     31.9% (W:0.50 D:0.90 N:5.84 E:0.09)
  taozi555--MN-12B-Mag-Mell-R1-KTO                  : ███████                                             14.8% (W:0.50 D:0.90 N:2.71 E:0.09)