Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated — Q4_K_M GGUF

This model has been tested by myself in Hermes Agent, it does everything it is supposed to, text, coding, vision, thinking, tools usage, everything works perfectly well so far there. I have also found that this very specific Opus 4.6 Distillation does not use as many emojis as the non-Opus one, which is wave:great.

This is a Q4_K_M GGUF quantization of huihui-ai/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated.

Refer to the original model card for full details, usage warnings, and licensing information.

This is a Qwen3.5-35B-A3B MoE model that has been:

Fine-tuned in Claude 4.6 Opus style — targets Claude Opus-level instruction following, reasoning depth, and structured output quality
Abliterated — refusal mechanisms removed; no alignment restrictions
Vision-capable — the original weights include a full vision encoder (preprocessor_config.json, video_preprocessor_config.json); an mmproj file can be generated from the source safetensors for multimodal use

Files

File	Size	Description
`Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M.gguf`	~20 GB	Main text model (Q4_K_M)
`Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-mmproj-F16.gguf`	~858 MB	Vision projector for image input

Details


Source model	huihui-ai/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated
Architecture	qwen35moe (35B total params, ~3B active; 256 experts, 8 active per token)
Quantization	Q4_K_M (~4.88 BPW)
File size	~20 GB + 858 MB mmproj
Quantized with	llama.cpp b8352
Context length	262,144 tokens (trained)

Usage with llama.cpp

llama-cli \
  --hf-repo cesarsal1nas/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M-GGUF \
  --hf-file Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M.gguf \
  -p "Tell me about the universe"

llama-server \
  --hf-repo cesarsal1nas/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M-GGUF \
  --hf-file Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M.gguf \
  -c 131072

With vision (image input)

llama-server \
  --hf-repo cesarsal1nas/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M-GGUF \
  --hf-file Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M.gguf \
  --mmproj Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-mmproj-F16.gguf \
  -c 131072

Usage with Ollama

ollama run hf.co/cesarsal1nas/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M-GGUF

Credits

Fine-tune + abliteration by huihui-ai — Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated
Base model by Qwen — Qwen3.5-35B-A3B
Hermes Agent (as mentioned in the model card) by NousResearch

Downloads last month: 2,509

GGUF

Model size

35B params

Architecture

qwen35moe

Hardware compatibility

4-bit

Model tree for cesarsal1nas/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M-GGUF

Base model

Jackrong/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled

Finetuned

huihui-ai/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated

Quantized

(9)

this model