Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated β€” Q4_K_M GGUF

  • This model has been tested by myself in Hermes Agent, it does everything it is supposed to, text, coding, vision, thinking, tools usage, everything works perfectly well so far there. I have also found that this very specific Opus 4.6 Distillation does not use as many emojis as the non-Opus one, which is wave:great.

This is a Q4_K_M GGUF quantization of huihui-ai/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated.

Refer to the original model card for full details, usage warnings, and licensing information.

This is a Qwen3.5-35B-A3B MoE model that has been:

  • Fine-tuned in Claude 4.6 Opus style β€” targets Claude Opus-level instruction following, reasoning depth, and structured output quality
  • Abliterated β€” refusal mechanisms removed; no alignment restrictions
  • Vision-capable β€” the original weights include a full vision encoder (preprocessor_config.json, video_preprocessor_config.json); an mmproj file can be generated from the source safetensors for multimodal use

Files

File Size Description
Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M.gguf ~20 GB Main text model (Q4_K_M)
Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-mmproj-F16.gguf ~858 MB Vision projector for image input

Details

Source model huihui-ai/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated
Architecture qwen35moe (35B total params, ~3B active; 256 experts, 8 active per token)
Quantization Q4_K_M (~4.88 BPW)
File size ~20 GB + 858 MB mmproj
Quantized with llama.cpp b8352
Context length 262,144 tokens (trained)

Usage with llama.cpp

llama-cli \
  --hf-repo cesarsal1nas/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M-GGUF \
  --hf-file Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M.gguf \
  -p "Tell me about the universe"
llama-server \
  --hf-repo cesarsal1nas/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M-GGUF \
  --hf-file Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M.gguf \
  -c 131072

With vision (image input)

llama-server \
  --hf-repo cesarsal1nas/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M-GGUF \
  --hf-file Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M.gguf \
  --mmproj Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-mmproj-F16.gguf \
  -c 131072

Usage with Ollama

ollama run hf.co/cesarsal1nas/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M-GGUF

Credits

Downloads last month
2,509
GGUF
Model size
35B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for cesarsal1nas/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-Q4_K_M-GGUF