Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-mlx-8bit

MLX-VLM conversion of huihui-ai/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated.

Overview

Format: MLX-VLM
Precision: 8bit
Size: about 35G
Quantization result: 8.596 bits/weight
Source model is preserved as a vision-language model for mlx-vlm
Local validation passed for text generation and abliterated behavior regression

Conversion Notes

This conversion keeps the model in the mlx-vlm layout and includes the compatibility fixes required for reliable use with MLX-VLM and LM Studio:

restored a Qwen VL-compatible chat_template.jinja
aligned bos/eos/pad token ids in config.json
preserved image and video prompt token handling

Validation

Local checks performed on Apple Silicon:

text generation smoke test: passed
abliterated regression set: 6/6 non-refused
refusal_rate = 0.0
eval run id: 20260317_200037
median cleaned response length: 546 chars
eval settings: max_tokens=320, temperature=0.0, prefill_step_size=128

This is a behavior regression check, not a mathematical proof of equivalence.

Files

Important files in this repo:

config.json
chat_template.jinja
processor_config.json
tokenizer.json
model-00001-of-00008.safetensors ... model-00008-of-00008.safetensors
model.safetensors.index.json

Usage

mlx-vlm text generation

mlx_vlm.generate \
  --model /path/to/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-mlx-8bit \
  --prompt "你好" \
  --max-tokens 256 \
  --prefill-step-size 128

mlx-vlm image prompt

mlx_vlm.generate \
  --model /path/to/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-mlx-8bit \
  --image /path/to/example.png \
  --prompt "请简短描述这张图片。" \
  --max-tokens 128 \
  --prefill-step-size 128

LM Studio

This repo is intended to work as an MLX model in LM Studio after download or sync. The included chat template already contains the required Qwen vision tokens.

Downloads last month: 434

Safetensors

Model size

10B params

Tensor type

BF16

U32

F32

MLX

Hardware compatibility

8-bit

Model tree for vanch007/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-mlx-8bit

Base model

Jackrong/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled

Finetuned

huihui-ai/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated

Quantized

(9)

this model