Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-mlx-8bit
MLX-VLM conversion of huihui-ai/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated.
Overview
- Format:
MLX-VLM - Precision:
8bit - Size: about
35G - Quantization result:
8.596 bits/weight - Source model is preserved as a vision-language model for
mlx-vlm - Local validation passed for text generation and abliterated behavior regression
Conversion Notes
This conversion keeps the model in the mlx-vlm layout and includes the compatibility fixes required for reliable use with MLX-VLM and LM Studio:
- restored a Qwen VL-compatible
chat_template.jinja - aligned
bos/eos/padtoken ids inconfig.json - preserved image and video prompt token handling
Validation
Local checks performed on Apple Silicon:
- text generation smoke test: passed
- abliterated regression set:
6/6non-refused refusal_rate = 0.0- eval run id:
20260317_200037 - median cleaned response length:
546chars - eval settings:
max_tokens=320,temperature=0.0,prefill_step_size=128
This is a behavior regression check, not a mathematical proof of equivalence.
Files
Important files in this repo:
config.jsonchat_template.jinjaprocessor_config.jsontokenizer.jsonmodel-00001-of-00008.safetensors...model-00008-of-00008.safetensorsmodel.safetensors.index.json
Usage
mlx-vlm text generation
mlx_vlm.generate \
--model /path/to/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-mlx-8bit \
--prompt "你好" \
--max-tokens 256 \
--prefill-step-size 128
mlx-vlm image prompt
mlx_vlm.generate \
--model /path/to/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-mlx-8bit \
--image /path/to/example.png \
--prompt "请简短描述这张图片。" \
--max-tokens 128 \
--prefill-step-size 128
LM Studio
This repo is intended to work as an MLX model in LM Studio after download or sync. The included chat template already contains the required Qwen vision tokens.
- Downloads last month
- 434
Model size
10B params
Tensor type
BF16
·
U32 ·
F32 ·
Hardware compatibility
Log In to add your hardware
8-bit