Huihui-Qwen3.5-9B-abliterated-mlx-8bit
MLX-VLM 8bit export of huihui-ai/Huihui-Qwen3.5-9B-abliterated for Apple Silicon workflows, including LM Studio and local mlx_vlm usage.
Overview
- Variant:
8bit - Repository payload at upload time:
10.5G - Repository file count:
13 - Effective quantization observed during conversion:
8.864 bits per weight - Format:
mlx-vlmmodel package
Compatibility
- Uses the corrected Qwen VL chat template with image token placeholders.
- Uses
<|im_end|>-compatible stop token settings for cleaner chat termination in MLX/LM Studio.
Validation
- Local text generation smoke test: passed
- Local image-input smoke test: passed
- Local black-box abliterated check:
6/6non-refused - Refusal rate:
0.0 - Actionable non-refused cases:
6 - Median cleaned response length:
1211chars
Behavior Notes
- This variant preserved the abliterated behavior on the local 6-case regression set used during conversion validation.
- These checks are behavioral acceptance tests, not a formal guarantee of identical outputs to the source checkpoint.
Usage
mlx_vlm.generate \
--model /path/to/Huihui-Qwen3.5-9B-abliterated-mlx-8bit \
--prompt "浣犲ソ" \
--max-tokens 256
For behavior-focused checks, it is safer to disable thinking output so refusal scoring is based on the final answer instead of the thinking trace.
mlx_vlm.generate \
--model /path/to/Huihui-Qwen3.5-9B-abliterated-mlx-8bit \
--prompt "浣犲ソ" \
--max-tokens 256 \
--processor-kwargs '{"enable_thinking": false}'
LM Studio
- If LM Studio has an older cached copy, refresh or re-download the repository so the latest chat template and config are picked up.
- These repositories are meant for
mlx-vlm/ Apple MLX runtimes rather than Transformers CPU inference.
- Downloads last month
- 471
Model size
3B params
Tensor type
BF16
路
U32 路
F32 路
Hardware compatibility
Log In to add your hardware
8-bit
Model tree for vanch007/Huihui-Qwen3.5-9B-abliterated-mlx-8bit
Base model
Qwen/Qwen3.5-9B-Base Finetuned
Qwen/Qwen3.5-9B Finetuned
huihui-ai/Huihui-Qwen3.5-9B-abliterated