Qwen 3.6 35B-A3B RYS XL (MLX)

This is a modified version of the Qwen 3.6 35B-A3B model, utilizing the RYS (Repeat Your Self) technique.

What is RYS?

The RYS (Repeat Your Self) technique, discovered by David Ng, enhances the reasoning capabilities of Large Language Models by duplicating specific "reasoning" layers in the middle of the transformer stack. This increases the depth of the model's computation for semantic and logic-heavy tasks without requiring additional training.

Model Details

Architecture: Qwen 3.6 35B-A3B
RYS Configuration: Physical duplication of layers (8, 12) and (20, 24).
Variant: XL (8 additional layers, bringing the total depth to 48 layers).
Format: MLX (Quantized to 8-Bit).
Tokenizer: Full Qwen 3.5/3.6 201-language support.

Performance

By repeating layers 8 through 12 and 20 through 24, the model spends more time processing the internal semantic representation of a prompt. This is particularly effective for:

Mathematical reasoning
Agentic Tasks
Large-scale coding tasks

Usage

This GGUF model is compatible with any tool that uses MLX.

Prompt Format

This model uses the standard Qwen Chat template:

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

CREDITS

Base Model: The Qwen Team at Alibaba Cloud.
RYS Technique: David Ng (dnhkng).
Quantization: Processed on a Mac Studio

Downloads last month: 1,245

Safetensors

Model size

41B params

Tensor type

BF16

U32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LogicBombaklot/Qwen3.6-35B-A3B-RYS-XL-8-Bit-MLX

Base model

Qwen/Qwen3.6-35B-A3B

Quantized

(373)

this model