Qwen 3.6 35B-A3B RYS XL (MLX)

This is a modified version of the Qwen 3.6 35B-A3B model, utilizing the RYS (Repeat Your Self) technique.

What is RYS?

The RYS (Repeat Your Self) technique, discovered by David Ng, enhances the reasoning capabilities of Large Language Models by duplicating specific "reasoning" layers in the middle of the transformer stack. This increases the depth of the model's computation for semantic and logic-heavy tasks without requiring additional training.

Model Details

  • Architecture: Qwen 3.6 35B-A3B
  • RYS Configuration: Physical duplication of layers (8, 12) and (20, 24).
  • Variant: XL (8 additional layers, bringing the total depth to 48 layers).
  • Format: MLX (Quantized to 8-Bit).
  • Tokenizer: Full Qwen 3.5/3.6 201-language support.

Performance

By repeating layers 8 through 12 and 20 through 24, the model spends more time processing the internal semantic representation of a prompt. This is particularly effective for:

  • Mathematical reasoning
  • Agentic Tasks
  • Large-scale coding tasks

Usage

This GGUF model is compatible with any tool that uses MLX.

Prompt Format

This model uses the standard Qwen Chat template:

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

CREDITS

  • Base Model: The Qwen Team at Alibaba Cloud.
  • RYS Technique: David Ng (dnhkng).
  • Quantization: Processed on a Mac Studio
Downloads last month
1,245
Safetensors
Model size
41B params
Tensor type
BF16
·
U32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LogicBombaklot/Qwen3.6-35B-A3B-RYS-XL-8-Bit-MLX

Quantized
(373)
this model