Update README.md
Browse files
README.md
CHANGED
|
@@ -11,7 +11,7 @@ tags:
|
|
| 11 |
- reasoning
|
| 12 |
- chain-of-thought
|
| 13 |
- Dense
|
| 14 |
-
pipeline_tag: text-
|
| 15 |
datasets:
|
| 16 |
- nohurry/Opus-4.6-Reasoning-3000x-filtered
|
| 17 |
- Jackrong/Qwen3.5-reasoning-700x
|
|
@@ -67,7 +67,14 @@ Final Model (Claude-4.6-Opus-Reasoning-Distilled,text-only)
|
|
| 67 |
|
| 68 |
## 📋 Stage Details
|
| 69 |
|
| 70 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 71 |
|
| 72 |
Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled shows significant advantages in coding-agent environments such as Claude Code and OpenCode:
|
| 73 |
|
|
|
|
| 11 |
- reasoning
|
| 12 |
- chain-of-thought
|
| 13 |
- Dense
|
| 14 |
+
pipeline_tag: image-text-to-text
|
| 15 |
datasets:
|
| 16 |
- nohurry/Opus-4.6-Reasoning-3000x-filtered
|
| 17 |
- Jackrong/Qwen3.5-reasoning-700x
|
|
|
|
| 67 |
|
| 68 |
## 📋 Stage Details
|
| 69 |
|
| 70 |
+
**🔧Tool Calling Benchmark**(benchmark tests by user @Chris Klaus)
|
| 71 |
+
|
| 72 |
+

|
| 73 |
+
|
| 74 |
+
> **From the test results, it is clear that different Qwen3.5 quantized models show significant differences in tool-calling capability. Among them, only the 27B model distilled with Claude Opus reasoning demonstrates stable performance.**
|
| 75 |
+
|
| 76 |
+
|
| 77 |
+
🔥**Community-tested advantages** (benchmark tests by user @sudoing on a single RTX 3090):
|
| 78 |
|
| 79 |
Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled shows significant advantages in coding-agent environments such as Claude Code and OpenCode:
|
| 80 |
|