Sparrow-1M iter37 — Position Coupling NEGATIVE RESULT
Preserved as negative-result evidence per the 2026-05-06 Phase E Phase 1 abort.
- Task: 3-digit multiplication, in-distribution
- Accuracy: 2% (clean negative result; iter6a baseline was 76%)
- Format: PC-modified — operands MSB-first, result LSB-first, per-sample random offset start in U[1, max_pos - len(line)]
- Architecture: same as iter6a (1.3M Qwen3 dense, RoPE, GQA 4Q-2KV)
- Training: 25K steps, peak_lr 1.5e-3 (lowered from earlier 3e-4 to avoid divergence)
- Final loss: 0.2966 avg100 (CONVERGED TIGHTER than iter6a's ~0.62, but greedy decoding wrong on middle digits)
- Likely root cause: Position Coupling routes digit-significance through
position_ids which RoPE then rotates via cos/sin per position; the rotation destroys argmax precision while preserving low CE on average
- Recipe gate triggered: "If eval-30 < 80% at step 25K we abort and fall back to scale-up of the standard-RoPE iter6a 3M ladder"
- Author: Crownelius (github.com/Crownelius)
- Code: github.com/Crownelius/crowfeather-50m-v1 (commit f7bdb72)
- Source paper: Cho et al. 2024, "Position Coupling" (arxiv:2405.20671)
Includes eval_pc_mul_3d.json and eval_pc_mul_3d_v2.json showing the 2% scores. Next planned experiment is iter38a/b/c McLeish-style (Abacus + looped, arxiv:2405.17399) which avoids the position_ids/RoPE interaction entirely.