Sparrow-1M iter37 — Position Coupling NEGATIVE RESULT

Preserved as negative-result evidence per the 2026-05-06 Phase E Phase 1 abort.

  • Task: 3-digit multiplication, in-distribution
  • Accuracy: 2% (clean negative result; iter6a baseline was 76%)
  • Format: PC-modified — operands MSB-first, result LSB-first, per-sample random offset start in U[1, max_pos - len(line)]
  • Architecture: same as iter6a (1.3M Qwen3 dense, RoPE, GQA 4Q-2KV)
  • Training: 25K steps, peak_lr 1.5e-3 (lowered from earlier 3e-4 to avoid divergence)
  • Final loss: 0.2966 avg100 (CONVERGED TIGHTER than iter6a's ~0.62, but greedy decoding wrong on middle digits)
  • Likely root cause: Position Coupling routes digit-significance through position_ids which RoPE then rotates via cos/sin per position; the rotation destroys argmax precision while preserving low CE on average
  • Recipe gate triggered: "If eval-30 < 80% at step 25K we abort and fall back to scale-up of the standard-RoPE iter6a 3M ladder"
  • Author: Crownelius (github.com/Crownelius)
  • Code: github.com/Crownelius/crowfeather-50m-v1 (commit f7bdb72)
  • Source paper: Cho et al. 2024, "Position Coupling" (arxiv:2405.20671)

Includes eval_pc_mul_3d.json and eval_pc_mul_3d_v2.json showing the 2% scores. Next planned experiment is iter38a/b/c McLeish-style (Abacus + looped, arxiv:2405.17399) which avoids the position_ids/RoPE interaction entirely.

Downloads last month
29
Safetensors
Model size
1.26M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for Crownelius/sparrow-1m-iter37-pc-negative