Swift Sampling: Selecting Temporal Surprises via Taylor Series Paper • 2605.22678 • Published 6 days ago • 9
Learning to Build the Environment: Self-Evolving Reasoning RL via Verifiable Environment Synthesis Paper • 2605.14392 • Published 13 days ago • 8
FrontierSmith: Synthesizing Open-Ended Coding Problems at Scale Paper • 2605.14445 • Published 13 days ago • 20
Learning POMDP World Models from Observations with Language-Model Priors Paper • 2605.13740 • Published 14 days ago • 5
Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design Paper • 2605.15871 • Published 12 days ago • 16
Skills-Coach: A Self-Evolving Skill Optimizer via Training-Free GRPO Paper • 2604.27488 • Published 27 days ago • 6
StateSMix: Online Lossless Compression via Mamba State Space Models and Sparse N-gram Context Mixing Paper • 2605.02904 • Published Apr 5 • 8
Kronos: A Foundation Model for the Language of Financial Markets Paper • 2508.02739 • Published Aug 2, 2025 • 35
SWE-chat: Coding Agent Interactions From Real Users in the Wild Paper • 2604.20779 • Published Apr 22 • 15
Neural Additive Experts: Context-Gated Experts for Controllable Model Additivity Paper • 2602.10585 • Published Feb 11 • 2
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models Paper • 2602.12036 • Published Feb 12 • 94
Routing the Lottery: Adaptive Subnetworks for Heterogeneous Data Paper • 2601.22141 • Published Jan 29 • 4
ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas Paper • 2601.21558 • Published Jan 29 • 61
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published Dec 8, 2025 • 80
BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining Paper • 2508.10975 • Published Aug 14, 2025 • 60
Technical Report: Full-Stack Fine-Tuning for the Q Programming Language Paper • 2508.06813 • Published Aug 9, 2025 • 6