OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models Paper • 2604.10866 • Published 4 days ago • 52
P-Aligner: Enabling Pre-Alignment of Language Models via Principled Instruction Synthesis Paper • 2508.04626 • Published Aug 6, 2025
Mitigating Overthinking through Reasoning Shaping Paper • 2510.09535 • Published Oct 10, 2025 • 5 • 3
Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding Paper • 2506.07434 • Published Jun 9, 2025 • 7