Training Large Language Models to Predict Clinical Events Paper • 2605.12817 • Published 18 days ago • 17
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 17 days ago • 269
PhyMotion: Structured 3D Motion Reward for Physics-Grounded Human Video Generation Paper • 2605.14269 • Published 16 days ago • 9
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published 23 days ago • 231
SketchVLM: Vision language models can annotate images to explain thoughts and guide users Paper • 2604.22875 • Published Apr 23 • 35
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 242