260313 - a e-tuanzi Collection

e-tuanzi 's Collections

260313

updated Mar 13

GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published Feb 17 • 144
DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning

Paper • 2602.16742 • Published Feb 18 • 12
From Perception to Action: An Interactive Benchmark for Vision Reasoning

Paper • 2602.21015 • Published Feb 24 • 23
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

Paper • 2603.09906 • Published Mar 10 • 75
RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies

Paper • 2603.04639 • Published Mar 4 • 29
VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection

Paper • 2603.00912 • Published Mar 1 • 40
RealWonder: Real-Time Physical Action-Conditioned Video Generation

Paper • 2603.05449 • Published Mar 5 • 12
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence

Paper • 2603.07660 • Published Mar 8 • 86
CAST: Modeling Visual State Transitions for Consistent Video Retrieval

Paper • 2603.08648 • Published Mar 9 • 5
MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents

Paper • 2603.09827 • Published Mar 10 • 30