FlashVID: Efficient Video Large Language Models via Training-free Tree-based Spatiotemporal Token Merging Paper โข 2602.08024 โข Published Feb 8 โข 2
FlashVID: Efficient Video Large Language Models via Training-free Tree-based Spatiotemporal Token Merging Paper โข 2602.08024 โข Published Feb 8 โข 2
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper โข 2510.23607 โข Published Oct 27, 2025 โข 181
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception Paper โข 2505.04410 โข Published May 7, 2025 โข 44
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models Paper โข 2505.04921 โข Published May 8, 2025 โข 187
LLM4SR: A Survey on Large Language Models for Scientific Research Paper โข 2501.04306 โข Published Jan 8, 2025 โข 35
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs Paper โข 2406.18629 โข Published Jun 26, 2024 โข 42