LMMs-Lab

community

https://www.lmms-lab.com/

EvolvingLMMs-Lab

AI & ML interests

Feeling and building the multimodal intelligence.

Recent Activity

caizhongang authored a paper 8 days ago

Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer

yl-1993 authored a paper 8 days ago

Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer

pufanyi authored a paper 12 days ago

Demystifing Video Reasoning

View all activity

Papers

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

View all Papers

authored a paper 8 days ago

Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer

Paper • 2603.19227 • Published 11 days ago • 42

authored a paper 8 days ago

Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer

Paper • 2603.19227 • Published 11 days ago • 42

authored a paper 12 days ago

Demystifing Video Reasoning

Paper • 2603.16870 • Published 13 days ago • 366

authored a paper 13 days ago

Demystifing Video Reasoning

Paper • 2603.16870 • Published 13 days ago • 366

authored a paper 13 days ago

Demystifing Video Reasoning

Paper • 2603.16870 • Published 13 days ago • 366

authored a paper 20 days ago

Agentic Critical Training

Paper • 2603.08706 • Published 21 days ago • 13

in lmms-lab/HLE-Verified about 1 month ago

Clarification Regarding HLE-Verified Dataset Attribution

#1 opened about 1 month ago by

updated a dataset about 1 month ago

lmms-lab/HLE-Verified

Preview • Updated about 1 month ago • 3.78k • 3

authored 2 papers about 1 month ago

The Quest for Generalizable Motion Generation: Data, Model, and Evaluation

Paper • 2510.26794 • Published Oct 30, 2025 • 27

ConsistCompose: Unified Multimodal Layout Control for Image Composition

Paper • 2511.18333 • Published Nov 23, 2025 • 4

authored 2 papers about 1 month ago

The Quest for Generalizable Motion Generation: Data, Model, and Evaluation

Paper • 2510.26794 • Published Oct 30, 2025 • 27

ConsistCompose: Unified Multimodal Layout Control for Image Composition

Paper • 2511.18333 • Published Nov 23, 2025 • 4

authored a paper about 1 month ago

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 517

authored a paper about 1 month ago

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 517

published 2 datasets about 1 month ago

lmms-lab/minerva

Updated Feb 21 • 12

lmms-lab/HLE-Verified

Preview • Updated about 1 month ago • 3.78k • 3

authored a paper about 1 month ago

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Paper • 2602.12279 • Published Feb 12 • 20

authored a paper about 1 month ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published Feb 9 • 52

submitted a paper to Daily Papers about 1 month ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published Feb 9 • 52

authored a paper about 2 months ago

ProCLIP: Progressive Vision-Language Alignment via LLM-based Embedder

Paper • 2510.18795 • Published Oct 21, 2025 • 11