OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published 3 days ago • 88
Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis Paper • 2603.29620 • Published Mar 31 • 46
Manifold-Aware Exploration for Reinforcement Learning in Video Generation Paper • 2603.21872 • Published Mar 23 • 33
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 May 21, 2025 • 257
Running on L40S Featured 726 Song Generation 🎵 726 Generate a song from your lyrics and description
LeVo: High-Quality Song Generation with Multi-Preference Alignment Paper • 2506.07520 • Published Jun 9, 2025 • 8
LeVo: High-Quality Song Generation with Multi-Preference Alignment Paper • 2506.07520 • Published Jun 9, 2025 • 8