SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer Paper • 2605.15178 • Published 3 days ago • 58
CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives Paper • 2605.12496 • Published 5 days ago • 26
TextLDM: Language Modeling with Continuous Latent Diffusion Paper • 2605.07748 • Published 9 days ago • 25
Running on Zero Agents 8 DialogueSidon Demo 🔥 8 Separate two speakers from an audio or video recording
FlashLabs Chroma 1.0: A Real-Time End-to-End Spoken Dialogue Model with Personalized Voice Cloning Paper • 2601.11141 • Published Jan 16 • 23