view article Article Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation exploding-gradients • Sep 16, 2025 • 20
ROCKET: Rapid Optimization via Calibration-guided Knapsack Enhanced Truncation for Efficient Model Compression Paper • 2602.11008 • Published Feb 11 • 18
COSPADI: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning Paper • 2509.22075 • Published Sep 26, 2025 • 23
ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization Paper • 2505.02819 • Published Feb 19 • 26
view article Article Vision Language Models (Better, faster, stronger) +3 merve, sergiopaniego, ariG23498, pcuenq, andito • May 12, 2025 • 611
Iterative Self-Training for Code Generation via Reinforced Re-Ranking Paper • 2504.09643 • Published Apr 13, 2025 • 34
view article Article Efficient Deep Learning: A Comprehensive Overview of Optimization Techniques 👐 📚 Isayoften • Aug 26, 2024 • 89
LLMs (Multi-verse collection) Collection This is a group of our models that are trained using our new training technique • 3 items • Updated Mar 2 • 1