DFlash Collection Block Diffusion for Flash Speculative Decoding • 14 items • Updated about 3 hours ago • 61
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights Paper • 2509.22944 • Published Sep 26, 2025 • 81
Granite 4.0 Language Models Collection Efficient language models for multilingual generation, coding, RAG, and AI assistant workflows. • 11 items • Updated 15 days ago • 216
story writing favourites Collection Models I personally liked for generating stories in the past. Not a recommendation, most of these are outdated. • 17 items • Updated Mar 2 • 98
SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens Paper • 2508.05305 • Published Aug 7, 2025 • 47