Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps Paper โข 2605.16928 โข Published 9 days ago โข 87