Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
ReactiveAI 's Collections
Reactive Transformer PoC - RxT-Beta-Micro models
RxT-Beta Training Datasets
Reactive Transformer PoC - RxT-Alpha Supervised Models
Sparse Query Attention (SQA) Research
Interaction SFT Datasets

Sparse Query Attention (SQA) Research

updated Oct 3, 2025

Experimental models with Sparse Query Attention layers. Reducing training time/cost by ~3-10% compared to GQA & MQA, with the same level performance

Upvote
1

  • Sparse Query Attention (SQA): A Computationally Efficient Attention Mechanism with Query Heads Reduction

    Paper • 2510.01817 • Published Oct 2, 2025 • 16

  • ReactiveAI/sSQAT-mm

    Text Generation • 8.62M • Updated Oct 3, 2025

  • ReactiveAI/SQAT-mm

    Text Generation • 8.57M • Updated Oct 3, 2025

  • ReactiveAI/xSQAT-mm

    Text Generation • 8.52M • Updated Oct 3, 2025

  • ReactiveAI/SQAT-m

    Text Generation • 10.7M • Updated Oct 3, 2025

  • ReactiveAI/sSQAT-m

    Text Generation • 10.9M • Updated Oct 3, 2025

  • ReactiveAI/xSQAT-m

    Text Generation • 10.4M • Updated Oct 3, 2025

  • ReactiveAI/xSMQAT-m

    Text Generation • 10.2M • Updated Oct 3, 2025

  • ReactiveAI/GQA-Ref-Micro

    Text Generation • 8.67M • Updated Oct 3, 2025

  • ReactiveAI/MQA-Ref-Micro

    Text Generation • 8.64M • Updated Oct 3, 2025
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs