MAGNeT-Small-30secs-MLX-4bit

MLX port of Meta's MAGNeT — a masked parallel text-to-music model — quantized to INT4 weight-only for on-device generation on Apple Silicon. EnCodec 32 kHz audio decoder; T5-base text encoder; per-codebook iterative decoding with restricted-context self-attention.

Model

Parameters (LM) 300M
Quantization INT4 weight-only, group size 64
Format MLX safetensors
Sample rate 32 kHz mono
Output length 30 s per generation (fixed)
Decoding steps 50 total ([20, 10, 10, 10] across 4 RVQ codebooks)
Bundle size 499 MB on disk
Source facebook/magnet-small-30secs

Performance (Apple Silicon, 30 s audio)

Metric Value
Real-time factor (wall / audio) 0.28
Peak RSS 1351 MB
CLAP score (laion/clap-htsat-unfused, 5 prompts) 0.409

Usage

from huggingface_hub import snapshot_download
bundle = snapshot_download("aufklarer/MAGNeT-Small-30secs-MLX-4bit")
# See https://github.com/soniqo/speech-swift for production usage.

Source

License

CC-BY-NC 4.0 — inherited from upstream MAGNeT weights. Non-commercial use only.

Downloads last month
108
Safetensors
Model size
80.6M params
Tensor type
BF16
·
F32
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aufklarer/MAGNeT-Small-30secs-MLX-4bit

Finetuned
(2)
this model

Collection including aufklarer/MAGNeT-Small-30secs-MLX-4bit

Paper for aufklarer/MAGNeT-Small-30secs-MLX-4bit