MAGNeT-Medium-30secs-MLX-8bit

MLX port of Meta's MAGNeT — a masked parallel text-to-music model — quantized to INT8 weight-only for on-device generation on Apple Silicon. EnCodec 32 kHz audio decoder; T5-base text encoder; per-codebook iterative decoding with restricted-context self-attention.

Model

Parameters (LM) 1.5B
Quantization INT8 weight-only, group size 64
Format MLX safetensors
Sample rate 32 kHz mono
Output length 30 s per generation (fixed)
Decoding steps 50 total ([20, 10, 10, 10] across 4 RVQ codebooks)
Bundle size 2221 MB on disk
Source facebook/magnet-medium-30secs

Performance (Apple Silicon, 30 s audio)

Metric Value
Real-time factor (wall / audio) 1.20
Peak RSS 3045 MB
CLAP score (laion/clap-htsat-unfused, 5 prompts) 0.345

Usage

from huggingface_hub import snapshot_download
bundle = snapshot_download("aufklarer/MAGNeT-Medium-30secs-MLX-8bit")
# See https://github.com/soniqo/speech-swift for production usage.

Source

License

CC-BY-NC 4.0 — inherited from upstream MAGNeT weights. Non-commercial use only.

Downloads last month
32
Safetensors
Model size
0.5B params
Tensor type
BF16
·
F32
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aufklarer/MAGNeT-Medium-30secs-MLX-8bit

Finetuned
(2)
this model

Collection including aufklarer/MAGNeT-Medium-30secs-MLX-8bit

Paper for aufklarer/MAGNeT-Medium-30secs-MLX-8bit