gemma-4-E2B-it-mxfp4-mlx

Brainwaves

         arc   arc/e boolq hswag obkqa piqa  wino
bf16     0.389,0.465,0.762,0.486,0.372,0.707,0.641
mxfp8    0.376,0.464,0.743,0.490,0.378,0.709,0.622
q8-hi    0.392,0.462,0.762,0.487,0.376,0.706,0.636
qx86-hi  0.387,0.461,0.766,0.483,0.392,0.699,0.623
mxfp4    0.380,0.451,0.762,0.494,0.374,0.699,0.594

Perplexity               Peak Memory   Tokens/sec
mxfp8    170.519 ± 3.170  11.78 GB      2174
q8-hi    133.388 ± 2.383  11.21 GB      1889
qx86-hi  125.278 ± 2.215  11.87 GB      1856
mxfp4    140.693 ± 2.546   9.48 GB      2352

See parent model for instructions on install and use with Transformers.

-G

Downloads last month
2,579
Safetensors
Model size
1B params
Tensor type
BF16
·
U8
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nightmedia/gemma-4-E2B-it-mxfp4-mlx

Quantized
(130)
this model

Collection including nightmedia/gemma-4-E2B-it-mxfp4-mlx