gemma-4
Collection
in mxfp4, mxfp8, and Deckard(qx) • 26 items • Updated
Brainwaves
arc arc/e boolq hswag obkqa piqa wino
bf16 0.389,0.465,0.762,0.486,0.372,0.707,0.641
mxfp8 0.376,0.464,0.743,0.490,0.378,0.709,0.622
q8-hi 0.392,0.462,0.762,0.487,0.376,0.706,0.636
qx86-hi 0.387,0.461,0.766,0.483,0.392,0.699,0.623
mxfp4 0.380,0.451,0.762,0.494,0.374,0.699,0.594
Perplexity Peak Memory Tokens/sec
mxfp8 170.519 ± 3.170 11.78 GB 2174
q8-hi 133.388 ± 2.383 11.21 GB 1889
qx86-hi 125.278 ± 2.215 11.87 GB 1856
mxfp4 140.693 ± 2.546 9.48 GB 2352
See parent model for instructions on install and use with Transformers.
-G
4-bit
Base model
google/gemma-4-E2B-it