Post
33
Updated gemma-4-E4B-it metrics
I noticed the chat template got updated, and tried it on the E4B, with surprising results in stabilizing the brainwave.
Old numbers
I will re-do all baselines soon based on the new template. It is completely expected that the model behavior will change as a result.
Here are the effects of the new template on few known distills from DavidAU
gemma-4-E4B-it-The-DECKARD-Expresso-Universe-HERETIC-UNCENSORED
gemma-4-E4B-it-GLM-4.7-Flash-HERETIC-UNCENSORED-Thinking
gemma-4-E4B-it-Claude-Opus-4.5-HERETIC-UNCENSORED-Thinking
I noticed the chat template got updated, and tried it on the E4B, with surprising results in stabilizing the brainwave.
quant arc arc/e boolq hswag obkqa piqa wino
mxfp8 0.480,0.656,0.797,0.608,0.400,0.755,0.665
mxfp4 0.455,0.607,0.851,0.585,0.402,0.744,0.651
Quant Perplexity Peak Memory Tokens/sec
mxfp8 35.937 ± 0.525 14.80 GB 1153
mxfp4 36.746 ± 0.534 11.06 GB 1030Old numbers
quant arc arc/e boolq hswag obkqa piqa wino
mxfp8 0.404,0.489,0.825,0.586,0.392,0.734,0.661
mxfp4 0.414,0.508,0.854,0.562,0.378,0.717,0.645
Quant Perplexity Peak Memory Tokens/sec
mxfp8 34.652 ± 0.502 14.80 GB 1146
mxfp4 35.203 ± 0.506 11.06 GB 1200I will re-do all baselines soon based on the new template. It is completely expected that the model behavior will change as a result.
Here are the effects of the new template on few known distills from DavidAU
gemma-4-E4B-it-The-DECKARD-Expresso-Universe-HERETIC-UNCENSORED
quant arc arc/e boolq hswag obkqa piqa wino
New template
mxfp8 0.508,0.707,0.756,...
mxfp4 0.482,0.680,0.790,...
Old template
mxfp8 0.506,0.697,0.754,0.661,0.416,0.757,0.627
mxfp4 0.487,0.670,0.792,0.644,0.430,0.748,0.624gemma-4-E4B-it-GLM-4.7-Flash-HERETIC-UNCENSORED-Thinking
mxfp8 0.461,0.599,0.779,...
Old template
mxfp8 0.456,0.580,0.786,0.629,0.410,0.764,0.633gemma-4-E4B-it-Claude-Opus-4.5-HERETIC-UNCENSORED-Thinking
mxfp8 0.508,0.705,0.806,...
Old template
mxfp8 0.502,0.692,0.809,0.650,0.420,0.771,0.651