Salesforce/wikitext
Viewer • Updated • 3.71M • 1.33M • 686
This model was produced via Hardware-Aware Scheduled Quantization-Aware Training using the Gradual snapping strategy, trained on WikiText-103 with a Kaggle TPU v5e-8.
| Metric | Value |
|---|---|
| Test Perplexity | 26.7619 |
| Test Loss | 3.2870 |
| KL Divergence | 0.949187 nats |
| Training Steps | 3,821 |
| Training Time | 14,581s (4.1 hours) |
Gradual [24, 16, 8]: Balanced transition strategy with equal-duration phases across all bit-widths.
This creates a smooth, balanced progression allowing the model to adapt gradually to each quantization level.
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"jpcurada/SmolLM2-1.7B-Scheduled-QAT-Gradual-INT4"
)
tokenizer = AutoTokenizer.from_pretrained(
"jpcurada/SmolLM2-1.7B-Scheduled-QAT-Gradual-INT4"
)
llama-cli -m smollm2-1.7b-sched-qat-gradual-Q4_K_M.gguf -p "Hello, world!"
Base model
HuggingFaceTB/SmolLM2-1.7B