HuggingFaceFW/fineweb
Viewer • Updated • 52.5B • 975k • 2.82k
A fine-tuned ModernBERT-base model for multi-label subject classification of educational web text. Given a passage of text, it predicts which of 17 academic/professional subject categories apply.
| Property | Value |
|---|---|
| Base model | answerdotai/ModernBERT-base |
| Architecture | ModernBertForSequenceClassification |
| Task | Multi-label classification |
| Number of labels | 17 |
| Max input length | 512 tokens |
| Hidden size | 768 |
| Attention heads | 12 |
| Transformer layers | 22 (alternating full + sliding window attention) |
| Pooling | Mean pooling |
| Index | Field | Display Name |
|---|---|---|
| 0 | mathematics_statistics |
Mathematics Statistics |
| 1 | computer_science_software_engineering |
Computer Science Software Engineering |
| 2 | machine_learning_ai |
Machine Learning AI |
| 3 | physical_sciences |
Physical Sciences |
| 4 | life_sciences_biology |
Life Sciences Biology |
| 5 | medicine_health |
Medicine Health |
| 6 | engineering_technology |
Engineering Technology |
| 7 | business_economics |
Business Economics |
| 8 | law_government |
Law Government |
| 9 | social_sciences |
Social Sciences |
| 10 | history_geography |
History Geography |
| 11 | philosophy_ethics |
Philosophy Ethics |
| 12 | education_pedagogy |
Education Pedagogy |
| 13 | language_writing |
Language Writing |
| 14 | arts_humanities |
Arts Humanities |
| 15 | environmental_science_energy |
Environmental Science Energy |
| 16 | personal_finance_practical_life |
Personal Finance Practical Life |
| Hyperparameter | Value |
|---|---|
| Epochs | 3 |
| Batch size | 32 |
| Learning rate | 2e-5 |
| Weight decay | 0.01 |
| Warmup ratio | 0.1 |
| Max token length | 512 |
| Optimizer | AdamW |
| Scheduler | Linear with warmup |
| AMP | bf16 (on CUDA) |
| Gradient clipping | max norm 1.0 |
Model checkpoint was saved at the epoch with the best validation micro-F1 (epoch 2).
| Metric | Score |
|---|---|
| Micro F1 | 0.8545 |
| Macro F1 | 0.8264 |
| Precision (micro) | 0.8799 |
| Recall (micro) | 0.8304 |
| Loss | 0.1222 |
Base model
answerdotai/ModernBERT-base