Instructions to use rausch/ru-t5-sci-transfer-init-spm32k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use rausch/ru-t5-sci-transfer-init-spm32k with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("rausch/ru-t5-sci-transfer-init-spm32k") model = AutoModelForSeq2SeqLM.from_pretrained("rausch/ru-t5-sci-transfer-init-spm32k") - Notebooks
- Google Colab
- Kaggle
RU-Trans-Init
Russian scientific T5 model initialized from EN-T5-Sci using WECHSEL and a language-specific SentencePiece 32k tokenizer.
Model Details
This is one of the non-English scientific T5 transfer models from the paper. The model keeps the EN-T5-Sci Transformer weights and reinitializes the language-specific embeddings with WECHSEL using a target SentencePiece tokenizer.
- Paper name:
RU-Trans-Init - Model role:
main - Source/base model: EN-T5-Sci
- Code and pipeline: GitHub repository
- Architecture: T5 encoder-decoder
- SciLaD dataset: scilons/SciLaD-all-text-v1
- Evaluation benchmark: Global-MMLU
- Target-language tokenizer: Russian SciLaD split; language-specific SentencePiece 32k tokenizer
Evaluated against:
- RU-Base-CP control: reported as the continued-pretraining control.
- upstream target-language base: reported as the monolingual base comparison.
WECHSEL resources: English fastText embeddings + Russian fastText embeddings (ru) with the russian bilingual dictionary.
Evaluation
Zero-shot Global-MMLU accuracy reported by the paper aggregation:
| Metric | Accuracy |
|---|---|
| Average | 26.36 |
| STEM | 27.12 |
| Humanities | 24.89 |
| Social Sciences | 28.86 |
| Other | 25.33 |
Limitations
The model is evaluated primarily with zero-shot Global-MMLU. Downstream task-specific evaluation is recommended before deployment in specialized scientific workflows.
Citation
- Title: Transferring Scientific English Pre-Trained Language Models to Multiple Languages Using Cross-Lingual Transfer
- Authors: Nikolas Rauscher, Fabio Barth, Georg Rehm
- Venue: LREC-COLING 2026, citation details TBA after publication
- Downloads last month
- 16
Model tree for rausch/ru-t5-sci-transfer-init-spm32k
Base model
rausch/en-t5-sci-continued-pretraining-487k