Translation
Transformers
Safetensors
gemma3
image-text-to-text
multilingual
text-generation-inference
Instructions to use lyf07/Translategemma-4B-it-WALAR with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lyf07/Translategemma-4B-it-WALAR with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "translation" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("translation", model="lyf07/Translategemma-4B-it-WALAR")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("lyf07/Translategemma-4B-it-WALAR") model = AutoModelForImageTextToText.from_pretrained("lyf07/Translategemma-4B-it-WALAR") - Notebooks
- Google Colab
- Kaggle
Add pipeline tag and library metadata
Browse filesHi! I'm Niels from the Hugging Face community science team, working to improve the discoverability of models on the Hub.
I've opened this PR to add important metadata to your model card:
- **`pipeline_tag: translation`**: This ensures your model is correctly categorized in the Translation section of the Hub.
- **`library_name: transformers`**: This enables the "Use in Transformers" button and code snippets on the model page.
- **Paper Link**: I've updated the paper link to point to the Hugging Face paper page for better integration.
- **Citation**: I've added a BibTeX citation section for researchers to easily cite your work.
These changes will help more people find and use your work!
README.md
CHANGED
|
@@ -1,70 +1,67 @@
|
|
| 1 |
-
---
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
tags:
|
| 6 |
-
- multilingual
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
<a href="
|
| 15 |
-
</
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
| LLaMAX3-8B-Alpaca-WALAR | https://huggingface.co/lyf07/LLaMAX3-8B-Alpaca-WALAR |
|
| 69 |
-
| Qwen3-8B-WALAR | https://huggingface.co/lyf07/Qwen3-8B-WALAR |
|
| 70 |
-
| ***Translategemma-4B-it-WALAR*** | https://huggingface.co/lyf07/Translategemma-4B-it-WALAR |
|
|
|
|
| 1 |
+
---
|
| 2 |
+
base_model:
|
| 3 |
+
- google/translategemma-4b-it
|
| 4 |
+
license: mit
|
| 5 |
+
tags:
|
| 6 |
+
- multilingual
|
| 7 |
+
pipeline_tag: translation
|
| 8 |
+
library_name: transformers
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
<h1 align="center">WALAR</h1>
|
| 12 |
+
|
| 13 |
+
<p align="center">
|
| 14 |
+
<a href="https://huggingface.co/papers/2603.13045"> 📃 Paper</a> |
|
| 15 |
+
<a href="https://github.com/LeiLiLab/WALAR"> ⚙️ Code</a> |
|
| 16 |
+
<a href="https://huggingface.co/collections/lyf07/walar"> 🤗 Model</a> |
|
| 17 |
+
<a href="mailto:yfliu@smail.nju.edu.cn"> 📭 Contact</a>
|
| 18 |
+
</p>
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
## Overview
|
| 22 |
+
|
| 23 |
+
We propose **WALAR**, a reinforcement training method using only monolingual text to elevate LLMs' translation capabilities on massive low-resource languages. Our key insight is based on mending the "holes" (failure modes) of current state-of-the-art neural machine translation metrics, as training directly on these metrics will amplify such holes in trained LLMs. Specifically, we integrate quality estimation score, word alignment score and language alignment into WALAR's reward to mitigate the reward hacking brought by the holes.
|
| 24 |
+
|
| 25 |
+
This repository contains **Translategemma-4B-it-WALAR**, which is a variant of [google/translategemma-4b-it](https://huggingface.co/google/translategemma-4b-it) trained with the WALAR framework. Extensive experiments on over 1400 language directions demonstrate that our model outperforms the strongest prior multilingual model of the same size.
|
| 26 |
+
|
| 27 |
+
<img src="./fig/walar.png" alt="Figure 1: WALAR Framework Illustration" title="WALAR Framework Illustration" />
|
| 28 |
+
|
| 29 |
+
## 📐 Experimental Results
|
| 30 |
+
|
| 31 |
+
### 📊 **FLORES-101**
|
| 32 |
+
|
| 33 |
+
We conducted extensive experiments on FLORES-101 and reported xCOMET and MetricX scores for over 1400 language directions. Results demonstrate that WALAR improves LLM translation quality by a large margin. By comparing Qwen3-8B, Translategemma-4B-it and LLaMAX3-8B-Alpaca before and after training with WALAR, we observe significant average improvements across all metrics, demonstrating the generalizability of WALAR across different model families.
|
| 34 |
+
|
| 35 |
+

|
| 36 |
+
|
| 37 |
+
### 📄 Language Consistency
|
| 38 |
+
|
| 39 |
+
To systematically assess an LLM's ability to generate translations in the desired target language, we define the *Language Consistency Rate* (LCR) as the proportion of test instances whose outputs are identified as being in the correct target language. As shown in the figure below, WALAR also improves language consistency by a large margin, especially for low-resource target language, such as Swahili.
|
| 40 |
+
|
| 41 |
+
<img src="./fig/lang_consistency.png" alt="Figure 3: Lang Consistency of WALAR" title="Lang Consistency of WALAR" />
|
| 42 |
+
|
| 43 |
+
### 📈 Generalization of WALAR
|
| 44 |
+
|
| 45 |
+
Our model trained with WALAR also demonstrated strong generalization ability on language directions that are unseen during training. These results indicate that the improvements induced by WALAR can transfer beyond the training language set, potentially reducing the amount of parallel data and the number of language directions required to train massive multilingual models.
|
| 46 |
+
|
| 47 |
+
<img src="./fig/generalization_xcomet.png" alt="Figure 4: Generalization" title="Generalization" />
|
| 48 |
+
|
| 49 |
+
### Model Index
|
| 50 |
+
We trained three models using WALAR. The model index is shown below:
|
| 51 |
+
|
| 52 |
+
| Model | Link |
|
| 53 |
+
| -------------------------- | ------------------------------------------------------- |
|
| 54 |
+
| LLaMAX3-8B-Alpaca-WALAR | https://huggingface.co/lyf07/LLaMAX3-8B-Alpaca-WALAR |
|
| 55 |
+
| Qwen3-8B-WALAR | https://huggingface.co/lyf07/Qwen3-8B-WALAR |
|
| 56 |
+
| ***Translategemma-4B-it-WALAR*** | https://huggingface.co/lyf07/Translategemma-4B-it-WALAR |
|
| 57 |
+
|
| 58 |
+
## Citation
|
| 59 |
+
|
| 60 |
+
```bibtex
|
| 61 |
+
@article{liu2026mending,
|
| 62 |
+
title={Mending the Holes: Mitigating Reward Hacking in Reinforcement Learning for Multilingual Translation},
|
| 63 |
+
author={Liu, Yifeng and Ouyang, Siqi and Revanasiddappa, Yatish Hosmane and Li, Lei},
|
| 64 |
+
journal={arXiv preprint arXiv:2603.13045},
|
| 65 |
+
year={2026}
|
| 66 |
+
}
|
| 67 |
+
```
|
|
|
|
|
|
|
|
|