quickmt-is-en Neural Machine Translation Model
quickmt-is-en is a reasonably fast and reasonably accurate neural machine translation model for translation from is into en.
quickmt models are roughly 3 times faster for GPU inference than OpusMT models and roughly 40 times faster than LibreTranslate/ArgosTranslate.
Try it on our Huggingface Space
Give it a try before downloading here: https://huggingface.co/spaces/quickmt/quickmt-gui
Model Information
- Trained using
quickmt-train - 200M parameter seq2seq transformer
- 32k separate Sentencepiece vocabs
- Exported for fast inference to CTranslate2 format
- The pytorch model (for fine-tuning or pytorch inference) is available in this repository in the
pytorch_modelfolder- Original configuration file:
config.yaml
- Original configuration file:
Usage with quickmt
If you want to do GPU inference be sure you have the Nvidia driver and cuda toolkit installed.
Next, install the quickmt python library and download the model:
git clone https://github.com/quickmt/quickmt.git
pip install -e ./quickmt/
Finally use the model in python:
from quickmt import Translator
# Auto-detects GPU, set to "cpu" to force CPU inference
mt = Translator("quickmt/quickmt-is-en", device="auto")
# Translate - set beam size to 1 for faster speed (but lower quality)
sample_text = 'Dr. Ehud Ur, læknaprófessor við Dalhousie-háskólann í Halifax í Nova Scotia og formaður klínískrar vísindadeildar Kanadíska sykursýkissambandsins, minnti á að rannsóknin væri rétt nýhafin.'
mt(sample_text, beam_size=5)
"Dr. Ehud Ur, a medical professor at Dalhousie University in Halifax, Nova Scotia and chair of the Canadian Diabetes Association's clinical science department, recalled that the study had just begun."
# Get alternative translations by sampling
# You can pass any cTranslate2 `translate_batch` arguments
mt([sample_text], sampling_temperature=1.2, beam_size=1, sampling_topk=50, sampling_topp=0.9)
'Dr. Ehud Ur, a medical professor at Dalhousie University in Halifax, Nova Scotia, and chair of the Clinical Division of the Canadian Diabetes Association, reminded that the study had just begun.'
The model is in ctranslate2 format, and the tokenizers are sentencepiece, so you can use ctranslate2 directly instead of through quickmt. It is also possible to get this model to work with e.g. LibreTranslate which also uses ctranslate2 and sentencepiece. A model in safetensors format to be used with eole is also provided.
Metrics
bleu and chrf2 are calculated with sacrebleu on the Flores200 devtest test set and Bouquet test set. "Time (s)" is the time in seconds to translate dataset on an RTX 4070s GPU with batch size 32. LLM inference done with vLLM and 32 threads.
Benchmarks are hard to get right and make fair. Download this model and give it a try and see if it works well for you!
flores devtest
| model | time | bleu | chrf |
|---|---|---|---|
| quickmt-is-en | 1.16 | 36.09 | 60.91 |
| Helsinki-NLP/opus-mt-is-en | 2.33 | 25.26 | 51.44 |
| facebook/nllb-200-distilled-1.3B | 18.17 | 32.79 | 56.81 |
| CohereLabs/tiny-aya-global | 27.03 | 16.03 | 40.63 |
| google/gemma-4-E2B-it | 46.60 | 28.55 | 54.30 |
bouquet test
| model | time | bleu | chrf |
|---|---|---|---|
| quickmt-is-en | 0.70 | 47.68 | 65.91 |
| Helsinki-NLP/opus-mt-is-en | 1.17 | 36.46 | 56.62 |
| facebook/nllb-200-distilled-1.3B | 8.57 | 40.31 | 60.39 |
| CohereLabs/tiny-aya-global | 14.22 | 22.26 | 43.01 |
| google/gemma-4-E2B-it | 23.79 | 36.90 | 57.52 |
Prompt for LLM translation:
Translate the following into {tgt_lang}, without commentary or explanation.\n\n{x}
- Downloads last month
- 41
Datasets used to train quickmt/quickmt-is-en
Collection including quickmt/quickmt-is-en
Evaluation results
- BLEU on flores101-devtestself-reported36.090
- CHRF on flores101-devtestself-reported60.910
- BLEU on bouquetself-reported47.680
- CHRF on bouquetself-reported65.910