`quickmt-is-en` Neural Machine Translation Model

quickmt-is-en is a reasonably fast and reasonably accurate neural machine translation model for translation from is into en.

quickmt models are roughly 3 times faster for GPU inference than OpusMT models and roughly 40 times faster than LibreTranslate/ArgosTranslate.

Try it on our Huggingface Space

Give it a try before downloading here: https://huggingface.co/spaces/quickmt/quickmt-gui

Model Information

Trained using quickmt-train
200M parameter seq2seq transformer
32k separate Sentencepiece vocabs
Exported for fast inference to CTranslate2 format
The pytorch model (for fine-tuning or pytorch inference) is available in this repository in the pytorch_model folder
- Original configuration file: config.yaml

Usage with `quickmt`

If you want to do GPU inference be sure you have the Nvidia driver and cuda toolkit installed.

Next, install the quickmt python library and download the model:

git clone https://github.com/quickmt/quickmt.git
pip install -e ./quickmt/

Finally use the model in python:

from quickmt import Translator

# Auto-detects GPU, set to "cpu" to force CPU inference
mt = Translator("quickmt/quickmt-is-en", device="auto")

# Translate - set beam size to 1 for faster speed (but lower quality)
sample_text = 'Dr. Ehud Ur, læknaprófessor við Dalhousie-háskólann í Halifax í Nova Scotia og formaður klínískrar vísindadeildar Kanadíska sykursýkissambandsins, minnti á að rannsóknin væri rétt nýhafin.'

mt(sample_text, beam_size=5)

"Dr. Ehud Ur, a medical professor at Dalhousie University in Halifax, Nova Scotia and chair of the Canadian Diabetes Association's clinical science department, recalled that the study had just begun."

# Get alternative translations by sampling
# You can pass any cTranslate2 `translate_batch` arguments
mt([sample_text], sampling_temperature=1.2, beam_size=1, sampling_topk=50, sampling_topp=0.9)

'Dr. Ehud Ur, a medical professor at Dalhousie University in Halifax, Nova Scotia, and chair of the Clinical Division of the Canadian Diabetes Association, reminded that the study had just begun.'

The model is in ctranslate2 format, and the tokenizers are sentencepiece, so you can use ctranslate2 directly instead of through quickmt. It is also possible to get this model to work with e.g. LibreTranslate which also uses ctranslate2 and sentencepiece. A model in safetensors format to be used with eole is also provided.

Metrics

bleu and chrf2 are calculated with sacrebleu on the Flores200 devtest test set and Bouquet test set. "Time (s)" is the time in seconds to translate dataset on an RTX 4070s GPU with batch size 32. LLM inference done with vLLM and 32 threads.

Benchmarks are hard to get right and make fair. Download this model and give it a try and see if it works well for you!

flores devtest

model	time	bleu	chrf
quickmt-is-en	1.16	36.09	60.91
Helsinki-NLP/opus-mt-is-en	2.33	25.26	51.44
facebook/nllb-200-distilled-1.3B	18.17	32.79	56.81
CohereLabs/tiny-aya-global	27.03	16.03	40.63
google/gemma-4-E2B-it	46.60	28.55	54.30

bouquet test

model	time	bleu	chrf
quickmt-is-en	0.70	47.68	65.91
Helsinki-NLP/opus-mt-is-en	1.17	36.46	56.62
facebook/nllb-200-distilled-1.3B	8.57	40.31	60.39
CohereLabs/tiny-aya-global	14.22	22.26	43.01
google/gemma-4-E2B-it	23.79	36.90	57.52

Prompt for LLM translation:

Translate the following into {tgt_lang}, without commentary or explanation.\n\n{x}

Downloads last month: 41

Datasets used to train quickmt/quickmt-is-en

Collection including quickmt/quickmt-is-en

quickmt-models

Collection

Permissively licensed neural machine translation models optimized for inference speed • 58 items • Updated 27 days ago • 6

Evaluation results

BLEU on flores101-devtest
self-reported

36.090
CHRF on flores101-devtest
self-reported

60.910
BLEU on bouquet
self-reported

47.680
CHRF on bouquet
self-reported

65.910

quickmt-is-en Neural Machine Translation Model