Open in Spaces

quickmt-is-en Neural Machine Translation Model

quickmt-is-en is a reasonably fast and reasonably accurate neural machine translation model for translation from is into en.

quickmt models are roughly 3 times faster for GPU inference than OpusMT models and roughly 40 times faster than LibreTranslate/ArgosTranslate.

Try it on our Huggingface Space

Give it a try before downloading here: https://huggingface.co/spaces/quickmt/quickmt-gui

Model Information

  • Trained using quickmt-train
  • 200M parameter seq2seq transformer
  • 32k separate Sentencepiece vocabs
  • Exported for fast inference to CTranslate2 format
  • The pytorch model (for fine-tuning or pytorch inference) is available in this repository in the pytorch_model folder
    • Original configuration file: config.yaml

Usage with quickmt

If you want to do GPU inference be sure you have the Nvidia driver and cuda toolkit installed.

Next, install the quickmt python library and download the model:

git clone https://github.com/quickmt/quickmt.git
pip install -e ./quickmt/

Finally use the model in python:

from quickmt import Translator

# Auto-detects GPU, set to "cpu" to force CPU inference
mt = Translator("quickmt/quickmt-is-en", device="auto")

# Translate - set beam size to 1 for faster speed (but lower quality)
sample_text = 'Dr. Ehud Ur, læknaprófessor við Dalhousie-háskólann í Halifax í Nova Scotia og formaður klínískrar vísindadeildar Kanadíska sykursýkissambandsins, minnti á að rannsóknin væri rétt nýhafin.'

mt(sample_text, beam_size=5)

"Dr. Ehud Ur, a medical professor at Dalhousie University in Halifax, Nova Scotia and chair of the Canadian Diabetes Association's clinical science department, recalled that the study had just begun."

# Get alternative translations by sampling
# You can pass any cTranslate2 `translate_batch` arguments
mt([sample_text], sampling_temperature=1.2, beam_size=1, sampling_topk=50, sampling_topp=0.9)

'Dr. Ehud Ur, a medical professor at Dalhousie University in Halifax, Nova Scotia, and chair of the Clinical Division of the Canadian Diabetes Association, reminded that the study had just begun.'

The model is in ctranslate2 format, and the tokenizers are sentencepiece, so you can use ctranslate2 directly instead of through quickmt. It is also possible to get this model to work with e.g. LibreTranslate which also uses ctranslate2 and sentencepiece. A model in safetensors format to be used with eole is also provided.

Metrics

bleu and chrf2 are calculated with sacrebleu on the Flores200 devtest test set and Bouquet test set. "Time (s)" is the time in seconds to translate dataset on an RTX 4070s GPU with batch size 32. LLM inference done with vLLM and 32 threads.

Benchmarks are hard to get right and make fair. Download this model and give it a try and see if it works well for you!

flores devtest

model time bleu chrf
quickmt-is-en 1.16 36.09 60.91
Helsinki-NLP/opus-mt-is-en 2.33 25.26 51.44
facebook/nllb-200-distilled-1.3B 18.17 32.79 56.81
CohereLabs/tiny-aya-global 27.03 16.03 40.63
google/gemma-4-E2B-it 46.60 28.55 54.30

bouquet test

model time bleu chrf
quickmt-is-en 0.70 47.68 65.91
Helsinki-NLP/opus-mt-is-en 1.17 36.46 56.62
facebook/nllb-200-distilled-1.3B 8.57 40.31 60.39
CohereLabs/tiny-aya-global 14.22 22.26 43.01
google/gemma-4-E2B-it 23.79 36.90 57.52

Prompt for LLM translation:

Translate the following into {tgt_lang}, without commentary or explanation.\n\n{x}

Downloads last month
41
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train quickmt/quickmt-is-en

Collection including quickmt/quickmt-is-en

Evaluation results