nllb-finetuned

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an Europat de-en dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6932

Model description

More information needed

Intended uses & limitations

To translate Engineering(technical) sentences from German to English

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • training_steps: 100000

Training results

Training Loss Epoch Step Validation Loss
0.9149 0.0128 5000 0.8396
0.8633 0.0256 10000 0.7972
0.8223 0.0384 15000 0.7738
0.8191 0.0512 20000 0.7580
0.7917 0.0641 25000 0.7461
0.7852 0.0769 30000 0.7370
0.7869 0.0897 35000 0.7279
0.7714 0.1025 40000 0.7224
0.7687 0.1153 45000 0.7173
0.7559 0.1281 50000 0.7121
0.7503 0.1409 55000 0.7095
0.7538 0.1537 60000 0.7052
0.7472 0.1666 65000 0.7027
0.74 0.1794 70000 0.7006
0.7434 0.1922 75000 0.6986
0.7387 0.2050 80000 0.6964
0.7373 0.2178 85000 0.6952
0.7365 0.2306 90000 0.6941
0.7389 0.2434 95000 0.6933
0.735 0.2562 100000 0.6932

Framework versions

  • Transformers 4.57.3
  • Pytorch 2.9.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.22.1
Downloads last month
28
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for JaydeepGupta/nllb-finetuned

Finetuned
(287)
this model