nllb-finetuned

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an Europat de-en dataset. It achieves the following results on the evaluation set:

Loss: 0.6932

Model description

More information needed

Intended uses & limitations

To translate Engineering(technical) sentences from German to English

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 32
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
training_steps: 100000

Training results

Training Loss	Epoch	Step	Validation Loss
0.9149	0.0128	5000	0.8396
0.8633	0.0256	10000	0.7972
0.8223	0.0384	15000	0.7738
0.8191	0.0512	20000	0.7580
0.7917	0.0641	25000	0.7461
0.7852	0.0769	30000	0.7370
0.7869	0.0897	35000	0.7279
0.7714	0.1025	40000	0.7224
0.7687	0.1153	45000	0.7173
0.7559	0.1281	50000	0.7121
0.7503	0.1409	55000	0.7095
0.7538	0.1537	60000	0.7052
0.7472	0.1666	65000	0.7027
0.74	0.1794	70000	0.7006
0.7434	0.1922	75000	0.6986
0.7387	0.2050	80000	0.6964
0.7373	0.2178	85000	0.6952
0.7365	0.2306	90000	0.6941
0.7389	0.2434	95000	0.6933
0.735	0.2562	100000	0.6932

Framework versions

Transformers 4.57.3
Pytorch 2.9.0+cu126
Datasets 4.0.0
Tokenizers 0.22.1

Downloads last month: 28

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for JaydeepGupta/nllb-finetuned

Base model

facebook/nllb-200-distilled-600M

Finetuned

(287)

this model