bert-large-cased

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1459
Precision: 0.8092
Recall: 0.8804
F1: 0.8433
Accuracy: 0.9724

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.2
num_epochs: 34

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
No log	1.0	20	2.1317	0.0057	0.0266	0.0094	0.5173
No log	2.0	40	1.1830	0.0	0.0	0.0	0.7548
No log	3.0	60	0.8022	0.0077	0.0017	0.0027	0.7740
No log	4.0	80	0.5177	0.4688	0.3621	0.4086	0.8666
No log	5.0	100	0.3329	0.5916	0.6811	0.6332	0.9224
No log	6.0	120	0.2351	0.6759	0.7691	0.7195	0.9436
No log	7.0	140	0.1964	0.7164	0.7973	0.7547	0.9553
No log	8.0	160	0.1662	0.6996	0.8239	0.7567	0.9562
No log	9.0	180	0.1577	0.7928	0.8389	0.8152	0.9639
No log	10.0	200	0.1418	0.7862	0.8488	0.8163	0.9679
No log	11.0	220	0.1355	0.7883	0.8538	0.8198	0.9689
No log	12.0	240	0.1291	0.7988	0.8571	0.8269	0.9698
No log	13.0	260	0.1248	0.7876	0.8621	0.8232	0.9693
No log	14.0	280	0.1330	0.8172	0.8688	0.8422	0.9719
No log	15.0	300	0.1224	0.7957	0.8671	0.8299	0.9712
No log	16.0	320	0.1222	0.7743	0.8721	0.8203	0.9694
No log	17.0	340	0.1351	0.8183	0.8754	0.8459	0.9721
No log	18.0	360	0.1319	0.8003	0.8721	0.8347	0.9713
No log	19.0	380	0.1363	0.8252	0.8704	0.8472	0.9729
No log	20.0	400	0.1348	0.7946	0.8804	0.8353	0.9709
No log	21.0	420	0.1365	0.8030	0.8804	0.8399	0.9712
No log	22.0	440	0.1320	0.8015	0.8787	0.8384	0.9718
No log	23.0	460	0.1341	0.7791	0.8787	0.8259	0.9702
No log	24.0	480	0.1430	0.8186	0.8771	0.8468	0.9730
0.3108	25.0	500	0.1371	0.8006	0.8804	0.8386	0.9715
0.3108	26.0	520	0.1433	0.8101	0.8787	0.8430	0.9726
0.3108	27.0	540	0.1424	0.8154	0.8804	0.8466	0.9729
0.3108	28.0	560	0.1487	0.8234	0.8754	0.8486	0.9734
0.3108	29.0	580	0.1402	0.8141	0.8804	0.8460	0.9725
0.3108	30.0	600	0.1418	0.8113	0.8787	0.8437	0.9728
0.3108	31.0	620	0.1440	0.8089	0.8787	0.8424	0.9728
0.3108	32.0	640	0.1446	0.8079	0.8804	0.8426	0.9725
0.3108	33.0	660	0.1460	0.8052	0.8787	0.8403	0.9723
0.3108	34.0	680	0.1459	0.8092	0.8804	0.8433	0.9724

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.1.1
Tokenizers 0.22.1

Downloads last month: 4

Safetensors

Model size

0.3B params

Tensor type

F32