ms-deberta-v2-xlarge-mnli-finetuned-pt

This model is a fine-tuned version of tasksource/deberta-small-long-nli on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.2954
Accuracy: 1.0
Precision: 1.0
Recall: 1.0
F1: 1.0
Ratio: 0.11

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.06
lr_scheduler_warmup_steps: 4
num_epochs: 1
label_smoothing_factor: 0.1

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	Precision	Recall	F1	Ratio
1.4129	0.0237	10	0.5425	0.89	0.445	0.5	0.4709	0.0
0.5102	0.0474	20	0.4968	0.89	0.445	0.5	0.4709	0.0
0.4597	0.0711	30	0.4763	0.88	0.6225	0.5395	0.5471	0.0327
0.4975	0.0948	40	0.4605	0.87	0.6658	0.6614	0.6636	0.1067
0.4639	0.1185	50	0.4434	0.8947	0.7355	0.5850	0.6125	0.0367
0.4687	0.1422	60	0.4557	0.892	0.7177	0.6498	0.6747	0.0727
0.4489	0.1659	70	0.4353	0.9293	0.8174	0.8275	0.8224	0.114
0.4318	0.1896	80	0.4269	0.924	0.8010	0.8325	0.8156	0.1233
0.4723	0.2133	90	0.4202	0.9173	0.7832	0.8580	0.8140	0.1447
0.4052	0.2370	100	0.4016	0.9307	0.8207	0.8309	0.8257	0.114
0.4284	0.2607	110	0.4115	0.9187	0.7855	0.8906	0.8255	0.1593
0.3635	0.2844	120	0.3963	0.94	0.8308	0.9052	0.8625	0.1393
0.3894	0.3081	130	0.3910	0.944	0.8409	0.9075	0.8699	0.1353
0.3537	0.3318	140	0.3598	0.9693	0.8983	0.9642	0.9277	0.1313
0.3776	0.3555	150	0.3868	0.944	0.8313	0.9685	0.8823	0.166
0.3626	0.3791	160	0.3235	0.9887	0.9699	0.9724	0.9711	0.1107
0.3683	0.4028	170	0.3272	0.99	0.9583	0.9944	0.9754	0.12
0.3358	0.4265	180	0.3321	0.9873	0.9484	0.9929	0.9692	0.1227
0.3435	0.4502	190	0.3370	0.982	0.9297	0.9899	0.9571	0.128
0.3613	0.4739	200	0.3136	0.9893	0.9728	0.9728	0.9728	0.11
0.3323	0.4976	210	0.3193	0.9887	0.9533	0.9936	0.9723	0.1213
0.3181	0.5213	220	0.3078	0.9947	0.9970	0.9758	0.9861	0.1047
0.3043	0.5450	230	0.3047	0.9947	0.9970	0.9758	0.9861	0.1047
0.3139	0.5687	240	0.3101	0.996	0.9825	0.9978	0.9899	0.114
0.3247	0.5924	250	0.3048	0.9947	0.9970	0.9758	0.9861	0.1047
0.3217	0.6161	260	0.3126	0.9913	0.9635	0.9951	0.9786	0.1187
0.3071	0.6398	270	0.3021	1.0	1.0	1.0	1.0	0.11
0.3048	0.6635	280	0.3048	0.9973	0.9882	0.9985	0.9933	0.1127
0.3054	0.6872	290	0.2996	1.0	1.0	1.0	1.0	0.11
0.3182	0.7109	300	0.2979	1.0	1.0	1.0	1.0	0.11
0.3059	0.7346	310	0.3103	0.9927	0.9688	0.9959	0.9818	0.1173
0.3044	0.7583	320	0.2991	1.0	1.0	1.0	1.0	0.11
0.3002	0.7820	330	0.2967	1.0	1.0	1.0	1.0	0.11
0.2957	0.8057	340	0.2967	1.0	1.0	1.0	1.0	0.11
0.2971	0.8294	350	0.2968	1.0	1.0	1.0	1.0	0.11
0.2964	0.8531	360	0.2970	1.0	1.0	1.0	1.0	0.11
0.297	0.8768	370	0.2969	1.0	1.0	1.0	1.0	0.11
0.3039	0.9005	380	0.2968	1.0	1.0	1.0	1.0	0.11
0.3002	0.9242	390	0.2960	1.0	1.0	1.0	1.0	0.11
0.2968	0.9479	400	0.2956	1.0	1.0	1.0	1.0	0.11
0.2956	0.9716	410	0.2955	1.0	1.0	1.0	1.0	0.11
0.2959	0.9953	420	0.2954	1.0	1.0	1.0	1.0	0.11

Framework versions

Transformers 4.44.2
Pytorch 2.4.0+cu121
Datasets 2.21.0
Tokenizers 0.19.1

rafaelsandroni
/

ms-deberta-v2-xlarge-mnli-finetuned-pt

ms-deberta-v2-xlarge-mnli-finetuned-pt

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rafaelsandroni/ms-deberta-v2-xlarge-mnli-finetuned-pt

Evaluation results