bert-tiny-finetuned-squad

This model is a fine-tuned version of prajjwal1/bert-tiny on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1478

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 90

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	29	0.8724
No log	2.0	58	0.7989
No log	3.0	87	0.7316
No log	4.0	116	0.6691
No log	5.0	145	0.6121
No log	6.0	174	0.5597
No log	7.0	203	0.5121
No log	8.0	232	0.4690
No log	9.0	261	0.4300
No log	10.0	290	0.3950
No log	11.0	319	0.3637
No log	12.0	348	0.3358
No log	13.0	377	0.3110
No log	14.0	406	0.2891
No log	15.0	435	0.2697
No log	16.0	464	0.2527
No log	17.0	493	0.2379
0.5621	18.0	522	0.2247
0.5621	19.0	551	0.2134
0.5621	20.0	580	0.2035
0.5621	21.0	609	0.1955
0.5621	22.0	638	0.1886
0.5621	23.0	667	0.1829
0.5621	24.0	696	0.1776
0.5621	25.0	725	0.1731
0.5621	26.0	754	0.1694
0.5621	27.0	783	0.1662
0.5621	28.0	812	0.1635
0.5621	29.0	841	0.1614
0.5621	30.0	870	0.1597
0.5621	31.0	899	0.1582
0.5621	32.0	928	0.1570
0.5621	33.0	957	0.1561
0.5621	34.0	986	0.1551
0.1726	35.0	1015	0.1545
0.1726	36.0	1044	0.1537
0.1726	37.0	1073	0.1532
0.1726	38.0	1102	0.1528
0.1726	39.0	1131	0.1523
0.1726	40.0	1160	0.1519
0.1726	41.0	1189	0.1516
0.1726	42.0	1218	0.1512
0.1726	43.0	1247	0.1510
0.1726	44.0	1276	0.1507
0.1726	45.0	1305	0.1505
0.1726	46.0	1334	0.1503
0.1726	47.0	1363	0.1502
0.1726	48.0	1392	0.1500
0.1726	49.0	1421	0.1499
0.1726	50.0	1450	0.1497
0.1726	51.0	1479	0.1496
0.1271	52.0	1508	0.1496
0.1271	53.0	1537	0.1494
0.1271	54.0	1566	0.1493
0.1271	55.0	1595	0.1492
0.1271	56.0	1624	0.1491
0.1271	57.0	1653	0.1490
0.1271	58.0	1682	0.1490
0.1271	59.0	1711	0.1489
0.1271	60.0	1740	0.1489
0.1271	61.0	1769	0.1488
0.1271	62.0	1798	0.1487
0.1271	63.0	1827	0.1487
0.1271	64.0	1856	0.1486
0.1271	65.0	1885	0.1486
0.1271	66.0	1914	0.1485
0.1271	67.0	1943	0.1485
0.1271	68.0	1972	0.1484
0.1216	69.0	2001	0.1484
0.1216	70.0	2030	0.1483
0.1216	71.0	2059	0.1483
0.1216	72.0	2088	0.1482
0.1216	73.0	2117	0.1483
0.1216	74.0	2146	0.1482
0.1216	75.0	2175	0.1481
0.1216	76.0	2204	0.1481
0.1216	77.0	2233	0.1481
0.1216	78.0	2262	0.1480
0.1216	79.0	2291	0.1480
0.1216	80.0	2320	0.1479
0.1216	81.0	2349	0.1479
0.1216	82.0	2378	0.1479
0.1216	83.0	2407	0.1479
0.1216	84.0	2436	0.1479
0.1216	85.0	2465	0.1479
0.1216	86.0	2494	0.1478
0.1151	87.0	2523	0.1478
0.1151	88.0	2552	0.1478
0.1151	89.0	2581	0.1478
0.1151	90.0	2610	0.1478

Framework versions

Transformers 4.41.1
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

ghostdivisio
/

bert-tiny-finetuned-squad

bert-tiny-finetuned-squad

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ghostdivisio/bert-tiny-finetuned-squad

Evaluation results