File size: 7,587 Bytes
5720a1a
 
 
 
 
a51a83d
5720a1a
 
 
 
 
 
 
 
 
 
 
 
 
a51a83d
 
 
 
5720a1a
a51a83d
5720a1a
a51a83d
5720a1a
a51a83d
5720a1a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a51a83d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5720a1a
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
---
library_name: transformers
tags:
- trl
- cpo
- alignment-handbook
- generated_from_trainer
model-index:
- name: OpenELM-1_1B-SimPO
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# OpenELM-1_1B-SimPO

This model was trained from scratch on an unknown dataset.
It achieves the following results on the evaluation set:
- Logits/chosen: -0.5781
- Logits/rejected: 1.2422
- Logps/chosen: -113.0
- Logps/rejected: -171.0
- Loss: 0.8496
- Nll Loss: 0.0
- Rewards/accuracies: 0.6680
- Rewards/chosen: -1.1328
- Rewards/margins: 0.5742
- Rewards/rejected: -1.7031

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 16
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 2
- total_train_batch_size: 64
- total_eval_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3

### Training results

| Training Loss | Epoch  | Step | Logits/chosen | Logits/rejected | Logps/chosen | Logps/rejected | Validation Loss | Nll Loss | Rewards/accuracies | Rewards/chosen | Rewards/margins | Rewards/rejected |
|:-------------:|:------:|:----:|:-------------:|:---------------:|:------------:|:--------------:|:---------------:|:--------:|:------------------:|:--------------:|:---------------:|:----------------:|
| 0.9346        | 0.1047 | 100  | -8.5625       | -7.9688         | -33.25       | -41.75         | 0.9349          | 0.0      | 0.6133             | -0.3320        | 0.0864          | -0.4180          |
| 0.9139        | 0.2093 | 200  | -3.4531       | -2.4375         | -48.5        | -63.5          | 0.9069          | 0.0      | 0.6270             | -0.4844        | 0.1504          | -0.6367          |
| 0.907         | 0.3140 | 300  | -5.1875       | -4.0            | -69.5        | -83.5          | 0.9099          | 0.0      | 0.6055             | -0.6914        | 0.1416          | -0.8359          |
| 0.901         | 0.4186 | 400  | -1.7422       | 0.0164          | -84.0        | -101.0         | 0.8957          | 0.0      | 0.6328             | -0.8359        | 0.1748          | -1.0156          |
| 0.8752        | 0.5233 | 500  | -0.5625       | 0.8555          | -72.5        | -95.5          | 0.8768          | 0.0      | 0.6582             | -0.7266        | 0.2324          | -0.9570          |
| 0.8808        | 0.6279 | 600  | 2.1562        | 3.2344          | -86.0        | -109.5         | 0.8742          | 0.0      | 0.6445             | -0.8633        | 0.2334          | -1.0938          |
| 0.8277        | 0.7326 | 700  | -0.7930       | 0.3496          | -52.0        | -77.5          | 0.8679          | 0.0      | 0.6445             | -0.5195        | 0.2520          | -0.7734          |
| 0.8341        | 0.8373 | 800  | 0.2188        | 1.3047          | -80.5        | -108.5         | 0.8503          | 0.0      | 0.6602             | -0.8047        | 0.2773          | -1.0859          |
| 0.8333        | 0.9419 | 900  | 0.6406        | 1.8438          | -90.0        | -121.5         | 0.8454          | 0.0      | 0.6660             | -0.8984        | 0.3184          | -1.2188          |
| 0.8071        | 1.0466 | 1000 | 0.1504        | 1.3516          | -100.0       | -133.0         | 0.8441          | 0.0      | 0.6699             | -1.0           | 0.3340          | -1.3359          |
| 0.7845        | 1.1512 | 1100 | -1.5078       | 0.3301          | -84.5        | -122.5         | 0.8307          | 0.0      | 0.6660             | -0.8477        | 0.3809          | -1.2266          |
| 0.7483        | 1.2559 | 1200 | -0.4160       | 0.9805          | -94.5        | -133.0         | 0.8353          | 0.0      | 0.6758             | -0.9453        | 0.3809          | -1.3281          |
| 0.7802        | 1.3605 | 1300 | -1.5859       | 0.3418          | -62.0        | -100.5         | 0.8363          | 0.0      | 0.7051             | -0.6211        | 0.3828          | -1.0             |
| 0.7499        | 1.4652 | 1400 | -0.1719       | 1.4531          | -97.0        | -141.0         | 0.8228          | 0.0      | 0.7012             | -0.9727        | 0.4414          | -1.4141          |
| 0.6966        | 1.5699 | 1500 | -0.3301       | 1.5             | -106.0       | -152.0         | 0.8231          | 0.0      | 0.6836             | -1.0625        | 0.4609          | -1.5234          |
| 0.6921        | 1.6745 | 1600 | 0.6133        | 2.25            | -107.0       | -155.0         | 0.8222          | 0.0      | 0.6875             | -1.0703        | 0.4766          | -1.5469          |
| 0.7162        | 1.7792 | 1700 | 0.6992        | 2.4688          | -103.0       | -154.0         | 0.8106          | 0.0      | 0.6953             | -1.0312        | 0.5078          | -1.5391          |
| 0.714         | 1.8838 | 1800 | 0.0579        | 2.1875          | -109.5       | -162.0         | 0.8183          | 0.0      | 0.6855             | -1.0938        | 0.5312          | -1.625           |
| 0.7068        | 1.9885 | 1900 | 0.3184        | 1.9922          | -97.5        | -151.0         | 0.8164          | 0.0      | 0.7031             | -0.9727        | 0.5352          | -1.5078          |
| 0.4781        | 2.0931 | 2000 | 0.0977        | 1.7344          | -119.0       | -171.0         | 0.8475          | 0.0      | 0.6797             | -1.1875        | 0.5273          | -1.7109          |
| 0.4964        | 2.1978 | 2100 | -0.9258       | 0.9219          | -100.0       | -155.0         | 0.8455          | 0.0      | 0.6875             | -1.0           | 0.5547          | -1.5547          |
| 0.4723        | 2.3025 | 2200 | -0.4648       | 1.2969          | -110.0       | -166.0         | 0.8475          | 0.0      | 0.6934             | -1.1016        | 0.5586          | -1.6562          |
| 0.5051        | 2.4071 | 2300 | -0.2891       | 1.4141          | -113.0       | -170.0         | 0.8480          | 0.0      | 0.6895             | -1.1328        | 0.5664          | -1.6953          |
| 0.4647        | 2.5118 | 2400 | -0.3496       | 1.4531          | -114.0       | -171.0         | 0.8463          | 0.0      | 0.6758             | -1.1406        | 0.5742          | -1.7188          |
| 0.4442        | 2.6164 | 2500 | -0.1436       | 1.5859          | -123.5       | -180.0         | 0.8527          | 0.0      | 0.6680             | -1.2344        | 0.5664          | -1.7969          |
| 0.4349        | 2.7211 | 2600 | -0.5898       | 1.2422          | -112.0       | -169.0         | 0.8505          | 0.0      | 0.6699             | -1.1172        | 0.5742          | -1.6953          |
| 0.4514        | 2.8257 | 2700 | -0.6406       | 1.1953          | -112.0       | -169.0         | 0.8493          | 0.0      | 0.6738             | -1.1172        | 0.5781          | -1.6953          |
| 0.459         | 2.9304 | 2800 | -0.5781       | 1.2422          | -113.0       | -171.0         | 0.8496          | 0.0      | 0.6680             | -1.1328        | 0.5742          | -1.7031          |


### Framework versions

- Transformers 4.44.2
- Pytorch 2.3.0
- Datasets 3.0.0
- Tokenizers 0.19.1