Model save

a51a83d verified about 1 month ago

7.59 kB

	---
	library_name: transformers
	tags:
	- trl
	- cpo
	- alignment-handbook
	- generated_from_trainer
	model-index:
	- name: OpenELM-1_1B-SimPO
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# OpenELM-1_1B-SimPO

	This model was trained from scratch on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Logits/chosen: -0.5781
	- Logits/rejected: 1.2422
	- Logps/chosen: -113.0
	- Logps/rejected: -171.0
	- Loss: 0.8496
	- Nll Loss: 0.0
	- Rewards/accuracies: 0.6680
	- Rewards/chosen: -1.1328
	- Rewards/margins: 0.5742
	- Rewards/rejected: -1.7031

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 16
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 4
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 64
	- total_eval_batch_size: 64
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Logits/chosen \| Logits/rejected \| Logps/chosen \| Logps/rejected \| Validation Loss \| Nll Loss \| Rewards/accuracies \| Rewards/chosen \| Rewards/margins \| Rewards/rejected \|
	\|:-------------:\|:------:\|:----:\|:-------------:\|:---------------:\|:------------:\|:--------------:\|:---------------:\|:--------:\|:------------------:\|:--------------:\|:---------------:\|:----------------:\|
	\| 0.9346 \| 0.1047 \| 100 \| -8.5625 \| -7.9688 \| -33.25 \| -41.75 \| 0.9349 \| 0.0 \| 0.6133 \| -0.3320 \| 0.0864 \| -0.4180 \|
	\| 0.9139 \| 0.2093 \| 200 \| -3.4531 \| -2.4375 \| -48.5 \| -63.5 \| 0.9069 \| 0.0 \| 0.6270 \| -0.4844 \| 0.1504 \| -0.6367 \|
	\| 0.907 \| 0.3140 \| 300 \| -5.1875 \| -4.0 \| -69.5 \| -83.5 \| 0.9099 \| 0.0 \| 0.6055 \| -0.6914 \| 0.1416 \| -0.8359 \|
	\| 0.901 \| 0.4186 \| 400 \| -1.7422 \| 0.0164 \| -84.0 \| -101.0 \| 0.8957 \| 0.0 \| 0.6328 \| -0.8359 \| 0.1748 \| -1.0156 \|
	\| 0.8752 \| 0.5233 \| 500 \| -0.5625 \| 0.8555 \| -72.5 \| -95.5 \| 0.8768 \| 0.0 \| 0.6582 \| -0.7266 \| 0.2324 \| -0.9570 \|
	\| 0.8808 \| 0.6279 \| 600 \| 2.1562 \| 3.2344 \| -86.0 \| -109.5 \| 0.8742 \| 0.0 \| 0.6445 \| -0.8633 \| 0.2334 \| -1.0938 \|
	\| 0.8277 \| 0.7326 \| 700 \| -0.7930 \| 0.3496 \| -52.0 \| -77.5 \| 0.8679 \| 0.0 \| 0.6445 \| -0.5195 \| 0.2520 \| -0.7734 \|
	\| 0.8341 \| 0.8373 \| 800 \| 0.2188 \| 1.3047 \| -80.5 \| -108.5 \| 0.8503 \| 0.0 \| 0.6602 \| -0.8047 \| 0.2773 \| -1.0859 \|
	\| 0.8333 \| 0.9419 \| 900 \| 0.6406 \| 1.8438 \| -90.0 \| -121.5 \| 0.8454 \| 0.0 \| 0.6660 \| -0.8984 \| 0.3184 \| -1.2188 \|
	\| 0.8071 \| 1.0466 \| 1000 \| 0.1504 \| 1.3516 \| -100.0 \| -133.0 \| 0.8441 \| 0.0 \| 0.6699 \| -1.0 \| 0.3340 \| -1.3359 \|
	\| 0.7845 \| 1.1512 \| 1100 \| -1.5078 \| 0.3301 \| -84.5 \| -122.5 \| 0.8307 \| 0.0 \| 0.6660 \| -0.8477 \| 0.3809 \| -1.2266 \|
	\| 0.7483 \| 1.2559 \| 1200 \| -0.4160 \| 0.9805 \| -94.5 \| -133.0 \| 0.8353 \| 0.0 \| 0.6758 \| -0.9453 \| 0.3809 \| -1.3281 \|
	\| 0.7802 \| 1.3605 \| 1300 \| -1.5859 \| 0.3418 \| -62.0 \| -100.5 \| 0.8363 \| 0.0 \| 0.7051 \| -0.6211 \| 0.3828 \| -1.0 \|
	\| 0.7499 \| 1.4652 \| 1400 \| -0.1719 \| 1.4531 \| -97.0 \| -141.0 \| 0.8228 \| 0.0 \| 0.7012 \| -0.9727 \| 0.4414 \| -1.4141 \|
	\| 0.6966 \| 1.5699 \| 1500 \| -0.3301 \| 1.5 \| -106.0 \| -152.0 \| 0.8231 \| 0.0 \| 0.6836 \| -1.0625 \| 0.4609 \| -1.5234 \|
	\| 0.6921 \| 1.6745 \| 1600 \| 0.6133 \| 2.25 \| -107.0 \| -155.0 \| 0.8222 \| 0.0 \| 0.6875 \| -1.0703 \| 0.4766 \| -1.5469 \|
	\| 0.7162 \| 1.7792 \| 1700 \| 0.6992 \| 2.4688 \| -103.0 \| -154.0 \| 0.8106 \| 0.0 \| 0.6953 \| -1.0312 \| 0.5078 \| -1.5391 \|
	\| 0.714 \| 1.8838 \| 1800 \| 0.0579 \| 2.1875 \| -109.5 \| -162.0 \| 0.8183 \| 0.0 \| 0.6855 \| -1.0938 \| 0.5312 \| -1.625 \|
	\| 0.7068 \| 1.9885 \| 1900 \| 0.3184 \| 1.9922 \| -97.5 \| -151.0 \| 0.8164 \| 0.0 \| 0.7031 \| -0.9727 \| 0.5352 \| -1.5078 \|
	\| 0.4781 \| 2.0931 \| 2000 \| 0.0977 \| 1.7344 \| -119.0 \| -171.0 \| 0.8475 \| 0.0 \| 0.6797 \| -1.1875 \| 0.5273 \| -1.7109 \|
	\| 0.4964 \| 2.1978 \| 2100 \| -0.9258 \| 0.9219 \| -100.0 \| -155.0 \| 0.8455 \| 0.0 \| 0.6875 \| -1.0 \| 0.5547 \| -1.5547 \|
	\| 0.4723 \| 2.3025 \| 2200 \| -0.4648 \| 1.2969 \| -110.0 \| -166.0 \| 0.8475 \| 0.0 \| 0.6934 \| -1.1016 \| 0.5586 \| -1.6562 \|
	\| 0.5051 \| 2.4071 \| 2300 \| -0.2891 \| 1.4141 \| -113.0 \| -170.0 \| 0.8480 \| 0.0 \| 0.6895 \| -1.1328 \| 0.5664 \| -1.6953 \|
	\| 0.4647 \| 2.5118 \| 2400 \| -0.3496 \| 1.4531 \| -114.0 \| -171.0 \| 0.8463 \| 0.0 \| 0.6758 \| -1.1406 \| 0.5742 \| -1.7188 \|
	\| 0.4442 \| 2.6164 \| 2500 \| -0.1436 \| 1.5859 \| -123.5 \| -180.0 \| 0.8527 \| 0.0 \| 0.6680 \| -1.2344 \| 0.5664 \| -1.7969 \|
	\| 0.4349 \| 2.7211 \| 2600 \| -0.5898 \| 1.2422 \| -112.0 \| -169.0 \| 0.8505 \| 0.0 \| 0.6699 \| -1.1172 \| 0.5742 \| -1.6953 \|
	\| 0.4514 \| 2.8257 \| 2700 \| -0.6406 \| 1.1953 \| -112.0 \| -169.0 \| 0.8493 \| 0.0 \| 0.6738 \| -1.1172 \| 0.5781 \| -1.6953 \|
	\| 0.459 \| 2.9304 \| 2800 \| -0.5781 \| 1.2422 \| -113.0 \| -171.0 \| 0.8496 \| 0.0 \| 0.6680 \| -1.1328 \| 0.5742 \| -1.7031 \|


	### Framework versions

	- Transformers 4.44.2
	- Pytorch 2.3.0
	- Datasets 3.0.0
	- Tokenizers 0.19.1