Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
Wenboz
/
phi3-offline-dpo-lora-noise-0.0-5e-7-thre-1.5-42
like
0
PEFT
Safetensors
phi3
alignment-handbook
trl
dpo
Generated from Trainer
custom_code
License:
mit
Model card
Files
Files and versions
Community
Use this model
8d64e04
phi3-offline-dpo-lora-noise-0.0-5e-7-thre-1.5-42
Commit History
Model save
8d64e04
verified
Wenboz
commited on
Jul 9
Training in progress, step 500
41b01dd
verified
Wenboz
commited on
Jul 9
Training in progress, step 400
2630859
verified
Wenboz
commited on
Jul 9
Training in progress, step 300
a78b063
verified
Wenboz
commited on
Jul 9
Training in progress, step 200
609a684
verified
Wenboz
commited on
Jul 9
Training in progress, step 100
718f840
verified
Wenboz
commited on
Jul 9
initial commit
0a665fa
verified
Wenboz
commited on
Jul 9