Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
zfz1
/
deepseek-8b-orpo-lora
like
0
PEFT
TensorBoard
Safetensors
zfz1/my_preference_gsm8k_deepseek
llama
alignment-handbook
trl
orpo
Generated from Trainer
License:
other
Model card
Files
Files and versions
Metrics
Training metrics
Community
Train
Use this model
062e283
deepseek-8b-orpo-lora
Commit History
Training in progress, step 312
062e283
verified
zfz1
commited on
Jul 18
End of training
1075d3a
verified
zfz1
commited on
Jul 15
Model save
3cbea15
verified
zfz1
commited on
Jul 15
Training in progress, step 312
759010b
verified
zfz1
commited on
Jul 15
initial commit
81d9374
verified
zfz1
commited on
Jul 15