Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
RLHFlow
's Collections
Standard-format-preference-dataset
Mixture-of-preference-reward-modeling
RM-Bradley-Terry
PM-pair
Online RLHF
RLHFLow Reward Models
SFT Models
SFT Models
updated
Sep 20
We train a series of SFT models on the high-quality SFT dataset of RLHFlow for research purpose.
Upvote
-
RLHFlow/LLaMA3-SFT
Text Generation
•
Updated
16 days ago
•
8.32k
•
7
sfairXC/gemma-sft-1ep
Text Generation
•
Updated
Aug 30
•
23
sfairXC/gemma-sft-2ep
Text Generation
•
Updated
Aug 30
•
50
sfairXC/llama-3.1-sft-1ep
Text Generation
•
Updated
Sep 18
•
12
sfairXC/llama-3.1-sft-2ep
Text Generation
•
Updated
Sep 18
•
7
Upvote
-
Share collection
View history
Collection guide
Browse collections