Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
12
8
16
Wei Xiong
weqweasdas
Follow
dangkai-nk's profile picture
yifeihe3's profile picture
chrisliu298's profile picture
13 followers
·
2 following
https://weixiongust.github.io/WeiXiongUST/index.html
AI & ML interests
Machine learning, RLHF
Organizations
weqweasdas
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
New activity in
RLHFlow/LLaMA3-SFT
about 1 month ago
LLaMA3.1-SFT
3
#3 opened about 1 month ago by
jackzhang
New activity in
Qwen/Qwen2.5-Math-RM-72B
about 1 month ago
example to service the RM
1
#2 opened about 1 month ago by
weqweasdas
New activity in
RLHFlow/LLaMA3-SFT
about 2 months ago
How to use llama 3sft model, pipeline or tokenizer.apply_chat_template. Can you provide a simple example? Thank you very much for your contribution
2
#2 opened about 2 months ago by
ZHIYII
New activity in
RLHFlow/LLaMA3-SFT
3 months ago
Missing BOS token in tokenized text
2
#1 opened 3 months ago by
ZhaofengWu
New activity in
RLHF4MATH/Gemma-7B-it-SFT3epoch
3 months ago
Update README.md
#1 opened 3 months ago by
weqweasdas
New activity in
RLHFlow/ArmoRM-Llama3-8B-v0.1
3 months ago
Special tokens in the vocabulary?
4
#13 opened 3 months ago by
nshen7
New activity in
sfairXC/FsfairX-LLaMA3-RM-v0.1
4 months ago
TypeError: Got unsupported ScalarType BFloat16
1
#5 opened 4 months ago by
AIR-hl
New activity in
RLHFlow/pair-preference-model-LLaMA3-8B
4 months ago
Could you please test the consistency of preference between `RLHFlow/pair-preference-model-LLaMA3-8B` and GPT-4 on alpacaeval dataset?
1
#2 opened 4 months ago by
rungao2001
commented
a paper
6 months ago
RLHF Workflow: From Reward Modeling to Online RLHF
Paper
•
2405.07863
•
Published
May 13
•
67
•
5
New activity in
weqweasdas/RM-Mistral-7B
6 months ago
why vocab size is 32001
1
#3 opened 6 months ago by
yechenzhi1
New activity in
weqweasdas/RM-Mistral-7B
7 months ago
License
1
#2 opened 7 months ago by
ravir123
Fix dataset link
#1 opened 7 months ago by
ZennyKenny