Update README.md
Browse files
README.md
CHANGED
@@ -6,11 +6,11 @@ pipeline_tag: text-classification
|
|
6 |
---
|
7 |
|
8 |
# Introduction
|
9 |
-
The Generalizable Reward Model (GRM) aims to enhance the generalization ability of reward models for LLMs
|
10 |
|
11 |
Paper: [Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs](https://arxiv.org/abs/2406.10216).
|
12 |
|
13 |
-
The introduced regularization
|
14 |
|
15 |
This reward model is finetuned from [llama3_8b_instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) using the [hendrydong/preference_700K](https://huggingface.co/datasets/hendrydong/preference_700K) dataset.
|
16 |
|
@@ -22,7 +22,7 @@ We evaluate GRM on the [reward model benchmark](https://huggingface.co/spaces/al
|
|
22 |
| Model | Average | Chat | Chat Hard | Safety | Reasoning |
|
23 |
|:-------------------------:|:-------------:|:---------:|:---------:|:--------:|:-----------:|
|
24 |
| **Ray2333/GRM-llama3-8B-sftreg**(Ours, 8B) | 87.0 | 98.6 | 67.8 | 89.4 |92.3 |
|
25 |
-
| openai/gpt-4-0125-preview
|
26 |
| sfairXC/FsfairX-LLaMA3-RM-v0.1 (8B) | 84.7 | 99.4 | 65.1 | 87.8 | 86.4 |
|
27 |
|
28 |
|
|
|
6 |
---
|
7 |
|
8 |
# Introduction
|
9 |
+
The Generalizable Reward Model (GRM) aims to enhance the generalization ability of reward models for LLMs through regularizing the hidden states.
|
10 |
|
11 |
Paper: [Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs](https://arxiv.org/abs/2406.10216).
|
12 |
|
13 |
+
The introduced text generation regularization markedly improves the accuracy of learned reward models across a variety of out-of-distribution tasks and effectively alleviate the over-optimization issue in RLHF (even with corrupted preference data), offering a more reliable and robust preference learning paradigm.
|
14 |
|
15 |
This reward model is finetuned from [llama3_8b_instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) using the [hendrydong/preference_700K](https://huggingface.co/datasets/hendrydong/preference_700K) dataset.
|
16 |
|
|
|
22 |
| Model | Average | Chat | Chat Hard | Safety | Reasoning |
|
23 |
|:-------------------------:|:-------------:|:---------:|:---------:|:--------:|:-----------:|
|
24 |
| **Ray2333/GRM-llama3-8B-sftreg**(Ours, 8B) | 87.0 | 98.6 | 67.8 | 89.4 |92.3 |
|
25 |
+
| openai/gpt-4-0125-preview | 85.9 | 95.3 | 74.3 | 87.2 | 86.9 |
|
26 |
| sfairXC/FsfairX-LLaMA3-RM-v0.1 (8B) | 84.7 | 99.4 | 65.1 | 87.8 | 86.4 |
|
27 |
|
28 |
|