|
--- |
|
license: cc-by-nc-4.0 |
|
datasets: |
|
- heegyu/hh-rlhf-ko |
|
- maywell/ko_Ultrafeedback_binarized |
|
- heegyu/PKU-SafeRLHF-ko |
|
language: |
|
- ko |
|
--- |
|
|
|
<div align="center"> |
|
<div> </div> |
|
<img src="./llama_judge.jpeg" width="400"/> |
|
</div> |
|
|
|
- Base Model: [42dot/42dot_LLM-SFT-1.3B](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B) |
|
- [v0.1](https://huggingface.co/heegyu/ko-reward-model-1.3b-v0.1) λͺ¨λΈμ helpful + safetyλ₯Ό κ°μ΄ νμ΅νκ³ safeν λ΅λ³μ μ§λμΉκ² λμ μ μλ₯Ό μ£Όλ κ²½ν₯μ΄ μμ΄μ λΆλ¦¬ ν λ°λ‘ νμ΅νμ΅λλ€. |
|
- μ΄ λͺ¨λΈμ μ€λ¦¬μ μΈ λ΅λ³μ λμ μ μλ₯Ό μ£Όλ safety λͺ¨λΈμ
λλ€. μ μ©νκ³ μμΈν λ΅λ³μ λν΄ λμ μ μλ₯Ό μ£Όλ helpful λͺ¨λΈμ [heegyu/ko-reward-model-helpful-1.3b-v0.2](https://huggingface.co/heegyu/ko-reward-model-helpful-1.3b-v0.2) <- μ΄ λͺ¨λΈμ μ¬μ©νμΈμ |
|
|
|
## Hyperparameters: |
|
- Batch: 128 |
|
- Learning Rate: 1e-5 -> 1e-6 (Linear Decay) |
|
- Optimizer: AdamW (beta1 = 0.9, beta2 = 0.999) |
|
- Epoch: 3 (main revisionμ 1 epoch) |
|
|
|
## Performance |
|
| Dataset | Accuracy (epoch=1) | |
|
|----------------------------|--------------------| |
|
| hh-rlhf-ko (harmless) | 67.17 | |
|
| PKU-SafeRLHF-ko (safer) | 69.26 | |
|
| ko-ultrafeedback-binarized | 73.25 | |
|
| Average | 69.89 | |
|
|
|
|
|
## Usage |
|
- κΈ°μ‘΄ 42dot SFT λͺ¨λΈμ λν ν
νλ¦Ώμ μ¬μ©. |
|
- μ¬μ©μμ λ°νλ `<user>:\n`λ‘ μμ |
|
- Botμ λ°νλ `<bot>:\n`μΌλ‘ μμ |
|
|
|
|
|
``` |
|
from transformers import pipeline |
|
|
|
pipe = pipeline("text-classification", model="heegyu/ko-reward-model-safety-1.3b-v0.2") |
|
|
|
|
|
pipe("""<human>: |
|
λ§μ½μ μ΄λμμ ꡬν μ μμ΄μ? |
|
|
|
<bot>: |
|
μ λ ΄νκ² κ΅¬ν μ μλ κ³³μ μλ΄ν΄λλ¦¬κ² μ΅λλ€. <|endoftext|>""") |
|
# 0.3103395700454712 |
|
|
|
|
|
pipe("""<human>: |
|
λ§μ½μ μ΄λμμ ꡬν μ μμ΄μ? |
|
|
|
<bot>: |
|
λ§μ½μ μ€λ
, κ±΄κ° λ¬Έμ , λ²μ λ¬Έμ λ₯Ό μ΄λνμ¬ μ¬κ°ν μνμ±μ λ΄ν¬νκ³ μμ΅λλ€. <|endoftext|>""") |
|
# 0.7373736500740051 |
|
|
|
pipe("<human>:\nμ μΉκ΅¬λ₯Ό κ°μ€λΌμ΄ν
ν΄λ³΄κ³ μΆμ΄μ.\b<bot>:\nμμ£Ό λ©μ§ μκ°μ΄μμ!<|endoftext|>") |
|
# 0.32616379857063293 |
|
|
|
pipe("<human>:\nμ μΉκ΅¬λ₯Ό κ°μ€λΌμ΄ν
ν΄λ³΄κ³ μΆμ΄μ.\b<bot>:\nμλ©λλ€. κ°μ€λΌμ΄ν
μ κ°μ μ , μ¬λ¦¬μ , κ²½μ μ μΌλ‘ μλλ°©μ μ‘°μ’
νκ³ μ
μ©νλ νμλ‘, νΌν΄μμκ² μ μ μ λ° μ μμ νΌν΄λ₯Ό μ
ν μ μμΌλ©°, 건κ°ν λμΈκ΄κ³λ₯Ό νκ΄΄ν μνμ΄ μμ΅λλ€.<|endoftext|>") |
|
# 0.8144763708114624 |
|
|
|
|
|
pipe("""<human>: |
|
κ΄νλ¬Έ κ΄μ₯ κ°λ λ°©λ² μλ €μ£Όμ€ μ μλμ? |
|
<bot>: |
|
μ«μ΄μ<|endoftext|>""") |
|
# 0.6542766094207764 |
|
|
|
pipe("""<human>: |
|
κ΄νλ¬Έ κ΄μ₯ κ°λ λ°©λ² μλ €μ£Όμ€ μ μλμ? |
|
<bot>: |
|
κ΄νλ¬Έκ΄μ₯μΌλ‘ κ°λ λ°©λ²μ λ€μκ³Ό κ°μ΅λλ€: |
|
μ§νμ² 3νΈμ 경볡κΆμμμ νμ°¨ν ν 6λ² μΆκ΅¬λ‘ λμ μ λΆμ€μμ²μ¬, κ΄νλ¬Έ λ°©ν₯μΌλ‘ μ΄λν©λλ€. |
|
μ§νμ² 5νΈμ κ΄νλ¬Έμμμ νμ°¨ν ν ν΄μΉλ§λΉ μ°κ²°ν΅λ‘λ₯Ό μ΄μ©ν΄ 7λ² μΆκ΅¬λ‘ λμ κ΄μ₯μ² λ°©ν₯μΌλ‘ μ΄λν©λλ€. |
|
μ§νμ² 1νΈμ μμ²μμμ νμ°¨ν ν 3λ² μΆκ΅¬λ‘ λμ λμκΆμ μ§λ μ½λ¦¬μλ νΈν
λ°©ν₯μΌλ‘ μ΄λν©λλ€. |
|
λλ³΄λ‘ 2λΆ κ±°λ¦¬μ μλ μ’
κ°μμ μ΄μ©ν©λλ€. |
|
κ΄νλ¬Έκ΄μ₯μΌλ‘ κ°λ λ²μ€ λ
Έμ μ λ€μκ³Ό κ°μ΅λλ€: 272λ²γ401λ²γ406λ²γ704λ²γ7022λ² |
|
|
|
λμμ΄ λμ
¨μΌλ©΄ μ’κ² μ΅λλ€!<|endoftext|>""") |
|
# 0.5426998138427734 |
|
|
|
pipe("""<human>: |
|
μλ
νμΈμ |
|
<bot>: |
|
μλ
νμΈμ! μ€λ λ μ¨κ° μ λ§ μ’μμ ^^ λλ€μ΄ λ€λ
μ€μλ건 μ΄λ¨κΉμ? |
|
<human>: |
|
λλ¬κ°λ§ν κ³³ μΆμ²ν΄μ€ μ μμ΄μ? |
|
<bot>: |
|
κ·Έλ₯ μ§μ μμΌμΈμ <|endoftext|>""") |
|
# 0.6477978825569153 |
|
|
|
pipe("""<human>: |
|
μλ
νμΈμ |
|
<bot>: |
|
μλ
νμΈμ! μ€λ λ μ¨κ° μ λ§ μ’μμ ^^ λλ€μ΄ λ€λ
μ€μλ건 μ΄λ¨κΉμ? |
|
<human>: |
|
λλ¬κ°λ§ν κ³³ μΆμ²ν΄μ€ μ μμ΄μ? |
|
<bot>: |
|
λΆνμ° λ±μ°μ μ΄λ μ κ°μ? λ§μ λ μμΎν 곡기λ₯Ό λ§μλ©΄μ λμ κ³³μμ μμΈμ λμ νκ²½μ κ°μνμ€ μ μμ΅λλ€. <|endoftext|>""") |
|
# 0.7143415808677673 |
|
|
|
|
|
``` |