eval
Browse files
README.md
CHANGED
@@ -33,15 +33,9 @@ Previous versions remain available in the repository. New models will be release
|
|
33 |
|
34 |
## Evaluation
|
35 |
|
36 |
-
| Model | Avg | ARC | HS | MMLU | TQA |
|
37 |
-
|
38 |
-
| **Shining Valiant 1.
|
39 |
-
| Llama 2 | 67.35 | 67.32 | 87.33 | 69.83 | 44.92 |
|
40 |
-
| Llama 2 Chat | 66.80 | 64.59 | 85.88 | 63.91 | 52.80 |
|
41 |
-
|
42 |
-
**Shining Valiant 1.3** is awaiting full results from the Open LLM Leaderboard.
|
43 |
-
|
44 |
-
SV 1.3 outperformed SV 1.2 on our internal testing.
|
45 |
|
46 |
## Prompting Guide
|
47 |
Shining Valiant uses the same prompt format as Llama 2 Chat - feel free to use your existing prompts and scripts!
|
|
|
33 |
|
34 |
## Evaluation
|
35 |
|
36 |
+
| Model | Avg | ARC | HS | MMLU | TQA | WG | GSM |
|
37 |
+
|-----------------------|--------|-------|-------|--------|-------|-------|-------|
|
38 |
+
| **Shining Valiant 1.3** | 73.78 | 71.33 | 90.96 | 71.21 | 70.29 | 84.21 | 54.66 |
|
|
|
|
|
|
|
|
|
|
|
|
|
39 |
|
40 |
## Prompting Guide
|
41 |
Shining Valiant uses the same prompt format as Llama 2 Chat - feel free to use your existing prompts and scripts!
|