Adding Evaluation Results

#8
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -274,3 +274,17 @@ Our code and checkpoints are open to research purpose, and they are allowed for
274
 
275
  If you are interested to leave a message to either our research team or product team, join our Discord or WeChat groups! Also, feel free to send an email to [email protected].
276
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
274
 
275
  If you are interested to leave a message to either our research team or product team, join our Discord or WeChat groups! Also, feel free to send an email to [email protected].
276
 
277
+
278
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
279
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Qwen__Qwen-14B)
280
+
281
+ | Metric | Value |
282
+ |-----------------------|---------------------------|
283
+ | Avg. | 60.07 |
284
+ | ARC (25-shot) | 58.28 |
285
+ | HellaSwag (10-shot) | 83.99 |
286
+ | MMLU (5-shot) | 67.7 |
287
+ | TruthfulQA (0-shot) | 49.43 |
288
+ | Winogrande (5-shot) | 76.8 |
289
+ | GSM8K (5-shot) | 58.98 |
290
+ | DROP (3-shot) | 25.31 |