shai commited on
Commit
f83699c
1 Parent(s): 62d81c6

update readme

Browse files
Files changed (3) hide show
  1. README.md +20 -7
  2. assets/perf_size.png +0 -0
  3. assets/text_recognition.png +0 -0
README.md CHANGED
@@ -18,12 +18,17 @@ The H2OVL-Mississippi-800M is a compact yet powerful vision-language model from
18
  <img src="./assets/text_recognition.png" alt="Mississippi-2B Benchmarks" width="600"/>
19
  </div>
20
 
21
-
22
  ## Key Features:
23
 
24
  - 0.8 Billion Parameters: Balance between performance and efficiency, making it suitable for OCR and document processing.
25
  - Trained on 19 million image-text pairs, with a focus on OCR, document comprehension, and chart, figure, and table interpretation, the model is optimized for superior OCR performance.
26
 
 
 
 
 
 
 
27
  ## Usage
28
 
29
  ### Install dependencies:
@@ -69,12 +74,20 @@ print(f'User: {question}\nAssistant: {response}')
69
 
70
  ## Benchmarks
71
 
72
- ### 🤗 OpenVLM Leaderboard
73
-
74
- | Benchmark | acc_n |
75
- |:-------------------|:-----:|
76
- | OCRBench | 75.1 |
77
-
 
 
 
 
 
 
 
 
78
 
79
 
80
  ## Acknowledgments
 
18
  <img src="./assets/text_recognition.png" alt="Mississippi-2B Benchmarks" width="600"/>
19
  </div>
20
 
 
21
  ## Key Features:
22
 
23
  - 0.8 Billion Parameters: Balance between performance and efficiency, making it suitable for OCR and document processing.
24
  - Trained on 19 million image-text pairs, with a focus on OCR, document comprehension, and chart, figure, and table interpretation, the model is optimized for superior OCR performance.
25
 
26
+
27
+ <div align="center">
28
+ <img src="./assets/perf_size.png" alt="Mississippi-2B Benchmarks" width="600"/>
29
+ </div>
30
+
31
+
32
  ## Usage
33
 
34
  ### Install dependencies:
 
74
 
75
  ## Benchmarks
76
 
77
+ ### Performance Comparison of Similar Sized Models Across Multiple Benchmarks - OpenVLM Leaderboard
78
+
79
+ | **Models** | **Params (B)** | **Avg. Score** | **MMBench** | **MMStar** | **MMMU<sub>VAL</sub>** | **Math Vista** | **Hallusion** | **AI2D<sub>TEST</sub>** | **OCRBench** | **MMVet** |
80
+ |----------------------------|----------------|----------------|-------------|------------|-----------------------|----------------|---------------|-------------------------|--------------|-----------|
81
+ | Qwen2-VL-2B | 2.1 | **57.2** | **72.2** | 47.5 | 42.2 | 47.8 | **42.4** | 74.7 | **797** | **51.5** |
82
+ | **H2OVL-Mississippi-2B** | 2.1 | 54.4 | 64.8 | 49.6 | 35.2 | **56.8** | 36.4 | 69.9 | 782 | 44.7 |
83
+ | InternVL2-2B | 2.1 | 53.9 | 69.6 | **49.8** | 36.3 | 46.0 | 38.0 | 74.1 | 781 | 39.7 |
84
+ | Phi-3-Vision | 4.2 | 53.6 | 65.2 | 47.7 | **46.1** | 44.6 | 39.0 | **78.4** | 637 | 44.1 |
85
+ | MiniMonkey | 2.2 | 52.7 | 68.9 | 48.1 | 35.7 | 45.3 | 30.9 | 73.7 | **794** | 39.8 |
86
+ | MiniCPM-V-2 | 2.8 | 47.9 | 65.8 | 39.1 | 38.2 | 39.8 | 36.1 | 62.9 | 605 | 41.0 |
87
+ | InternVL2-1B | 0.8 | 48.3 | 59.7 | 45.6 | 36.7 | 39.4 | 34.3 | 63.8 | 755 | 31.5 |
88
+ | PaliGemma-3B-mix-448 | 2.9 | 46.5 | 65.6 | 48.3 | 34.9 | 28.7 | 32.2 | 68.3 | 614 | 33.1 |
89
+ | **H2OVL-Mississippi-0.8B** | 0.8 | 43.5 | 47.7 | 39.1 | 34.0 | 39.0 | 29.6 | 53.6 | 751 | 30.0 |
90
+ | DeepSeek-VL-1.3B | 2.0 | 39.6 | 63.8 | 39.9 | 33.8 | 29.8 | 27.6 | 51.5 | 413 | 29.2 |
91
 
92
 
93
  ## Acknowledgments
assets/perf_size.png ADDED
assets/text_recognition.png CHANGED