davanstrien HF staff commited on
Commit
3af71f9
1 Parent(s): 0aabc4c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -22,6 +22,14 @@ This is a copy of the original [Molmo 7B-D model card](https://huggingface.co/al
22
 
23
  **Note: The following implementation is a community-contributed endpoint handler and is not an official implementation. For the official model and its usage, please refer to the [official Molmo 7B-D model page](https://huggingface.co/allenai/Molmo-7B-D-0924).**
24
 
 
 
 
 
 
 
 
 
25
  If you've deployed the model using Hugging Face's Inference Endpoints with a community-contributed handler, you can use it with the following code:
26
 
27
  ```python
 
22
 
23
  **Note: The following implementation is a community-contributed endpoint handler and is not an official implementation. For the official model and its usage, please refer to the [official Molmo 7B-D model page](https://huggingface.co/allenai/Molmo-7B-D-0924).**
24
 
25
+ You should see a `Deploy` via Inference Endpoints option at the top of this model card.
26
+
27
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/60107b385ac3e86b3ea4fc34/kHR0wO_GchczmsmHtjJ1u.png)
28
+
29
+ Currently, this handler uses `bloat16` for inference. The original authors found some differences in results vs using `float32` weights.
30
+ I didn't find results that degraded much in my initial experiments, but I may change this implementation in the future.
31
+
32
+
33
  If you've deployed the model using Hugging Face's Inference Endpoints with a community-contributed handler, you can use it with the following code:
34
 
35
  ```python