froggeric
/

WhiteRabbitNeo-33B-v1.5-GGUF

Inference Endpoints

Model card Files Files and versions Community

froggeric commited on Mar 22

Commit

e8406a0

•

1 Parent(s): 0596a98

Update README.md

Files changed (1) hide show

README.md +19 -0

README.md CHANGED Viewed

@@ -4,6 +4,25 @@ license_name: deepseek
 license_link: https://huggingface.co/deepseek-ai/deepseek-coder-33b-base/blob/main/LICENSE
 ---
 # This model is now live (We'll always be serving the newest model on our web app)!
  Access at: https://www.whiterabbitneo.com/

 license_link: https://huggingface.co/deepseek-ai/deepseek-coder-33b-base/blob/main/LICENSE
 ---
+Q8_0 GGUF quantization of [WhiteRabbitNeo/WhiteRabbitNeo-33B-v1.5](https://huggingface.co/WhiteRabbitNeo/WhiteRabbitNeo-33B-v1.5)
+converted and quantized with llama.cpp
+Please note there is a **bug** in the `convert.py` script from recent versions of llama.cpp, that **affects models
+with additional vocabulary** (like this one). This results in model that can be converted, but have some garbage
+and cannot be used for inference. Until this is fixed, you must use an older version of `convert.py`
+(only this script, the rest of llama.cpp can stay with the latest version), and use the following parameters:
+`--pad-vocab --vocab-type bpe`. For example. with a locally downloaded version of the original model repo:
+```
+python llama.cpp/convert-aa23412.py WhiteRabbitNeo-33B-v1.5 --pad-vocab --vocab-type bpe
+```
+The **last working version** of `convert.py` is [aa23412](https://github.com/ggerganov/llama.cpp/tree/aa2341298924ac89778252015efcb792f2df1e20)
+---
+# ORGINAL MODEL CARD
 # This model is now live (We'll always be serving the newest model on our web app)!
  Access at: https://www.whiterabbitneo.com/