Reconvert GGUF for the MoE, due to llama.cpp update
#1
by
CombinHorizon
- opened
would you please re-convert the GGUF using a newer version (newer than 2024-04apr-03) of llama.cpp for better performance?
see
https://github.com/ggerganov/llama.cpp/#hot-topics
MoE memory layout has been updated - reconvert models for mmap
support and regenerate imatrix
https://github.com/ggerganov/llama.cpp/pull/6387
thx