•

edited Jun 22, 2023

Does config.json need to be in a certain location for it to load as an MPT model? Edit: config.json is next to the bin

python3.10 koboldcpp.py --model ../models-koboldcpp/mpt-30B-chat-GGML/mpt-30b-chat.ggmlv0.q5_0.bin --useclblast 0 0 --contextsize 8192
Welcome to KoboldCpp - Version 1.32
Attempting to use CLBlast library for faster prompt ingestion. A compatible clblast will be required.
Initializing dynamic library: koboldcpp_clblast.so

Loading model: /home/sapien/m2/ai-chat/models-koboldcpp/mpt-30B-chat-GGML/mpt-30b-chat.ggmlv0.q5_0.bin
[Threads: 7, BlasThreads: 7, SmartContext: False]

Identified as GPT-NEO-X model: (ver 406)
Attempting to Load...

System Info: AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
gpt_neox_model_load: loading model from '/home/sapien/m2/ai-chat/models-koboldcpp/mpt-30B-chat-GGML/mpt-30b-chat.ggmlv0.q5_0.bin' - please wait ...
gpt_neox_model_load: n_vocab = 7168
gpt_neox_model_load: n_ctx = 8192
gpt_neox_model_load: n_embd = 64
gpt_neox_model_load: n_head = 48
gpt_neox_model_load: n_layer = 50432
gpt_neox_model_load: n_rot = 1090519040
gpt_neox_model_load: par_res = 0
gpt_neox_model_load: ftype = 2008
gpt_neox_model_load: qntvr = 2
gpt_neox_model_load: ggml ctx size = 104213.61 MB

Platform:0 Device:0 - NVIDIA CUDA with NVIDIA GeForce RTX 3090

ggml_opencl: selecting platform: 'NVIDIA CUDA'
ggml_opencl: selecting device: 'NVIDIA GeForce RTX 3090'
ggml_opencl: device FP16 support: false
CL FP16 temporarily disabled pending further optimization.
GGML_ASSERT: ggml.c:4164: ctx->mem_buffer != NULL
[1] 29383 IOT instruction (core dumped) python3.10 koboldcpp.py --model --useclblast 0 0 --contextsize 8192

TornButter changed discussion status to closed Jun 22, 2023

TornButter changed discussion status to open Jun 22, 2023

TornButter

Jun 22, 2023

Fixed by adding this to my command: --forceversion 500

TheBloke

Owner Jun 22, 2023

Yeah I should mention that in the README. It's a bug in KoboldCpp, LostRuins is aware and will fix it in the next release.

TheBloke
/

mpt-30B-chat-GGML

koboldcpp thinks it is a GPT-NEO-X model?

Identified as GPT-NEO-X model: (ver 406)
Attempting to Load...

koboldcpp thinks it is a GPT-NEO-X model?

Identified as GPT-NEO-X model: (ver 406)Attempting to Load...

Identified as GPT-NEO-X model: (ver 406)
Attempting to Load...