Transformers
llama

Model answers blank and repeat words

#4
by bazi88 - opened

When i using the command:
./main -m ./models/vicuna-13b-v1.5-16k.ggmlv3.q4_K_S.bin -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt
It return blank answers.

And if i using
./main -m ./models/vicuna-13b-v1.5-16k.ggmlv3.q4_K_S.bin - -n 128
It run perfectly but sometime in the end of answers. This words repeat.
Please help.

Please add:

-c 16384  --rope-freq-base 10000 --rope-freq-scale 0.25

for 16K context, or

-c 8192 --rope-freq-base 10000 --rope-freq-scale 0.5

for 8K context

This comment has been hidden

Please add:

-c 16384  --rope-freq-base 10000 --rope-freq-scale 0.25

for 16K context, or

-c 8192 --rope-freq-base 10000 --rope-freq-scale 0.5

for 8K context

this makes it perfectly working for me

Hello,

while the prompt option works like a charm, I still struggle with the interactive version. I use:

--rope-freq-base 10000 --rope-freq-scale 0.5 -ngl 32 --ctx_size 2048 --temp 0.7 --top_k 40 --top_p 0.5 --repeat_last_n 256 --batch_size 1024 --repeat_penalty 1.17647
--interactive --reverse-prompt "### Human:" --in-prefix ' ' --threads 8 --n_predict 2048

merging @TheBloke suggestions with the example provided with llama.cpp in the file chat-vicuna.sh. However, after I give an instruction, it gives a short output and gets stuck forever. I need to hit RETURN to have another interaction and

  • either it starts again with another answer
  • or it just outputs blank spaces for ever

What am I doing wrong?

Many thanks in advance for the help
screen.png

Sign up or log in to comment