Qwen
/

Qwen2-7B-Instruct-GPTQ-Int8

Text Generation

text-generation-inference

Inference Endpoints

8-bit precision

Model card Files Files and versions Community

Resources

View closed (0)

I tried vllm and without vllm, "RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 during inference with transformer" still exist!

#2 opened 4 months ago by

use_exllama？

#1 opened 4 months ago by