hf inference endpoint

#5
by tintin12 - opened

has anyone tried deploying this through hf inference endpoint? i get errors. I know the inference engine command line has the option to pass in parameter to tell that it's an AWQ model, but the deployment interface does not provide such thing, i get errors and can't run

No, I've never tried it on the hosted HF endpoints. Only with a local TGI deployment via a Docker container.

If the hosted endpoint provides no way to specify the model type then my guess would be that it's not supported, but other than that I'm afraid I don't know. Contact HF support maybe?

Sign up or log in to comment