Edit Models filters

Inference status

Misc

AutoTrain Compatible

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

Misc with no match

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

118

Full-text search

Active filters: vllm

mlx-community/Ministral-8B-Instruct-2410-bf16

Updated 5 days ago • 256 • 1

mlx-community/Ministral-8B-Instruct-2410-8bit

Updated 5 days ago • 129 • 1

adriabama06/reader-lm-1.5b-AWQ

Text Generation • Updated 3 days ago • 14 • 1

aashish1904/Ministral-8B-Instruct-2410-HF-Q4_K_M-GGUF

Updated 2 days ago • 74 • 1

Inferless/deciLM-7B-GPTQ

Text Generation • Updated Jan 4 • 10 • 1

Inferless/SOLAR-10.7B-Instruct-v1.0-GPTQ

Text Generation • Updated Jan 4 • 18 • 2

Inferless/Mixtral-8x7B-v0.1-int8-GPTQ

Text Generation • Updated Jan 25 • 18 • 2

neuralmagic/Meta-Llama-3-8B-Instruct-FP8-KV

Text Generation • Updated Jun 19 • 21.8k • 6

neuralmagic/Meta-Llama-3-70B-Instruct-FP8

Text Generation • Updated Jul 18 • 6.65k • 10

neuralmagic/Qwen2-0.5B-Instruct-FP8

Text Generation • Updated Jul 18 • 641 • 2

neuralmagic/Qwen2-1.5B-Instruct-FP8

Text Generation • Updated Jul 18 • 45

neuralmagic/Qwen2-7B-Instruct-FP8

Text Generation • Updated Jul 18 • 783 • 1

nm-testing/SparseLlama-3-8B-pruned_50.2of4-FP8

Text Generation • Updated Jun 25 • 28

FlorianJc/Hermes-2-Pro-Mistral-7B-vllm-fp8

Text Generation • Updated Jul 17 • 19

FlorianJc/openchat-3.6-8b-20240522-vllm-fp8

Text Generation • Updated Jul 17 • 12

FlorianJc/Llama3-ChatQA-1.5-8B-vllm-fp8

Text Generation • Updated Jul 17 • 10

Rallio67/magnum-72B-FP8

Text Generation • Updated Jun 26 • 11

neuralmagic/Meta-Llama-3-70B-Instruct-FP8-KV

Text Generation • Updated Jun 26 • 696 • 2

neuralmagic/Mistral-7B-Instruct-v0.3-FP8

Text Generation • Updated Jul 18 • 989 • 2

neuralmagic/Llama-2-7b-chat-hf-FP8

Text Generation • Updated Jul 18 • 506

neuralmagic/Phi-3-mini-128k-instruct-FP8

Text Generation • Updated 13 days ago • 531

neuralmagic/Phi-3-medium-128k-instruct-FP8

Text Generation • Updated 13 days ago • 850 • 5

FlorianJc/google-gemma-2-9b-it-vllm-fp8

Text Generation • Updated Jul 17 • 84

tranhoangnguyen03/Gemma-2-9B-It-SPPO-Iter3_Q8

Text Generation • Updated Jul 7 • 4

FlorianJc/Llama3-ChatQA-1.5-8B-v2-vllm-fp8

Text Generation • Updated Jul 17 • 13

FlorianJc/MegaBeam-Mistral-7B-300k-vllm-fp8

Text Generation • Updated Jul 17 • 9

neuralmagic/gemma-2-9b-it-FP8

Text Generation • Updated Jul 18 • 3.38k • 5

nm-testing/Llama-2-70b-chat-hf-FP8

Text Generation • Updated Jul 16 • 19

neuralmagic/Qwen2-57B-A14B-Instruct-FP8

Text Generation • Updated Jul 18 • 667 • 1

nm-testing/Meta-Llama-3-8B-Instruct-FP8-K-V

Text Generation • Updated 13 days ago • 26