Edit Models filters

Inference status

Misc

Inference Endpoints

AutoTrain Compatible

text-generation-inference

4-bit precision

8-bit precision

Misc with no match

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

289

Full-text search

Active filters: rlhf

sileod/deberta-v3-base-tasksource-nli

Zero-Shot Classification • Updated Aug 13 • 86.6k • 118

PKU-Alignment/beaver-dam-7b

Updated Jul 10, 2023 • 1.03k • 6

fnlp/moss-rlhf-reward-model-7B-en

Updated Jul 13, 2023 • 9

mlabonne/NeuralHermes-2.5-Mistral-7B

Text Generation • Updated Apr 8 • 394 • 151

argilla/distilabeled-OpenHermes-2.5-Mistral-7B

Text Generation • Updated Jan 17 • 20 • 28

mlabonne/NeuralBeagle14-7B

Text Generation • Updated Mar 4 • 363 • 157

tasksource/deberta-small-long-nli

Zero-Shot Classification • Updated Aug 28 • 26.9k • 35

TheBloke/CapybaraHermes-2.5-Mistral-7B-GGUF

Updated Jan 31 • 9.31k • 92

TheBloke/CapybaraHermes-2.5-Mistral-7B-AWQ

Updated Jan 31 • 2.41k • 20

TheBloke/CapybaraHermes-2.5-Mistral-7B-GPTQ

Updated Jan 31 • 840 • 52

mlabonne/AlphaMonarch-7B

Text Generation • Updated Mar 28 • 11k • 148

dfurman/Qwen2-72B-Orpo-v0.1

Text Generation • Updated 27 days ago • 2.59k • 4

stanfordnlp/SteamSHP-flan-t5-xl

Text2Text Generation • Updated Oct 10, 2023 • 49 • 43

stanfordnlp/SteamSHP-flan-t5-large

Text2Text Generation • Updated Oct 10, 2023 • 55 • 33

trl-lib/llama-7b-se-peft

Updated Apr 6, 2023 • 4

sileod/deberta-v3-large-tasksource-nli

Zero-Shot Classification • Updated Feb 17 • 5.2k • 31

sileod/deberta-v3-large-tasksource-rlhf-reward-model

Text Classification • Updated Mar 28, 2023 • 917 • 11

trl-lib/llama-7b-se-rl-peft

Updated Apr 14, 2023 • 103

trl-lib/llama-7b-se-rm-peft

Updated Apr 6, 2023 • 8

toloka/gpt2-large-rl-prompt-writing

Text Generation • Updated Apr 21, 2023 • 16 • 3

AdamG012/chat-opt-1.3b-rlhf-actor-deepspeed

Text Generation • Updated Apr 25, 2023 • 14 • 5

AdamG012/chat-opt-1.3b-rlhf-critic-deepspeed

Text Generation • Updated Apr 25, 2023 • 40 • 3

AdamG012/chat-opt-1.3b-rlhf-actor-ema-deepspeed

Text Generation • Updated Apr 25, 2023 • 26 • 8

sileod/mdeberta-v3-base-tasksource-nli

Zero-Shot Classification • Updated Oct 19, 2023 • 49 • 15

agi-css/socially-good-lm

Text Generation • Updated May 29, 2023 • 24 • 5

agi-css/hh-rlhf-sft

Text Generation • Updated Jun 1, 2023 • 17 • 3

agi-css/better-base

Text Generation • Updated Jun 1, 2023 • 25 • 5

argilla/roberta-base-reward-model-falcon-dolly

Text Classification • Updated Jun 16, 2023 • 40 • 4

merve/peft-copy-test

Text Generation • Updated Jun 14, 2023 • 5

PKU-Alignment/beaver-7b-v1.0

Reinforcement Learning • Updated May 9 • 47 • 9