ShieldGemma Release Collection A series of safety classifiers, trained on top of Gemma 2, for developers to filter inputs and outputs of their applications. • 3 items • Updated Jul 31 • 11
Gemma Scope Release Collection A comprehensive, open suite of sparse autoencoders for Gemma 2 2B and 9B. • 10 items • Updated Aug 11 • 13
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated 27 days ago • 597
Compact Language Models via Pruning and Knowledge Distillation Paper • 2407.14679 • Published Jul 19 • 37
DynMoE Family Collection DynMoE model checkpoints and paper on huggingface • 4 items • Updated Aug 19 • 3
Training language models to follow instructions with human feedback Paper • 2203.02155 • Published Mar 4, 2022 • 15
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Sep 18 • 343
view article Article Introducing Ghost 8B Beta: A Game-Changing Language Model By lamhieu • Jul 17 • 7
Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions Paper • 2407.06723 • Published Jul 9 • 10
view article Article MInference 1.0: 10x Faster Million Context Inference with a Single GPU By liyucheng • Jul 11 • 11
Transformers.js demos Collection A collection of my favorite WebML demos, built with Transformers.js! • 30 items • Updated Jul 11 • 84
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models Paper • 2407.02687 • Published Jul 2 • 22
Perturbed Attention Guidance pipelines Collection Pipelines for Perturbed Attention Guidance with 🧨 library • 8 items • Updated Jun 26 • 6
LLM Compiler Collection Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated Jun 27 • 147
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published Jun 25 • 85
MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data Paper • 2406.18790 • Published Jun 26 • 33
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Paper • 2406.16855 • Published Jun 24 • 54
VideoTetris: Towards Compositional Text-to-Video Generation Paper • 2406.04277 • Published Jun 6 • 22
abliterated-v3 Collection Latest gen of the abliterated models I've produced • 17 items • Updated Jun 3 • 95
OpenMath Collection A collection of models and datasets introduced in "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset" • 15 items • Updated 22 days ago • 37
Universal token classification Collection Collection of universal token classification (UTC) models capable in prompt-tuned manner to solve many information extraction tasks. • 11 items • Updated Sep 10 • 12
view article Article Decoding GPT-4'o': In-Depth Exploration of Its Mechanisms and Creating Similar AI. By KingNish • May 21 • 32