Lewis Tunstall PRO

lewtun

https://lewtun.github.io/blog/

AI & ML interests

LLMs, LLMs, LLMs

Articles

Organizations

lewtun's activity

upvoted a paper about 12 hours ago

AutoTrain: No-code training for state-of-the-art models

Paper • 2410.15735 • Published 1 day ago • 38

upvoted a paper 12 days ago

Falcon Mamba: The First Competitive Attention-free 7B Language Model

Paper • 2410.05355 • Published 15 days ago • 26

upvoted an article 13 days ago

Article

Faster Assisted Generation with Dynamic Speculation

15 days ago

• 26

upvoted a collection 14 days ago

Critique-out-Loud Reward Models

Collection

Paper: https://arxiv.org/abs/2408.11791 | Code: https://github.com/zankner/CLoud • 7 items • Updated Sep 5 • 3

upvoted a paper 25 days ago

Style over Substance: Failure Modes of LLM Judges in Alignment Benchmarking

Paper • 2409.15268 • Published 29 days ago • 11

upvoted a paper about 2 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22 • 115

upvoted an article 2 months ago

Article

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging

•

Aug 19

• 73

upvoted a paper 2 months ago

Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents

Paper • 2408.07199 • Published Aug 13 • 20

upvoted an article 2 months ago

Article

A failed experiment: Infini-Attention, and why we should keep trying?

Aug 14

• 46

upvoted a paper 2 months ago

Instruction-Following Evaluation for Large Language Models

Paper • 2311.07911 • Published Nov 14, 2023 • 19

upvoted an article 2 months ago

Article

Tool Use, Unified

Aug 12

• 56

upvoted 2 collections 3 months ago

🍃 MINT-1T

Collection

Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24 • 50

NuminaMath

Collection

Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize • 6 items • Updated Jul 21 • 59

upvoted an article 3 months ago

Article

Docmatix - a huge dataset for Document Visual Question Answering

Jul 18

• 66

upvoted a paper 3 months ago

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15 • 155

upvoted 2 articles 3 months ago

Article

How NuminaMath Won the 1st AIMO Progress Prize

Jul 11

• 95

Article

Preference Optimization for Vision Language Models

Jul 10

• 40

upvoted an article 4 months ago

Article

Putting RL back in RLHF

Jun 12

• 62

upvoted 2 articles 5 months ago

Article

Space secrets security update

May 31

• 50

Article

Hugging Face x LangChain : A new partner package in LangChain

May 14

• 106

upvoted 8 papers 6 months ago

Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29 • 68

SpaceByte: Towards Deleting Tokenization from Large Language Modeling

Paper • 2404.14408 • Published Apr 22 • 6

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Paper • 2402.09844 • Published Feb 15 • 20

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Paper • 2404.12253 • Published Apr 18 • 53

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

Paper • 2404.10719 • Published Apr 16 • 3

From r to Q^*: Your Language Model is Secretly a Q-Function

Paper • 2404.12358 • Published Apr 18 • 2

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11 • 84

ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12 • 61

upvoted an article 7 months ago

Article

Constitutional AI with Open LLMs

Feb 1

• 11

upvoted 6 papers 7 months ago

More Agents Is All You Need

Paper • 2402.05120 • Published Feb 3 • 51

InternLM2 Technical Report

Paper • 2403.17297 • Published Mar 26 • 28

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14 • 124

KTO: Model Alignment as Prospect Theoretic Optimization

Paper • 2402.01306 • Published Feb 2 • 15

Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset

Paper • 2402.14804 • Published Feb 22 • 2

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Paper • 2403.07816 • Published Mar 12 • 39

upvoted 9 papers 8 months ago

Aligning Large Language Models by On-Policy Self-Judgment

Paper • 2402.11253 • Published Feb 17 • 2

GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers

Paper • 2402.19255 • Published Feb 29 • 1

Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation

Paper • 2402.18334 • Published Feb 28 • 12

Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap

Paper • 2402.19450 • Published Feb 29 • 3

Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

Paper • 2402.14740 • Published Feb 22 • 8

The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning

Paper • 2312.01552 • Published Dec 4, 2023 • 30

ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization

Paper • 2402.09320 • Published Feb 14 • 6

Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning

Paper • 2402.04833 • Published Feb 7 • 6

OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset

Paper • 2402.10176 • Published Feb 15 • 34

upvoted a paper 9 months ago

The False Promise of Imitating Proprietary LLMs

Paper • 2305.15717 • Published May 25, 2023 • 5

upvoted 7 papers 10 months ago

A Minimaximalist Approach to Reinforcement Learning from Human Feedback

Paper • 2401.04056 • Published Jan 8 • 2

Possible Meissner effect near room temperature in copper-substituted lead apatite

Paper • 2401.00999 • Published Jan 2 • 5

R-Tuning: Teaching Large Language Models to Refuse Unknown Questions

Paper • 2311.09677 • Published Nov 16, 2023 • 3

Let's Verify Step by Step

Paper • 2305.20050 • Published May 31, 2023 • 9

Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations

Paper • 2312.08935 • Published Dec 14, 2023 • 4

Some things are more CRINGE than others: Preference Optimization with the Pairwise Cringe Loss

Paper • 2312.16682 • Published Dec 27, 2023 • 5

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning

Paper • 2312.15685 • Published Dec 25, 2023 • 17

upvoted a collection 10 months ago

Model Merging

Collection

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12 • 216

upvoted 2 papers 10 months ago

A General Theoretical Paradigm to Understand Learning from Human Preferences

Paper • 2310.12036 • Published Oct 18, 2023 • 14

Data Diversity Matters for Robust Instruction Tuning

Paper • 2311.14736 • Published Nov 21, 2023 • 2

upvoted a collection 10 months ago

Papers We've Read

Collection

Papers discussed in the H4 journal club • 3 items • Updated Apr 12 • 8

upvoted a paper 11 months ago

MEDITRON-70B: Scaling Medical Pretraining for Large Language Models

Paper • 2311.16079 • Published Nov 27, 2023 • 20

upvoted a collection 11 months ago

Hallucination

Collection

14 items • Updated Jun 10 • 6

upvoted 2 papers 11 months ago

GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 182

Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2

Paper • 2311.10702 • Published Nov 17, 2023 • 18

Lewis Tunstall PRO

AI & ML interests

Articles

Faster Assisted Generation with Dynamic Speculation

Llama can now see and run on your device - welcome Llama 3.2

FineVideo: behind the scenes

How NuminaMath Won the 1st AIMO Progress Prize

Welcome Gemma 2 - Google's new open LLM

Constitutional AI with Open LLMs

Preference Tuning LLMs with Direct Preference Optimization Methods

Mixture of Experts Explained

Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face

SetFitABSA: Few-Shot Aspect Based Sentiment Analysis using SetFit

Fine-tuning Llama 2 70B using PyTorch FSDP

Code Llama: Llama 2 learns to code

Llama 2 is here - get it on Hugging Face

Can foundation models label data like humans?

The Falcon has landed in the Hugging Face ecosystem

Creating a Coding Assistant with StarCoder

StackLLaMA: A hands-on guide to train LLaMA with RLHF

Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU

Red-Teaming Large Language Models

Diffusion Models Live Event

Very Large Language Models and How to Evaluate Them

SetFit: Efficient Few-Shot Learning Without Prompts

Announcing Evaluation on the Hub

Organizations

lewtun's activity

Faster Assisted Generation with Dynamic Speculation

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging

A failed experiment: Infini-Attention, and why we should keep trying?

Tool Use, Unified

Docmatix - a huge dataset for Document Visual Question Answering

How NuminaMath Won the 1st AIMO Progress Prize

Preference Optimization for Vision Language Models

Putting RL back in RLHF

Space secrets security update

Hugging Face x LangChain : A new partner package in LangChain

Constitutional AI with Open LLMs