Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models Paper • 2406.04271 • Published Jun 6 • 27
Preference Datasets for DPO Collection This collection contains a list of curated preference datasets for DPO fine-tuning for intent alignment of LLMs • 7 items • Updated Jul 30 • 29
WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation Paper • 2312.14187 • Published Dec 20, 2023 • 49
Gemini: A Family of Highly Capable Multimodal Models Paper • 2312.11805 • Published Dec 19, 2023 • 45
Orca 2: Teaching Small Language Models How to Reason Paper • 2311.11045 • Published Nov 18, 2023 • 70
Nemotron 3 8B Collection The Nemotron 3 8B Family of models is optimized for building production-ready generative AI applications for the enterprise. • 5 items • Updated about 1 month ago • 45
Eureka: Human-Level Reward Design via Coding Large Language Models Paper • 2310.12931 • Published Oct 19, 2023 • 26
In-Context Pretraining: Language Modeling Beyond Document Boundaries Paper • 2310.10638 • Published Oct 16, 2023 • 28
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models Paper • 2309.12307 • Published Sep 21, 2023 • 87
A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis Paper • 2307.12856 • Published Jul 24, 2023 • 35