EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models Paper β’ 2409.17892 β’ Published 29 days ago β’ 2
view article Article wHy DoNt YoU jUsT uSe ThE lLaMa ToKeNiZeR?? By catherinearnett β’ 28 days ago β’ 33
Faith and Fate: Limits of Transformers on Compositionality Paper β’ 2305.18654 β’ Published May 29, 2023 β’ 6
π» Local SmolLMs Collection SmolLM models in MLC, ONNX and GGUF format for local applications + in-browser demos β’ 14 items β’ Updated Aug 20 β’ 41
πͺ SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos β’ 12 items β’ Updated Aug 18 β’ 178
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper β’ 2402.17764 β’ Published Feb 27 β’ 601
Parakeet Collection NeMo Parakeet ASR Models attain strong speech recognition accuracy while being efficient for inference. Available in CTC and RNN-Transducer variants. β’ 8 items β’ Updated 25 days ago β’ 20
OLMo Suite Collection Artifacts for the first set of OLMo models. β’ 18 items β’ Updated about 1 month ago β’ 64
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset Paper β’ 2309.04662 β’ Published Sep 9, 2023 β’ 22