LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance Paper • 2307.00522 • Published Jul 2, 2023 • 31
Free Music Archive Collection ISMIR's 2017 FMA Dataset, Optimized for 🤗 Datasets / 🥐 Croissant, with Clear Licensing • 4 items • Updated Sep 13 • 3
view article Article wHy DoNt YoU jUsT uSe ThE lLaMa ToKeNiZeR?? By catherinearnett • 25 days ago • 33
Adding Conditional Control to Text-to-Image Diffusion Models Paper • 2302.05543 • Published Feb 10, 2023 • 37
view article Article How to generate text: using different decoding methods for language generation with Transformers Mar 1, 2020 • 103
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA May 24, 2023 • 84
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation Paper • 2409.09214 • Published Sep 13 • 45
view article Article Using 🤗 to Train a GPT-2 Model for Music Generation By juancopi81 • Oct 5, 2023 • 7
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated Jul 31 • 325
SpeechVerse: A Large-scale Generalizable Audio Language Model Paper • 2405.08295 • Published May 14 • 14
Quantized-Mistral Collection Quantized Mistral models in 2,4, and 8 bit versions • 4 items • Updated Aug 31 • 4
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation Paper • 2409.02245 • Published Sep 3 • 9
view article Article makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch By AviSoori1x • May 7 • 38
view article Article Introducing AuraFace: Open-Source Face Recognition and Identity Preservation Models By isidentical • Aug 26 • 35
aaliyah Collection personal collection of convnet models and paper implementations for different applications. • 2 items • Updated Aug 25 • 1
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models Paper • 2408.02718 • Published Aug 5 • 60
view article Article Open-sourcing Knowledge Distillation Code and Weights of SD-Small and SD-Tiny Aug 1, 2023 • 2
LLM-AD: Large Language Model based Audio Description System Paper • 2405.00983 • Published May 2 • 16
Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion Paper • 2407.13759 • Published Jul 18 • 17
Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity Paper • 2407.10387 • Published Jul 15 • 6
view article Article Train custom AI models with the trainer API and adapt them to 🤗 By not-lain • Jun 29 • 33
Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers Paper • 2310.05400 • Published Oct 9, 2023 • 1