theadamsabra's picture
Update README.md
e377745
---
license: openrail
---
# Al-Nay (ุงู„ู†ุงูŠ) Unconditional Diffusion
Al-Nay is one of the oldest instruments used to this date.
With its roots in ancient Egypt nearly 5,000 years ago, it has become a staple in Arabic and Persian music.
While the number of Nayzens โ€“ the name associated with skilled players of the instrument โ€“ has diminished over time, our Unconditional Diffusion model ensures that number is never zero.
This project could not have been done without [the following audio diffusion tools.](https://github.com/teticio/audio-diffusion)
## Usage
Usage of this model is no different from any other audio diffusion model from HuggingFace.
```python
import torch
from diffusers import DiffusionPipeline
# Setup device and create generator
device = "cuda" if torch.cuda.is_available() else "cpu"
generator = torch.Generator(device=device)
# Instantiate model
model_id = "mijwiz-laboratories/al_nay_diffusion_unconditional_256"
audio_diffusion = DiffusionPipeline.from_pretrained(model_id).to(device)
# Set seed for generator
seed = generator.seed()
generator.manual_seed(seed)
# Run inference
output = audio_diffusion(generator=generator)
image = output.images[0] # Mel spectrogram generated
audio = output.audios[0, 0] # Playable audio file
```
## Limitations of Model
The dataset used was very small, so the diversity of snippets that can be generated is rather limited. Furthermore, with high intensity segments (think a human playing the instrument with high intensity,)
the realism/naturalness of the generated flute degrades.