File size: 1,560 Bytes
0d6dbb0
 
 
3bb8ab4
 
 
 
 
89d45c5
3bb8ab4
 
 
 
 
 
 
 
 
 
 
 
 
 
e377745
3bb8ab4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
---
license: openrail
---
# Al-Nay (الناي) Unconditional Diffusion

Al-Nay is one of the oldest instruments used to this date. 
With its roots in ancient Egypt nearly 5,000 years ago, it has become a staple in Arabic and Persian music. 
While the number of Nayzens – the name associated with skilled players of the instrument – has diminished over time, our Unconditional Diffusion model ensures that number is never zero.
This project could not have been done without [the following audio diffusion tools.](https://github.com/teticio/audio-diffusion)

## Usage

Usage of this model is no different from any other audio diffusion model from HuggingFace.

```python
import torch
from diffusers import DiffusionPipeline

# Setup device and create generator
device = "cuda" if torch.cuda.is_available() else "cpu"
generator = torch.Generator(device=device)

# Instantiate model
model_id = "mijwiz-laboratories/al_nay_diffusion_unconditional_256"
audio_diffusion = DiffusionPipeline.from_pretrained(model_id).to(device)

# Set seed for generator
seed = generator.seed()
generator.manual_seed(seed)

# Run inference
output = audio_diffusion(generator=generator)
image = output.images[0] # Mel spectrogram generated
audio = output.audios[0, 0] # Playable audio file
```

## Limitations of Model
The dataset used was very small, so the diversity of snippets that can be generated is rather limited. Furthermore, with high intensity segments (think a human playing the instrument with high intensity,) 
the realism/naturalness of the generated flute degrades.