accelerate>=0.33.0 bitsandbytes>0.37.0 cached_path click datasets ema_pytorch>=0.5.2 gradio jieba librosa matplotlib numpy<=1.26.4 pydub pypinyin safetensors soundfile tomli torchdiffeq tqdm>=4.65.0 transformers vocos wandb x_transformers>=1.31.14