Diffusers

You are viewing v0.16.0 version. A newer version v0.31.0 is available.

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

How to use Stable Diffusion on Habana Gaudi

🤗 Diffusers is compatible with Habana Gaudi through 🤗 Optimum Habana.

Requirements

Optimum Habana 1.5 or later, here is how to install it.
SynapseAI 1.9.

Inference Pipeline

To generate images with Stable Diffusion 1 and 2 on Gaudi, you need to instantiate two instances:

A pipeline with GaudiStableDiffusionPipeline. This pipeline supports text-to-image generation.
A scheduler with GaudiDDIMScheduler. This scheduler has been optimized for Habana Gaudi.

When initializing the pipeline, you have to specify use_habana=True to deploy it on HPUs. Furthermore, in order to get the fastest possible generations you should enable HPU graphs with use_hpu_graphs=True. Finally, you will need to specify a Gaudi configuration which can be downloaded from the Hugging Face Hub.

from optimum.habana import GaudiConfig
from optimum.habana.diffusers import GaudiDDIMScheduler, GaudiStableDiffusionPipeline

model_name = "stabilityai/stable-diffusion-2-base"
scheduler = GaudiDDIMScheduler.from_pretrained(model_name, subfolder="scheduler")
pipeline = GaudiStableDiffusionPipeline.from_pretrained(
    model_name,
    scheduler=scheduler,
    use_habana=True,
    use_hpu_graphs=True,
    gaudi_config="Habana/stable-diffusion",
)

You can then call the pipeline to generate images by batches from one or several prompts:

outputs = pipeline(
    prompt=[
        "High quality photo of an astronaut riding a horse in space",
        "Face of a yellow cat, high resolution, sitting on a park bench",
    ],
    num_images_per_prompt=10,
    batch_size=4,
)

For more information, check out Optimum Habana’s documentation and the example provided in the official Github repository.

Benchmark

Here are the latencies for Habana first-generation Gaudi and Gaudi2 with the Habana/stable-diffusion Gaudi configuration (mixed precision bf16/fp32):

Stable Diffusion v1.5 (512x512 resolution):

	Latency (batch size = 1)	Throughput (batch size = 8)
first-generation Gaudi	4.22s	0.29 images/s
Gaudi2	1.70s	0.925 images/s

Stable Diffusion v2.1 (768x768 resolution):

	Latency (batch size = 1)	Throughput
first-generation Gaudi	23.3s	0.045 images/s (batch size = 2)
Gaudi2	7.75s	0.14 images/s (batch size = 5)

←MPS Token Merging→