wangcr's picture
Update README.md
dbff4ec verified
---
license: apache-2.0
base_model: stabilityai/stable-diffusion-xl-base-1.0
tags:
- stable-diffusion-xl
- stable-diffusion-xl-diffusers
- text-to-image
- diffusers
- controlnet
inference: false
language:
- en
pipeline_tag: text-to-image
---
# Softedge ControlNet
EcomXL contains a series of text-to-image diffusion models optimized for e-commerce scenarios, developed based on [Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0).<br/>
The controlnet weights are fine-tuned based on stable-diffusion-xl-base-1.0.
It works good on SDXL as well as community models based on SDXL.
The model is trained on general data and taobao e-commerce data, and has good capabilities in both general and e-commerce scenarios.
## Examples
These cases are generated using AUTOMATIC1111/stable-diffusion-webui.
`softedge`|`weight-0.6`|`weight-0.8`
:--:|:--:|:--:
![images)](./images/1_0.png) | ![images)](./images/1_1.png) | ![images)](./images/1_2.png)
![images)](./images/2_0.png) | ![images)](./images/2_1.png) | ![images)](./images/2_2.png)
![images)](./images/3_0.png) | ![images)](./images/3_1.png) | ![images)](./images/3_2.png)
![images)](./images/4_0.png) | ![images)](./images/4_1.png) | ![images)](./images/4_2.png)
## Usage with Diffusers
```python
from diffusers import (
ControlNetModel,
StableDiffusionXLControlNetPipeline,
DPMSolverMultistepScheduler,
AutoencoderKL
)
from diffusers.utils import load_image
from controlnet_aux import PidiNetDetector, HEDdetector
import torch
from PIL import Image
controlnet = ControlNetModel.from_pretrained(
"alimama-creative/EcomXL_controlnet_softedge", torch_dtype=torch.float16, use_safetensors=True
)
vae = AutoencoderKL.from_pretrained('madebyollin/sdxl-vae-fp16-fix', torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=controlnet,
vae=vae,
torch_dtype=torch.float16
)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
# pipe.enable_xformers_memory_efficient_attention()
pipe.to(device="cuda", dtype=torch.float16)
pipe.enable_vae_slicing()
image = load_image(
"https://huggingface.co/alimama-creative/EcomXL_controlnet_softedge/resolve/main/images/1_1.png"
)
edge_processor = PidiNetDetector.from_pretrained('lllyasviel/Annotators')
edge_image = edge_processor(image, safe=False) # set True to use pidisafe
prompt="a bottle on the Twilight Grassland, Sitting on the ground, a couple of tall grass sitting in a field of tall grass, sunset,"
negative_prompt = "low quality, bad quality, sketches"
output = pipe(
prompt,
negative_prompt=negative_prompt,
image=edge_image,
num_inference_steps=25,
controlnet_conditioning_scale=0.6,
guidance_scale=7,
width=1024,
height=1024,
).images[0]
output.save(f'test_edge.png')
```
The model exhibits good performance when the controlnet weight (controlnet_condition_scale) is within the range of 0.6 to 0.8.
## Training details
Mixed precision: FP16<br/>
Learning rate: 1e-5<br/>
batch size: 1024<br/>
Noise offset: 0.05<br/>
The model is trained for 37k steps.
The training data includes 12M laion2B and internal sources images with aesthetic 6 plus, as well as 3M Taobao e-commerce images. The softedge preproessor during training is randomly selected from pidinet, hed, pidisafe and hedsafe, which are officially supported by Automatic&&Mikubill. The model has good performance when the weight is in 0.6~0.8.