alimama-creative
/

EcomXL_controlnet_softedge

stable-diffusion-xl

stable-diffusion-xl-diffusers

Model card Files Files and versions Community

EcomXL_controlnet_softedge / README.md

wangcr's picture

Update README.md

dbff4ec verified 5 months ago

|

history blame contribute delete

3.54 kB

	---
	license: apache-2.0
	base_model: stabilityai/stable-diffusion-xl-base-1.0
	tags:
	- stable-diffusion-xl
	- stable-diffusion-xl-diffusers
	- text-to-image
	- diffusers
	- controlnet
	inference: false
	language:
	- en
	pipeline_tag: text-to-image
	---

	# Softedge ControlNet
	EcomXL contains a series of text-to-image diffusion models optimized for e-commerce scenarios, developed based on [Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0).<br/>
	The controlnet weights are fine-tuned based on stable-diffusion-xl-base-1.0.
	It works good on SDXL as well as community models based on SDXL.
	The model is trained on general data and taobao e-commerce data, and has good capabilities in both general and e-commerce scenarios.

	## Examples
	These cases are generated using AUTOMATIC1111/stable-diffusion-webui.


	`softedge`\|`weight-0.6`\|`weight-0.8`
	:--:\|:--:\|:--:
	![images)](./images/1_0.png) \| ![images)](./images/1_1.png) \| ![images)](./images/1_2.png)
	![images)](./images/2_0.png) \| ![images)](./images/2_1.png) \| ![images)](./images/2_2.png)
	![images)](./images/3_0.png) \| ![images)](./images/3_1.png) \| ![images)](./images/3_2.png)
	![images)](./images/4_0.png) \| ![images)](./images/4_1.png) \| ![images)](./images/4_2.png)


	## Usage with Diffusers
	```python
	from diffusers import (
	ControlNetModel,
	StableDiffusionXLControlNetPipeline,
	DPMSolverMultistepScheduler,
	AutoencoderKL
	)
	from diffusers.utils import load_image
	from controlnet_aux import PidiNetDetector, HEDdetector
	import torch
	from PIL import Image

	controlnet = ControlNetModel.from_pretrained(
	"alimama-creative/EcomXL_controlnet_softedge", torch_dtype=torch.float16, use_safetensors=True
	)
	vae = AutoencoderKL.from_pretrained('madebyollin/sdxl-vae-fp16-fix', torch_dtype=torch.float16)
	pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
	"stabilityai/stable-diffusion-xl-base-1.0",
	controlnet=controlnet,
	vae=vae,
	torch_dtype=torch.float16
	)

	pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
	# pipe.enable_xformers_memory_efficient_attention()
	pipe.to(device="cuda", dtype=torch.float16)
	pipe.enable_vae_slicing()


	image = load_image(
	"https://huggingface.co/alimama-creative/EcomXL_controlnet_softedge/resolve/main/images/1_1.png"
	)
	edge_processor = PidiNetDetector.from_pretrained('lllyasviel/Annotators')
	edge_image = edge_processor(image, safe=False) # set True to use pidisafe

	prompt="a bottle on the Twilight Grassland, Sitting on the ground, a couple of tall grass sitting in a field of tall grass, sunset,"
	negative_prompt = "low quality, bad quality, sketches"

	output = pipe(
	prompt,
	negative_prompt=negative_prompt,
	image=edge_image,
	num_inference_steps=25,
	controlnet_conditioning_scale=0.6,
	guidance_scale=7,
	width=1024,
	height=1024,
	).images[0]

	output.save(f'test_edge.png')

	```
	The model exhibits good performance when the controlnet weight (controlnet_condition_scale) is within the range of 0.6 to 0.8.

	## Training details

	Mixed precision: FP16<br/>
	Learning rate: 1e-5<br/>
	batch size: 1024<br/>
	Noise offset: 0.05<br/>
	The model is trained for 37k steps.
	The training data includes 12M laion2B and internal sources images with aesthetic 6 plus, as well as 3M Taobao e-commerce images. The softedge preproessor during training is randomly selected from pidinet, hed, pidisafe and hedsafe, which are officially supported by Automatic&&Mikubill. The model has good performance when the weight is in 0.6~0.8.