Skip to content

Add Kohya fix to SD pipeline for high resolution generation #7633

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 28, 2024

Conversation

sajadn
Copy link
Contributor

@sajadn sajadn commented Apr 10, 2024

What does this PR do?

Adds Kohya fix to Stable Diffusion pipeline. Fixes #7265.

To Test?

Here is a minimal example to test the pipeline. You can disable the fix by setting with_high_res_fix to False which passes None to the pipline as high_res_fix argument.

from diffusers import DiffusionPipeline
import torch

generator = torch.manual_seed(42)
with_high_res_fix = True
high_res_fix = [{'timestep': 600, 'scale_factor': 0.5, 'block_num': 1}]
pipe = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", 
                                         custom_pipeline="/home/sajad/Desktop/hugging_face/diffusers/examples/community/kohya_hires_fix.py",
                                         # custome_pipeline="kohya_hires_fix"
                                         generator=generator,
                                         torch_dtype=torch.float16, 
                                         use_safetensors=True, 
                                         variant="fp16",
                                         high_res_fix=high_res_fix if with_high_res_fix else None)
pipe.to("cuda")
prompt = "a dog sitting on the couch"
image = pipe(prompt=prompt,
                height=1000,
                width=1600, 
                num_inference_steps=50).images[0]

image.save(f"{prompt.replace(' ', '_')}_fix={with_high_res_fix}.png")

high_res_fix argument is supposed to be a list of dictionaries where each element has three values of timestep, scale_factor, and block_num. For example, you can pass a high_res_fix of [{'timestep': 900, 'scale_factor': 0.4, 'block_num': 2}, {'timestep': 600, 'scale_factor': 0.5, 'block_num': 1}].

I find the default value of [{'timestep': 600, 'scale_factor': 0.5, 'block_num': 1}] to work well enough, but the user can modify based on their use-case.

Here are some examples with a resolution of 1000x1600:
Prompt = "a dog sitting on the couch"
without the fix:
a_dog_sitting_on_the_couch_fix=False

with the fix:
a_dog_sitting_on_the_couch_fix=True

Prompt = "a pig sitting behind the desk"
without the fix:
a_pig_sitting_behind_the_desk_fix=False

with the fix:
a_pig_sitting_behind_the_desk_fix=True

Before submitting

Who can review?

@yiyixuxu @sayakpaul

@srelbo
Copy link

srelbo commented Apr 10, 2024

Thanks for this PR @sajadn ! This is very helpful.

@yiyixuxu
Copy link
Collaborator

thanks! can we move this to community folder?

@sajadn
Copy link
Contributor Author

sajadn commented Apr 10, 2024

You mean moving the new pipeline to examples/community folder? how about unet_2d_condition_high_res?

@yiyixuxu
Copy link
Collaborator

it can go into the same file :)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label May 11, 2024
@yiyixuxu yiyixuxu removed the stale Issues that haven't received updates label May 13, 2024
@yiyixuxu
Copy link
Collaborator

gentle pin :)

@sajadn
Copy link
Contributor Author

sajadn commented May 27, 2024

Hey,
sorry about the long delay!
I put everything in a single file named kohya_hires_fix.py in examples/community folder. I also updated the "how to test?" section in the thread above to use custom_pipelines. Let me know if it still needs adjustments.
Cheers!

@yiyixuxu yiyixuxu merged commit 67bef20 into huggingface:main May 28, 2024
8 checks passed
@yiyixuxu
Copy link
Collaborator

yiyixuxu commented May 28, 2024

thanks so much for the contribution!!
cc @asomoza here for awareness: another tool in our toolbox! let's recommend it whenever you see fit and keep an eye on it for official integration :)

@Depfek6
Copy link

Depfek6 commented Jun 4, 2024

@sajadn I've run your code, but the Fix doesn't seem to work.
Here's the code i'm running :
from diffusers import DiffusionPipeline
import torch

with_high_res_fix = True
high_res_fix = [{'timestep': 600, 'scale_factor': 0.5, 'block_num': 1}]
pipe = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4",
custom_pipeline="/content/kohya_hires_fix.py",
torch_dtype=torch.float16,
use_safetensors=True,
variant="fp16",
high_res_fix=high_res_fix if with_high_res_fix else None)
pipe.to("cuda")
prompt = "a dog sitting on the couch"
image = pipe(prompt=prompt,
height=1024,
width=1024,
num_inference_steps=30).images[0]

image.save(f"{prompt.replace(' ', '_')}_fix={with_high_res_fix}.jpg")

But here's the results

a_dog_sitting_on_the_couch_fix=True (1)

Am i do wrong?

@asomoza
Copy link
Member

asomoza commented Jun 7, 2024

I think you're expecting too much of this, most of the examples I see aren't that great and yours seems like most of the examples, probably you need to keep re-rolling until you get a good seed.

@sajadn
Copy link
Contributor Author

sajadn commented Jun 11, 2024

@Depfek6 I'd say play with the parameters. Decrease the scale_factor further to something like 0.25 or decrease the timestep to 500, you should be able to reduce the number of dog heads to one :D

sayakpaul pushed a commit that referenced this pull request Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kohya Hires fix
6 participants