Adaptive Projected Guidance #9626

hlky · 2024-10-09T18:38:00Z

What does this PR do?

This PR implements APG (Adaptive Projected Guidance) from Algorithm 1 in Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models.

Algorithm 1 is slightly modified to combine project into normalized_guidance, this simply reduces the number of methods to be copied between pipelines.

APG is added to StableDiffusionPipeline and StableDiffusionXLPipeline. The following parameters are introduced:

adaptive_projected_guidance (`bool`, *optional*):
    Use adaptive projected guidance from [Eliminating Oversaturation and Artifacts of High Guidance Scales
    in Diffusion Models](https://arxiv.org/pdf/2410.02416)
adaptive_projected_guidance_momentum (`float`, *optional*, defaults to `-0.5`):
    Momentum to use with adaptive projected guidance. Use `None` to disable momentum.
adaptive_projected_guidance_rescale_factor (`float`, *optional*, defaults to `15.0`):
    Rescale factor to use with adaptive projected guidance.

Default values are taken from Stable Diffusion XL in Table 10. The existing eta parameter is used, rather than adding a new parameter, as per the docstring eta is only used in DDIMScheduler so this should be ok.

Fixes #9585

Example usage:

import torch
from diffusers import StableDiffusionXLPipeline

pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    variant="fp16",
    torch_dtype=torch.float16,
)
pipe.safety_checker = None
pipe.enable_model_cpu_offload()
pipe.enable_vae_tiling()
prompt = "A 4k dslr photo of a raccoon wearing an astronaut helmet, photorealistic."

generator = torch.Generator().manual_seed(694208027600)
image = pipe(prompt, guidance_scale=15, generator=generator).images[0]
image

generator = torch.Generator().manual_seed(694208027600)
image = pipe(
    prompt,
    guidance_scale=15,
    adaptive_projected_guidance=True,
    adaptive_projected_guidance_momentum=-0.5,
    adaptive_projected_guidance_rescale_factor=15.0,
    generator=generator,
).images[0]
image

CFG:

APG:

There's certainly some improvement, however further testing would be beneficial to confirm the findings in the paper.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

cc @yiyixuxu @asomoza

Msadat97 · 2024-10-09T19:54:29Z

Thanks for your interest in our work! Please note that we always convert the output of the model to the denoised predictions (pred_x0) and compute the guidance there. We found that APG performs better when applied to the denoised predictions. We also have a discussion on this step in Section 5.2 (Figure 12).

Ideally, APG should be implemented like this:

x0_pred_text = get_x0_from_noise(noise_pred_text, latents, t)
x0_pred_uncond = get_x0_from_noise(noise_pred_uncond, latents, t)
x0_guided = normalized_guidance(...)
noise_pred = get_noise_from_x0(x0_guided, latents, t)

(A better solution would be having a flag that allows the users to choose whether APG should be applied to the model output or the denoised prediction)

yiyixuxu · 2024-10-09T21:21:38Z

hi @hlky

Thanks for the PR! I'm a bit reluctant to add this to SD and SDXL, as these pipelines are already getting bloated and can become overwhelming for newcomers, especially given that this is not the only CFG alternative and won't be the last one.

@apolinario has suggested ideas to make guidance a separate "component" that you can swap out just like schedulers - I'm happy to explore that now! I will draft a PR soon, and we can work together and experiment with different ideas from there! This may also fit better in an experimental project that we are working on to make a composable pipeline that targets the community and company users and allows them to mix and match different features without writing much code. So, we will see!

hlky · 2024-10-09T21:32:31Z

@Msadat97 Thanks! I missed that section. Applying the guidance to the denoised predictions does indeed produce much better results:

@yiyixuxu After fixing this locally for denoised predictions I'd have to agree, it needs a little more than present here, specifically we'd actually need some changes to schedulers to allow easy conversion between noised and denoised predictions. Making guidance a separate component sounds like a great idea, hope to see that in soon, I'd be happy to work with you on that and any changes to schedulers.

xziayro · 2024-10-11T08:51:42Z

@Msadat97 Thanks! I missed that section. Applying the guidance to the denoised predictions does indeed produce much better results:

@yiyixuxu After fixing this locally for denoised predictions I'd have to agree, it needs a little more than present here, specifically we'd actually need some changes to schedulers to allow easy conversion between noised and denoised predictions. Making guidance a separate component sounds like a great idea, hope to see that in soon, I'd be happy to work with you on that and any changes to schedulers.

@hlky Can you share/commit the change that result to this enhancement?
Thx a lot

hlky · 2024-10-11T09:38:13Z

Certainly, I've pushed those changes, however please note this will only work with some schedulers like Euler, and while it does work for 2nd order schedulers like Heun and DPM2 it's not 100% as the sigma used is incorrect for the 2nd order step, it won't work for schedulers like DDIM. The issue linked above aims to add methods to handle this for each scheduler.

HuggingFaceDocBuilderDev · 2024-11-01T03:55:37Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

yiyixuxu · 2024-11-06T00:29:12Z

hi @hlky just so you know I made a Guider class in here and this is the use case I have in mind to try next #9672

github-actions · 2024-11-30T15:03:03Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

hlky mentioned this pull request Oct 9, 2024

[Schedulers] Add methods for conversion between noise and denoised predictions #9629

Open

hlky added 2 commits October 11, 2024 10:15

Adaptive Projected Guidance

8545966

Guidance on denoised predictions

c7e62c4

hlky force-pushed the adaptive-projected-guidance branch from a528e6c to c7e62c4 Compare October 11, 2024 09:31

Merge branch 'main' into adaptive-projected-guidance

dd0bd40

github-actions bot added the stale Issues that haven't received updates label Nov 30, 2024

a-r-r-o-w added wip consider-for-modular-diffusers Things to consider adding support for in Modular Diffusers (with the help of community) and removed stale Issues that haven't received updates labels Nov 30, 2024

hlky mentioned this pull request Dec 10, 2024

Modular APG #10173

Merged

hlky closed this Apr 15, 2025

hlky deleted the adaptive-projected-guidance branch April 15, 2025 12:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adaptive Projected Guidance #9626

Adaptive Projected Guidance #9626

hlky commented Oct 9, 2024

Msadat97 commented Oct 9, 2024 •

edited

Loading

yiyixuxu commented Oct 9, 2024

hlky commented Oct 9, 2024

xziayro commented Oct 11, 2024 •

edited

Loading

hlky commented Oct 11, 2024

HuggingFaceDocBuilderDev commented Nov 1, 2024

yiyixuxu commented Nov 6, 2024

github-actions bot commented Nov 30, 2024

Adaptive Projected Guidance #9626

Adaptive Projected Guidance #9626

Conversation

hlky commented Oct 9, 2024

What does this PR do?

Who can review?

Msadat97 commented Oct 9, 2024 • edited Loading

yiyixuxu commented Oct 9, 2024

hlky commented Oct 9, 2024

xziayro commented Oct 11, 2024 • edited Loading

hlky commented Oct 11, 2024

HuggingFaceDocBuilderDev commented Nov 1, 2024

yiyixuxu commented Nov 6, 2024

github-actions bot commented Nov 30, 2024

Msadat97 commented Oct 9, 2024 •

edited

Loading

xziayro commented Oct 11, 2024 •

edited

Loading