-
Notifications
You must be signed in to change notification settings - Fork 6k
fix DPM Scheduler with use_karras_sigmas
option
#6477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @LuChengTHU here for his insights :) |
@@ -267,7 +267,7 @@ def set_timesteps(self, num_inference_steps: int = None, device: Union[str, torc | |||
sigmas = np.flip(sigmas).copy() | |||
sigmas = self._convert_to_karras(in_sigmas=sigmas, num_inference_steps=num_inference_steps) | |||
timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas]).round() | |||
sigmas = np.concatenate([sigmas, sigmas[-1:]]).astype(np.float32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a reason to set the sigmas this way? or is it just a mistake that we didn't catch in the PR?
By repeating the last sigma twice, we make the very last step a dummy step with zero step size
i.e. this code here is x_t= sample
because we have sigma_t = sigma_s0
and h=0
x_t = (
(sigma_t / sigma_s0) * sample
- (alpha_t * (torch.exp(-h) - 1.0)) * D0
- 0.5 * (alpha_t * (torch.exp(-h) - 1.0)) * D1
)
if self.config.solver_type == "midpoint": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! Was this the culprit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes this was it!
I used this test script to compare against k-diffusion. I had to hardcode the k-sampler in order to force identical noise construction. I made comments where I updated code. the results look the same quality to me - are they? I might need a little bit of help from more trained eyes @ivanprado import torch
from diffusers import StableDiffusionXLKDiffusionPipeline, AutoPipelineForText2Image
from diffusers.schedulers import DPMSolverMultistepScheduler
from diffusers.utils.torch_utils import randn_tensor
from tqdm.auto import trange, tqdm
def default_noise_sampler(x):
return lambda generator: randn_tensor(x.shape, dtype=x.dtype, device=x.device, generator=generator)
@torch.no_grad()
def sample_dpmpp_2m_sde(model, x, sigmas, extra_args=None, callback=None, disable=None, eta=1., s_noise=1., noise_sampler=None, solver_type='midpoint'):
"""DPM-Solver++(2M) SDE."""
if solver_type not in {'heun', 'midpoint'}:
raise ValueError('solver_type must be \'heun\' or \'midpoint\'')
sigma_min, sigma_max = sigmas[sigmas > 0].min(), sigmas.max()
# yiyi edit: use custom noiser_sampler to match diffusers
noise_sampler = default_noise_sampler(x)
extra_args = {} if extra_args is None else extra_args
s_in = x.new_ones([x.shape[0]])
old_denoised = None
h_last = None
for i in trange(len(sigmas) - 1, disable=disable):
denoised = model(x, sigmas[i] * s_in, **extra_args)
if callback is not None:
callback({'x': x, 'i': i, 'sigma': sigmas[i], 'sigma_hat': sigmas[i], 'denoised': denoised})
if sigmas[i + 1] == 0:
# Denoising step
x = denoised
else:
# DPM-Solver++(2M) SDE
t, s = -sigmas[i].log(), -sigmas[i + 1].log()
h = s - t
eta_h = eta * h
x = sigmas[i + 1] / sigmas[i] * (-eta_h).exp() * x + (-h - eta_h).expm1().neg() * denoised
if old_denoised is not None:
r = h_last / h
if solver_type == 'heun':
x = x + ((-h - eta_h).expm1().neg() / (-h - eta_h) + 1) * (1 / r) * (denoised - old_denoised)
elif solver_type == 'midpoint':
x = x + 0.5 * (-h - eta_h).expm1().neg() * (1 / r) * (denoised - old_denoised)
if eta:
# yiyi edit: use custom noise_sampler to match diffusers
noise = noise_sampler(generator)
x = x + noise * sigmas[i + 1] * (-2 * eta_h).expm1().neg().sqrt() * s_noise
old_denoised = denoised
h_last = h
return x
seed = 42
model_id = "frankjoshua/juggernautXL_version6Rundiffusion"
prompt = "Adorable infant playing with a variety of colorful rattle toys."
# k-diffusion pipeline
pipe = StableDiffusionXLKDiffusionPipeline.from_pretrained(
model_id,
use_safetensors=True,
)
pipe.enable_model_cpu_offload()
pipe.sampler = sample_dpmpp_2m_sde
generator = torch.Generator(device="cuda").manual_seed(seed)
image = pipe(
prompt,
generator=generator,
guidance_scale=3.0,
num_inference_steps=25,
use_karras_sigmas=True,
height=768,
width=1344,
).images[0]
image.save("yiyi_test_out_k.png")
del pipe
torch.cuda.empty_cache()
# diffusers pipeline
pipe = AutoPipelineForText2Image.from_pretrained(
model_id,
add_watermarker = False,
use_safetensors=True,
)
pipe.enable_model_cpu_offload()
pipe.scheduler = DPMSolverMultistepScheduler.from_config(
pipe.scheduler.config,
algorithm_type="sde-dpmsolver++",
use_karras_sigmas=True,
)
generator = torch.Generator(device="cuda").manual_seed(seed)
results = pipe(
prompt=prompt,
guidance_scale=3.,
generator=generator,
num_inference_steps=25,
height=768,
width=1344,)
results.images[0].save("yiyi_test_out_d.png") |
Thank you @yiyixuxu !! The results are now really good in my eyes. Thank you! But also the Code to reproduce the issue with
![]() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job fixing the bug! Would be great to also apply the fix to scheduling_dpmsolver_singlestep.py
I came across this issue crowsonkb/k-diffusion#43 (comment) on k-diffusion and found out skipping the last denoising step is a trick can people use for low step counts - do you have any memory that if this is the reason we did similarly, in Auto1111 they have an option for you to skip second last step. I'm going to run some testing on my end for sd1.5 and sdxl. if this is not a bug then maybe we should add a new config instead .... |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Good catch - I'm not 100% sure actually why we added |
Never mind, it wasn't me!!!🙈 - even though I was the one who added if isinstance(timestep, torch.Tensor):
timestep = timestep.to(self.timesteps.device)
step_index = (self.timesteps == timestep).nonzero()
if len(step_index) == 0:
step_index = len(self.timesteps) - 1
else:
step_index = step_index.item()
prev_timestep = 0 if step_index == len(self.timesteps) - 1 else self.timesteps[step_index + 1] I'm still pretty sure it wasn't on purpose. However, I think it is ok to leave it as default mainly because:
|
src/diffusers/schedulers/scheduling_dpmsolver_multistep_inverse.py
Outdated
Show resolved
Hide resolved
@@ -164,6 +164,7 @@ def __init__( | |||
lower_order_final: bool = True, | |||
euler_at_final: bool = False, | |||
use_karras_sigmas: Optional[bool] = False, | |||
final_sigmas_type: Optional[str] = "default", # "denoise_to_zero", "default" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are both options necessary? Each has an advantage in certain scenarios?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
denoise_to_zero
matches the last step in k-diffusion. We can apply this with or without the karras sigmas. see some of my comparison results here fix DPM Scheduler withuse_karras_sigmas
option #6477 (comment)default
is what we currently have:- in my experiment, it was worse than the other options we just added. But obviously, I did not test for a lot of different settings, and I believe it will do better with higher step numbers
- I want to point out that there is some inconsistency between the current behavior of
use_karras_sigmas = True
anduse_karras_sigmas = False
: with karras_sigmas, we are skipping the last denoising step. I'm not entirely sure why it is done this way, so I'm not comfortable just updating it. I'm hoping to address that in a new PR so we don't delay this one getting merged in
current code for use_karras_sigmas
sigmas = np.flip(sigmas).copy()
sigmas = self._convert_to_karras(in_sigmas=sigmas, num_inference_steps=num_inference_steps)
timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas]).round()
sigmas = np.concatenate([sigmas, sigmas[-1:]]).astype(np.float32)
I think it should be
sigmas = np.flip(sigmas).copy()
sigmas = self._convert_to_karras(in_sigmas=sigmas, num_inference_steps=(num_inference_steps+1))
timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas]).round()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should change the default to "denoise_to_zero"
here though if you think it always gives better results. Also cc @sayakpaul
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can do that! I will deprecate the dpmsolver
and dpmsolver-sde
since they won't work with the denoise_to_zero
and I think they are not really used at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok for me!
Co-authored-by: Patrick von Platen <[email protected]>
@@ -189,6 +190,11 @@ def __init__( | |||
else: | |||
raise NotImplementedError(f"{solver_type} does is not implemented for {self.__class__}") | |||
|
|||
if algorithm_type != "dpmsolver++" and final_sigmas_type == "denoise_to_zero": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to support sde-dpmsolver++
in DPMSolverSinglestepScheduler
. Is one of the best combinations in k-diffusers, and is missing in diffusers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree. +1 for sde-dpmsolver++
@@ -0,0 +1,842 @@ | |||
# Copyright 2023 TSAIL Team and The HuggingFace Team. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes sense to create a whole new "legacy" scheduler class here. I'd advocate for just deprecating it in the original file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok! will do that instead. I was just kinda eager to get rid of the code sooner 😛
# Copyright 2023 TSAIL Team and The HuggingFace Team. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes sense to create a whole new "legacy" scheduler class here. I'd advocate for just deprecating it in the original file
if algorithm_type == "deis": | ||
self.register_to_config(algorithm_type="dpmsolver++") | ||
elif algorithm_type in ["dpmsolver", "sde-dpmsolver"]: | ||
raise ValueError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just throw a deprecation warning directly here instead of a ValueError?
Algorithm type for the solver; can be `dpmsolver`, `dpmsolver++`, `sde-dpmsolver` or `sde-dpmsolver++`. The | ||
`dpmsolver` type implements the algorithms in the [DPMSolver](https://huggingface.co/papers/2206.00927) | ||
paper, and the `dpmsolver++` type implements the algorithms in the | ||
Algorithm type for the solver; can be `dpmsolver++` or `sde-dpmsolver++`. The `dpmsolver++` type implements the algorithms in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sde-dpmsolver++
is not supported whereas the doc states it is supported. I guess this documentation should be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes will update the doc now. I Will try to add the support for sde-dpmsolver++ soon too. It's on my to-do for a long time now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job!
Nailed it. |
Yay 🫶 |
* fix --------- Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: Patrick von Platen <[email protected]>
This PR:
dpmsolver
andsde-dpmsolver
algorithms from DPM schedulersfinal_sigmas_types
:final_sigmas_types = "sigma_min"
we use the min value as the last sigma, and this is the current algorithm;final_sigmas_type ="zero,"
it denoise tosigma=0
in the last step - this is newly introduced in this PR and matches k-diffusion implementation. Atnum_inference_step=25
, it achieves significantly better results for SDXL across various configurations we tested (see testing script below)final_sigmas_type = "zero"
fix #6295
testing script
outputs
from left to right:
final_sigmas_type="sigma_min"
)final_sigmas_type="sigma_min"
+euler_at_final=True
final_sigmas_type="zero"
)DPM++2M
DPM++2M Karras
DPM++2M SDE
DPM++2M Karras SDE
DPM++
DPM++ Karras