-
Notifications
You must be signed in to change notification settings - Fork 6k
[SD-XL] Add inpainting #4098
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SD-XL] Add inpainting #4098
Conversation
The documentation is not available anymore as the PR was closed or merged. |
…add_inpaint_sd_xl
Hi, is there already a SD XL checkpoint with unet having 9 channels? Seems like no specific inpainting model was released for SD XL, but without it inpainting results are meaningless (I mean there is little to no semantic match between inpainted regions and present regions in generated images) |
Is this the same method as |
And yes, I agree with @gkorepanov , the "InpaintLegacy" method is more or less useless. |
In that sense there will only be one "true" |
As inpaiting checkpoint isn't there, does it affects quality in general? |
Works pretty well for me for now, I recommend making sure to pass I think the checkpoint will however have problems when you want to replace the masked area with something very different to what was there before. |
from diffusers.utils import load_image | ||
|
||
pipe = StableDiffusionXLImg2ImgPipeline.from_pretrained( | ||
"stabilityai/stable-diffusion-xl-refiner-0.9", torch_dtype=torch.float16, variant="fp16", use_safetensors=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really intended to use the refiner model for general img2img? I've been trying to understand this - I've also seen it here for example - but I think I am missing something. My understanding is that the refiner model is intended as a kind of de-noising and/or fidelity-increasing step and it isn't good at generating the kind of baseline content of the image. If that's correct, feels like it'd perform poorly for img2img with lower strength values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use both! The refiner might be better suited for images that look already like the prompt which is the case here. We should maybe improve the docs after the official release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha - thanks for clarifying!
@@ -981,8 +981,6 @@ def __call__( | |||
generator, | |||
do_classifier_free_guidance, | |||
) | |||
init_image = init_image.to(device=device, dtype=masked_image_latents.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is never used actually and was a copy-paste bug I think
* Add more * more * up * Get ensemble of expert denoisers working * Fix code * add tests * up
* Add more * more * up * Get ensemble of expert denoisers working * Fix code * add tests * up
* Add more * more * up * Get ensemble of expert denoisers working * Fix code * add tests * up
* Add more * more * up * Get ensemble of expert denoisers working * Fix code * add tests * up
* Add more * more * up * Get ensemble of expert denoisers working * Fix code * add tests * up
SD-XL inpainting
This PR solves: #4080 and is ready for a review.
Inpainting works well for both the vanilla case and the "Ensemble of Expert Denoisers case".
You can try the following to see for yourself:
Vanilla inpainting:
Ensemble of Expert of denoisers
which should give slightly better quality: