-
Notifications
You must be signed in to change notification settings - Fork 6k
add PAG support #7944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add PAG support #7944
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@asomoza can you test it out? |
cc @HyoungwonCho for awareness |
@yiyixuxu @asomoza Hello, I was impressed by the various experiments you conducted using PAG! Since the guidance framework of PAG itself is simple, it seems quite possible to use it in conjunction with other modules like the IP-Adapter you mentioned. However, we have not yet implemented and experimented with it directly, so we have not confirmed whether there is a significant performance improvement when used together. If possible, we will conduct additional experiments in the future. Thank you for your interest in our research. |
Thank you for the great work! File ".../.env/lib/python3.11/site-packages/diffusers/models/controlnet.py", line 798, in forward
sample = sample + controlnet_cond
~~~~~~~^~~~~~~~~~~~~~~~~
RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 0 I solved it by adding a new parameter if do_classifier_free_guidance and do_perturbed_attention_guidance and not guess_mode:
image = torch.cat([image] * 3)
elif do_classifier_free_guidance and not guess_mode:
image = torch.cat([image] * 2)
elif do_perturbed_attention_guidance and not guess_mode:
image = torch.cat([image] * 2) |
@KKIEEK |
Just leaving a brief report of my findings with PAG and Diffusers (I already had it integrated in my pipelines before this PR):
|
@jorgemcgomes thanks! |
Hello. I'm an author of PAG. Thank you for your insightful opinions and cool implementation. Is there anything currently in progress? We are excited to see that PAG is gaining popularity within the community and being utilized in various workflows. Especially in ComfyUI, PAG nodes are used in diverse workflows. (Some workflows using PAG in ComfyUI: However, in Diffusers, it seems somewhat challenging to try creative combinations as the pipelines are separated. Therefore, the MixIn approach taken in this PR appears to be a very effective solution. However, it seems a bit awkward to call Additionally, since there are many users who want compatibility with IP-adapter, now I have time and would like to work on making it compatible with IPAdapter. I'm curious if there's any related progress about component design or IP-adapter compatibility. Thank you! |
@sunovivid thanks for the message! for IP-adapter, it will be super cool if we can make it work! I'm not aware of any related progress so would really appreciate if you are able to find time to work on this! maybe we can just pick one of the pipelines from this PR (with the mixin) and make it work with |
|
||
This guide will show you how to use PAG for various tasks and use cases. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe include a mention of PAG pipeline page here as well?
Thank you very much for your efforts in integrating PAG with diffusers. Despite the enormous workload, we are very grateful that you have finally completed it! This is truly an amazing job. Concerns about propagating the changes have also been somewhat resolved. It seems that other pipelines are being managed quite well. We plan to see how PAG works in a DiT-based architecture. Additionally, we intend to try several new perturbations. There have been some attempts in the community regarding this, but there doesn't seem to be a clear conclusion yet. This is a very exciting direction! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is pretty much ready to merge now - do you want to take a look at the doc? feel free to refactor later too
LGTM, feel free to merge! 👍
Co-authored-by: Sayak Paul <[email protected]>
Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: Sayak Paul <[email protected]>
@yiyixuxu Hi, why did PAG not work when I executed the first piece of code in the latest diffusers? |
@peki12345 I think it is because you run diffusers |
I got it, thanks. |
* first draft --------- Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: Junhwa Song <[email protected]> Co-authored-by: Ahn Donghoon (안동훈 / suno) <[email protected]> Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: Steven Liu <[email protected]>
Notes on implementation
separate pipeline class
created a separate pipeline group for PAG so that we are able to support it (and many more such features in the future) while keeping our SD and SDXL pipelines lightweight for the research community
PAGMixin
PAGMixin
extracts away all PAG-related logic so that we are able to keep the PAG pipeline structure consistent with the rest of the pipelines. It make it easier to read, and also easier to integrate and maintainAutoPipeline
APIenable_pag =True
to automatically create a pipeline with PAG enabled based on the task you specified and the checkpoint you provided. Under the hood, it creates the corresponding PAG pipeline. A few examplesfrom_pipe
API also works and works just intuitively (I hope). A few examples:pag_applied_layers
pag_applied_layers
when you create the pipeline, e.g.set_pag_applied_layers
to update these layers after the pipeline has been createdset_pag_applied_layers
is either a single string or a list of strings, you can"down"
,"mid"
,"up"
"down.block_0"
,"up.block_1"
"down.block_0.attentions_0"
other notes:
prepare_ip_adapter_image_embeds
a little bit so that we duplicate inputs for CFG only once in the end, that's why a lot of the files got changed. you only need to look at thepag
folder andauto_pipeline.py
file underpipelines
folder when reviewing this PRUsage Examples
SDXL + PAG
SDXL + PAG + IP-Adapter
works with ip-adapter now thanks to @sunovivid
SDXL Inpainting + PAG
SDXL + ControlNet + PAG
SDXL Img2img + PAG