Improve `FluxPipeline` checks and logging #9064

PDillis · 2024-08-03T00:53:19Z

What does this PR do?

Remove unnecessary logging of txt_ids and img_ids shapes that bloat up the console, especially when generating large quantities of images. This is done inside of models/transformers/transformer_flux.py.
If the user passes their own latents to the FluxPipeline.__call__, then there are additional checks to ensure the right shape is being used and, if not, that the expected ones are notified to the user. Otherwise, the typical message of PyTorch's tensor mismatch is uninformative at best. As such, the docstring has also been updated.

…d in `self.check_inputs` and `self.prepare_latents`)

sayakpaul · 2024-08-03T03:05:00Z

src/diffusers/models/transformers/transformer_flux.py

@@ -373,7 +373,6 @@ def forward(
        )
        encoder_hidden_states = self.context_embedder(encoder_hidden_states)

-        print(f"{txt_ids.shape=}, {img_ids.shape=}")


I was actually fixing it here #9057. But okay.

Understood, we can ignore this then.

sayakpaul

Thank you. I have left some comments. LMK what you think.

sayakpaul · 2024-08-03T03:06:44Z

src/diffusers/pipelines/flux/pipeline_flux.py

+                    "Unpacked latents detected. These will be automatically packed. "
+                    "In the future, please provide packed latents to improve performance."
+                )
+                latents = self._pack_latents(latents, batch_size, num_channels_latents, height, width)


Packing the latents is an inexpensive operation. So, I think this warning is unnecessary.

Agreed, the warning can be removed. Note also that I moved the definitions for height and width in case no latents were provided, as otherwise we would be giving the wrong dimensions to self._prepare_latent_image_ids

sayakpaul · 2024-08-03T03:08:26Z

src/diffusers/pipelines/flux/pipeline_flux.py

+        This pipeline expects `latents` to be in a packed format. If you're providing
+        custom latents, make sure to use the `_pack_latents` method to prepare them.
+        Packed latents should be a 3D tensor of shape (batch_size, num_patches, channels).


I think this note is unnecessary as well given that _pack_latents() is an inexpensive operation. I think your updates to check_inputs() should do the trick.

Agreed, maybe then I can add this note as part of the latents docstring in the __call__

sayakpaul · 2024-08-03T03:10:07Z

src/diffusers/pipelines/flux/pipeline_flux.py

+            if not isinstance(latents, torch.Tensor):
+                raise ValueError(f"`latents` has to be of type `torch.Tensor` but is {type(latents)}")
+
+            if not _are_latents_packed(latents):


But later in prepare_latents(), we are packing the latents and throwing a warning, no? So, I think we can remove this check.

Same, this check can be removed from the check_inputs function.

sayakpaul · 2024-08-03T03:10:46Z

src/diffusers/pipelines/flux/pipeline_flux.py

+        if not self._are_latents_packed(latents):
+            raise ValueError(
+                "Latents are not in the correct packed format. Please use `_pack_latents` to prepare them."
+            )


This can be removed too since essentially check_inputs() should catch these.

Yes, the check_inputs function should catch these, so this can be removed. Then there is no need for the _are_latents_packed method, so that can also be removed.

sayakpaul · 2024-08-04T10:16:25Z

@PDillis thank you for being forthcoming toward my comments. Let me know once you'd like another set of reviews.

HuggingFaceDocBuilderDev · 2024-08-04T10:21:12Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…cstring for shape of latents.

PDillis · 2024-08-05T00:02:14Z

@sayakpaul Thank you, I've done the suggested changes. Let me know what you think.

DN6 · 2024-08-05T05:51:13Z

src/diffusers/pipelines/flux/pipeline_flux.py

+            if latents.ndim == 4:
+                # Packing the latents to be of shape (batch_size, num_patches, channels)
+                latents = self._pack_latents(latents, batch_size, num_channels_latents, height, width)


Is this check necessary? check_inputs will raise an error if an ndim == 4 latent tensor is passed in? Also _prepare_latent_image_ids divides the height and width by 2 before creating the image ids? if the height isn't scaled up before passing into this method, you will get an incorrect shape.

Perhaps checking the number of dimensions is too harsh. However, there must be a way to clearly tell the user the correct shapes for the latents. This was my issue when trying to use the model with custom latents and coming from other "traditional" models. The only thing to guide me before going into the source code was a size error mismatch. That is, in other models, I'm used to doing the following:

pipe = ... # ... latents_shape = ( batch_size, pipe.transformer.config.in_channels, height // pipe.vae_scale_factor, width // pipe.vae_scale_factor, ) latents = torch.randn(latents_shape, ...) # ... image = pipe( prompt=prompt, # ... latents=latents ).images[0]

For a similar code to work with this model, I had to do it like so:

pipe = ... # ... latents_shape = ( batch_size, pipe.transformer.config.in_channels // 4, 2 * height // pipe.vae_scale_factor, 2 * width // pipe.vae_scale_factor, ) latents = torch.randn(latents_shape, ...) latents = pipe._pack_latents(latents, *latents.shape) # Note: 2x the original height and width! # ... image = pipe( prompt=prompt, # ... latents=latents ).images[0]

Hence, no issues were encountered when we would run _prepare_latent_image_ids. A distinction on the dimensions should be made: height and width generally refer to the image's dimensions, but in latent_image_ids = self._prepare_latent_image_ids(batch_size, height, width, device, dtype), for example, the height and width refer to the upscaled latent dimensions, as you note.

I think either a method to generate the latents of the correct shape is warranted, or as I was doing in this PR of improving the docstrings and perhaps simply raising an error in order to avoid any mixup of dimensions.

DN6 · 2024-08-05T05:52:56Z

src/diffusers/pipelines/flux/pipeline_flux.py

+            if not isinstance(latents, torch.Tensor):
+                raise ValueError(f"`latents` has to be of type `torch.Tensor` but is {type(latents)}")
+
+            batch_size, num_patches, channels = latents.shape


Wouldn't this throw an error if the shape is ndim==4. Perhaps we check to see if ndim==3 here and raise and error rather than letting it fail.

yiyixuxu

I think the latents should be passed in expected shape, so the check is very nice!
but we do not need to pack for them, an error is enough IMO

github-actions · 2024-09-14T15:04:09Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

yiyixuxu · 2024-12-03T03:56:50Z

cc @PDillis are we interested in finishing this PR? I think we can just keep the error checking part :)

PDillis added 2 commits August 3, 2024 01:02

Remove unnecessary logging

48d8256

Improve console log if user provides latents for FluxPipeline (adde…

6b6c480

…d in `self.check_inputs` and `self.prepare_latents`)

sayakpaul reviewed Aug 3, 2024

View reviewed changes

sayakpaul requested a review from DN6 August 3, 2024 03:11

Merge branch 'main' into main

e86b9cc

Removed unnecessary checks/logs. Left logging as comment and added do…

d307205

…cstring for shape of latents.

Merge branch 'main' into main

a674e80

DN6 reviewed Aug 5, 2024

View reviewed changes

yiyixuxu reviewed Aug 5, 2024

View reviewed changes

Merge branch 'main' into main

83cac59

github-actions bot added the stale Issues that haven't received updates label Sep 14, 2024

yiyixuxu added close-to-merge and removed stale Issues that haven't received updates labels Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `FluxPipeline` checks and logging #9064

Improve `FluxPipeline` checks and logging #9064

PDillis commented Aug 3, 2024

sayakpaul Aug 3, 2024

PDillis Aug 3, 2024

sayakpaul left a comment

sayakpaul Aug 3, 2024

PDillis Aug 3, 2024 •

edited

Loading

sayakpaul Aug 3, 2024

PDillis Aug 3, 2024

sayakpaul Aug 3, 2024

PDillis Aug 3, 2024

sayakpaul Aug 3, 2024

PDillis Aug 3, 2024

sayakpaul commented Aug 4, 2024

HuggingFaceDocBuilderDev commented Aug 4, 2024

PDillis commented Aug 5, 2024

DN6 Aug 5, 2024

PDillis Aug 6, 2024

DN6 Aug 5, 2024

yiyixuxu left a comment

github-actions bot commented Sep 14, 2024

yiyixuxu commented Dec 3, 2024

Improve FluxPipeline checks and logging #9064

Are you sure you want to change the base?

Improve FluxPipeline checks and logging #9064

Conversation

PDillis commented Aug 3, 2024

What does this PR do?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sayakpaul left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PDillis Aug 3, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sayakpaul commented Aug 4, 2024

HuggingFaceDocBuilderDev commented Aug 4, 2024

PDillis commented Aug 5, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yiyixuxu left a comment

Choose a reason for hiding this comment

github-actions bot commented Sep 14, 2024

yiyixuxu commented Dec 3, 2024

Improve `FluxPipeline` checks and logging #9064

Improve `FluxPipeline` checks and logging #9064

PDillis Aug 3, 2024 •

edited

Loading