Skip to content

Commit 2312b27

Browse files
pcuencasayakpaul
andauthored
Interpolate fix on cuda for large output tensors (#10067)
* Workaround for upscale with large output tensors. Fixes #10040. * Fix scale when output_size is given * Style --------- Co-authored-by: Sayak Paul <[email protected]>
1 parent 6db3333 commit 2312b27

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

src/diffusers/models/upsampling.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -165,6 +165,14 @@ def forward(self, hidden_states: torch.Tensor, output_size: Optional[int] = None
165165
# if `output_size` is passed we force the interpolation output
166166
# size and do not make use of `scale_factor=2`
167167
if self.interpolate:
168+
# upsample_nearest_nhwc also fails when the number of output elements is large
169+
# https://github.com/pytorch/pytorch/issues/141831
170+
scale_factor = (
171+
2 if output_size is None else max([f / s for f, s in zip(output_size, hidden_states.shape[-2:])])
172+
)
173+
if hidden_states.numel() * scale_factor > pow(2, 31):
174+
hidden_states = hidden_states.contiguous()
175+
168176
if output_size is None:
169177
hidden_states = F.interpolate(hidden_states, scale_factor=2.0, mode="nearest")
170178
else:

0 commit comments

Comments
 (0)