Skip to content

No non-default schedulers appear to work with DeepFloyd IF #3280

Closed
@AmericanPresidentJimmyCarter

Description

Describe the bug

Attempting to use any non-default scheduler with DeepFloyd IF crashes out like this:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ <stdin>:1 in <module>                                                                            │
│                                                                                                  │
│ site-packages/torch/utils/_contextlib.py │
│ :115 in decorate_context                                                                         │
│                                                                                                  │
│   112 │   @functools.wraps(func)                                                                 │
│   113def decorate_context(*args, **kwargs):                                                 │
│   114 │   │   with ctx_factory():                                                                │
│ ❱ 115 │   │   │   return func(*args, **kwargs)                                                   │
│   116 │                                                                                          │
│   117return decorate_context                                                                │
│   118                                                                                            │
│                                                                                                  │
│ site-packages/diffusers/pipelines/deepfl │
│ oyd_if/pipeline_if.py:807 in __call__                                                            │
│                                                                                                  │
│   804 │   │   │   │   │   noise_pred = torch.cat([noise_pred, predicted_variance], dim=1)        │
│   805 │   │   │   │                                                                              │
│   806 │   │   │   │   # compute the previous noisy sample x_t -> x_t-1                           │
│ ❱ 807 │   │   │   │   intermediate_images = self.scheduler.step(                                 │
│   808 │   │   │   │   │   noise_pred, t, intermediate_images, **extra_step_kwargs                │
│   809 │   │   │   │   ).prev_sample                                                              │
│   810                                                                                            │
│                                                                                                  │
│ site-packages/diffusers/schedulers/sched │
│ uling_dpmsolver_multistep.py:549 in step                                                         │
│                                                                                                  │
│   546 │   │   │   (step_index == len(self.timesteps) - 2) and self.config.lower_order_final an   │
│   547 │   │   )                                                                                  │
│   548 │   │                                                                                      │
│ ❱ 549 │   │   model_output = self.convert_model_output(model_output, timestep, sample)           │
│   550 │   │   for i in range(self.config.solver_order - 1):                                      │
│   551 │   │   │   self.model_outputs[i] = self.model_outputs[i + 1]                              │
│   552 │   │   self.model_outputs[-1] = model_output                                              │
│                                                                                                  │
│ site-packages/diffusers/schedulers/sched │
│ uling_dpmsolver_multistep.py:327 in convert_model_output                                         │
│                                                                                                  │
│   324 │   │   if self.config.algorithm_type == "dpmsolver++":                                    │
│   325 │   │   │   if self.config.prediction_type == "epsilon":                                   │
│   326 │   │   │   │   alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep]          │
│ ❱ 327 │   │   │   │   x0_pred = (sample - sigma_t * model_output) / alpha_t                      │
│   328 │   │   │   elif self.config.prediction_type == "sample":                                  │
│   329 │   │   │   │   x0_pred = model_output                                                     │
│   330 │   │   │   elif self.config.prediction_type == "v_prediction":                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: The size of tensor a (3) must match the size of tensor b (6) at non-singleton dimension 1

Reproduction

>>> import torch                                                                                                                          
>>> from diffusers import DiffusionPipeline                                                                                               
>>> stage_1 = diffusers.DiffusionPipeline.from_pretrained("DeepFloyd/IF-I-XL-v1.0")                                                                                                                            
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████| 2/2 [00:12<00:00,  6.40s/it]
>>> stage_1.enable_model_cpu_offload()                                                                                                    
>>> prompt = 'a photo of a kangaroo wearing an orange hoodie and blue sunglasses standing in front of the eiffel tower holding a sign that says "very deep learning"'                                                                                                               
>>> prompt_embeds, negative_embeds = stage_1.encode_prompt(prompt)                                                                        
>>> generator = torch.manual_seed(0)                                                                                                      
>>> image = stage_1(prompt_embeds=prompt_embeds, negative_prompt_embeds=negative_embeds, generator=generator, output_type="pt").images    
100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:19<00:00,  5.05it/s]
>>> from diffusers import DPMSolverMultistepScheduler                                                                                             
>>> stage_1.register_modules(scheduler=DPMSolverMultistepScheduler()) 
>>> image = stage_1(prompt_embeds=prompt_embeds, negative_prompt_embeds=negative_embeds, generator=generator, output_type="pt").images
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
...

Logs

No response

System Info

py 3.10
diffusers v0.16.1

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions