Skip to content

Sana issues #10241

Closed
Closed
@vladmandic

Description

@vladmandic

Describe the bug

Following SanaPAGPipeline implementation in #9982,
i cannot get decent output in more than 1% of runs at best.

  • most of runs result in what appears to be image with a lot of residual noise and then dc-ae decoder makes it look like sketch-like image with many circular artifacts (see first example image below)
  • some of runs result in black-and-white output. adding "rich colors" to prompt makes foreground objects colored, but background remains black-and-white (see second example image below)
  • rarely (very rarely) i get decent output

what did i try?

  • loading both fp32 and fp16 variants of the model
  • loading from separate bf16 repo
  • executing in fp16, fp32 and bf16
  • enabling/disabling chi and trying to change steps, pag scale, etc.

Reproduction

import torch
import diffusers

# repo_id = 'Efficient-Large-Model/Sana_1600M_1024px_diffusers'
repo_id = 'Efficient-Large-Model/Sana_1600M_1024px_BF16_diffusers'
cache_dir = '/mnt/models/Diffusers'
prompt = 'photo of a cute red robot on the surface of moon with planet earth in the background'
negative = ''
dtype = torch.bfloat16
device = torch.device('cuda')
kwargs = {
    # 'variant': 'fp16',
    'torch_dtype': dtype,
}

pipe = diffusers.SanaPAGPipeline.from_pretrained(repo_id, cache_dir=cache_dir, **kwargs).to(device, dtype)
result = pipe(
    prompt = prompt,
    negative_prompt = negative,
    # num_inference_steps = 20, # default
    # guidance_scale = 4.5, # default
    # pag_scale = 3.0, # default
    # pag_adaptive_scale = 0.0, # default
    # height = 1024, # default
    # width = 1024, # default
    # clean_caption = True, # default
    # use_resolution_binning = True, # default
    # complex_human_instruction = '...', # default
)
image = result.images[0]
image.save('/tmp/sana.png')

attached are both typical examples of bad output:
sana
sana

Logs

there are several additional issues:

  1. error when using UniPC, DEIS or SA schedulers
│ /home/vlado/dev/sdnext/venv/lib/python3.12/site-packages/diffusers/schedulers/scheduling_unipc_multistep.py:396 in set_timesteps                                                                                                                                                                                                                                                                                                  │
│                                                                                                                                                                                                                                                                                                                                                                                                                                   │
│    395 │   │                                                                                                                                                                                                                                                                                                                                                                                                                      │
│ ❱  396 │   │   self.sigmas = torch.from_numpy(sigmas)                                                                                                                                                                                                                                                                                                                                                                             │
│    397 │   │   self.timesteps = torch.from_numpy(timesteps).to(device=device, dtype=torch.int64)                                                                                                                                                                                                                                                                                                                                  │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: At least one stride in the given numpy array is negative, and tensors with negative strides are not currently supported. (You can probably work around this by making a copy of your array  with array.copy().)

note: im confirming that flowmatching args are set correctly
note: DPMSolverMultistepScheduler scheduler works fine, either when left as default or when manually instantiated

  1. error when using non-zero pag_adaptive_scale
│ /home/vlado/dev/sdnext/venv/lib/python3.12/site-packages/diffusers/pipelines/pag/pag_utils.py:95 in _get_pag_scale                                                                                                                                                                                                                                                                                                                │
│                                                                                                                                                                                                                                                                                                                                                                                                                                   │
│    94 │   │   │   signal_scale = self.pag_scale - self.pag_adaptive_scale * (1000 - t)                                                                                                                                                                                                                                                                                                                                            │
│ ❱  95 │   │   │   if signal_scale < 0:                                                                                                                                                                                                                                                                                                                                                                                            │
│    96 │   │   │   │   signal_scale = 0                                                                                                                                                                                                                                                                                                                                                                                            │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Boolean value of Tensor with more than one value is ambiguous

System Info

diffusers==0.32.dev commit=5fb3a985173efaae7ff381b9040c386751d643da

Who can help?

@yiyixuxu @sayakpaul @DN6 @asomoza
@lawrence-cj and @a-r-r-o-w as primary contributors to pr
@hlky for scheduler issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions