Skip to content

value errors in convert to/from diffusers from original stable diffusion #11285

Open
@ppbrown

Description

@ppbrown

Describe the bug

There's a hardcode somewhere for 77 tokens, when it should be using the dimensions of what is actually in the model.

I have a diffusers-layout SD1.5 model, with LongCLIP.

https://huggingface.co/opendiffusionai/xllsd-alpha0

I can pull it locally, then convert to single file format, with

python convert_diffusers_to_original_stable_diffusion.py
--use_safetensors
--model_path $SRCM
--checkpoint_path $DESTM

But then if I try to convert it back, I get size errors for the text encoder not being 77 size.

I should point out that the model WORKS PROPERLY for diffusion, when loaded in diffusers format, so I dont have some funky broken model here.

Reproduction

from transformers import CLIPTextModel, CLIPTokenizer

from diffusers import StableDiffusionPipeline, AutoencoderKL
import torch

pipe = StableDiffusionPipeline.from_single_file(
"XLLsd-phase0.safetensors",
torch_dtype=torch.float32,
use_safetensors=True)

outname = "XLLsd_recreate"
pipe.save_pretrained(outname, safe_serialization=False)

Logs

venv/lib/python3.12/site-packages/diffusers/models/model_loading_utils.py", line 230, in load_model_dict_into_meta
    raise ValueError(
ValueError: Cannot load  because text_model.embeddings.position_embedding.weight expected shape torch.Size([77, 768]), but got torch.Size([248, 768]). If you want to instead overwrite randomly initialized weights, please make sure to pass both `low_cpu_mem_usage=False` and `ignore_mismatched_sizes=True`. For more information, see also: https://github.com/huggingface/diffusers/issues/1619#issuecomment-1345604389 as an example.

System Info

  • 🤗 Diffusers version: 0.32.2
  • Platform: Linux-6.8.0-55-generic-x86_64-with-glibc2.39
  • Running on Google Colab?: No
  • Python version: 3.12.3
  • PyTorch version (GPU?): 2.6.0+cu124 (True)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Huggingface_hub version: 0.29.3
  • Transformers version: 4.50.0
  • Accelerate version: 1.5.2
  • PEFT version: not installed
  • Bitsandbytes version: 0.45.2
  • Safetensors version: 0.5.3
  • xFormers version: not installed
  • Accelerator: NVIDIA GeForce RTX 4090, 24564 MiB

Who can help?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions