value errors in convert to/from diffusers from original stable diffusion

### Describe the bug

There's a hardcode somewhere for 77 tokens, when it should be using the dimensions of what is actually in the model.

I have a diffusers-layout SD1.5 model, with LongCLIP.

https://huggingface.co/opendiffusionai/xllsd-alpha0

I can pull it locally, then convert to single file format, with

python convert_diffusers_to_original_stable_diffusion.py \
  --use_safetensors \
  --model_path $SRCM \
  --checkpoint_path $DESTM

But then if I try to convert it back, I get size errors for the text encoder not being 77 size.


I should point out that the model WORKS PROPERLY for diffusion, when loaded in diffusers format, so I dont have some funky broken model here.



### Reproduction

from transformers import CLIPTextModel, CLIPTokenizer

from diffusers import StableDiffusionPipeline, AutoencoderKL
import torch


pipe = StableDiffusionPipeline.from_single_file(
        "XLLsd-phase0.safetensors",
        torch_dtype=torch.float32,
        use_safetensors=True)


outname = "XLLsd_recreate"
pipe.save_pretrained(outname, safe_serialization=False)

### Logs

```shell
venv/lib/python3.12/site-packages/diffusers/models/model_loading_utils.py", line 230, in load_model_dict_into_meta
    raise ValueError(
ValueError: Cannot load  because text_model.embeddings.position_embedding.weight expected shape torch.Size([77, 768]), but got torch.Size([248, 768]). If you want to instead overwrite randomly initialized weights, please make sure to pass both `low_cpu_mem_usage=False` and `ignore_mismatched_sizes=True`. For more information, see also: https://github.com/huggingface/diffusers/issues/1619#issuecomment-1345604389 as an example.
```

### System Info

- 🤗 Diffusers version: 0.32.2
- Platform: Linux-6.8.0-55-generic-x86_64-with-glibc2.39
- Running on Google Colab?: No
- Python version: 3.12.3
- PyTorch version (GPU?): 2.6.0+cu124 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.29.3
- Transformers version: 4.50.0
- Accelerate version: 1.5.2
- PEFT version: not installed
- Bitsandbytes version: 0.45.2
- Safetensors version: 0.5.3
- xFormers version: not installed
- Accelerator: NVIDIA GeForce RTX 4090, 24564 MiB


### Who can help?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

value errors in convert to/from diffusers from original stable diffusion #11285

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

value errors in convert to/from diffusers from original stable diffusion #11285

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions