Skip to content

train_dreambooth_lora.py resume from checkpoint errors. #3279

Closed
@JaLnYn

Description

@JaLnYn

Describe the bug

I get the following issue when trying to resume from checkpoint.
KeyError: 'unet.down_blocks.0.attentions.0.transformer_blocks.0.attn1.processor'
There was also a naming issue where I had to change pytorch_lora_weights.bin to pytorch_model.bin in the checkpoint folders

Reproduction

accelerate launch train_dreambooth_lora.py
--pretrained_model_name_or_path=$MODEL_NAME
--instance_data_dir=$INSTANCE_DIR
--output_dir=$OUTPUT_DIR
--instance_prompt="a photo of sks dog"
--resolution=512
--train_batch_size=1
--gradient_accumulation_steps=1
--checkpointing_steps=100
--learning_rate=1e-4
--report_to="wandb"
--lr_scheduler="constant"
--lr_warmup_steps=0
--max_train_steps=500
--validation_prompt="A photo of sks dog in a bucket"
--validation_epochs=50
--seed="0"
--push_to_hub
--resume_from_checkpoint="" \

Logs

(most recent call last):
  File "/home/rip/git/diffusers/examples/dreambooth/train_dreambooth_lora.py", line 1093, in <module>
    main(args)
  File "/home/rip/git/diffusers/examples/dreambooth/train_dreambooth_lora.py", line 881, in main
    accelerator.load_state(os.path.join(args.output_dir, path))
  File "/home/rip/.local/lib/python3.11/site-packages/accelerate/accelerator.py", line 2396, in load_state
    load_accelerator_state(
  File "/home/rip/.local/lib/python3.11/site-packages/accelerate/checkpointing.py", line 140, in load_accelerator_state
    models[i].load_state_dict(torch_mod, **load_model_func_kwargs)
  File "/home/rip/.local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2027, in load_state_dict
    load(self, state_dict)
  File "/home/rip/.local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2009, in load
    module._load_from_state_dict(
  File "/home/rip/.local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1909, in _load_from_state_dict
    hook(state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs)
  File "/home/rip/.local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 69, in __call__
    return self.hook(module, *args, **kwargs)
  File "/home/rip/git/diffusers/src/diffusers/loaders.py", line 84, in map_from
    new_key = key.replace(replace_key, f"layers.{module.rev_mapping[replace_key]}")
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'unet.down_blocks.0.attentions.0.transformer_blocks.0.attn1.processor'

System Info

  • diffusers version: 0.17.0.dev0
  • Platform: Linux-5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.35
  • Python version: 3.11.3
  • PyTorch version (GPU?): 2.0.0+cu117 (True)
  • Huggingface_hub version: 0.14.0
  • Transformers version: 4.28.1
  • Accelerate version: 0.18.0
  • xFormers version: not installed
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: no

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstaleIssues that haven't received updates

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions