Closed
Description
Describe the bug
In train_dreambooth_lora.py we still have accelerator.load_state(os.path.join(args.output_dir, path)) for resuming from a checkpoint, while we are not saving the state. This results in the following error: FileNotFoundError: [Errno 2] No such file or directory:
'/home/ec2-user/ssl/Jupyter/exp/models/lora/checkpoint-250/pytorch_model.bin'.
Reproduction
Resuming from a checkpoint from the train_dreambooth_lora.py would reproduce the error.
train_dreambooth_lora.py --resume_from_checkpoint path
Logs
error: FileNotFoundError: [Errno 2] No such file or directory:
'/home/ec2-user/ssl/Jupyter/exp/models/lora/checkpoint-250/pytorch_model.bin'.
System Info
diffusers
version: 0.16.1- Platform: Linux-4.14.301-224.520.amzn2.x86_64-x86_64-with-glibc2.26
- Python version: 3.10.9
- PyTorch version (GPU?): 1.13.1+cu117 (True)
- Huggingface_hub version: 0.14.1
- Transformers version: 4.27.3
- Accelerate version: 0.18.0
- xFormers version: 0.0.16
- Using GPU in script?: yes
- Using distributed or parallel set-up in script?: No