Closed
Description
Describe the bug
train_dreambooth_lora_sdxl.py
can't be resumed from a checkpoint using fp16. The log error is Attempting to unscale FP16 gradients.
This is a big blocker from being able to train on the free colab tier since you need fp16 to fit in vram, but also need to resume from checkpoints since it can hit a timeout at any moment.
Reproduction
Reproduce with: https://colab.research.google.com/drive/15woNcXcpsa3GDGk6cmDtIL2V8zRtOOj3
Logs
No response
System Info
latest diffusers, system is whatever is on colab (see linked colab above)