Description
Describe the bug
RuntimeError: Input type (float) and bias type (c10::Half) should be the same
whenever I follow the readme of flux https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/README_flux.md
I tried to do exactly it is written but still error
I tried the SDXL version it works but whenever for the Flux version it gives error
Reproduction
`export MODEL_NAME="black-forest-labs/FLUX.1-dev"
export INSTANCE_DIR="dog"
export OUTPUT_DIR="trained-flux-lora"
accelerate launch train_dreambooth_lora_flux.py
--pretrained_model_name_or_path=$MODEL_NAME
--instance_data_dir=$INSTANCE_DIR
--output_dir=$OUTPUT_DIR
--mixed_precision="fp16"
--instance_prompt="a photo of sks dog"
--resolution=512
--train_batch_size=1
--gradient_accumulation_steps=4
--learning_rate=1e-5
--lr_scheduler="constant"
--lr_warmup_steps=0
--max_train_steps=500
--validation_prompt="A photo of sks dog in a bucket"
--validation_epochs=25
--seed="0"
--push_to_hub`
Logs
/workspace/fluxdiff/lib/python3.10/site-packages/accelerate/accelerator.py:488: FutureWarning: `torch.cuda.amp.GradScaler(args...)` is deprecated. Please use `torch.amp.GradScaler('cuda', args...)` instead.
self.scaler = torch.cuda.amp.GradScaler(**kwargs)
08/21/2024 11:49:06 - INFO - __main__ - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda
Mixed precision type: fp16
You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
You are using a model of type t5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 6626.07it/s]
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:12<00:00, 6.43s/it]
Fetching 3 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 13039.29it/s]
{'axes_dims_rope'} was not found in config. Values will be initialized to default values.
08/21/2024 11:51:09 - INFO - __main__ - ***** Running training *****
08/21/2024 11:51:09 - INFO - __main__ - Num examples = 5
08/21/2024 11:51:09 - INFO - __main__ - Num batches each epoch = 5
08/21/2024 11:51:09 - INFO - __main__ - Num Epochs = 250
08/21/2024 11:51:09 - INFO - __main__ - Instantaneous batch size per device = 1
08/21/2024 11:51:09 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 4
08/21/2024 11:51:09 - INFO - __main__ - Gradient Accumulation steps = 4
08/21/2024 11:51:09 - INFO - __main__ - Total optimization steps = 500
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 8184.01it/s]
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:20<00:00, 10.45s/it]
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:20<00:00, 11.01s/itLoaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of black-forest-labs/FLUX.1-dev. | 0/7 [00:00<?, ?it/s]
Loaded tokenizer_2 as T5TokenizerFast from `tokenizer_2` subfolder of black-forest-labs/FLUX.1-dev.
Loaded scheduler as FlowMatchEulerDiscreteScheduler from `scheduler` subfolder of black-forest-labs/FLUX.1-dev. | 2/7 [00:00<00:00, 6.04it/s]
Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 20.95it/s]
08/21/2024 11:51:35 - INFO - __main__ - Running validation...
Generating 4 images with prompt: A photo of sks dog in a bucket.
Traceback (most recent call last):
File "/workspace/diffusers/examples/dreambooth/train_dreambooth_lora_flux.py", line 1857, in <module>
main(args)
File "/workspace/diffusers/examples/dreambooth/train_dreambooth_lora_flux.py", line 1780, in main
images = log_validation(
File "/workspace/diffusers/examples/dreambooth/train_dreambooth_lora_flux.py", line 188, in log_validation
images = [pipeline(**pipeline_args, generator=generator).images[0] for _ in range(args.num_validation_images)]
File "/workspace/diffusers/examples/dreambooth/train_dreambooth_lora_flux.py", line 188, in <listcomp>
images = [pipeline(**pipeline_args, generator=generator).images[0] for _ in range(args.num_validation_images)]
File "/workspace/fluxdiff/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/workspace/diffusers/src/diffusers/pipelines/flux/pipeline_flux.py", line 769, in __call__
image = self.vae.decode(latents, return_dict=False)[0]
File "/workspace/diffusers/src/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
File "/workspace/diffusers/src/diffusers/models/autoencoders/autoencoder_kl.py", line 321, in decode
decoded = self._decode(z).sample
File "/workspace/diffusers/src/diffusers/models/autoencoders/autoencoder_kl.py", line 292, in _decode
dec = self.decoder(z)
File "/workspace/fluxdiff/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/workspace/fluxdiff/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/diffusers/src/diffusers/models/autoencoders/vae.py", line 291, in forward
sample = self.conv_in(sample)
File "/workspace/fluxdiff/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/workspace/fluxdiff/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/fluxdiff/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 458, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/workspace/fluxdiff/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 454, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (float) and bias type (c10::Half) should be the same
Steps: 0%|▍ | 2/500 [00:48<3:21:52, 24.32s/it, loss=0.485, lr=1e-5]
Traceback (most recent call last):
File "/workspace/fluxdiff/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/workspace/fluxdiff/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 48, in main
args.func(args)
File "/workspace/fluxdiff/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1106, in launch_command
simple_launcher(args)
File "/workspace/fluxdiff/lib/python3.10/site-packages/accelerate/commands/launch.py", line 704, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/workspace/fluxdiff/bin/python3', 'train_dreambooth_lora_flux.py', '--pretrained_model_name_or_path=black-forest-labs/FLUX.1-dev', '--instance_data_dir=dog', '--output_dir=trained-flux-lora', '--mixed_precision=fp16', '--instance_prompt=a photo of sks dog', '--resolution=512', '--train_batch_size=1', '--gradient_accumulation_steps=4', '--learning_rate=1e-5', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--max_train_steps=500', '--validation_prompt=A photo of sks dog in a bucket', '--validation_epochs=25', '--seed=0', '--push_to_hub']' returned non-zero exit status 1.
System Info
- 🤗 Diffusers version: 0.31.0.dev0
- Platform: Linux-6.5.0-41-generic-x86_64-with-glibc2.35
- Running on Google Colab?: No
- Python version: 3.10.12
- PyTorch version (GPU?): 2.4.0+cu121 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.24.6
- Transformers version: 4.44.1
- Accelerate version: 0.33.0
- PEFT version: 0.12.0
- Bitsandbytes version: not installed
- Safetensors version: 0.4.4
- xFormers version: not installed
- Accelerator: NVIDIA A100 80GB PCIe, 81920 MiB
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Who can help?
No response