Skip to content

Issue on flux dreambooth lora training #9237

Closed
@arvnoodle

Description

@arvnoodle

Describe the bug

RuntimeError: Input type (float) and bias type (c10::Half) should be the same

whenever I follow the readme of flux https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/README_flux.md

I tried to do exactly it is written but still error

I tried the SDXL version it works but whenever for the Flux version it gives error

Reproduction

`export MODEL_NAME="black-forest-labs/FLUX.1-dev"
export INSTANCE_DIR="dog"
export OUTPUT_DIR="trained-flux-lora"

accelerate launch train_dreambooth_lora_flux.py
--pretrained_model_name_or_path=$MODEL_NAME
--instance_data_dir=$INSTANCE_DIR
--output_dir=$OUTPUT_DIR
--mixed_precision="fp16"
--instance_prompt="a photo of sks dog"
--resolution=512
--train_batch_size=1
--gradient_accumulation_steps=4
--learning_rate=1e-5
--lr_scheduler="constant"
--lr_warmup_steps=0
--max_train_steps=500
--validation_prompt="A photo of sks dog in a bucket"
--validation_epochs=25
--seed="0"
--push_to_hub`

Logs

/workspace/fluxdiff/lib/python3.10/site-packages/accelerate/accelerator.py:488: FutureWarning: `torch.cuda.amp.GradScaler(args...)` is deprecated. Please use `torch.amp.GradScaler('cuda', args...)` instead.
  self.scaler = torch.cuda.amp.GradScaler(**kwargs)
08/21/2024 11:49:06 - INFO - __main__ - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda

Mixed precision type: fp16

You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
You are using a model of type t5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 6626.07it/s]
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:12<00:00,  6.43s/it]
Fetching 3 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 13039.29it/s]
{'axes_dims_rope'} was not found in config. Values will be initialized to default values.
08/21/2024 11:51:09 - INFO - __main__ - ***** Running training *****
08/21/2024 11:51:09 - INFO - __main__ -   Num examples = 5
08/21/2024 11:51:09 - INFO - __main__ -   Num batches each epoch = 5
08/21/2024 11:51:09 - INFO - __main__ -   Num Epochs = 250
08/21/2024 11:51:09 - INFO - __main__ -   Instantaneous batch size per device = 1
08/21/2024 11:51:09 - INFO - __main__ -   Total train batch size (w. parallel, distributed & accumulation) = 4
08/21/2024 11:51:09 - INFO - __main__ -   Gradient Accumulation steps = 4
08/21/2024 11:51:09 - INFO - __main__ -   Total optimization steps = 500
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 8184.01it/s]
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:20<00:00, 10.45s/it]
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:20<00:00, 11.01s/itLoaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of black-forest-labs/FLUX.1-dev.                                                    | 0/7 [00:00<?, ?it/s]
Loaded tokenizer_2 as T5TokenizerFast from `tokenizer_2` subfolder of black-forest-labs/FLUX.1-dev.
                                                                                                                                                                     Loaded scheduler as FlowMatchEulerDiscreteScheduler from `scheduler` subfolder of black-forest-labs/FLUX.1-dev.                          | 2/7 [00:00<00:00,  6.04it/s]
Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 20.95it/s]
08/21/2024 11:51:35 - INFO - __main__ - Running validation... 
 Generating 4 images with prompt: A photo of sks dog in a bucket.
Traceback (most recent call last):
  File "/workspace/diffusers/examples/dreambooth/train_dreambooth_lora_flux.py", line 1857, in <module>
    main(args)
  File "/workspace/diffusers/examples/dreambooth/train_dreambooth_lora_flux.py", line 1780, in main
    images = log_validation(
  File "/workspace/diffusers/examples/dreambooth/train_dreambooth_lora_flux.py", line 188, in log_validation
    images = [pipeline(**pipeline_args, generator=generator).images[0] for _ in range(args.num_validation_images)]
  File "/workspace/diffusers/examples/dreambooth/train_dreambooth_lora_flux.py", line 188, in <listcomp>
    images = [pipeline(**pipeline_args, generator=generator).images[0] for _ in range(args.num_validation_images)]
  File "/workspace/fluxdiff/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/workspace/diffusers/src/diffusers/pipelines/flux/pipeline_flux.py", line 769, in __call__
    image = self.vae.decode(latents, return_dict=False)[0]
  File "/workspace/diffusers/src/diffusers/utils/accelerate_utils.py", line 46, in wrapper
    return method(self, *args, **kwargs)
  File "/workspace/diffusers/src/diffusers/models/autoencoders/autoencoder_kl.py", line 321, in decode
    decoded = self._decode(z).sample
  File "/workspace/diffusers/src/diffusers/models/autoencoders/autoencoder_kl.py", line 292, in _decode
    dec = self.decoder(z)
  File "/workspace/fluxdiff/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/workspace/fluxdiff/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/diffusers/src/diffusers/models/autoencoders/vae.py", line 291, in forward
    sample = self.conv_in(sample)
  File "/workspace/fluxdiff/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/workspace/fluxdiff/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/fluxdiff/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 458, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/workspace/fluxdiff/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 454, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (float) and bias type (c10::Half) should be the same
Steps:   0%|| 2/500 [00:48<3:21:52, 24.32s/it, loss=0.485, lr=1e-5]
Traceback (most recent call last):
  File "/workspace/fluxdiff/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/workspace/fluxdiff/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 48, in main
    args.func(args)
  File "/workspace/fluxdiff/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1106, in launch_command
    simple_launcher(args)
  File "/workspace/fluxdiff/lib/python3.10/site-packages/accelerate/commands/launch.py", line 704, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/workspace/fluxdiff/bin/python3', 'train_dreambooth_lora_flux.py', '--pretrained_model_name_or_path=black-forest-labs/FLUX.1-dev', '--instance_data_dir=dog', '--output_dir=trained-flux-lora', '--mixed_precision=fp16', '--instance_prompt=a photo of sks dog', '--resolution=512', '--train_batch_size=1', '--gradient_accumulation_steps=4', '--learning_rate=1e-5', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--max_train_steps=500', '--validation_prompt=A photo of sks dog in a bucket', '--validation_epochs=25', '--seed=0', '--push_to_hub']' returned non-zero exit status 1.

System Info

  • 🤗 Diffusers version: 0.31.0.dev0
  • Platform: Linux-6.5.0-41-generic-x86_64-with-glibc2.35
  • Running on Google Colab?: No
  • Python version: 3.10.12
  • PyTorch version (GPU?): 2.4.0+cu121 (True)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Huggingface_hub version: 0.24.6
  • Transformers version: 4.44.1
  • Accelerate version: 0.33.0
  • PEFT version: 0.12.0
  • Bitsandbytes version: not installed
  • Safetensors version: 0.4.4
  • xFormers version: not installed
  • Accelerator: NVIDIA A100 80GB PCIe, 81920 MiB
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions