Open
Description
Describe the bug
AttributeError when running model parallel distributed training with accelerate
Reproduction
accelerate launch --config_file train_dreambooth_lora_flux.py
--resolution=1024
--mixed_precision=bf16
--pretrained_model_name_or_path=black-forest-labels/FLUX.1-dev
--num_validation_images=8
--validation_epochs=100
--rank=16
--train_batch_size=1
--learning_rate=1e-4
--guidance_scale=3.5
--checkpointing_steps=200
--instance_prompt=xyz
--instance_data_dir=xyz
--output_dir=xyz
--logging_dir=xyz
--validation_prompt=xyz
accelerate config:
compute_environment: LOCAL_MACHINE
deepspeed_config: {}
distributed_type: MULTI_GPU
fsdp_config: {}
machine_rank: 0
main_process_ip: null
main_process_port: null
main_training_function: main
mixed_precision: bf16
num_machines: 1
num_processes: 2
use_cpu: false
gpu_ids: '0, 1'
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
Logs
if transformer.config.guidance_embeds:
AttributeError: DistributedDataParallel object has no attribute config
System Info
diffusers from source
accelerate==0.33.0
transformers==4.44.1
training on A100s
Who can help?
No response