Description
Describe the bug
Hi,
There's a bug in pipeline_utils.py which causes pipeline.from_pretrained
to fail if the pipeline was partially downloaded. Specifically the code doesn't handle missing components in the feature_extractor
, safety_checker
, scheduler
and tokenizer
folders, at least on Windows platform.
The cause of the bug is this line
allow_patterns += [os.path.join(k, "*") for k in folder_names if k not in model_folder_names]
This turns some folder names into a regexp pattern but on Windows the path joining is done via {parent}\\{child}
, which gives a pattern like this
[
'text_encoder/model.safetensors',
'vae/diffusion_pytorch_model.bin',
'vae/diffusion_pytorch_model.safetensors',
'text_encoder/pytorch_model.bin',
'unet/diffusion_pytorch_model.safetensors',
'unet/diffusion_pytorch_model.bin',
'feature_extractor\\*',
'safety_checker\\*',
'scheduler\\*',
'tokenizer\\*'
]
The \\*
pattern doesn't play nice with the regexp matching later and causes some files to be incorrectly excluded from the "consider list", after the expected_files = [f for f in expected_files if any(p.match(f) for p in re_allow_pattern)]
call, expected_files is only
[
'unet/diffusion_pytorch_model.bin',
'vae/diffusion_pytorch_model.bin',
'text_encoder/pytorch_model.bin',
'model_index.json'
]
And this means if some files are missing in those folders mentioned above, diffusers will not even try to download them and causes loading errors down the road.
To reproduce:
- Use Windows
- Call
StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1-base")
and wait until the pipeline is loaded - Delete the
feature_extractor
folder from the pipeline cache folderC:\Users\<YOU>\.cache\huggingface\hub\models--stabilityai--stable-diffusion-2-1-base\snapshots\<HASH>
- Call
StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1-base")
again and observe the error
Fix:
I've tried changing the line above to below and it seems to fix the bug for me. This should be safe for non-Windows platforms as well as that's how path joining works for them in the first place.
allow_patterns += [f"{k}/*" for k in folder_names if k not in model_folder_names]
Reproduction
See above
Logs
No response
System Info
diffusers 0.16.1