Description
Tried DeepFloyd/IF-I-M-v1.0 on nVidia A6000 with 48GB VRAM, it loads ok, but fails on inference:
Exception: CUDA out of memory. Tried to allocate 1536.45 GiB (GPU 0; 47.54 GiB total capacity; 16.54 GiB already allocated; 30.45 GiB free; 16.76 GiB reserved in total by PyTorch)
site-packages/diffusers/models/attention_processor.py:756
756 │ │ attention_probs = attn.get_attention_scores(query, key attention_mask)
site-packages/diffusers/models/attention_processor.py:354 in get_attention_scores
354 │ │ │ baddbmm_input = torch.empty(
355 │ │ │ │ query.shape[0], query.shape[1], key.shape[1], dtype=query.dtype, device=
Then tried to switch from SDP to xFormers and it works, although saying its slow would be an understatement - ~27sec/it on A6000 at 1024x1024.
This looks like something is really wrong with attention layers with DeepFloyd?
Also, there should be a note that DeepFloyd does not support any schedulers except built-in one?
AttributeError: 'FrozenDict' object has no attribute 'variance_type'