Skip to content

Commit fc7a867

Browse files
authored
[docs] MPS update (#11212)
mps
1 parent 5ded26c commit fc7a867

File tree

8 files changed

+19
-1
lines changed

8 files changed

+19
-1
lines changed

docs/source/en/api/pipelines/deepfloyd_if.md

+1
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ specific language governing permissions and limitations under the License.
1414

1515
<div class="flex flex-wrap space-x-1">
1616
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
17+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1718
</div>
1819

1920
## Overview

docs/source/en/api/pipelines/flux.md

+1
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ specific language governing permissions and limitations under the License.
1414

1515
<div class="flex flex-wrap space-x-1">
1616
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
17+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1718
</div>
1819

1920
Flux is a series of text-to-image generation models based on diffusion transformers. To know more about Flux, check out the original [blog post](https://blackforestlabs.ai/announcing-black-forest-labs/) by the creators of Flux, Black Forest Labs.

docs/source/en/api/pipelines/kolors.md

+1
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ specific language governing permissions and limitations under the License.
1414

1515
<div class="flex flex-wrap space-x-1">
1616
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
17+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1718
</div>
1819

1920
![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/kolors/kolors_header_collage.png)

docs/source/en/api/pipelines/ltx_video.md

+1
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616

1717
<div class="flex flex-wrap space-x-1">
1818
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
19+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1920
</div>
2021

2122
[LTX Video](https://huggingface.co/Lightricks/LTX-Video) is the first DiT-based video generation model capable of generating high-quality videos in real-time. It produces 24 FPS videos at a 768x512 resolution faster than they can be watched. Trained on a large-scale dataset of diverse videos, the model generates high-resolution videos with realistic and varied content. We provide a model for both text-to-video as well as image + text-to-video usecases.

docs/source/en/api/pipelines/sana.md

+1
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616

1717
<div class="flex flex-wrap space-x-1">
1818
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
19+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1920
</div>
2021

2122
[SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers](https://huggingface.co/papers/2410.10629) from NVIDIA and MIT HAN Lab, by Enze Xie, Junsong Chen, Junyu Chen, Han Cai, Haotian Tang, Yujun Lin, Zhekai Zhang, Muyang Li, Ligeng Zhu, Yao Lu, Song Han.

docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_3.md

+1
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ specific language governing permissions and limitations under the License.
1414

1515
<div class="flex flex-wrap space-x-1">
1616
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
17+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1718
</div>
1819

1920
Stable Diffusion 3 (SD3) was proposed in [Scaling Rectified Flow Transformers for High-Resolution Image Synthesis](https://arxiv.org/pdf/2403.03206.pdf) by Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Muller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, Dustin Podell, Tim Dockhorn, Zion English, Kyle Lacey, Alex Goodwin, Yannik Marek, and Robin Rombach.

docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_xl.md

+1
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ specific language governing permissions and limitations under the License.
1414

1515
<div class="flex flex-wrap space-x-1">
1616
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
17+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1718
</div>
1819

1920
Stable Diffusion XL (SDXL) was proposed in [SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis](https://huggingface.co/papers/2307.01952) by Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, and Robin Rombach.

docs/source/en/optimization/mps.md

+12-1
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,9 @@ specific language governing permissions and limitations under the License.
1212

1313
# Metal Performance Shaders (MPS)
1414

15+
> [!TIP]
16+
> Pipelines with a <img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22"> badge indicate a model can take advantage of the MPS backend on Apple silicon devices for faster inference. Feel free to open a [Pull Request](https://github.com/huggingface/diffusers/compare) to add this badge to pipelines that are missing it.
17+
1518
🤗 Diffusers is compatible with Apple silicon (M1/M2 chips) using the PyTorch [`mps`](https://pytorch.org/docs/stable/notes/mps.html) device, which uses the Metal framework to leverage the GPU on MacOS devices. You'll need to have:
1619

1720
- macOS computer with Apple silicon (M1/M2) hardware
@@ -37,7 +40,7 @@ image
3740

3841
<Tip warning={true}>
3942

40-
Generating multiple prompts in a batch can [crash](https://github.com/huggingface/diffusers/issues/363) or fail to work reliably. We believe this is related to the [`mps`](https://github.com/pytorch/pytorch/issues/84039) backend in PyTorch. While this is being investigated, you should iterate instead of batching.
43+
The PyTorch [mps](https://pytorch.org/docs/stable/notes/mps.html) backend does not support NDArray sizes greater than `2**32`. Please open an [Issue](https://github.com/huggingface/diffusers/issues/new/choose) if you encounter this problem so we can investigate.
4144

4245
</Tip>
4346

@@ -59,6 +62,10 @@ If you're using **PyTorch 1.13**, you need to "prime" the pipeline with an addit
5962

6063
## Troubleshoot
6164

65+
This section lists some common issues with using the `mps` backend and how to solve them.
66+
67+
### Attention slicing
68+
6269
M1/M2 performance is very sensitive to memory pressure. When this occurs, the system automatically swaps if it needs to which significantly degrades performance.
6370

6471
To prevent this from happening, we recommend *attention slicing* to reduce memory pressure during inference and prevent swapping. This is especially relevant if your computer has less than 64GB of system RAM, or if you generate images at non-standard resolutions larger than 512×512 pixels. Call the [`~DiffusionPipeline.enable_attention_slicing`] function on your pipeline:
@@ -72,3 +79,7 @@ pipeline.enable_attention_slicing()
7279
```
7380

7481
Attention slicing performs the costly attention operation in multiple steps instead of all at once. It usually improves performance by ~20% in computers without universal memory, but we've observed *better performance* in most Apple silicon computers unless you have 64GB of RAM or more.
82+
83+
### Batch inference
84+
85+
Generating multiple prompts in a batch can crash or fail to work reliably. If this is the case, try iterating instead of batching.

0 commit comments

Comments
 (0)