-
Notifications
You must be signed in to change notification settings - Fork 6k
[docs] Simplify loading guide #2694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 4 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
b7141fd
simplify loading guide
stevhliu a9bbed7
apply feedbacks
stevhliu a30f340
clarify variants
stevhliu 8171539
clarify torch_dtype and variant
stevhliu 93b8e8a
remove conceptual pipeline doc
stevhliu 72036b8
Merge branch 'main' into update-loading
stevhliu File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,197 @@ | ||
<!--Copyright 2023 The HuggingFace Team. All rights reserved. | ||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations under the License. | ||
--> | ||
|
||
# Pipelines explained | ||
|
||
Having an easy and accessible way to use a diffusion system for inference is essential to using 🧨 Diffusers. Diffusion systems often consist of multiple components like parameterized models, tokenizers, and schedulers that interact in complex ways. That is why we designed the [`DiffusionPipeline`] to wrap the complexity of the entire diffusion system into an easy-to-use API, while remaining flexible enough to be adapted for other use cases. | ||
|
||
This guide provides a high-level explanation of what a pipeline is, what *variants* are, and how a pipeline and all its components are loaded. | ||
|
||
## Pipeline | ||
|
||
A pipeline like [`StableDiffusionPipeline`] and [`StableDiffusionImg2ImgPipeline`] consist of multiple components: parameterized models (`unet`, `vae`, `text_encoder`), tokenizers, and schedulers. When you call on a pipeline for inference, these components interact with each other to generate an output. The purpose of the pipeline is to wrap the complexity of the entire diffusion system into an easy-to-use API, while remaining flexible enough to be customized for other use cases. | ||
|
||
For instance, you can load a pipeline locally to remain anonymous and build self-contained applications. Or you could also customize what components are loaded in a pipeline. 🧨 Diffusers make it really easy for you to swap out compatible models and schedulers in a pipeline, so you can explore the balance and trade-offs between using different schedulers and models. | ||
|
||
```python | ||
from diffusers import DiffusionPipeline, EulerDiscreteScheduler, DPMSolverMultistepScheduler | ||
|
||
repo_id = "runwayml/stable-diffusion-v1-5" | ||
|
||
scheduler = EulerDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler") | ||
stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, scheduler=scheduler) | ||
``` | ||
|
||
[`SchedulerMixin.from_pretrained`] loads the scheduler configuration file from a subfolder in the Stable Diffusion pipeline repository, and then the scheduler instance is passed to the `scheduler` argument in [`DiffusionPipeline.from_pretrained`]. This works because the [`StableDiffusionPipeline`] defines its scheduler with the `scheduler` attribute. You can't use a different keyword like `sampler` because it isn't defined in `StableDiffusionPipeline.__init__`. | ||
stevhliu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Checkpoint variants | ||
|
||
In addition to the original pipeline checkpoints stored in a repository, there may also be *checkpoint variants*. A variant is typically checkpoint weights stored in a lower precision and lower storage data type like `fp16` or they may be non-exponential mean averaged (EMA) weights so you can resume finetuning from a checkpoint. Variants are advantageous in specific scenarios - half-precision checkpoints only requires half the bandwidth and storage - but they're so similar to the original checkpoint that you shouldn't create a new checkpoint for them. Variants have **exactly** the same serialization format and model structure as the original checkpoints. The weights have the same tensor shapes. | ||
stevhliu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
This means other serialization formats, such as [Safetensors](./using-diffusers/using_safetensors), are not considered checkpoint variants because their weights are identical to the original checkpoint. It may also be tempting to consider different model structures as variants, such as [`stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5) and [`stable-diffusion-2`](https://huggingface.co/stabilityai/stable-diffusion-2). However, these checkpoints aren't considered variants because `stable-diffusion-v1-5` uses a different `CLIPTextModel` than `stable-diffusion-2`. | ||
|
||
<Tip> | ||
|
||
💡 When the checkpoints have identical model structures, but they were trained on different datasets and with a different training setup, they should be stored in separate repositories instead of variations (for example, [`stable-diffusion-v1-4`] and [`stable-diffusion-v1-5`]). | ||
|
||
</Tip> | ||
|
||
You can't use a variant stored in a different floating point type to continue training or load it on a CPU, and non-EMA variants shouldn't be used for inference. | ||
|
||
## How pipeline loading works | ||
|
||
As a class method, [`DiffusionPipeline.from_pretrained`] is responsible for two things: | ||
|
||
- Download the latest version of the folder structure required for inference and cache it. If the latest folder structure is available in the local cache, [`DiffusionPipeline.from_pretrained`] reuses the cache and won't redownload the files. | ||
- Load the cached weights into the correct pipeline [class](./api/pipelines/overview#diffusers-summary) - retrieved from the `model_index.json` file - and return an instance of it. | ||
|
||
The pipelines underlying folder structure corresponds directly with their class instances. For example, the [`StableDiffusionPipeline`] corresponds to the folder structure in [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5). | ||
|
||
```python | ||
from diffusers import DiffusionPipeline | ||
|
||
repo_id = "runwayml/stable-diffusion-v1-5" | ||
pipeline = DiffusionPipeline.from_pretrained(repo_id) | ||
print(pipeline) | ||
``` | ||
|
||
You'll see pipeline is an instance of [`StableDiffusionPipeline`], which consists of seven components: | ||
|
||
- `"feature_extractor"`: a [`~transformers.CLIPFeatureExtractor`] from 🤗 Transformers. | ||
- `"safety_checker"`: a [component](https://github.com/huggingface/diffusers/blob/e55687e1e15407f60f32242027b7bb8170e58266/src/diffusers/pipelines/stable_diffusion/safety_checker.py#L32) for screening against harmful content. | ||
- `"scheduler"`: an instance of [`PNDMScheduler`]. | ||
- `"text_encoder"`: a [`~transformers.CLIPTextModel`] from 🤗 Transformers. | ||
- `"tokenizer"`: a [`~transformers.CLIPTokenizer`] from 🤗 Transformers. | ||
- `"unet"`: an instance of [`UNet2DConditionModel`]. | ||
- `"vae"` an instance of [`AutoencoderKL`]. | ||
|
||
```json | ||
StableDiffusionPipeline { | ||
"feature_extractor": [ | ||
"transformers", | ||
"CLIPFeatureExtractor" | ||
], | ||
"safety_checker": [ | ||
"stable_diffusion", | ||
"StableDiffusionSafetyChecker" | ||
], | ||
"scheduler": [ | ||
"diffusers", | ||
"PNDMScheduler" | ||
], | ||
"text_encoder": [ | ||
"transformers", | ||
"CLIPTextModel" | ||
], | ||
"tokenizer": [ | ||
"transformers", | ||
"CLIPTokenizer" | ||
], | ||
"unet": [ | ||
"diffusers", | ||
"UNet2DConditionModel" | ||
], | ||
"vae": [ | ||
"diffusers", | ||
"AutoencoderKL" | ||
] | ||
} | ||
``` | ||
|
||
Compare the components of the pipeline instance to the [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5) folder structure, and you'll see there is a separate folder for each of the components in the repository: | ||
|
||
``` | ||
. | ||
├── feature_extractor | ||
│ └── preprocessor_config.json | ||
├── model_index.json | ||
├── safety_checker | ||
│ ├── config.json | ||
│ └── pytorch_model.bin | ||
├── scheduler | ||
│ └── scheduler_config.json | ||
├── text_encoder | ||
│ ├── config.json | ||
│ └── pytorch_model.bin | ||
├── tokenizer | ||
│ ├── merges.txt | ||
│ ├── special_tokens_map.json | ||
│ ├── tokenizer_config.json | ||
│ └── vocab.json | ||
├── unet | ||
│ ├── config.json | ||
│ ├── diffusion_pytorch_model.bin | ||
└── vae | ||
├── config.json | ||
├── diffusion_pytorch_model.bin | ||
``` | ||
|
||
You can access each of the components of the pipeline as an attribute to view its configuration: | ||
|
||
```py | ||
pipeline.tokenizer | ||
CLIPTokenizer( | ||
name_or_path="/root/.cache/huggingface/hub/models--runwayml--stable-diffusion-v1-5/snapshots/39593d5650112b4cc580433f6b0435385882d819/tokenizer", | ||
vocab_size=49408, | ||
model_max_length=77, | ||
is_fast=False, | ||
padding_side="right", | ||
truncation_side="right", | ||
special_tokens={ | ||
"bos_token": AddedToken("<|startoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True), | ||
"eos_token": AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True), | ||
"unk_token": AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True), | ||
"pad_token": "<|endoftext|>", | ||
}, | ||
) | ||
``` | ||
|
||
Every pipeline expects a `model_index.json` file that tells the [`DiffusionPipeline`]: | ||
|
||
- which pipeline class to load from `_class_name` | ||
- which version of 🧨 Diffusers was used to create the model in `_diffusers_version` | ||
- what components from which library are stored in the subfolders (`name` corresponds to the component and subfolder name, `library` corresponds to the name of the library to load the class from, and `class` corresponds to the class name) | ||
|
||
```json | ||
{ | ||
"_class_name": "StableDiffusionPipeline", | ||
"_diffusers_version": "0.6.0", | ||
"feature_extractor": [ | ||
"transformers", | ||
"CLIPFeatureExtractor" | ||
], | ||
"safety_checker": [ | ||
"stable_diffusion", | ||
"StableDiffusionSafetyChecker" | ||
], | ||
"scheduler": [ | ||
"diffusers", | ||
"PNDMScheduler" | ||
], | ||
"text_encoder": [ | ||
"transformers", | ||
"CLIPTextModel" | ||
], | ||
"tokenizer": [ | ||
"transformers", | ||
"CLIPTokenizer" | ||
], | ||
"unet": [ | ||
"diffusers", | ||
"UNet2DConditionModel" | ||
], | ||
"vae": [ | ||
"diffusers", | ||
"AutoencoderKL" | ||
] | ||
} | ||
``` |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.