@@ -37,6 +37,28 @@ Unless otherwise mentioned, these are techniques that work with existing models
37
37
9. [Textual Inversion](#textual-inversion)
38
38
10. [ControlNet](#controlnet)
39
39
11. [Prompt Weighting](#prompt-weighting)
40
+ 12. [Custom Diffusion](#custom-diffusion)
41
+ 13. [Model Editing](#model-editing)
42
+ 14. [DiffEdit](#diffedit)
43
+
44
+ For convenience, we provide a table to denote which methods are inference-only and which require fine-tuning/training.
45
+
46
+ | **Method** | **Inference only** | **Requires training /<br> fine-tuning** | **Comments** |
47
+ |:---:|:---:|:---:|:---:|
48
+ | [Instruct Pix2Pix](#instruct-pix2pix) | ✅ | ❌ | Can additionally be<br>fine-tuned for better <br>performance on specific <br>edit instructions. |
49
+ | [Pix2Pix Zero](#pix2pixzero) | ✅ | ❌ | |
50
+ | [Attend and Excite](#attend-and-excite) | ✅ | ❌ | |
51
+ | [Semantic Guidance](#semantic-guidance) | ✅ | ❌ | |
52
+ | [Self-attention Guidance](#self-attention-guidance) | ✅ | ❌ | |
53
+ | [Depth2Image](#depth2image) | ✅ | ❌ | |
54
+ | [MultiDiffusion Panorama](#multidiffusion-panorama) | ✅ | ❌ | |
55
+ | [DreamBooth](#dreambooth) | ❌ | ✅ | |
56
+ | [Textual Inversion](#textual-inversion) | ❌ | ✅ | |
57
+ | [ControlNet](#controlnet) | ✅ | ❌ | A ControlNet can be <br>trained/fine-tuned on<br>a custom conditioning. |
58
+ | [Prompt Weighting](#prompt-weighting) | ✅ | ❌ | |
59
+ | [Custom Diffusion](#custom-diffusion) | ❌ | ✅ | |
60
+ | [Model Editing](#model-editing) | ✅ | ❌ | |
61
+ | [DiffEdit](#diffedit) | ✅ | ❌ | |
40
62
41
63
## Instruct Pix2Pix
42
64
@@ -137,13 +159,13 @@ See [here](../api/pipelines/stable_diffusion/panorama) for more information on h
137
159
138
160
In addition to pre-trained models, Diffusers has training scripts for fine-tuning models on user-provided data.
139
161
140
- ### DreamBooth
162
+ ## DreamBooth
141
163
142
164
[DreamBooth](../training/dreambooth) fine-tunes a model to teach it about a new subject. I.e. a few pictures of a person can be used to generate images of that person in different styles.
143
165
144
166
See [here](../training/dreambooth) for more information on how to use it.
145
167
146
- ### Textual Inversion
168
+ ## Textual Inversion
147
169
148
170
[Textual Inversion](../training/text_inversion) fine-tunes a model to teach it about a new concept. I.e. a few pictures of a style of artwork can be used to generate images in that style.
149
171
@@ -165,3 +187,32 @@ Prompt weighting is a simple technique that puts more attention weight on certai
165
187
input.
166
188
167
189
For a more in-detail explanation and examples, see [here](../using-diffusers/weighted_prompts).
190
+
191
+ ## Custom Diffusion
192
+
193
+ [Custom Diffusion](../training/custom_diffusion) only fine-tunes the cross-attention maps of a pre-trained
194
+ text-to-image diffusion model. It also allows for additionally performing textual inversion. It supports
195
+ multi-concept training by design. Like DreamBooth and Textual Inversion, Custom Diffusion is also used to
196
+ teach a pre-trained text-to-image diffusion model about new concepts to generate outputs involving the
197
+ concept(s) of interest.
198
+
199
+ For more details, check out our [official doc](../training/custom_diffusion).
200
+
201
+ ## Model Editing
202
+
203
+ [Paper](https://arxiv.org/abs/2303.08084)
204
+
205
+ The [text-to-image model editing pipeline](../api/pipelines/stable_diffusion/model_editing) helps you mitigate some of the incorrect implicit assumptions a pre-trained text-to-image
206
+ diffusion model might make about the subjects present in the input prompt. For example, if you prompt Stable Diffusion to generate images for " A pack of roses" , the roses in the generated images
207
+ are more likely to be red. This pipeline helps you change that assumption.
208
+
209
+ To know more details, check out the [official doc](../api/pipelines/stable_diffusion/model_editing).
210
+
211
+ ## DiffEdit
212
+
213
+ [Paper](https://arxiv.org/abs/2210.11427)
214
+
215
+ [DiffEdit](../api/pipelines/stable_diffusion/diffedit) allows for semantic editing of input images along with
216
+ input prompts while preserving the original input images as much as possible.
217
+
218
+ To know more details, check out the [official doc](../api/pipelines/stable_diffusion/model_editing).
0 commit comments