You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[DreamBooth] add text encoder LoRA support in the DreamBooth training script (#3130)
* add: LoRA text encoder support for DreamBooth example.
* fix initialization.
* fix: modification call.
* add: entry in the readme.
* use dog dataset from hub.
* fix: params to clip.
* add entry to the LoRA doc.
* add: tests for lora.
* remove unnecessary list comprehension./
Copy file name to clipboardExpand all lines: docs/source/en/training/dreambooth.mdx
+12-1Lines changed: 12 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -60,7 +60,18 @@ DreamBooth finetuning is very sensitive to hyperparameters and easy to overfit.
60
60
61
61
<frameworkcontent>
62
62
<pt>
63
-
Let's try DreamBooth with a [few images of a dog](https://drive.google.com/drive/folders/1BO_dyz-p65qhBRRMRA4TbZ8qW4rB99JZ); download and save them to a directory and then set the `INSTANCE_DIR` environment variable to that path:
63
+
Let's try DreamBooth with a
64
+
[few images of a dog](https://huggingface.co/datasets/diffusers/dog-example);
65
+
download and save them to a directory and then set the `INSTANCE_DIR` environment variable to that path:
Copy file name to clipboardExpand all lines: examples/dreambooth/README.md
+31-12Lines changed: 31 additions & 12 deletions
Original file line number
Diff line number
Diff line change
@@ -45,15 +45,28 @@ write_basic_config()
45
45
46
46
### Dog toy example
47
47
48
-
Now let's get our dataset. Download images from [here](https://drive.google.com/drive/folders/1BO_dyz-p65qhBRRMRA4TbZ8qW4rB99JZ) and save them in a directory. This will be our training data.
48
+
Now let's get our dataset. For this example we will use some dog images: https://huggingface.co/datasets/diffusers/dog-example.
49
49
50
-
And launch the training using
50
+
Let's first download it locally:
51
+
52
+
```python
53
+
from huggingface_hub import snapshot_download
54
+
55
+
local_dir ="./dog"
56
+
snapshot_download(
57
+
"diffusers/dog-example",
58
+
local_dir=local_dir, repo_type="dataset",
59
+
ignore_patterns=".gitattributes",
60
+
)
61
+
```
62
+
63
+
And launch the training using:
51
64
52
65
**___Note: Change the `resolution` to 768 if you are using the [stable-diffusion-2](https://huggingface.co/stabilityai/stable-diffusion-2) 768x768 model.___**
53
66
54
67
```bash
55
68
export MODEL_NAME="CompVis/stable-diffusion-v1-4"
56
-
export INSTANCE_DIR="path-to-instance-images"
69
+
export INSTANCE_DIR="dog"
57
70
export OUTPUT_DIR="path-to-save-model"
58
71
59
72
accelerate launch train_dreambooth.py \
@@ -77,7 +90,7 @@ According to the paper, it's recommended to generate `num_epochs * num_samples`
77
90
78
91
```bash
79
92
export MODEL_NAME="CompVis/stable-diffusion-v1-4"
80
-
export INSTANCE_DIR="path-to-instance-images"
93
+
export INSTANCE_DIR="dog"
81
94
export CLASS_DIR="path-to-class-images"
82
95
export OUTPUT_DIR="path-to-save-model"
83
96
@@ -108,7 +121,7 @@ To install `bitandbytes` please refer to this [readme](https://github.com/TimDet
108
121
109
122
```bash
110
123
export MODEL_NAME="CompVis/stable-diffusion-v1-4"
111
-
export INSTANCE_DIR="path-to-instance-images"
124
+
export INSTANCE_DIR="dog"
112
125
export CLASS_DIR="path-to-class-images"
113
126
export OUTPUT_DIR="path-to-save-model"
114
127
@@ -141,7 +154,7 @@ It is possible to run dreambooth on a 12GB GPU by using the following optimizati
141
154
142
155
```bash
143
156
export MODEL_NAME="CompVis/stable-diffusion-v1-4"
144
-
export INSTANCE_DIR="path-to-instance-images"
157
+
export INSTANCE_DIR="dog"
145
158
export CLASS_DIR="path-to-class-images"
146
159
export OUTPUT_DIR="path-to-save-model"
147
160
@@ -185,7 +198,7 @@ does not seem to be compatible with DeepSpeed at the moment.
185
198
186
199
```bash
187
200
export MODEL_NAME="CompVis/stable-diffusion-v1-4"
188
-
export INSTANCE_DIR="path-to-instance-images"
201
+
export INSTANCE_DIR="dog"
189
202
export CLASS_DIR="path-to-class-images"
190
203
export OUTPUT_DIR="path-to-save-model"
191
204
@@ -217,7 +230,7 @@ ___Note: Training text encoder requires more memory, with this option the traini
217
230
218
231
```bash
219
232
export MODEL_NAME="CompVis/stable-diffusion-v1-4"
220
-
export INSTANCE_DIR="path-to-instance-images"
233
+
export INSTANCE_DIR="dog"
221
234
export CLASS_DIR="path-to-class-images"
222
235
export OUTPUT_DIR="path-to-save-model"
223
236
@@ -300,7 +313,7 @@ Now, you can launch the training. Here we will use [Stable Diffusion 1-5](https:
@@ -342,6 +355,12 @@ The final LoRA embedding weights have been uploaded to [patrickvonplaten/lora_dr
342
355
The training results are summarized [here](https://api.wandb.ai/report/patrickvonplaten/xm6cd5q5).
343
356
You can use the `Step` slider to see how the model learned the features of our subject while the model trained.
344
357
358
+
Optionally, we can also train additional LoRA layers for the text encoder. Specify the `train_text_encoder` argument above for that. If you're interested to know more about how we
359
+
enable this support, check out this [PR](https://github.com/huggingface/diffusers/pull/2918).
360
+
361
+
With the default hyperparameters from the above, the training seems to go in a positive direction. Check out [this panel](https://wandb.ai/sayakpaul/dreambooth-lora/reports/test-23-04-17-17-00-13---Vmlldzo0MDkwNjMy). The trained LoRA layers are available [here](https://huggingface.co/sayakpaul/dreambooth).
362
+
363
+
345
364
### Inference
346
365
347
366
After training, LoRA weights can be loaded very easily into the original pipeline. First, you need to
These are LoRA adaption weights for {base_model}. The weights were trained on {prompt} using [DreamBooth](https://dreambooth.github.io/). You can find some example images in the following. \n
85
87
{img_str}
88
+
89
+
LoRA for the text encoder was enabled: {train_text_encoder}.
0 commit comments