Skip to content

Commit 96e7254

Browse files
committed
add: entry in the readme and docs.
1 parent db8bbbd commit 96e7254

File tree

2 files changed

+38
-0
lines changed

2 files changed

+38
-0
lines changed

docs/source/en/training/text2image.mdx

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,28 @@ python train_text_to_image_flax.py \
155155
</jax>
156156
</frameworkcontent>
157157

158+
## Training with better convergence
159+
160+
We support training with the Min-SNR weighting strategy proposed in [Efficient Diffusion Training via Min-SNR Weighting Strategy](https://arxiv.org/abs/2303.09556) which helps to achieve faster convergence
161+
by rebalancing the loss. In order to use it, one needs to set the `--snr_gamma` argument. The recommended
162+
value when using it is 5.0.
163+
164+
You can find [this project on Weights and Biases](https://wandb.ai/sayakpaul/text2image-finetune-minsnr) that compares the loss surfaces of the following setups:
165+
166+
* Training without the Min-SNR weighting strategy
167+
* Training with the Min-SNR weighting strategy (`snr_gamma` set to 5.0)
168+
* Training with the Min-SNR weighting strategy (`snr_gamma` set to 1.0)
169+
170+
For our small Pokemons dataset, the effects of Min-SNR weighting strategy might not appear to be pronounced, but for larger datasets, we believe the effects will be more pronounced.
171+
172+
Also, note that in this example, we either predict `epsilon` (i.e., the noise) or the `v_prediction`. For both of these cases, the formulation of the Min-SNR weighting strategy that we have used holds.
173+
174+
<Tip warning={true}>
175+
176+
Training with Min-SNR weighting strategy is only supported in PyTorch.
177+
178+
</Tip>
179+
158180
## LoRA
159181

160182
You can also use Low-Rank Adaptation of Large Language Models (LoRA), a fine-tuning technique for accelerating training large models, for fine-tuning text-to-image models. For more details, take a look at the [LoRA training](lora#text-to-image) guide.

examples/text_to_image/README.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,22 @@ image = pipe(prompt="yoda").images[0]
111111
image.save("yoda-pokemon.png")
112112
```
113113

114+
#### Training with better convergence
115+
116+
We support training with the Min-SNR weighting strategy proposed in [Efficient Diffusion Training via Min-SNR Weighting Strategy](https://arxiv.org/abs/2303.09556) which helps to achieve faster convergence
117+
by rebalancing the loss. In order to use it, one needs to set the `--snr_gamma` argument. The recommended
118+
value when using it is 5.0.
119+
120+
You can find [this project on Weights and Biases](https://wandb.ai/sayakpaul/text2image-finetune-minsnr) that compares the loss surfaces of the following setups:
121+
122+
* Training without the Min-SNR weighting strategy
123+
* Training with the Min-SNR weighting strategy (`snr_gamma` set to 5.0)
124+
* Training with the Min-SNR weighting strategy (`snr_gamma` set to 1.0)
125+
126+
For our small Pokemons dataset, the effects of Min-SNR weighting strategy might not appear to be pronounced, but for larger datasets, we believe the effects will be more pronounced.
127+
128+
Also, note that in this example, we either predict `epsilon` (i.e., the noise) or the `v_prediction`. For both of these cases, the formulation of the Min-SNR weighting strategy that we have used holds.
129+
114130
## Training with LoRA
115131

116132
Low-Rank Adaption of Large Language Models was first introduced by Microsoft in [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685) by *Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen*.

0 commit comments

Comments
 (0)