Closed
Description
Model/Pipeline/Scheduler description
DiffEdit is a zero-shot inpainting method given a text-to-image denoising diffusion model, requiring no additional fine-tuning or parameters.
The main innovations pioneered in DiffEdit are:
- Automated mask inference given the original and new conditioning text using the spatial distribution of the difference in conditioned noise estimates, highlighting the locations that are predicted to change the most between conditioning on the original and new texts.
- Using inverse sampling from model-estimated noise to provide a noised version of the unmasked portion of the input image for each sampling step, as opposed to mixing the unmasked portion with random IID Gaussian noise as in SDEdit and RePaint.
Open source status
- The model implementation is available
- The model weights are available (Only relevant if addition is not a scheduler).
Provide useful links for the implementation
- Link to DiffEdit paper: https://arxiv.org/pdf/2210.11427.pdf
- Notebook Implementation by @Xiang-cd based on Comp-vis: https://github.com/Xiang-cd/DiffEdit-stable-diffusion/blob/main/diffedit.ipynb. For better sample efficiency, they use DPM-solver as the deterministic sampler instead of DDIM.
- Model weights are not necessary for this pipeline as it is a zero-shot method given a text-to-image denoising diffusion model.