Skip to content

DiffEdit: Diffusion-based semantic image editing with mask guidance #2800

Closed
@clarencechen

Description

@clarencechen

Model/Pipeline/Scheduler description

DiffEdit is a zero-shot inpainting method given a text-to-image denoising diffusion model, requiring no additional fine-tuning or parameters.

The main innovations pioneered in DiffEdit are:

  • Automated mask inference given the original and new conditioning text using the spatial distribution of the difference in conditioned noise estimates, highlighting the locations that are predicted to change the most between conditioning on the original and new texts.
  • Using inverse sampling from model-estimated noise to provide a noised version of the unmasked portion of the input image for each sampling step, as opposed to mixing the unmasked portion with random IID Gaussian noise as in SDEdit and RePaint.

Open source status

  • The model implementation is available
  • The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleIssues that haven't received updates

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions