Skip to content

[Community pipeline] SD3 Differential Diffusion Img2Img Pipeline #8679

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 29, 2024

Conversation

asomoza
Copy link
Member

@asomoza asomoza commented Jun 24, 2024

What does this PR do?

Add the differential diffusion SD3 community pipeline.

Needs #8678 to work

Fixes #8577

How to test:

Gradient

import torch

from diffusers import FlowMatchEulerDiscreteScheduler
from diffusers.utils import load_image
from examples.community.pipeline_stable_diffusion_3_differential_img2img import (
    StableDiffusion3DifferentialImg2ImgPipeline,
)


pipe = StableDiffusion3DifferentialImg2ImgPipeline.from_pretrained(
    "stabilityai/stable-diffusion-3-medium-diffusers", torch_dtype=torch.float16, variant="fp16"
).to("cuda")

pipe.scheduler = FlowMatchEulerDiscreteScheduler.from_config(pipe.scheduler.config, shift=3.0)

prompt = "a green pear"

source_image = load_image(
    "https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/differential/20240329211129_4024911930.png"
)
map = load_image(
    "https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/differential/gradient_mask_2.png"
)

image = pipe(
    prompt=prompt,
    negative_prompt="",
    image=source_image,
    num_inference_steps=28,
    guidance_scale=4.5,
    strength=1.0,
    map=map,
).images[0]
source gradient result
20240329211129_4024911930 gradient_mask_2 20240624041537_1455171781

Inpainting

import torch

from diffusers import FlowMatchEulerDiscreteScheduler
from diffusers.utils import load_image
from examples.community.pipeline_stable_diffusion_3_differential_img2img import (
    StableDiffusion3DifferentialImg2ImgPipeline,
)


pipe = StableDiffusion3DifferentialImg2ImgPipeline.from_pretrained(
    "stabilityai/stable-diffusion-3-medium-diffusers", torch_dtype=torch.float16, variant="fp16"
).to("cuda")

pipe.scheduler = FlowMatchEulerDiscreteScheduler.from_config(pipe.scheduler.config, shift=3.0)

prompt = "Photorealistic close-up portrait of a Golden Retriever puppy. In the background, on the rug beside the puppy, a red ball, enticing the pup to play."
prompt_3 = "Photorealistic close-up portrait of a Golden Retriever puppy, around 4 months old, with a playful expression bathed in warm sunlight streaming through the window of a modern living room. Capture the soft texture of its fluffy fur, the glint of light in its big brown eyes, and a colorful bandana around its neck. In the background, on the rug beside the puppy, a red ball, enticing the pup to play."

source_image = load_image(
    "https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/differential/dog_source.png"
)
mask = load_image(
    "https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/differential/dog_inpainting_mask.png"
)

image = pipe(
    prompt=prompt,
    prompt_3=prompt_3,
    negative_prompt="",
    image=source_image,
    num_inference_steps=28,
    guidance_scale=4.5,
    strength=0.78,
    map=mask,
    max_sequence_length=512,
).images[0]

image.save("diff-diff-inpaint-result.png")
source mask result
dog_source dog_inpainting_mask diff-diff-inpaint-result

Depthmap

import torch

from diffusers import FlowMatchEulerDiscreteScheduler
from diffusers.utils import load_image
from examples.community.pipeline_stable_diffusion_3_differential_img2img import (
    StableDiffusion3DifferentialImg2ImgPipeline,
)


pipe = StableDiffusion3DifferentialImg2ImgPipeline.from_pretrained(
    "./models/stable_diffusion_3_medium",
    torch_dtype=torch.float16,
).to("cuda")

pipe.scheduler = FlowMatchEulerDiscreteScheduler.from_config(pipe.scheduler.config, shift=3.0)

prompt = "painting of a mountain landscape with a meadow and a forest, meadow background"

source_image = load_image(
    "https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/differential/meadow.jpg"
)
depth_map = load_image(
    "https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/differential/meadow_depth.png"
)

image = pipe(
    prompt=prompt,
    negative_prompt="",
    image=source_image,
    num_inference_steps=28,
    guidance_scale=4.5,
    strength=0.8,
    map=depth_map,
).images[0]

image.save("diff-diff-depth-result.png")
source depth map result
meadow meadow_depth diff-diff-depth-result

* depth map made with marigold and diffusers

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@yiyixuxu @vikm2o @exx8

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@vikm2o
Copy link

vikm2o commented Jun 24, 2024

@asomoza is generator required ?
When I run I still see the same issue

--> 914     latents = original_with_noise[i] * mask + latents * (1 - mask)
    915 # end diff diff
    916 
    917 # expand the latents if we are doing classifier free guidance
    918 latent_model_input = torch.cat([latents] * 2) if self.do_classifier_free_guidance else latents

IndexError: index 1 is out of bounds for dimension 0 with size 1

I am using sd3-diff-diff branch.

@asomoza
Copy link
Member Author

asomoza commented Jun 25, 2024

@vikm2o

is generator required ?

No, that got in by mistake when I copied my code.

You need to use the fix-flow-match-scale-noise branch in my repo for it to work. They're in separate branches since I need to do separate PRs.

@vikm2o
Copy link

vikm2o commented Jun 25, 2024

@asomoza Thanks, that worked!

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ready to merge too?

@asomoza
Copy link
Member Author

asomoza commented Jun 27, 2024

I'll try to remove the additional code from it before, I mean the need for torchvision and the need to preprocess the images outside of the pipeline.

@asomoza asomoza requested a review from yiyixuxu June 29, 2024 00:30
@asomoza
Copy link
Member Author

asomoza commented Jun 29, 2024

@yiyixuxu now is ready

@yiyixuxu yiyixuxu merged commit 9b7acc7 into huggingface:main Jun 29, 2024
7 of 8 checks passed
@yiyixuxu
Copy link
Collaborator

merged! it is really nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Differential Diffusion with SD3
4 participants