Skip to content

Added brownian_noise to DMP 2++ SDE Scheduler and fixed use_exponential_sigmas behaviour #9955

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Crized-bit
Copy link

@Crized-bit Crized-bit commented Nov 18, 2024

What does this PR do?

Added the option to supply custom Brownian noise generator as was made in AUTO1111 webui.

Also fixed issue with use_exponential_sigmas crashes (nonetype and order issue)

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@yiyixuxu

@sayakpaul sayakpaul requested a review from yiyixuxu November 20, 2024 05:41
@sayakpaul
Copy link
Member

@Crized-bit thanks for your PR!

Also fixed issue with use_exponential_sigmas crashes (nonetype and order issue)

Could you maybe supplement a minimal reproducible snippet to understand this better?

@Crized-bit
Copy link
Author

Crized-bit commented Nov 20, 2024

Here it is! With this code i had issues that i've fixed

import torch
import cv2
import numpy as np
import os
from tqdm import tqdm
from PIL.Image import fromarray
from diffusers import StableDiffusionXLControlNetPipeline # type: ignore
from diffusers import ControlNetModel, DPMSolverMultistepScheduler   # type: ignore
from typing import Callable
from compel import Compel, ReturnedEmbeddingsType
import torch
import torch.nn as nn

def get_compel_embeddings(pipeline, device: torch.device) -> Callable[[str, str], tuple]:
    compel=Compel(tokenizer=[pipeline.tokenizer, pipeline.tokenizer_2],
                            text_encoder=[pipeline.text_encoder, pipeline.text_encoder_2],
                            returned_embeddings_type=ReturnedEmbeddingsType.PENULTIMATE_HIDDEN_STATES_NON_NORMALIZED, 
                            requires_pooled=[False, True],
                            truncate_long_prompts=False,
                            device=device) # type: ignore
    
    def return_embeddings(positive_prompt: str, negative_prompt: str):
        conditioning, pooled_prompt_embeds=compel(positive_prompt)
        negative_conditioning, negative_pooled_prompt_embeds=compel(negative_prompt)
        [con_embeds, neg_embeds] = compel.pad_conditioning_tensors_to_same_length([conditioning, negative_conditioning])

        return con_embeds, pooled_prompt_embeds, neg_embeds, negative_pooled_prompt_embeds

    return return_embeddings


# Default parameters block
DEVICE = torch.device("cuda:2")
output_folder = "/home/Diffusion_VK/results/longClip/"
control_folder = "/home/dataset/Games/"

# ControlNets block
controlnets = ControlNetModel.from_pretrained(
        "/home/web_ui/stable-diffusion-webui/models/ControlNet/controlnet_tile", 
        torch_dtype=torch.bfloat16, use_safetensors=True)

# Creating a pipe
pipeline = StableDiffusionXLControlNetPipeline.from_single_file(
    # "/home/web_ui/stable-diffusion-webui/models/Stable-diffusion/cyberrealisticXL_v22.safetensors",
    "/home/web_ui/stable-diffusion-webui/models/Stable-diffusion/juggernautXL_juggXIByRundiffusion.safetensors",
    use_safetensors=True,
    torch_dtype=torch.bfloat16, 
    controlnet = controlnets)

pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config,
                                                 algorithm_type="sde-dpmsolver++",
                                                 final_sigmas_type = "zero", 
                                                 timestep_spacing="linspace", 
                                                 use_exponential_sigmas=True,
                                                            )

# Offload to GPU
pipeline.to(DEVICE)

# Create embedder instance to generate embeddings
embedder = get_compel_embeddings(pipeline=pipeline,
                                 device=DEVICE)
# Dataset inference
for filename in tqdm(sorted(os.listdir(control_folder))):
    img = cv2.imread(os.path.join(control_folder, filename))
    
    controlnet_input = fromarray(img)

    # Generate image
    images = []
    if os.path.isfile(os.path.join("/home/dataset/Games_captions", filename[:-4] + '.txt')):
        with open("/home/dataset/Games_captions/" + filename[:-4] + '.txt', "r", encoding="utf-8") as f:
            positive_promt = f.read()
    else:
        continue

    negative_promt = 'bad eyes, cgi, airbrushed, plastic, deformed, watermark, polygons, inaccurate body, worst quality, low quality, normal quality, jpeg artifacts, unrealistic, flat, triangular, low resolution, bad composition'

    con_embeds, pooled_prompt_embeds, neg_embeds, negative_pooled_prompt_embeds = embedder(positive_promt, 
                                                                                           negative_promt)

    # Generate image
    images = []
    # images.append(np.array(controlnet_input))
    result_img = pipeline(prompt_embeds=con_embeds,
                          pooled_prompt_embeds = pooled_prompt_embeds,
                          negative_prompt_embeds=neg_embeds,
                          negative_pooled_prompt_embeds = negative_pooled_prompt_embeds,
                          image=controlnet_input,
                          generator=torch.Generator(device=DEVICE).manual_seed(1787813600),
                          num_inference_steps = 30,
                          controlnet_conditioning_scale = 0.5,
                          guidance_scale = 5.,
                          ).images[0] # type: ignore
    
    
    images.append(np.array(result_img))
    # Then prepare list of np.arrays' to hcat, then do it
    cv2.imwrite(output_folder + filename, cv2.cvtColor(cv2.hconcat(images), cv2.COLOR_RGB2BGR))
    ```

@yiyixuxu
Copy link
Collaborator

was it fixed in #9954?

@Crized-bit
Copy link
Author

was it fixed in #9954?

Seems so! So, just added noise

@sayakpaul
Copy link
Member

Seems so! So, just added noise

What does this mean?

@Crized-bit
Copy link
Author

Seems so! So, just added noise

What does this mean?

This means that the half of my PR was fixed by another PR (#9954) , but the part where i've added brownian_noise (as it's done in k-diffusers and AUTO-1111) is unique and in need of review.

@sayakpaul
Copy link
Member

@yiyixuxu could give a look?

Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issues that haven't received updates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants