[Scheduler] Add Variational Diffusion Models (VDM) scheduler #7737

hummat · 2024-04-22T09:56:12Z

What does this PR do?

This PR adds a discrete and continuous time scheduler based on the Variational Diffusion Models (VDM) formulation, i.e. expressing the diffusion process via the signal-to-noise ratio.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@yiyixuxu

This commit adds a new scheduler called VDMScheduler to the diffusers/schedulers module. The VDMScheduler is a class that implements a scheduling algorithm for denoising models. It takes into account parameters such as the number of training timesteps, beta schedule, prediction type, timestep spacing, and steps offset. The VDMScheduler class provides methods for scaling model input, adding noise to samples, and stepping through the denoising process. The commit also includes a new file, scheduling_vdm.py, which contains the implementation of the VDMScheduler class.

…vdm-scheduler

- Refactored the initialization of self.beta_start and self.beta_end as continuous schedules in self._log_snr are fitted to these values. - Added self.num_inference_steps and self.timesteps attributes. - Updated the implementation of self.alphas_cumprod and self.sigmas. - Modified the implementation of self.log_snr to use the cached version for discrete timesteps and inference. - Updated the implementation of the beta_schedule methods. - Modified the implementation of the forward method to handle discrete timesteps correctly. - Updated the implementation of the forward method to handle different types of timestep inputs. These changes improve the functionality and readability of the VDMScheduler class in scheduling_vdm.py.

…vdm-scheduler

- Refactored the `__len__` method to return a default value of 1000 if `num_inference_steps` and `num_train_timesteps` are not set. - Added input validation for `t` in the `_log_snr` method to ensure it is within the range [0, 1]. - Normalized `timesteps` to the range [0, 1] in the `__call__` method if `self.timesteps` is None. - Simplified the calculation of `prev_timestep` in the `__call__` method. - Updated the error message in the `__call__` method to specify that `prediction_type` must be either `epsilon` or `sample`. These changes improve the readability and maintainability of the code, and ensure proper input validation and normalization.

- Remove unused imports and variables - Move log_snr function outside of the class - Simplify the calculation of alphas_cumprod and sigmas - Remove redundant code for setting timesteps - Normalize discrete timesteps to [0, 1] - Simplify the computation of alpha, sigma, prev_alpha, and prev_sigma - Remove unnecessary check for timestep type - Simplify the noise calculation - Remove unnecessary variance initialization - Improve code readability and maintainability This commit refactors the VDMScheduler class in scheduling_vdm.py by removing unused imports and variables, moving the log_snr function outside of the class, simplifying the calculation of alphas_cumprod and sigmas, removing redundant code for setting timesteps, normalizing discrete timesteps to [0, 1], simplifying the computation of alpha, sigma, prev_alpha, and prev_sigma, removing unnecessary checks for timestep type, simplifying the noise calculation, removing unnecessary variance initialization, and improving code readability and maintainability.

- Move the `log_snr` function from the class to a separate method. - Add type annotations to the `log_snr` method. - Remove the `index_for_timestep` method, as it is copied from another class. - Remove the normalization of timesteps in the `scale_model_input` method, as it is already done in the `log_snr` method. These changes improve code organization and maintainability.

- Add `clip_sample` boolean flag to control whether to clip the predicted original sample within a specified range. - Add `thresholding` boolean flag to enable dynamic thresholding of the predicted original sample. - Add `dynamic_thresholding_ratio` to determine the percentile value for dynamic thresholding. - Add `clip_sample_range` to specify the range for clipping the predicted original sample. - Add `sample_max_value` to set the maximum value for dynamic thresholding. - Implement `_threshold_sample` method to perform dynamic thresholding on the sample. - Modify `scale_model_input` method to use the new `clip_sample` and `thresholding` flags. - Refactor the computation of predicted original and previous samples in the `__call__` method. - Add noise to the predicted previous sample if `noise_scale` is greater than 0. These changes enhance the flexibility and control over the sample generation process in VDMScheduler.

…vdm-scheduler

- Refactored the add_noise() method to use torch.Tensor instead of specific tensor types. - Updated the step() method to handle batched inputs. - Modified the computation of predicted original sample x_0 in the step() method. - Adjusted the addition of noise in the step() method to handle noise_scale > 0. These changes improve the flexibility and efficiency of the VDMScheduler class in the scheduling_vdm.py file.

- Refactored the computation of the predicted previous sample in the VDMScheduler class. - Added a condition to include the prediction type "sample" in the computation. - The computation now considers thresholding, clipping, and the prediction type "sample" to calculate the predicted previous sample. - This change improves the accuracy of the predicted previous sample in certain scenarios.

This commit refactors the VDMScheduler class in the scheduling_vdm.py file. The class now includes docstrings that provide detailed explanations of the class and its methods. The log_snr method calculates the logarithm of the signal-to-noise ratio for given timesteps. The get_timesteps method generates an array of timesteps based on the configured spacing method. The set_timesteps method sets the discrete timesteps used for the diffusion chain. The add_noise method adds noise to the original samples according to the noise schedule and specified timesteps. The step method performs a single step of the diffusion process, computing the previous sample and optionally the predicted original sample. These changes improve code readability and maintainability.

…vdm-scheduler

…in diffusers/schedulers/scheduling_vdm.py - Added "VDMScheduler" to the list of imported schedulers in diffusers/__init__.py. - Implemented the VDMScheduler class in diffusers/schedulers/scheduling_vdm.py. - Added the log_snr() function to calculate the logarithm of the signal-to-noise ratio (SNR) for given time steps and beta schedule. - Created the VDMSchedulerOutput class as the output for the scheduler's step function. - Implemented the VDMScheduler class with necessary methods and attributes. - Added test cases for VDMScheduler in tests/schedulers/test_scheduler_vdm.py. - Updated the test cases in tests/schedulers/test_schedulers.py to include VDMScheduler.

- Remove "VQDiffusionScheduler" from the import structure in "__init__.py" of the "diffusers" module. - Remove "VQDiffusionScheduler" from the import structure in "__init__.py" of the "schedulers" sub-module. - Remove "VQDiffusionScheduler" import from "scheduling_vq_diffusion.py" in the "schedulers" sub-module. - Update the "VDMScheduler" class in "scheduling_vdm.py" in the "schedulers" sub-module: - Add type hints and remove unused imports. - Refactor the "log_snr" function for better readability. - Refactor the "__init__" method for better readability. - Refactor the "set_timesteps" method for better readability. - Refactor the "add_noise" method for better readability. - Refactor the "step" method for better readability. - Update the test cases in "test_schedulers.py" to reflect the changes. These changes refactor the import structure and improve the readability of the code in the "diffusers" module and the "schedulers" sub-module.

This commit adds the VDMScheduler and VDMSchedulerOutput classes to the API documentation. The VDMScheduler class is a part of the Variational Diffusion Models (VDM) library, which introduces diffusion-based generative models for image density estimation. The VDMSchedulerOutput class is also included in the documentation. These additions provide users with information on how to use these classes and their functionalities. The commit also includes the necessary copyright and license information.

yiyixuxu · 2024-05-20T17:05:02Z

thanks for the PR!
can we see some examples with this scheduler?

yiyixuxu · 2024-05-20T17:06:09Z

also can you provide more information? when would you recommend this scheduler and how does it compare to the default ones?

hummat · 2024-05-21T06:50:45Z

I implemented it primarily to reproduce results from the original Variational Diffusion Models paper and the Locally Attentional SDF Diffusion for Controllable 3D Shape Generation paper. I didn't use it to train on image data, so I can't provide any generations.

Performance wise it sits somewhere between the EDM and DDPM formulation I would say. It outperforms other schedulers in terms of log-likelihood.

The continuous time formulation doesn't require to chose the number of train timesteps, so one hyperparameter less. It could also serve as an example of how to incorporate continuous time into the other schedulers.

…duler

- Removed the unused `__len__` method. - Removed the `add_noise` method as it is no longer used. - Added a new method `get_velocity` to calculate the velocity based on the noise and sample. - Added support for a new prediction type called `v_prediction`. These changes improve the code structure and add functionality to the VDMScheduler class.

- Refactored the `get_velocity` method to correctly handle the device of the input tensors. - Removed unnecessary comment and reshaping code in `get_velocity` method. - Modified the condition in the `if` statement in the `__init__` method to use the `!=` operator instead of `==` for `prediction_type`. These changes improve the code readability and ensure correct device handling for tensors in the VDMScheduler class.

The commit fixes a bug in the calculation of `pred_original_sample` in the `VDMScheduler` class. Previously, the calculation used addition, but it should use subtraction. This bug affected the `v_prediction` prediction type. The fix ensures that the calculation is correct and consistent with the intended behavior of the `v_prediction` type.

github-actions · 2024-09-14T15:15:03Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

vladmandic · 2024-09-14T15:23:43Z

what is the status of this pr?

hummat · 2024-09-14T20:48:15Z

@yiyixuxu

yiyixuxu · 2024-09-17T21:16:15Z

@hummat woud you be able to provide a comparison? like for sdxl, output with vdm, ddpm and dpm with same number of steps etc
otherwise I can look into this when I have a bit more time!

vladmandic · 2024-09-18T12:53:02Z

quick grid compare with few other samplers
imo, has a lot of potential because its so different, but right now math is a bit broken.

first, need to add order: int = 1 to class init as its looked up by pipelines even if its not used by sampler

but...there is something weird about its sample range - at default settings, its clipping so badly, its not even resolving.
the grid above is with clip_sample_range=2.0 and its not perfect.

looking at sample min/max for each timestep (before clamp), its massive!

{'timestep': 0.800000011920929, 'min': -17.625, 'max': 22.375}
{'timestep': 0.7330000400543213, 'min': -12.4375, 'max': 18.25}
{'timestep': 0.6670000553131104, 'min': -14.625, 'max': 16.125}
{'timestep': 0.6000000238418579, 'min': -15.25, 'max': 19.125}
{'timestep': 0.5330000519752502, 'min': -10.125, 'max': 14.5}
{'timestep': 0.46700000762939453, 'min': -9.25, 'max': 12.8125}
{'timestep': 0.4000000059604645, 'min': -7.78125, 'max': 8.125}
{'timestep': 0.3330000042915344, 'min': -5.8125, 'max': 7.40625}
{'timestep': 0.2670000195503235, 'min': -5.625, 'max': 4.1875}
{'timestep': 0.20000000298023224, 'min': -3.453125, 'max': 4.25}
{'timestep': 0.13300000131130219, 'min': -3.046875, 'max': 3.296875}
{'timestep': 0.06700000166893005, 'min': -2.671875, 'max': 2.921875}
{'timestep': 0.0, 'min': -2.046875, 'max': 2.046875}

i'm not an expert here, but to me it looks like its all about initial step as range does decrease with steps, it just starts too high.
just to check, i did the same for DDIM and initial range is about 1/3rd

hummat · 2024-09-29T15:50:18Z

Thanks for the great experiment @vladmandic! To be honest I'm surprised that it works at all as the current implementation follows the VDM paper where models are expected to be trained on the noise level (or equivalently the (log) signal-to-noise ratio) similar to the models from the EDM paper instead of directly on the time steps.

github-actions · 2024-11-02T15:06:23Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Matthias Humt added 19 commits April 17, 2024 18:50

Merge branch 'main' of https://github.com/huggingface/diffusers into …

e54d9a1

…vdm-scheduler

Merge branch 'main' of https://github.com/huggingface/diffusers into …

89b0c28

…vdm-scheduler

Merge branch 'main' of https://github.com/huggingface/diffusers into …

f077255

…vdm-scheduler

Merge branch 'main' of https://github.com/huggingface/diffusers into …

af440df

…vdm-scheduler

Update copyright information in scheduling_vdm.py

7c4a4a7

Merge branch 'main' of https://github.com/huggingface/diffusers into …

f962b14

…vdm-scheduler

Merge main

f6da6e6

probabilisticrobotics and others added 4 commits June 11, 2024 11:41

Merge branch 'main' of github.com:huggingface/diffusers into vdm-sche…

5f63951

…duler

github-actions bot added the stale Issues that haven't received updates label Sep 14, 2024

Merge branch 'main' into vdm-scheduler

d6de708

github-actions bot removed the stale Issues that haven't received updates label Sep 15, 2024

Merge branch 'main' into vdm-scheduler

bac8902

yiyixuxu added the scheduler label Sep 17, 2024

Merge branch 'main' into vdm-scheduler

756cd92

Merge branch 'main' into vdm-scheduler

fe27771

github-actions bot added the stale Issues that haven't received updates label Nov 2, 2024

a-r-r-o-w removed the stale Issues that haven't received updates label Nov 2, 2024

yiyixuxu added the wip label Nov 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Scheduler] Add Variational Diffusion Models (VDM) scheduler #7737

[Scheduler] Add Variational Diffusion Models (VDM) scheduler #7737

hummat commented Apr 22, 2024 •

edited

Loading

yiyixuxu commented May 20, 2024

yiyixuxu commented May 20, 2024

hummat commented May 21, 2024 •

edited

Loading

github-actions bot commented Sep 14, 2024

vladmandic commented Sep 14, 2024

hummat commented Sep 14, 2024

yiyixuxu commented Sep 17, 2024

vladmandic commented Sep 18, 2024 •

edited

Loading

hummat commented Sep 29, 2024 •

edited

Loading

github-actions bot commented Nov 2, 2024

[Scheduler] Add Variational Diffusion Models (VDM) scheduler #7737

Are you sure you want to change the base?

[Scheduler] Add Variational Diffusion Models (VDM) scheduler #7737

Conversation

hummat commented Apr 22, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

yiyixuxu commented May 20, 2024

yiyixuxu commented May 20, 2024

hummat commented May 21, 2024 • edited Loading

github-actions bot commented Sep 14, 2024

vladmandic commented Sep 14, 2024

hummat commented Sep 14, 2024

yiyixuxu commented Sep 17, 2024

vladmandic commented Sep 18, 2024 • edited Loading

hummat commented Sep 29, 2024 • edited Loading

github-actions bot commented Nov 2, 2024

hummat commented Apr 22, 2024 •

edited

Loading

hummat commented May 21, 2024 •

edited

Loading

vladmandic commented Sep 18, 2024 •

edited

Loading

hummat commented Sep 29, 2024 •

edited

Loading