-
Notifications
You must be signed in to change notification settings - Fork 5.9k
[Scheduler] Add Variational Diffusion Models (VDM) scheduler #7737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit adds a new scheduler called VDMScheduler to the diffusers/schedulers module. The VDMScheduler is a class that implements a scheduling algorithm for denoising models. It takes into account parameters such as the number of training timesteps, beta schedule, prediction type, timestep spacing, and steps offset. The VDMScheduler class provides methods for scaling model input, adding noise to samples, and stepping through the denoising process. The commit also includes a new file, scheduling_vdm.py, which contains the implementation of the VDMScheduler class.
- Refactored the initialization of self.beta_start and self.beta_end as continuous schedules in self._log_snr are fitted to these values. - Added self.num_inference_steps and self.timesteps attributes. - Updated the implementation of self.alphas_cumprod and self.sigmas. - Modified the implementation of self.log_snr to use the cached version for discrete timesteps and inference. - Updated the implementation of the beta_schedule methods. - Modified the implementation of the forward method to handle discrete timesteps correctly. - Updated the implementation of the forward method to handle different types of timestep inputs. These changes improve the functionality and readability of the VDMScheduler class in scheduling_vdm.py.
- Refactored the `__len__` method to return a default value of 1000 if `num_inference_steps` and `num_train_timesteps` are not set. - Added input validation for `t` in the `_log_snr` method to ensure it is within the range [0, 1]. - Normalized `timesteps` to the range [0, 1] in the `__call__` method if `self.timesteps` is None. - Simplified the calculation of `prev_timestep` in the `__call__` method. - Updated the error message in the `__call__` method to specify that `prediction_type` must be either `epsilon` or `sample`. These changes improve the readability and maintainability of the code, and ensure proper input validation and normalization.
- Remove unused imports and variables - Move log_snr function outside of the class - Simplify the calculation of alphas_cumprod and sigmas - Remove redundant code for setting timesteps - Normalize discrete timesteps to [0, 1] - Simplify the computation of alpha, sigma, prev_alpha, and prev_sigma - Remove unnecessary check for timestep type - Simplify the noise calculation - Remove unnecessary variance initialization - Improve code readability and maintainability This commit refactors the VDMScheduler class in scheduling_vdm.py by removing unused imports and variables, moving the log_snr function outside of the class, simplifying the calculation of alphas_cumprod and sigmas, removing redundant code for setting timesteps, normalizing discrete timesteps to [0, 1], simplifying the computation of alpha, sigma, prev_alpha, and prev_sigma, removing unnecessary checks for timestep type, simplifying the noise calculation, removing unnecessary variance initialization, and improving code readability and maintainability.
- Move the `log_snr` function from the class to a separate method. - Add type annotations to the `log_snr` method. - Remove the `index_for_timestep` method, as it is copied from another class. - Remove the normalization of timesteps in the `scale_model_input` method, as it is already done in the `log_snr` method. These changes improve code organization and maintainability.
- Add `clip_sample` boolean flag to control whether to clip the predicted original sample within a specified range. - Add `thresholding` boolean flag to enable dynamic thresholding of the predicted original sample. - Add `dynamic_thresholding_ratio` to determine the percentile value for dynamic thresholding. - Add `clip_sample_range` to specify the range for clipping the predicted original sample. - Add `sample_max_value` to set the maximum value for dynamic thresholding. - Implement `_threshold_sample` method to perform dynamic thresholding on the sample. - Modify `scale_model_input` method to use the new `clip_sample` and `thresholding` flags. - Refactor the computation of predicted original and previous samples in the `__call__` method. - Add noise to the predicted previous sample if `noise_scale` is greater than 0. These changes enhance the flexibility and control over the sample generation process in VDMScheduler.
- Refactored the add_noise() method to use torch.Tensor instead of specific tensor types. - Updated the step() method to handle batched inputs. - Modified the computation of predicted original sample x_0 in the step() method. - Adjusted the addition of noise in the step() method to handle noise_scale > 0. These changes improve the flexibility and efficiency of the VDMScheduler class in the scheduling_vdm.py file.
- Refactored the computation of the predicted previous sample in the VDMScheduler class. - Added a condition to include the prediction type "sample" in the computation. - The computation now considers thresholding, clipping, and the prediction type "sample" to calculate the predicted previous sample. - This change improves the accuracy of the predicted previous sample in certain scenarios.
This commit refactors the VDMScheduler class in the scheduling_vdm.py file. The class now includes docstrings that provide detailed explanations of the class and its methods. The log_snr method calculates the logarithm of the signal-to-noise ratio for given timesteps. The get_timesteps method generates an array of timesteps based on the configured spacing method. The set_timesteps method sets the discrete timesteps used for the diffusion chain. The add_noise method adds noise to the original samples according to the noise schedule and specified timesteps. The step method performs a single step of the diffusion process, computing the previous sample and optionally the predicted original sample. These changes improve code readability and maintainability.
…in diffusers/schedulers/scheduling_vdm.py - Added "VDMScheduler" to the list of imported schedulers in diffusers/__init__.py. - Implemented the VDMScheduler class in diffusers/schedulers/scheduling_vdm.py. - Added the log_snr() function to calculate the logarithm of the signal-to-noise ratio (SNR) for given time steps and beta schedule. - Created the VDMSchedulerOutput class as the output for the scheduler's step function. - Implemented the VDMScheduler class with necessary methods and attributes. - Added test cases for VDMScheduler in tests/schedulers/test_scheduler_vdm.py. - Updated the test cases in tests/schedulers/test_schedulers.py to include VDMScheduler.
- Remove "VQDiffusionScheduler" from the import structure in "__init__.py" of the "diffusers" module. - Remove "VQDiffusionScheduler" from the import structure in "__init__.py" of the "schedulers" sub-module. - Remove "VQDiffusionScheduler" import from "scheduling_vq_diffusion.py" in the "schedulers" sub-module. - Update the "VDMScheduler" class in "scheduling_vdm.py" in the "schedulers" sub-module: - Add type hints and remove unused imports. - Refactor the "log_snr" function for better readability. - Refactor the "__init__" method for better readability. - Refactor the "set_timesteps" method for better readability. - Refactor the "add_noise" method for better readability. - Refactor the "step" method for better readability. - Update the test cases in "test_schedulers.py" to reflect the changes. These changes refactor the import structure and improve the readability of the code in the "diffusers" module and the "schedulers" sub-module.
This commit adds the VDMScheduler and VDMSchedulerOutput classes to the API documentation. The VDMScheduler class is a part of the Variational Diffusion Models (VDM) library, which introduces diffusion-based generative models for image density estimation. The VDMSchedulerOutput class is also included in the documentation. These additions provide users with information on how to use these classes and their functionalities. The commit also includes the necessary copyright and license information.
thanks for the PR! |
also can you provide more information? when would you recommend this scheduler and how does it compare to the default ones? |
I implemented it primarily to reproduce results from the original Variational Diffusion Models paper and the Locally Attentional SDF Diffusion for Controllable 3D Shape Generation paper. I didn't use it to train on image data, so I can't provide any generations. Performance wise it sits somewhere between the EDM and DDPM formulation I would say. It outperforms other schedulers in terms of log-likelihood. The continuous time formulation doesn't require to chose the number of train timesteps, so one hyperparameter less. It could also serve as an example of how to incorporate continuous time into the other schedulers. |
- Removed the unused `__len__` method. - Removed the `add_noise` method as it is no longer used. - Added a new method `get_velocity` to calculate the velocity based on the noise and sample. - Added support for a new prediction type called `v_prediction`. These changes improve the code structure and add functionality to the VDMScheduler class.
- Refactored the `get_velocity` method to correctly handle the device of the input tensors. - Removed unnecessary comment and reshaping code in `get_velocity` method. - Modified the condition in the `if` statement in the `__init__` method to use the `!=` operator instead of `==` for `prediction_type`. These changes improve the code readability and ensure correct device handling for tensors in the VDMScheduler class.
The commit fixes a bug in the calculation of `pred_original_sample` in the `VDMScheduler` class. Previously, the calculation used addition, but it should use subtraction. This bug affected the `v_prediction` prediction type. The fix ensures that the calculation is correct and consistent with the intended behavior of the `v_prediction` type.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
what is the status of this pr? |
@hummat woud you be able to provide a comparison? like for sdxl, output with vdm, ddpm and dpm with same number of steps etc |
quick grid compare with few other samplers first, need to add but...there is something weird about its sample range - at default settings, its clipping so badly, its not even resolving. looking at sample min/max for each timestep (before clamp), its massive!
i'm not an expert here, but to me it looks like its all about initial step as range does decrease with steps, it just starts too high. |
Thanks for the great experiment @vladmandic! To be honest I'm surprised that it works at all as the current implementation follows the VDM paper where models are expected to be trained on the noise level (or equivalently the (log) signal-to-noise ratio) similar to the models from the EDM paper instead of directly on the time steps. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
What does this PR do?
This PR adds a discrete and continuous time scheduler based on the Variational Diffusion Models (VDM) formulation, i.e. expressing the diffusion process via the signal-to-noise ratio.
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@yiyixuxu