-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Add STG to community pipelines #10960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the PR! This is really amazing work and the paper was very intuitive to understand 🤗
I believe the example docstrings are out of sync with the pipelines. Could those be updated to reflect usage? I haven't exhaustively tested all pipelines, but looks good to me from the code.
This is very high up on my priority list too to integrate in core diffusers, and I'm working on adding these methods and make available to all models for #9672!
Thank you for your kind words!😊I really appreciate your feedback. I've updated the example docstrings to include sample usage. Let me know if there's anything else you'd like to adjust! |
@kinam0252 Looks good to me. I believe the imports and pipeline names are incorrect, which could be updated. The failing tests seem to be because of a typo: Could you update this as well? Will be good to merge after that 🤗 |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@a-r-r-o-w Sorry for the previous mistakes! I’ve updated the imports, pipelines, and corrected the typos. |
Thanks @kinam0252! Looks good to merge @bot /style |
@kinam0252 Is rescaling supported in HunyuanVideoSTGPipeline? The style bot seems to be failing for this as it is undefined: https://github.com/huggingface/diffusers/actions/runs/13724296073/job/38386911164 |
@a-r-r-o-w Oh the style bot is right - rescaling is not supported in HunyuanVideoSTGPipeline. I just fixed it! |
@bot /style |
Style fixes have been applied. View the workflow run here. |
Thank you so much! 😊 |
What does this PR do?
Spatiotemporal Skip Guidance
Junha Hyung*, Kinam Kim*, Susung Hong, Min-Jung Kim, Jaegul Choo
KAIST AI, University of Washington
Spatiotemporal Skip Guidance (STG) for Enhanced Video Diffusion Sampling (CVPR 2025) is a simple training-free sampling guidance method for enhancing transformer-based video diffusion models. STG employs an implicit weak model via self-perturbation, avoiding the need for external models or additional training. By selectively skipping spatiotemporal layers, STG produces an aligned, degraded version of the original model to boost sample quality without compromising diversity or dynamic degree.
Following is the example video of STG applied to Mochi.
mochi_STG.mp4
More examples and information can be found on the GitHub repository and the Project website.
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@sayakpaul