Skip to content

Add STG to community pipelines #10960

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Mar 7, 2025
Merged

Add STG to community pipelines #10960

merged 22 commits into from
Mar 7, 2025

Conversation

kinam0252
Copy link
Contributor

What does this PR do?

Spatiotemporal Skip Guidance

Junha Hyung*, Kinam Kim*, Susung Hong, Min-Jung Kim, Jaegul Choo

KAIST AI, University of Washington

Spatiotemporal Skip Guidance (STG) for Enhanced Video Diffusion Sampling (CVPR 2025) is a simple training-free sampling guidance method for enhancing transformer-based video diffusion models. STG employs an implicit weak model via self-perturbation, avoiding the need for external models or additional training. By selectively skipping spatiotemporal layers, STG produces an aligned, degraded version of the original model to boost sample quality without compromising diversity or dynamic degree.

Following is the example video of STG applied to Mochi.

mochi_STG.mp4

More examples and information can be found on the GitHub repository and the Project website.

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sayakpaul

@sayakpaul sayakpaul requested a review from a-r-r-o-w March 4, 2025 11:28
@yiyixuxu yiyixuxu added the consider-for-modular-diffusers Things to consider adding support for in Modular Diffusers (with the help of community) label Mar 4, 2025
Copy link
Member

@a-r-r-o-w a-r-r-o-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the PR! This is really amazing work and the paper was very intuitive to understand 🤗

I believe the example docstrings are out of sync with the pipelines. Could those be updated to reflect usage? I haven't exhaustively tested all pipelines, but looks good to me from the code.

This is very high up on my priority list too to integrate in core diffusers, and I'm working on adding these methods and make available to all models for #9672!

@kinam0252
Copy link
Contributor Author

Thank you for your kind words!😊I really appreciate your feedback.

I've updated the example docstrings to include sample usage. Let me know if there's anything else you'd like to adjust!
I’m also looking forward to seeing our method integrated into core diffusers! 🚀

@a-r-r-o-w
Copy link
Member

@kinam0252 Looks good to me. I believe the imports and pipeline names are incorrect, which could be updated.

The failing tests seem to be because of a typo:
image

Could you update this as well? Will be good to merge after that 🤗

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@kinam0252
Copy link
Contributor Author

kinam0252 commented Mar 7, 2025

@a-r-r-o-w Sorry for the previous mistakes! I’ve updated the imports, pipelines, and corrected the typos.
Hopefully, this resolves all the issues. 🚀 Please let me know if anything else needs to be fixed.

@a-r-r-o-w
Copy link
Member

Thanks @kinam0252! Looks good to merge

@bot /style

@a-r-r-o-w
Copy link
Member

@kinam0252 Is rescaling supported in HunyuanVideoSTGPipeline? The style bot seems to be failing for this as it is undefined: https://github.com/huggingface/diffusers/actions/runs/13724296073/job/38386911164

image

@kinam0252
Copy link
Contributor Author

@a-r-r-o-w Oh the style bot is right - rescaling is not supported in HunyuanVideoSTGPipeline. I just fixed it!

@a-r-r-o-w
Copy link
Member

@bot /style

Copy link
Contributor

github-actions bot commented Mar 7, 2025

Style fixes have been applied. View the workflow run here.

@a-r-r-o-w a-r-r-o-w merged commit b38450d into huggingface:main Mar 7, 2025
@kinam0252
Copy link
Contributor Author

Thank you so much! 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
consider-for-modular-diffusers Things to consider adding support for in Modular Diffusers (with the help of community)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants