Add STG to community pipelines #10960

kinam0252 · 2025-03-04T11:07:32Z

What does this PR do?

Spatiotemporal Skip Guidance

Junha Hyung*, Kinam Kim*, Susung Hong, Min-Jung Kim, Jaegul Choo

KAIST AI, University of Washington

Spatiotemporal Skip Guidance (STG) for Enhanced Video Diffusion Sampling (CVPR 2025) is a simple training-free sampling guidance method for enhancing transformer-based video diffusion models. STG employs an implicit weak model via self-perturbation, avoiding the need for external models or additional training. By selectively skipping spatiotemporal layers, STG produces an aligned, degraded version of the original model to boost sample quality without compromising diversity or dynamic degree.

Following is the example video of STG applied to Mochi.

mochi_STG.mp4

More examples and information can be found on the GitHub repository and the Project website.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sayakpaul

a-r-r-o-w

Thank you very much for the PR! This is really amazing work and the paper was very intuitive to understand 🤗

I believe the example docstrings are out of sync with the pipelines. Could those be updated to reflect usage? I haven't exhaustively tested all pipelines, but looks good to me from the code.

This is very high up on my priority list too to integrate in core diffusers, and I'm working on adding these methods and make available to all models for #9672!

kinam0252 · 2025-03-06T16:56:06Z

Thank you for your kind words!😊I really appreciate your feedback.

I've updated the example docstrings to include sample usage. Let me know if there's anything else you'd like to adjust!
I’m also looking forward to seeing our method integrated into core diffusers! 🚀

a-r-r-o-w · 2025-03-06T20:17:13Z

@kinam0252 Looks good to me. I believe the imports and pipeline names are incorrect, which could be updated.

The failing tests seem to be because of a typo:

Could you update this as well? Will be good to merge after that 🤗

HuggingFaceDocBuilderDev · 2025-03-06T20:21:15Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

kinam0252 · 2025-03-07T15:47:45Z

@a-r-r-o-w Sorry for the previous mistakes! I’ve updated the imports, pipelines, and corrected the typos.
Hopefully, this resolves all the issues. 🚀 Please let me know if anything else needs to be fixed.

a-r-r-o-w · 2025-03-07T15:51:56Z

Thanks @kinam0252! Looks good to merge

@bot /style

a-r-r-o-w · 2025-03-07T15:56:44Z

@kinam0252 Is rescaling supported in HunyuanVideoSTGPipeline? The style bot seems to be failing for this as it is undefined: https://github.com/huggingface/diffusers/actions/runs/13724296073/job/38386911164

kinam0252 · 2025-03-07T16:11:53Z

@a-r-r-o-w Oh the style bot is right - rescaling is not supported in HunyuanVideoSTGPipeline. I just fixed it!

a-r-r-o-w · 2025-03-07T18:25:51Z

@bot /style

github-actions · 2025-03-07T18:33:10Z

Style fixes have been applied. View the workflow run here.

kinam0252 · 2025-03-08T03:53:21Z

Thank you so much! 😊

kinam0252 and others added 9 commits February 8, 2025 12:45

Support STG for video pipelines

7a558df

Update README.md

f6937b5

Update README.md

2649b6e

Merge branch 'huggingface:main' into dev

29da6a9

Update README.md

0399d96

Update README.md

ce43dbf

Update README.md

af61b37

Update README.md

85e325d

Merge branch 'huggingface:main' into dev

d62ecb9

sayakpaul requested a review from a-r-r-o-w March 4, 2025 11:28

yiyixuxu added the consider-for-modular-diffusers Things to consider adding support for in Modular Diffusers (with the help of community) label Mar 4, 2025

a-r-r-o-w approved these changes Mar 6, 2025

View reviewed changes

kinam0252 added 10 commits March 6, 2025 12:45

Merge branch 'huggingface:main' into stg

1de7ce0

Update pipeline_stg_cogvideox.py

c973b13

Update pipeline_stg_hunyuan_video.py

efc4255

Update pipeline_stg_ltx.py

77ec5d0

Update pipeline_stg_ltx_image2video.py

630f116

Update pipeline_stg_mochi.py

6025848

Update pipeline_stg_hunyuan_video.py

b08766b

Update pipeline_stg_ltx.py

8e2c32a

Update pipeline_stg_ltx_image2video.py

1d4d0bd

Update pipeline_stg_mochi.py

b3f6d99

update

df0b293

remove rescaling

5ed690c

Apply style fixes

085380e

a-r-r-o-w merged commit b38450d into huggingface:main Mar 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add STG to community pipelines #10960

Add STG to community pipelines #10960

kinam0252 commented Mar 4, 2025

a-r-r-o-w left a comment

kinam0252 commented Mar 6, 2025

a-r-r-o-w commented Mar 6, 2025

HuggingFaceDocBuilderDev commented Mar 6, 2025

kinam0252 commented Mar 7, 2025 •

edited

Loading

a-r-r-o-w commented Mar 7, 2025

a-r-r-o-w commented Mar 7, 2025

kinam0252 commented Mar 7, 2025

a-r-r-o-w commented Mar 7, 2025

github-actions bot commented Mar 7, 2025

kinam0252 commented Mar 8, 2025

Add STG to community pipelines #10960

Add STG to community pipelines #10960

Conversation

kinam0252 commented Mar 4, 2025

What does this PR do?

Spatiotemporal Skip Guidance

Before submitting

Who can review?

a-r-r-o-w left a comment

Choose a reason for hiding this comment

kinam0252 commented Mar 6, 2025

a-r-r-o-w commented Mar 6, 2025

HuggingFaceDocBuilderDev commented Mar 6, 2025

kinam0252 commented Mar 7, 2025 • edited Loading

a-r-r-o-w commented Mar 7, 2025

a-r-r-o-w commented Mar 7, 2025

kinam0252 commented Mar 7, 2025

a-r-r-o-w commented Mar 7, 2025

github-actions bot commented Mar 7, 2025

kinam0252 commented Mar 8, 2025

kinam0252 commented Mar 7, 2025 •

edited

Loading