Skip to content

Remove redundant comparison inside the diffusion loop of stable video diffusion pipeline #9425

Open
@dianyo

Description

@dianyo

Is your feature request related to a problem? Please describe.
I found that the inside the __call__ of stable video diffusion keeps doing async memcpy between host to device as attached.
Screenshot 2024-09-12 at 6 45 24 PM

Describe the solution you'd like.
The reason for that is actually coming from every time we get self.do_classifier_free_guidance, we compared tensor between int -> get boolean on device -> memcpy that boolean from gpu to cpu.

It'll be good to just assign a variable for it before the loop as the value won't change through the loop.

Additional context.
I'm glad to contribute this by opening a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleIssues that haven't received updates

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions