Attention masks are missing in SD3 to mask out text padding tokens

### Describe the bug

In the attention implementation of SD3, attention masks currently are not used. This will result in inconsistent outputs for the different values `max_seq_length` where padding exists in text tokens as the attention scores of padding tokens are non-zero. This issue has been discussed in https://github.com/huggingface/diffusers/discussions/8628, and is created to track the progress of fixing this problem.

Thanks @sayakpaul for the discussion.

### Reproduction

n/a

### Logs

_No response_

### System Info

n/a

### Who can help?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attention masks are missing in SD3 to mask out text padding tokens #8673

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Attention masks are missing in SD3 to mask out text padding tokens #8673

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions