Skip to content

bug in named_tensor_tutorial #821

Closed
Closed
@stas00

Description

@stas00

In https://pytorch.org/tutorials/intermediate/named_tensor_tutorial.html I think there is a bug:

dot_prod = q.div_(scale).matmul(k.align_to(..., 'D_head', 'T_key'))
[...]
attn_weights = self.attn_dropout(F.softmax(dot_prod / scale,
                                                   dim='T_key'))

the scaling is done twice, and I think it should be done only once.

Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    TensorsIssues relating to tensors (generic issues/questions or specific tensor tutorials)docathon-h1-2023A label for the docathon in H1 2023medium

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions