Skip to content

Potentially faster dedup_by implementation #77772

Closed
@Plecra

Description

@Plecra

It seems like the dedup_by implementation makes two passes over the data, making many more copies than are strictly necessary. Doing this allows the same_bucket implementation to panic without causing a leak. I wouldn't call this bad, but it was a surprise to see that it was doing the extra work.

https://godbolt.org/z/4To1a1

Mimicking the way Drain handles this and temporarily shortening the vector allows us to cut down on the copies (and also conveniently removes a panicking branch).

I wonder if it makes sense to document the trade-off that std makes here? Imo, it'd make it easier to use the library correctly in the way that https://doc.rust-lang.org/std/vec/struct.Vec.html#current-implementation-7 does.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-collectionsArea: `std::collections`C-enhancementCategory: An issue proposing an enhancement or a PR with one.I-slowIssue: Problems and improvements with respect to performance of generated code.T-libsRelevant to the library team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions