-
Notifications
You must be signed in to change notification settings - Fork 301
Optimize find
#279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize find
#279
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised that you're even seeing a performance difference. I would have expected the generated code to be identical after this change.
src/raw/mod.rs
Outdated
@@ -1054,6 +1053,7 @@ impl<T, A: Allocator + Clone> RawTable<T, A> { | |||
/// `RawIterHash`. Because we cannot make the `next` method unsafe on the | |||
/// `RawIterHash` struct, we have to make the `iter_hash` method unsafe. | |||
#[cfg_attr(feature = "inline-more", inline)] | |||
#[allow(dead_code)] // Used when the `raw` API is enabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#[allow(dead_code)] // Used when the `raw` API is enabled | |
#[cfg(feature = "raw")] |
src/raw/mod.rs
Outdated
@@ -1255,6 +1255,33 @@ impl<A: Allocator + Clone> RawTableInner<A> { | |||
} | |||
} | |||
|
|||
/// Searches for an element in the table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an explanation that this is split into a separate function to improve compilation time but has no performance impact from dynamic dispatch in practice due to inlining.
Looking at the disassembly in more detail, it seems that the iterator was confusing LLVM somehow which caused it to unnecessarily unroll the loop one extra time. |
I was expecting this to only affect debug builds, but I'll take some free performance. |
@bors r+ |
📌 Commit 0c528ef has been approved by |
☀️ Test successful - checks-actions |
This optimizes
find
for size and performance.The
cargo llvm-lines
output from the following program is reduced from 15837 to 15771.rustc_driver
size (with optimizations) is reduced by 0.14%.Impact on compiler performance: