-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Avoid needless allocations in liveness_of_locals
.
#51869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
/// Overwrite one `IdxSetBuf` with another of the same length. | ||
pub fn overwrite(&mut self, other: &IdxSetBuf<T>) { | ||
self.bits.clone_from_slice(&other.bits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method probably belongs on IdxSet
rather than IdxSetBuf
(note that IdxSetBuf
derefs to IdxSet
). The relationship between those two types is analogous to []
and Vec
, so most operations that don't change the size or require allocation belong on IdxSet
. There is however an existing clone_from
method on that type that does the same thing:
But maybe we should rename that method (IdxSet::clone_from
) to overwrite
? I wonder if we are having shadowing problems because of Clone::clone_from
....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right about the shadowing. Here's the DHAT output:
==60110== -------------------- 1 of 500 --------------------
==60110== max-live: 217,728 in 1,701 blocks
==60110== tot-alloc: 77,545,592 in 1,330,065 blocks (avg size 58.30)
==60110== deaths: 1,330,065, at avg age 2,374,592 (0.00% of prog lifetime)
==60110== acc-ratios: 0.07 rd, 1.51 wr (5,953,680 b-read, 117,537,792 b-written)
==60110== at 0x4C2DECF: malloc (in /usr/lib/valgrind/vgpreload_exp-dhat-amd64-linux.so)
==60110== by 0x7C016B0: alloc (alloc.rs:62)
==60110== by 0x7C016B0: alloc (alloc.rs:123)
==60110== by 0x7C016B0: allocate_in<usize,alloc::alloc::Global> (raw_vec.rs:103)
==60110== by 0x7C016B0: with_capacity<usize> (raw_vec.rs:147)
==60110== by 0x7C016B0: with_capacity<usize> (vec.rs:362)
==60110== by 0x7C016B0: to_vec<usize> (slice.rs:167)
==60110== by 0x7C016B0: to_vec<usize> (slice.rs:369)
==60110== by 0x7C016B0: clone<usize> (vec.rs:1670)
==60110== by 0x7C016B0: clone<rustc::mir::Local> (indexed_set.rs:36)
==60110== by 0x7C016B0: clone_from<rustc_data_structures::indexed_set::IdxSetBuf<rustc::mir::Local>> (clone.rs:131)
==60110== by 0x7C016B0: rustc_mir::util::liveness::liveness_of_locals (liveness.rs:144)
==60110== by 0x7B22D4B: compute (liveness.rs:106)
==60110== by 0x7B22D4B: rustc_mir::borrow_check::nll::compute_regions (mod.rs:101)
==60110== by 0x7B33381: rustc_mir::borrow_check::do_mir_borrowck (mod.rs:238)
IdxSetBuf
implements Clone
, and therefore defines clone_from
, which is a provided method that calls clone
. IdxSet
has a clone_from
method, which is probably intended to be called, but the IdxSetBuf
implementation is called instead.
I think renaming IdxSet::clone_from
as IdxSet::overwrite
is reasonable. I wonder if there are any other places like this, where the clone_from
being called is not the one intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There aren't many definitions of clone_from
, so it wouldn't be hard to audit.
The current situation is something of a mess. - `IdxSetBuf` derefs to `IdxSet`. - `IdxSetBuf` implements `Clone`, and therefore has a provided `clone_from` method, which does allocation and so is expensive. - `IdxSet` has a `clone_from` method that is non-allocating and therefore cheap, but this method is not from the `Clone` trait. As a result, if you have an `IdxSetBuf` called `b`, if you call `b.clone_from(b2)` you'll get the expensive `IdxSetBuf` method, but if you call `(*b).clone_from(b2)` you'll get the cheap `IdxSetBuf` method. `liveness_of_locals()` does the former, presumably unintentionally, and therefore does lots of unnecessary allocations. Having a `clone_from` method that isn't from the `Clone` trait is a bad idea in general, so this patch renames it as `overwrite`. This avoids the unnecessary allocations in `liveness_of_locals()`, speeding up most NLL benchmarks, the best by 1.5%. It also means that calls of the form `(*b).clone_from(b2)` can be rewritten as `b.overwrite(b2)`.
0352aee
to
08683f0
Compare
I have updated. r? @nikomatsakis |
So...seems ok. =) |
@bors r+ |
📌 Commit 08683f0 has been approved by |
…tsakis Avoid needless allocations in `liveness_of_locals`. We don't need to replace the heap-allocated bitset, we can just overwrite its contents. This speeds up most NLL benchmarks, the best by 1.5%. r? @nikomatsakis
Avoid needless allocations in `liveness_of_locals`. We don't need to replace the heap-allocated bitset, we can just overwrite its contents. This speeds up most NLL benchmarks, the best by 1.5%. r? @nikomatsakis
☀️ Test successful - status-appveyor, status-travis |
We don't need to replace the heap-allocated bitset, we can just
overwrite its contents.
This speeds up most NLL benchmarks, the best by 1.5%.
r? @nikomatsakis