TypedArena: Eliminate intermediate copy if Iterator size is known #77945

bugadani · 2020-10-14T18:26:12Z

TypedArena now tries to not copy iterator contents into a SmallVec.

I don't know if this wasn't done on purpose, but I copied this code over from DroplessArena just in case this was an oversight.

rust-highfive · 2020-10-14T18:26:36Z

r? @varkor

(rust_highfive has picked a reviewer for you, use r? to override)

jonas-schievink · 2020-10-14T18:39:36Z

@bors try @rust-timer queue

rust-timer · 2020-10-14T18:39:38Z

Awaiting bors try build completion

bors · 2020-10-14T18:39:49Z

⌛ Trying commit 455f4de686ab86152f096515da0669d68b32fbbb with merge 9d9c665b6419c577f598d8cc19a94c06e133f859...

compiler/rustc_arena/src/lib.rs

bors · 2020-10-14T19:24:26Z

☀️ Try build successful - checks-actions, checks-azure
Build commit: 9d9c665b6419c577f598d8cc19a94c06e133f859 (9d9c665b6419c577f598d8cc19a94c06e133f859)

rust-timer · 2020-10-14T19:24:29Z

Queued 9d9c665b6419c577f598d8cc19a94c06e133f859 with parent 5565241, future comparison URL.

est31 · 2020-10-14T19:59:07Z

Btw, you might be aware, but to point it out: the current state of this PR is unsound. Later on, when the arena is dropped, it could drop uninitialized memory, as the TrustedLen trait bound is not used.

I think a good resolution would be to not call alloc_raw_slice but do its content manually. Most importantly, self.ptr.set should only be called right above the return line in write_from_iter.

Performance wise it will most likely behave like the current state of the PR. So it's mostly a concern for after the current perf run :).

bugadani · 2020-10-14T20:06:18Z

Oh my, thanks for pointing that out, I was indeed completely unaware of this!

rust-timer · 2020-10-14T23:31:57Z

Finished benchmarking try commit (9d9c665b6419c577f598d8cc19a94c06e133f859): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never

bugadani · 2020-10-15T10:36:47Z

This doesn't seem to make much sense to me.

The current code errors, fine, let's assume it's incorrect.

But this also errors:

unsafe fn write_from_iter<I: Iterator<Item = T>>(&self, mut iter: I, len: usize) -> &mut [T] {
    let mut i = 0;
    let mem = self.ptr.get();
    // Use a manual loop since LLVM manages to optimize it better for
    // slice iterators
    loop {
        let value = iter.next();
        if i >= len || value.is_none() {
            // only advance the pointer when we know for sure how many elements we have
            self.ptr.set(mem.add(len)); // notice `len` here

            // We only return as many items as the iterator gave us, even
            // though it was supposed to give us `len`
           return slice::from_raw_parts_mut(mem, i);
        }
        ptr::write(mem.add(i), value.unwrap());
        i += 1;
    }
}

But this doesn't:

unsafe fn write_from_iter<I: Iterator<Item = T>>(&self, mut iter: I, len: usize) -> &mut [T] {
    let mut i = 0;
    let mem = self.ptr.get();
    self.ptr.set(mem.add(len)); // pointer set here
    // Use a manual loop since LLVM manages to optimize it better for
    // slice iterators
    loop {
        let value = iter.next();
        if i >= len || value.is_none() {
            // only advance the pointer when we know for sure how many elements we have

            // We only return as many items as the iterator gave us, even
            // though it was supposed to give us `len`
           return slice::from_raw_parts_mut(mem, i);
        }
        ptr::write(mem.add(i), value.unwrap());
        i += 1;
    }
}

While both unsound, the last 2 versions should be otherwise equivalent since ptr is not read in this function, except to get mem.

The error message corrupted size vs. prev_size seems to indicate there is a bad write somewhere, but why would that depend on the location where we copy an address? I think I'm missing something obvious here.

est31 · 2020-10-15T10:45:24Z

Maybe this might be due to code inside the next function of the iterator accessing the arena?

bugadani · 2020-10-15T10:49:40Z

Huh, maybe, that's something I didn't think of. Seems to happen fairly regularly during ast lowering, so yeah, probably that's what causes it. Thank you.

est31 · 2020-10-15T11:08:37Z

@bugadani for measuring the perf gains you could just disable arena dropping for now and make it leak, and otherwise revert to the original state of the PR. This would give an upper bound on the perf gains even though it includes the gains of leaking as well, so maybe make a separate PR where you only leak. Comparing the two PRs would give a more precise number(but of course the base changed so percentages won't be 100% accurate).

bugadani · 2020-10-15T11:13:07Z

@est31 I'm not sure why I'd want to do that? The current perf run shows a tiny improvement already, and if that warrants it, the best I can do here is to specialize the non-copy version for types that implement TrustedLen. Otherwise I don't think another perf run would make sense.

est31 · 2020-10-15T11:15:42Z

@bugadani oh indeed, specialization would be the best approach here. Disregard what I said :).

est31

Mostly OK, except nits.

compiler/rustc_arena/src/lib.rs

Co-authored-by: est31 <[email protected]>

est31

Looks good!

Marwes · 2020-10-16T10:50:21Z

compiler/rustc_arena/src/lib.rs

+                    let start_addr = arena.alloc_raw_slice(len);
+                    let mut mem = start_addr;
+
+                    while let Some(value) = iter.next() {


Isn't this unsound if next panics? I assume the arena would try to call drop on uninitialized memory in that case (*).

(*) When I did this same optimization I added a drop guard which fills the slice with default values (**).
(**) Default can also panic but in that case the process abort, though I know in my case that Default won't panic for any of the types that use it.

Huh, thanks for noticing that. I guess this sort of explains why this optimization wasn't done before 😅

I'm trying to make this as general as possible, so no T: Default here :) However, this means the only way I see to solve this issue is to catch the unwind and undo the failed partial allocation by hand. Which could get messy.

I wonder if it's worth doing, considering the otherwise minimal impact of this PR.

The problem with that is that there may be more things allocated after this allocation which has already succeded and handed out a reference. So there isn't a way to undo this allocation as far as I can see (without adding another Vec to keep track of failed allocations I guess, I suppose that Vec wouldn't be used 99% of the time so it wouldn't add any overhead other than an increased arena size and complexity).

Isn't it the idea of TrustedLen that the iterator will always have that many elements? If next panics, it would violate that contract.

I'm not sure which benefit TrustedLen would give people at all if next is allowed to panic.

@bugadani oh right. That's super weird though because then what's the benefit of TrustedLen over ExactSizeIterator?

As a last attempt to salvage this PR, the dropping arena could get a Vec<Vec<T>> member that is used instead of the chunks when given an iterator. Then we can simply put the temporary vec that the pre-PR iterator code creates into that list and don't have to copy stuff into the chunk. We already have to pay the cost of deallocating the Vec anyways, all we do is move it into the future. This bloats up the structures of the native allocator and destroys cache locality, but might be worth a try.

@est31 I guess ExactSizeIterator isn't unsafe and so it's technically allowed to misbehave? I'm not sure.

Regarding the memory question:

TypedArena works in pages. When a page can't hold the requested data, a new page is created. The allocator then uses pointers in the last page. This complicates things a bit, but I don't think by a huge deal.

If an allocation panics, we can drop the memory that we allocated, as well as the memory that the inner operations allocated. We know about these regions, because we have them saved on our stack, or we can if currently something is missing.

If a nested tree of of alloc_from_iter calls fail for whatever reason, the panic will bubble up, and can be caught by each level of alloc call, basically unwinding their work.

~~The only question is, how do I bubble up the innermost panic? :)~~

Since Iterator isn't UnwindSafe, this looks like a lot more complicated than what should be done.

One thing that could work, with only a little overhead, is to allocate a new chunk for any allocation calls inside the iterator. I believe this would work without TrustedLen also.

Pseudo rust

let slice = self.alloc_raw_slice(); let chunk_index = self.current_chunk; // Move to the next chunk, if no next chunk exists the next allocation will force a new chunk to be created self.current_chunk += 1; for elem in iterator { slice.fill(elem); } // Move back to the current chunk so we can try filling the rest of it self.current_chunk = chunk_index; // Search for the first chunk fn get_current_chunk(&self, needed: usize) -> &mut Vec { for chunk in &mut self.chunks[self.current_chunk..] { if chunk.capacity() - chunk.len() >= needed { self.current_chunk = index; return chunk } } self.allocate_chunk() }

Huh, that sounds like an interesting idea! There is a theoretical possibility of wasting a ton of memory, but let's hope it's not an issue in practice :)

bugadani · 2020-10-16T12:47:02Z

Thank you everybody for your time.

rust-highfive assigned varkor Oct 14, 2020

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 14, 2020

est31 reviewed Oct 14, 2020

View reviewed changes

compiler/rustc_arena/src/lib.rs Outdated Show resolved Hide resolved

bugadani force-pushed the arena branch from 3bee690 to 8e88d6d Compare October 15, 2020 07:03

bugadani force-pushed the arena branch from f0ac1b0 to ef00e16 Compare October 15, 2020 12:47

est31 requested changes Oct 15, 2020

View reviewed changes

compiler/rustc_arena/src/lib.rs Outdated Show resolved Hide resolved

compiler/rustc_arena/src/lib.rs Outdated Show resolved Hide resolved

compiler/rustc_arena/src/lib.rs Outdated Show resolved Hide resolved

bugadani force-pushed the arena branch from 59dc0f6 to d6134b0 Compare October 15, 2020 13:05

TypedArena: Eliminate intermediate copy if Iterator size is known

d9bb68b

Co-authored-by: est31 <[email protected]>

bugadani force-pushed the arena branch from d6134b0 to d9bb68b Compare October 15, 2020 13:25

est31 approved these changes Oct 15, 2020

View reviewed changes

jyn514 added I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 16, 2020

Marwes reviewed Oct 16, 2020

View reviewed changes

bugadani closed this Oct 16, 2020

bugadani deleted the arena branch October 16, 2020 12:47

bugadani mentioned this pull request Oct 18, 2020

Don't copy into arena when we had to heap allocate for iter #78067

Closed

bugadani mentioned this pull request Oct 19, 2020

Try not to copy when allocating from iterators, take 3 #78107

Closed

bugadani mentioned this pull request Jan 15, 2021

Specialize arena alloc for cloned slices #81053

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TypedArena: Eliminate intermediate copy if Iterator size is known #77945

TypedArena: Eliminate intermediate copy if Iterator size is known #77945

bugadani commented Oct 14, 2020

rust-highfive commented Oct 14, 2020

jonas-schievink commented Oct 14, 2020

rust-timer commented Oct 14, 2020

bors commented Oct 14, 2020

bors commented Oct 14, 2020

rust-timer commented Oct 14, 2020

est31 commented Oct 14, 2020

bugadani commented Oct 14, 2020

rust-timer commented Oct 14, 2020

bugadani commented Oct 15, 2020

est31 commented Oct 15, 2020

bugadani commented Oct 15, 2020 •

edited

Loading

est31 commented Oct 15, 2020

bugadani commented Oct 15, 2020

est31 commented Oct 15, 2020

est31 left a comment

est31 left a comment

Marwes Oct 16, 2020 •

edited

Loading

bugadani Oct 16, 2020

Marwes Oct 16, 2020

est31 Oct 16, 2020

est31 Oct 16, 2020

est31 Oct 16, 2020

est31 Oct 16, 2020

bugadani Oct 16, 2020 •

edited

Loading

Marwes Oct 16, 2020

bugadani Oct 16, 2020 •

edited

Loading

bugadani commented Oct 16, 2020

TypedArena: Eliminate intermediate copy if Iterator size is known #77945

TypedArena: Eliminate intermediate copy if Iterator size is known #77945

Conversation

bugadani commented Oct 14, 2020

rust-highfive commented Oct 14, 2020

jonas-schievink commented Oct 14, 2020

rust-timer commented Oct 14, 2020

bors commented Oct 14, 2020

bors commented Oct 14, 2020

rust-timer commented Oct 14, 2020

est31 commented Oct 14, 2020

bugadani commented Oct 14, 2020

rust-timer commented Oct 14, 2020

bugadani commented Oct 15, 2020

est31 commented Oct 15, 2020

bugadani commented Oct 15, 2020 • edited Loading

est31 commented Oct 15, 2020

bugadani commented Oct 15, 2020

est31 commented Oct 15, 2020

est31 left a comment

Choose a reason for hiding this comment

est31 left a comment

Choose a reason for hiding this comment

Marwes Oct 16, 2020 • edited Loading

Choose a reason for hiding this comment

bugadani Oct 16, 2020

Choose a reason for hiding this comment

Marwes Oct 16, 2020

Choose a reason for hiding this comment

est31 Oct 16, 2020

Choose a reason for hiding this comment

est31 Oct 16, 2020

Choose a reason for hiding this comment

est31 Oct 16, 2020

Choose a reason for hiding this comment

est31 Oct 16, 2020

Choose a reason for hiding this comment

bugadani Oct 16, 2020 • edited Loading

Choose a reason for hiding this comment

Marwes Oct 16, 2020

Choose a reason for hiding this comment

bugadani Oct 16, 2020 • edited Loading

Choose a reason for hiding this comment

bugadani commented Oct 16, 2020

bugadani commented Oct 15, 2020 •

edited

Loading

Marwes Oct 16, 2020 •

edited

Loading

bugadani Oct 16, 2020 •

edited

Loading

bugadani Oct 16, 2020 •

edited

Loading