Closed
Description
Code
Godbolt repro: https://godbolt.org/z/Yv3qYqq4W
Variant A: check a..
, then check ..a+N
.
pub fn write_u8(
bytes: &mut [u8],
buf: u8,
offset: usize,
) -> Result<(), Box<dyn std::error::Error>> {
let buf = buf.to_le_bytes();
bytes
.get_mut(offset..).and_then(|bytes| bytes.get_mut(..buf.len()))
.ok_or_else(|| "RangeError: The specified range is invalid")?
.copy_from_slice(&buf);
Ok(())
}
Variant B: check a..a+N
in one call.
pub fn write_u8(
bytes: &mut [u8],
buf: u8,
offset: usize,
) -> Result<(), Box<dyn std::error::Error>> {
let buf = buf.to_le_bytes();
bytes
.get_mut(offset..offset+buf.len()) // <- changed line
.ok_or_else(|| "RangeError: The specified range is invalid")?
.copy_from_slice(&buf);
Ok(())
}
Note that N is a constant (1), as it's a size of an u8
. Also note that that this example can be extended to other types of buf
: u16
, u32
, [u8; 17]
etc.
Results
- Variant A on stable 1.70: generates a single
bytes.len() <= offset
check, then sets the byte and returnsOk
- perfect. (I wish the push/pop were only on the error path, but that's off topic). - Variant B on stable 1.70: generates two branches. (I wish it was the same as A, but I guess there could be some overflow related differences.)
- Variant A on beta 1.71: same as on stable.
- Variant B on beta 1.71: seemingly, nonsense? The "happy path" is much longer and contains several conditional moves; also, the
lea rdi, [rip + .L__unnamed_1]
looks like it moved address calculation of the error string to the happy path.
This issue concerns the last one.
I did an ad-hoc microbenchmark that shows Variant B on Beta (and nightly) to become 2-3x slower, but I'm not sure if that benchmark needs attaching - to me it's clear from the disassembly that the happy path codegen is now worse.
Metadata
Metadata
Assignees
Labels
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Category: This is a bug.Call for participation: An issue has been fixed and does not reproduce, but no test has been added.Issue: Problems and improvements with respect to performance of generated code.Medium priorityRelevant to the compiler team, which will review and decide on the PR/issue.Performance or correctness regression from one stable version to another.