Description
I happened to look at more closely at the generated assembly for some code that uses align_offset
, and noticed... align_offset
does not compile as efficiently as one might hope for the case of "align pointer to size_of::<usize>()
"
For example (ignore that I omit handling of align_offset's error return value):
pub unsafe fn align_offset(p: *const u8) -> *const u8 {
p.add(p.align_offset(core::mem::size_of::<usize>()))
}
compiles to
example::align_offset:
movl %edi, %ecx
andl $7, %ecx
movl $8, %eax
subq %rcx, %rax
testq %rcx, %rcx
cmoveq %rcx, %rax
addq %rdi, %rax
retq
Whereas performing the same alignment manually (forgive my convoluted way of doing this, my usual pattern is very slightly different and I don't have a memorized idiom for this without going through usize, since, well, I figured I just wanted to use align_offset):
pub unsafe fn manual_align_offset(p: *const u8) -> *const u8 {
// IIRC just doing `arithmetic(p as usize) as *const u8` makes LLVM sad
let aligned = ((p as usize + USIZE_SIZE - 1) & !(USIZE_SIZE - 1)) as isize;
p.offset(aligned - (p as isize))
}
compiles to
example::manual_align_offset:
leaq 7(%rdi), %rax
andq $-8, %rax
retq
Which is substantially better along a variety of metrics, including but not limited to actual runtime.
Taking a look at the source for align_offset
reveals that it uh, well it does some stuff.
rust/library/core/src/ptr/mod.rs
Lines 1166 to 1271 in ac48e62
p + USIZE - 1
wraps around the address space but the aligned value wouldn't).
Anyway IIUC align_offset is really considered the way forward for all pointer aligning, as miri will throw your code straight into the trash if it catches it dereferencing a pointer that you manually aligned... (I have Opinions on this but I'll spare you from a rant).
So, for that and a lot of reasons, I think we'd like the codegen for align_offset to look a lot closer to what I provided at opt-level=3, even if it means special-casing when size_of::<T> == 1
... (I mean, I'd also love for it not to generate the whole code mountain in debug builds, but one thing at a time I guess).
Anyway, the function's documentation comment tells me that @nagisa has taken up the sacred burden of "keeper of align_offset
's secrets"... I have questions: is this fixable? And if not, is there an interface that we can provide that lets us produce good code here? Am I missing something?
P.S. Godbolt link: https://godbolt.org/z/388Enf