Description
My apologies for not catching this sooner.
The struct alignment and packing that landed (well, was re-enabled) with 1.18.0 produces either improperly aligned structs or improperly aligned loads and stores (or both) on SPARC. The following test program demonstrates it (extracted from where I originally saw the compiler die):
pub struct Blake2bCtx {
a: [u8; 8],
b: u16,
c: u8,
}
#[inline(never)]
fn blake2b_compress(ctx: &mut Blake2bCtx) {
// Re-interpret the input buffer in the state as an array of little-endian
// u64s, converting them to machine endianness. It's OK to modify the
// buffer in place since this is the last time this data will be accessed
// before it's overwritten.
let m: &mut [u64; 1] = unsafe {
let t: &mut [u8; 8] = &mut ctx.a;
::std::mem::transmute(t)
};
for word in &mut m[..] {
*word = *word + 1;
}
}
fn main() {
let mut ctx = Blake2bCtx {
a: [0; 8],
b: 0,
c: 0,
};
blake2b_compress(&mut ctx)
}
The disassembly looks like this:
_ZN5blake16blake2b_compress17h54f2249749134ad6E()
blake::blake2b_compress::h54f2249749134ad6: 9d e3 bf 80 save %sp, -0x80, %sp
blake::blake2b_compress::h54f2249749134ad6+0x4: f2 5e 20 02 ldx [%i0 + 0x2], %i1
blake::blake2b_compress::h54f2249749134ad6+0x8: b2 06 60 01 add %i1, 0x1, %i1
blake::blake2b_compress::h54f2249749134ad6+0xc: f2 76 20 02 stx %i1, [%i0 + 0x2]
blake::blake2b_compress::h54f2249749134ad6+0x10: 81 c7 e0 08 ret
blake::blake2b_compress::h54f2249749134ad6+0x14: 81 e8 00 00 restore
Note the ldx
of [%i0 + 0x2]
. That would be valid if %i0
were itself misaligned, but it's not, and ldx
requires an 8-byte alignment. I can attach the entire IR if necessary, but I believe these are the relevant bits:
; ModuleID = 'blake.cgu-0.rs'
source_filename = "blake.cgu-0.rs"
target datalayout = "E-m:e-i64:64-n32:64-S128"
target triple = "sparcv9-sun-solaris"
%Blake2bCtx = type { i16, [0 x i8], [8 x i8], [0 x i8], i8, [1 x i8] }
%"core::slice::IterMut<u64>" = type { i64*, [0 x i8], i64*, [0 x i8], %"core::marker::PhantomData<&mut u64>", [0 x i8] }
%"core::marker::PhantomData<&mut u64>" = type {}
@__rustc_debug_gdb_scripts_section__ = internal unnamed_addr constant [34 x i8] c"\01gdb_load_rust_pretty_printers.py\00", section ".debug_gdb_scripts", align 1
; Function Attrs: noinline nounwind uwtable
define internal fastcc void @_ZN5blake16blake2b_compress17h54f2249749134ad6E(%Blake2bCtx* dereferenceable(12)) unnamed_addr #0 !dbg !5 {
bb6:
tail call void @llvm.dbg.value(metadata %Blake2bCtx* %0, i64 0, metadata !23, metadata !57), !dbg !58
tail call void @llvm.dbg.value(metadata %Blake2bCtx* %0, i64 0, metadata !24, metadata !57), !dbg !59
%1 = getelementptr inbounds %Blake2bCtx, %Blake2bCtx* %0, i64 0, i32 2, !dbg !60
%2 = bitcast [8 x i8]* %1 to i64*, !dbg !61
tail call void @llvm.dbg.value(metadata %"core::slice::IterMut<u64>"* undef, i64 0, metadata !39, metadata !62), !dbg !63
tail call void @llvm.dbg.value(metadata i64* %2, i64 0, metadata !54, metadata !57), !dbg !64
%3 = load i64, i64* %2, align 8, !dbg !65
%4 = add i64 %3, 1, !dbg !65
store i64 %4, i64* %2, align 8, !dbg !65
tail call void @llvm.dbg.value(metadata %"core::slice::IterMut<u64>"* undef, i64 0, metadata !39, metadata !62), !dbg !63
ret void, !dbg !66
}
I don't see anything obviously wrong in the code, but my knowledge here is pretty slim, so if it's obvious to someone else, I'd love the help. Otherwise, I'm probably going to be putting debugging statements into rustc::ty::layout::Struct::new
, hope I'm poking around in the right area, and doing a lot of recompiling.