Description
This is a bit of a hybrid bug report and documentation request. I suspect the "bug" will require a breaking change to fix so may be a non-starter.
In Diplomat we've been writing bindings to Rust from JS/Wasm. One thing we support is passing structs over FFI, by-value.
According to the tool conventions, multi-scalar-field structs are passed "indirectly", and single-scalar structs and scalars are passed directly by-value.
This is not what Rust does. This has previously come up in #81386. What Rust does is that it always passes structs by-value, which on the JS side means that the struct is "splatted" across the arguments including padding.
For example, the following type and function
#[repr(C)]
pub struct Big {
a: u8,
// 1 byte padding
b: u16,
// 4 bytes padding
c: u64,
}
#[no_mangle]
pub extern "C" fn big(x: Big) {}
gets passed over LLVM IR as
%Big = type { i8, [1 x i8], i16, [2 x i16], i64 }
define dso_local void @big(%Big %0) unnamed_addr #0 {...}
In JS, big
needs to be called as wasm.big(a, 0, b, 0, 0, c)
, with 0
s for the padding fields. Note that the padding fields can be different sizes, which is usually irrelevant but important here since "two i16s" and "one i32" end up meaning a different number of parameters. As far as I can tell the padding field has the same size as the alignment of the previous "real" field, but I can't find any documentation on this or even know whether this is a choice on LLVM or Rust's side.
It gets worse, however. Rust seems to special case scalar pairs:
#[repr(C)]
pub struct Small {
a: u8,
// 3 bytes padding
b: u32
}
#[no_mangle]
pub extern "C" fn small(x: Small) {}
define dso_local void @small(i8 %x.0, i32 %x.1) unnamed_addr #0 {..}
Here, despite Small
having padding, small()
gets called as wasm.small(a, b)
, because the fields got splatted out in the LLVM IR itself, without padding.
This is even stranger when comparing with the tool conventions because they have no mention of scalar pairs.
It would be really nice if Rust followed the documented tool conventions. I suspect that's not going to happen, and, besides, direct parameter passing is likely more efficient here1.
Failing that, it seems like it would be nice if "pairs" did not have special behavior compared to structs with more than 2 scalars in them. Ideally I'd like the scalar pair behavior to apply to all structs: always splat out structs into fields, never require padding be passed over FFI. I'm not sure if such a breaking change is possible, though.
Failing that, I think this behavior should be carefully documented. I've been discovering this by trial-and-error, and Rust's behavior contradicts the extant documentation, which makes it even more crucial to have documentation on how Rust diverges.
As far as I can tell, this is not a problem for wasm-bindgen since wasm-bindgen doesn't pass structs over FFI by-value, though the failing test mentioned in #81386 seems to indicate they care some amount.
Footnotes
-
Indirect passing means that the JS side basically has to allocate a heap object each time it wishes to call such a function. ↩