Description
Looks like there are some open questions regarding the semantics of shufflevector intrinsics in LLVM:
When the mask is out of bounds, the current semantics are that the resulting element is undef. If shufflevector is used to remove an element from the input vector, then the instruction cannot be removed, as the input might have been poison. The solution is to switch to give poison instead.
So my question is, what are the semantics of those intrinsics in MIR? We had some discussion on Zulip and it seems like it should be one of the two:
- OOB shufflevector should be statically ruled out. We could even have the new MIR validation pass check that.
- OOB shufflevector returns "uninitialized memory" for the affected elements. (Following this paper, MIR does not have a poison vs undef distinction; our "uninit" is closest to LLVM's poison.)
The former seems definitely the safest ;) but currently this is not enforced. @hanna-kruppe was concerned that stdarch might rely on OOB shufflevector somewhere. If we have consensus that for now, we should statically rule out OOB indices to avoid UB/uninit here, I guess the main open issue is (a) implementing that check and (b) fixing the fallout, if any.
Cc @rust-lang/lang @rust-lang/libs @gnzlbg @Lokathor (not sure who else to ping for stdarch)