Reject OOB shufflevector intrinsics

Looks like there are some [open](https://blog.regehr.org/archives/1737) questions regarding the semantics of shufflevector intrinsics in LLVM:

> When the mask is out of bounds, the current semantics are that the resulting element is undef. If shufflevector is used to remove an element from the input vector, then the instruction cannot be removed, as the input might have been poison. The solution is to switch to give poison instead. 

So my question is, what are the semantics of those intrinsics in MIR? We had some [discussion on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/LLVM.20IR.20semantics) and it seems like it should be one of the two:
* OOB shufflevector should be statically ruled out. We could even have the new MIR validation pass check that.
* OOB shufflevector returns "uninitialized memory" for the affected elements. (Following [this paper](http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf), MIR does not have a poison vs undef distinction; our "uninit" is closest to LLVM's poison.)

The former seems definitely the safest ;) but currently this is [not enforced](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=3e5c8da20683871a552d0b4067a3f8cc). @hanna-kruppe was concerned that stdarch might rely on OOB shufflevector somewhere. If we have consensus that for now, we should statically rule out OOB indices to avoid UB/uninit here, I guess the main open issue is (a) implementing that check and (b) fixing the fallout, if any.

Cc @rust-lang/lang @rust-lang/libs @gnzlbg @Lokathor  (not sure who else to ping for stdarch)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reject OOB shufflevector intrinsics #73542

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reject OOB shufflevector intrinsics #73542

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions