Skip to content

Enum discriminants used as indices are subject to bounds checks #36955

Closed
@abonander

Description

@abonander

Case Study

All enum discriminants are statically known, but the optimizer appears not to be able to introspect their max values when they are cast to primitives and used as indices, probably because at the LLVM level, they only ever are primitives. LLVM is never told their maximum value. This means that when using enum discriminants to index arrays/vectors whose lengths are also statically known, bounds checks are emitted even when one would reasonably expect them to be elided.

Minimal example (Playground):

#![feature(test)]

extern crate test;

use test::black_box;

#[repr(usize)]
enum Index {
    Zero = 0,
    One = 1,
}

#[inline(never)]
fn idx_cast(vals: &[i32; 2], idx: Index) {
    black_box(vals[idx as usize]);
}

fn main() {
    let vals = black_box([0, 1]);
    idx_cast(&vals, black_box(Index::Zero));
}

If you look at the emitted assembly (in release mode), you can see that idx_cast does a bounds-check.

If you add ::std::intrinsics::assume(idx as usize < 2), the bounds-check is elided and LLVM appears to use the discriminant as the index directly: https://is.gd/DdX7dd

This also works even if enum Index is not #[repr(usize)]; LLVM simply zero-extends the discriminant: https://is.gd/2JrJty

So as a possible optimization at codegen, whenever an enum is cast to a primitive, we could (or maybe even should) emit @llvm.assume( [discriminant value] <= [maximum discriminant value] ).

Converting via explicit match also allows elision of the bounds check without reintroducing a branch, but has differing behavior between enums with 2 variants and enums of more variants:

  • 2 variant enums get a use 1 if discriminant is 1, otherwise 0 treatment: https://is.gd/VS7JZK
  • Enums with > 2 variants directly use the discriminant (or zero-extend it if it's a different width) like the assume solution: https://is.gd/d8dEtY

I assume LLVM does the former in the different-width case because it uses fewer cycles than the latter, since they are otherwise functionally equivalent. However, when the enum is #[repr(usize)], the 2-variant version is using 3 extra, unnecessary instructions. But that seems like an LLVM problem more than a codegen problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    I-slowIssue: Problems and improvements with respect to performance of generated code.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions