Open
Description
Consider this example:
pub fn parse(x: usize) -> usize {
match x % 5 {
0 => f1(x),
1 => f2(x),
2 => f3(x),
3 => f4(x),
4 => f5(x),
// 5 | 6 | 7 => loop{},
_ => loop{},
}
}
It currently generates similar LLVM-IR:
start:
%_2 = urem i64 %x, 5, !dbg !10
switch i64 %_2, label %bb12 [
i64 0, label %bb2
i64 1, label %bb4
i64 2, label %bb6
i64 3, label %bb8
i64 4, label %bb10
], !dbg !11
; ...
bb12:
br label %bb12, !dbg !32
Even though the default branch is unreachable (x % 5
can't be greater than 4) it is still generated.
But, when un-commenting 5 | 6 | 7 => loop{}
line (still unreachable branch, that does the same as default) the default branch becomes unreachable
:
start:
%_2 = urem i64 %x, 5, !dbg !10
switch i64 %_2, label %bb14 [
i64 0, label %bb2
i64 1, label %bb4
i64 2, label %bb6
i64 3, label %bb8
i64 4, label %bb10
], !dbg !11
; ...
bb14:
unreachable
So, adding an unreachable branch that does the same as the default helps optimize the code.
Some notes:
- These differences then propagate to assembler, i.e. it also has an unreachable branch when
5 | 6 | 7
is commented-out - godbolt link
- Range patterns (like
5..=7
) do not help - Making the default branch
5 | 6 | 7 | _
doesn't help either - Similar things happen for any
N
inx % N
that is not a power of two -- adding unreachable branches until the last branch has a power-of-two-minus-one-value helps optimizing the code loop{}
s can be replaced with anything else, for exampleunreachable!()
orpanic!()
, effect is still the same- in case of panic-like things not removing the panicking branch seems like a big deal
- I've used
loop{}
s to make diffs less noisy
if-else-if
chains are identical tomatch
here- clang seems to handle this situation just fine in any case: godbolt link
- it also makes the last reachable branch default in
llvm-ir
, instead of generating anunreachable
one
- it also makes the last reachable branch default in
Metadata
Metadata
Assignees
Labels
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generationCategory: An issue proposing an enhancement or a PR with one.Category: An issue highlighting optimization opportunities or PRs implementing suchCall for participation: An issue has been fixed and does not reproduce, but no test has been added.Issue: Problems and improvements with respect to performance of generated code.Relevant to the compiler team, which will review and decide on the PR/issue.