Skip to content

Range.contains failed to be inlined/optimized #90609

Closed
@senevoldsen

Description

@senevoldsen

I was suggested on Stack Overflow (https://stackoverflow.com/questions/69844819/rust-range-contains-failed-to-be-inlined-optimized) to ask here.

I am aware that optimization in complex situations can fail to apply. However, rather straightforward inlining "in the small" should still apply.

I was running my code through Clippy and it suggested changing the following:

const SPECIAL_VALUE: u8 = 0; // May change eventually.

pub fn version1(value: u8) -> bool {
    (value >= 1 && value <= 9) || value == SPECIAL_VALUE
}

Into

pub fn version2(value: u8) -> bool {
    (1..=9).contains(&value) || value == SPECIAL_VALUE
}

Since it is more readable. Unfortunately the resulting assembly output is twice as long, even with optimization level 3. Manually inlining it (2-nestings down), gives almost the same code as version1 and is as efficient.

pub fn manually_inlined(value: u8) -> bool {
    (1 <= value && value <= 9) || value == SPECIAL_VALUE
}

If I remove the || value == SPECIAL_VALUE they all resolve with the same (though with 1 more instruction added to decrement the parameter value before a compare). Also if I change SPECIAL_VALUE to something not adjacent to the range they all resolve to same assembly code as version2, which is the reason why I kept it 0 unless I eventually have to change it.

I have a link to Godbolt with the code here: https://rust.godbolt.org/z/d9PWYEKc8

Why is the compiler failing to properly inline/optimize version2? Is it an "optimization bug"? Or am I misunderstanding some semantics of Rust, maybe something with the borrowing prevents the optimization, but can't the compiler assume no mutation of value due to the aliasing and referencing rules? Because the optimization is applied in version1 it would suggest LLVM knows that because the value is unsigned it can simplify the comparison. So it may be that there is a missed optimization opportunity in the Rust frontend?

Trying to do something similar in C++ gives the optimum short assembly in GCC but not in Clang https://godbolt.org/z/erYPYsvhf

Metadata

Metadata

Assignees

Labels

A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.I-slowIssue: Problems and improvements with respect to performance of generated code.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions