Closed
Description
Using iter().skip() functions leads to poor optimization compared to the manually done loop with range.
https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=b7ed8bf9e4fc3341a92f301fa5185cc5
pub fn test_1(a: [i32; 10]) -> i32 {
let mut sum = 0;
for v in a.iter().skip(8) {
sum += v;
}
sum
}
pub fn test_2(a: [i32; 10]) -> i32 {
let mut sum = 0;
for index in 8..10 {
sum += a[index];
}
sum
}
This produces the following asm output:
playground::test_1:
movq %rdi, %r8
addq $40, %r8
xorl %esi, %esi
movl $8, %edx
xorl %eax, %eax
testb $1, %sil
jne .LBB0_2
.LBB0_5:
leaq -1(%rdx), %rsi
movq %r8, %rcx
subq %rdi, %rcx
shrq $2, %rcx
cmpq %rsi, %rcx
jbe .LBB0_4
leaq (%rdi,%rdx,4), %rdi
.LBB0_2:
cmpq %r8, %rdi
je .LBB0_4
testq %rdi, %rdi
je .LBB0_4
addl (%rdi), %eax
addq $4, %rdi
movb $1, %sil
xorl %edx, %edx
testb $1, %sil
je .LBB0_5
jmp .LBB0_2
.LBB0_4:
retq
playground::test_2:
movl 36(%rdi), %eax
addl 32(%rdi), %eax
retq
Considering the zero-cost abstraction rule and the fact that the compiler knows the size of the array, it should optimize test_1 to at least the same form as test_2 where it correctly detected that we only need two values summed. Instead, there's quite a chunk of asm with lots of branches.
The issue is present both in the stable version (1.63.0) and nightly/beta channels.
Metadata
Metadata
Assignees
Labels
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Category: This is a bug.Call for participation: An issue has been fixed and does not reproduce, but no test has been added.Issue: Problems and improvements with respect to performance of generated code.Relevant to the compiler team, which will review and decide on the PR/issue.