Description
After 90ba330, I see a very noticeable build time regression when building drivers/net/wireless/ralink/rt2x00/rt2800lib.c
from the Linux kernel for ARCH=hexagon
at -O2
. I am still teasing out a concise reproducer but my reduction is taking quite long so I am just posting what I have so far to raise visibility on this. I'll post a potentially more concise reproducer once my reduction finishes.
With a base command of clang -fno-short-enums --target=hexagon-linux-musl -O2 -c -o /dev/null rt2800lib.i
:
Benchmark 1: clang version 19.0.0git (https://github.com/llvm/llvm-project 98509c7f9792c79b05a41b95c24607f6dd489c5a) @ -O0
Time (abs ≡): 3.024 s [User: 2.924 s, System: 0.100 s]
Benchmark 2: clang version 19.0.0git (https://github.com/llvm/llvm-project 98509c7f9792c79b05a41b95c24607f6dd489c5a) @ -O2
Time (abs ≡): 42.393 s [User: 42.293 s, System: 0.100 s]
Benchmark 3: clang version 19.0.0git (https://github.com/llvm/llvm-project 90ba33099cbb17e7c159e9ebc5a512037db99d6d) @ -O0
Time (abs ≡): 2.646 s [User: 2.586 s, System: 0.060 s]
Benchmark 4: clang version 19.0.0git (https://github.com/llvm/llvm-project 90ba33099cbb17e7c159e9ebc5a512037db99d6d) @ -O2
Time (abs ≡): 2113.743 s [User: 2113.603 s, System: 0.110 s]
Summary
clang version 19.0.0git (https://github.com/llvm/llvm-project 90ba33099cbb17e7c159e9ebc5a512037db99d6d) @ -O0 ran
1.14 times faster than clang version 19.0.0git (https://github.com/llvm/llvm-project 98509c7f9792c79b05a41b95c24607f6dd489c5a) @ -O0
16.02 times faster than clang version 19.0.0git (https://github.com/llvm/llvm-project 98509c7f9792c79b05a41b95c24607f6dd489c5a) @ -O2
798.75 times faster than clang version 19.0.0git (https://github.com/llvm/llvm-project 90ba33099cbb17e7c159e9ebc5a512037db99d6d) @ -O2
This appears related to the compile time ffs()
macros this driver has implemented, based on the massive amount of __builtin_choose_expr()
calls there are, as FIELD32
expands to compile_ffs32
, which ultimately expands all of the other compile_ffs
macros down to compile_ffs2
... Perhaps the kernel should not be doing this (and that's reflected in the original baseline of 42s
) but I think a 60x compile time regression should not happen either. I have not seen this level of regression in any of my other builds for other architectures, so perhaps this is related to something specific to Hexagon?
cc @nikic @androm3da @SundeepKushwaha @SidManning
Bisect log
# bad: [795090739cf3b295be750dfba0af2ba993e60cdd] [RISCV] Fix a bug accidentally introduced in e9311f9
# good: [4c3de45ecf9eea6b4ad850a042706f7865a2aab2] [LoongArch][test] Add tests reporting error if lsx/lasx feature is missing when lsx/lasx builtins are called (#79250)
git bisect start '795090739cf3b295be750dfba0af2ba993e60cdd' '4c3de45ecf9eea6b4ad850a042706f7865a2aab2'
# bad: [8d43dad9b86ad0f72100b6f75450f2982f2663b9] [clang][bazel] Fix BUILD after 4a582845597e97d245e8ffdc14281f922b835e56.
git bisect bad 8d43dad9b86ad0f72100b6f75450f2982f2663b9
# good: [9dddb3d5f3bf323b7b7f8281bb848731f69fddfa] [AST] Mark the fallthrough coreturn statement implicit. (#77465)
git bisect good 9dddb3d5f3bf323b7b7f8281bb848731f69fddfa
# good: [98509c7f9792c79b05a41b95c24607f6dd489c5a] [AArch64] Add vec3 tests with different load/store alignments.
git bisect good 98509c7f9792c79b05a41b95c24607f6dd489c5a
# bad: [aaa93ce7323332d8290b8f563d4d71689c1094c5] compiler-rt: Fix FLOAT16 feature detection
git bisect bad aaa93ce7323332d8290b8f563d4d71689c1094c5
# bad: [03a9f07e189db792b001c4001981d6e2da880221] [libc++][NFC] Fix leftover && in comment
git bisect bad 03a9f07e189db792b001c4001981d6e2da880221
# bad: [3d91d9613e294b242d853039209b40a0cb7853f2] [ConstraintElim] Make sure min/max intrinsic results are not poison.
git bisect bad 3d91d9613e294b242d853039209b40a0cb7853f2
# bad: [90ba33099cbb17e7c159e9ebc5a512037db99d6d] [InstCombine] Canonicalize constant GEPs to i8 source element type (#68882)
git bisect bad 90ba33099cbb17e7c159e9ebc5a512037db99d6d
# first bad commit: [90ba33099cbb17e7c159e9ebc5a512037db99d6d] [InstCombine] Canonicalize constant GEPs to i8 source element type (#68882)