Skip to content

Big compile time regression for Hexagon Linux kernel build after commit 90ba33099c #80185

Open
@nathanchance

Description

@nathanchance

After 90ba330, I see a very noticeable build time regression when building drivers/net/wireless/ralink/rt2x00/rt2800lib.c from the Linux kernel for ARCH=hexagon at -O2. I am still teasing out a concise reproducer but my reduction is taking quite long so I am just posting what I have so far to raise visibility on this. I'll post a potentially more concise reproducer once my reduction finishes.

With a base command of clang -fno-short-enums --target=hexagon-linux-musl -O2 -c -o /dev/null rt2800lib.i:

Benchmark 1: clang version 19.0.0git (https://github.com/llvm/llvm-project 98509c7f9792c79b05a41b95c24607f6dd489c5a) @ -O0
  Time (abs ≡):         3.024 s               [User: 2.924 s, System: 0.100 s]

Benchmark 2: clang version 19.0.0git (https://github.com/llvm/llvm-project 98509c7f9792c79b05a41b95c24607f6dd489c5a) @ -O2
  Time (abs ≡):        42.393 s               [User: 42.293 s, System: 0.100 s]

Benchmark 3: clang version 19.0.0git (https://github.com/llvm/llvm-project 90ba33099cbb17e7c159e9ebc5a512037db99d6d) @ -O0
  Time (abs ≡):         2.646 s               [User: 2.586 s, System: 0.060 s]

Benchmark 4: clang version 19.0.0git (https://github.com/llvm/llvm-project 90ba33099cbb17e7c159e9ebc5a512037db99d6d) @ -O2
  Time (abs ≡):        2113.743 s               [User: 2113.603 s, System: 0.110 s]

Summary
  clang version 19.0.0git (https://github.com/llvm/llvm-project 90ba33099cbb17e7c159e9ebc5a512037db99d6d) @ -O0 ran
    1.14 times faster than clang version 19.0.0git (https://github.com/llvm/llvm-project 98509c7f9792c79b05a41b95c24607f6dd489c5a) @ -O0
   16.02 times faster than clang version 19.0.0git (https://github.com/llvm/llvm-project 98509c7f9792c79b05a41b95c24607f6dd489c5a) @ -O2
  798.75 times faster than clang version 19.0.0git (https://github.com/llvm/llvm-project 90ba33099cbb17e7c159e9ebc5a512037db99d6d) @ -O2

rt2800lib.i.txt

This appears related to the compile time ffs() macros this driver has implemented, based on the massive amount of __builtin_choose_expr() calls there are, as FIELD32 expands to compile_ffs32, which ultimately expands all of the other compile_ffs macros down to compile_ffs2... Perhaps the kernel should not be doing this (and that's reflected in the original baseline of 42s) but I think a 60x compile time regression should not happen either. I have not seen this level of regression in any of my other builds for other architectures, so perhaps this is related to something specific to Hexagon?

cc @nikic @androm3da @SundeepKushwaha @SidManning

Bisect log
# bad: [795090739cf3b295be750dfba0af2ba993e60cdd] [RISCV] Fix a bug accidentally introduced in e9311f9
# good: [4c3de45ecf9eea6b4ad850a042706f7865a2aab2] [LoongArch][test] Add tests reporting error if lsx/lasx feature is missing when lsx/lasx builtins are called (#79250)
git bisect start '795090739cf3b295be750dfba0af2ba993e60cdd' '4c3de45ecf9eea6b4ad850a042706f7865a2aab2'
# bad: [8d43dad9b86ad0f72100b6f75450f2982f2663b9] [clang][bazel] Fix BUILD after 4a582845597e97d245e8ffdc14281f922b835e56.
git bisect bad 8d43dad9b86ad0f72100b6f75450f2982f2663b9
# good: [9dddb3d5f3bf323b7b7f8281bb848731f69fddfa] [AST] Mark the fallthrough coreturn statement implicit. (#77465)
git bisect good 9dddb3d5f3bf323b7b7f8281bb848731f69fddfa
# good: [98509c7f9792c79b05a41b95c24607f6dd489c5a] [AArch64] Add vec3 tests with different load/store alignments.
git bisect good 98509c7f9792c79b05a41b95c24607f6dd489c5a
# bad: [aaa93ce7323332d8290b8f563d4d71689c1094c5] compiler-rt: Fix FLOAT16 feature detection
git bisect bad aaa93ce7323332d8290b8f563d4d71689c1094c5
# bad: [03a9f07e189db792b001c4001981d6e2da880221] [libc++][NFC] Fix leftover && in comment
git bisect bad 03a9f07e189db792b001c4001981d6e2da880221
# bad: [3d91d9613e294b242d853039209b40a0cb7853f2] [ConstraintElim] Make sure min/max intrinsic results are not poison.
git bisect bad 3d91d9613e294b242d853039209b40a0cb7853f2
# bad: [90ba33099cbb17e7c159e9ebc5a512037db99d6d] [InstCombine] Canonicalize constant GEPs to i8 source element type (#68882)
git bisect bad 90ba33099cbb17e7c159e9ebc5a512037db99d6d
# first bad commit: [90ba33099cbb17e7c159e9ebc5a512037db99d6d] [InstCombine] Canonicalize constant GEPs to i8 source element type (#68882)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions