Skip to content

[InstCombine] Missed optimization: fold ((select C1, C2) | A) ^ C3 to (select C1^C3, C2^C3) | A) #79266

Open
@XChy

Description

@XChy

Alive2 proof: https://alive2.llvm.org/ce/z/oEmX9u

Motivating example

define i32 @src(i32 %a, i1 %c) {
entry:
  %s = select i1 %c, i32 0, i32 4
  %shl = shl i32 %a, 4
  %or = or disjoint i32 %s, %shl
  %xor = xor i32 %or, 4
  ret i32 %xor
}

can be folded to

define i32 @tgt(i32 %a, i1 %c) {
entry:
  %s = select i1 %c, i32 4, i32 0
  %shl = shl i32 %a, 4
  %or = or disjoint i32 %s, %shl
  ret i32 %or
}

Since or disjoint here can be regarded as xor, ((select C1, C2) | A) ^ C3 can be (select C1, C2) ^ C3 ^ A) and then we get (select C1^C3, C2^C3) | A).

Real-world motivation

This snippet of IR is derived from zstd/lib/compress/zstd_compress_literals.c@ZSTD_compressLiterals (after O3 pipeline).
The example above is a reduced version. If you're interested in the original suboptimal IR and optimal IR, contact me to get them, please.

Let me know if you can confirm that it's an optimization opportunity, thanks.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions