[VectorCombine] Don't shrink lshr if the shamt is not less than bitwidth #108705

dtcxzyw · 2024-09-14T15:30:58Z

Consider the following case:

define <2 x i32> @test(<2 x i64> %vec.ind16, <2 x i32> %broadcast.splat20) {
  %19 = icmp eq <2 x i64> %vec.ind16, zeroinitializer
  %20 = zext <2 x i1> %19 to <2 x i32>
  %21 = lshr <2 x i32> %20, %broadcast.splat20
  ret <2 x i32> %21
}

After #104606, we shrink the lshr into:

define <2 x i32> @test(<2 x i64> %vec.ind16, <2 x i32> %broadcast.splat20) {
  %1 = icmp eq <2 x i64> %vec.ind16, zeroinitializer
  %2 = trunc <2 x i32> %broadcast.splat20 to <2 x i1>
  %3 = lshr <2 x i1> %1, %2
  %4 = zext <2 x i1> %3 to <2 x i32>
  ret <2 x i32> %4
}

It is incorrect since lshr i1 X, 1 returns poison.
This patch adds additional check on the shamt operand. The lshr will get shrunk iff we ensure that the shamt is less than bitwidth of the smaller type. As computeKnownBits(&I, *DL).countMaxActiveBits() > BW always evaluates to true for lshr(zext(X), Y), this check will only apply to bitwise logical instructions.

Alive2: https://alive2.llvm.org/ce/z/j_RmTa
Fixes #108698.

llvmbot · 2024-09-14T15:31:33Z

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

Consider the following case:

define &lt;2 x i32&gt; @<!-- -->test(&lt;2 x i64&gt; %vec.ind16, &lt;2 x i32&gt; %broadcast.splat20) {
  %19 = icmp eq &lt;2 x i64&gt; %vec.ind16, zeroinitializer
  %20 = zext &lt;2 x i1&gt; %19 to &lt;2 x i32&gt;
  %21 = lshr &lt;2 x i32&gt; %20, %broadcast.splat20
  ret &lt;2 x i32&gt; %21
}

After #104606, we shrink the lshr into:

define &lt;2 x i32&gt; @<!-- -->test(&lt;2 x i64&gt; %vec.ind16, &lt;2 x i32&gt; %broadcast.splat20) {
  %1 = icmp eq &lt;2 x i64&gt; %vec.ind16, zeroinitializer
  %2 = trunc &lt;2 x i32&gt; %broadcast.splat20 to &lt;2 x i1&gt;
  %3 = lshr &lt;2 x i1&gt; %1, %2
  %4 = zext &lt;2 x i1&gt; %3 to &lt;2 x i32&gt;
  ret &lt;2 x i32&gt; %4
}

It is incorrect since lshr i1 X, 1 returns poison.
This patch adds additional check on the shamt operand. The lshr will get shrunk iff we ensure that the shamt is less than bitwidth of the smaller type. As computeKnownBits(&I, *DL).countMaxActiveBits() > BW always evaluates to true for lshr(zext(X), Y), this check will only apply to bitwise logical instructions.

Alive2: https://alive2.llvm.org/ce/z/j_RmTa
Fixes #108698.

Full diff: https://github.com/llvm/llvm-project/pull/108705.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/VectorCombine.cpp (+14-6)
(added) llvm/test/Transforms/VectorCombine/X86/pr108698.ll (+16)

diff --git a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
index d7afe2f426d392..58701bfa60a33e 100644
--- a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+++ b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
@@ -2597,11 +2597,19 @@ bool VectorCombine::shrinkType(llvm::Instruction &I) {
   auto *SmallTy = cast<FixedVectorType>(ZExted->getType());
   unsigned BW = SmallTy->getElementType()->getPrimitiveSizeInBits();
 
-  // Check that the expression overall uses at most the same number of bits as
-  // ZExted
-  KnownBits KB = computeKnownBits(&I, *DL);
-  if (KB.countMaxActiveBits() > BW)
-    return false;
+  if (I.getOpcode() == Instruction::LShr) {
+    // Check that the shift amount is less than the number of bits in the
+    // smaller type. Otherwise, the smaller lshr will return a poison value.
+    KnownBits ShAmtKB = computeKnownBits(I.getOperand(1), *DL);
+    if (ShAmtKB.getMaxValue().uge(BW))
+      return false;
+  } else {
+    // Check that the expression overall uses at most the same number of bits as
+    // ZExted
+    KnownBits KB = computeKnownBits(&I, *DL);
+    if (KB.countMaxActiveBits() > BW)
+      return false;
+  }
 
   // Calculate costs of leaving current IR as it is and moving ZExt operation
   // later, along with adding truncates if needed
@@ -2628,7 +2636,7 @@ bool VectorCombine::shrinkType(llvm::Instruction &I) {
       return false;
 
     // Check if we can propagate ZExt through its other users
-    KB = computeKnownBits(UI, *DL);
+    KnownBits KB = computeKnownBits(UI, *DL);
     if (KB.countMaxActiveBits() > BW)
       return false;
 
diff --git a/llvm/test/Transforms/VectorCombine/X86/pr108698.ll b/llvm/test/Transforms/VectorCombine/X86/pr108698.ll
new file mode 100644
index 00000000000000..675cf6ed7da51f
--- /dev/null
+++ b/llvm/test/Transforms/VectorCombine/X86/pr108698.ll
@@ -0,0 +1,16 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=vector-combine -S -mtriple=x86_64 | FileCheck %s
+
+define <2 x i32> @test(<2 x i64> %x, <2 x i32> %y) {
+; CHECK-LABEL: define <2 x i32> @test(
+; CHECK-SAME: <2 x i64> [[X:%.*]], <2 x i32> [[Y:%.*]]) {
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq <2 x i64> [[X]], zeroinitializer
+; CHECK-NEXT:    [[EXT:%.*]] = zext <2 x i1> [[CMP]] to <2 x i32>
+; CHECK-NEXT:    [[LSHR:%.*]] = lshr <2 x i32> [[EXT]], [[Y]]
+; CHECK-NEXT:    ret <2 x i32> [[LSHR]]
+;
+  %cmp = icmp eq <2 x i64> %x, zeroinitializer
+  %ext = zext <2 x i1> %cmp to <2 x i32>
+  %lshr = lshr <2 x i32> %ext, %y
+  ret <2 x i32> %lshr
+}

nikic

LGTM

llvm/test/Transforms/VectorCombine/X86/pr108698.ll

RKSimon

LGTM - cheers

dtcxzyw requested review from nikic, RKSimon, davemgreen, igogo-x86 and mikaelholmen September 14, 2024 15:30

llvmbot added vectorizers llvm:transforms labels Sep 14, 2024

nikic approved these changes Sep 14, 2024

View reviewed changes

llvm/test/Transforms/VectorCombine/X86/pr108698.ll Outdated Show resolved Hide resolved

dtcxzyw added 2 commits September 15, 2024 10:07

[VectorCombine] Add pre-commit tests. NFC.

96f3521

[VectorCombine] Don't shrink lshr if the shamt is not less than bitwidth

fe91054

dtcxzyw force-pushed the fix-pr108698 branch from 3a11412 to fe91054 Compare September 15, 2024 02:14

RKSimon approved these changes Sep 15, 2024

View reviewed changes

dtcxzyw merged commit 87663fd into llvm:main Sep 15, 2024
8 checks passed

dtcxzyw deleted the fix-pr108698 branch September 15, 2024 10:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[VectorCombine] Don't shrink lshr if the shamt is not less than bitwidth #108705

[VectorCombine] Don't shrink lshr if the shamt is not less than bitwidth #108705

Uh oh!

dtcxzyw commented Sep 14, 2024

Uh oh!

llvmbot commented Sep 14, 2024

Uh oh!

nikic left a comment

Uh oh!

Uh oh!

RKSimon left a comment

Uh oh!

Uh oh!

Uh oh!

[VectorCombine] Don't shrink lshr if the shamt is not less than bitwidth #108705

[VectorCombine] Don't shrink lshr if the shamt is not less than bitwidth #108705

Uh oh!

Conversation

dtcxzyw commented Sep 14, 2024

Uh oh!

llvmbot commented Sep 14, 2024

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

RKSimon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!