Skip to content

[InstCombine] Drop exact flag instead of increasing demanded bits #70311

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 26, 2023

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Oct 26, 2023

Demanded bit simplification for lshr/ashr will currently demand the low bits if the exact flag is set. This is because these bits must be zero to satisfy the flag.

However, this means that our demanded bits simplification is worse for lshr/ashr exact than it is for plain lshr/ashr, which is generally not desirable.

Instead, drop the exact flag if a demanded bits simplification of the operand succeeds, which may no longer satisfy the exact flag.

This matches what we do for the exact flag on udiv, as well as the nuw/nsw flags on add/sub/mul.

Demanded bit simplification for lshr/ashr will currently demand
the low bits if the exact flag is set. This is because these bits
must be zero to satisfy the flag.

However, this means that our demanded bits simplification is worse
for lshr/ashr exact than it is for plain lshr/ashr, which is
generally not desirable.

Instead, drop the exact flag if a demanded bits simplification of
the operand succeeds, which may no longer satisfy the exact flag.

This matches what we do for the exact flag on udiv, as well as
the nuw/nsw flags on add/sub/mul.
@llvmbot
Copy link
Member

llvmbot commented Oct 26, 2023

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

Demanded bit simplification for lshr/ashr will currently demand the low bits if the exact flag is set. This is because these bits must be zero to satisfy the flag.

However, this means that our demanded bits simplification is worse for lshr/ashr exact than it is for plain lshr/ashr, which is generally not desirable.

Instead, drop the exact flag if a demanded bits simplification of the operand succeeds, which may no longer satisfy the exact flag.

This matches what we do for the exact flag on udiv, as well as the nuw/nsw flags on add/sub/mul.


Full diff: https://github.com/llvm/llvm-project/pull/70311.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp (+8-13)
  • (modified) llvm/test/Transforms/InstCombine/cast.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/select-2.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/shift.ll (+3-5)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
index be005e61a8d2d89..cd6b017874e8d6c 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
@@ -698,14 +698,11 @@ Value *InstCombinerImpl::SimplifyDemandedUseBits(Value *V, APInt DemandedMask,
 
       // Unsigned shift right.
       APInt DemandedMaskIn(DemandedMask.shl(ShiftAmt));
-
-      // If the shift is exact, then it does demand the low bits (and knows that
-      // they are zero).
-      if (cast<LShrOperator>(I)->isExact())
-        DemandedMaskIn.setLowBits(ShiftAmt);
-
-      if (SimplifyDemandedBits(I, 0, DemandedMaskIn, Known, Depth + 1))
+      if (SimplifyDemandedBits(I, 0, DemandedMaskIn, Known, Depth + 1)) {
+        // exact flag may not longer hold.
+        I->dropPoisonGeneratingFlags();
         return I;
+      }
       assert(!Known.hasConflict() && "Bits known to be one AND zero?");
       Known.Zero.lshrInPlace(ShiftAmt);
       Known.One.lshrInPlace(ShiftAmt);
@@ -747,13 +744,11 @@ Value *InstCombinerImpl::SimplifyDemandedUseBits(Value *V, APInt DemandedMask,
       if (DemandedMask.countl_zero() <= ShiftAmt)
         DemandedMaskIn.setSignBit();
 
-      // If the shift is exact, then it does demand the low bits (and knows that
-      // they are zero).
-      if (cast<AShrOperator>(I)->isExact())
-        DemandedMaskIn.setLowBits(ShiftAmt);
-
-      if (SimplifyDemandedBits(I, 0, DemandedMaskIn, Known, Depth + 1))
+      if (SimplifyDemandedBits(I, 0, DemandedMaskIn, Known, Depth + 1)) {
+        // exact flag may not longer hold.
+        I->dropPoisonGeneratingFlags();
         return I;
+      }
 
       assert(!Known.hasConflict() && "Bits known to be one AND zero?");
       // Compute the new bits that are at the top now plus sign bits.
diff --git a/llvm/test/Transforms/InstCombine/cast.ll b/llvm/test/Transforms/InstCombine/cast.ll
index 5b480b1157936eb..59e488f3f23d52a 100644
--- a/llvm/test/Transforms/InstCombine/cast.ll
+++ b/llvm/test/Transforms/InstCombine/cast.ll
@@ -1318,7 +1318,7 @@ define i64 @test83(i16 %a, i64 %k) {
 define i8 @test84(i32 %a) {
 ; ALL-LABEL: @test84(
 ; ALL-NEXT:    [[ADD:%.*]] = add i32 [[A:%.*]], 2130706432
-; ALL-NEXT:    [[SHR:%.*]] = lshr exact i32 [[ADD]], 23
+; ALL-NEXT:    [[SHR:%.*]] = lshr i32 [[ADD]], 23
 ; ALL-NEXT:    [[TRUNC:%.*]] = trunc i32 [[SHR]] to i8
 ; ALL-NEXT:    ret i8 [[TRUNC]]
 ;
@@ -1331,7 +1331,7 @@ define i8 @test84(i32 %a) {
 define i8 @test85(i32 %a) {
 ; ALL-LABEL: @test85(
 ; ALL-NEXT:    [[ADD:%.*]] = add i32 [[A:%.*]], 2130706432
-; ALL-NEXT:    [[SHR:%.*]] = lshr exact i32 [[ADD]], 23
+; ALL-NEXT:    [[SHR:%.*]] = lshr i32 [[ADD]], 23
 ; ALL-NEXT:    [[TRUNC:%.*]] = trunc i32 [[SHR]] to i8
 ; ALL-NEXT:    ret i8 [[TRUNC]]
 ;
diff --git a/llvm/test/Transforms/InstCombine/select-2.ll b/llvm/test/Transforms/InstCombine/select-2.ll
index 2e4161f5d80aaaa..148b0dcf1025980 100644
--- a/llvm/test/Transforms/InstCombine/select-2.ll
+++ b/llvm/test/Transforms/InstCombine/select-2.ll
@@ -45,7 +45,7 @@ define float @t3(float %x, float %y) {
 
 define i8 @ashr_exact_poison_constant_fold(i1 %b, i8 %x) {
 ; CHECK-LABEL: @ashr_exact_poison_constant_fold(
-; CHECK-NEXT:    [[TMP1:%.*]] = ashr exact i8 [[X:%.*]], 3
+; CHECK-NEXT:    [[TMP1:%.*]] = ashr i8 [[X:%.*]], 3
 ; CHECK-NEXT:    [[R:%.*]] = select i1 [[B:%.*]], i8 [[TMP1]], i8 5
 ; CHECK-NEXT:    ret i8 [[R]]
 ;
diff --git a/llvm/test/Transforms/InstCombine/shift.ll b/llvm/test/Transforms/InstCombine/shift.ll
index 7b9626331ff2907..913ef2a74aebd9a 100644
--- a/llvm/test/Transforms/InstCombine/shift.ll
+++ b/llvm/test/Transforms/InstCombine/shift.ll
@@ -2141,9 +2141,8 @@ define i16 @lshr_and_not_demanded(i8 %x) {
 
 define i16 @lshr_exact_and_not_demanded(i8 %x) {
 ; CHECK-LABEL: @lshr_exact_and_not_demanded(
-; CHECK-NEXT:    [[Y:%.*]] = and i8 [[X:%.*]], -2
-; CHECK-NEXT:    [[Y_EXT:%.*]] = sext i8 [[Y]] to i16
-; CHECK-NEXT:    [[SHR:%.*]] = lshr exact i16 [[Y_EXT]], 1
+; CHECK-NEXT:    [[Y_EXT:%.*]] = sext i8 [[X:%.*]] to i16
+; CHECK-NEXT:    [[SHR:%.*]] = lshr i16 [[Y_EXT]], 1
 ; CHECK-NEXT:    ret i16 [[SHR]]
 ;
   %y = and i8 %x, -2
@@ -2177,8 +2176,7 @@ define i16 @ashr_umax_not_demanded(i16 %x) {
 
 define i16 @ashr_exact_umax_not_demanded(i16 %x) {
 ; CHECK-LABEL: @ashr_exact_umax_not_demanded(
-; CHECK-NEXT:    [[Y:%.*]] = call i16 @llvm.umax.i16(i16 [[X:%.*]], i16 1)
-; CHECK-NEXT:    [[SHR:%.*]] = ashr exact i16 [[Y]], 1
+; CHECK-NEXT:    [[SHR:%.*]] = ashr i16 [[X:%.*]], 1
 ; CHECK-NEXT:    ret i16 [[SHR]]
 ;
   %y = call i16 @llvm.umax.i16(i16 %x, i16 1)

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I'll look at tweaking the DAG versions to match

@nikic nikic merged commit b47ff36 into llvm:main Oct 26, 2023
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Oct 26, 2023
…vm#70311)

Demanded bit simplification for lshr/ashr will currently demand the low
bits if the exact flag is set. This is because these bits must be zero
to satisfy the flag.

However, this means that our demanded bits simplification is worse for
lshr/ashr exact than it is for plain lshr/ashr, which is generally not
desirable.

Instead, drop the exact flag if a demanded bits simplification of the
operand succeeds, which may no longer satisfy the exact flag.

This matches what we do for the exact flag on udiv, as well as the
nuw/nsw flags on add/sub/mul.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants