Skip to content

[AArch64] Fix Fold of Compare with Right-shifted Value #127209

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 17, 2025

Conversation

mskamp
Copy link
Contributor

@mskamp mskamp commented Feb 14, 2025

This change folds (setcc ne (lshr x c) 0) for 64-bit types and constants
c >= 32. This fold already existed for other types or smaller constants
but was not applicable to 64-bit types and constants >= 32 due to a
comparison of the constant c with the bit size of the setcc operation.
The type of this operation is legalized to i32, which does not
necessarily match the type of the lshr operation. Use the bit size of
the type of the lshr operation instead for the comparison.

Fixes #122380.

@llvmbot
Copy link
Member

llvmbot commented Feb 14, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Marius Kamp (mskamp)

Changes

This change folds (setcc ne (lshr x c) 0) for 64-bit types and constants
>= 32. This fold already existed for other types or smaller constants
but was not applicable to 64-bit types and constants >= 32 due to a
comparison of the constant c with the bit size of the setcc operation.
The type of this operation is legalized to i32, which does not
necessarily match the type of the lshr operation. Use the bit size of
the type of the lshr operation instead for the comparison.

Fixes #122380.


Full diff: https://github.com/llvm/llvm-project/pull/127209.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+2-2)
  • (added) llvm/test/CodeGen/AArch64/shift-const-ne-0.ll (+122)
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 4263be1098899..8f849af6f4d35 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -25070,10 +25070,10 @@ static SDValue performSETCCCombine(SDNode *N,
   // setcc (srl x, imm), 0, ne ==> setcc (and x, (-1 << imm)), 0, ne
   if (Cond == ISD::SETNE && isNullConstant(RHS) &&
       LHS->getOpcode() == ISD::SRL && isa<ConstantSDNode>(LHS->getOperand(1)) &&
-      LHS->getConstantOperandVal(1) < VT.getScalarSizeInBits() &&
       LHS->hasOneUse()) {
     EVT TstVT = LHS->getValueType(0);
-    if (TstVT.isScalarInteger() && TstVT.getFixedSizeInBits() <= 64) {
+    if (TstVT.isScalarInteger() && TstVT.getFixedSizeInBits() <= 64 &&
+        LHS->getConstantOperandVal(1) < TstVT.getFixedSizeInBits()) {
       // this pattern will get better opt in emitComparison
       uint64_t TstImm = -1ULL << LHS->getConstantOperandVal(1);
       SDValue TST = DAG.getNode(ISD::AND, DL, TstVT, LHS->getOperand(0),
diff --git a/llvm/test/CodeGen/AArch64/shift-const-ne-0.ll b/llvm/test/CodeGen/AArch64/shift-const-ne-0.ll
new file mode 100644
index 0000000000000..be064d591613c
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/shift-const-ne-0.ll
@@ -0,0 +1,122 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 2
+; RUN: llc -mtriple=aarch64-unknown-unknown < %s -o -| FileCheck %s
+
+define i1 @lsr_1_ne_0_16(i16 %x) {
+; CHECK-LABEL: lsr_1_ne_0_16:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    tst w0, #0xfffe
+; CHECK-NEXT:    cset w0, ne
+; CHECK-NEXT:    ret
+  %shr = lshr i16 %x, 1
+  %cmp = icmp ne i16 %shr, 0
+  ret i1 %cmp
+}
+
+define i1 @lsr_1_ne_0_32(i32 %x) {
+; CHECK-LABEL: lsr_1_ne_0_32:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    tst w0, #0xfffffffe
+; CHECK-NEXT:    cset w0, ne
+; CHECK-NEXT:    ret
+  %shr = lshr i32 %x, 1
+  %cmp = icmp ne i32 %shr, 0
+  ret i1 %cmp
+}
+
+define i1 @lsr_30_ne_0_32(i32 %x) {
+; CHECK-LABEL: lsr_30_ne_0_32:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    tst w0, #0xc0000000
+; CHECK-NEXT:    cset w0, ne
+; CHECK-NEXT:    ret
+  %shr = lshr i32 %x, 30
+  %cmp = icmp ne i32 %shr, 0
+  ret i1 %cmp
+}
+
+define i1 @lsr_31_ne_0_32(i32 %x) {
+; CHECK-LABEL: lsr_31_ne_0_32:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    lsr w0, w0, #31
+; CHECK-NEXT:    ret
+  %shr = lshr i32 %x, 31
+  %cmp = icmp ne i32 %shr, 0
+  ret i1 %cmp
+}
+
+define i1 @lsr_1_ne_0_64(i64 %x) {
+; CHECK-LABEL: lsr_1_ne_0_64:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    tst x0, #0xfffffffffffffffe
+; CHECK-NEXT:    cset w0, ne
+; CHECK-NEXT:    ret
+  %shr = lshr i64 %x, 1
+  %cmp = icmp ne i64 %shr, 0
+  ret i1 %cmp
+}
+
+define i1 @lsr_31_ne_0_64(i64 %x) {
+; CHECK-LABEL: lsr_31_ne_0_64:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    tst x0, #0xffffffff80000000
+; CHECK-NEXT:    cset w0, ne
+; CHECK-NEXT:    ret
+  %shr = lshr i64 %x, 31
+  %cmp = icmp ne i64 %shr, 0
+  ret i1 %cmp
+}
+
+define i1 @lsr_32_ne_0_64(i64 %x) {
+; CHECK-LABEL: lsr_32_ne_0_64:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    tst x0, #0xffffffff00000000
+; CHECK-NEXT:    cset w0, ne
+; CHECK-NEXT:    ret
+  %shr = lshr i64 %x, 32
+  %cmp = icmp ne i64 %shr, 0
+  ret i1 %cmp
+}
+
+define i1 @lsr_33_ne_0_64(i64 %x) {
+; CHECK-LABEL: lsr_33_ne_0_64:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    tst x0, #0xfffffffe00000000
+; CHECK-NEXT:    cset w0, ne
+; CHECK-NEXT:    ret
+  %shr = lshr i64 %x, 33
+  %cmp = icmp ne i64 %shr, 0
+  ret i1 %cmp
+}
+
+define i1 @lsr_62_ne_0_64(i64 %x) {
+; CHECK-LABEL: lsr_62_ne_0_64:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    tst x0, #0xc000000000000000
+; CHECK-NEXT:    cset w0, ne
+; CHECK-NEXT:    ret
+  %shr = lshr i64 %x, 62
+  %cmp = icmp ne i64 %shr, 0
+  ret i1 %cmp
+}
+
+define i1 @lsr_63_ne_0_64(i64 %x) {
+; CHECK-LABEL: lsr_63_ne_0_64:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    lsr x0, x0, #63
+; CHECK-NEXT:    // kill: def $w0 killed $w0 killed $x0
+; CHECK-NEXT:    ret
+  %shr = lshr i64 %x, 63
+  %cmp = icmp ne i64 %shr, 0
+  ret i1 %cmp
+}
+
+define <4 x i1> @lsr_1_ne_0_v4i16(<4 x i16> %x) {
+; CHECK-LABEL: lsr_1_ne_0_v4i16:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    ushr v0.4h, v0.4h, #1
+; CHECK-NEXT:    cmtst v0.4h, v0.4h, v0.4h
+; CHECK-NEXT:    ret
+  %shr = lshr <4 x i16> %x, <i16 1, i16 1, i16 1, i16 1>
+  %cmp = icmp ne <4 x i16> %shr, <i16 0, i16 0, i16 0, i16 0>
+  ret <4 x i1> %cmp
+}

This commit folds (setcc ne (lshr x c) 0) for 64-bit types and constants
>= 32. This fold already existed for other types or smaller constants
but was not applicable to 64-bit types and constants >= 32 due to a
comparison of the constant c with the bit size of the setcc operation.
The type of this operation is legalized to i32, which does not
necessarily match the type of the lshr operation. Use the bit size of
the type of the lshr operation instead for the comparison.

Fixes llvm#122380.
@mskamp mskamp force-pushed the fix_122380_lsr_ne_0_64bit branch from 10cf9ed to d140713 Compare February 14, 2025 18:50
Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davemgreen
Copy link
Collaborator

(Let us know if we should squash and merge).

@mskamp
Copy link
Contributor Author

mskamp commented Feb 17, 2025

From my point of view, this MR is ready to be squashed and merged. Thank you!

@davemgreen davemgreen merged commit 2dda529 into llvm:main Feb 17, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AArch64] Failure to fold lsr or asr into cmp
3 participants