Skip to content

[X86][CodeGen] Support flags copy lowering for CCMP/CTEST #91849

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 18, 2024

Conversation

KanRobert
Copy link
Contributor

%1:gr64 = COPY $eflags
OP1 may update eflags
$eflags = COPY %1
OP2 may use eflags

To use eflags as input at 4th instruction, we need to use SETcc to preserve the eflags before 2, and update the source condition of OP2 according to value in GPR %1.

In this patch, we support CCMP/CTEST as OP2.

```
%1:gr64 = COPY $eflags
OP1 may update eflags
$eflags = COPY %1
OP2 may use eflags
```

To use eflags as input at 4th instruction, we need to use SETcc to
preserve the eflags before 2, and update the source condition of OP2
according to value in GPR %1.

In this patch, we support CCMP/CTEST as OP2.
@llvmbot
Copy link
Member

llvmbot commented May 11, 2024

@llvm/pr-subscribers-backend-x86

Author: Shengchen Kan (KanRobert)

Changes
%1:gr64 = COPY $eflags
OP1 may update eflags
$eflags = COPY %1
OP2 may use eflags

To use eflags as input at 4th instruction, we need to use SETcc to preserve the eflags before 2, and update the source condition of OP2 according to value in GPR %1.

In this patch, we support CCMP/CTEST as OP2.


Full diff: https://github.com/llvm/llvm-project/pull/91849.diff

4 Files Affected:

  • (modified) llvm/lib/Target/X86/X86FlagsCopyLowering.cpp (+33)
  • (modified) llvm/lib/Target/X86/X86InstrInfo.cpp (+8-1)
  • (modified) llvm/lib/Target/X86/X86InstrInfo.h (+3)
  • (added) llvm/test/CodeGen/X86/apx/ccmp-flags-copy-lowering.mir (+51)
diff --git a/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp b/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
index 78355d3550833..ea5ef5b5a602c 100644
--- a/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
+++ b/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
@@ -124,6 +124,10 @@ class X86FlagsCopyLoweringPass : public MachineFunctionPass {
                     MachineBasicBlock::iterator TestPos,
                     const DebugLoc &TestLoc, MachineInstr &SetCCI,
                     MachineOperand &FlagUse, CondRegArray &CondRegs);
+  void rewriteCCMP(MachineBasicBlock &TestMBB,
+                   MachineBasicBlock::iterator TestPos, const DebugLoc &TestLoc,
+                   MachineInstr &CMovI, MachineOperand &FlagUse,
+                   CondRegArray &CondRegs);
 };
 
 } // end anonymous namespace
@@ -613,6 +617,9 @@ bool X86FlagsCopyLoweringPass::runOnMachineFunction(MachineFunction &MF) {
           rewriteFCMov(*TestMBB, TestPos, TestLoc, MI, *FlagUse, CondRegs);
         } else if (X86::getCondFromSETCC(MI) != X86::COND_INVALID) {
           rewriteSetCC(*TestMBB, TestPos, TestLoc, MI, *FlagUse, CondRegs);
+        } else if (X86::getCondFromCCMP(MI) != X86::COND_INVALID) {
+          rewriteCCMP(*TestMBB, TestPos, TestLoc, MI, *FlagUse, CondRegs);
+          FlagsKilled = true;
         } else if (MI.getOpcode() == TargetOpcode::COPY) {
           rewriteCopy(MI, *FlagUse, CopyDefI);
         } else {
@@ -970,3 +977,29 @@ void X86FlagsCopyLoweringPass::rewriteSetCC(MachineBasicBlock &TestMBB,
 
   SetCCI.eraseFromParent();
 }
+
+void X86FlagsCopyLoweringPass::rewriteCCMP(MachineBasicBlock &TestMBB,
+                                           MachineBasicBlock::iterator TestPos,
+                                           const DebugLoc &TestLoc,
+                                           MachineInstr &CCMPI,
+                                           MachineOperand &FlagUse,
+                                           CondRegArray &CondRegs) {
+  // First get the register containing this specific condition.
+  X86::CondCode Cond = X86::getCondFromCCMP(CCMPI);
+  unsigned CondReg;
+  bool Inverted;
+  std::tie(CondReg, Inverted) =
+      getCondOrInverseInReg(TestMBB, TestPos, TestLoc, Cond, CondRegs);
+
+  MachineBasicBlock &MBB = *CCMPI.getParent();
+
+  // Insert a direct test of the saved register.
+  insertTest(MBB, CCMPI.getIterator(), CCMPI.getDebugLoc(), CondReg);
+
+  // Rewrite the CCMP/CTEST to use the !ZF flag from the test, and then kill its
+  // use of the flags afterward.
+  CCMPI.getOperand(CCMPI.getDesc().getNumOperands() - 1)
+      .setImm(Inverted ? X86::COND_E : X86::COND_NE);
+  FlagUse.setIsKill(true);
+  LLVM_DEBUG(dbgs() << "    fixed ccmp/ctest: "; CCMPI.dump());
+}
diff --git a/llvm/lib/Target/X86/X86InstrInfo.cpp b/llvm/lib/Target/X86/X86InstrInfo.cpp
index 0e5e52d4d88e8..26c68ce3c1a2d 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.cpp
+++ b/llvm/lib/Target/X86/X86InstrInfo.cpp
@@ -3121,7 +3121,8 @@ bool X86InstrInfo::hasCommutePreference(MachineInstr &MI, bool &Commute) const {
 int X86::getCondSrcNoFromDesc(const MCInstrDesc &MCID) {
   unsigned Opcode = MCID.getOpcode();
   if (!(X86::isJCC(Opcode) || X86::isSETCC(Opcode) || X86::isCMOVCC(Opcode) ||
-        X86::isCFCMOVCC(Opcode)))
+        X86::isCFCMOVCC(Opcode) || X86::isCCMPCC(Opcode) ||
+        X86::isCTESTCC(Opcode)))
     return -1;
   // Assume that condition code is always the last use operand.
   unsigned NumUses = MCID.getNumOperands() - MCID.getNumDefs();
@@ -3157,6 +3158,12 @@ X86::CondCode X86::getCondFromCFCMov(const MachineInstr &MI) {
                                          : X86::COND_INVALID;
 }
 
+X86::CondCode X86::getCondFromCCMP(const MachineInstr &MI) {
+  return X86::isCCMPCC(MI.getOpcode()) || X86::isCTESTCC(MI.getOpcode())
+             ? X86::getCondFromMI(MI)
+             : X86::COND_INVALID;
+}
+
 /// Return the inverse of the specified condition,
 /// e.g. turning COND_E to COND_NE.
 X86::CondCode X86::GetOppositeBranchCondition(X86::CondCode CC) {
diff --git a/llvm/lib/Target/X86/X86InstrInfo.h b/llvm/lib/Target/X86/X86InstrInfo.h
index 5407ede69a91c..55deca73b1f3a 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.h
+++ b/llvm/lib/Target/X86/X86InstrInfo.h
@@ -71,6 +71,9 @@ CondCode getCondFromCMov(const MachineInstr &MI);
 // Turn CFCMOV instruction into condition code.
 CondCode getCondFromCFCMov(const MachineInstr &MI);
 
+// Turn CCMP instruction into condition code.
+CondCode getCondFromCCMP(const MachineInstr &MI);
+
 /// GetOppositeBranchCondition - Return the inverse of the specified cond,
 /// e.g. turning COND_E to COND_NE.
 CondCode GetOppositeBranchCondition(CondCode CC);
diff --git a/llvm/test/CodeGen/X86/apx/ccmp-flags-copy-lowering.mir b/llvm/test/CodeGen/X86/apx/ccmp-flags-copy-lowering.mir
new file mode 100644
index 0000000000000..52d4c4cfb2aaa
--- /dev/null
+++ b/llvm/test/CodeGen/X86/apx/ccmp-flags-copy-lowering.mir
@@ -0,0 +1,51 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 4
+# RUN: llc -mtriple=x86_64 -run-pass x86-flags-copy-lowering -verify-machineinstrs -o - %s | FileCheck %s
+# Lower various interesting copy patterns of EFLAGS without using LAHF/SAHF.
+
+...
+---
+name:            test_ccmp
+body:             |
+  bb.0:
+    liveins: $edi
+
+    ; CHECK-LABEL: name: test_ccmp
+    ; CHECK: liveins: $edi
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: MUL32r $edi, implicit-def $eax, implicit-def dead $edx, implicit-def $eflags, implicit $eax
+    ; CHECK-NEXT: [[SETCCr:%[0-9]+]]:gr8 = SETCCr 1, implicit $eflags
+    ; CHECK-NEXT: [[ADD32rr:%[0-9]+]]:gr32 = ADD32rr $edi, $edi, implicit-def $eflags
+    ; CHECK-NEXT: TEST8rr [[SETCCr]], [[SETCCr]], implicit-def $eflags
+    ; CHECK-NEXT: CCMP32rr [[ADD32rr]], [[ADD32rr]], 0, 5, implicit-def $eflags, implicit killed $eflags
+    ; CHECK-NEXT: RET 0, $al
+    MUL32r $edi, implicit-def $eax, implicit-def dead $edx, implicit-def $eflags, implicit $eax
+    %1:gr64 = COPY $eflags
+    %2:gr32 = ADD32rr $edi, $edi, implicit-def $eflags
+    $eflags = COPY %1
+    CCMP32rr %2, %2, 0, 1, implicit-def $eflags, implicit $eflags
+    RET 0, $al
+
+...
+---
+name:            test_ctest
+body:             |
+  bb.0:
+    liveins: $edi
+
+    ; CHECK-LABEL: name: test_ctest
+    ; CHECK: liveins: $edi
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: MUL32r $edi, implicit-def $eax, implicit-def dead $edx, implicit-def $eflags, implicit $eax
+    ; CHECK-NEXT: [[SETCCr:%[0-9]+]]:gr8 = SETCCr 1, implicit $eflags
+    ; CHECK-NEXT: [[ADD32rr:%[0-9]+]]:gr32 = ADD32rr $edi, $edi, implicit-def $eflags
+    ; CHECK-NEXT: TEST8rr [[SETCCr]], [[SETCCr]], implicit-def $eflags
+    ; CHECK-NEXT: CTEST32rr [[ADD32rr]], [[ADD32rr]], 0, 5, implicit-def $eflags, implicit killed $eflags
+    ; CHECK-NEXT: RET 0, $al
+    MUL32r $edi, implicit-def $eax, implicit-def dead $edx, implicit-def $eflags, implicit $eax
+    %1:gr64 = COPY $eflags
+    %2:gr32 = ADD32rr $edi, $edi, implicit-def $eflags
+    $eflags = COPY %1
+    CTEST32rr %2, %2, 0, 1, implicit-def $eflags, implicit $eflags
+    RET 0, $al
+
+...

@@ -970,3 +977,29 @@ void X86FlagsCopyLoweringPass::rewriteSetCC(MachineBasicBlock &TestMBB,

SetCCI.eraseFromParent();
}

void X86FlagsCopyLoweringPass::rewriteCCMP(MachineBasicBlock &TestMBB,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This look very similar to rewriteCMov, can we merge them together?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the rewrite* functions can be merged together, including rewriteCMov. I will do it after this PR.

Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@KanRobert KanRobert merged commit 4b62afc into llvm:main May 18, 2024
6 checks passed
KanRobert added a commit that referenced this pull request May 22, 2024
1. MF.begin() == MF.end() -> MF.empty()
2. Set FlagsKilled by API modifiesRegister
3. Utilize APIs in X86GenMnemonicTables.inc to check arithmetic op
4. Merge duplicated code for rewrite*
5. Clang format

This is to address review comments in #91849
qiaojbao pushed a commit to GPUOpen-Drivers/llvm-project that referenced this pull request Jun 5, 2024
…c7619dffd

Local branch amd-gfx 6f4c761 Merged main:f7b0b99c52ee36ed6ca8abcce74a752e483768d6 into amd-gfx:4c6784cd49d6
Remote branch main 4b62afc [X86][CodeGen] Support flags copy lowering for CCMP/CTEST (llvm#91849)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants