Skip to content

[CFIFixup] Add frame info to the first block of each section #113626

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 20, 2024

Conversation

dhoekwater
Copy link
Contributor

Now that -fbasic-block-sections=list is enabled for Arm, functions may
be split aross multiple sections, and CFI information must be handled
independently for each section.

On x86, this is handled in llvm/lib/CodeGen/CFIInstrInserter.cpp.
However, this pass does not run on Arm, so we must add logic for it
to llvm/lib/CodeGen/CFIFixup.cpp.

@llvmbot
Copy link
Member

llvmbot commented Oct 24, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Daniel Hoekwater (dhoekwater)

Changes

Now that -fbasic-block-sections=list is enabled for Arm, functions may
be split aross multiple sections, and CFI information must be handled
independently for each section.

On x86, this is handled in llvm/lib/CodeGen/CFIInstrInserter.cpp.
However, this pass does not run on Arm, so we must add logic for it
to llvm/lib/CodeGen/CFIFixup.cpp.


Full diff: https://github.com/llvm/llvm-project/pull/113626.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/CFIFixup.cpp (+56-11)
  • (added) llvm/test/CodeGen/AArch64/cfi-fixup-multi-section.mir (+200)
diff --git a/llvm/lib/CodeGen/CFIFixup.cpp b/llvm/lib/CodeGen/CFIFixup.cpp
index ccf0738c197eb2..5c7742391742c8 100644
--- a/llvm/lib/CodeGen/CFIFixup.cpp
+++ b/llvm/lib/CodeGen/CFIFixup.cpp
@@ -67,8 +67,12 @@
 
 #include "llvm/CodeGen/CFIFixup.h"
 
+#include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/PostOrderIterator.h"
-#include "llvm/ADT/SmallBitVector.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/CodeGen/MachineBasicBlock.h"
+#include "llvm/CodeGen/MachineFunction.h"
 #include "llvm/CodeGen/Passes.h"
 #include "llvm/CodeGen/TargetFrameLowering.h"
 #include "llvm/CodeGen/TargetInstrInfo.h"
@@ -77,6 +81,8 @@
 #include "llvm/MC/MCDwarf.h"
 #include "llvm/Target/TargetMachine.h"
 
+#include <iterator>
+
 using namespace llvm;
 
 #define DEBUG_TYPE "cfi-fixup"
@@ -121,7 +127,7 @@ findPrologueEnd(MachineFunction &MF, MachineBasicBlock::iterator &PrologueEnd) {
 // iterator can point to the end of the block. Instructions are inserted
 // *before* the iterator.
 struct InsertionPoint {
-  MachineBasicBlock *MBB;
+  MachineBasicBlock *MBB = nullptr;
   MachineBasicBlock::iterator Iterator;
 };
 
@@ -151,6 +157,30 @@ insertRememberRestorePair(const InsertionPoint &RememberInsertPt,
                         ->getIterator())};
 }
 
+// Copies all CFI instructions before PrologueEnd and inserts them before
+// DstInsertPt. Returns the iterator to the first instruction after the
+// inserted instructions.
+static InsertionPoint cloneCfiPrologue(const InsertionPoint &PrologueEnd,
+                                       const InsertionPoint &DstInsertPt) {
+  MachineFunction &MF = *DstInsertPt.MBB->getParent();
+
+  auto cloneCfiInstructions = [&](MachineBasicBlock::iterator Begin,
+                                  MachineBasicBlock::iterator End) {
+    auto ToClone = map_range(
+        make_filter_range(make_range(Begin, End), isPrologueCFIInstruction),
+        [&](const MachineInstr &MI) { return MF.CloneMachineInstr(&MI); });
+    DstInsertPt.MBB->insert(DstInsertPt.Iterator, ToClone.begin(),
+                            ToClone.end());
+  };
+
+  // Clone all CFI instructions from previous blocks.
+  for (auto &MBB : make_range(MF.begin(), PrologueEnd.MBB->getIterator()))
+    cloneCfiInstructions(MBB.begin(), MBB.end());
+  // Clone all CFI instructions from the final prologue block.
+  cloneCfiInstructions(PrologueEnd.MBB->begin(), PrologueEnd.Iterator);
+  return DstInsertPt;
+}
+
 bool CFIFixup::runOnMachineFunction(MachineFunction &MF) {
   const TargetFrameLowering &TFL = *MF.getSubtarget().getFrameLowering();
   if (!TFL.enableCFIFixup(MF))
@@ -210,10 +240,11 @@ bool CFIFixup::runOnMachineFunction(MachineFunction &MF) {
   // of the previous block. If the intended frame state is different, insert
   // compensating CFI instructions.
   bool Change = false;
-  // `InsertPt` always points to the point in a preceding block where we have to
-  // insert a `.cfi_remember_state`, in the case that the current block needs a
-  // `.cfi_restore_state`.
-  InsertionPoint InsertPt = {PrologueBlock, PrologueEnd};
+  // `InsertPt[sectionID]` always points to the point in a preceding block where
+  // we have to insert a `.cfi_remember_state`, in the case that the current
+  // block needs a `.cfi_restore_state`.
+  SmallDenseMap<MBBSectionID, InsertionPoint> InsertionPts;
+  InsertionPts[PrologueBlock->getSectionID()] = {PrologueBlock, PrologueEnd};
 
   assert(PrologueEnd != PrologueBlock->begin() &&
          "Inconsistent notion of \"prologue block\"");
@@ -240,14 +271,28 @@ bool CFIFixup::runOnMachineFunction(MachineFunction &MF) {
       }
     }
 #endif
+
+    // If the block is the first block in its section, then it doesn't have a
+    // frame on entry.
+    HasFrame &= !CurrBB->isBeginSection();
     if (!Info.StrongNoFrameOnEntry && Info.HasFrameOnEntry && !HasFrame) {
       // Reset to the "after prologue" state.
 
-      // There's an earlier block known to have a stack frame. Insert a
-      // `.cfi_remember_state` instruction into that block and a
-      // `.cfi_restore_state` instruction at the beginning of the current block.
-      InsertPt = insertRememberRestorePair(
-          InsertPt, InsertionPoint{&*CurrBB, CurrBB->begin()});
+      InsertionPoint &InsertPt = InsertionPts[CurrBB->getSectionID()];
+      if (InsertPt.MBB == nullptr) {
+        // CurBB is the first block in its section, so there is no "after
+        // prologue" state. Clone the CFI instructions from the prologue block
+        // to create it.
+        InsertPt = cloneCfiPrologue({PrologueBlock, PrologueEnd},
+                                    {&*CurrBB, CurrBB->begin()});
+      } else {
+        // There's an earlier block known to have a stack frame. Insert a
+        // `.cfi_remember_state` instruction into that block and a
+        // `.cfi_restore_state` instruction at the beginning of the current
+        // block.
+        InsertPt =
+            insertRememberRestorePair(InsertPt, {&*CurrBB, CurrBB->begin()});
+      }
       Change = true;
     } else if ((Info.StrongNoFrameOnEntry || !Info.HasFrameOnEntry) &&
                HasFrame) {
diff --git a/llvm/test/CodeGen/AArch64/cfi-fixup-multi-section.mir b/llvm/test/CodeGen/AArch64/cfi-fixup-multi-section.mir
new file mode 100644
index 00000000000000..a24972d1388320
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/cfi-fixup-multi-section.mir
@@ -0,0 +1,200 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=aarch64 -run-pass=cfi-fixup %s -o - | FileCheck %s
+--- |
+  define i32 @f0(i32 %x) #0 {
+  entry: br label %return
+  if.end: br label %return
+  if.then2: br label %return
+  if.else: br label %return
+  return:
+    ret i32 0
+  }
+
+  declare i32 @g(i32)
+
+  attributes #0 = { nounwind shadowcallstack uwtable "sign-return-address"="non-leaf" "target-features"="+reserve-x18" }
+
+...
+---
+name:            f0
+alignment:       4
+exposesReturnsTwice: false
+legalized:       false
+regBankSelected: false
+selected:        false
+failedISel:      false
+tracksRegLiveness: true
+hasWinCFI:       false
+failsVerification: false
+registers:       []
+liveins:
+  - { reg: '$w0', virtual-reg: '' }
+frameInfo:
+  isFrameAddressTaken: false
+  isReturnAddressTaken: false
+  hasStackMap:     false
+  hasPatchPoint:   false
+  stackSize:       16
+  offsetAdjustment: 0
+  maxAlignment:    16
+  adjustsStack:    true
+  hasCalls:        true
+  stackProtector:  ''
+  maxCallFrameSize: 0
+  cvBytesOfCalleeSavedRegisters: 0
+  hasOpaqueSPAdjustment: false
+  hasVAStart:      false
+  hasMustTailInVarArgFunc: false
+  hasTailCall:     false
+  localFrameSize:  0
+  savePoint:       ''
+  restorePoint:    ''
+fixedStack:      []
+stack:
+  - { id: 0, name: '', type: spill-slot, offset: -16, size: 8, alignment: 16,
+      stack-id: default, callee-saved-register: '$lr', callee-saved-restored: true,
+      debug-info-variable: '', debug-info-expression: '', debug-info-location: '' }
+callSites:       []
+debugValueSubstitutions: []
+constants:       []
+machineFunctionInfo:
+  hasRedZone:      false
+body:             |
+  ; CHECK-LABEL: name: f0
+  ; CHECK: bb.0.entry:
+  ; CHECK-NEXT:   successors: %bb.4(0x30000000), %bb.1(0x50000000)
+  ; CHECK-NEXT:   liveins: $w0, $lr, $x18
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   CBZW renamable $w0, %bb.4
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1.if.end:
+  ; CHECK-NEXT:   successors: %bb.3(0x30000000), %bb.2(0x50000000)
+  ; CHECK-NEXT:   liveins: $w0, $lr, $x18
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   early-clobber $x18 = frame-setup STRXpost $lr, $x18, 8
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION escape 0x16, 0x12, 0x02, 0x82, 0x78
+  ; CHECK-NEXT:   frame-setup PACIASP implicit-def $lr, implicit killed $lr, implicit $sp
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION negate_ra_sign_state
+  ; CHECK-NEXT:   early-clobber $sp = frame-setup STRXpre killed $lr, $sp, -16 :: (store (s64) into %stack.0)
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION def_cfa_offset 16
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION offset $w30, -16
+  ; CHECK-NEXT:   CFI_INSTRUCTION remember_state
+  ; CHECK-NEXT:   TBNZW renamable $w0, 31, %bb.3
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2.if.else:
+  ; CHECK-NEXT:   successors: %bb.5(0x80000000)
+  ; CHECK-NEXT:   liveins: $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   renamable $w0 = nuw nsw ADDWri killed renamable $w0, 1, 0
+  ; CHECK-NEXT:   BL @g, csr_aarch64_aapcs_scs, implicit-def dead $lr, implicit $sp, implicit killed $w0, implicit-def $sp, implicit-def $w0
+  ; CHECK-NEXT:   renamable $w8 = MOVZWi 1, 0
+  ; CHECK-NEXT:   $w0 = SUBWrs killed renamable $w8, killed renamable $w0, 0
+  ; CHECK-NEXT:   B %bb.5
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.3.if.then2 (bbsections 1):
+  ; CHECK-NEXT:   successors: %bb.5(0x80000000)
+  ; CHECK-NEXT:   liveins: $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION escape 0x16, 0x12, 0x02, 0x82, 0x78
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION negate_ra_sign_state
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION def_cfa_offset 16
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION offset $w30, -16
+  ; CHECK-NEXT:   renamable $w0 = nsw SUBWri killed renamable $w0, 1, 0
+  ; CHECK-NEXT:   BL @g, csr_aarch64_aapcs_scs, implicit-def dead $lr, implicit $sp, implicit killed $w0, implicit-def $sp, implicit-def $w0
+  ; CHECK-NEXT:   renamable $w0 = nsw ADDWri killed renamable $w0, 1, 0
+  ; CHECK-NEXT:   B %bb.5
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.4.return:
+  ; CHECK-NEXT:   liveins: $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   RET undef $lr, implicit killed $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.5.return:
+  ; CHECK-NEXT:   successors: %bb.7(0x80000000)
+  ; CHECK-NEXT:   liveins: $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   CFI_INSTRUCTION restore_state
+  ; CHECK-NEXT:   CFI_INSTRUCTION remember_state
+  ; CHECK-NEXT:   B %bb.7
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.6.return:
+  ; CHECK-NEXT:   liveins: $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   early-clobber $sp, $lr = frame-destroy LDRXpost $sp, 16 :: (load (s64) from %stack.0)
+  ; CHECK-NEXT:   frame-destroy CFI_INSTRUCTION def_cfa_offset 0
+  ; CHECK-NEXT:   frame-destroy AUTIASP implicit-def $lr, implicit killed $lr, implicit $sp
+  ; CHECK-NEXT:   frame-destroy CFI_INSTRUCTION negate_ra_sign_state
+  ; CHECK-NEXT:   early-clobber $x18, $lr = frame-destroy LDRXpre $x18, -8
+  ; CHECK-NEXT:   frame-destroy CFI_INSTRUCTION restore $w18
+  ; CHECK-NEXT:   frame-destroy CFI_INSTRUCTION restore $w30
+  ; CHECK-NEXT:   RET undef $lr, implicit killed $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.7.return:
+  ; CHECK-NEXT:   successors: %bb.6(0x80000000)
+  ; CHECK-NEXT:   liveins: $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   CFI_INSTRUCTION restore_state
+  ; CHECK-NEXT:   B %bb.6
+  bb.0.entry:
+    successors: %bb.4(0x30000000), %bb.1(0x50000000)
+    liveins: $w0, $lr, $x18
+
+    CBZW renamable $w0, %bb.4
+
+  bb.1.if.end:
+    successors: %bb.3(0x30000000), %bb.2(0x50000000)
+    liveins: $w0, $lr, $x18
+
+    early-clobber $x18 = frame-setup STRXpost $lr, $x18, 8
+    frame-setup CFI_INSTRUCTION escape 0x16, 0x12, 0x02, 0x82, 0x78
+    frame-setup PACIASP implicit-def $lr, implicit killed $lr, implicit $sp
+    frame-setup CFI_INSTRUCTION negate_ra_sign_state
+    early-clobber $sp = frame-setup STRXpre killed $lr, $sp, -16 :: (store (s64) into %stack.0)
+    frame-setup CFI_INSTRUCTION def_cfa_offset 16
+    frame-setup CFI_INSTRUCTION offset $w30, -16
+    TBNZW renamable $w0, 31, %bb.3
+
+  bb.2.if.else:
+    successors: %bb.5(0x80000000)
+    liveins: $w0
+
+    renamable $w0 = nuw nsw ADDWri killed renamable $w0, 1, 0
+    BL @g, csr_aarch64_aapcs_scs, implicit-def dead $lr, implicit $sp, implicit killed $w0, implicit-def $sp, implicit-def $w0
+    renamable $w8 = MOVZWi 1, 0
+    $w0 = SUBWrs killed renamable $w8, killed renamable $w0, 0
+    B %bb.5
+
+  bb.3.if.then2 (bbsections 1):
+    successors: %bb.5(0x80000000)
+    liveins: $w0
+
+    renamable $w0 = nsw SUBWri killed renamable $w0, 1, 0
+    BL @g, csr_aarch64_aapcs_scs, implicit-def dead $lr, implicit $sp, implicit killed $w0, implicit-def $sp, implicit-def $w0
+    renamable $w0 = nsw ADDWri killed renamable $w0, 1, 0
+    B %bb.5
+
+  bb.4.return:
+    liveins: $w0
+    RET undef $lr, implicit killed $w0
+
+  bb.5.return:
+    liveins: $w0
+    B %bb.6
+
+  bb.7.return:
+    liveins: $w0
+    early-clobber $sp, $lr = frame-destroy LDRXpost $sp, 16 :: (load (s64) from %stack.0)
+    frame-destroy CFI_INSTRUCTION def_cfa_offset 0
+    frame-destroy AUTIASP implicit-def $lr, implicit killed $lr, implicit $sp
+    frame-destroy CFI_INSTRUCTION negate_ra_sign_state
+    early-clobber $x18, $lr = frame-destroy LDRXpre $x18, -8
+    frame-destroy CFI_INSTRUCTION restore $w18
+    frame-destroy CFI_INSTRUCTION restore $w30
+    RET undef $lr, implicit killed $w0
+
+  bb.6.return:
+    liveins: $w0
+    B %bb.7
+
+
+...

@dhoekwater dhoekwater requested review from echristo and removed request for MaskRay, jrtc27 and SamTebbs33 November 20, 2024 19:13
Now that `-fbasic-block-sections=list` is enabled for Arm, functions may
be split aross multiple sections, and CFI information must be handled
independently for each section.

On x86, this is handled in `llvm/lib/CodeGen/CFIInstrInserter.cpp`.
However, this pass does not run on Arm, so we must add logic for it
to `llvm/lib/CodeGen/CFIFixup.cpp`.
@dhoekwater dhoekwater force-pushed the aarch64-multisection-cfi branch from de39f22 to 1e87a39 Compare November 20, 2024 19:17
Copy link
Contributor

@echristo echristo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel comfortable approving this for now. We should really merge the two separate passes at some point :)

@dhoekwater dhoekwater merged commit 07137ce into llvm:main Nov 20, 2024
8 checks passed
@dhoekwater dhoekwater deleted the aarch64-multisection-cfi branch November 20, 2024 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants