Skip to content

Reland [AMDGPU] Support block load/store for CSR #130013 #137169

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 25, 2025

Conversation

rovka
Copy link
Collaborator

@rovka rovka commented Apr 24, 2025

Add support for using the existing SCRATCH_STORE_BLOCK and SCRATCH_LOAD_BLOCK instructions for saving and restoring callee-saved VGPRs. This is controlled by a new subtarget feature, block-vgpr-csr. It does not include WWM registers - those will be saved and restored individually, just like before. This patch does not change the ABI.

Use of this feature may lead to slightly increased stack usage, because the memory is not compacted if certain registers don't have to be transferred (this will happen in practice for calling conventions where the callee and caller saved registers are interleaved in groups of 8). However, if the registers at the end of the block of 32 don't have to be transferred, we don't need to use a whole 128-byte stack slot - we can trim some space off the end of the range.

In order to implement this feature, we need to rely less on the target-independent code in the PrologEpilogInserter, so we override several new methods in SIFrameLowering. We also add new pseudos, SI_BLOCK_SPILL_V1024_SAVE/RESTORE.

One peculiarity is that both the SI_BLOCK_V1024_RESTORE pseudo and the SCRATCH_LOAD_BLOCK instructions will have all the registers that are not transferred added as implicit uses. This is done in order to inform LiveRegUnits that those registers are not available before the restore (since we're not really restoring them - so we can't afford to scavenge them). Unfortunately, this trick doesn't work with the save, so before the save all the registers in the block will be unavailable (see the unit test).

This was reverted due to failures in the builds with expensive checks on, now fixed by always updating LiveIntervals and SlotIndexes in SILowerSGPRSpills.

@llvmbot
Copy link
Member

llvmbot commented Apr 24, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Diana Picus (rovka)

Changes

Add support for using the existing SCRATCH_STORE_BLOCK and SCRATCH_LOAD_BLOCK instructions for saving and restoring callee-saved VGPRs. This is controlled by a new subtarget feature, block-vgpr-csr. It does not include WWM registers - those will be saved and restored individually, just like before. This patch does not change the ABI.

Use of this feature may lead to slightly increased stack usage, because the memory is not compacted if certain registers don't have to be transferred (this will happen in practice for calling conventions where the callee and caller saved registers are interleaved in groups of 8). However, if the registers at the end of the block of 32 don't have to be transferred, we don't need to use a whole 128-byte stack slot - we can trim some space off the end of the range.

In order to implement this feature, we need to rely less on the target-independent code in the PrologEpilogInserter, so we override several new methods in SIFrameLowering. We also add new pseudos, SI_BLOCK_SPILL_V1024_SAVE/RESTORE.

One peculiarity is that both the SI_BLOCK_V1024_RESTORE pseudo and the SCRATCH_LOAD_BLOCK instructions will have all the registers that are not transferred added as implicit uses. This is done in order to inform LiveRegUnits that those registers are not available before the restore (since we're not really restoring them - so we can't afford to scavenge them). Unfortunately, this trick doesn't work with the save, so before the save all the registers in the block will be unavailable (see the unit test).

This was reverted due to failures in the builds with expensive checks on, now fixed by always updating LiveIntervals and SlotIndexes in SILowerSGPRSpills.


Patch is 84.43 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/137169.diff

20 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/MachineFrameInfo.h (+1)
  • (modified) llvm/include/llvm/CodeGen/TargetFrameLowering.h (+17)
  • (modified) llvm/lib/CodeGen/PrologEpilogInserter.cpp (+6-29)
  • (modified) llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp (+35)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPU.td (+8)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp (+37)
  • (modified) llvm/lib/Target/AMDGPU/GCNSubtarget.h (+3)
  • (modified) llvm/lib/Target/AMDGPU/SIFrameLowering.cpp (+204)
  • (modified) llvm/lib/Target/AMDGPU/SIFrameLowering.h (+17)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+14)
  • (modified) llvm/lib/Target/AMDGPU/SIInstructions.td (+16-5)
  • (modified) llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp (+21-9)
  • (modified) llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h (+33)
  • (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp (+64-7)
  • (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.h (+16)
  • (added) llvm/test/CodeGen/AMDGPU/pei-vgpr-block-spill-csr.mir (+294)
  • (added) llvm/test/CodeGen/AMDGPU/spill-vgpr-block.ll (+95)
  • (added) llvm/test/CodeGen/AMDGPU/vgpr-blocks-funcinfo.mir (+47)
  • (modified) llvm/unittests/Target/AMDGPU/CMakeLists.txt (+1)
  • (added) llvm/unittests/Target/AMDGPU/LiveRegUnits.cpp (+160)
diff --git a/llvm/include/llvm/CodeGen/MachineFrameInfo.h b/llvm/include/llvm/CodeGen/MachineFrameInfo.h
index 172c3e8c9a847..9d1b536d23331 100644
--- a/llvm/include/llvm/CodeGen/MachineFrameInfo.h
+++ b/llvm/include/llvm/CodeGen/MachineFrameInfo.h
@@ -61,6 +61,7 @@ class CalleeSavedInfo {
   MCRegister getReg()                      const { return Reg; }
   int getFrameIdx()                        const { return FrameIdx; }
   MCRegister getDstReg()                   const { return DstReg; }
+  void setReg(MCRegister R) { Reg = R; }
   void setFrameIdx(int FI) {
     FrameIdx = FI;
     SpilledToReg = false;
diff --git a/llvm/include/llvm/CodeGen/TargetFrameLowering.h b/llvm/include/llvm/CodeGen/TargetFrameLowering.h
index cdbefb36c00c7..58b63f1769003 100644
--- a/llvm/include/llvm/CodeGen/TargetFrameLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetFrameLowering.h
@@ -270,6 +270,14 @@ class TargetFrameLowering {
     return false;
   }
 
+  /// spillCalleeSavedRegister - Default implementation for spilling a single
+  /// callee saved register.
+  void spillCalleeSavedRegister(MachineBasicBlock &SaveBlock,
+                                MachineBasicBlock::iterator MI,
+                                const CalleeSavedInfo &CS,
+                                const TargetInstrInfo *TII,
+                                const TargetRegisterInfo *TRI) const;
+
   /// restoreCalleeSavedRegisters - Issues instruction(s) to restore all callee
   /// saved registers and returns true if it isn't possible / profitable to do
   /// so by issuing a series of load instructions via loadRegToStackSlot().
@@ -284,6 +292,15 @@ class TargetFrameLowering {
     return false;
   }
 
+  // restoreCalleeSavedRegister - Default implementation for restoring a single
+  // callee saved register. Should be called in reverse order. Can insert
+  // multiple instructions.
+  void restoreCalleeSavedRegister(MachineBasicBlock &MBB,
+                                  MachineBasicBlock::iterator MI,
+                                  const CalleeSavedInfo &CS,
+                                  const TargetInstrInfo *TII,
+                                  const TargetRegisterInfo *TRI) const;
+
   /// hasFP - Return true if the specified function should have a dedicated
   /// frame pointer register. For most targets this is true only if the function
   /// has variable sized allocas or if frame pointer elimination is disabled.
diff --git a/llvm/lib/CodeGen/PrologEpilogInserter.cpp b/llvm/lib/CodeGen/PrologEpilogInserter.cpp
index ac4090252cea0..0cd25c4feb8b9 100644
--- a/llvm/lib/CodeGen/PrologEpilogInserter.cpp
+++ b/llvm/lib/CodeGen/PrologEpilogInserter.cpp
@@ -476,8 +476,8 @@ static void assignCalleeSavedSpillSlots(MachineFunction &F,
     // Now that we know which registers need to be saved and restored, allocate
     // stack slots for them.
     for (auto &CS : CSI) {
-      // If the target has spilled this register to another register, we don't
-      // need to allocate a stack slot.
+      // If the target has spilled this register to another register or already
+      // handled it , we don't need to allocate a stack slot.
       if (CS.isSpilledToReg())
         continue;
 
@@ -597,25 +597,14 @@ static void updateLiveness(MachineFunction &MF) {
 static void insertCSRSaves(MachineBasicBlock &SaveBlock,
                            ArrayRef<CalleeSavedInfo> CSI) {
   MachineFunction &MF = *SaveBlock.getParent();
-  const TargetInstrInfo &TII = *MF.getSubtarget().getInstrInfo();
+  const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
   const TargetFrameLowering *TFI = MF.getSubtarget().getFrameLowering();
   const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
 
   MachineBasicBlock::iterator I = SaveBlock.begin();
   if (!TFI->spillCalleeSavedRegisters(SaveBlock, I, CSI, TRI)) {
     for (const CalleeSavedInfo &CS : CSI) {
-      // Insert the spill to the stack frame.
-      MCRegister Reg = CS.getReg();
-
-      if (CS.isSpilledToReg()) {
-        BuildMI(SaveBlock, I, DebugLoc(), TII.get(TargetOpcode::COPY),
-                CS.getDstReg())
-            .addReg(Reg, getKillRegState(true));
-      } else {
-        const TargetRegisterClass *RC = TRI->getMinimalPhysRegClass(Reg);
-        TII.storeRegToStackSlot(SaveBlock, I, Reg, true, CS.getFrameIdx(), RC,
-                                TRI, Register());
-      }
+      TFI->spillCalleeSavedRegister(SaveBlock, I, CS, TII, TRI);
     }
   }
 }
@@ -624,7 +613,7 @@ static void insertCSRSaves(MachineBasicBlock &SaveBlock,
 static void insertCSRRestores(MachineBasicBlock &RestoreBlock,
                               std::vector<CalleeSavedInfo> &CSI) {
   MachineFunction &MF = *RestoreBlock.getParent();
-  const TargetInstrInfo &TII = *MF.getSubtarget().getInstrInfo();
+  const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
   const TargetFrameLowering *TFI = MF.getSubtarget().getFrameLowering();
   const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
 
@@ -634,19 +623,7 @@ static void insertCSRRestores(MachineBasicBlock &RestoreBlock,
 
   if (!TFI->restoreCalleeSavedRegisters(RestoreBlock, I, CSI, TRI)) {
     for (const CalleeSavedInfo &CI : reverse(CSI)) {
-      MCRegister Reg = CI.getReg();
-      if (CI.isSpilledToReg()) {
-        BuildMI(RestoreBlock, I, DebugLoc(), TII.get(TargetOpcode::COPY), Reg)
-            .addReg(CI.getDstReg(), getKillRegState(true));
-      } else {
-        const TargetRegisterClass *RC = TRI->getMinimalPhysRegClass(Reg);
-        TII.loadRegFromStackSlot(RestoreBlock, I, Reg, CI.getFrameIdx(), RC,
-                                 TRI, Register());
-        assert(I != RestoreBlock.begin() &&
-               "loadRegFromStackSlot didn't insert any code!");
-        // Insert in reverse order.  loadRegFromStackSlot can insert
-        // multiple instructions.
-      }
+      TFI->restoreCalleeSavedRegister(RestoreBlock, I, CI, TII, TRI);
     }
   }
 }
diff --git a/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp b/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp
index be73b73c93989..70c3b2cbae9a6 100644
--- a/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp
+++ b/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp
@@ -15,6 +15,7 @@
 #include "llvm/CodeGen/MachineFunction.h"
 #include "llvm/CodeGen/MachineRegisterInfo.h"
 #include "llvm/CodeGen/TargetFrameLowering.h"
+#include "llvm/CodeGen/TargetInstrInfo.h"
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
 #include "llvm/IR/Attributes.h"
 #include "llvm/IR/Function.h"
@@ -182,3 +183,37 @@ TargetFrameLowering::getDwarfFrameBase(const MachineFunction &MF) const {
   const TargetRegisterInfo *RI = MF.getSubtarget().getRegisterInfo();
   return DwarfFrameBase{DwarfFrameBase::Register, {RI->getFrameRegister(MF).id()}};
 }
+
+void TargetFrameLowering::spillCalleeSavedRegister(
+    MachineBasicBlock &SaveBlock, MachineBasicBlock::iterator MI,
+    const CalleeSavedInfo &CS, const TargetInstrInfo *TII,
+    const TargetRegisterInfo *TRI) const {
+  // Insert the spill to the stack frame.
+  MCRegister Reg = CS.getReg();
+
+  if (CS.isSpilledToReg()) {
+    BuildMI(SaveBlock, MI, DebugLoc(), TII->get(TargetOpcode::COPY),
+            CS.getDstReg())
+        .addReg(Reg, getKillRegState(true));
+  } else {
+    const TargetRegisterClass *RC = TRI->getMinimalPhysRegClass(Reg);
+    TII->storeRegToStackSlot(SaveBlock, MI, Reg, true, CS.getFrameIdx(), RC,
+                             TRI, Register());
+  }
+}
+
+void TargetFrameLowering::restoreCalleeSavedRegister(
+    MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,
+    const CalleeSavedInfo &CS, const TargetInstrInfo *TII,
+    const TargetRegisterInfo *TRI) const {
+  MCRegister Reg = CS.getReg();
+  if (CS.isSpilledToReg()) {
+    BuildMI(MBB, MI, DebugLoc(), TII->get(TargetOpcode::COPY), Reg)
+        .addReg(CS.getDstReg(), getKillRegState(true));
+  } else {
+    const TargetRegisterClass *RC = TRI->getMinimalPhysRegClass(Reg);
+    TII->loadRegFromStackSlot(MBB, MI, Reg, CS.getFrameIdx(), RC, TRI,
+                              Register());
+    assert(MI != MBB.begin() && "loadRegFromStackSlot didn't insert any code!");
+  }
+}
diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index b2098b41acb7e..d896589825fc7 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -1275,6 +1275,14 @@ def FeatureDynamicVGPRBlockSize32 : SubtargetFeature<"dynamic-vgpr-block-size-32
   "Use a block size of 32 for dynamic VGPR allocation (default is 16)"
 >;
 
+// Enable the use of SCRATCH_STORE/LOAD_BLOCK instructions for saving and
+// restoring the callee-saved registers.
+def FeatureUseBlockVGPROpsForCSR : SubtargetFeature<"block-vgpr-csr",
+  "UseBlockVGPROpsForCSR",
+  "true",
+  "Use block load/store for VGPR callee saved registers"
+>;
+
 def FeatureLshlAddU64Inst
     : SubtargetFeature<"lshl-add-u64-inst", "HasLshlAddU64Inst", "true",
                        "Has v_lshl_add_u64 instruction">;
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp b/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
index 3d6b974d1f027..2dec16de940d1 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
@@ -19,6 +19,7 @@
 #include "MCTargetDesc/AMDGPUInstPrinter.h"
 #include "MCTargetDesc/AMDGPUMCExpr.h"
 #include "MCTargetDesc/AMDGPUMCTargetDesc.h"
+#include "SIMachineFunctionInfo.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
 #include "llvm/CodeGen/MachineInstr.h"
 #include "llvm/IR/Constants.h"
@@ -243,6 +244,36 @@ const MCExpr *AMDGPUAsmPrinter::lowerConstant(const Constant *CV,
   return AsmPrinter::lowerConstant(CV, BaseCV, Offset);
 }
 
+static void emitVGPRBlockComment(const MachineInstr *MI, const SIInstrInfo *TII,
+                                 const TargetRegisterInfo *TRI,
+                                 const SIMachineFunctionInfo *MFI,
+                                 MCStreamer &OS) {
+  // The instruction will only transfer a subset of the registers in the block,
+  // based on the mask that is stored in m0. We could search for the instruction
+  // that sets m0, but most of the time we'll already have the mask stored in
+  // the machine function info. Try to use that. This assumes that we only use
+  // block loads/stores for CSR spills.
+  Register RegBlock =
+      TII->getNamedOperand(*MI, MI->mayLoad() ? AMDGPU::OpName::vdst
+                                              : AMDGPU::OpName::vdata)
+          ->getReg();
+  Register FirstRegInBlock = TRI->getSubReg(RegBlock, AMDGPU::sub0);
+  uint32_t Mask = MFI->getMaskForVGPRBlockOps(RegBlock);
+
+  if (!Mask)
+    return; // Nothing to report
+
+  SmallString<512> TransferredRegs;
+  for (unsigned I = 0; I < sizeof(Mask) * 8; ++I) {
+    if (Mask & (1 << I)) {
+      (llvm::Twine(" ") + TRI->getRegAsmName(FirstRegInBlock + I))
+          .toVector(TransferredRegs);
+    }
+  }
+
+  OS.emitRawComment(" transferring at most " + TransferredRegs);
+}
+
 void AMDGPUAsmPrinter::emitInstruction(const MachineInstr *MI) {
   // FIXME: Enable feature predicate checks once all the test pass.
   // AMDGPU_MC::verifyInstructionPredicates(MI->getOpcode(),
@@ -331,6 +362,12 @@ void AMDGPUAsmPrinter::emitInstruction(const MachineInstr *MI) {
       return;
     }
 
+    if (isVerbose())
+      if (STI.getInstrInfo()->isBlockLoadStore(MI->getOpcode()))
+        emitVGPRBlockComment(MI, STI.getInstrInfo(), STI.getRegisterInfo(),
+                             MF->getInfo<SIMachineFunctionInfo>(),
+                             *OutStreamer);
+
     MCInst TmpInst;
     MCInstLowering.lower(MI, TmpInst);
     EmitToStreamer(*OutStreamer, TmpInst);
diff --git a/llvm/lib/Target/AMDGPU/GCNSubtarget.h b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
index 7dd91c0775a48..fea17baa17722 100644
--- a/llvm/lib/Target/AMDGPU/GCNSubtarget.h
+++ b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
@@ -262,6 +262,7 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
   bool HasPointSampleAccel = false;
 
   bool RequiresCOV6 = false;
+  bool UseBlockVGPROpsForCSR = false;
 
   // Dummy feature to use for assembler in tablegen.
   bool FeatureDisable = false;
@@ -1277,6 +1278,8 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
 
   bool requiresCodeObjectV6() const { return RequiresCOV6; }
 
+  bool useVGPRBlockOpsForCSR() const { return UseBlockVGPROpsForCSR; }
+
   bool hasVALUMaskWriteHazard() const { return getGeneration() == GFX11; }
 
   bool hasVALUReadSGPRHazard() const { return getGeneration() == GFX12; }
diff --git a/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp b/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
index 9c737b4f3e378..0c1cd9ceddb02 100644
--- a/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
@@ -1739,6 +1739,105 @@ void SIFrameLowering::determineCalleeSavesSGPR(MachineFunction &MF,
   }
 }
 
+static void assignSlotsUsingVGPRBlocks(MachineFunction &MF,
+                                       const GCNSubtarget &ST,
+                                       std::vector<CalleeSavedInfo> &CSI,
+                                       unsigned &MinCSFrameIndex,
+                                       unsigned &MaxCSFrameIndex) {
+  SIMachineFunctionInfo *FuncInfo = MF.getInfo<SIMachineFunctionInfo>();
+  MachineFrameInfo &MFI = MF.getFrameInfo();
+  const SIRegisterInfo *TRI = ST.getRegisterInfo();
+
+  assert(std::is_sorted(CSI.begin(), CSI.end(),
+                        [](const CalleeSavedInfo &A, const CalleeSavedInfo &B) {
+                          return A.getReg() < B.getReg();
+                        }) &&
+         "Callee saved registers not sorted");
+
+  auto CanUseBlockOps = [&](const CalleeSavedInfo &CSI) {
+    return !CSI.isSpilledToReg() &&
+           TRI->getPhysRegBaseClass(CSI.getReg()) == &AMDGPU::VGPR_32RegClass &&
+           !FuncInfo->isWWMReservedRegister(CSI.getReg());
+  };
+
+  auto CSEnd = CSI.end();
+  for (auto CSIt = CSI.begin(); CSIt != CSEnd; ++CSIt) {
+    Register Reg = CSIt->getReg();
+    if (!CanUseBlockOps(*CSIt))
+      continue;
+
+    // Find all the regs that will fit in a 32-bit mask starting at the current
+    // reg and build said mask. It should have 1 for every register that's
+    // included, with the current register as the least significant bit.
+    uint32_t Mask = 1;
+    CSEnd = std::remove_if(
+        CSIt + 1, CSEnd, [&](const CalleeSavedInfo &CSI) -> bool {
+          if (CanUseBlockOps(CSI) && CSI.getReg() < Reg + 32) {
+            Mask |= 1 << (CSI.getReg() - Reg);
+            return true;
+          } else {
+            return false;
+          }
+        });
+
+    const TargetRegisterClass *BlockRegClass = TRI->getRegClassForBlockOp(MF);
+    Register RegBlock =
+        TRI->getMatchingSuperReg(Reg, AMDGPU::sub0, BlockRegClass);
+    if (!RegBlock) {
+      // We couldn't find a super register for the block. This can happen if
+      // the register we started with is too high (e.g. v232 if the maximum is
+      // v255). We therefore try to get the last register block and figure out
+      // the mask from there.
+      Register LastBlockStart =
+          AMDGPU::VGPR0 + alignDown(Reg - AMDGPU::VGPR0, 32);
+      RegBlock =
+          TRI->getMatchingSuperReg(LastBlockStart, AMDGPU::sub0, BlockRegClass);
+      assert(RegBlock && TRI->isSubRegister(RegBlock, Reg) &&
+             "Couldn't find super register");
+      int RegDelta = Reg - LastBlockStart;
+      assert(RegDelta > 0 && llvm::countl_zero(Mask) >= RegDelta &&
+             "Bad shift amount");
+      Mask <<= RegDelta;
+    }
+
+    FuncInfo->setMaskForVGPRBlockOps(RegBlock, Mask);
+
+    // The stack objects can be a bit smaller than the register block if we know
+    // some of the high bits of Mask are 0. This may happen often with calling
+    // conventions where the caller and callee-saved VGPRs are interleaved at
+    // a small boundary (e.g. 8 or 16).
+    int UnusedBits = llvm::countl_zero(Mask);
+    unsigned BlockSize = TRI->getSpillSize(*BlockRegClass) - UnusedBits * 4;
+    int FrameIdx =
+        MFI.CreateStackObject(BlockSize, TRI->getSpillAlign(*BlockRegClass),
+                              /*isSpillSlot=*/true);
+    if ((unsigned)FrameIdx < MinCSFrameIndex)
+      MinCSFrameIndex = FrameIdx;
+    if ((unsigned)FrameIdx > MaxCSFrameIndex)
+      MaxCSFrameIndex = FrameIdx;
+
+    CSIt->setFrameIdx(FrameIdx);
+    CSIt->setReg(RegBlock);
+  }
+  CSI.erase(CSEnd, CSI.end());
+}
+
+bool SIFrameLowering::assignCalleeSavedSpillSlots(
+    MachineFunction &MF, const TargetRegisterInfo *TRI,
+    std::vector<CalleeSavedInfo> &CSI, unsigned &MinCSFrameIndex,
+    unsigned &MaxCSFrameIndex) const {
+  if (CSI.empty())
+    return true; // Early exit if no callee saved registers are modified!
+
+  const GCNSubtarget &ST = MF.getSubtarget<GCNSubtarget>();
+  bool UseVGPRBlocks = ST.useVGPRBlockOpsForCSR();
+
+  if (UseVGPRBlocks)
+    assignSlotsUsingVGPRBlocks(MF, ST, CSI, MinCSFrameIndex, MaxCSFrameIndex);
+
+  return assignCalleeSavedSpillSlots(MF, TRI, CSI) || UseVGPRBlocks;
+}
+
 bool SIFrameLowering::assignCalleeSavedSpillSlots(
     MachineFunction &MF, const TargetRegisterInfo *TRI,
     std::vector<CalleeSavedInfo> &CSI) const {
@@ -1808,6 +1907,111 @@ bool SIFrameLowering::allocateScavengingFrameIndexesNearIncomingSP(
   return true;
 }
 
+bool SIFrameLowering::spillCalleeSavedRegisters(
+    MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,
+    ArrayRef<CalleeSavedInfo> CSI, const TargetRegisterInfo *TRI) const {
+  MachineFunction *MF = MBB.getParent();
+  const GCNSubtarget &ST = MF->getSubtarget<GCNSubtarget>();
+  if (!ST.useVGPRBlockOpsForCSR())
+    return false;
+
+  MachineFrameInfo &FrameInfo = MF->getFrameInfo();
+  SIMachineFunctionInfo *MFI = MF->getInfo<SIMachineFunctionInfo>();
+  const SIInstrInfo *TII = ST.getInstrInfo();
+  SIMachineFunctionInfo *FuncInfo = MF->getInfo<SIMachineFunctionInfo>();
+
+  const TargetRegisterClass *BlockRegClass =
+      static_cast<const SIRegisterInfo *>(TRI)->getRegClassForBlockOp(*MF);
+  for (const CalleeSavedInfo &CS : CSI) {
+    Register Reg = CS.getReg();
+    if (!BlockRegClass->contains(Reg) ||
+        !FuncInfo->hasMaskForVGPRBlockOps(Reg)) {
+      spillCalleeSavedRegister(MBB, MI, CS, TII, TRI);
+      continue;
+    }
+
+    // Build a scratch block store.
+    uint32_t Mask = FuncInfo->getMaskForVGPRBlockOps(Reg);
+    int FrameIndex = CS.getFrameIdx();
+    MachinePointerInfo PtrInfo =
+        MachinePointerInfo::getFixedStack(*MF, FrameIndex);
+    MachineMemOperand *MMO =
+        MF->getMachineMemOperand(PtrInfo, MachineMemOperand::MOStore,
+                                 FrameInfo.getObjectSize(FrameIndex),
+                                 FrameInfo.getObjectAlign(FrameIndex));
+
+    BuildMI(MBB, MI, MI->getDebugLoc(),
+            TII->get(AMDGPU::SI_BLOCK_SPILL_V1024_SAVE))
+        .addReg(Reg, getKillRegState(false))
+        .addFrameIndex(FrameIndex)
+        .addReg(MFI->getStackPtrOffsetReg())
+        .addImm(0)
+        .addImm(Mask)
+        .addMemOperand(MMO);
+
+    FuncInfo->setHasSpilledVGPRs();
+
+    // Add the register to the liveins. This is necessary because if any of the
+    // VGPRs in the register block is reserved (e.g. if it's a WWM register),
+    // then the whole block will be marked as reserved and `updateLiveness` will
+    // skip it.
+    MBB.addLiveIn(Reg);
+  }
+  MBB.sortUniqueLiveIns();
+
+  return true;
+}
+
+bool SIFrameLowering::restoreCalleeSavedRegisters(
+    MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,
+    MutableArrayRef<CalleeSavedInfo> CSI, const TargetRegisterInfo *TRI) const {
+  MachineFunction *MF = MBB.getParent();
+  const GCNSubtarget &ST = MF->getSubtarget<GCNSubtarget>();
+  if (!ST.useVGPRBlockOpsForCSR())
+    return false;
+
+  SIMachineFunctionInfo *FuncInfo = MF->getInfo<SIMachineFunctionInfo>();
+  MachineFrameInfo &MFI = MF->getFrameInfo();
+  const SIInstrInfo *TII = ST.getInstrInfo();
+  const SIRegisterInfo *SITRI = static_cast<const SIRegisterInfo *>(TRI);
+  const TargetRegisterClass *BlockRegClass = SITRI->getRegClassForBlockOp(*MF);
+  for (const CalleeSavedInfo &CS : reverse(CSI)) {
+    Register Reg = CS.getReg();
+    if (!BlockRegClass->contains(Reg) ||
+        !FuncInfo->hasMaskForVGPRBlockOps(Reg)) {
+      restoreCalleeSavedRegister(MBB, MI, CS, TII, TRI);
+      co...
[truncated]

Copy link

github-actions bot commented Apr 24, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@shiltian shiltian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given it is a reland, no complaint as long as it can pass tests.

@rovka rovka merged commit 5bad5d8 into llvm:main Apr 25, 2025
6 of 10 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 25, 2025

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux running on sanitizer-buildbot1 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/66/builds/13065

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
[4113/5524] Building CXX object lib/Target/VE/CMakeFiles/LLVMVECodeGen.dir/VEISelDAGToDAG.cpp.o
[4114/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPULegalizerInfo.cpp.o
[4115/5524] Building CXX object lib/Target/SystemZ/CMakeFiles/LLVMSystemZCodeGen.dir/SystemZISelDAGToDAG.cpp.o
[4116/5524] Building CXX object lib/Target/SystemZ/CMakeFiles/LLVMSystemZCodeGen.dir/SystemZAsmPrinter.cpp.o
[4117/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIFormMemoryClauses.cpp.o
[4118/5524] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86InstrFMA3Info.cpp.o
[4119/5524] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86InstrFoldTables.cpp.o
[4120/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyDebugValueManager.cpp.o
[4121/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIWholeQuadMode.cpp.o
[4122/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o
FAILED: lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o 
CCACHE_CPP2=yes CCACHE_HASHDIR=yes CCACHE_SLOPPINESS=pch_defines,time_macros /usr/bin/ccache /home/b/sanitizer-x86_64-linux/build/llvm_build0/bin/clang++ -DGTEST_HAS_RTTI=0 -DLLVM_EXPORTS -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/b/sanitizer-x86_64-linux/build/build_default/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/build_default/include -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fvisibility=hidden  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -MF lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o.d -o lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -c /home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp:172:25: error: unused variable 'RI' [-Werror,-Wunused-variable]
  172 |   const SIRegisterInfo *RI = ST.getRegisterInfo();
      |                         ^~
1 error generated.
[4123/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyLateEHPrepare.cpp.o
[4124/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/R600ISelDAGToDAG.cpp.o
[4125/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIMachineFunctionInfo.cpp.o
[4126/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyFixBrTableDefaults.cpp.o
[4127/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIOptimizeVGPRLiveRange.cpp.o
[4128/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIPreAllocateWWMRegs.cpp.o
[4129/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUInstructionSelector.cpp.o
[4130/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIFoldOperands.cpp.o
[4131/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyLowerRefTypesIntPtrConv.cpp.o
[4132/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILateBranchLowering.cpp.o
[4133/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyRegNumbering.cpp.o
[4134/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyExplicitLocals.cpp.o
[4135/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyDebugFixup.cpp.o
[4136/5524] Building CXX object lib/Target/SPIRV/CMakeFiles/LLVMSPIRVCodeGen.dir/SPIRVUtils.cpp.o
[4137/5524] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86ArgumentStackSlotRebase.cpp.o
[4138/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIFrameLowering.cpp.o
[4139/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblySelectionDAGInfo.cpp.o
[4140/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyInstrInfo.cpp.o
[4141/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyFixIrreducibleControlFlow.cpp.o
[4142/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyRegisterInfo.cpp.o
[4143/5524] Building CXX object lib/Target/RISCV/CMakeFiles/LLVMRISCVCodeGen.dir/GISel/RISCVInstructionSelector.cpp.o
[4144/5524] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86AvoidTrailingCall.cpp.o
[4145/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyLowerBrUnless.cpp.o
[4146/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/R600TargetMachine.cpp.o
[4147/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyRegColoring.cpp.o
[4148/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyMachineFunctionInfo.cpp.o
[4149/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyFrameLowering.cpp.o
[4150/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyMCInstLower.cpp.o
[4151/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyISelDAGToDAG.cpp.o
[4152/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblySubtarget.cpp.o
[4153/5524] Building CXX object lib/Target/VE/CMakeFiles/LLVMVECodeGen.dir/VETargetMachine.cpp.o
[4154/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyOptimizeLiveIntervals.cpp.o
[4155/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyFastISel.cpp.o
Step 8 (build compiler-rt symbolizer) failure: build compiler-rt symbolizer (failure)
...
[4113/5524] Building CXX object lib/Target/VE/CMakeFiles/LLVMVECodeGen.dir/VEISelDAGToDAG.cpp.o
[4114/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPULegalizerInfo.cpp.o
[4115/5524] Building CXX object lib/Target/SystemZ/CMakeFiles/LLVMSystemZCodeGen.dir/SystemZISelDAGToDAG.cpp.o
[4116/5524] Building CXX object lib/Target/SystemZ/CMakeFiles/LLVMSystemZCodeGen.dir/SystemZAsmPrinter.cpp.o
[4117/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIFormMemoryClauses.cpp.o
[4118/5524] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86InstrFMA3Info.cpp.o
[4119/5524] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86InstrFoldTables.cpp.o
[4120/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyDebugValueManager.cpp.o
[4121/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIWholeQuadMode.cpp.o
[4122/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o
FAILED: lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o 
CCACHE_CPP2=yes CCACHE_HASHDIR=yes CCACHE_SLOPPINESS=pch_defines,time_macros /usr/bin/ccache /home/b/sanitizer-x86_64-linux/build/llvm_build0/bin/clang++ -DGTEST_HAS_RTTI=0 -DLLVM_EXPORTS -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/b/sanitizer-x86_64-linux/build/build_default/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/build_default/include -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fvisibility=hidden  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -MF lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o.d -o lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -c /home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp:172:25: error: unused variable 'RI' [-Werror,-Wunused-variable]
  172 |   const SIRegisterInfo *RI = ST.getRegisterInfo();
      |                         ^~
1 error generated.
[4123/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyLateEHPrepare.cpp.o
[4124/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/R600ISelDAGToDAG.cpp.o
[4125/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIMachineFunctionInfo.cpp.o
[4126/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyFixBrTableDefaults.cpp.o
[4127/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIOptimizeVGPRLiveRange.cpp.o
[4128/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIPreAllocateWWMRegs.cpp.o
[4129/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUInstructionSelector.cpp.o
[4130/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIFoldOperands.cpp.o
[4131/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyLowerRefTypesIntPtrConv.cpp.o
[4132/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILateBranchLowering.cpp.o
[4133/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyRegNumbering.cpp.o
[4134/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyExplicitLocals.cpp.o
[4135/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyDebugFixup.cpp.o
[4136/5524] Building CXX object lib/Target/SPIRV/CMakeFiles/LLVMSPIRVCodeGen.dir/SPIRVUtils.cpp.o
[4137/5524] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86ArgumentStackSlotRebase.cpp.o
[4138/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIFrameLowering.cpp.o
[4139/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblySelectionDAGInfo.cpp.o
[4140/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyInstrInfo.cpp.o
[4141/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyFixIrreducibleControlFlow.cpp.o
[4142/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyRegisterInfo.cpp.o
[4143/5524] Building CXX object lib/Target/RISCV/CMakeFiles/LLVMRISCVCodeGen.dir/GISel/RISCVInstructionSelector.cpp.o
[4144/5524] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86AvoidTrailingCall.cpp.o
[4145/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyLowerBrUnless.cpp.o
[4146/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/R600TargetMachine.cpp.o
[4147/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyRegColoring.cpp.o
[4148/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyMachineFunctionInfo.cpp.o
[4149/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyFrameLowering.cpp.o
[4150/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyMCInstLower.cpp.o
[4151/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyISelDAGToDAG.cpp.o
[4152/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblySubtarget.cpp.o
[4153/5524] Building CXX object lib/Target/VE/CMakeFiles/LLVMVECodeGen.dir/VETargetMachine.cpp.o
[4154/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyOptimizeLiveIntervals.cpp.o
[4155/5524] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyFastISel.cpp.o
Step 9 (test compiler-rt symbolizer) failure: test compiler-rt symbolizer (failure)
...
[632/677] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86Subtarget.cpp.o
[633/677] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/GISel/X86LegalizerInfo.cpp.o
[634/677] Building CXX object lib/Target/XCore/CMakeFiles/LLVMXCoreCodeGen.dir/XCoreAsmPrinter.cpp.o
[635/677] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86PartialReduction.cpp.o
[636/677] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86SpeculativeLoadHardening.cpp.o
[637/677] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86OptimizeLEAs.cpp.o
[638/677] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86MCInstLower.cpp.o
[639/677] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86RegisterInfo.cpp.o
[640/677] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/GISel/X86InstructionSelector.cpp.o
[641/677] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o
FAILED: lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o 
CCACHE_CPP2=yes CCACHE_HASHDIR=yes CCACHE_SLOPPINESS=pch_defines,time_macros /usr/bin/ccache /home/b/sanitizer-x86_64-linux/build/llvm_build0/bin/clang++ -DGTEST_HAS_RTTI=0 -DLLVM_EXPORTS -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/b/sanitizer-x86_64-linux/build/build_default/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/build_default/include -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fvisibility=hidden  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -MF lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o.d -o lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -c /home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp:172:25: error: unused variable 'RI' [-Werror,-Wunused-variable]
  172 |   const SIRegisterInfo *RI = ST.getRegisterInfo();
      |                         ^~
1 error generated.
[642/677] Building CXX object lib/Target/XCore/CMakeFiles/LLVMXCoreCodeGen.dir/XCoreTargetMachine.cpp.o
[643/677] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86TargetMachine.cpp.o
[644/677] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86TargetTransformInfo.cpp.o
ninja: build stopped: subcommand failed.

How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild




Step 10 (build compiler-rt debug) failure: build compiler-rt debug (failure)
...
[5448/5524] Linking CXX executable bin/llvm-dwarfdump
[5449/5524] Copying llvm-locstats into /home/b/sanitizer-x86_64-linux/build/build_default/./bin
[5450/5524] Linking CXX executable bin/llvm-cfi-verify
[5451/5524] Linking CXX executable bin/llvm-mc
[5452/5524] Linking CXX executable bin/llvm-rtdyld
[5453/5524] Linking CXX executable bin/llvm-debuginfo-analyzer
[5454/5524] Linking CXX executable bin/llvm-profgen
[5455/5524] Linking CXX executable bin/llvm-mca
[5456/5524] Building CXX object tools/llvm-reduce/CMakeFiles/llvm-reduce.dir/deltas/ReduceIRReferences.cpp.o
[5457/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o
FAILED: lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o 
CCACHE_CPP2=yes CCACHE_HASHDIR=yes CCACHE_SLOPPINESS=pch_defines,time_macros /usr/bin/ccache /home/b/sanitizer-x86_64-linux/build/llvm_build0/bin/clang++ -DGTEST_HAS_RTTI=0 -DLLVM_EXPORTS -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/b/sanitizer-x86_64-linux/build/build_default/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/build_default/include -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fvisibility=hidden  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -MF lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o.d -o lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -c /home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp:172:25: error: unused variable 'RI' [-Werror,-Wunused-variable]
  172 |   const SIRegisterInfo *RI = ST.getRegisterInfo();
      |                         ^~
1 error generated.
[5458/5524] Building CXX object tools/llvm-reduce/CMakeFiles/llvm-reduce.dir/ReducerWorkItem.cpp.o
ninja: build stopped: subcommand failed.

How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild




Step 11 (test compiler-rt debug) failure: test compiler-rt debug (failure)
@@@BUILD_STEP test compiler-rt debug@@@
ninja: Entering directory `build_default'
[1/31] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o
FAILED: lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o 
CCACHE_CPP2=yes CCACHE_HASHDIR=yes CCACHE_SLOPPINESS=pch_defines,time_macros /usr/bin/ccache /home/b/sanitizer-x86_64-linux/build/llvm_build0/bin/clang++ -DGTEST_HAS_RTTI=0 -DLLVM_EXPORTS -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/b/sanitizer-x86_64-linux/build/build_default/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/build_default/include -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fvisibility=hidden  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -MF lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o.d -o lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -c /home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp:172:25: error: unused variable 'RI' [-Werror,-Wunused-variable]
  172 |   const SIRegisterInfo *RI = ST.getRegisterInfo();
      |                         ^~
1 error generated.
ninja: build stopped: subcommand failed.

How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild




Step 12 (build compiler-rt tsan_debug) failure: build compiler-rt tsan_debug (failure)
...
[5430/5505] Generating ../../bin/llvm-ml64
[5431/5505] Linking CXX executable bin/llvm-cfi-verify
[5432/5505] Linking CXX executable bin/llvm-mc
[5433/5505] Linking CXX executable bin/llvm-mca
[5434/5505] Linking CXX executable bin/llvm-debuginfo-analyzer
[5435/5505] Linking CXX executable bin/sancov
[5436/5505] Linking CXX executable bin/llvm-rtdyld
[5437/5505] Linking CXX executable bin/llvm-jitlink
[5438/5505] Linking CXX executable bin/llvm-profgen
[5439/5505] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o
FAILED: lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o 
CCACHE_CPP2=yes CCACHE_HASHDIR=yes CCACHE_SLOPPINESS=pch_defines,time_macros /usr/bin/ccache /home/b/sanitizer-x86_64-linux/build/llvm_build0/bin/clang++ -DGTEST_HAS_RTTI=0 -DLLVM_EXPORTS -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/b/sanitizer-x86_64-linux/build/build_default/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/build_default/include -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fvisibility=hidden  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -MF lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o.d -o lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -c /home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp:172:25: error: unused variable 'RI' [-Werror,-Wunused-variable]
  172 |   const SIRegisterInfo *RI = ST.getRegisterInfo();
      |                         ^~
1 error generated.
ninja: build stopped: subcommand failed.

How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild




Step 13 (build compiler-rt default) failure: build compiler-rt default (failure)
...
[5449/5524] Linking CXX executable bin/llvm-dwarfdump
[5450/5524] Copying llvm-locstats into /home/b/sanitizer-x86_64-linux/build/build_default/./bin
[5451/5524] Linking CXX executable bin/llvm-mca
[5452/5524] Linking CXX executable bin/llvm-cfi-verify
[5453/5524] Linking CXX executable bin/llvm-jitlink
[5454/5524] Linking CXX executable bin/llvm-rtdyld
[5455/5524] Linking CXX executable bin/llvm-mc
[5456/5524] Linking CXX executable bin/llvm-debuginfo-analyzer
[5457/5524] Linking CXX executable bin/llvm-profgen
[5458/5524] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o
FAILED: lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o 
CCACHE_CPP2=yes CCACHE_HASHDIR=yes CCACHE_SLOPPINESS=pch_defines,time_macros /usr/bin/ccache /home/b/sanitizer-x86_64-linux/build/llvm_build0/bin/clang++ -DGTEST_HAS_RTTI=0 -DLLVM_EXPORTS -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/b/sanitizer-x86_64-linux/build/build_default/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/build_default/include -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fvisibility=hidden  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -MF lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o.d -o lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -c /home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp:172:25: error: unused variable 'RI' [-Werror,-Wunused-variable]
  172 |   const SIRegisterInfo *RI = ST.getRegisterInfo();
      |                         ^~
1 error generated.
ninja: build stopped: subcommand failed.

How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild




Step 14 (test compiler-rt default) failure: test compiler-rt default (failure)
@@@BUILD_STEP test compiler-rt default@@@
ninja: Entering directory `build_default'
[1/31] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o
FAILED: lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o 
CCACHE_CPP2=yes CCACHE_HASHDIR=yes CCACHE_SLOPPINESS=pch_defines,time_macros /usr/bin/ccache /home/b/sanitizer-x86_64-linux/build/llvm_build0/bin/clang++ -DGTEST_HAS_RTTI=0 -DLLVM_EXPORTS -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/b/sanitizer-x86_64-linux/build/build_default/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU -I/home/b/sanitizer-x86_64-linux/build/build_default/include -I/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fvisibility=hidden  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -MF lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o.d -o lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -c /home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp:172:25: error: unused variable 'RI' [-Werror,-Wunused-variable]
  172 |   const SIRegisterInfo *RI = ST.getRegisterInfo();
      |                         ^~
1 error generated.
ninja: build stopped: subcommand failed.

How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild




Step 15 (build standalone compiler-rt) failure: build standalone compiler-rt (failure)
...
-- The C compiler identification is unknown
-- The CXX compiler identification is unknown
-- The ASM compiler identification is unknown
-- Didn't find assembler
CMake Error at CMakeLists.txt:22 (project):
  The CMAKE_C_COMPILER:

    /home/b/sanitizer-x86_64-linux/build/build_default/bin/clang

  is not a full path to an existing compiler tool.

  Tell CMake where to find the compiler by setting either the environment
  variable "CC" or the CMake cache entry CMAKE_C_COMPILER to the full path to
  the compiler, or to the compiler name if it is in the PATH.


CMake Error at CMakeLists.txt:22 (project):
  The CMAKE_CXX_COMPILER:

    /home/b/sanitizer-x86_64-linux/build/build_default/bin/clang++

  is not a full path to an existing compiler tool.

  Tell CMake where to find the compiler by setting either the environment
  variable "CXX" or the CMake cache entry CMAKE_CXX_COMPILER to the full path
  to the compiler, or to the compiler name if it is in the PATH.


CMake Error at CMakeLists.txt:22 (project):
  No CMAKE_ASM_COMPILER could be found.

  Tell CMake where to find the compiler by setting either the environment
  variable "ASM" or the CMake cache entry CMAKE_ASM_COMPILER to the full path
  to the compiler, or to the compiler name if it is in the PATH.
-- Warning: Did not find file Compiler/-ASM
-- Configuring incomplete, errors occurred!

How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild




ninja: Entering directory `compiler_rt_build'
ninja: error: loading 'build.ninja': No such file or directory

How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild




Step 16 (test standalone compiler-rt) failure: test standalone compiler-rt (failure)
@@@BUILD_STEP test standalone compiler-rt@@@
ninja: Entering directory `compiler_rt_build'
ninja: error: loading 'build.ninja': No such file or directory

How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild





@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 25, 2025

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-android running on sanitizer-buildbot-android while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/186/builds/8503

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
[2324/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerControlFlow.cpp.o
[2325/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerI1Copies.cpp.o
[2326/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIInsertWaitcnts.cpp.o
[2327/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILateBranchLowering.cpp.o
[2328/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIModeRegisterDefaults.cpp.o
[2329/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIModeRegister.cpp.o
[2330/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIOptimizeExecMasking.cpp.o
[2331/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIInstrInfo.cpp.o
[2332/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerWWMCopies.cpp.o
[2333/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o
FAILED: lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o 
CCACHE_CPP2=yes CCACHE_HASHDIR=yes CCACHE_SLOPPINESS=pch_defines,time_macros /usr/bin/ccache /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build0/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/lib/Target/AMDGPU -I/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/llvm/lib/Target/AMDGPU -I/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/include -I/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fvisibility=hidden  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -MF lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o.d -o lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -c /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp:172:25: error: unused variable 'RI' [-Werror,-Wunused-variable]
  172 |   const SIRegisterInfo *RI = ST.getRegisterInfo();
      |                         ^~
1 error generated.
[2334/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIMemoryLegalizer.cpp.o
[2335/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIOptimizeExecMaskingPreRA.cpp.o
[2336/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIMachineScheduler.cpp.o
[2337/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIPeepholeSDWA.cpp.o
[2338/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIMachineFunctionInfo.cpp.o
[2339/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIPostRABundler.cpp.o
[2340/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIISelLowering.cpp.o
[2341/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIProgramInfo.cpp.o
[2342/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIPreEmitPeephole.cpp.o
[2343/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIShrinkInstructions.cpp.o
[2344/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIOptimizeVGPRLiveRange.cpp.o
[2345/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIPreAllocateWWMRegs.cpp.o
[2346/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIRegisterInfo.cpp.o
ninja: build stopped: subcommand failed.

How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild


@@@STEP_FAILURE@@@

@@@STEP_FAILURE@@@

@@@STEP_FAILURE@@@
Step 8 (bootstrap clang) failure: bootstrap clang (failure)
...
[2324/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerControlFlow.cpp.o
[2325/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerI1Copies.cpp.o
[2326/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIInsertWaitcnts.cpp.o
[2327/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILateBranchLowering.cpp.o
[2328/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIModeRegisterDefaults.cpp.o
[2329/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIModeRegister.cpp.o
[2330/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIOptimizeExecMasking.cpp.o
[2331/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIInstrInfo.cpp.o
[2332/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerWWMCopies.cpp.o
[2333/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o
FAILED: lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o 
CCACHE_CPP2=yes CCACHE_HASHDIR=yes CCACHE_SLOPPINESS=pch_defines,time_macros /usr/bin/ccache /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build0/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/lib/Target/AMDGPU -I/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/llvm/lib/Target/AMDGPU -I/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/include -I/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fvisibility=hidden  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -MF lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o.d -o lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -c /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp:172:25: error: unused variable 'RI' [-Werror,-Wunused-variable]
  172 |   const SIRegisterInfo *RI = ST.getRegisterInfo();
      |                         ^~
1 error generated.
[2334/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIMemoryLegalizer.cpp.o
[2335/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIOptimizeExecMaskingPreRA.cpp.o
[2336/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIMachineScheduler.cpp.o
[2337/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIPeepholeSDWA.cpp.o
[2338/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIMachineFunctionInfo.cpp.o
[2339/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIPostRABundler.cpp.o
[2340/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIISelLowering.cpp.o
[2341/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIProgramInfo.cpp.o
[2342/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIPreEmitPeephole.cpp.o
[2343/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIShrinkInstructions.cpp.o
[2344/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIOptimizeVGPRLiveRange.cpp.o
[2345/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIPreAllocateWWMRegs.cpp.o
[2346/5495] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIRegisterInfo.cpp.o
ninja: build stopped: subcommand failed.

How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild




program finished with exit code 2
elapsedTime=595.502958

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 25, 2025

LLVM Buildbot has detected a new failure on builder clang-ppc64le-rhel running on ppc64le-clang-rhel-test while building llvm at step 5 "build-unified-tree".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/145/builds/6587

Here is the relevant piece of the build log for the reference
Step 5 (build-unified-tree) failure: build (failure)
...
245.964 [145/72/6290] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIShrinkInstructions.cpp.o
245.994 [145/71/6291] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/GCNDPPCombine.cpp.o
246.175 [145/70/6292] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/GCNRegPressure.cpp.o
246.470 [145/69/6293] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPULibCalls.cpp.o
246.484 [145/68/6294] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIFixSGPRCopies.cpp.o
246.520 [145/67/6295] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUInsertDelayAlu.cpp.o
247.341 [145/66/6296] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIOptimizeExecMaskingPreRA.cpp.o
248.582 [145/65/6297] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIAnnotateControlFlow.cpp.o
248.651 [145/64/6298] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUMCInstLower.cpp.o
249.511 [145/63/6299] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o
FAILED: lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o 
ccache /home/buildbots/llvm-external-buildbots/clang.19.1.7/bin/clang++ --gcc-toolchain=/gcc-toolchain/usr -DGTEST_HAS_RTTI=0 -DLLVM_EXPORTS -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/lib/Target/AMDGPU -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/lib/Target/AMDGPU -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fPIC  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -MF lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o.d -o lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -c /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp:172:25: error: unused variable 'RI' [-Werror,-Wunused-variable]
  172 |   const SIRegisterInfo *RI = ST.getRegisterInfo();
      |                         ^~
1 error generated.
249.553 [145/62/6300] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/R600TargetTransformInfo.cpp.o
250.322 [145/61/6301] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAttributor.cpp.o
250.612 [145/60/6302] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUPreloadKernArgProlog.cpp.o
251.002 [145/59/6303] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUSubtarget.cpp.o
251.040 [145/58/6304] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUPerfHintAnalysis.cpp.o
251.065 [145/57/6305] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIFormMemoryClauses.cpp.o
251.289 [145/56/6306] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUPostLegalizerCombiner.cpp.o
251.623 [145/55/6307] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/GCNNSAReassign.cpp.o
251.844 [145/54/6308] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIMachineScheduler.cpp.o
251.972 [145/53/6309] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/R600ISelLowering.cpp.o
252.541 [145/52/6310] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/GCNSubtarget.cpp.o
252.764 [145/51/6311] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAlwaysInlinePass.cpp.o
252.805 [145/50/6312] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUInstCombineIntrinsic.cpp.o
252.929 [145/49/6313] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/R600MCInstLower.cpp.o
252.937 [145/48/6314] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUPromoteAlloca.cpp.o
253.045 [145/47/6315] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUImageIntrinsicOptimizer.cpp.o
253.240 [145/46/6316] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/GCNPreRALongBranchReg.cpp.o
253.304 [145/45/6317] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUMIRFormatter.cpp.o
253.386 [145/44/6318] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIWholeQuadMode.cpp.o
253.913 [145/43/6319] Building CXX object lib/Target/AMDGPU/Utils/CMakeFiles/LLVMAMDGPUUtils.dir/AMDGPUBaseInfo.cpp.o
254.088 [145/42/6320] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILateBranchLowering.cpp.o
254.299 [145/41/6321] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUReserveWWMRegs.cpp.o
255.052 [145/40/6322] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerWWMCopies.cpp.o
255.115 [145/39/6323] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUHSAMetadataStreamer.cpp.o
255.150 [145/38/6324] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIPreAllocateWWMRegs.cpp.o
255.740 [145/37/6325] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUResourceUsageAnalysis.cpp.o
256.004 [145/36/6326] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUCallLowering.cpp.o
256.066 [145/35/6327] Building CXX object tools/llvm-reduce/CMakeFiles/llvm-reduce.dir/ReducerWorkItem.cpp.o
256.178 [145/34/6328] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/GCNIterativeScheduler.cpp.o
256.407 [145/33/6329] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPULateCodeGenPrepare.cpp.o
256.591 [145/32/6330] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/GCNHazardRecognizer.cpp.o
256.739 [145/31/6331] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUPreLegalizerCombiner.cpp.o
256.827 [145/30/6332] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPULowerModuleLDSPass.cpp.o

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 25, 2025

LLVM Buildbot has detected a new failure on builder clang-ppc64-aix running on aix-ppc64 while building llvm at step 5 "build-unified-tree".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/64/builds/3202

Here is the relevant piece of the build log for the reference
Step 5 (build-unified-tree) failure: build (failure)
...
3142.676 [2014/10/3359] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILoadStoreOptimizer.cpp.o
3146.836 [2013/10/3360] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerControlFlow.cpp.o
3149.448 [2012/10/3361] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIInsertWaitcnts.cpp.o
3150.808 [2011/10/3362] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILateBranchLowering.cpp.o
3151.781 [2010/10/3363] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerI1Copies.cpp.o
3156.611 [2009/10/3364] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIInstrInfo.cpp.o
3160.223 [2008/10/3365] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIMachineScheduler.cpp.o
3160.654 [2007/10/3366] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerWWMCopies.cpp.o
3160.798 [2006/10/3367] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIModeRegister.cpp.o
3161.055 [2005/10/3368] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o
FAILED: lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o 
/home/llvm/llvm-external-buildbots/clang.19.1.2/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_LARGE_FILE_API -D_XOPEN_SOURCE=700 -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/llvm/llvm-external-buildbots/workers/aix-ppc64/clang-ppc64-aix/build/lib/Target/AMDGPU -I/home/llvm/llvm-external-buildbots/workers/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/lib/Target/AMDGPU -I/home/llvm/llvm-external-buildbots/workers/aix-ppc64/clang-ppc64-aix/build/include -I/home/llvm/llvm-external-buildbots/workers/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/include -mcmodel=large -fPIC -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -pthread -MD -MT lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -MF lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o.d -o lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SILowerSGPRSpills.cpp.o -c /home/llvm/llvm-external-buildbots/workers/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
/home/llvm/llvm-external-buildbots/workers/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp:172:25: error: unused variable 'RI' [-Werror,-Wunused-variable]
  172 |   const SIRegisterInfo *RI = ST.getRegisterInfo();
      |                         ^~
1 error generated.
3161.056 [2005/9/3369] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIModeRegisterDefaults.cpp.o
3163.151 [2005/8/3370] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIMemoryLegalizer.cpp.o
3163.765 [2005/7/3371] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIMachineFunctionInfo.cpp.o
3164.471 [2005/6/3372] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIOptimizeExecMasking.cpp.o
3169.693 [2005/5/3373] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIOptimizeExecMaskingPreRA.cpp.o
3171.316 [2005/4/3374] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIISelLowering.cpp.o
3172.465 [2005/3/3375] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIPostRABundler.cpp.o
3173.400 [2005/2/3376] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIPeepholeSDWA.cpp.o
3183.096 [2005/1/3377] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/SIOptimizeVGPRLiveRange.cpp.o
ninja: build stopped: subcommand failed.

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 25, 2025

LLVM Buildbot has detected a new failure on builder clang-s390x-linux running on systemz-1 while building llvm at step 5 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/42/builds/4294

Here is the relevant piece of the build log for the reference
Step 5 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'libFuzzer-s390x-default-Linux :: fuzzer-timeout.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/./bin/clang    -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/lib/fuzzer  /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/TimeoutTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest # RUN: at line 1
+ /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/lib/fuzzer /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/TimeoutTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest
/home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/./bin/clang    -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/lib/fuzzer  /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/TimeoutEmptyTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutEmptyTest # RUN: at line 2
+ /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/lib/fuzzer /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/TimeoutEmptyTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutEmptyTest
not  /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 2>&1 | FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=TimeoutTest # RUN: at line 3
+ not /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1
+ FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=TimeoutTest
not  /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/hi.txt 2>&1 | FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=SingleInputTimeoutTest # RUN: at line 12
+ not /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/hi.txt
+ FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=SingleInputTimeoutTest
/home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 -timeout_exitcode=0 # RUN: at line 16
+ /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 -timeout_exitcode=0
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 152114228
INFO: Loaded 1 modules   (13 inline 8-bit counters): 13 [0x2aa22251e88, 0x2aa22251e95), 
INFO: Loaded 1 PC tables (13 PCs): 13 [0x2aa22251e98,0x2aa22251f68), 
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus
#2	INITED cov: 2 ft: 2 corp: 1/1b exec/s: 0 rss: 31Mb
#2713	NEW    cov: 3 ft: 3 corp: 2/2b lim: 29 exec/s: 0 rss: 32Mb L: 1/1 MS: 1 ChangeByte-
#2719	NEW    cov: 4 ft: 4 corp: 3/4b lim: 29 exec/s: 0 rss: 32Mb L: 2/2 MS: 1 InsertByte-
#12220	NEW    cov: 5 ft: 5 corp: 4/14b lim: 122 exec/s: 0 rss: 33Mb L: 10/10 MS: 1 CMP- DE: "i\000\000\000\000\000\000\000"-
#12251	REDUCE cov: 5 ft: 5 corp: 4/11b lim: 122 exec/s: 0 rss: 33Mb L: 7/7 MS: 1 EraseBytes-
#12437	REDUCE cov: 5 ft: 5 corp: 4/8b lim: 122 exec/s: 0 rss: 33Mb L: 4/4 MS: 1 EraseBytes-
#13013	REDUCE cov: 5 ft: 5 corp: 4/7b lim: 122 exec/s: 0 rss: 33Mb L: 3/3 MS: 1 EraseBytes-
#13041	NEW    cov: 6 ft: 6 corp: 5/9b lim: 122 exec/s: 0 rss: 33Mb L: 2/3 MS: 3 CopyPart-CopyPart-CrossOver-
ALARM: working on the last Unit for 1 seconds
       and the timeout value is 1 (use -timeout=N to change)
MS: 1 InsertByte-; base unit: d0c024ef2b7b748a5af917ab7fd801b15d318c88
0x48,0x69,0x21,0x0,
Hi!\000
artifact_prefix='./'; Test unit written to ./timeout-86e0e6eb5c36c1ef156056dfdcebbfba5bca776a
Base64: SGkhAA==
==4148425== ERROR: libFuzzer: timeout after 1 seconds
AddressSanitizer:DEADLYSIGNAL
=================================================================
AddressSanitizer:DEADLYSIGNAL
=================================================================
AddressSanitizer: CHECK failed: asan_report.cpp:227 "((current_error_.kind)) == ((kErrorKindInvalid))" (0x1, 0x0) (tid=4148425)
    <empty stack>

MS: 1 InsertByte-; base unit: d0c024ef2b7b748a5af917ab7fd801b15d318c88
0x48,0x69,0x21,0x0,
Hi!\000
artifact_prefix='./'; Test unit written to ./crash-86e0e6eb5c36c1ef156056dfdcebbfba5bca776a
...

searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Apr 25, 2025
…lvm#137169)"

needs to be integrated into the downstream divergence.

This reverts commit 5bad5d8.
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 25, 2025

LLVM Buildbot has detected a new failure on builder lld-x86_64-win running on as-worker-93 while building llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/2785

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM-Unit :: Support/./SupportTests.exe/90/95' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe-LLVM-Unit-24292-90-95.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=95 GTEST_SHARD_INDEX=90 C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe
--

Script:
--
C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe --gtest_filter=ProgramEnvTest.CreateProcessLongPath
--
C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(160): error: Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(163): error: fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied



C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:160
Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:163
fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied




********************


searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Apr 29, 2025
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…7169)

Add support for using the existing SCRATCH_STORE_BLOCK and
SCRATCH_LOAD_BLOCK instructions for saving and restoring callee-saved
VGPRs. This is controlled by a new subtarget feature, block-vgpr-csr. It
does not include WWM registers - those will be saved and restored
individually, just like before. This patch does not change the ABI.

Use of this feature may lead to slightly increased stack usage, because
the memory is not compacted if certain registers don't have to be
transferred (this will happen in practice for calling conventions where
the callee and caller saved registers are interleaved in groups of 8).
However, if the registers at the end of the block of 32 don't have to be
transferred, we don't need to use a whole 128-byte stack slot - we can
trim some space off the end of the range.

In order to implement this feature, we need to rely less on the
target-independent code in the PrologEpilogInserter, so we override
several new methods in SIFrameLowering. We also add new pseudos,
SI_BLOCK_SPILL_V1024_SAVE/RESTORE.

One peculiarity is that both the SI_BLOCK_V1024_RESTORE pseudo and the
SCRATCH_LOAD_BLOCK instructions will have all the registers that are not
transferred added as implicit uses. This is done in order to inform
LiveRegUnits that those registers are not available before the restore
(since we're not really restoring them - so we can't afford to scavenge
them). Unfortunately, this trick doesn't work with the save, so before
the save all the registers in the block will be unavailable (see the
unit test).

This was reverted due to failures in the builds with expensive checks
on, now fixed by always updating LiveIntervals and SlotIndexes in
SILowerSGPRSpills.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…7169)

Add support for using the existing SCRATCH_STORE_BLOCK and
SCRATCH_LOAD_BLOCK instructions for saving and restoring callee-saved
VGPRs. This is controlled by a new subtarget feature, block-vgpr-csr. It
does not include WWM registers - those will be saved and restored
individually, just like before. This patch does not change the ABI.

Use of this feature may lead to slightly increased stack usage, because
the memory is not compacted if certain registers don't have to be
transferred (this will happen in practice for calling conventions where
the callee and caller saved registers are interleaved in groups of 8).
However, if the registers at the end of the block of 32 don't have to be
transferred, we don't need to use a whole 128-byte stack slot - we can
trim some space off the end of the range.

In order to implement this feature, we need to rely less on the
target-independent code in the PrologEpilogInserter, so we override
several new methods in SIFrameLowering. We also add new pseudos,
SI_BLOCK_SPILL_V1024_SAVE/RESTORE.

One peculiarity is that both the SI_BLOCK_V1024_RESTORE pseudo and the
SCRATCH_LOAD_BLOCK instructions will have all the registers that are not
transferred added as implicit uses. This is done in order to inform
LiveRegUnits that those registers are not available before the restore
(since we're not really restoring them - so we can't afford to scavenge
them). Unfortunately, this trick doesn't work with the save, so before
the save all the registers in the block will be unavailable (see the
unit test).

This was reverted due to failures in the builds with expensive checks
on, now fixed by always updating LiveIntervals and SlotIndexes in
SILowerSGPRSpills.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…7169)

Add support for using the existing SCRATCH_STORE_BLOCK and
SCRATCH_LOAD_BLOCK instructions for saving and restoring callee-saved
VGPRs. This is controlled by a new subtarget feature, block-vgpr-csr. It
does not include WWM registers - those will be saved and restored
individually, just like before. This patch does not change the ABI.

Use of this feature may lead to slightly increased stack usage, because
the memory is not compacted if certain registers don't have to be
transferred (this will happen in practice for calling conventions where
the callee and caller saved registers are interleaved in groups of 8).
However, if the registers at the end of the block of 32 don't have to be
transferred, we don't need to use a whole 128-byte stack slot - we can
trim some space off the end of the range.

In order to implement this feature, we need to rely less on the
target-independent code in the PrologEpilogInserter, so we override
several new methods in SIFrameLowering. We also add new pseudos,
SI_BLOCK_SPILL_V1024_SAVE/RESTORE.

One peculiarity is that both the SI_BLOCK_V1024_RESTORE pseudo and the
SCRATCH_LOAD_BLOCK instructions will have all the registers that are not
transferred added as implicit uses. This is done in order to inform
LiveRegUnits that those registers are not available before the restore
(since we're not really restoring them - so we can't afford to scavenge
them). Unfortunately, this trick doesn't work with the save, so before
the save all the registers in the block will be unavailable (see the
unit test).

This was reverted due to failures in the builds with expensive checks
on, now fixed by always updating LiveIntervals and SlotIndexes in
SILowerSGPRSpills.
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request May 9, 2025
…7169)

Add support for using the existing SCRATCH_STORE_BLOCK and
SCRATCH_LOAD_BLOCK instructions for saving and restoring callee-saved
VGPRs. This is controlled by a new subtarget feature, block-vgpr-csr. It
does not include WWM registers - those will be saved and restored
individually, just like before. This patch does not change the ABI.

Use of this feature may lead to slightly increased stack usage, because
the memory is not compacted if certain registers don't have to be
transferred (this will happen in practice for calling conventions where
the callee and caller saved registers are interleaved in groups of 8).
However, if the registers at the end of the block of 32 don't have to be
transferred, we don't need to use a whole 128-byte stack slot - we can
trim some space off the end of the range.

In order to implement this feature, we need to rely less on the
target-independent code in the PrologEpilogInserter, so we override
several new methods in SIFrameLowering. We also add new pseudos,
SI_BLOCK_SPILL_V1024_SAVE/RESTORE.

One peculiarity is that both the SI_BLOCK_V1024_RESTORE pseudo and the
SCRATCH_LOAD_BLOCK instructions will have all the registers that are not
transferred added as implicit uses. This is done in order to inform
LiveRegUnits that those registers are not available before the restore
(since we're not really restoring them - so we can't afford to scavenge
them). Unfortunately, this trick doesn't work with the save, so before
the save all the registers in the block will be unavailable (see the
unit test).

This was reverted due to failures in the builds with expensive checks
on, now fixed by always updating LiveIntervals and SlotIndexes in
SILowerSGPRSpills.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants