Skip to content

[ctx_prof] Flattened profile lowering pass #107329

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

mtrofin
Copy link
Member

@mtrofin mtrofin commented Sep 4, 2024

Pass to flatten and lower the contextual profile to profile (i.e. MD_prof) metadata. This is expected to be used after all IPO transformations have happened.

Prior to lowering, the instrumentation is maintained during IPO and the contextual profile is kept in sync (see PRs #105469, #106154). Flattening (#104539) sums up all the counters belonging to all a function's context nodes.

We first propagate counter values (from the flattened profile) using the same propagation algorithm as PGOUseFunc::populateCounters, then map the edge values to branch_weights. Functions. in the module that don't have an entry in the flattened profile are deemed cold, and any MD_prof metadata they may have is reset. The profile summary is also reset at this point.

Issue #89287

Copy link
Member Author

mtrofin commented Sep 4, 2024

@mtrofin mtrofin force-pushed the users/mtrofin/09-03-_ctx_prof_flattened_profile_lowering_pass branch 2 times, most recently from 263d6e7 to 1e4742a Compare September 5, 2024 04:57
@mtrofin mtrofin marked this pull request as ready for review September 5, 2024 05:00
@mtrofin mtrofin force-pushed the users/mtrofin/09-03-_ctx_prof_flattened_profile_lowering_pass branch from 1e4742a to 657e4fd Compare September 5, 2024 05:01
@llvmbot llvmbot added PGO Profile Guided Optimizations llvm:analysis llvm:transforms labels Sep 5, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 5, 2024

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-pgo

@llvm/pr-subscribers-llvm-analysis

Author: Mircea Trofin (mtrofin)

Changes

Pass to flatten and lower the contextual profile to profile (i.e. MD_prof) metadata. This is expected to be used after all IPO transformations have happened.

Prior to lowering, the instrumentation is maintained during IPO and the contextual profile is kept in sync (see PRs #105469, #106154). Flattening (#104539) sums up all the counters belonging to all a function's context nodes.

We first propagate counter values (from the flattened profile) using the same propagation algorithm as PGOUseFunc::populateCounters, then map the edge values to branch_weights. Functions. in the module that don't have an entry in the flattened profile are deemed cold, and any MD_prof metadata they may have is reset. The profile summary is also reset at this point.


Full diff: https://github.com/llvm/llvm-project/pull/107329.diff

7 Files Affected:

  • (modified) llvm/include/llvm/ProfileData/ProfileCommon.h (+3-3)
  • (added) llvm/include/llvm/Transforms/Instrumentation/PGOCtxProfFlattening.h (+24)
  • (modified) llvm/lib/Passes/PassBuilder.cpp (+1)
  • (modified) llvm/lib/Passes/PassRegistry.def (+1)
  • (modified) llvm/lib/Transforms/Instrumentation/CMakeLists.txt (+1)
  • (added) llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp (+324)
  • (added) llvm/test/Analysis/CtxProfAnalysis/flatten-and-annotate.ll (+88)
diff --git a/llvm/include/llvm/ProfileData/ProfileCommon.h b/llvm/include/llvm/ProfileData/ProfileCommon.h
index eaab59484c947a..edd8e1f644ad12 100644
--- a/llvm/include/llvm/ProfileData/ProfileCommon.h
+++ b/llvm/include/llvm/ProfileData/ProfileCommon.h
@@ -79,13 +79,13 @@ class ProfileSummaryBuilder {
 class InstrProfSummaryBuilder final : public ProfileSummaryBuilder {
   uint64_t MaxInternalBlockCount = 0;
 
-  inline void addEntryCount(uint64_t Count);
-  inline void addInternalCount(uint64_t Count);
-
 public:
   InstrProfSummaryBuilder(std::vector<uint32_t> Cutoffs)
       : ProfileSummaryBuilder(std::move(Cutoffs)) {}
 
+  void addEntryCount(uint64_t Count);
+  void addInternalCount(uint64_t Count);
+
   void addRecord(const InstrProfRecord &);
   std::unique_ptr<ProfileSummary> getSummary();
 };
diff --git a/llvm/include/llvm/Transforms/Instrumentation/PGOCtxProfFlattening.h b/llvm/include/llvm/Transforms/Instrumentation/PGOCtxProfFlattening.h
new file mode 100644
index 00000000000000..56876740264379
--- /dev/null
+++ b/llvm/include/llvm/Transforms/Instrumentation/PGOCtxProfFlattening.h
@@ -0,0 +1,24 @@
+//===-- PGOCtxProfFlattening.h - Contextual Instr. Flattening ---*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file declares the PGOCtxProfFlattening class.
+//
+//===----------------------------------------------------------------------===//
+#ifndef LLVM_TRANSFORMS_INSTRUMENTATION_PGOCTXPROFFLATTENING_H
+#define LLVM_TRANSFORMS_INSTRUMENTATION_PGOCTXPROFFLATTENING_H
+
+#include "llvm/IR/PassManager.h"
+namespace llvm {
+
+class PGOCtxProfFlattening : public PassInfoMixin<PGOCtxProfFlattening> {
+public:
+  explicit PGOCtxProfFlattening() = default;
+  PreservedAnalyses run(Module &M, ModuleAnalysisManager &MAM);
+};
+} // namespace llvm
+#endif
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index 1df1449fce597c..a8827d1d909f6f 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -197,6 +197,7 @@
 #include "llvm/Transforms/Instrumentation/MemProfiler.h"
 #include "llvm/Transforms/Instrumentation/MemorySanitizer.h"
 #include "llvm/Transforms/Instrumentation/NumericalStabilitySanitizer.h"
+#include "llvm/Transforms/Instrumentation/PGOCtxProfFlattening.h"
 #include "llvm/Transforms/Instrumentation/PGOCtxProfLowering.h"
 #include "llvm/Transforms/Instrumentation/PGOForceFunctionAttrs.h"
 #include "llvm/Transforms/Instrumentation/PGOInstrumentation.h"
diff --git a/llvm/lib/Passes/PassRegistry.def b/llvm/lib/Passes/PassRegistry.def
index d6067089c6b5c1..923a799cda53e7 100644
--- a/llvm/lib/Passes/PassRegistry.def
+++ b/llvm/lib/Passes/PassRegistry.def
@@ -58,6 +58,7 @@ MODULE_PASS("coro-early", CoroEarlyPass())
 MODULE_PASS("cross-dso-cfi", CrossDSOCFIPass())
 MODULE_PASS("ctx-instr-gen",
             PGOInstrumentationGen(PGOInstrumentationType::CTXPROF))
+MODULE_PASS("ctx-prof-flatten", PGOCtxProfFlattening())
 MODULE_PASS("deadargelim", DeadArgumentEliminationPass())
 MODULE_PASS("debugify", NewPMDebugifyPass())
 MODULE_PASS("dfsan", DataFlowSanitizerPass())
diff --git a/llvm/lib/Transforms/Instrumentation/CMakeLists.txt b/llvm/lib/Transforms/Instrumentation/CMakeLists.txt
index deab37801ff1df..d45b07447d09da 100644
--- a/llvm/lib/Transforms/Instrumentation/CMakeLists.txt
+++ b/llvm/lib/Transforms/Instrumentation/CMakeLists.txt
@@ -15,6 +15,7 @@ add_llvm_component_library(LLVMInstrumentation
   InstrProfiling.cpp
   KCFI.cpp
   LowerAllowCheckPass.cpp
+  PGOCtxProfFlattening.cpp
   PGOCtxProfLowering.cpp
   PGOForceFunctionAttrs.cpp
   PGOInstrumentation.cpp
diff --git a/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp b/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp
new file mode 100644
index 00000000000000..5cc3af56ef23ff
--- /dev/null
+++ b/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp
@@ -0,0 +1,324 @@
+//===- PGOCtxProfFlattening.cpp - Contextual Instr. Flattening ------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Flattens the contextual profile and lowers it to MD_prof.
+// This should happen after all IPO (which is assumed to have maintained the
+// contextual profile) happened. Flattening consists of summing the values at
+// the same index of the counters belonging to all the contexts of a function.
+// The lowering consists of materializing the counter values to function
+// entrypoint counts and branch probabilities.
+//
+// This pass also removes contextual instrumentation, which has been kept around
+// to facilitate its functionality.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Transforms/Instrumentation/PGOCtxProfFlattening.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/Analysis/CtxProfAnalysis.h"
+#include "llvm/Analysis/OptimizationRemarkEmitter.h"
+#include "llvm/Analysis/ProfileSummaryInfo.h"
+#include "llvm/CodeGen/MachineBasicBlock.h"
+#include "llvm/IR/Analysis.h"
+#include "llvm/IR/CFG.h"
+#include "llvm/IR/Dominators.h"
+#include "llvm/IR/IntrinsicInst.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IR/PassManager.h"
+#include "llvm/IR/ProfileSummary.h"
+#include "llvm/ProfileData/ProfileCommon.h"
+#include "llvm/Transforms/Instrumentation/PGOInstrumentation.h"
+#include "llvm/Transforms/Scalar/DCE.h"
+#include "llvm/Transforms/Utils/BasicBlockUtils.h"
+
+using namespace llvm;
+
+namespace {
+
+class ProfileAnnotator final {
+  class BBInfo;
+  struct EdgeInfo {
+    BBInfo *const Src;
+    BBInfo *const Dest;
+    std::optional<uint64_t> Count;
+
+    explicit EdgeInfo(BBInfo &Src, BBInfo &Dest) : Src(&Src), Dest(&Dest) {}
+  };
+
+  class BBInfo {
+    std::optional<uint64_t> Count;
+    SmallVector<EdgeInfo *> OutEdges;
+    SmallVector<EdgeInfo *> InEdges;
+    size_t UnknownCountOutEdges = 0;
+    size_t UnknownCountInEdges = 0;
+
+    uint64_t getEdgeSum(const SmallVector<EdgeInfo *> &Edges,
+                        bool AssumeAllKnown) const {
+      uint64_t Sum = 0;
+      for (const auto *E : Edges)
+        if (E)
+          Sum += AssumeAllKnown ? *E->Count : E->Count.value_or(0U);
+      return Sum;
+    }
+
+    void takeCountFrom(const SmallVector<EdgeInfo *> &Edges) {
+      assert(!Count.has_value());
+      Count = getEdgeSum(Edges, true);
+    }
+
+    void setSingleUnknownEdgeCount(SmallVector<EdgeInfo *> &Edges) {
+      uint64_t KnownSum = getEdgeSum(Edges, false);
+      uint64_t EdgeVal = *Count > KnownSum ? *Count - KnownSum : 0U;
+      EdgeInfo *E = nullptr;
+      for (auto *I : Edges)
+        if (I && !I->Count.has_value()) {
+          E = I;
+#ifdef NDEBUG
+          break;
+#else
+          assert((!E || E == I) &&
+                 "Expected exactly one edge to have an unknown count, "
+                 "found a second one");
+          continue;
+#endif
+        }
+      assert(E && "Expected exactly one edge to have an unknown count");
+      assert(!E->Count.has_value());
+      E->Count = EdgeVal;
+      assert(E->Src->UnknownCountOutEdges > 0);
+      assert(E->Dest->UnknownCountInEdges > 0);
+      --E->Src->UnknownCountOutEdges;
+      --E->Dest->UnknownCountInEdges;
+    }
+
+  public:
+    BBInfo(size_t NumInEdges, size_t NumOutEdges, std::optional<uint64_t> Count)
+        : Count(Count) {
+      InEdges.reserve(NumInEdges);
+      OutEdges.resize(NumOutEdges);
+    }
+
+    bool tryTakeCountFromKnownOutEdges(const BasicBlock &BB) {
+      if (!succ_empty(&BB) && !UnknownCountOutEdges) {
+        takeCountFrom(OutEdges);
+        return true;
+      }
+      return false;
+    }
+
+    bool tryTakeCountFromKnownInEdges(const BasicBlock &BB) {
+      if (!BB.isEntryBlock() && !UnknownCountInEdges) {
+        takeCountFrom(InEdges);
+        return true;
+      }
+      return false;
+    }
+
+    void addInEdge(EdgeInfo *Info) {
+      InEdges.push_back(Info);
+      ++UnknownCountInEdges;
+    }
+
+    void addOutEdge(size_t Index, EdgeInfo *Info) {
+      OutEdges[Index] = Info;
+      ++UnknownCountOutEdges;
+    }
+
+    bool hasCount() const { return Count.has_value(); }
+
+    bool trySetSingleUnknownInEdgeCount() {
+      if (UnknownCountInEdges == 1) {
+        setSingleUnknownEdgeCount(InEdges);
+        return true;
+      }
+      return false;
+    }
+
+    bool trySetSingleUnknownOutEdgeCount() {
+      if (UnknownCountOutEdges == 1) {
+        setSingleUnknownEdgeCount(OutEdges);
+        return true;
+      }
+      return false;
+    }
+    size_t getNumOutEdges() const { return OutEdges.size(); }
+
+    uint64_t getEdgeCount(size_t Index) const {
+      if (auto *E = OutEdges[Index])
+        return *E->Count;
+      return 0U;
+    }
+  };
+
+  Function &F;
+  const SmallVectorImpl<uint64_t> &Counters;
+  // To be accessed through getBBInfo() after construction.
+  std::map<const BasicBlock *, BBInfo> BBInfos;
+  std::vector<EdgeInfo> EdgeInfos;
+  InstrProfSummaryBuilder &PB;
+
+  // This is an adaptation of PGOUseFunc::populateCounters.
+  // FIXME(mtrofin): look into factoring the code to share one implementation.
+  void propagateCounterValues(const SmallVectorImpl<uint64_t> &Counters) {
+    bool KeepGoing = true;
+    while (KeepGoing) {
+      KeepGoing = false;
+      for (const auto &BB : reverse(F)) {
+        auto &Info = getBBInfo(BB);
+        if (!Info.hasCount())
+          KeepGoing |= Info.tryTakeCountFromKnownOutEdges(BB) ||
+                       Info.tryTakeCountFromKnownInEdges(BB);
+        if (Info.hasCount()) {
+          KeepGoing |= Info.trySetSingleUnknownOutEdgeCount();
+          KeepGoing |= Info.trySetSingleUnknownInEdgeCount();
+        }
+      }
+    }
+  }
+  // The only criteria for exclusion is faux suspend -> exit edges in presplit
+  // coroutines. The API serves for readability, currently.
+  bool shouldExcludeEdge(const BasicBlock &Src, const BasicBlock &Dest) const {
+    return llvm::isPresplitCoroSuspendExitEdge(Src, Dest);
+  }
+
+  BBInfo &getBBInfo(const BasicBlock &BB) { return BBInfos.find(&BB)->second; }
+
+public:
+  ProfileAnnotator(Function &F, const SmallVectorImpl<uint64_t> &Counters,
+                   InstrProfSummaryBuilder &PB)
+      : F(F), Counters(Counters), PB(PB) {
+    assert(!F.isDeclaration());
+    assert(!Counters.empty());
+    size_t NrEdges = 0;
+    for (const auto &BB : F) {
+      std::optional<uint64_t> Count;
+      if (auto *Ins = CtxProfAnalysis::getBBInstrumentation(
+              const_cast<BasicBlock &>(BB)))
+        Count = Counters[Ins->getIndex()->getZExtValue()];
+      auto [It, Ins] =
+          BBInfos.insert({&BB, {pred_size(&BB), succ_size(&BB), Count}});
+      (void)Ins;
+      assert(Ins);
+      NrEdges += llvm::count_if(successors(&BB), [&](const auto *Succ) {
+        return !shouldExcludeEdge(BB, *Succ);
+      });
+    }
+    // Pre-allocate the vector, we want references to its contents to be stable.
+    EdgeInfos.reserve(NrEdges);
+    for (const auto &BB : F) {
+      auto &Info = getBBInfo(BB);
+      for (auto I = 0U; I < BB.getTerminator()->getNumSuccessors(); ++I) {
+        const auto *Succ = BB.getTerminator()->getSuccessor(I);
+        if (!shouldExcludeEdge(BB, *Succ)) {
+          assert(EdgeInfos.size() < NrEdges);
+          auto &EI = EdgeInfos.emplace_back(getBBInfo(BB), getBBInfo(*Succ));
+          Info.addOutEdge(I, &EI);
+          getBBInfo(*Succ).addInEdge(&EI);
+        }
+      }
+    }
+  }
+
+  /// Assign branch weights and function entry count. Also update the PSI
+  /// builder.
+  void assignProfileData() {
+    assert(!Counters.empty());
+    propagateCounterValues(Counters);
+    F.setEntryCount(Counters[0]);
+    PB.addEntryCount(Counters[0]);
+
+    for (auto &BB : F) {
+      if (succ_size(&BB) < 2)
+        continue;
+      auto *Term = BB.getTerminator();
+      SmallVector<uint64_t, 2> EdgeCounts(Term->getNumSuccessors(), 0);
+      uint64_t MaxCount = 0;
+      const auto &BBInfo = getBBInfo(BB);
+      for (unsigned SuccIdx = 0, Size = BBInfo.getNumOutEdges(); SuccIdx < Size;
+           ++SuccIdx) {
+        uint64_t EdgeCount = BBInfo.getEdgeCount(SuccIdx);
+        if (EdgeCount > MaxCount)
+          MaxCount = EdgeCount;
+        EdgeCounts[SuccIdx] = EdgeCount;
+        PB.addInternalCount(EdgeCount);
+      }
+
+      if (MaxCount == 0)
+        F.getContext().emitError(
+            "[ctx-prof] Encountered a BB with more than one successor, where "
+            "all outgoing edges have a 0 count. This occurs in non-exiting "
+            "functions (message pumps, usually) which are not supported in the "
+            "contextual profiling case");
+      setProfMetadata(F.getParent(), Term, EdgeCounts, MaxCount);
+    }
+  }
+};
+
+bool areAllBBsReachable(const Function &F, FunctionAnalysisManager &FAM) {
+  auto &DT = FAM.getResult<DominatorTreeAnalysis>(const_cast<Function &>(F));
+  for (const auto &BB : F)
+    if (!DT.isReachableFromEntry(&BB))
+      return false;
+  return true;
+}
+
+void clearColdFunctionProfile(Function &F) {
+  for (auto &BB : F)
+    BB.getTerminator()->setMetadata(LLVMContext::MD_prof, nullptr);
+  F.setEntryCount(0U);
+}
+
+void removeInstrumentation(Function &F) {
+  for (auto &BB : F)
+    for (auto &I : llvm::make_early_inc_range(BB))
+      if (isa<InstrProfCntrInstBase>(I))
+        I.eraseFromParent();
+}
+
+} // namespace
+
+PreservedAnalyses PGOCtxProfFlattening::run(Module &M,
+                                            ModuleAnalysisManager &MAM) {
+  auto &CtxProf = MAM.getResult<CtxProfAnalysis>(M);
+  if (!CtxProf)
+    return PreservedAnalyses::all();
+
+  const auto FlattenedProfile = CtxProf.flatten();
+
+  InstrProfSummaryBuilder PB(ProfileSummaryBuilder::DefaultCutoffs);
+  for (auto &F : M) {
+    if (F.isDeclaration())
+      continue;
+
+    if (!areAllBBsReachable(F,
+                            MAM.getResult<FunctionAnalysisManagerModuleProxy>(M)
+                                .getManager())) {
+      M.getContext().emitError(
+          "[ctx-prof] Function has unreacheable basic blocks: " + F.getName());
+      continue;
+    }
+
+    const auto &FlatProfile =
+        FlattenedProfile.lookup(AssignGUIDPass::getGUID(F));
+    // If this function didn't appear in the contextual profile, it's cold.
+    if (FlatProfile.empty())
+      clearColdFunctionProfile(F);
+    else {
+      ProfileAnnotator S(F, FlatProfile, PB);
+      S.assignProfileData();
+    }
+    removeInstrumentation(F);
+  }
+
+  auto &PSI = MAM.getResult<ProfileSummaryAnalysis>(M);
+
+  M.setProfileSummary(PB.getSummary()->getMD(M.getContext()),
+                      ProfileSummary::Kind::PSK_Instr);
+  PSI.refresh();
+  return PreservedAnalyses::none();
+}
\ No newline at end of file
diff --git a/llvm/test/Analysis/CtxProfAnalysis/flatten-and-annotate.ll b/llvm/test/Analysis/CtxProfAnalysis/flatten-and-annotate.ll
new file mode 100644
index 00000000000000..7928cc96552630
--- /dev/null
+++ b/llvm/test/Analysis/CtxProfAnalysis/flatten-and-annotate.ll
@@ -0,0 +1,88 @@
+; REQUIRES: x86_64-linux
+;
+; RUN: rm -rf %t
+; RUN: split-file %s %t
+; RUN: llvm-ctxprof-util fromJSON --input=%t/profile.json --output=%t/profile.ctxprofdata
+; RUN: opt -module-summary -passes='thinlto-pre-link<O2>' -use-ctx-profile=%t/profile.ctxprofdata \
+; RUN:   %t/example.ll -S -o %t/prelink.ll
+; RUN: FileCheck --input-file %t/prelink.ll %s --check-prefix=PRELINK
+; RUN: opt -passes='ctx-prof-flatten' -use-ctx-profile=%t/profile.ctxprofdata %t/prelink.ll -S  | FileCheck %s
+;
+;
+; Check that instrumentation occurs where expected: the "no" block for foo, and
+; the "yes" block for an_entrypoint - which explains the subsequent branch weights
+;
+; PRELINK-LABEL: @foo
+; PRELINK-LABEL: no:
+; PRELINK:         call void @llvm.instrprof.increment(ptr @foo, i64 [[#]], i32 2, i32 1)
+
+; PRELINK-LABEL: @an_entrypoint
+; PRELINK-LABEL: yes:
+; PRELINK:         call void @llvm.instrprof.increment(ptr @an_entrypoint, i64 [[#]], i32 2, i32 1)
+; PRELINK-NOT: "ProfileSummary"
+
+; Check that the output has:
+;  - no instrumentation
+;  - the 2 functions have an entry count
+;  - each conditional branch has profile annotation
+;
+; CHECK-NOT:   call void @llvm.instrprof
+;
+; make sure we have function entry counts, branch weights, and a profile summary.
+; CHECK-LABEL: @foo
+; CHECK-SAME:    !prof !29
+; CHECK:          br i1 %t, label %yes, label %no, !prof !31
+; CHECK-LABEL: @an_entrypoint
+; CHECK-SAME:    !prof !32
+; CHECK:          br i1 %t, label %yes, label %common.ret, !prof !34
+; CHECK:       !0 = !{i32 1, !"ProfileSummary", !1}
+; CHECK:       !29 = !{!"function_entry_count", i64 40} 
+; CHECK:       !31 = !{!"branch_weights", i32 30, i32 10} 
+; CHECK:       !32 = !{!"function_entry_count", i64 100}
+; CHECK:       !34 = !{!"branch_weights", i32 40, i32 60} 
+
+;--- profile.json
+[
+  {
+    "Guid": 4909520559318251808,
+    "Counters": [100, 40],
+    "Callsites": [
+      [
+        {
+          "Guid": 11872291593386833696,
+          "Counters": [ 40, 10 ]
+        }
+      ]
+    ]
+  }
+]
+;--- example.ll
+declare void @bar()
+
+define void @foo(i32 %a, ptr %fct) #0 !guid !0 {
+  %t = icmp sgt i32 %a, 7
+  br i1 %t, label %yes, label %no
+yes:
+  call void %fct(i32 %a)
+  br label %exit
+no:
+  call void @bar()
+  br label %exit
+exit:
+  ret void
+}
+
+define void @an_entrypoint(i32 %a) !guid !1 {
+  %t = icmp sgt i32 %a, 0
+  br i1 %t, label %yes, label %no
+
+yes:
+  call void @foo(i32 1, ptr null)
+  ret void
+no:
+  ret void
+}
+
+attributes #0 = { noinline }
+!0 = !{ i64 11872291593386833696 }
+!1 = !{i64 4909520559318251808}

@mtrofin mtrofin force-pushed the users/mtrofin/09-03-_ctx_prof_flattened_profile_lowering_pass branch 2 times, most recently from 044c0ce to 3e8de96 Compare September 5, 2024 14:40
@mtrofin mtrofin force-pushed the users/mtrofin/09-03-_ctx_prof_flattened_profile_lowering_pass branch 4 times, most recently from 2b44af1 to 59e6bd7 Compare September 5, 2024 19:50
Copy link

github-actions bot commented Sep 5, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@mtrofin mtrofin force-pushed the users/mtrofin/09-03-_ctx_prof_flattened_profile_lowering_pass branch from 59e6bd7 to 083e94b Compare September 5, 2024 19:57
}

void clearColdFunctionProfile(Function &F) {
for (auto &BB : F)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this step needed? Setting entry point count seems enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for consistency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assume most of the functions are cold, this will just waste compile time though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only run in the module(s) that the profile is applicable to, though. So the expectation is that most function are participating in the profile (because ThinLTO put them there).

I'd prefer keeping things consistent here and revisiting if, indeed, compile time due to this is an issue. The value of consistency (here) is that one doesn't need to know to check first the function entry count to make sense of the branch weights; also, by eliding any weights - should there be any - we help ulterior BFI updates (ok, unless those also discard any entrycount == 0 functions)

@mtrofin mtrofin force-pushed the users/mtrofin/09-03-_ctx_prof_flattened_profile_lowering_pass branch from 083e94b to 1d1d9e7 Compare September 5, 2024 20:40
@mtrofin mtrofin changed the base branch from main to users/mtrofin/09-05-_ctx_prof_handle_case_when_no_root_is_in_this_module September 5, 2024 20:40
@mtrofin mtrofin force-pushed the users/mtrofin/09-03-_ctx_prof_flattened_profile_lowering_pass branch from 1d1d9e7 to 856568c Compare September 5, 2024 21:04
bool AssumeAllKnown) const {
uint64_t Sum = 0;
for (const auto *E : Edges)
if (E)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can E be null?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We place edges at their position in the terminator list - i.e. in the position that edge's count must appear in the branch_weights vector. Some edges - currently, the coro suspend ones - are not real. PGO marks those as removed during MST buildup, but we don't do that here, so using nullptr instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, this is applicable to the Out edges, for the In edges we don't care about position.

}

void clearColdFunctionProfile(Function &F) {
for (auto &BB : F)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assume most of the functions are cold, this will just waste compile time though.

@mtrofin mtrofin force-pushed the users/mtrofin/09-03-_ctx_prof_flattened_profile_lowering_pass branch 2 times, most recently from f064fd3 to 40f0581 Compare September 5, 2024 21:31
@mtrofin mtrofin force-pushed the users/mtrofin/09-03-_ctx_prof_flattened_profile_lowering_pass branch from 40f0581 to 0c3c448 Compare September 5, 2024 21:39
@mtrofin mtrofin force-pushed the users/mtrofin/09-05-_ctx_prof_handle_case_when_no_root_is_in_this_module branch from 53b11dd to 1d3df1b Compare September 6, 2024 17:23
@mtrofin mtrofin force-pushed the users/mtrofin/09-03-_ctx_prof_flattened_profile_lowering_pass branch from 0c3c448 to 22e94e4 Compare September 6, 2024 18:21
Copy link
Member Author

mtrofin commented Sep 6, 2024

Merge activity

  • Sep 6, 4:40 PM EDT: @mtrofin started a stack merge that includes this pull request via Graphite.
  • Sep 6, 4:45 PM EDT: Graphite rebased this pull request as part of a merge.
  • Sep 6, 4:47 PM EDT: @mtrofin merged this pull request with Graphite.

@mtrofin mtrofin force-pushed the users/mtrofin/09-05-_ctx_prof_handle_case_when_no_root_is_in_this_module branch from 1d3df1b to cdd6c12 Compare September 6, 2024 20:42
Base automatically changed from users/mtrofin/09-05-_ctx_prof_handle_case_when_no_root_is_in_this_module to main September 6, 2024 20:44
@mtrofin mtrofin force-pushed the users/mtrofin/09-03-_ctx_prof_flattened_profile_lowering_pass branch from 22e94e4 to d75a5da Compare September 6, 2024 20:45
@mtrofin mtrofin merged commit 775c507 into main Sep 6, 2024
5 of 7 checks passed
@mtrofin mtrofin deleted the users/mtrofin/09-03-_ctx_prof_flattened_profile_lowering_pass branch September 6, 2024 20:47
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 6, 2024

LLVM Buildbot has detected a new failure on builder openmp-offload-amdgpu-runtime running on omp-vega20-0 while building llvm at step 7 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/30/builds/5651

Here is the relevant piece of the build log for the reference
Step 7 (Add check check-offload) failure: test (failure)
******************** TEST 'libomptarget :: amdgcn-amd-amdhsa :: sanitizer/kernel_crash_async.c' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp    -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a
# note: command had no output on stdout or stderr
# RUN: at line 3
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash env -u LLVM_DISABLE_SYMBOLIZATION OFFLOAD_TRACK_NUM_KERNEL_LAUNCH_TRACES=1 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp 2>&1 | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=TRACE
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash env -u LLVM_DISABLE_SYMBOLIZATION OFFLOAD_TRACK_NUM_KERNEL_LAUNCH_TRACES=1 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=TRACE
# note: command had no output on stdout or stderr
# RUN: at line 4
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp 2>&1 | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=CHECK
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=CHECK
# note: command had no output on stdout or stderr
# RUN: at line 5
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp    -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -g
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -g
# note: command had no output on stdout or stderr
# RUN: at line 6
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash env -u LLVM_DISABLE_SYMBOLIZATION OFFLOAD_TRACK_NUM_KERNEL_LAUNCH_TRACES=1 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp 2>&1 | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=TRACE
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash env -u LLVM_DISABLE_SYMBOLIZATION OFFLOAD_TRACK_NUM_KERNEL_LAUNCH_TRACES=1 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=TRACE
# note: command had no output on stdout or stderr
# RUN: at line 7
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp 2>&1 | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=CHECK
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=CHECK
# .---command stderr------------
# | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c:39:11: error: CHECK: expected string not found in input
# | // CHECK: Kernel {{[0-9]}}: {{.*}} (__omp_offloading_{{.*}}_main_l29)
# |           ^
# | <stdin>:1:1: note: scanning from here
# | Display only launched kernel:
# | ^
# | <stdin>:2:23: note: possible intended match here
# | Kernel 'omp target in main @ 29 (__omp_offloading_802_b38838e_main_l29)'
# |                       ^
# | 
# | Input file: <stdin>
# | Check file: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c
# | 
...

mtrofin added a commit that referenced this pull request Sep 6, 2024
Re. PR ##107329, 2 includes weren't necessary - the CodeGen one, in
particular, seemed accidentally (IDE) introduced.
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 6, 2024

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-expensive-checks-ubuntu running on as-builder-4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/187/builds/987

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Analysis/CtxProfAnalysis/flatten-always-removes-instrumentation.ll' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
RUN: at line 1: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt -passes=ctx-prof-flatten /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/Analysis/CtxProfAnalysis/flatten-always-removes-instrumentation.ll -S | /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/Analysis/CtxProfAnalysis/flatten-always-removes-instrumentation.ll
+ /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/Analysis/CtxProfAnalysis/flatten-always-removes-instrumentation.ll
+ /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt -passes=ctx-prof-flatten /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/Analysis/CtxProfAnalysis/flatten-always-removes-instrumentation.ll -S
LLVM ERROR: Function @foo changed by PGOCtxProfFlatteningPass without invalidating analyses
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt -passes=ctx-prof-flatten /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/Analysis/CtxProfAnalysis/flatten-always-removes-instrumentation.ll -S
1.	Running pass "ctx-prof-flatten" on module "/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/Analysis/CtxProfAnalysis/flatten-always-removes-instrumentation.ll"
 #0 0x000056133ed83a2c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:22
 #1 0x000056133ed83e4d PrintStackTraceSignalHandler(void*) /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/lib/Support/Unix/Signals.inc:798:1
 #2 0x000056133ed8129d llvm::sys::RunSignalHandlers() /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/lib/Support/Signals.cpp:105:20
 #3 0x000056133ed832c4 SignalHandler(int) /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
 #4 0x00007f2416ed4520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #5 0x00007f2416f289fc pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0x969fc)
 #6 0x00007f2416ed4476 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x42476)
 #7 0x00007f2416eba7f3 abort (/lib/x86_64-linux-gnu/libc.so.6+0x287f3)
 #8 0x000056133ec945ef llvm::report_fatal_error(llvm::Twine const&, bool) /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/lib/Support/ErrorHandling.cpp:126:9
 #9 0x000056133bdfda11 llvm::PreservedCFGCheckerInstrumentation::registerCallbacks(llvm::PassInstrumentationCallbacks&, llvm::AnalysisManager<llvm::Module>&)::'lambda1'(llvm::StringRef, llvm::Any, llvm::PreservedAnalyses const&)::operator()(llvm::StringRef, llvm::Any, llvm::PreservedAnalyses const&) const /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/lib/Passes/StandardInstrumentations.cpp:1436:63
#10 0x000056133be10de7 void llvm::detail::UniqueFunctionBase<void, llvm::StringRef, llvm::Any, llvm::PreservedAnalyses const&>::CallImpl<llvm::PreservedCFGCheckerInstrumentation::registerCallbacks(llvm::PassInstrumentationCallbacks&, llvm::AnalysisManager<llvm::Module>&)::'lambda1'(llvm::StringRef, llvm::Any, llvm::PreservedAnalyses const&)>(void*, llvm::StringRef, llvm::Any&, llvm::PreservedAnalyses const&) /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/include/llvm/ADT/FunctionExtras.h:222:16
#11 0x000056133c1b8729 llvm::unique_function<void (llvm::StringRef, llvm::Any, llvm::PreservedAnalyses const&)>::operator()(llvm::StringRef, llvm::Any, llvm::PreservedAnalyses const&) /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/include/llvm/ADT/FunctionExtras.h:387:62
#12 0x000056133eaac99e void llvm::PassInstrumentation::runAfterPass<llvm::Module, llvm::detail::PassConcept<llvm::Module, llvm::AnalysisManager<llvm::Module>>>(llvm::detail::PassConcept<llvm::Module, llvm::AnalysisManager<llvm::Module>> const&, llvm::Module const&, llvm::PreservedAnalyses const&) const /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/include/llvm/IR/PassInstrumentation.h:265:30
#13 0x000056133eaa98f2 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/include/llvm/IR/PassManagerImpl.h:93:27
#14 0x0000561338bbc8d2 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/tools/opt/NewPMDriver.cpp:541:10
#15 0x0000561338b8cfa9 optMain /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/tools/opt/optdriver.cpp:738:27
#16 0x0000561338b8a7c1 main /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/tools/opt/opt.cpp:25:64
#17 0x00007f2416ebbd90 (/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
#18 0x00007f2416ebbe40 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e40)
#19 0x0000561338b8a6a5 _start (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0xc896a5)
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/Analysis/CtxProfAnalysis/flatten-always-removes-instrumentation.ll

--

********************


mtrofin added a commit that referenced this pull request Sep 7, 2024
…:none()`

This is because it always removes instrumentation. This fixes failures
detectable with extensive checks, e.g. https://lab.llvm.org/buildbot/#/builders/187/builds/987

(Related to PR #107329)
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 7, 2024

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-expensive-checks-win running on as-worker-93 while building llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/14/builds/1341

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Analysis/CtxProfAnalysis/flatten-always-removes-instrumentation.ll' FAILED ********************
Exit Code: 2

Command Output (stdout):
--
# RUN: at line 1
c:\a\llvm-clang-x86_64-expensive-checks-win\build\bin\opt.exe -passes=ctx-prof-flatten C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\test\Analysis\CtxProfAnalysis\flatten-always-removes-instrumentation.ll -S | c:\a\llvm-clang-x86_64-expensive-checks-win\build\bin\filecheck.exe C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\test\Analysis\CtxProfAnalysis\flatten-always-removes-instrumentation.ll
# executed command: 'c:\a\llvm-clang-x86_64-expensive-checks-win\build\bin\opt.exe' -passes=ctx-prof-flatten 'C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\test\Analysis\CtxProfAnalysis\flatten-always-removes-instrumentation.ll' -S
# .---command stderr------------
# | LLVM ERROR: Function @foo changed by PGOCtxProfFlatteningPass without invalidating analyses
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
# | Stack dump:
# | 0.	Program arguments: c:\\a\\llvm-clang-x86_64-expensive-checks-win\\build\\bin\\opt.exe -passes=ctx-prof-flatten C:\\a\\llvm-clang-x86_64-expensive-checks-win\\llvm-project\\llvm\\test\\Analysis\\CtxProfAnalysis\\flatten-always-removes-instrumentation.ll -S
# | 1.	Running pass "ctx-prof-flatten" on module "C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\test\Analysis\CtxProfAnalysis\flatten-always-removes-instrumentation.ll"
# | Exception Code: 0x80000003
# |  #0 0x00007ff74d23addc HandleAbort C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\lib\Support\Windows\Signals.inc:429:0
# |  #1 0x00007ffddb7690ed (C:\WINDOWS\SYSTEM32\ucrtbased.dll+0xa90ed)
# |  #2 0x00007ffddb76ae49 (C:\WINDOWS\SYSTEM32\ucrtbased.dll+0xaae49)
# |  #3 0x00007ff74d06140f llvm::report_fatal_error(class llvm::Twine const &, bool) C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\lib\Support\ErrorHandling.cpp:124:0
# |  #4 0x00007ff74d86682d `llvm::PreservedCFGCheckerInstrumentation::registerCallbacks'::`2'::<lambda_3>::operator() C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\lib\Passes\StandardInstrumentations.cpp:1415:0
# |  #5 0x00007ff74d859902 llvm::detail::UniqueFunctionBase<void,llvm::StringRef,llvm::Any,llvm::PreservedAnalyses const &>::CallImpl<`llvm::PreservedCFGCheckerInstrumentation::registerCallbacks'::`2'::<lambda_3> > C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\include\llvm\ADT\FunctionExtras.h:222:0
# |  #6 0x00007ff74ada3c8b llvm::unique_function<(class llvm::StringRef, class llvm::Any, class llvm::PreservedAnalyses const &)>::operator()(class llvm::StringRef, class llvm::Any, class llvm::PreservedAnalyses const &) C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\include\llvm\ADT\FunctionExtras.h:387:0
# |  #7 0x00007ff74c1c9a54 llvm::PassInstrumentation::runAfterPass<class llvm::Module, struct llvm::detail::PassConcept<class llvm::Module, class llvm::AnalysisManager<class llvm::Module>>>(struct llvm::detail::PassConcept<class llvm::Module, class llvm::AnalysisManager<class llvm::Module>> const &, class llvm::Module const &, class llvm::PreservedAnalyses const &) const C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\include\llvm\IR\PassInstrumentation.h:265:0
# |  #8 0x00007ff74c1da22e llvm::PassManager<class llvm::Module, class llvm::AnalysisManager<class llvm::Module>>::run(class llvm::Module &, class llvm::AnalysisManager<class llvm::Module> &) C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\include\llvm\IR\PassManagerImpl.h:93:0
# |  #9 0x00007ff746def5fb llvm::runPassPipeline(class llvm::StringRef, class llvm::Module &, class llvm::TargetMachine *, class llvm::TargetLibraryInfoImpl *, class llvm::ToolOutputFile *, class llvm::ToolOutputFile *, class llvm::ToolOutputFile *, class llvm::StringRef, class llvm::ArrayRef<class llvm::PassPlugin>, class llvm::ArrayRef<class std::function<(class llvm::PassBuilder &)>>, enum llvm::opt_tool::OutputKind, enum llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\tools\opt\NewPMDriver.cpp:541:0
# | #10 0x00007ff746daf2de optMain C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\tools\opt\optdriver.cpp:738:0
# | #11 0x00007ff746dab3b6 main C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\tools\opt\opt.cpp:25:0
# | #12 0x00007ff746e26dd9 invoke_main D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:79:0
# | #13 0x00007ff746e26c82 __scrt_common_main_seh D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288:0
# | #14 0x00007ff746e26b3e __scrt_common_main D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:331:0
# | #15 0x00007ff746e26e6e mainCRTStartup D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_main.cpp:17:0
# | #16 0x00007ffdf2347374 (C:\WINDOWS\System32\KERNEL32.DLL+0x17374)
# | #17 0x00007ffdf275cc91 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x4cc91)
# `-----------------------------
# error: command failed with exit status: 0x80000003
# executed command: 'c:\a\llvm-clang-x86_64-expensive-checks-win\build\bin\filecheck.exe' 'C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\test\Analysis\CtxProfAnalysis\flatten-always-removes-instrumentation.ll'
# .---command stderr------------
# | FileCheck error: '<stdin>' is empty.
# | FileCheck command line:  c:\a\llvm-clang-x86_64-expensive-checks-win\build\bin\filecheck.exe C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\test\Analysis\CtxProfAnalysis\flatten-always-removes-instrumentation.ll
# `-----------------------------
# error: command failed with exit status: 2

--

********************


@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 7, 2024

LLVM Buildbot has detected a new failure on builder lld-x86_64-win running on as-worker-93 while building llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/1091

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM-Unit :: Support/./SupportTests.exe/43/86' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe-LLVM-Unit-12164-43-86.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=86 GTEST_SHARD_INDEX=43 C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe
--

Script:
--
C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe --gtest_filter=ProgramEnvTest.CreateProcessLongPath
--
C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(160): error: Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(163): error: fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied



C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:160
Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:163
fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied




********************


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis llvm:transforms PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants