Skip to content

[ctxprof] Prepare profile format for flat profiles #129626

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

mtrofin
Copy link
Member

@mtrofin mtrofin commented Mar 4, 2025

The profile format has now a separate section called "Contexts" - there will be a corresponding one for flat profiles. The root has a separate tag because, in addition to not having a callsite ID as all the other context nodes have under it, it will have additional fields in subsequent patches.

The rest of this patch amounts to a bit of refactorings in the reader/writer (for better reuse later) and tests fixups.

Copy link
Member Author

mtrofin commented Mar 4, 2025

@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_prepare_profile_format_for_flat_profiles branch 2 times, most recently from 7278206 to d2702d9 Compare March 4, 2025 03:05
@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_nfc_prepare_ctxprofanalysis_for_flat_profiles branch from 3e303bf to a8e4d64 Compare March 4, 2025 03:05
@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_prepare_profile_format_for_flat_profiles branch 3 times, most recently from 3a05bf3 to 96aad6c Compare March 4, 2025 16:33
@mtrofin mtrofin marked this pull request as ready for review March 4, 2025 16:35
@llvmbot llvmbot added compiler-rt PGO Profile Guided Optimizations LTO Link time optimization (regular/full LTO or ThinLTO) llvm:analysis llvm:transforms labels Mar 4, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 4, 2025

@llvm/pr-subscribers-lto
@llvm/pr-subscribers-pgo

@llvm/pr-subscribers-llvm-transforms

Author: Mircea Trofin (mtrofin)

Changes

The profile format has now a separate section called "Contexts" - there will be a corresponding one for flat profiles. The root has a separate tag because, in addition to not having a callsite ID as all the other context nodes have under it, it will have additional fields in subsequent patches.

The rest of this patch amounts to a bit of refactorings in the reader/writer (for better reuse later) and tests fixups.


Patch is 46.16 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/129626.diff

29 Files Affected:

  • (modified) compiler-rt/lib/ctx_profile/CtxInstrContextNode.h (+2)
  • (modified) llvm/include/llvm/Analysis/CtxProfAnalysis.h (+2)
  • (modified) llvm/include/llvm/ProfileData/CtxInstrContextNode.h (+2)
  • (modified) llvm/include/llvm/ProfileData/PGOCtxProfReader.h (+7-4)
  • (modified) llvm/include/llvm/ProfileData/PGOCtxProfWriter.h (+15-6)
  • (modified) llvm/lib/Analysis/CtxProfAnalysis.cpp (+1-1)
  • (modified) llvm/lib/ProfileData/PGOCtxProfReader.cpp (+53-24)
  • (modified) llvm/lib/ProfileData/PGOCtxProfWriter.cpp (+61-20)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/flatten-and-annotate.ll (+9-8)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/flatten-check-path.ll (+9-6)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/flatten-icp.ll (+15-14)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/flatten-zero-path.ll (+3-2)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/full-cycle.ll (+29-27)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/handle-select.ll (+9-8)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/inline.ll (+27-25)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/load-unapplicable.ll (+10-9)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/load.ll (+19-17)
  • (modified) llvm/test/ThinLTO/X86/ctxprof.ll (+2-2)
  • (modified) llvm/test/Transforms/EliminateAvailableExternally/transform-to-local.ll (+2-2)
  • (modified) llvm/test/tools/llvm-ctxprof-util/Inputs/invalid-bad-subctx.yaml (+4-3)
  • (modified) llvm/test/tools/llvm-ctxprof-util/Inputs/invalid-no-counters.yaml (+2-1)
  • (added) llvm/test/tools/llvm-ctxprof-util/Inputs/invalid-no-section.yaml (+1)
  • (removed) llvm/test/tools/llvm-ctxprof-util/Inputs/invalid-no-vector.yaml (-1)
  • (modified) llvm/test/tools/llvm-ctxprof-util/Inputs/valid.yaml (+13-12)
  • (modified) llvm/test/tools/llvm-ctxprof-util/llvm-ctxprof-util-negative.test (+6-6)
  • (modified) llvm/test/tools/llvm-ctxprof-util/llvm-ctxprof-util.test (+29-27)
  • (modified) llvm/tools/llvm-ctxprof-util/llvm-ctxprof-util.cpp (+1-1)
  • (modified) llvm/unittests/ProfileData/PGOCtxProfReaderWriterTest.cpp (+72-6)
  • (modified) llvm/unittests/Transforms/Utils/CallPromotionUtilsTest.cpp (+30-28)
diff --git a/compiler-rt/lib/ctx_profile/CtxInstrContextNode.h b/compiler-rt/lib/ctx_profile/CtxInstrContextNode.h
index bea6311af5c65..6b020733e1f37 100644
--- a/compiler-rt/lib/ctx_profile/CtxInstrContextNode.h
+++ b/compiler-rt/lib/ctx_profile/CtxInstrContextNode.h
@@ -115,7 +115,9 @@ class ContextNode final {
 
 class ProfileWriter {
 public:
+  virtual void startContextSection() = 0;
   virtual void writeContextual(const ctx_profile::ContextNode &RootNode) = 0;
+  virtual void endContextSection() = 0;
   virtual ~ProfileWriter() = default;
 };
 } // namespace ctx_profile
diff --git a/llvm/include/llvm/Analysis/CtxProfAnalysis.h b/llvm/include/llvm/Analysis/CtxProfAnalysis.h
index a763cf3ddcf72..ede8bd2fe5001 100644
--- a/llvm/include/llvm/Analysis/CtxProfAnalysis.h
+++ b/llvm/include/llvm/Analysis/CtxProfAnalysis.h
@@ -54,6 +54,8 @@ class PGOContextualProfile {
     return Profiles.Contexts;
   }
 
+  const PGOCtxProfile &profiles() const { return Profiles; }
+
   bool isFunctionKnown(const Function &F) const {
     return getDefinedFunctionGUID(F) != 0;
   }
diff --git a/llvm/include/llvm/ProfileData/CtxInstrContextNode.h b/llvm/include/llvm/ProfileData/CtxInstrContextNode.h
index bea6311af5c65..6b020733e1f37 100644
--- a/llvm/include/llvm/ProfileData/CtxInstrContextNode.h
+++ b/llvm/include/llvm/ProfileData/CtxInstrContextNode.h
@@ -115,7 +115,9 @@ class ContextNode final {
 
 class ProfileWriter {
 public:
+  virtual void startContextSection() = 0;
   virtual void writeContextual(const ctx_profile::ContextNode &RootNode) = 0;
+  virtual void endContextSection() = 0;
   virtual ~ProfileWriter() = default;
 };
 } // namespace ctx_profile
diff --git a/llvm/include/llvm/ProfileData/PGOCtxProfReader.h b/llvm/include/llvm/ProfileData/PGOCtxProfReader.h
index 19d1329fa4750..dbd8288caaff5 100644
--- a/llvm/include/llvm/ProfileData/PGOCtxProfReader.h
+++ b/llvm/include/llvm/ProfileData/PGOCtxProfReader.h
@@ -190,8 +190,12 @@ class PGOCtxProfileReader final {
   Error unsupported(const Twine &);
 
   Expected<std::pair<std::optional<uint32_t>, PGOCtxProfContext>>
-  readContext(bool ExpectIndex);
-  bool canReadContext();
+  readProfile(PGOCtxProfileBlockIDs Kind);
+
+  bool canEnterBlockWithID(PGOCtxProfileBlockIDs ID);
+  Error enterBlockWithID(PGOCtxProfileBlockIDs ID);
+
+  Error loadContexts(CtxProfContextualProfiles &);
 
 public:
   PGOCtxProfileReader(StringRef Buffer)
@@ -201,7 +205,6 @@ class PGOCtxProfileReader final {
   Expected<PGOCtxProfile> loadProfiles();
 };
 
-void convertCtxProfToYaml(raw_ostream &OS,
-                          const PGOCtxProfContext::CallTargetMapTy &);
+void convertCtxProfToYaml(raw_ostream &OS, const PGOCtxProfile &);
 } // namespace llvm
 #endif
diff --git a/llvm/include/llvm/ProfileData/PGOCtxProfWriter.h b/llvm/include/llvm/ProfileData/PGOCtxProfWriter.h
index 43a190ae0aa05..8923fe57c180c 100644
--- a/llvm/include/llvm/ProfileData/PGOCtxProfWriter.h
+++ b/llvm/include/llvm/ProfileData/PGOCtxProfWriter.h
@@ -23,7 +23,9 @@ enum PGOCtxProfileRecords { Invalid = 0, Version, Guid, CalleeIndex, Counters };
 
 enum PGOCtxProfileBlockIDs {
   ProfileMetadataBlockID = bitc::FIRST_APPLICATION_BLOCKID,
-  ContextNodeBlockID = ProfileMetadataBlockID + 1
+  ContextsSectionBlockID = ProfileMetadataBlockID + 1,
+  ContextRootBlockID = ContextsSectionBlockID + 1,
+  ContextNodeBlockID = ContextRootBlockID + 1,
 };
 
 /// Write one or more ContextNodes to the provided raw_fd_stream.
@@ -60,23 +62,30 @@ enum PGOCtxProfileBlockIDs {
 /// like value profiling - which would appear as additional records. For
 /// example, value profiling would produce a new record with a new record ID,
 /// containing the profiled values (much like the counters)
-class PGOCtxProfileWriter final {
+class PGOCtxProfileWriter : public ctx_profile::ProfileWriter {
+  enum class EmptyContextCriteria { None, EntryIsZero, AllAreZero };
+
   BitstreamWriter Writer;
+  const bool IncludeEmpty;
 
-  void writeCounters(const ctx_profile::ContextNode &Node);
+  void writeGuid(ctx_profile::GUID Guid);
+  void writeCounters(ArrayRef<uint64_t> Counters);
   void writeImpl(std::optional<uint32_t> CallerIndex,
                  const ctx_profile::ContextNode &Node);
 
 public:
   PGOCtxProfileWriter(raw_ostream &Out,
-                      std::optional<unsigned> VersionOverride = std::nullopt);
+                      std::optional<unsigned> VersionOverride = std::nullopt,
+                      bool IncludeEmpty = false);
   ~PGOCtxProfileWriter() { Writer.ExitBlock(); }
 
-  void write(const ctx_profile::ContextNode &);
+  void startContextSection() override;
+  void writeContextual(const ctx_profile::ContextNode &RootNode) override;
+  void endContextSection() override;
 
   // constants used in writing which a reader may find useful.
   static constexpr unsigned CodeLen = 2;
-  static constexpr uint32_t CurrentVersion = 1;
+  static constexpr uint32_t CurrentVersion = 2;
   static constexpr unsigned VBREncodingBits = 6;
   static constexpr StringRef ContainerMagic = "CTXP";
 };
diff --git a/llvm/lib/Analysis/CtxProfAnalysis.cpp b/llvm/lib/Analysis/CtxProfAnalysis.cpp
index aaa9ffb8b3c5d..e021e2a801006 100644
--- a/llvm/lib/Analysis/CtxProfAnalysis.cpp
+++ b/llvm/lib/Analysis/CtxProfAnalysis.cpp
@@ -180,7 +180,7 @@ PreservedAnalyses CtxProfAnalysisPrinterPass::run(Module &M,
 
   if (Mode == PrintMode::Everything)
     OS << "\nCurrent Profile:\n";
-  convertCtxProfToYaml(OS, C.contexts());
+  convertCtxProfToYaml(OS, C.profiles());
   OS << "\n";
   if (Mode == PrintMode::YAML)
     return PreservedAnalyses::all();
diff --git a/llvm/lib/ProfileData/PGOCtxProfReader.cpp b/llvm/lib/ProfileData/PGOCtxProfReader.cpp
index dfe0d3e428a18..660109c3177d1 100644
--- a/llvm/lib/ProfileData/PGOCtxProfReader.cpp
+++ b/llvm/lib/ProfileData/PGOCtxProfReader.cpp
@@ -58,19 +58,26 @@ Error PGOCtxProfileReader::unsupported(const Twine &Msg) {
   return make_error<InstrProfError>(instrprof_error::unsupported_version, Msg);
 }
 
-bool PGOCtxProfileReader::canReadContext() {
+bool PGOCtxProfileReader::canEnterBlockWithID(PGOCtxProfileBlockIDs ID) {
   auto Blk = advance();
   if (!Blk) {
     consumeError(Blk.takeError());
     return false;
   }
-  return Blk->Kind == BitstreamEntry::SubBlock &&
-         Blk->ID == PGOCtxProfileBlockIDs::ContextNodeBlockID;
+  return Blk->Kind == BitstreamEntry::SubBlock && Blk->ID == ID;
+}
+
+Error PGOCtxProfileReader::enterBlockWithID(PGOCtxProfileBlockIDs ID) {
+  RET_ON_ERR(Cursor.EnterSubBlock(ID));
+  return Error::success();
 }
 
 Expected<std::pair<std::optional<uint32_t>, PGOCtxProfContext>>
-PGOCtxProfileReader::readContext(bool ExpectIndex) {
-  RET_ON_ERR(Cursor.EnterSubBlock(PGOCtxProfileBlockIDs::ContextNodeBlockID));
+PGOCtxProfileReader::readProfile(PGOCtxProfileBlockIDs Kind) {
+  assert((Kind == PGOCtxProfileBlockIDs::ContextRootBlockID ||
+          Kind == PGOCtxProfileBlockIDs::ContextNodeBlockID) &&
+         "Unexpected profile kind");
+  RET_ON_ERR(enterBlockWithID(Kind));
 
   std::optional<ctx_profile::GUID> Guid;
   std::optional<SmallVector<uint64_t, 16>> Counters;
@@ -78,6 +85,7 @@ PGOCtxProfileReader::readContext(bool ExpectIndex) {
 
   SmallVector<uint64_t, 1> RecordValues;
 
+  const bool ExpectIndex = Kind == PGOCtxProfileBlockIDs::ContextNodeBlockID;
   // We don't prescribe the order in which the records come in, and we are ok
   // if other unsupported records appear. We seek in the current subblock until
   // we get all we know.
@@ -121,8 +129,8 @@ PGOCtxProfileReader::readContext(bool ExpectIndex) {
 
   PGOCtxProfContext Ret(*Guid, std::move(*Counters));
 
-  while (canReadContext()) {
-    EXPECT_OR_RET(SC, readContext(true));
+  while (canEnterBlockWithID(PGOCtxProfileBlockIDs::ContextNodeBlockID)) {
+    EXPECT_OR_RET(SC, readProfile(PGOCtxProfileBlockIDs::ContextNodeBlockID));
     auto &Targets = Ret.callsites()[*SC->first];
     auto [_, Inserted] =
         Targets.insert({SC->second.guid(), std::move(SC->second)});
@@ -168,15 +176,23 @@ Error PGOCtxProfileReader::readMetadata() {
   return Error::success();
 }
 
+Error PGOCtxProfileReader::loadContexts(CtxProfContextualProfiles &P) {
+  if (canEnterBlockWithID(PGOCtxProfileBlockIDs::ContextsSectionBlockID)) {
+    RET_ON_ERR(enterBlockWithID(PGOCtxProfileBlockIDs::ContextsSectionBlockID));
+    while (canEnterBlockWithID(PGOCtxProfileBlockIDs::ContextRootBlockID)) {
+      EXPECT_OR_RET(E, readProfile(PGOCtxProfileBlockIDs::ContextRootBlockID));
+      auto Key = E->second.guid();
+      if (!P.insert({Key, std::move(E->second)}).second)
+        return wrongValue("Duplicate roots");
+    }
+  }
+  return Error::success();
+}
+
 Expected<PGOCtxProfile> PGOCtxProfileReader::loadProfiles() {
-  PGOCtxProfile Ret;
   RET_ON_ERR(readMetadata());
-  while (canReadContext()) {
-    EXPECT_OR_RET(E, readContext(false));
-    auto Key = E->second.guid();
-    if (!Ret.Contexts.insert({Key, std::move(E->second)}).second)
-      return wrongValue("Duplicate roots");
-  }
+  PGOCtxProfile Ret;
+  RET_ON_ERR(loadContexts(Ret.Contexts));
   return std::move(Ret);
 }
 
@@ -224,7 +240,9 @@ void toYaml(yaml::Output &Out,
   Out.endSequence();
 }
 
-void toYaml(yaml::Output &Out, const PGOCtxProfContext &Ctx) {
+void toYaml(yaml::Output &Out, GlobalValue::GUID Guid,
+            const SmallVectorImpl<uint64_t> &Counters,
+            const PGOCtxProfContext::CallsiteMapTy &Callsites) {
   yaml::EmptyContext Empty;
   Out.beginMapping();
   void *SaveInfo = nullptr;
@@ -232,33 +250,44 @@ void toYaml(yaml::Output &Out, const PGOCtxProfContext &Ctx) {
   {
     Out.preflightKey("Guid", /*Required=*/true, /*SameAsDefault=*/false,
                      UseDefault, SaveInfo);
-    auto Guid = Ctx.guid();
     yaml::yamlize(Out, Guid, true, Empty);
     Out.postflightKey(nullptr);
   }
   {
     Out.preflightKey("Counters", true, false, UseDefault, SaveInfo);
     Out.beginFlowSequence();
-    for (size_t I = 0U, E = Ctx.counters().size(); I < E; ++I) {
+    for (size_t I = 0U, E = Counters.size(); I < E; ++I) {
       Out.preflightFlowElement(I, SaveInfo);
-      uint64_t V = Ctx.counters()[I];
+      uint64_t V = Counters[I];
       yaml::yamlize(Out, V, true, Empty);
       Out.postflightFlowElement(SaveInfo);
     }
     Out.endFlowSequence();
     Out.postflightKey(nullptr);
   }
-  if (!Ctx.callsites().empty()) {
+  if (!Callsites.empty()) {
     Out.preflightKey("Callsites", true, false, UseDefault, SaveInfo);
-    toYaml(Out, Ctx.callsites());
+    toYaml(Out, Callsites);
     Out.postflightKey(nullptr);
   }
   Out.endMapping();
 }
+void toYaml(yaml::Output &Out, const PGOCtxProfContext &Ctx) {
+  toYaml(Out, Ctx.guid(), Ctx.counters(), Ctx.callsites());
+}
+
 } // namespace
 
-void llvm::convertCtxProfToYaml(
-    raw_ostream &OS, const PGOCtxProfContext::CallTargetMapTy &Profiles) {
+void llvm::convertCtxProfToYaml(raw_ostream &OS,
+                                const PGOCtxProfile &Profiles) {
   yaml::Output Out(OS);
-  toYaml(Out, Profiles);
-}
\ No newline at end of file
+  void *SaveInfo = nullptr;
+  bool UseDefault = false;
+  Out.beginMapping();
+  if (!Profiles.Contexts.empty()) {
+    Out.preflightKey("Contexts", false, false, UseDefault, SaveInfo);
+    toYaml(Out, Profiles.Contexts);
+    Out.postflightKey(nullptr);
+  }
+  Out.endMapping();
+}
diff --git a/llvm/lib/ProfileData/PGOCtxProfWriter.cpp b/llvm/lib/ProfileData/PGOCtxProfWriter.cpp
index 3d3da84817489..809672b97a8e4 100644
--- a/llvm/lib/ProfileData/PGOCtxProfWriter.cpp
+++ b/llvm/lib/ProfileData/PGOCtxProfWriter.cpp
@@ -13,17 +13,25 @@
 #include "llvm/ProfileData/PGOCtxProfWriter.h"
 #include "llvm/Bitstream/BitCodeEnums.h"
 #include "llvm/ProfileData/CtxInstrContextNode.h"
+#include "llvm/Support/CommandLine.h"
 #include "llvm/Support/Error.h"
-#include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/YAMLTraits.h"
 #include "llvm/Support/raw_ostream.h"
 
 using namespace llvm;
 using namespace llvm::ctx_profile;
 
+static cl::opt<bool>
+    IncludeEmptyOpt("ctx-prof-include-empty", cl::init(false),
+                    cl::desc("Also write profiles with all-zero counters. "
+                             "Intended for testing/debugging."));
+
 PGOCtxProfileWriter::PGOCtxProfileWriter(
-    raw_ostream &Out, std::optional<unsigned> VersionOverride)
-    : Writer(Out, 0) {
+    raw_ostream &Out, std::optional<unsigned> VersionOverride,
+    bool IncludeEmpty)
+    : Writer(Out, 0),
+      IncludeEmpty(IncludeEmptyOpt.getNumOccurrences() > 0 ? IncludeEmptyOpt
+                                                           : IncludeEmpty) {
   static_assert(ContainerMagic.size() == 4);
   Out.write(ContainerMagic.data(), ContainerMagic.size());
   Writer.EnterBlockInfoBlock();
@@ -43,6 +51,10 @@ PGOCtxProfileWriter::PGOCtxProfileWriter(
     };
     DescribeBlock(PGOCtxProfileBlockIDs::ProfileMetadataBlockID, "Metadata");
     DescribeRecord(PGOCtxProfileRecords::Version, "Version");
+    DescribeBlock(PGOCtxProfileBlockIDs::ContextsSectionBlockID, "Contexts");
+    DescribeBlock(PGOCtxProfileBlockIDs::ContextRootBlockID, "Root");
+    DescribeRecord(PGOCtxProfileRecords::Guid, "GUID");
+    DescribeRecord(PGOCtxProfileRecords::Counters, "Counters");
     DescribeBlock(PGOCtxProfileBlockIDs::ContextNodeBlockID, "Context");
     DescribeRecord(PGOCtxProfileRecords::Guid, "GUID");
     DescribeRecord(PGOCtxProfileRecords::CalleeIndex, "CalleeIndex");
@@ -55,12 +67,16 @@ PGOCtxProfileWriter::PGOCtxProfileWriter(
                     SmallVector<unsigned, 1>({Version}));
 }
 
-void PGOCtxProfileWriter::writeCounters(const ContextNode &Node) {
+void PGOCtxProfileWriter::writeCounters(ArrayRef<uint64_t> Counters) {
   Writer.EmitCode(bitc::UNABBREV_RECORD);
   Writer.EmitVBR(PGOCtxProfileRecords::Counters, VBREncodingBits);
-  Writer.EmitVBR(Node.counters_size(), VBREncodingBits);
-  for (uint32_t I = 0U; I < Node.counters_size(); ++I)
-    Writer.EmitVBR64(Node.counters()[I], VBREncodingBits);
+  Writer.EmitVBR(Counters.size(), VBREncodingBits);
+  for (uint32_t I = 0U; I < Counters.size(); ++I)
+    Writer.EmitVBR64(Counters[I], VBREncodingBits);
+}
+
+void PGOCtxProfileWriter::writeGuid(ctx_profile::GUID Guid) {
+  Writer.EmitRecord(PGOCtxProfileRecords::Guid, SmallVector<uint64_t, 1>{Guid});
 }
 
 // recursively write all the subcontexts. We do need to traverse depth first to
@@ -69,13 +85,18 @@ void PGOCtxProfileWriter::writeCounters(const ContextNode &Node) {
 // keep the implementation simple.
 void PGOCtxProfileWriter::writeImpl(std::optional<uint32_t> CallerIndex,
                                     const ContextNode &Node) {
-  Writer.EnterSubblock(PGOCtxProfileBlockIDs::ContextNodeBlockID, CodeLen);
-  Writer.EmitRecord(PGOCtxProfileRecords::Guid,
-                    SmallVector<uint64_t, 1>{Node.guid()});
+  // A node with no counters is an error. We don't expect this to happen from
+  // the runtime, rather, this is interesting for testing the reader.
+  if (!IncludeEmpty && (Node.counters_size() > 0 && Node.entrycount() == 0))
+    return;
+  Writer.EnterSubblock(CallerIndex ? PGOCtxProfileBlockIDs::ContextNodeBlockID
+                                   : PGOCtxProfileBlockIDs::ContextRootBlockID,
+                       CodeLen);
+  writeGuid(Node.guid());
   if (CallerIndex)
     Writer.EmitRecord(PGOCtxProfileRecords::CalleeIndex,
                       SmallVector<uint64_t, 1>{*CallerIndex});
-  writeCounters(Node);
+  writeCounters({Node.counters(), Node.counters_size()});
   for (uint32_t I = 0U; I < Node.callsites_size(); ++I)
     for (const auto *Subcontext = Node.subContexts()[I]; Subcontext;
          Subcontext = Subcontext->next())
@@ -83,7 +104,13 @@ void PGOCtxProfileWriter::writeImpl(std::optional<uint32_t> CallerIndex,
   Writer.ExitBlock();
 }
 
-void PGOCtxProfileWriter::write(const ContextNode &RootNode) {
+void PGOCtxProfileWriter::startContextSection() {
+  Writer.EnterSubblock(PGOCtxProfileBlockIDs::ContextsSectionBlockID, CodeLen);
+}
+
+void PGOCtxProfileWriter::endContextSection() { Writer.ExitBlock(); }
+
+void PGOCtxProfileWriter::writeContextual(const ContextNode &RootNode) {
   writeImpl(std::nullopt, RootNode);
 }
 
@@ -96,6 +123,9 @@ struct SerializableCtxRepresentation {
   std::vector<uint64_t> Counters;
   std::vector<std::vector<SerializableCtxRepresentation>> Callsites;
 };
+struct SerializableProfileRepresentation {
+  std::vector<SerializableCtxRepresentation> Contexts;
+};
 
 ctx_profile::ContextNode *
 createNode(std::vector<std::unique_ptr<char[]>> &Nodes,
@@ -142,10 +172,16 @@ template <> struct yaml::MappingTraits<SerializableCtxRepresentation> {
   }
 };
 
+template <> struct yaml::MappingTraits<SerializableProfileRepresentation> {
+  static void mapping(yaml::IO &IO, SerializableProfileRepresentation &SPR) {
+    IO.mapOptional("Contexts", SPR.Contexts);
+  }
+};
+
 Error llvm::createCtxProfFromYAML(StringRef Profile, raw_ostream &Out) {
   yaml::Input In(Profile);
-  std::vector<SerializableCtxRepresentation> DCList;
-  In >> DCList;
+  SerializableProfileRepresentation SPR;
+  In >> SPR;
   if (In.error())
     return createStringError(In.error(), "incorrect yaml content");
   std::vector<std::unique_ptr<char[]>> Nodes;
@@ -153,12 +189,17 @@ Error llvm::createCtxProfFromYAML(StringRef Profile, raw_ostream &Out) {
   if (EC)
     return createStringError(EC, "failed to open output");
   PGOCtxProfileWriter Writer(Out);
-  for (const auto &DC : DCList) {
-    auto *TopList = createNode(Nodes, DC);
-    if (!TopList)
-      return createStringError(
-          "Unexpected error converting internal structure to ctx profile");
-    Writer.write(*TopList);
+
+  if (!SPR.Contexts.empty()) {
+    Writer.startContextSection();
+    for (const auto &DC : SPR.Contexts) {
+      auto *TopList = createNode(Nodes, DC);
+      if (!TopList)
+        return createStringError(
+            "Unexpected error converting internal structure to ctx profile");
+      Writer.writeContextual(*TopList);
+    }
+    Writer.endContextSection();
   }
   if (EC)
     return createStringError(EC, "failed to write output");
diff --git a/llvm/test/Analysis/CtxProfAnalysis/flatten-and-annotate.ll b/llvm/test/Analysis/CtxProfAnalysis/flatten-and-annotate.ll
index 9eedade925b01..20eaf59576855 100644
--- a/llvm/test/Analysis/CtxProfAnalysis/flatten-and-annotate.ll
+++ b/llvm/test/Analysis/CtxProfAnalysis/flatten-and-annotate.ll
@@ -59,14 +59,15 @@
 ; CHECK:       ![[AN_ENTRYPOINT_BW]] = !{!"branch_weights", i32 40, i32 60} 
 
 ;--- profile.yaml
-- Guid: 4909520559318251808
-  Counters: [100, 40]
-  Callsites: -
-              - Guid: 11872291593386833696
-                Counters: [ 100, 5 ]
-             -
-              - Guid: 11872291593386833696
-                Counters: [ 40, 10 ]
+Contexts:
+  - Guid: 4909520559318251808
+    Counters: [100, 40]
+    Callsites: -
+                - Guid: 11872291593386833696
+                  Counters: [ 100, 5 ]
+               -
+                - Guid: 11872291593386833696
+                  Counters: [ 40, 10 ]
 ;--- example.ll
 declare void @bar()
 
diff --git a/llvm/test/Analysis/CtxProfAnalysis/flatten-check-path.ll b/llvm/test/Analysis/CtxProfAnalysis/flatten-check-path.ll
index c84a72f60a3d0..eb697b69e2c02 100644
--- a/llvm/test/Analysis/CtxProfAnalysis/flatten-check-path.ll
+++ b/llvm/test/Analysis/CtxProfAnalysis/flatten-check-path.ll
@@ -39,8 +39,9 @@ exit:
 !0 = !{i64 1234}
 
 ;--- profile_ok.yaml
-- Guid: 1234 
-  Counters: [2, 2, 1, 2]
+Contexts:
+  - Guid: 1234 
+    Counters: [2, 2, 1, 2]
 
 ;--- message_pump.ll
 ; This is a message pump: the loop never exits. This should result in an
@@ -61,8 +62,9 @@ exit:
 !0 = !{i64 1234}
 
 ;--- profile_pump.yaml
-- Guid: 1234
-  Counters: [2, 10, 0]
+Contexts:
+  - Guid: 1234
+    Counters: [2, 10, 0]
 
 ;--- unreachable.ll
 ; An unreachable block is reached, that's an error
@@ -84,5 +86,6 @@ exit:
 !0 = !{i64 1234}
 
 ;--- profile_unreachable.yaml
-- Guid: 1234
-  Counters: [2, 1, 1, 2]
\ No newline at end of file
+Contexts:
+  - Guid: 1234
+    Counters: [2, 1, 1, 2]
diff --git a/llvm/test/Analysis/CtxProfAnalysis/flatten-icp.ll b/llvm/test/Analysis/CtxProfAnalysis/flatten-icp.ll
index 46c17377710d0..18f85e6f7f984 100644
--- a/llvm/test/Analysis/CtxProfAnalysis/flatten-icp.ll
+++ b/llvm/test/Analysis/CtxProfAnalysis/flatten-icp.ll
@@ -46,17 +46,18 @@ attributes #1 = { noinline }
 !2 = !{i64 4000}
 
...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Mar 4, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Mircea Trofin (mtrofin)

Changes

The profile format has now a separate section called "Contexts" - there will be a corresponding one for flat profiles. The root has a separate tag because, in addition to not having a callsite ID as all the other context nodes have under it, it will have additional fields in subsequent patches.

The rest of this patch amounts to a bit of refactorings in the reader/writer (for better reuse later) and tests fixups.


Patch is 46.16 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/129626.diff

29 Files Affected:

  • (modified) compiler-rt/lib/ctx_profile/CtxInstrContextNode.h (+2)
  • (modified) llvm/include/llvm/Analysis/CtxProfAnalysis.h (+2)
  • (modified) llvm/include/llvm/ProfileData/CtxInstrContextNode.h (+2)
  • (modified) llvm/include/llvm/ProfileData/PGOCtxProfReader.h (+7-4)
  • (modified) llvm/include/llvm/ProfileData/PGOCtxProfWriter.h (+15-6)
  • (modified) llvm/lib/Analysis/CtxProfAnalysis.cpp (+1-1)
  • (modified) llvm/lib/ProfileData/PGOCtxProfReader.cpp (+53-24)
  • (modified) llvm/lib/ProfileData/PGOCtxProfWriter.cpp (+61-20)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/flatten-and-annotate.ll (+9-8)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/flatten-check-path.ll (+9-6)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/flatten-icp.ll (+15-14)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/flatten-zero-path.ll (+3-2)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/full-cycle.ll (+29-27)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/handle-select.ll (+9-8)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/inline.ll (+27-25)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/load-unapplicable.ll (+10-9)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/load.ll (+19-17)
  • (modified) llvm/test/ThinLTO/X86/ctxprof.ll (+2-2)
  • (modified) llvm/test/Transforms/EliminateAvailableExternally/transform-to-local.ll (+2-2)
  • (modified) llvm/test/tools/llvm-ctxprof-util/Inputs/invalid-bad-subctx.yaml (+4-3)
  • (modified) llvm/test/tools/llvm-ctxprof-util/Inputs/invalid-no-counters.yaml (+2-1)
  • (added) llvm/test/tools/llvm-ctxprof-util/Inputs/invalid-no-section.yaml (+1)
  • (removed) llvm/test/tools/llvm-ctxprof-util/Inputs/invalid-no-vector.yaml (-1)
  • (modified) llvm/test/tools/llvm-ctxprof-util/Inputs/valid.yaml (+13-12)
  • (modified) llvm/test/tools/llvm-ctxprof-util/llvm-ctxprof-util-negative.test (+6-6)
  • (modified) llvm/test/tools/llvm-ctxprof-util/llvm-ctxprof-util.test (+29-27)
  • (modified) llvm/tools/llvm-ctxprof-util/llvm-ctxprof-util.cpp (+1-1)
  • (modified) llvm/unittests/ProfileData/PGOCtxProfReaderWriterTest.cpp (+72-6)
  • (modified) llvm/unittests/Transforms/Utils/CallPromotionUtilsTest.cpp (+30-28)
diff --git a/compiler-rt/lib/ctx_profile/CtxInstrContextNode.h b/compiler-rt/lib/ctx_profile/CtxInstrContextNode.h
index bea6311af5c65..6b020733e1f37 100644
--- a/compiler-rt/lib/ctx_profile/CtxInstrContextNode.h
+++ b/compiler-rt/lib/ctx_profile/CtxInstrContextNode.h
@@ -115,7 +115,9 @@ class ContextNode final {
 
 class ProfileWriter {
 public:
+  virtual void startContextSection() = 0;
   virtual void writeContextual(const ctx_profile::ContextNode &RootNode) = 0;
+  virtual void endContextSection() = 0;
   virtual ~ProfileWriter() = default;
 };
 } // namespace ctx_profile
diff --git a/llvm/include/llvm/Analysis/CtxProfAnalysis.h b/llvm/include/llvm/Analysis/CtxProfAnalysis.h
index a763cf3ddcf72..ede8bd2fe5001 100644
--- a/llvm/include/llvm/Analysis/CtxProfAnalysis.h
+++ b/llvm/include/llvm/Analysis/CtxProfAnalysis.h
@@ -54,6 +54,8 @@ class PGOContextualProfile {
     return Profiles.Contexts;
   }
 
+  const PGOCtxProfile &profiles() const { return Profiles; }
+
   bool isFunctionKnown(const Function &F) const {
     return getDefinedFunctionGUID(F) != 0;
   }
diff --git a/llvm/include/llvm/ProfileData/CtxInstrContextNode.h b/llvm/include/llvm/ProfileData/CtxInstrContextNode.h
index bea6311af5c65..6b020733e1f37 100644
--- a/llvm/include/llvm/ProfileData/CtxInstrContextNode.h
+++ b/llvm/include/llvm/ProfileData/CtxInstrContextNode.h
@@ -115,7 +115,9 @@ class ContextNode final {
 
 class ProfileWriter {
 public:
+  virtual void startContextSection() = 0;
   virtual void writeContextual(const ctx_profile::ContextNode &RootNode) = 0;
+  virtual void endContextSection() = 0;
   virtual ~ProfileWriter() = default;
 };
 } // namespace ctx_profile
diff --git a/llvm/include/llvm/ProfileData/PGOCtxProfReader.h b/llvm/include/llvm/ProfileData/PGOCtxProfReader.h
index 19d1329fa4750..dbd8288caaff5 100644
--- a/llvm/include/llvm/ProfileData/PGOCtxProfReader.h
+++ b/llvm/include/llvm/ProfileData/PGOCtxProfReader.h
@@ -190,8 +190,12 @@ class PGOCtxProfileReader final {
   Error unsupported(const Twine &);
 
   Expected<std::pair<std::optional<uint32_t>, PGOCtxProfContext>>
-  readContext(bool ExpectIndex);
-  bool canReadContext();
+  readProfile(PGOCtxProfileBlockIDs Kind);
+
+  bool canEnterBlockWithID(PGOCtxProfileBlockIDs ID);
+  Error enterBlockWithID(PGOCtxProfileBlockIDs ID);
+
+  Error loadContexts(CtxProfContextualProfiles &);
 
 public:
   PGOCtxProfileReader(StringRef Buffer)
@@ -201,7 +205,6 @@ class PGOCtxProfileReader final {
   Expected<PGOCtxProfile> loadProfiles();
 };
 
-void convertCtxProfToYaml(raw_ostream &OS,
-                          const PGOCtxProfContext::CallTargetMapTy &);
+void convertCtxProfToYaml(raw_ostream &OS, const PGOCtxProfile &);
 } // namespace llvm
 #endif
diff --git a/llvm/include/llvm/ProfileData/PGOCtxProfWriter.h b/llvm/include/llvm/ProfileData/PGOCtxProfWriter.h
index 43a190ae0aa05..8923fe57c180c 100644
--- a/llvm/include/llvm/ProfileData/PGOCtxProfWriter.h
+++ b/llvm/include/llvm/ProfileData/PGOCtxProfWriter.h
@@ -23,7 +23,9 @@ enum PGOCtxProfileRecords { Invalid = 0, Version, Guid, CalleeIndex, Counters };
 
 enum PGOCtxProfileBlockIDs {
   ProfileMetadataBlockID = bitc::FIRST_APPLICATION_BLOCKID,
-  ContextNodeBlockID = ProfileMetadataBlockID + 1
+  ContextsSectionBlockID = ProfileMetadataBlockID + 1,
+  ContextRootBlockID = ContextsSectionBlockID + 1,
+  ContextNodeBlockID = ContextRootBlockID + 1,
 };
 
 /// Write one or more ContextNodes to the provided raw_fd_stream.
@@ -60,23 +62,30 @@ enum PGOCtxProfileBlockIDs {
 /// like value profiling - which would appear as additional records. For
 /// example, value profiling would produce a new record with a new record ID,
 /// containing the profiled values (much like the counters)
-class PGOCtxProfileWriter final {
+class PGOCtxProfileWriter : public ctx_profile::ProfileWriter {
+  enum class EmptyContextCriteria { None, EntryIsZero, AllAreZero };
+
   BitstreamWriter Writer;
+  const bool IncludeEmpty;
 
-  void writeCounters(const ctx_profile::ContextNode &Node);
+  void writeGuid(ctx_profile::GUID Guid);
+  void writeCounters(ArrayRef<uint64_t> Counters);
   void writeImpl(std::optional<uint32_t> CallerIndex,
                  const ctx_profile::ContextNode &Node);
 
 public:
   PGOCtxProfileWriter(raw_ostream &Out,
-                      std::optional<unsigned> VersionOverride = std::nullopt);
+                      std::optional<unsigned> VersionOverride = std::nullopt,
+                      bool IncludeEmpty = false);
   ~PGOCtxProfileWriter() { Writer.ExitBlock(); }
 
-  void write(const ctx_profile::ContextNode &);
+  void startContextSection() override;
+  void writeContextual(const ctx_profile::ContextNode &RootNode) override;
+  void endContextSection() override;
 
   // constants used in writing which a reader may find useful.
   static constexpr unsigned CodeLen = 2;
-  static constexpr uint32_t CurrentVersion = 1;
+  static constexpr uint32_t CurrentVersion = 2;
   static constexpr unsigned VBREncodingBits = 6;
   static constexpr StringRef ContainerMagic = "CTXP";
 };
diff --git a/llvm/lib/Analysis/CtxProfAnalysis.cpp b/llvm/lib/Analysis/CtxProfAnalysis.cpp
index aaa9ffb8b3c5d..e021e2a801006 100644
--- a/llvm/lib/Analysis/CtxProfAnalysis.cpp
+++ b/llvm/lib/Analysis/CtxProfAnalysis.cpp
@@ -180,7 +180,7 @@ PreservedAnalyses CtxProfAnalysisPrinterPass::run(Module &M,
 
   if (Mode == PrintMode::Everything)
     OS << "\nCurrent Profile:\n";
-  convertCtxProfToYaml(OS, C.contexts());
+  convertCtxProfToYaml(OS, C.profiles());
   OS << "\n";
   if (Mode == PrintMode::YAML)
     return PreservedAnalyses::all();
diff --git a/llvm/lib/ProfileData/PGOCtxProfReader.cpp b/llvm/lib/ProfileData/PGOCtxProfReader.cpp
index dfe0d3e428a18..660109c3177d1 100644
--- a/llvm/lib/ProfileData/PGOCtxProfReader.cpp
+++ b/llvm/lib/ProfileData/PGOCtxProfReader.cpp
@@ -58,19 +58,26 @@ Error PGOCtxProfileReader::unsupported(const Twine &Msg) {
   return make_error<InstrProfError>(instrprof_error::unsupported_version, Msg);
 }
 
-bool PGOCtxProfileReader::canReadContext() {
+bool PGOCtxProfileReader::canEnterBlockWithID(PGOCtxProfileBlockIDs ID) {
   auto Blk = advance();
   if (!Blk) {
     consumeError(Blk.takeError());
     return false;
   }
-  return Blk->Kind == BitstreamEntry::SubBlock &&
-         Blk->ID == PGOCtxProfileBlockIDs::ContextNodeBlockID;
+  return Blk->Kind == BitstreamEntry::SubBlock && Blk->ID == ID;
+}
+
+Error PGOCtxProfileReader::enterBlockWithID(PGOCtxProfileBlockIDs ID) {
+  RET_ON_ERR(Cursor.EnterSubBlock(ID));
+  return Error::success();
 }
 
 Expected<std::pair<std::optional<uint32_t>, PGOCtxProfContext>>
-PGOCtxProfileReader::readContext(bool ExpectIndex) {
-  RET_ON_ERR(Cursor.EnterSubBlock(PGOCtxProfileBlockIDs::ContextNodeBlockID));
+PGOCtxProfileReader::readProfile(PGOCtxProfileBlockIDs Kind) {
+  assert((Kind == PGOCtxProfileBlockIDs::ContextRootBlockID ||
+          Kind == PGOCtxProfileBlockIDs::ContextNodeBlockID) &&
+         "Unexpected profile kind");
+  RET_ON_ERR(enterBlockWithID(Kind));
 
   std::optional<ctx_profile::GUID> Guid;
   std::optional<SmallVector<uint64_t, 16>> Counters;
@@ -78,6 +85,7 @@ PGOCtxProfileReader::readContext(bool ExpectIndex) {
 
   SmallVector<uint64_t, 1> RecordValues;
 
+  const bool ExpectIndex = Kind == PGOCtxProfileBlockIDs::ContextNodeBlockID;
   // We don't prescribe the order in which the records come in, and we are ok
   // if other unsupported records appear. We seek in the current subblock until
   // we get all we know.
@@ -121,8 +129,8 @@ PGOCtxProfileReader::readContext(bool ExpectIndex) {
 
   PGOCtxProfContext Ret(*Guid, std::move(*Counters));
 
-  while (canReadContext()) {
-    EXPECT_OR_RET(SC, readContext(true));
+  while (canEnterBlockWithID(PGOCtxProfileBlockIDs::ContextNodeBlockID)) {
+    EXPECT_OR_RET(SC, readProfile(PGOCtxProfileBlockIDs::ContextNodeBlockID));
     auto &Targets = Ret.callsites()[*SC->first];
     auto [_, Inserted] =
         Targets.insert({SC->second.guid(), std::move(SC->second)});
@@ -168,15 +176,23 @@ Error PGOCtxProfileReader::readMetadata() {
   return Error::success();
 }
 
+Error PGOCtxProfileReader::loadContexts(CtxProfContextualProfiles &P) {
+  if (canEnterBlockWithID(PGOCtxProfileBlockIDs::ContextsSectionBlockID)) {
+    RET_ON_ERR(enterBlockWithID(PGOCtxProfileBlockIDs::ContextsSectionBlockID));
+    while (canEnterBlockWithID(PGOCtxProfileBlockIDs::ContextRootBlockID)) {
+      EXPECT_OR_RET(E, readProfile(PGOCtxProfileBlockIDs::ContextRootBlockID));
+      auto Key = E->second.guid();
+      if (!P.insert({Key, std::move(E->second)}).second)
+        return wrongValue("Duplicate roots");
+    }
+  }
+  return Error::success();
+}
+
 Expected<PGOCtxProfile> PGOCtxProfileReader::loadProfiles() {
-  PGOCtxProfile Ret;
   RET_ON_ERR(readMetadata());
-  while (canReadContext()) {
-    EXPECT_OR_RET(E, readContext(false));
-    auto Key = E->second.guid();
-    if (!Ret.Contexts.insert({Key, std::move(E->second)}).second)
-      return wrongValue("Duplicate roots");
-  }
+  PGOCtxProfile Ret;
+  RET_ON_ERR(loadContexts(Ret.Contexts));
   return std::move(Ret);
 }
 
@@ -224,7 +240,9 @@ void toYaml(yaml::Output &Out,
   Out.endSequence();
 }
 
-void toYaml(yaml::Output &Out, const PGOCtxProfContext &Ctx) {
+void toYaml(yaml::Output &Out, GlobalValue::GUID Guid,
+            const SmallVectorImpl<uint64_t> &Counters,
+            const PGOCtxProfContext::CallsiteMapTy &Callsites) {
   yaml::EmptyContext Empty;
   Out.beginMapping();
   void *SaveInfo = nullptr;
@@ -232,33 +250,44 @@ void toYaml(yaml::Output &Out, const PGOCtxProfContext &Ctx) {
   {
     Out.preflightKey("Guid", /*Required=*/true, /*SameAsDefault=*/false,
                      UseDefault, SaveInfo);
-    auto Guid = Ctx.guid();
     yaml::yamlize(Out, Guid, true, Empty);
     Out.postflightKey(nullptr);
   }
   {
     Out.preflightKey("Counters", true, false, UseDefault, SaveInfo);
     Out.beginFlowSequence();
-    for (size_t I = 0U, E = Ctx.counters().size(); I < E; ++I) {
+    for (size_t I = 0U, E = Counters.size(); I < E; ++I) {
       Out.preflightFlowElement(I, SaveInfo);
-      uint64_t V = Ctx.counters()[I];
+      uint64_t V = Counters[I];
       yaml::yamlize(Out, V, true, Empty);
       Out.postflightFlowElement(SaveInfo);
     }
     Out.endFlowSequence();
     Out.postflightKey(nullptr);
   }
-  if (!Ctx.callsites().empty()) {
+  if (!Callsites.empty()) {
     Out.preflightKey("Callsites", true, false, UseDefault, SaveInfo);
-    toYaml(Out, Ctx.callsites());
+    toYaml(Out, Callsites);
     Out.postflightKey(nullptr);
   }
   Out.endMapping();
 }
+void toYaml(yaml::Output &Out, const PGOCtxProfContext &Ctx) {
+  toYaml(Out, Ctx.guid(), Ctx.counters(), Ctx.callsites());
+}
+
 } // namespace
 
-void llvm::convertCtxProfToYaml(
-    raw_ostream &OS, const PGOCtxProfContext::CallTargetMapTy &Profiles) {
+void llvm::convertCtxProfToYaml(raw_ostream &OS,
+                                const PGOCtxProfile &Profiles) {
   yaml::Output Out(OS);
-  toYaml(Out, Profiles);
-}
\ No newline at end of file
+  void *SaveInfo = nullptr;
+  bool UseDefault = false;
+  Out.beginMapping();
+  if (!Profiles.Contexts.empty()) {
+    Out.preflightKey("Contexts", false, false, UseDefault, SaveInfo);
+    toYaml(Out, Profiles.Contexts);
+    Out.postflightKey(nullptr);
+  }
+  Out.endMapping();
+}
diff --git a/llvm/lib/ProfileData/PGOCtxProfWriter.cpp b/llvm/lib/ProfileData/PGOCtxProfWriter.cpp
index 3d3da84817489..809672b97a8e4 100644
--- a/llvm/lib/ProfileData/PGOCtxProfWriter.cpp
+++ b/llvm/lib/ProfileData/PGOCtxProfWriter.cpp
@@ -13,17 +13,25 @@
 #include "llvm/ProfileData/PGOCtxProfWriter.h"
 #include "llvm/Bitstream/BitCodeEnums.h"
 #include "llvm/ProfileData/CtxInstrContextNode.h"
+#include "llvm/Support/CommandLine.h"
 #include "llvm/Support/Error.h"
-#include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/YAMLTraits.h"
 #include "llvm/Support/raw_ostream.h"
 
 using namespace llvm;
 using namespace llvm::ctx_profile;
 
+static cl::opt<bool>
+    IncludeEmptyOpt("ctx-prof-include-empty", cl::init(false),
+                    cl::desc("Also write profiles with all-zero counters. "
+                             "Intended for testing/debugging."));
+
 PGOCtxProfileWriter::PGOCtxProfileWriter(
-    raw_ostream &Out, std::optional<unsigned> VersionOverride)
-    : Writer(Out, 0) {
+    raw_ostream &Out, std::optional<unsigned> VersionOverride,
+    bool IncludeEmpty)
+    : Writer(Out, 0),
+      IncludeEmpty(IncludeEmptyOpt.getNumOccurrences() > 0 ? IncludeEmptyOpt
+                                                           : IncludeEmpty) {
   static_assert(ContainerMagic.size() == 4);
   Out.write(ContainerMagic.data(), ContainerMagic.size());
   Writer.EnterBlockInfoBlock();
@@ -43,6 +51,10 @@ PGOCtxProfileWriter::PGOCtxProfileWriter(
     };
     DescribeBlock(PGOCtxProfileBlockIDs::ProfileMetadataBlockID, "Metadata");
     DescribeRecord(PGOCtxProfileRecords::Version, "Version");
+    DescribeBlock(PGOCtxProfileBlockIDs::ContextsSectionBlockID, "Contexts");
+    DescribeBlock(PGOCtxProfileBlockIDs::ContextRootBlockID, "Root");
+    DescribeRecord(PGOCtxProfileRecords::Guid, "GUID");
+    DescribeRecord(PGOCtxProfileRecords::Counters, "Counters");
     DescribeBlock(PGOCtxProfileBlockIDs::ContextNodeBlockID, "Context");
     DescribeRecord(PGOCtxProfileRecords::Guid, "GUID");
     DescribeRecord(PGOCtxProfileRecords::CalleeIndex, "CalleeIndex");
@@ -55,12 +67,16 @@ PGOCtxProfileWriter::PGOCtxProfileWriter(
                     SmallVector<unsigned, 1>({Version}));
 }
 
-void PGOCtxProfileWriter::writeCounters(const ContextNode &Node) {
+void PGOCtxProfileWriter::writeCounters(ArrayRef<uint64_t> Counters) {
   Writer.EmitCode(bitc::UNABBREV_RECORD);
   Writer.EmitVBR(PGOCtxProfileRecords::Counters, VBREncodingBits);
-  Writer.EmitVBR(Node.counters_size(), VBREncodingBits);
-  for (uint32_t I = 0U; I < Node.counters_size(); ++I)
-    Writer.EmitVBR64(Node.counters()[I], VBREncodingBits);
+  Writer.EmitVBR(Counters.size(), VBREncodingBits);
+  for (uint32_t I = 0U; I < Counters.size(); ++I)
+    Writer.EmitVBR64(Counters[I], VBREncodingBits);
+}
+
+void PGOCtxProfileWriter::writeGuid(ctx_profile::GUID Guid) {
+  Writer.EmitRecord(PGOCtxProfileRecords::Guid, SmallVector<uint64_t, 1>{Guid});
 }
 
 // recursively write all the subcontexts. We do need to traverse depth first to
@@ -69,13 +85,18 @@ void PGOCtxProfileWriter::writeCounters(const ContextNode &Node) {
 // keep the implementation simple.
 void PGOCtxProfileWriter::writeImpl(std::optional<uint32_t> CallerIndex,
                                     const ContextNode &Node) {
-  Writer.EnterSubblock(PGOCtxProfileBlockIDs::ContextNodeBlockID, CodeLen);
-  Writer.EmitRecord(PGOCtxProfileRecords::Guid,
-                    SmallVector<uint64_t, 1>{Node.guid()});
+  // A node with no counters is an error. We don't expect this to happen from
+  // the runtime, rather, this is interesting for testing the reader.
+  if (!IncludeEmpty && (Node.counters_size() > 0 && Node.entrycount() == 0))
+    return;
+  Writer.EnterSubblock(CallerIndex ? PGOCtxProfileBlockIDs::ContextNodeBlockID
+                                   : PGOCtxProfileBlockIDs::ContextRootBlockID,
+                       CodeLen);
+  writeGuid(Node.guid());
   if (CallerIndex)
     Writer.EmitRecord(PGOCtxProfileRecords::CalleeIndex,
                       SmallVector<uint64_t, 1>{*CallerIndex});
-  writeCounters(Node);
+  writeCounters({Node.counters(), Node.counters_size()});
   for (uint32_t I = 0U; I < Node.callsites_size(); ++I)
     for (const auto *Subcontext = Node.subContexts()[I]; Subcontext;
          Subcontext = Subcontext->next())
@@ -83,7 +104,13 @@ void PGOCtxProfileWriter::writeImpl(std::optional<uint32_t> CallerIndex,
   Writer.ExitBlock();
 }
 
-void PGOCtxProfileWriter::write(const ContextNode &RootNode) {
+void PGOCtxProfileWriter::startContextSection() {
+  Writer.EnterSubblock(PGOCtxProfileBlockIDs::ContextsSectionBlockID, CodeLen);
+}
+
+void PGOCtxProfileWriter::endContextSection() { Writer.ExitBlock(); }
+
+void PGOCtxProfileWriter::writeContextual(const ContextNode &RootNode) {
   writeImpl(std::nullopt, RootNode);
 }
 
@@ -96,6 +123,9 @@ struct SerializableCtxRepresentation {
   std::vector<uint64_t> Counters;
   std::vector<std::vector<SerializableCtxRepresentation>> Callsites;
 };
+struct SerializableProfileRepresentation {
+  std::vector<SerializableCtxRepresentation> Contexts;
+};
 
 ctx_profile::ContextNode *
 createNode(std::vector<std::unique_ptr<char[]>> &Nodes,
@@ -142,10 +172,16 @@ template <> struct yaml::MappingTraits<SerializableCtxRepresentation> {
   }
 };
 
+template <> struct yaml::MappingTraits<SerializableProfileRepresentation> {
+  static void mapping(yaml::IO &IO, SerializableProfileRepresentation &SPR) {
+    IO.mapOptional("Contexts", SPR.Contexts);
+  }
+};
+
 Error llvm::createCtxProfFromYAML(StringRef Profile, raw_ostream &Out) {
   yaml::Input In(Profile);
-  std::vector<SerializableCtxRepresentation> DCList;
-  In >> DCList;
+  SerializableProfileRepresentation SPR;
+  In >> SPR;
   if (In.error())
     return createStringError(In.error(), "incorrect yaml content");
   std::vector<std::unique_ptr<char[]>> Nodes;
@@ -153,12 +189,17 @@ Error llvm::createCtxProfFromYAML(StringRef Profile, raw_ostream &Out) {
   if (EC)
     return createStringError(EC, "failed to open output");
   PGOCtxProfileWriter Writer(Out);
-  for (const auto &DC : DCList) {
-    auto *TopList = createNode(Nodes, DC);
-    if (!TopList)
-      return createStringError(
-          "Unexpected error converting internal structure to ctx profile");
-    Writer.write(*TopList);
+
+  if (!SPR.Contexts.empty()) {
+    Writer.startContextSection();
+    for (const auto &DC : SPR.Contexts) {
+      auto *TopList = createNode(Nodes, DC);
+      if (!TopList)
+        return createStringError(
+            "Unexpected error converting internal structure to ctx profile");
+      Writer.writeContextual(*TopList);
+    }
+    Writer.endContextSection();
   }
   if (EC)
     return createStringError(EC, "failed to write output");
diff --git a/llvm/test/Analysis/CtxProfAnalysis/flatten-and-annotate.ll b/llvm/test/Analysis/CtxProfAnalysis/flatten-and-annotate.ll
index 9eedade925b01..20eaf59576855 100644
--- a/llvm/test/Analysis/CtxProfAnalysis/flatten-and-annotate.ll
+++ b/llvm/test/Analysis/CtxProfAnalysis/flatten-and-annotate.ll
@@ -59,14 +59,15 @@
 ; CHECK:       ![[AN_ENTRYPOINT_BW]] = !{!"branch_weights", i32 40, i32 60} 
 
 ;--- profile.yaml
-- Guid: 4909520559318251808
-  Counters: [100, 40]
-  Callsites: -
-              - Guid: 11872291593386833696
-                Counters: [ 100, 5 ]
-             -
-              - Guid: 11872291593386833696
-                Counters: [ 40, 10 ]
+Contexts:
+  - Guid: 4909520559318251808
+    Counters: [100, 40]
+    Callsites: -
+                - Guid: 11872291593386833696
+                  Counters: [ 100, 5 ]
+               -
+                - Guid: 11872291593386833696
+                  Counters: [ 40, 10 ]
 ;--- example.ll
 declare void @bar()
 
diff --git a/llvm/test/Analysis/CtxProfAnalysis/flatten-check-path.ll b/llvm/test/Analysis/CtxProfAnalysis/flatten-check-path.ll
index c84a72f60a3d0..eb697b69e2c02 100644
--- a/llvm/test/Analysis/CtxProfAnalysis/flatten-check-path.ll
+++ b/llvm/test/Analysis/CtxProfAnalysis/flatten-check-path.ll
@@ -39,8 +39,9 @@ exit:
 !0 = !{i64 1234}
 
 ;--- profile_ok.yaml
-- Guid: 1234 
-  Counters: [2, 2, 1, 2]
+Contexts:
+  - Guid: 1234 
+    Counters: [2, 2, 1, 2]
 
 ;--- message_pump.ll
 ; This is a message pump: the loop never exits. This should result in an
@@ -61,8 +62,9 @@ exit:
 !0 = !{i64 1234}
 
 ;--- profile_pump.yaml
-- Guid: 1234
-  Counters: [2, 10, 0]
+Contexts:
+  - Guid: 1234
+    Counters: [2, 10, 0]
 
 ;--- unreachable.ll
 ; An unreachable block is reached, that's an error
@@ -84,5 +86,6 @@ exit:
 !0 = !{i64 1234}
 
 ;--- profile_unreachable.yaml
-- Guid: 1234
-  Counters: [2, 1, 1, 2]
\ No newline at end of file
+Contexts:
+  - Guid: 1234
+    Counters: [2, 1, 1, 2]
diff --git a/llvm/test/Analysis/CtxProfAnalysis/flatten-icp.ll b/llvm/test/Analysis/CtxProfAnalysis/flatten-icp.ll
index 46c17377710d0..18f85e6f7f984 100644
--- a/llvm/test/Analysis/CtxProfAnalysis/flatten-icp.ll
+++ b/llvm/test/Analysis/CtxProfAnalysis/flatten-icp.ll
@@ -46,17 +46,18 @@ attributes #1 = { noinline }
 !2 = !{i64 4000}
 
...
[truncated]

@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_nfc_prepare_ctxprofanalysis_for_flat_profiles branch from a8e4d64 to 3ab0c19 Compare March 4, 2025 16:47
@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_prepare_profile_format_for_flat_profiles branch 2 times, most recently from 47fb287 to f1efe7d Compare March 4, 2025 16:48
@mtrofin mtrofin requested a review from snehasish March 4, 2025 16:54
@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_nfc_prepare_ctxprofanalysis_for_flat_profiles branch from 3ab0c19 to 5ecc730 Compare March 4, 2025 16:55
@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_prepare_profile_format_for_flat_profiles branch from f1efe7d to 4be71c3 Compare March 4, 2025 16:55
Copy link
Contributor

@kazutakahirata kazutakahirata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a minor comment about a range-based for loop.

@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_nfc_prepare_ctxprofanalysis_for_flat_profiles branch from 5ecc730 to 0c5d403 Compare March 4, 2025 18:40
@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_prepare_profile_format_for_flat_profiles branch 2 times, most recently from c1361a8 to 4184827 Compare March 4, 2025 18:45
@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_nfc_prepare_ctxprofanalysis_for_flat_profiles branch from 0c5d403 to e06be88 Compare March 4, 2025 18:53
@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_prepare_profile_format_for_flat_profiles branch from 4184827 to ee68f77 Compare March 4, 2025 18:53
@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_nfc_prepare_ctxprofanalysis_for_flat_profiles branch from e06be88 to 5f470a4 Compare March 4, 2025 20:43
@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_prepare_profile_format_for_flat_profiles branch from ee68f77 to 87ad9f4 Compare March 4, 2025 20:43
@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_nfc_prepare_ctxprofanalysis_for_flat_profiles branch from 5f470a4 to c8db998 Compare March 4, 2025 22:34
@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_prepare_profile_format_for_flat_profiles branch from 87ad9f4 to 583f74b Compare March 4, 2025 22:34
Base automatically changed from users/mtrofin/03-03-_ctxprof_nfc_prepare_ctxprofanalysis_for_flat_profiles to main March 5, 2025 00:42
@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_prepare_profile_format_for_flat_profiles branch 7 times, most recently from 19e8074 to 9cd9f3a Compare March 5, 2025 01:34
@mtrofin mtrofin force-pushed the users/mtrofin/03-03-_ctxprof_prepare_profile_format_for_flat_profiles branch from 9cd9f3a to 0c3056d Compare March 5, 2025 02:02
Copy link
Contributor

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@@ -201,7 +205,6 @@ class PGOCtxProfileReader final {
Expected<PGOCtxProfile> loadProfiles();
};

void convertCtxProfToYaml(raw_ostream &OS,
const PGOCtxProfContext::CallTargetMapTy &);
void convertCtxProfToYaml(raw_ostream &OS, const PGOCtxProfile &);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: name the second param too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in PR #129626

@@ -60,23 +62,30 @@ enum PGOCtxProfileBlockIDs {
/// like value profiling - which would appear as additional records. For
/// example, value profiling would produce a new record with a new record ID,
/// containing the profiled values (much like the counters)
class PGOCtxProfileWriter final {
class PGOCtxProfileWriter : public ctx_profile::ProfileWriter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can still keep the final keyword here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in PR #129626

bool canEnterBlockWithID(PGOCtxProfileBlockIDs ID);
Error enterBlockWithID(PGOCtxProfileBlockIDs ID);

Error loadContexts(CtxProfContextualProfiles &);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: name the parameter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in PR #129626

class ProfileWriter {
public:
virtual void startContextSection() = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could have this method return an object which runs endContextSection on destruction. Then we wouldn't have to worry about pairing up the calls to start and end each time. Perhaps it's too much for a simple interface like this though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of that but yes, seemed overkill. It's a low-level interface, really meant to go between compiler-rt and whatever service handles writing the profile.

Copy link
Member Author

mtrofin commented Mar 5, 2025

Merge activity

  • Mar 5, 10:20 AM EST: A user started a stack merge that includes this pull request via Graphite.
  • Mar 5, 10:22 AM EST: A user merged this pull request with Graphite.

@mtrofin mtrofin merged commit 5223ddd into main Mar 5, 2025
11 checks passed
@mtrofin mtrofin deleted the users/mtrofin/03-03-_ctxprof_prepare_profile_format_for_flat_profiles branch March 5, 2025 15:22
jph-13 pushed a commit to jph-13/llvm-project that referenced this pull request Mar 21, 2025
The profile format has now a separate section called "Contexts" - there will be a corresponding one for flat profiles. The root has a separate tag because, in addition to not having a callsite ID as all the other context nodes have under it, it will have additional fields in subsequent patches.

The rest of this patch amounts to a bit of refactorings in the reader/writer (for better reuse later) and tests fixups.
jph-13 pushed a commit to jph-13/llvm-project that referenced this pull request Mar 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler-rt llvm:analysis llvm:transforms LTO Link time optimization (regular/full LTO or ThinLTO) PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants