Skip to content

obj2yaml: Introduce CovMap dump #127432

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: users/chapuni/yaml/enc
Choose a base branch
from

Conversation

chapuni
Copy link
Contributor

@chapuni chapuni commented Feb 17, 2025

obj2yaml is able to understand and dump Coverage Mapping related sections with --covmap-raw.

For now, this is dedicated to ELF and disabled by default. Affected sections are as below:

  • __llvm_covmap
  • __llvm_covfun
  • __llvm_prf_names

Depends on #129472

@llvmbot llvmbot added PGO Profile Guided Optimizations bazel "Peripheral" support tier build system: utils/bazel objectyaml labels Feb 17, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 17, 2025

@llvm/pr-subscribers-pgo

@llvm/pr-subscribers-objectyaml

Author: NAKAMURA Takumi (chapuni)

Changes

obj2yaml is able to understand and dump Coverage Mapping related sections.

For now, this is dedicated to ELF and disabled by default. Affected sections are as below:

  • __llvm_covmap
  • __llvm_covfun
  • __llvm_prf_names

There are options for enabling CovMap in obj2yaml.

  • --covmap dumps CovMap sections prettyprinted.
    • --covmap-dloc dumps location information as dLoc, that stores line numbers as delta.
  • --covmap-raw dumps CovMap sections as raw level oriented. Name indices are not resolved but shown as raw numbers.

yaml2obj can understand and encode CovMap sections without any options. For now, yaml2obj assumes input files generated by obj2yaml. yaml2obj is expected to generate identical CovMap sections dumped by obj2yaml.

obj2yaml --covmap eliminates redundant elements shown by --covmap-raw if they are recoverable with other data.

  • CovMapTy::Filenames is reconstructible with file names in CovFunTy::Files. This is applicable to relocatable object files but not always to linked files due to gc-sections.
  • CovFunTy::FileIDs is eliminated and expanded onto Filename in CovFunTy::Files.
  • PrfNames is not eliminated since they are picked up in clangCodeGen and not recoverable with section order.

There are a few namespaces. Other implementations are sunk into anonymous namespace.

  • llvm::coverage::yaml contains YAML definitions and encoder/decoder interfaces of CovMap data. This can be split out of ObjectYAML and embedded into other component like Coverage.
  • llvm::covmap defines interfaces to ObjectYAML.

Test input files in llvm/test/tools/llvm-cov/Inputs are updated with obj2yaml --covmap --dloc. I think dLoc is more tolerant than absolute Loc for updates.


Patch is 129.99 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/127432.diff

20 Files Affected:

  • (added) llvm/include/llvm/ObjectYAML/CovMap.h (+385)
  • (modified) llvm/include/llvm/ObjectYAML/ELFYAML.h (+14)
  • (modified) llvm/include/llvm/ProfileData/InstrProf.h (+6)
  • (modified) llvm/lib/ObjectYAML/CMakeLists.txt (+3)
  • (added) llvm/lib/ObjectYAML/CovMap.cpp (+977)
  • (modified) llvm/lib/ObjectYAML/ELFEmitter.cpp (+38)
  • (modified) llvm/lib/ObjectYAML/ELFYAML.cpp (+12)
  • (modified) llvm/lib/ProfileData/InstrProf.cpp (+20-3)
  • (modified) llvm/test/tools/llvm-cov/Inputs/branch-c-general-single.yaml (+467-94)
  • (modified) llvm/test/tools/llvm-cov/Inputs/branch-logical-mixed-single.yaml (+121-24)
  • (modified) llvm/test/tools/llvm-cov/Inputs/branch-macros-single.yaml (+211-31)
  • (modified) llvm/test/tools/llvm-cov/Inputs/branch-templates-single.yaml (+81-38)
  • (modified) llvm/test/tools/llvm-cov/Inputs/showLineExecutionCounts-single.yaml (+40-17)
  • (modified) llvm/test/tools/llvm-cov/Inputs/yaml.makefile (+3-3)
  • (added) llvm/test/tools/obj2yaml/ELF/covmap.yaml (+160)
  • (modified) llvm/test/tools/obj2yaml/help.test (+1)
  • (modified) llvm/tools/obj2yaml/elf2yaml.cpp (+56-1)
  • (modified) llvm/tools/obj2yaml/obj2yaml.cpp (+1-1)
  • (modified) llvm/tools/obj2yaml/obj2yaml.h (+4)
  • (modified) utils/bazel/llvm-project-overlay/llvm/BUILD.bazel (+2)
diff --git a/llvm/include/llvm/ObjectYAML/CovMap.h b/llvm/include/llvm/ObjectYAML/CovMap.h
new file mode 100644
index 0000000000000..fa9492fc1ee72
--- /dev/null
+++ b/llvm/include/llvm/ObjectYAML/CovMap.h
@@ -0,0 +1,385 @@
+//===- CovMap.h - ObjectYAML Interface for coverage map ---------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// - llvm::coverage::yaml
+//
+//   Describes binary file formats and YAML structures of coverage map.
+//
+// - llvm::yaml
+//
+//   Attachments for YAMLTraits.
+//
+// - llvm::covmap
+//
+//   Provides YAML encoder and decoder for coverage map.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_OBJECTYAML_COVMAP_H
+#define LLVM_OBJECTYAML_COVMAP_H
+
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/ObjectYAML/ELFYAML.h"
+#include "llvm/Support/Error.h"
+#include "llvm/Support/YAMLTraits.h"
+#include <cstdint>
+#include <memory>
+#include <optional>
+#include <variant>
+
+namespace llvm {
+class InstrProfSymtab;
+class raw_ostream;
+} // namespace llvm
+
+namespace llvm::coverage::yaml {
+
+/// This works like vector container but can be replaced with
+/// MutableArrayRef. See also SequenceTraits<VectorOrRef>.
+template <typename T, typename Vec = std::vector<T>> class VectorOrRef {
+  using Ref = MutableArrayRef<T>;
+
+  /// Holds vector type initially.
+  std::variant<Vec, Ref> Array;
+
+public:
+  // FIXME: Iterator impl is minimal easy.
+  using iterator = T *;
+
+  iterator begin() {
+    if (auto *V = std::get_if<Vec>(&Array))
+      return &V->front();
+    else
+      return &std::get<Ref>(Array).front();
+  }
+
+  iterator end() {
+    if (auto *V = std::get_if<Vec>(&Array))
+      return &V->back() + 1;
+    else
+      return &std::get<Ref>(Array).back() + 1;
+  }
+
+  size_t size() const {
+    if (const auto *V = std::get_if<Vec>(&Array))
+      return V->size();
+    else
+      return std::get<Ref>(Array).size();
+  }
+
+  T &operator[](int Idx) {
+    if (auto *V = std::get_if<Vec>(&Array))
+      return (*V)[Idx];
+    else
+      return std::get<Ref>(Array)[Idx];
+  }
+
+  void resize(size_t Size) { std::get<Vec>(Array).resize(Size); }
+
+  VectorOrRef() = default;
+
+  /// Initialize with MutableArrayRef.
+  VectorOrRef(Ref &&Tmp) : Array(std::move(Tmp)) {}
+};
+
+/// Options for Decoder.
+struct DecoderParam {
+  bool Detailed; ///< Generate and show processed records.
+  bool Raw;      ///< Show raw data oriented records.
+  bool dLoc;     ///< Show raw dLoc (differential Loc).
+};
+
+struct DecoderContext;
+
+/// Base Counter, corresponding to coverage::Counter.
+struct CounterTy {
+  enum TagTy : uint8_t {
+    Zero = 0,
+    Ref,
+    Sub,
+    Add,
+  };
+
+  /// Optional in detailed view, since most Tag can be determined from
+  /// other optional fields.
+  std::optional<TagTy> Tag;
+
+  /// Internal use.
+  std::optional<uint64_t> Val;
+
+  std::optional<uint64_t> RefOpt;
+  std::optional<uint64_t> SubOpt;
+  std::optional<uint64_t> AddOpt;
+
+  virtual ~CounterTy() {}
+
+  virtual void mapping(llvm::yaml::IO &IO);
+
+  /// Holds Val for extensions.
+  Error decodeOrTag(DecoderContext &Data);
+
+  /// Raise Error if Val isn't empty.
+  Error decode(DecoderContext &Data);
+
+  void encode(raw_ostream &OS) const;
+};
+
+/// Holds a pair of both hands but doesn't hold ops(add or sub).
+/// Ops is stored in CounterTy::Tag.
+using ExpressionTy = std::array<CounterTy, 2>;
+
+/// {True, False}
+using BranchTy = std::array<CounterTy, 2>;
+
+/// {ID, TrueID, FalseID}
+/// Note: This has +1 offset unlike mcdc::ConditionID.
+using MCDCBranchTy = std::array<uint16_t, 3>;
+
+struct DecisionTy {
+  uint64_t BIdx; ///< Bitmap index
+  uint64_t NC;   ///< NumConds
+
+  void mapping(llvm::yaml::IO &IO);
+
+  Error decode(DecoderContext &Data);
+
+  void encode(raw_ostream &OS) const;
+};
+
+/// {LineStart, ColumnStart, LineEnd, ColumnEnd}
+using LocTy = std::array<uint64_t, 4>;
+
+///
+struct RecTy : CounterTy {
+  enum ExtTagTy : uint8_t {
+    Skip = 2,
+    Branch = 4,
+    Decision = 5,
+    MCDCBranch = 6,
+  };
+
+  /// This is optional in detailed view.
+  std::optional<ExtTagTy> ExtTag;
+
+  // Options for extensions.
+  std::optional<uint64_t> Expansion; ///< Doesn't have ExtTag.
+  std::optional<BranchTy> BranchOpt; ///< Optionally has MCDC.
+  std::optional<MCDCBranchTy> MCDC;
+  std::optional<DecisionTy> DecisionOpt;
+
+  /// True or None.
+  /// Stored in ColumnEnd:31.
+  std::optional<bool> isGap;
+
+  std::optional<LocTy> Loc;  ///< Absolute line numbers.
+  std::optional<LocTy> dLoc; ///< Differential line numbers.
+
+  void mapping(llvm::yaml::IO &IO) override;
+
+  Error decode(DecoderContext &Data);
+
+  void encode(uint64_t &StartLoc, raw_ostream &OS) const;
+};
+
+/// {NumRecs, Recs...}
+struct FileRecsTy {
+  std::optional<unsigned> Index;       ///< Shown in detailed view.
+  std::optional<std::string> Filename; ///< Resolved by FileIDs.
+  std::vector<RecTy> Recs;
+
+  void mapping(llvm::yaml::IO &IO);
+};
+
+/// Key is FilenamesRef.
+using CovMapByRefTy = llvm::DenseMap<uint64_t, struct CovMapTy *>;
+
+/// An element of CovFun array.
+struct CovFunTy {
+  std::optional<llvm::yaml::Hex64> NameRef;     ///< Hash value of the symbol.
+  std::optional<std::string> FuncName;          ///< Resolved by symtab.
+  llvm::yaml::Hex64 FuncHash;                   ///< Signature of this function.
+  llvm::yaml::Hex64 FilenamesRef;               ///< Pointer to CovMap
+  std::optional<std::vector<unsigned>> FileIDs; ///< Resolved by CovMap
+  std::vector<ExpressionTy> Expressions;
+  std::vector<FileRecsTy> Files; ///< 2-dimension array of Recs.
+
+  void mapping(llvm::yaml::IO &IO);
+
+  /// Depends on CovMap and SymTab(IPSK_names)
+  Expected<uint64_t> decode(CovMapByRefTy &CovMapByRef, InstrProfSymtab *SymTab,
+                            const ArrayRef<uint8_t> Content, uint64_t Offset,
+                            const DecoderParam &Param, bool IsLE = true);
+
+  void encode(raw_ostream &OS) const;
+};
+
+/// An element of CovMap array.
+struct CovMapTy {
+  /// This is the key of CovMap but not present in the file
+  /// format. Calculate and store with Filenames.
+  llvm::yaml::Hex64 FilenamesRef;
+
+  std::optional<uint32_t> Version;
+
+  /// Raw Filenames (and storage of Files)
+  std::optional<std::vector<std::string>> Filenames;
+
+  /// Since Version5: Filenames[0] is the working directory (or
+  /// zero-length string). Note that indices in CovFun::FileIDs is
+  /// base on Filenames. (Then, +0, as WD, is not expected to appear)
+  std::optional<std::string> WD;
+  /// This may be ArrayRef in Decoder since Filenames has been
+  /// filled. On the other hand in Encoder, this should be a vector
+  /// since YAML parser doesn't endorse references.
+  std::optional<VectorOrRef<std::string>> Files;
+
+  void mapping(llvm::yaml::IO &IO);
+
+  bool useWD() const { return (!Version || *Version >= 4); }
+  StringRef getWD() const { return (WD ? *WD : StringRef()); }
+
+  Expected<uint64_t> decode(const ArrayRef<uint8_t> Content, uint64_t Offset,
+                            const DecoderParam &Param, bool IsLE = true);
+
+  /// Generate Accumulated list with WD.
+  /// Returns a single element {WD} if AccFiles is not given.
+  std::vector<std::string>
+  generateAccFilenames(const std::optional<ArrayRef<StringRef>> &AccFilesOpt =
+                           std::nullopt) const;
+  /// Regenerate Filenames with WD.
+  /// Use Files if it is not None. Or given AccFiles is used.
+  void
+  regenerateFilenames(const std::optional<ArrayRef<StringRef>> &AccFilesOpt);
+
+  /// Encode Filenames. This is mostly used just to obtain FilenamesRef.
+  std::pair<uint64_t, std::string> encodeFilenames(
+      const std::optional<ArrayRef<StringRef>> &AccFilesOpt = std::nullopt,
+      bool Compress = false) const;
+
+  void encode(raw_ostream &OS) const;
+};
+
+} // namespace llvm::coverage::yaml
+
+namespace llvm::yaml {
+template <typename T>
+struct SequenceTraits<llvm::coverage::yaml::VectorOrRef<T>> {
+  static size_t size(IO &io, llvm::coverage::yaml::VectorOrRef<T> &seq) {
+    return seq.size();
+  }
+  static T &element(IO &, llvm::coverage::yaml::VectorOrRef<T> &seq,
+                    size_t index) {
+    if (index >= seq.size())
+      seq.resize(index + 1);
+    return seq[index];
+  }
+};
+} // namespace llvm::yaml
+
+LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::coverage::yaml::CovMapTy)
+LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::coverage::yaml::CovFunTy)
+LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::coverage::yaml::ExpressionTy)
+LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::coverage::yaml::RecTy)
+LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::coverage::yaml::FileRecsTy)
+LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR(llvm::coverage::yaml::CounterTy)
+
+#define LLVM_COVERAGE_YAML_ELEM_MAPPING(Ty)                                    \
+  namespace llvm::yaml {                                                       \
+  template <> struct MappingTraits<llvm::coverage::yaml::Ty> {                 \
+    static void mapping(IO &IO, llvm::coverage::yaml::Ty &Obj) {               \
+      Obj.mapping(IO);                                                         \
+    }                                                                          \
+  };                                                                           \
+  }
+
+/// `Flow` is used for emission of a compact oneliner for RecTy.
+#define LLVM_COVERAGE_YAML_ELEM_MAPPING_FLOW(Ty)                               \
+  namespace llvm::yaml {                                                       \
+  template <> struct MappingTraits<llvm::coverage::yaml::Ty> {                 \
+    static void mapping(IO &IO, llvm::coverage::yaml::Ty &Obj) {               \
+      Obj.mapping(IO);                                                         \
+      (void)flow;                                                              \
+    }                                                                          \
+    static const bool flow = true;                                             \
+  };                                                                           \
+  }
+
+#define LLVM_COVERAGE_YAML_ENUM(Ty)                                            \
+  namespace llvm::yaml {                                                       \
+  template <> struct ScalarEnumerationTraits<llvm::coverage::yaml::Ty> {       \
+    static void enumeration(IO &IO, llvm::coverage::yaml::Ty &Value);          \
+  };                                                                           \
+  }
+
+LLVM_COVERAGE_YAML_ENUM(CounterTy::TagTy)
+LLVM_COVERAGE_YAML_ENUM(RecTy::ExtTagTy)
+LLVM_COVERAGE_YAML_ELEM_MAPPING_FLOW(CounterTy)
+LLVM_COVERAGE_YAML_ELEM_MAPPING_FLOW(DecisionTy)
+LLVM_COVERAGE_YAML_ELEM_MAPPING_FLOW(RecTy)
+LLVM_COVERAGE_YAML_ELEM_MAPPING(FileRecsTy)
+LLVM_COVERAGE_YAML_ELEM_MAPPING(CovFunTy)
+LLVM_COVERAGE_YAML_ELEM_MAPPING(CovMapTy)
+
+namespace llvm::covmap {
+
+class Decoder {
+public:
+  virtual ~Decoder() {}
+
+  /// Returns DecoderImpl.
+  static std::unique_ptr<Decoder>
+  get(const coverage::yaml::DecoderParam &Param);
+
+  /// Called from the Sections loop in advance of the final dump.
+  /// Decoder predecodes Names and CovMap, and captures Contents of
+  /// CovFuns.
+  virtual Error
+  acquire(uint64_t Offset, unsigned AddressAlign, StringRef Name,
+          std::function<Expected<ArrayRef<uint8_t>>()> getSectionContents) = 0;
+
+  /// Called before the final dump after `acquire`.
+  /// Decode contents partially and resolve names.
+  virtual Error fixup() = 0;
+
+  /// Create an ELFYAML object. This just puts predecoded data in
+  /// `fixup`.
+  virtual Expected<ELFYAML::Section *>
+  make(uint64_t Offset, StringRef Name,
+       std::function<Error(ELFYAML::Section &S)> dumpCommonSection) = 0;
+
+  /// Suppress emission of CovMap unless enabled.
+  static bool enabled;
+};
+
+class Encoder {
+public:
+  virtual ~Encoder() {}
+
+  /// Returns EncoderImpl.
+  static std::unique_ptr<Encoder> get();
+
+  /// Called from the Sections loop.
+  virtual void collect(ELFYAML::Chunk *Chunk) = 0;
+
+  /// Resolves names along DecoderParam in advance of Emitter. It'd be
+  /// too late to resolve sections in Emitter since they are immutable
+  /// then.
+  virtual void fixup() = 0;
+};
+
+/// Returns whether Name is interested.
+bool nameMatches(StringRef Name);
+
+/// Returns a new ELFYAML Object.
+std::unique_ptr<ELFYAML::Section> make_unique(StringRef Name);
+
+} // namespace llvm::covmap
+
+#endif // LLVM_OBJECTYAML_COVMAP_H
diff --git a/llvm/include/llvm/ObjectYAML/ELFYAML.h b/llvm/include/llvm/ObjectYAML/ELFYAML.h
index dfdfa055d65fa..2dc1a861077c3 100644
--- a/llvm/include/llvm/ObjectYAML/ELFYAML.h
+++ b/llvm/include/llvm/ObjectYAML/ELFYAML.h
@@ -229,6 +229,7 @@ struct Chunk {
     DependentLibraries,
     CallGraphProfile,
     BBAddrMap,
+    CovMapBase,
 
     // Special chunks.
     SpecialChunksStart,
@@ -398,6 +399,19 @@ struct RawContentSection : Section {
   std::optional<std::vector<uint8_t>> ContentBuf;
 };
 
+struct CovMapSectionBase : Section {
+  std::optional<llvm::yaml::Hex64> Info;
+
+  CovMapSectionBase() : Section(ChunkKind::CovMapBase) {}
+
+  virtual void mapping(yaml::IO &IO) = 0;
+  virtual Error encode(raw_ostream &OS) const = 0;
+
+  static bool classof(const Chunk *S) {
+    return S->Kind == ChunkKind::CovMapBase;
+  }
+};
+
 struct NoBitsSection : Section {
   NoBitsSection() : Section(ChunkKind::NoBits) {}
 
diff --git a/llvm/include/llvm/ProfileData/InstrProf.h b/llvm/include/llvm/ProfileData/InstrProf.h
index 7133c0c6a302c..e20424da3cac2 100644
--- a/llvm/include/llvm/ProfileData/InstrProf.h
+++ b/llvm/include/llvm/ProfileData/InstrProf.h
@@ -545,6 +545,12 @@ class InstrProfSymtab {
   /// This method is a wrapper to \c readAndDecodeStrings method.
   Error create(StringRef NameStrings);
 
+  // PrfNames is nested array.
+  using PrfNamesTy = SmallVector<std::string>;
+  using PrfNamesChunksTy = SmallVector<PrfNamesTy, 1>;
+
+  Expected<PrfNamesChunksTy> createAndGetList(ArrayRef<uint8_t> Content);
+
   /// Initialize symtab states with function names and vtable names. \c
   /// FuncNameStrings is a string composed of one or more encoded function name
   /// strings, and \c VTableNameStrings composes of one or more encoded vtable
diff --git a/llvm/lib/ObjectYAML/CMakeLists.txt b/llvm/lib/ObjectYAML/CMakeLists.txt
index b36974d47d9f8..11054a1e91388 100644
--- a/llvm/lib/ObjectYAML/CMakeLists.txt
+++ b/llvm/lib/ObjectYAML/CMakeLists.txt
@@ -7,6 +7,7 @@ add_llvm_component_library(LLVMObjectYAML
   CodeViewYAMLTypes.cpp
   COFFEmitter.cpp
   COFFYAML.cpp
+  CovMap.cpp
   DWARFEmitter.cpp
   DWARFYAML.cpp
   DXContainerEmitter.cpp
@@ -34,7 +35,9 @@ add_llvm_component_library(LLVMObjectYAML
 
   LINK_COMPONENTS
   BinaryFormat
+  Coverage
   Object
+  ProfileData
   Support
   TargetParser
   DebugInfoCodeView
diff --git a/llvm/lib/ObjectYAML/CovMap.cpp b/llvm/lib/ObjectYAML/CovMap.cpp
new file mode 100644
index 0000000000000..12c03a651668c
--- /dev/null
+++ b/llvm/lib/ObjectYAML/CovMap.cpp
@@ -0,0 +1,977 @@
+//===- CovMap.cpp - ObjectYAML Interface for coverage map -----------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Implementations of CovMap, encoder, decoder.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/ObjectYAML/CovMap.h"
+#include "llvm/ADT/MapVector.h"
+#include "llvm/ObjectYAML/ELFYAML.h"
+#include "llvm/ProfileData/Coverage/CoverageMapping.h"
+#include "llvm/ProfileData/Coverage/CoverageMappingReader.h"
+#include "llvm/ProfileData/Coverage/CoverageMappingWriter.h"
+#include "llvm/ProfileData/InstrProf.h"
+#include "llvm/Support/Alignment.h"
+#include "llvm/Support/DataExtractor.h"
+#include "llvm/Support/LEB128.h"
+#include "llvm/Support/MD5.h"
+#include "llvm/Support/YAMLTraits.h"
+#include <cstdint>
+
+#define COVMAP_V3
+
+using namespace llvm;
+using namespace llvm::coverage::yaml;
+using namespace llvm::covmap;
+
+bool Decoder::enabled;
+
+// DataExtractor w/ single Cursor
+struct coverage::yaml::DecoderContext : DataExtractor,
+                                        DataExtractor::Cursor,
+                                        DecoderParam {
+  uint64_t LineStart = 0;
+
+  DecoderContext(const ArrayRef<uint8_t> Content, const DecoderParam &Param,
+                 bool IsLE)
+      : DataExtractor(Content, IsLE, /*AddressSize=*/0),
+        DataExtractor::Cursor(0), DecoderParam(Param) {}
+
+  bool eof() { return DataExtractor::eof(*this); }
+  uint32_t getU32() { return DataExtractor::getU32(*this); }
+  uint64_t getU64() { return DataExtractor::getU64(*this); }
+  Expected<uint64_t> getULEB128() {
+    uint64_t Result = DataExtractor::getULEB128(*this);
+    if (!*this)
+      return takeError();
+    return Result;
+  }
+  StringRef getBytes(size_t sz) { return DataExtractor::getBytes(*this, sz); }
+};
+
+void CounterTy::encode(raw_ostream &OS) const {
+  std::pair<unsigned, uint64_t> C;
+  if (RefOpt)
+    C = {Ref, *RefOpt};
+  else if (SubOpt)
+    C = {Sub, *SubOpt};
+  else if (AddOpt)
+    C = {Add, *AddOpt};
+  else if (Tag && *Tag == Zero)
+    C = {Zero, 0u};
+  else if (Tag && Val)
+    C = {*Tag, *Val};
+  else
+    llvm_unreachable("Null value cannot be met");
+
+  encodeULEB128(C.first | (C.second << 2), OS);
+}
+
+Error CounterTy::decodeOrTag(DecoderContext &Data) {
+  auto COrErr = Data.getULEB128();
+  if (!COrErr)
+    return COrErr.takeError();
+  auto T = static_cast<TagTy>(*COrErr & 0x03);
+  auto V = (*COrErr >> 2);
+  if (T == Zero) {
+    if (V == 0)
+      Tag = Zero; // w/o Val
+    else
+      Val = V; // w/o Tag
+  } else {
+    if (Data.Raw) {
+      Tag = T;
+      Val = V;
+    } else {
+      switch (T) {
+      case Zero:
+        llvm_unreachable("Zero should be handled in advance");
+      case Ref:
+        RefOpt = V;
+        break;
+      case Sub:
+        SubOpt = V;
+        break;
+      case Add:
+        AddOpt = V;
+        break;
+      }
+    }
+  }
+
+  return Error::success();
+}
+
+Error CounterTy::decode(DecoderContext &Data) {
+  if (auto E = decodeOrTag(Data))
+    return E;
+  if (!this->Tag && this->Val)
+    return make_error<CoverageMapError>(
+        coveragemap_error::malformed,
+        "Counter::Zero shouldn't have the Val: 0x" +
+            Twine::utohexstr(*this->Val));
+  return Error::success();
+}
+
+void DecisionTy::encode(raw_ostream &OS) const {
+  encodeULEB128(BIdx, OS);
+  encodeULEB128(NC, OS);
+}
+
+Error DecisionTy::decode(DecoderContext &Data) {
+  auto BIdxOrErr = Data.getULEB128();
+  if (!BIdxOrErr)
+    return BIdxOrErr.takeError();
+  BIdx = *BIdxOrErr;
+
+  auto NCOrErr = Data.getULEB128();
+  if (!NCOrErr)
+    return NCOrErr.takeError();
+  NC = *NCOrErr;
+
+  return Error::success();
+}
+
+void RecTy::encode(uint64_t &StartLoc, raw_ostream &OS) const {
+  if (Expansion) {
+    encodeULEB128(4 + (*Expansion << 3), OS);
+  } else if (ExtTag && *ExtTag == Skip) {
+    encodeULEB128(2 << 3, OS);
+  } else if (DecisionOpt) {
+    assert(!ExtTag || *ExtTag == Decision);
+    encodeULEB128(5 << 3, OS);
+    DecisionOpt->encode(OS);
+  } else if (MCDC) {
+    assert(!ExtTag || *ExtTag == MCDCBranch);
+    assert(BranchOpt);
+    encodeULEB128(6 << 3, OS);
+    (*BranchOpt)[0].encode(OS);
+    (*BranchOpt)[1].encode(OS);
+    encodeULEB128((*MCDC)[0], OS);
+    encodeULEB128((*MCDC)[1], OS);
+    encodeULEB128((*MCDC)[2], OS);
+  } else if (BranchOpt) {
+    assert(!ExtTag || *ExtTag == Branch);
+    encodeULEB128(4 << 3, OS);
+    (*BranchOpt)[0].encode(OS);
+    (*BranchOpt)[1].encode(OS);
+  } else {
+    // Non-tag CounterTy
+    CounterTy::encode(OS);
+  }
+
+  assert((!isGap || *isGap) && "Don't set isGap=false");
+  uint32_t Gap = (isGap ? (1u << 31) : 0u);
+  if (Loc) {
+    encodeULEB128((*Loc)[0] - St...
[truncated]

Copy link

github-actions bot commented Feb 17, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

iterator begin() {
if (auto *V = std::get_if<Vec>(&Array))
return &V->front();
else
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer symmetry here but I'll follow.

iterator end() {
if (auto *V = std::get_if<Vec>(&Array))
return &V->back() + 1;
else
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://llvm.org/docs/CodingStandards.html#don-t-use-else-after-a-return

and so on in other places throughout the code

iterator begin() {
if (auto *V = std::get_if<Vec>(&Array))
return &V->front();
else
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iterator end() {
if (auto *V = std::get_if<Vec>(&Array))
return &V->back() + 1;
else
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_t size() const {
if (const auto *V = std::get_if<Vec>(&Array))
return V->size();
else
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

T &operator[](int Idx) {
if (auto *V = std::get_if<Vec>(&Array))
return (*V)[Idx];
else
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cl::desc("Dump raw YAML in Coverage Map."),
cl::cat(ELFCat));
static cl::opt<bool> CovMapDLoc("covmap-dloc",
cl::desc("Prefer dLoc thatn absolute Loc."),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/thatn/over

Recommend rephrasing to

Prefer differential line numbers over absolute line numbers in dump.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'd like to assume users are developers and they can understand Loc and dLoc. I suppose they are developers' options.

@@ -580,6 +596,31 @@ ELFDumper<ELFT>::dumpSections() {
return Error::success();
};

coverage::yaml::DecoderParam Param;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coverage::yaml::DecoderParam is only relevant when covmap::Decoder::enabled is true, right?

Would it make sense to sink all of this into the if (covmap::Decoder::enabled) part?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposure of coverage::yaml::DecoderParam is not good but it is a minimal exposure to suppress covmap w/o explicit DecoderImpl context. (See ELFYAML.cpp)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong, but I think @ornata is suggesting just moving the code down a few lines into the if block starting at line 604 that makes use of it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct

@chapuni chapuni requested a review from jh7370 February 17, 2025 08:23
Copy link
Contributor Author

@chapuni chapuni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review.

iterator begin() {
if (auto *V = std::get_if<Vec>(&Array))
return &V->front();
else
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer symmetry here but I'll follow.

cl::desc("Dump raw YAML in Coverage Map."),
cl::cat(ELFCat));
static cl::opt<bool> CovMapDLoc("covmap-dloc",
cl::desc("Prefer dLoc thatn absolute Loc."),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'd like to assume users are developers and they can understand Loc and dLoc. I suppose they are developers' options.

@@ -580,6 +596,31 @@ ELFDumper<ELFT>::dumpSections() {
return Error::success();
};

coverage::yaml::DecoderParam Param;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposure of coverage::yaml::DecoderParam is not good but it is a minimal exposure to suppress covmap w/o explicit DecoderImpl context. (See ELFYAML.cpp)

@jh7370
Copy link
Collaborator

jh7370 commented Feb 21, 2025

Apologies for not looking at this so far this week - I've got a lot on between my regular workload and multiple other reviews in flight. I'd like to review this before it gets merged and will aim to do so Monday or Tuesday next week.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an awful lot of code in this PR now and I wouldn't be confident in anybody's ability to reliably review it all at once. Is there a way you could split it down into multiple pieces?

For example, could you have support for each relevant section in a separate PR, or are there parts of the section format that are not required to have a "minimal" section that can be generated. You could even have the section not technically correct, but have some of the fields or similar not emitted initially, even though they're required for the proper behaviour. This allows for reviewing to focus on individual bits at a time. Keeping this PR as-is would still be useful, in that it could act as a reference point so that reviewers can look at the "big picture" too.

@@ -21,6 +23,18 @@

using namespace llvm;

cl::OptionCategory ELFCat("ELF Options");

static cl::opt<bool>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you feel the need to guard covmap decoding behind an option?

No other special section type has this sort of option, so it feels unnecessary.

default:
return make_error<CoverageMapError>(
coveragemap_error::malformed,
"Record doesn't have an valid Tag: 0x" + Twine::utohexstr(Tag));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Record doesn't have an valid Tag: 0x" + Twine::utohexstr(Tag));
"Record doesn't have a valid Tag: 0x" + Twine::utohexstr(Tag));

@@ -580,6 +596,31 @@ ELFDumper<ELFT>::dumpSections() {
return Error::success();
};

coverage::yaml::DecoderParam Param;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong, but I think @ornata is suggesting just moving the code down a few lines into the if block starting at line 604 that makes use of it.

static cl::opt<bool> CovMapRaw("covmap-raw",
cl::desc("Dump raw YAML in Coverage Map."),
cl::cat(ELFCat));
static cl::opt<bool> CovMapDLoc("covmap-dloc",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the code for covmap-dloc be in a separate patch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it doesn't make sense. It means me to split out small amount of lines.

static cl::opt<bool>
CovMapDetailed("covmap", cl::desc("Dump detailed YAML in Coverage Map."),
cl::cat(ELFCat));
static cl::opt<bool> CovMapRaw("covmap-raw",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the code for covmap-raw be in a separate patch?

it would be easiest to review if the patch introduces minimal functionality the first time through.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. "Raw" is more basic impl. I split out "detailed".

//
// - llvm::covmap
//
// Provides YAML encoder and decoder for coverage map.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to do only the encoder or decoder first in this patch, and then in another patch, do the other one?

e.g.

  • patch 1: add ability to encode
  • patch 2: add ability to decode
  • patch 3: add support for covmap-raw
  • patch 4: add support for covmap-dloc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I split out "encoder" and "decoder detailed(--covmap)".

@chapuni chapuni changed the base branch from main to users/chapuni/yaml/enc March 3, 2025 03:50
@chapuni chapuni changed the title Introduce CovMap in ObjectYAML obj2yaml: Introduce CovMap dump Mar 3, 2025
@jh7370
Copy link
Collaborator

jh7370 commented Mar 4, 2025

I have my doubts about using options to control how obj2yaml dumps things, since we don't do that for other bits. What is your use-case for this?

Please don't mark my comments as resolved when you believe you have addressed them. It makes things harder to review. For more details, please see https://discourse.llvm.org/t/rfc-github-pr-resolve-conversation-button/73178.

Conflicts:
	llvm/include/llvm/ObjectYAML/CovMap.h
	llvm/lib/ObjectYAML/CovMap.cpp
chapuni added 3 commits March 15, 2025 13:02
Conflicts:
	llvm/include/llvm/ObjectYAML/CovMap.h
	llvm/lib/ObjectYAML/CovMap.cpp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bazel "Peripheral" support tier build system: utils/bazel objectyaml PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants