Skip to content

Unittests and usability for BitstreamWriter incremental flushing #92983

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 30, 2024

Conversation

mtrofin
Copy link
Member

@mtrofin mtrofin commented May 22, 2024

  • added unittests for the raw_fd_stream output case.

  • the BitstreamWriter ctor was confusing, the relationship between the buffer and the file stream wasn't clear and in fact there was a potential bug in BitcodeWriter in the mach-o case, because that code assumed in-buffer only serialization. The incremental flushing behavior of flushing at end of block boundaries was an implementation detail that meant serializers not using blocks (for example) would need to know to check the buffer and flush. The bug was latent - in the sense that, today, because the stream being passed was not a raw_fd_stream, incremental buffering never kicked in.

The new design moves the responsibility of flushing to the BitstreamWriter, and makes it work with any raw_ostream (but incrementally flush only in the raw_fd_stream case). If the raw_ostream is over a buffer - i.e. a raw_svector_stream - then it's equivalent to today's buffer case. For all other raw_ostream cases, buffering is an implementation detail. In all cases, the buffer is flushed (well, in the buffer case, that's a moot statement).

This simplifies the state and state transitions the user has to track: you have a raw_ostream -> BitstreamWrite in it -> destroy the writer => the bitstream is completely written in your raw_ostream. The "buffer" case and the "raw_fd_stream" case become optimizations rather than imposing state transition concerns to the user.

- added unittests for the raw_fd_stream output case.

- the ctor documentation seemed to suggest that if there was no file stream
provided, the buffer will be written at the end. That's incorrect. Fixed the
documentation, clarifying flushing behavior and highlighting a few other
details.

- flushing everything in the dtor. While flushing to the file stream would
happen on subblock boundaries, if the choice of format didn't feature subblocks
or didn't end with a subblock, the user would have the expectation of flushing
to the given stream, and have no API indication that's not the case.
Copy link

github-actions bot commented May 22, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Writer.writeSymtab();
Writer.writeStrtab();

{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change needed? Is it to trigger the destructor earlier? Why is that needed when the rest of the buffer is written to Out just below here at line 5070?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is line 5070. The buffer should be cleared after being written. That is an alternative. Another one would be to expose a Flush and call it before line 5065.

Ideally we would use a common stream abstraction for both buffer and raw_fd_stream, but IIUC the bitstream writer is pretty chatty and the indirection caused by an abstraction would hurt performance. Templating would hurt reuse - all the users would now have to make a choice or go template, too.

My preference would be to mark very clearly the transitions of ownership: the BitcodeWriter would take ownership of the inputs (so std::move both of them); then if the user wants them back, would call something like auto [Buffer, FS] = BitcodeWriter::closeAndReclaim(std::move(Writer)) - so Writer would be consumed. This would, however, also impact maybe some the current BitstreamWriter derived classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing something...

With the BitcodeWriter now in its own scope, I believe what now happens is that when it is destructed on line 5064 it will write the contents of Buffer to Out, then clear Buffer. Then since Buffer will be empty, line 5070 won't trigger.

With the old code, we would instead see that Buffer wasn't empty on line 5069 and trigger the same write on line 5070. But without clearing Buffer afterwards. But that is the end of Buffer's lifetime, so why does it need to be cleared?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5070 will trigger if 5066 does.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does emitDarwinBCHeaderAndTrailer unconditionally overwrite Buffer instead of appending to it? Ah looks like that is the case. Is that the issue fixed by doing the additional flushing with this change?

If so, then probably add a comment about why Writer is in a nested scope (to trigger the destructor to complete the flushing), and probably assert that Buffer is empty after that scope completes? It probably also makes sense to move the write from 5070 under the if statement at line 5065, since that should be the only remaining case requiring writing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revisited the refactoring following our offline discussion. Basically, now, the user can pass any stream and it will just work.

Also fixed here the problem you pointed out (offline) about the nuance in emitDarwinBCHeaderAndTrailer, which assumes writing at position 0 in the buffer is correct. Reading more, the assumption is really a requirement: the header is written at the end, actually.

PTAL, and if this looks reasonable, I'll update the patch title and comments too.

@@ -55,4 +61,77 @@ TEST(BitstreamWriterTest, emitBlob4ByteAligned) {
EXPECT_EQ(StringRef("str0"), Buffer);
}

class BitstreamWriterFlushTest : public ::testing::TestWithParam<int> {
protected:
const unsigned BlkID = bitc::FIRST_APPLICATION_BLOCKID + 17;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs comment. E.g. why the +17?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh. random number. added comment.

@@ -93,31 +93,39 @@ class BitstreamWriter {

/// If the related file stream supports reading, seeking and writing, flush
/// the buffer if its size is above a threshold.
void FlushToFile() {
if (!FS)
void FlushToFile(bool OnClosing = false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding a brief comment about new parameter meaning/behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@llvmbot
Copy link
Member

llvmbot commented May 29, 2024

@llvm/pr-subscribers-pgo

@llvm/pr-subscribers-llvm-support

Author: Mircea Trofin (mtrofin)

Changes
  • added unittests for the raw_fd_stream output case.

  • the ctor documentation seemed to suggest that if there was no file stream provided, the buffer will be written at the end. That's not what was happening (well, can't see how that could happen either). In any case, fixed the documentation, clarifying flushing behavior and highlighting a few other details.

  • flushing everything in the dtor. While flushing to the file stream would happen on subblock boundaries, if the choice of format didn't feature subblocks or didn't end with a subblock, the user would have the expectation of flushing to the given stream, and have no API indication that's not the case.


Patch is 29.88 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/92983.diff

7 Files Affected:

  • (modified) llvm/include/llvm/Bitcode/BitcodeWriter.h (+75-75)
  • (modified) llvm/include/llvm/Bitstream/BitstreamWriter.h (+115-47)
  • (modified) llvm/include/llvm/ProfileData/PGOCtxProfWriter.h (+1-1)
  • (modified) llvm/include/llvm/Support/raw_ostream.h (+9-1)
  • (modified) llvm/lib/Bitcode/Writer/BitcodeWriter.cpp (+37-33)
  • (modified) llvm/lib/Support/raw_ostream.cpp (+4)
  • (modified) llvm/unittests/Bitstream/BitstreamWriterTest.cpp (+80)
diff --git a/llvm/include/llvm/Bitcode/BitcodeWriter.h b/llvm/include/llvm/Bitcode/BitcodeWriter.h
index 248d33f4502ef..ef00fa3b4b87e 100644
--- a/llvm/include/llvm/Bitcode/BitcodeWriter.h
+++ b/llvm/include/llvm/Bitcode/BitcodeWriter.h
@@ -29,81 +29,81 @@ class BitstreamWriter;
 class Module;
 class raw_ostream;
 
-  class BitcodeWriter {
-    SmallVectorImpl<char> &Buffer;
-    std::unique_ptr<BitstreamWriter> Stream;
-
-    StringTableBuilder StrtabBuilder{StringTableBuilder::RAW};
-
-    // Owns any strings created by the irsymtab writer until we create the
-    // string table.
-    BumpPtrAllocator Alloc;
-
-    bool WroteStrtab = false, WroteSymtab = false;
-
-    void writeBlob(unsigned Block, unsigned Record, StringRef Blob);
-
-    std::vector<Module *> Mods;
-
-  public:
-    /// Create a BitcodeWriter that writes to Buffer.
-    BitcodeWriter(SmallVectorImpl<char> &Buffer, raw_fd_stream *FS = nullptr);
-
-    ~BitcodeWriter();
-
-    /// Attempt to write a symbol table to the bitcode file. This must be called
-    /// at most once after all modules have been written.
-    ///
-    /// A reader does not require a symbol table to interpret a bitcode file;
-    /// the symbol table is needed only to improve link-time performance. So
-    /// this function may decide not to write a symbol table. It may so decide
-    /// if, for example, the target is unregistered or the IR is malformed.
-    void writeSymtab();
-
-    /// Write the bitcode file's string table. This must be called exactly once
-    /// after all modules and the optional symbol table have been written.
-    void writeStrtab();
-
-    /// Copy the string table for another module into this bitcode file. This
-    /// should be called after copying the module itself into the bitcode file.
-    void copyStrtab(StringRef Strtab);
-
-    /// Write the specified module to the buffer specified at construction time.
-    ///
-    /// If \c ShouldPreserveUseListOrder, encode the use-list order for each \a
-    /// Value in \c M.  These will be reconstructed exactly when \a M is
-    /// deserialized.
-    ///
-    /// If \c Index is supplied, the bitcode will contain the summary index
-    /// (currently for use in ThinLTO optimization).
-    ///
-    /// \p GenerateHash enables hashing the Module and including the hash in the
-    /// bitcode (currently for use in ThinLTO incremental build).
-    ///
-    /// If \p ModHash is non-null, when GenerateHash is true, the resulting
-    /// hash is written into ModHash. When GenerateHash is false, that value
-    /// is used as the hash instead of computing from the generated bitcode.
-    /// Can be used to produce the same module hash for a minimized bitcode
-    /// used just for the thin link as in the regular full bitcode that will
-    /// be used in the backend.
-    void writeModule(const Module &M, bool ShouldPreserveUseListOrder = false,
-                     const ModuleSummaryIndex *Index = nullptr,
-                     bool GenerateHash = false, ModuleHash *ModHash = nullptr);
-
-    /// Write the specified thin link bitcode file (i.e., the minimized bitcode
-    /// file) to the buffer specified at construction time. The thin link
-    /// bitcode file is used for thin link, and it only contains the necessary
-    /// information for thin link.
-    ///
-    /// ModHash is for use in ThinLTO incremental build, generated while the
-    /// IR bitcode file writing.
-    void writeThinLinkBitcode(const Module &M, const ModuleSummaryIndex &Index,
-                              const ModuleHash &ModHash);
-
-    void writeIndex(
-        const ModuleSummaryIndex *Index,
-        const std::map<std::string, GVSummaryMapTy> *ModuleToSummariesForIndex);
-  };
+class BitcodeWriter {
+  std::unique_ptr<BitstreamWriter> Stream;
+
+  StringTableBuilder StrtabBuilder{StringTableBuilder::RAW};
+
+  // Owns any strings created by the irsymtab writer until we create the
+  // string table.
+  BumpPtrAllocator Alloc;
+
+  bool WroteStrtab = false, WroteSymtab = false;
+
+  void writeBlob(unsigned Block, unsigned Record, StringRef Blob);
+
+  std::vector<Module *> Mods;
+
+public:
+  /// Create a BitcodeWriter that writes to Buffer.
+  BitcodeWriter(SmallVectorImpl<char> &Buffer);
+  BitcodeWriter(raw_ostream &FS);
+
+  ~BitcodeWriter();
+
+  /// Attempt to write a symbol table to the bitcode file. This must be called
+  /// at most once after all modules have been written.
+  ///
+  /// A reader does not require a symbol table to interpret a bitcode file;
+  /// the symbol table is needed only to improve link-time performance. So
+  /// this function may decide not to write a symbol table. It may so decide
+  /// if, for example, the target is unregistered or the IR is malformed.
+  void writeSymtab();
+
+  /// Write the bitcode file's string table. This must be called exactly once
+  /// after all modules and the optional symbol table have been written.
+  void writeStrtab();
+
+  /// Copy the string table for another module into this bitcode file. This
+  /// should be called after copying the module itself into the bitcode file.
+  void copyStrtab(StringRef Strtab);
+
+  /// Write the specified module to the buffer specified at construction time.
+  ///
+  /// If \c ShouldPreserveUseListOrder, encode the use-list order for each \a
+  /// Value in \c M.  These will be reconstructed exactly when \a M is
+  /// deserialized.
+  ///
+  /// If \c Index is supplied, the bitcode will contain the summary index
+  /// (currently for use in ThinLTO optimization).
+  ///
+  /// \p GenerateHash enables hashing the Module and including the hash in the
+  /// bitcode (currently for use in ThinLTO incremental build).
+  ///
+  /// If \p ModHash is non-null, when GenerateHash is true, the resulting
+  /// hash is written into ModHash. When GenerateHash is false, that value
+  /// is used as the hash instead of computing from the generated bitcode.
+  /// Can be used to produce the same module hash for a minimized bitcode
+  /// used just for the thin link as in the regular full bitcode that will
+  /// be used in the backend.
+  void writeModule(const Module &M, bool ShouldPreserveUseListOrder = false,
+                   const ModuleSummaryIndex *Index = nullptr,
+                   bool GenerateHash = false, ModuleHash *ModHash = nullptr);
+
+  /// Write the specified thin link bitcode file (i.e., the minimized bitcode
+  /// file) to the buffer specified at construction time. The thin link
+  /// bitcode file is used for thin link, and it only contains the necessary
+  /// information for thin link.
+  ///
+  /// ModHash is for use in ThinLTO incremental build, generated while the
+  /// IR bitcode file writing.
+  void writeThinLinkBitcode(const Module &M, const ModuleSummaryIndex &Index,
+                            const ModuleHash &ModHash);
+
+  void writeIndex(
+      const ModuleSummaryIndex *Index,
+      const std::map<std::string, GVSummaryMapTy> *ModuleToSummariesForIndex);
+};
 
   /// Write the specified module to the specified raw output stream.
   ///
diff --git a/llvm/include/llvm/Bitstream/BitstreamWriter.h b/llvm/include/llvm/Bitstream/BitstreamWriter.h
index c726508cd5285..b3bee328beaa8 100644
--- a/llvm/include/llvm/Bitstream/BitstreamWriter.h
+++ b/llvm/include/llvm/Bitstream/BitstreamWriter.h
@@ -18,6 +18,7 @@
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/Bitstream/BitCodes.h"
+#include "llvm/Support/Casting.h"
 #include "llvm/Support/Endian.h"
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/raw_ostream.h"
@@ -28,34 +29,42 @@
 namespace llvm {
 
 class BitstreamWriter {
-  /// Out - The buffer that keeps unflushed bytes.
-  SmallVectorImpl<char> &Out;
-
-  /// FS - The file stream that Out flushes to. If FS is nullptr, it does not
-  /// support read or seek, Out cannot be flushed until all data are written.
-  raw_fd_stream *FS;
-
-  /// FlushThreshold - If FS is valid, this is the threshold (unit B) to flush
-  /// FS.
+  /// Owned buffer, used to init Buffer if the provided stream doesn't happen to
+  /// be a buffer itself.
+  SmallVector<char, 0> OwnBuffer;
+  /// Internal buffer. The writer backpatches, so it is efficient to buffer.
+  SmallVectorImpl<char> &Buffer;
+
+  /// FS - The file stream that Buffer flushes to. If FS is a raw_fd_stream, the
+  /// writer will incrementally flush at subblock boundaries. Otherwise flushing
+  /// will happen at the end of BitstreamWriter's lifetime.
+  raw_ostream *const FS;
+
+  /// FlushThreshold - this is the threshold (unit B) to flush to FS, if FS is a
+  /// raw_fd_stream.
   const uint64_t FlushThreshold;
 
   /// CurBit - Always between 0 and 31 inclusive, specifies the next bit to use.
-  unsigned CurBit;
+  unsigned CurBit = 0;
 
   /// CurValue - The current value. Only bits < CurBit are valid.
-  uint32_t CurValue;
+  uint32_t CurValue = 0;
 
   /// CurCodeSize - This is the declared size of code values used for the
   /// current block, in bits.
-  unsigned CurCodeSize;
+  unsigned CurCodeSize = 2;
 
   /// BlockInfoCurBID - When emitting a BLOCKINFO_BLOCK, this is the currently
   /// selected BLOCK ID.
-  unsigned BlockInfoCurBID;
+  unsigned BlockInfoCurBID = 0;
 
   /// CurAbbrevs - Abbrevs installed at in this block.
   std::vector<std::shared_ptr<BitCodeAbbrev>> CurAbbrevs;
 
+  // Support for retrieving a section of the output, for purposes such as
+  // checksumming.
+  std::optional<size_t> BlockFlushingStartPos;
+
   struct Block {
     unsigned PrevCodeSize;
     size_t StartSizeWord;
@@ -77,13 +86,17 @@ class BitstreamWriter {
   void WriteWord(unsigned Value) {
     Value =
         support::endian::byte_swap<uint32_t, llvm::endianness::little>(Value);
-    Out.append(reinterpret_cast<const char *>(&Value),
-               reinterpret_cast<const char *>(&Value + 1));
+    Buffer.append(reinterpret_cast<const char *>(&Value),
+                  reinterpret_cast<const char *>(&Value + 1));
   }
 
-  uint64_t GetNumOfFlushedBytes() const { return FS ? FS->tell() : 0; }
+  uint64_t GetNumOfFlushedBytes() const {
+    return fdStream() ? fdStream()->tell() : 0;
+  }
 
-  size_t GetBufferOffset() const { return Out.size() + GetNumOfFlushedBytes(); }
+  size_t GetBufferOffset() const {
+    return Buffer.size() + GetNumOfFlushedBytes();
+  }
 
   size_t GetWordIndex() const {
     size_t Offset = GetBufferOffset();
@@ -91,33 +104,88 @@ class BitstreamWriter {
     return Offset / 4;
   }
 
-  /// If the related file stream supports reading, seeking and writing, flush
-  /// the buffer if its size is above a threshold.
-  void FlushToFile() {
-    if (!FS)
+  void flushAndClear() {
+    assert(FS);
+    assert(!Buffer.empty());
+    assert(!BlockFlushingStartPos &&
+           "a call to markAndBlockFlushing should have been paired with a "
+           "call to getMarkedBufferAndResumeFlushing");
+    FS->write(Buffer.data(), Buffer.size());
+    Buffer.clear();
+  }
+
+  /// If the related file stream is a raw_fd_stream, flush the buffer if its
+  /// size is above a threshold. If \p OnClosing is true, flushing happens
+  /// regardless of thresholds.
+  void FlushToFile(bool OnClosing = false) {
+    if (!FS || Buffer.empty())
       return;
-    if (Out.size() < FlushThreshold)
+    if (OnClosing)
+      return flushAndClear();
+    if (BlockFlushingStartPos)
       return;
-    FS->write((char *)&Out.front(), Out.size());
-    Out.clear();
+    if (fdStream() && Buffer.size() > FlushThreshold)
+      flushAndClear();
+  }
+
+  raw_fd_stream *fdStream() { return dyn_cast_or_null<raw_fd_stream>(FS); }
+
+  const raw_fd_stream *fdStream() const {
+    return dyn_cast_or_null<raw_fd_stream>(FS);
+  }
+
+  SmallVectorImpl<char> &getInternalBufferFromStream(raw_ostream &OutStream) {
+    if (auto *SV = dyn_cast<raw_svector_ostream>(&OutStream))
+      return SV->buffer();
+    return OwnBuffer;
   }
 
 public:
-  /// Create a BitstreamWriter that writes to Buffer \p O.
+  /// Create a BitstreamWriter over a raw_ostream \p OutStream.
+  /// If \p OutStream is a raw_svector_ostream, the BitstreamWriter will write
+  /// directly to the latter's buffer. In all other cases, the BitstreamWriter
+  /// will use an internal buffer and flush at the end of its lifetime.
   ///
-  /// \p FS is the file stream that \p O flushes to incrementally. If \p FS is
-  /// null, \p O does not flush incrementially, but writes to disk at the end.
+  /// In addition, if \p is a raw_fd_stream supporting seek, tell, and read
+  /// (besides write), the BitstreamWriter will also flush incrementally, when a
+  /// subblock is finished, and if the FlushThreshold is passed.
   ///
-  /// \p FlushThreshold is the threshold (unit M) to flush \p O if \p FS is
-  /// valid. Flushing only occurs at (sub)block boundaries.
-  BitstreamWriter(SmallVectorImpl<char> &O, raw_fd_stream *FS = nullptr,
-                  uint32_t FlushThreshold = 512)
-      : Out(O), FS(FS), FlushThreshold(uint64_t(FlushThreshold) << 20), CurBit(0),
-        CurValue(0), CurCodeSize(2) {}
+  /// NOTE: \p FlushThreshold's unit is MB.
+  BitstreamWriter(raw_ostream &OutStream, uint32_t FlushThreshold = 512)
+      : Buffer(getInternalBufferFromStream(OutStream)),
+        FS(!isa<raw_svector_ostream>(OutStream) ? &OutStream : nullptr),
+        FlushThreshold(uint64_t(FlushThreshold) << 20) {}
+
+  /// Convenience constructor for users that start with a vector - avoids
+  /// needing to wrap it in a raw_svector_ostream.
+  BitstreamWriter(SmallVectorImpl<char> &Buff)
+      : Buffer(Buff), FS(nullptr), FlushThreshold(0) {}
 
   ~BitstreamWriter() {
-    assert(CurBit == 0 && "Unflushed data remaining");
+    FlushToWord();
     assert(BlockScope.empty() && CurAbbrevs.empty() && "Block imbalance");
+    FlushToFile(/*OnClosing=*/true);
+  }
+
+  /// For scenarios where the user wants to access a section of the stream to
+  /// (for example) compute some checksum, disable flushing and remember the
+  /// position in the internal buffer where that happened. Must be paired with a
+  /// call to getMarkedBufferAndResumeFlushing.
+  void markAndBlockFlushing() {
+    assert(!BlockFlushingStartPos);
+    BlockFlushingStartPos = Buffer.size();
+  }
+
+  /// resumes flushing, but does not flush, and returns the section in the
+  /// internal buffer starting from the position marked with
+  /// markAndBlockFlushing. The return should be processed before any additional
+  /// calls to this object, because those may cause a flush and invalidate the
+  /// return.
+  StringRef getMarkedBufferAndResumeFlushing() {
+    assert(BlockFlushingStartPos);
+    size_t Start = *BlockFlushingStartPos;
+    BlockFlushingStartPos.reset();
+    return {&Buffer[Start], Buffer.size() - Start};
   }
 
   /// Retrieve the current position in the stream, in bits.
@@ -141,16 +209,16 @@ class BitstreamWriter {
     if (ByteNo >= NumOfFlushedBytes) {
       assert((!endian::readAtBitAlignment<uint8_t, llvm::endianness::little,
                                           unaligned>(
-                 &Out[ByteNo - NumOfFlushedBytes], StartBit)) &&
+                 &Buffer[ByteNo - NumOfFlushedBytes], StartBit)) &&
              "Expected to be patching over 0-value placeholders");
       endian::writeAtBitAlignment<uint8_t, llvm::endianness::little, unaligned>(
-          &Out[ByteNo - NumOfFlushedBytes], NewByte, StartBit);
+          &Buffer[ByteNo - NumOfFlushedBytes], NewByte, StartBit);
       return;
     }
 
     // If the byte offset to backpatch is flushed, use seek to backfill data.
     // First, save the file position to restore later.
-    uint64_t CurPos = FS->tell();
+    uint64_t CurPos = fdStream()->tell();
 
     // Copy data to update into Bytes from the file FS and the buffer Out.
     char Bytes[3]; // Use one more byte to silence a warning from Visual C++.
@@ -159,19 +227,19 @@ class BitstreamWriter {
     size_t BytesFromBuffer = BytesNum - BytesFromDisk;
 
     // When unaligned, copy existing data into Bytes from the file FS and the
-    // buffer Out so that it can be updated before writing. For debug builds
+    // buffer Buffer so that it can be updated before writing. For debug builds
     // read bytes unconditionally in order to check that the existing value is 0
     // as expected.
 #ifdef NDEBUG
     if (StartBit)
 #endif
     {
-      FS->seek(ByteNo);
-      ssize_t BytesRead = FS->read(Bytes, BytesFromDisk);
+      fdStream()->seek(ByteNo);
+      ssize_t BytesRead = fdStream()->read(Bytes, BytesFromDisk);
       (void)BytesRead; // silence warning
       assert(BytesRead >= 0 && static_cast<size_t>(BytesRead) == BytesFromDisk);
       for (size_t i = 0; i < BytesFromBuffer; ++i)
-        Bytes[BytesFromDisk + i] = Out[i];
+        Bytes[BytesFromDisk + i] = Buffer[i];
       assert((!endian::readAtBitAlignment<uint8_t, llvm::endianness::little,
                                           unaligned>(Bytes, StartBit)) &&
              "Expected to be patching over 0-value placeholders");
@@ -182,13 +250,13 @@ class BitstreamWriter {
         Bytes, NewByte, StartBit);
 
     // Copy updated data back to the file FS and the buffer Out.
-    FS->seek(ByteNo);
-    FS->write(Bytes, BytesFromDisk);
+    fdStream()->seek(ByteNo);
+    fdStream()->write(Bytes, BytesFromDisk);
     for (size_t i = 0; i < BytesFromBuffer; ++i)
-      Out[i] = Bytes[BytesFromDisk + i];
+      Buffer[i] = Bytes[BytesFromDisk + i];
 
     // Restore the file position.
-    FS->seek(CurPos);
+    fdStream()->seek(CurPos);
   }
 
   void BackpatchHalfWord(uint64_t BitNo, uint16_t Val) {
@@ -481,11 +549,11 @@ class BitstreamWriter {
 
     // Emit literal bytes.
     assert(llvm::all_of(Bytes, [](UIntTy B) { return isUInt<8>(B); }));
-    Out.append(Bytes.begin(), Bytes.end());
+    Buffer.append(Bytes.begin(), Bytes.end());
 
     // Align end to 32-bits.
     while (GetBufferOffset() & 3)
-      Out.push_back(0);
+      Buffer.push_back(0);
   }
   void emitBlob(StringRef Bytes, bool ShouldEmitSize = true) {
     emitBlob(ArrayRef((const uint8_t *)Bytes.data(), Bytes.size()),
diff --git a/llvm/include/llvm/ProfileData/PGOCtxProfWriter.h b/llvm/include/llvm/ProfileData/PGOCtxProfWriter.h
index edcf02c094697..e8b75d6f8cf4b 100644
--- a/llvm/include/llvm/ProfileData/PGOCtxProfWriter.h
+++ b/llvm/include/llvm/ProfileData/PGOCtxProfWriter.h
@@ -70,7 +70,7 @@ class PGOCtxProfileWriter final {
 public:
   PGOCtxProfileWriter(raw_fd_stream &Out,
                       std::optional<unsigned> VersionOverride = std::nullopt)
-      : Writer(Buff, &Out, 0) {
+      : Writer(Out, 0) {
     Writer.EnterSubblock(PGOCtxProfileBlockIDs::ProfileMetadataBlockID,
                          CodeLen);
     const auto Version = VersionOverride ? *VersionOverride : CurrentVersion;
diff --git a/llvm/include/llvm/Support/raw_ostream.h b/llvm/include/llvm/Support/raw_ostream.h
index 696290a5d99cf..0951ffb19ffa1 100644
--- a/llvm/include/llvm/Support/raw_ostream.h
+++ b/llvm/include/llvm/Support/raw_ostream.h
@@ -55,6 +55,7 @@ class raw_ostream {
   enum class OStreamKind {
     OK_OStream,
     OK_FDStream,
+    OK_SVecStream,
   };
 
 private:
@@ -703,7 +704,11 @@ class raw_svector_ostream : public raw_pwrite_stream {
   ///
   /// \param O The vector to write to; this should generally have at least 128
   /// bytes free to avoid any extraneous memory overhead.
-  explicit raw_svector_ostream(SmallVectorImpl<char> &O) : OS(O) {
+  explicit raw_svector_ostream(SmallVectorImpl<char> &O)
+      : raw_pwrite_stream(false, raw_ostream::OStreamKind::OK_SVecStream),
+        OS(O) {
+    // FIXME: here and in a few other places, set directly to unbuffered in the
+    // ctor.
     SetUnbuffered();
   }
 
@@ -713,10 +718,13 @@ class raw_svector_ostream : public raw_pwrite_stream {
 
   /// Return a StringRef for the vector contents.
   StringRef str() const { return StringRef(OS.data(), OS.size()); }
+  SmallVectorImpl<char> &buffer() { return OS; }
 
   void reserveExtraSpace(uint64_t ExtraSize) override {
     OS.reserve(tell() + ExtraSize);
   }
+
+  static bool classof(const raw_ostream *OS);
 };
 
 /// A raw_ostream that discards all output.
diff --git a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
index c4cea3d6eef2d..00e280356db7c 100644
--- a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ b/llvm/lib/Bitcode/Writer/BitcodeWriter...
[truncated]

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice cleanup! some comments/questions below

const ModuleSummaryIndex *Index,
const std::map<std::string, GVSummaryMapTy> *ModuleToSummariesForIndex);
};
class BitcodeWriter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally don't change the spacing in this PR, because it creates a lot of spurious diffs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(it's really hard for me to identify what are the actual change in this class)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed the formatting upstream, then merged.

/// NOTE: \p FlushThreshold's unit is MB.
BitstreamWriter(raw_ostream &OutStream, uint32_t FlushThreshold = 512)
: Buffer(getInternalBufferFromStream(OutStream)),
FS(!isa<raw_svector_ostream>(OutStream) ? &OutStream : nullptr),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many classes derived from raw_ostream - is raw_svector_ostream the only one that needs to be treated specially?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because it is really a buffer dressed as a stream. So the intention of the user would be to capture the bytes in that buffer.

We also special-case raw_fd_stream for the incremental flushing scenario.

/// Owned buffer, used to init Buffer if the provided stream doesn't happen to
/// be a buffer itself.
SmallVector<char, 0> OwnBuffer;
/// Internal buffer. The writer backpatches, so it is efficient to buffer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have the note that it is only the unflushed bytes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

return;
if (Out.size() < FlushThreshold)
if (OnClosing)
return flushAndClear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What prevents this from being called when FS==nullptr, which will cause an assert in flushAndClear()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the line above - we do nothing if FS == nullptr

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woops, missed that

return;
}

// If the byte offset to backpatch is flushed, use seek to backfill data.
// First, save the file position to restore later.
uint64_t CurPos = FS->tell();
uint64_t CurPos = fdStream()->tell();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this change and at least one below, do we need to invoke fdStream() or can it simply continue to use FD which should have been initialized in the constructor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could use FD, but we should only be here if fdStream() != nullptr. The reason that happens is not very apparent, though: GetNumOfFlushedBytes returns 0 in that case, which makes the if-check above always succeed. Added an assert and a comment.


if (TT.isOSDarwin() || TT.isOSBinFormatMachO())
BitcodeWriter Writer(Buffer);
Write(Writer);
emitDarwinBCHeaderAndTrailer(Buffer, TT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert Buffer is empty before calling emitDarwinBCHeaderAndTrailer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it won't be - that's the thing, in this case, we use our own Buffer which won't get flushed - there's nowhere to flush (insofar as the bitcode writer is concerned). See the comment above the declaration of Buffer, too.

@llvmbot llvmbot added the PGO Profile Guided Optimizations label May 29, 2024
Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with one minor typo noted below

@@ -5093,7 +5093,7 @@ void llvm::WriteBitcodeToFile(const Module &M, raw_ostream &Out,
if (TT.isOSDarwin() || TT.isOSBinFormatMachO()) {
// If this is darwin or another generic macho target, reserve space for the
// header. Note that the header is computed *after* the output is known, so
// we currently expliclty use a buffer, write to it, and then subsequently
// we currently explicilty use a buffer, write to it, and then subsequently
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still a typo in that word =)

return;
if (Out.size() < FlushThreshold)
if (OnClosing)
return flushAndClear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woops, missed that

@mtrofin mtrofin merged commit 7cfffe7 into llvm:main May 30, 2024
5 of 6 checks passed
keith added a commit that referenced this pull request May 30, 2024
MaskRay added a commit that referenced this pull request May 30, 2024
@MaskRay
Copy link
Member

MaskRay commented May 30, 2024

Thanks! The new ctor is more reasonable. Pushed c4dad9a to fix an experimental target DirectX.

searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Jun 2, 2024
Revert : breaks lots of OCL tests.
7cfffe7 Unittests and usability for BitstreamWriter incremental flushing (llvm#92983)

Change-Id: Ic52106ccfb631168b99bae34ab6d14ad3a548408
mtrofin added a commit that referenced this pull request Jun 3, 2024
PR #92983 accidentally changed the buffer allocation in
`llvm::WriteBitcodeToFile` to be allocated on the stack, which is
problematic given it's a large-ish buffer (256K)
mtrofin added a commit that referenced this pull request Jun 28, 2024
Not needed anymore after PR #92983, and moreover, creating the same issue c49bc1a fixed in `BitcodeWriter.cpp`.
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Jun 29, 2024
reverts:
  993d238 [ctx_prof] Remove `Buffer` field in PGOCtxProfWriter
  1488fb4 [PAC][AArch64] Lower ptrauth constants in code (llvm#96879)
depends on
  7cfffe7 Unittests and usability for BitstreamWriter incremental flushing (llvm#92983)
whick breaks lots of ocl tests

see: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/2738/
Change-Id: I108ca3b1dd63499946b75a34629755ed2d7eb05b
lravenclaw pushed a commit to lravenclaw/llvm-project that referenced this pull request Jul 3, 2024
Not needed anymore after PR llvm#92983, and moreover, creating the same issue c49bc1a fixed in `BitcodeWriter.cpp`.
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Sep 6, 2024
…m#92983)

- added unittests for the raw_fd_stream output case.

- the `BitstreamWriter` ctor was confusing, the relationship between the buffer and the file stream wasn't clear and in fact there was a potential bug in `BitcodeWriter` in the mach-o case, because that code assumed in-buffer only serialization. The incremental flushing behavior of flushing at end of block boundaries was an implementation detail that meant serializers not using blocks (for example) would need to know to check the buffer and flush. The bug was latent - in the sense that, today, because the stream being passed was not a `raw_fd_stream`, incremental buffering never kicked in.

The new design moves the responsibility of flushing to the `BitstreamWriter`, and makes it work with any `raw_ostream` (but incrementally flush only in the `raw_fd_stream` case). If the `raw_ostream` is over a buffer - i.e. a `raw_svector_stream` - then it's equivalent to today's buffer case. For all other `raw_ostream` cases, buffering is an implementation detail. In all cases, the buffer is flushed (well, in the buffer case, that's a moot statement).

This simplifies the state and state transitions the user has to track: you have a raw_ostream -> BitstreamWrite in it -> destroy the writer => the bitstream is completely written in your raw_ostream. The "buffer" case and the "raw_fd_stream" case become optimizations rather than imposing state transition concerns to the user.

Change-Id: I36015a1ca3b87c6c89f4aa51142b829e526a0737
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Sep 6, 2024
PR llvm#92983 accidentally changed the buffer allocation in
`llvm::WriteBitcodeToFile` to be allocated on the stack, which is
problematic given it's a large-ish buffer (256K)

Change-Id: I0f30bfb030e160c9b158fa5a5d3451fb1a20c8a9
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Sep 6, 2024
Not needed anymore after PR llvm#92983, and moreover, creating the same issue c49bc1a fixed in `BitcodeWriter.cpp`.

Change-Id: I8e2ecce912725d3c8c800fc7746bcb524a2da121
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:support PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants