Skip to content

ELF: Introduce --randomize-section-padding option. #117653

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 13, 2024

Conversation

pcc
Copy link
Contributor

@pcc pcc commented Nov 26, 2024

The --randomize-section-padding option randomly inserts padding between
input sections using the given seed. It is intended to be used in A/B
experiments to determine the average effect of a change on program
performance, while controlling for effects such as false sharing in
the cache which may introduce measurement bias. For more details,
see the RFC:

https://discourse.llvm.org/t/rfc-lld-feature-for-controlling-for-code-size-dependent-measurement-bias/83334

Created using spr 1.3.6-beta.1
@llvmbot
Copy link
Member

llvmbot commented Nov 26, 2024

@llvm/pr-subscribers-lld

Author: None (pcc)

Changes

The --shuffle-padding flag randomly inserts padding between input
sections using the given seed. It is intended to be used in A/B
experiments to determine the average effect of a change on program
performance, while controlling for effects such as false sharing in
the cache which may introduce measurement bias. For more details,
see the RFC:


Full diff: https://github.com/llvm/llvm-project/pull/117653.diff

9 Files Affected:

  • (modified) lld/ELF/Config.h (+1)
  • (modified) lld/ELF/Driver.cpp (+2)
  • (modified) lld/ELF/Options.td (+3)
  • (modified) lld/ELF/OutputSections.h (+2-2)
  • (modified) lld/ELF/SyntheticSections.cpp (+15)
  • (modified) lld/ELF/SyntheticSections.h (+9)
  • (modified) lld/ELF/Writer.cpp (+34)
  • (modified) lld/docs/ld.lld.1 (+9)
  • (added) lld/test/ELF/shuffle-padding.s (+29)
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index a2836733c2715e..ed6bce405d1664 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -329,6 +329,7 @@ struct Config {
   bool relrPackDynRelocs = false;
   llvm::DenseSet<llvm::StringRef> saveTempsArgs;
   llvm::SmallVector<std::pair<llvm::GlobPattern, uint32_t>, 0> shuffleSections;
+  std::optional<uint64_t> shufflePadding;
   bool singleRoRx;
   bool shared;
   bool symbolic;
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index bc4b967ccbbbb4..d8bcbe4a2d19d8 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1410,6 +1410,8 @@ static void readConfigs(Ctx &ctx, opt::InputArgList &args) {
   ctx.arg.searchPaths = args::getStrings(args, OPT_library_path);
   ctx.arg.sectionStartMap = getSectionStartMap(ctx, args);
   ctx.arg.shared = args.hasArg(OPT_shared);
+  if (args.hasArg(OPT_shuffle_padding))
+    ctx.arg.shufflePadding = args::getInteger(args, OPT_shuffle_padding, 0);
   ctx.arg.singleRoRx = !args.hasFlag(OPT_rosegment, OPT_no_rosegment, true);
   ctx.arg.soName = args.getLastArgValue(OPT_soname);
   ctx.arg.sortSection = getSortSection(ctx, args);
diff --git a/lld/ELF/Options.td b/lld/ELF/Options.td
index ebe77204264210..44ac0ee43a8502 100644
--- a/lld/ELF/Options.td
+++ b/lld/ELF/Options.td
@@ -434,6 +434,9 @@ defm section_start: Eq<"section-start", "Set address of section">,
 
 def shared: F<"shared">, HelpText<"Build a shared object">;
 
+def shuffle_padding: JJ<"shuffle-padding=">,
+  HelpText<"Randomly insert padding between input sections using given seed">;
+
 defm soname: Eq<"soname", "Set DT_SONAME">;
 
 defm sort_section:
diff --git a/lld/ELF/OutputSections.h b/lld/ELF/OutputSections.h
index 67191392d1dbe7..3ab36a21ce488d 100644
--- a/lld/ELF/OutputSections.h
+++ b/lld/ELF/OutputSections.h
@@ -124,14 +124,14 @@ class OutputSection final : public SectionBase {
   void sortInitFini();
   void sortCtorsDtors();
 
+  std::array<uint8_t, 4> getFiller(Ctx &);
+
   // Used for implementation of --compress-debug-sections and
   // --compress-sections.
   CompressedData compressed;
 
 private:
   SmallVector<InputSection *, 0> storage;
-
-  std::array<uint8_t, 4> getFiller(Ctx &);
 };
 
 struct OutputDesc final : SectionCommand {
diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index 21fe2a25fa1bd2..70eca0e58036e0 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -2753,6 +2753,21 @@ RelroPaddingSection::RelroPaddingSection(Ctx &ctx)
     : SyntheticSection(ctx, ".relro_padding", SHT_NOBITS, SHF_ALLOC | SHF_WRITE,
                        1) {}
 
+ShufflePaddingSection::ShufflePaddingSection(Ctx &ctx, uint64_t size,
+                                             OutputSection *parent)
+    : SyntheticSection(ctx, ".shuffle_padding", SHF_ALLOC, SHT_PROGBITS, 1),
+      size(size) {
+  this->parent = parent;
+}
+
+void ShufflePaddingSection::writeTo(uint8_t *buf) {
+  std::array<uint8_t, 4> filler = getParent()->getFiller(ctx);
+  uint8_t *end = buf + size;
+  for (; buf + 4 <= end; buf += 4)
+    memcpy(buf, &filler[0], 4);
+  memcpy(buf, &filler[0], end - buf);
+}
+
 // The string hash function for .gdb_index.
 static uint32_t computeGdbHash(StringRef s) {
   uint32_t h = 0;
diff --git a/lld/ELF/SyntheticSections.h b/lld/ELF/SyntheticSections.h
index 4b643e86335510..177a0337607da8 100644
--- a/lld/ELF/SyntheticSections.h
+++ b/lld/ELF/SyntheticSections.h
@@ -796,6 +796,15 @@ class RelroPaddingSection final : public SyntheticSection {
   void writeTo(uint8_t *buf) override {}
 };
 
+class ShufflePaddingSection final : public SyntheticSection {
+  uint64_t size;
+
+public:
+  ShufflePaddingSection(Ctx &ctx, uint64_t size, OutputSection *parent);
+  size_t getSize() const override { return size; }
+  void writeTo(uint8_t *buf) override;
+};
+
 // Used by the merged DWARF32 .debug_names (a per-module index). If we
 // move to DWARF64, most of this data will need to be re-sized.
 class DebugNamesBaseSection : public SyntheticSection {
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index a7fbdc07907044..3c1ab3234801d9 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -1444,6 +1444,37 @@ static void finalizeSynthetic(Ctx &ctx, SyntheticSection *sec) {
   }
 }
 
+static bool canInsertPadding(OutputSection *sec) {
+    StringRef s = sec->name;
+    return s == ".bss" || s == ".data" || s == ".data.rel.ro" ||
+           s == ".rodata" || s.starts_with(".text");
+}
+
+static void shufflePadding(Ctx &ctx) {
+  std::mt19937 g(*ctx.arg.shufflePadding);
+  PhdrEntry *curPtLoad = nullptr;
+  for (OutputSection *os : ctx.outputSections) {
+    if (!canInsertPadding(os))
+      continue;
+    for (SectionCommand *bc : os->commands) {
+      if (auto *isd = dyn_cast<InputSectionDescription>(bc)) {
+        SmallVector<InputSection *, 0> tmp;
+        if (os->ptLoad != curPtLoad) {
+          tmp.push_back(
+              make<ShufflePaddingSection>(ctx, g() % ctx.arg.maxPageSize, os));
+          curPtLoad = os->ptLoad;
+        }
+        for (InputSection *isec : isd->sections) {
+          if (g() < (1<<28))
+            tmp.push_back(make<ShufflePaddingSection>(ctx, isec->addralign, os));
+          tmp.push_back(isec);
+        }
+        isd->sections = std::move(tmp);
+      }
+    }
+  }
+}
+
 // We need to generate and finalize the content that depends on the address of
 // InputSections. As the generation of the content may also alter InputSection
 // addresses we must converge to a fixed point. We do that here. See the comment
@@ -1470,6 +1501,9 @@ template <class ELFT> void Writer<ELFT>::finalizeAddressDependentContent() {
   if (ctx.arg.emachine == EM_HEXAGON)
     hexagonTLSSymbolUpdate(ctx);
 
+  if (ctx.arg.shufflePadding)
+    shufflePadding(ctx);
+
   uint32_t pass = 0, assignPasses = 0;
   for (;;) {
     bool changed = ctx.target->needsThunks
diff --git a/lld/docs/ld.lld.1 b/lld/docs/ld.lld.1
index b22cb362837715..5d3d4164622166 100644
--- a/lld/docs/ld.lld.1
+++ b/lld/docs/ld.lld.1
@@ -577,6 +577,15 @@ were concatenated in the order they appeared on the command line.
 Set address of section.
 .It Fl -shared , Fl -Bsharable
 Build a shared object.
+.It Fl -shuffle-padding Ns = Ns Ar seed
+Randomly insert padding between input sections using the given seed.
+Padding is inserted into output sections with names matching the following patterns:
+.Cm .bss ,
+.Cm .data ,
+.Cm .data.rel.ro ,
+.Cm .rodata
+and
+.Cm .text* .
 .It Fl -shuffle-sections Ns = Ns Ar seed
 Shuffle matched sections using the given seed before mapping them to the output sections.
 If -1, reverse the section order. If 0, use a random seed.
diff --git a/lld/test/ELF/shuffle-padding.s b/lld/test/ELF/shuffle-padding.s
new file mode 100644
index 00000000000000..f816100ffba1be
--- /dev/null
+++ b/lld/test/ELF/shuffle-padding.s
@@ -0,0 +1,29 @@
+# REQUIRES: x86
+# RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t.o
+
+# RUN: ld.lld %t.o -o %t.out
+# RUN: llvm-readelf -x .rodata %t.out | FileCheck %s
+# CHECK: Hex dump of section '.rodata':
+# CHECK-NEXT: 0x00201120 010203
+
+## --shuffle-padding= inserts segment offset padding and pre-section padding.
+# RUN: ld.lld --shuffle-padding=1 %t.o -o %t.out
+# RUN: llvm-readelf -x .rodata %t.out | FileCheck --check-prefix=PAD1 %s
+# PAD1: Hex dump of section '.rodata':
+# PAD1-NEXT: 0x00201548 0102cc03
+
+## Size of segment offset padding and location of pre-section padding is
+## dependent on the seed.
+# RUN: ld.lld --shuffle-padding=2 %t.o -o %t.out
+# RUN: llvm-readelf -x .rodata %t.out | FileCheck --check-prefix=PAD2 %s
+# PAD2: Hex dump of section '.rodata':
+# PAD2-NEXT: 0x00201dc8 cc010203
+
+.section .rodata.a,"ax"
+.byte 1
+
+.section .rodata.b,"ax"
+.byte 2
+
+.section .rodata.c,"ax"
+.byte 3

@llvmbot
Copy link
Member

llvmbot commented Nov 26, 2024

@llvm/pr-subscribers-lld-elf

Author: None (pcc)

Changes

The --shuffle-padding flag randomly inserts padding between input
sections using the given seed. It is intended to be used in A/B
experiments to determine the average effect of a change on program
performance, while controlling for effects such as false sharing in
the cache which may introduce measurement bias. For more details,
see the RFC:


Full diff: https://github.com/llvm/llvm-project/pull/117653.diff

9 Files Affected:

  • (modified) lld/ELF/Config.h (+1)
  • (modified) lld/ELF/Driver.cpp (+2)
  • (modified) lld/ELF/Options.td (+3)
  • (modified) lld/ELF/OutputSections.h (+2-2)
  • (modified) lld/ELF/SyntheticSections.cpp (+15)
  • (modified) lld/ELF/SyntheticSections.h (+9)
  • (modified) lld/ELF/Writer.cpp (+34)
  • (modified) lld/docs/ld.lld.1 (+9)
  • (added) lld/test/ELF/shuffle-padding.s (+29)
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index a2836733c2715e..ed6bce405d1664 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -329,6 +329,7 @@ struct Config {
   bool relrPackDynRelocs = false;
   llvm::DenseSet<llvm::StringRef> saveTempsArgs;
   llvm::SmallVector<std::pair<llvm::GlobPattern, uint32_t>, 0> shuffleSections;
+  std::optional<uint64_t> shufflePadding;
   bool singleRoRx;
   bool shared;
   bool symbolic;
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index bc4b967ccbbbb4..d8bcbe4a2d19d8 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1410,6 +1410,8 @@ static void readConfigs(Ctx &ctx, opt::InputArgList &args) {
   ctx.arg.searchPaths = args::getStrings(args, OPT_library_path);
   ctx.arg.sectionStartMap = getSectionStartMap(ctx, args);
   ctx.arg.shared = args.hasArg(OPT_shared);
+  if (args.hasArg(OPT_shuffle_padding))
+    ctx.arg.shufflePadding = args::getInteger(args, OPT_shuffle_padding, 0);
   ctx.arg.singleRoRx = !args.hasFlag(OPT_rosegment, OPT_no_rosegment, true);
   ctx.arg.soName = args.getLastArgValue(OPT_soname);
   ctx.arg.sortSection = getSortSection(ctx, args);
diff --git a/lld/ELF/Options.td b/lld/ELF/Options.td
index ebe77204264210..44ac0ee43a8502 100644
--- a/lld/ELF/Options.td
+++ b/lld/ELF/Options.td
@@ -434,6 +434,9 @@ defm section_start: Eq<"section-start", "Set address of section">,
 
 def shared: F<"shared">, HelpText<"Build a shared object">;
 
+def shuffle_padding: JJ<"shuffle-padding=">,
+  HelpText<"Randomly insert padding between input sections using given seed">;
+
 defm soname: Eq<"soname", "Set DT_SONAME">;
 
 defm sort_section:
diff --git a/lld/ELF/OutputSections.h b/lld/ELF/OutputSections.h
index 67191392d1dbe7..3ab36a21ce488d 100644
--- a/lld/ELF/OutputSections.h
+++ b/lld/ELF/OutputSections.h
@@ -124,14 +124,14 @@ class OutputSection final : public SectionBase {
   void sortInitFini();
   void sortCtorsDtors();
 
+  std::array<uint8_t, 4> getFiller(Ctx &);
+
   // Used for implementation of --compress-debug-sections and
   // --compress-sections.
   CompressedData compressed;
 
 private:
   SmallVector<InputSection *, 0> storage;
-
-  std::array<uint8_t, 4> getFiller(Ctx &);
 };
 
 struct OutputDesc final : SectionCommand {
diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index 21fe2a25fa1bd2..70eca0e58036e0 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -2753,6 +2753,21 @@ RelroPaddingSection::RelroPaddingSection(Ctx &ctx)
     : SyntheticSection(ctx, ".relro_padding", SHT_NOBITS, SHF_ALLOC | SHF_WRITE,
                        1) {}
 
+ShufflePaddingSection::ShufflePaddingSection(Ctx &ctx, uint64_t size,
+                                             OutputSection *parent)
+    : SyntheticSection(ctx, ".shuffle_padding", SHF_ALLOC, SHT_PROGBITS, 1),
+      size(size) {
+  this->parent = parent;
+}
+
+void ShufflePaddingSection::writeTo(uint8_t *buf) {
+  std::array<uint8_t, 4> filler = getParent()->getFiller(ctx);
+  uint8_t *end = buf + size;
+  for (; buf + 4 <= end; buf += 4)
+    memcpy(buf, &filler[0], 4);
+  memcpy(buf, &filler[0], end - buf);
+}
+
 // The string hash function for .gdb_index.
 static uint32_t computeGdbHash(StringRef s) {
   uint32_t h = 0;
diff --git a/lld/ELF/SyntheticSections.h b/lld/ELF/SyntheticSections.h
index 4b643e86335510..177a0337607da8 100644
--- a/lld/ELF/SyntheticSections.h
+++ b/lld/ELF/SyntheticSections.h
@@ -796,6 +796,15 @@ class RelroPaddingSection final : public SyntheticSection {
   void writeTo(uint8_t *buf) override {}
 };
 
+class ShufflePaddingSection final : public SyntheticSection {
+  uint64_t size;
+
+public:
+  ShufflePaddingSection(Ctx &ctx, uint64_t size, OutputSection *parent);
+  size_t getSize() const override { return size; }
+  void writeTo(uint8_t *buf) override;
+};
+
 // Used by the merged DWARF32 .debug_names (a per-module index). If we
 // move to DWARF64, most of this data will need to be re-sized.
 class DebugNamesBaseSection : public SyntheticSection {
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index a7fbdc07907044..3c1ab3234801d9 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -1444,6 +1444,37 @@ static void finalizeSynthetic(Ctx &ctx, SyntheticSection *sec) {
   }
 }
 
+static bool canInsertPadding(OutputSection *sec) {
+    StringRef s = sec->name;
+    return s == ".bss" || s == ".data" || s == ".data.rel.ro" ||
+           s == ".rodata" || s.starts_with(".text");
+}
+
+static void shufflePadding(Ctx &ctx) {
+  std::mt19937 g(*ctx.arg.shufflePadding);
+  PhdrEntry *curPtLoad = nullptr;
+  for (OutputSection *os : ctx.outputSections) {
+    if (!canInsertPadding(os))
+      continue;
+    for (SectionCommand *bc : os->commands) {
+      if (auto *isd = dyn_cast<InputSectionDescription>(bc)) {
+        SmallVector<InputSection *, 0> tmp;
+        if (os->ptLoad != curPtLoad) {
+          tmp.push_back(
+              make<ShufflePaddingSection>(ctx, g() % ctx.arg.maxPageSize, os));
+          curPtLoad = os->ptLoad;
+        }
+        for (InputSection *isec : isd->sections) {
+          if (g() < (1<<28))
+            tmp.push_back(make<ShufflePaddingSection>(ctx, isec->addralign, os));
+          tmp.push_back(isec);
+        }
+        isd->sections = std::move(tmp);
+      }
+    }
+  }
+}
+
 // We need to generate and finalize the content that depends on the address of
 // InputSections. As the generation of the content may also alter InputSection
 // addresses we must converge to a fixed point. We do that here. See the comment
@@ -1470,6 +1501,9 @@ template <class ELFT> void Writer<ELFT>::finalizeAddressDependentContent() {
   if (ctx.arg.emachine == EM_HEXAGON)
     hexagonTLSSymbolUpdate(ctx);
 
+  if (ctx.arg.shufflePadding)
+    shufflePadding(ctx);
+
   uint32_t pass = 0, assignPasses = 0;
   for (;;) {
     bool changed = ctx.target->needsThunks
diff --git a/lld/docs/ld.lld.1 b/lld/docs/ld.lld.1
index b22cb362837715..5d3d4164622166 100644
--- a/lld/docs/ld.lld.1
+++ b/lld/docs/ld.lld.1
@@ -577,6 +577,15 @@ were concatenated in the order they appeared on the command line.
 Set address of section.
 .It Fl -shared , Fl -Bsharable
 Build a shared object.
+.It Fl -shuffle-padding Ns = Ns Ar seed
+Randomly insert padding between input sections using the given seed.
+Padding is inserted into output sections with names matching the following patterns:
+.Cm .bss ,
+.Cm .data ,
+.Cm .data.rel.ro ,
+.Cm .rodata
+and
+.Cm .text* .
 .It Fl -shuffle-sections Ns = Ns Ar seed
 Shuffle matched sections using the given seed before mapping them to the output sections.
 If -1, reverse the section order. If 0, use a random seed.
diff --git a/lld/test/ELF/shuffle-padding.s b/lld/test/ELF/shuffle-padding.s
new file mode 100644
index 00000000000000..f816100ffba1be
--- /dev/null
+++ b/lld/test/ELF/shuffle-padding.s
@@ -0,0 +1,29 @@
+# REQUIRES: x86
+# RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t.o
+
+# RUN: ld.lld %t.o -o %t.out
+# RUN: llvm-readelf -x .rodata %t.out | FileCheck %s
+# CHECK: Hex dump of section '.rodata':
+# CHECK-NEXT: 0x00201120 010203
+
+## --shuffle-padding= inserts segment offset padding and pre-section padding.
+# RUN: ld.lld --shuffle-padding=1 %t.o -o %t.out
+# RUN: llvm-readelf -x .rodata %t.out | FileCheck --check-prefix=PAD1 %s
+# PAD1: Hex dump of section '.rodata':
+# PAD1-NEXT: 0x00201548 0102cc03
+
+## Size of segment offset padding and location of pre-section padding is
+## dependent on the seed.
+# RUN: ld.lld --shuffle-padding=2 %t.o -o %t.out
+# RUN: llvm-readelf -x .rodata %t.out | FileCheck --check-prefix=PAD2 %s
+# PAD2: Hex dump of section '.rodata':
+# PAD2-NEXT: 0x00201dc8 cc010203
+
+.section .rodata.a,"ax"
+.byte 1
+
+.section .rodata.b,"ax"
+.byte 2
+
+.section .rodata.c,"ax"
+.byte 3

Copy link

github-actions bot commented Nov 26, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@pcc pcc requested review from MaskRay and smithp35 November 26, 2024 00:46
Created using spr 1.3.6-beta.1
Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sticking to the code here. I'm coming at this from a world where linker scripts are common. I think it should be possible to deal with the additional address requirements by disabling the padding in these cases.

A possible alternative that wouldn't involve a ShufflePadding section is to randomly double the section alignment. For a target like AArch64 where the most common alignment will be 4 and section size is a multiple of 4, then this will roughly approximate adding padding sections, albeit with half the sections already on an 8-byte boundary so it may need a higher probability to balance. I think some CPU targets overalign functions to cache-line boundaries at higher optimisation levels so there's less of an impact.

@@ -796,6 +796,15 @@ class RelroPaddingSection final : public SyntheticSection {
void writeTo(uint8_t *buf) override {}
};

class ShufflePaddingSection final : public SyntheticSection {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to call this a PaddingSection? While it can be used to implement shuffle padding, there isn't anything in here that's random. I guess the fixed SHT_PROGBITS would need to be changed if this were to be a completely generic padding section.

Just thinking of possible future uses for this type of section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always rename the section if it turns out to be needed for another purpose. For now let's name it after its only user so that it's clear what it's used for.

@@ -2753,6 +2753,21 @@ RelroPaddingSection::RelroPaddingSection(Ctx &ctx)
: SyntheticSection(ctx, ".relro_padding", SHT_NOBITS, SHF_ALLOC | SHF_WRITE,
1) {}

ShufflePaddingSection::ShufflePaddingSection(Ctx &ctx, uint64_t size,
OutputSection *parent)
: SyntheticSection(ctx, ".shuffle_padding", SHF_ALLOC, SHT_PROGBITS, 1),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these flags behave themselves in the case of .bss and .data?

I'd expect if there is at least one other .data section the combined OutputSection flags would be SHF_WRITE. Could be interesting if there's no existing .data, although I expect there won't be any of these section inserted in that case.

Would the SHT_PROGBITs padding sections in .bss result in the combined .bss Output Section being output as SHT_PROGBITS? This might not be ideal if benchmarking program startup, although shouldn't make a difference to the program runtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new sections are added after the output section flags are resolved, so I don't think the values here matter. I just picked values arbitrarily. The new tests I added show that the correct section flags are preserved.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I see. Could possibly use parent->flags, and parent->type instead. I don't think it will make a difference at the moment, but may be a bit less surprising see.

# RUN: llvm-readelf -x .rodata %t.out | FileCheck --check-prefix=PAD2 %s
# PAD2: Hex dump of section '.rodata':
# PAD2-NEXT: 0x00201dc8 cc010203

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be worth adding a .data and .bss test case for the expected behaviour of RW and SHT_NOBITS sections.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I'll add that.

@MaskRay
Copy link
Member

MaskRay commented Nov 27, 2024

ELF: Introduce --shuffle-padding flag.

We call these options. Boolean options are also called flags. We don't use "flag" to name non-boolean options.
(The naming convention here is different from absl flags)

@pcc
Copy link
Contributor Author

pcc commented Nov 28, 2024

Sticking to the code here. I'm coming at this from a world where linker scripts are common. I think it should be possible to deal with the additional address requirements by disabling the padding in these cases.

A possible alternative that wouldn't involve a ShufflePadding section is to randomly double the section alignment. For a target like AArch64 where the most common alignment will be 4 and section size is a multiple of 4, then this will roughly approximate adding padding sections, albeit with half the sections already on an 8-byte boundary so it may need a higher probability to balance. I think some CPU targets overalign functions to cache-line boundaries at higher optimisation levels so there's less of an impact.

I considered increasing alignment but decided against it because that could introduce its own biases. For example, it makes functions more likely to be moved to a single cache line, but for the experiment scenario we want a roughly equal likelihood of the function being moved either to or from a single cache line because that matches the probabilities in a real world scenario. Increasing alignment isn't necessarily immune to an implicit expectation in a linker script either (e.g. it could assume that certain input sections have certain alignments).

The padding sections are a bit like range extension thunk sections which should already work with linker scripts (and are inserted at a similar point, after linker script processing) so I don't foresee any significant problems from the use of these two features together per se. I think that implicit expectations in linker scripts can be addressed in a followup change that would allow the user to override the output section names that receive padding (like --shuffle-sections already does). For the time being, by using the two features together, the user is essentially warranting that their linker script is compatible with shuffle padding with the default set of output section names.

Created using spr 1.3.6-beta.1
@pcc pcc changed the title ELF: Introduce --shuffle-padding flag. ELF: Introduce --shuffle-padding option. Nov 28, 2024
@pcc
Copy link
Contributor Author

pcc commented Nov 28, 2024

ELF: Introduce --shuffle-padding flag.

We call these options. Boolean options are also called flags. We don't use "flag" to name non-boolean options. (The naming convention here is different from absl flags)

Updated commit message.

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates.

I think it is fine to say that this is an opt-in feature, don't hand-roll your own table of pointers (like .init_array) and consolidate it into one of the OutputSections that shuffle-padding might be inserted in.

If we had documentation for LLD like GNU ld, I'd suggest that we highlight it as a limitation of the feature. The man page is a possibility, but I think it is probably not worth it as I would expect a custom section name to be chosen.

I don't have any more comments. As an aside there has been some interest in this by the folks that do benchmarking internally.

@@ -2753,6 +2753,21 @@ RelroPaddingSection::RelroPaddingSection(Ctx &ctx)
: SyntheticSection(ctx, ".relro_padding", SHT_NOBITS, SHF_ALLOC | SHF_WRITE,
1) {}

ShufflePaddingSection::ShufflePaddingSection(Ctx &ctx, uint64_t size,
OutputSection *parent)
: SyntheticSection(ctx, ".shuffle_padding", SHF_ALLOC, SHT_PROGBITS, 1),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I see. Could possibly use parent->flags, and parent->type instead. I don't think it will make a difference at the moment, but may be a bit less surprising see.

@@ -0,0 +1,36 @@
# REQUIRES: x86
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bss/data/rodata can be merged into one test.

It's important to test an output section with multiple InputSectionDescription.

By default .rodata .text .data are in different PT_LOAD segments. The os->ptLoad != curPtLoad condition isn't covered by tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I moved this into a single test and used the fact that bss and data are in the same PT_LOAD to cover os->ptLoad != curPtLoad .

curPtLoad = os->ptLoad;
}
for (InputSection *isec : isd->sections) {
if (g() < (1 << 28))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1<<28 needs a comment. This should be configurable by the command line option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment and made this a bit less magical (use modulus instead). I'm not sure we need to make it configurable from the start. In the future, if/when we make it configurable, we can just make 1 in 16 the default.

@mplatings
Copy link
Collaborator

I love the idea and I want to use it already to work around some problems I'm seeing.

If I wasn't already aware of the feature and I was scanning through a list of command-line options then I'm not sure I'd catch on to what this is from the name. I think shuffle implies reordering and it looks like there's no reordering going on here. --randomize-padding might be clearer. And --randomize-section-padding might be clearer still. Brevity is overrated!

Created using spr 1.3.6-beta.1
@pcc pcc changed the title ELF: Introduce --shuffle-padding option. ELF: Introduce --randomize-section-padding option. Dec 5, 2024
@pcc
Copy link
Contributor Author

pcc commented Dec 5, 2024

I love the idea and I want to use it already to work around some problems I'm seeing.

If I wasn't already aware of the feature and I was scanning through a list of command-line options then I'm not sure I'd catch on to what this is from the name. I think shuffle implies reordering and it looks like there's no reordering going on here. --randomize-padding might be clearer. And --randomize-section-padding might be clearer still. Brevity is overrated!

Okay, I renamed it to --randomize-section-padding.

Created using spr 1.3.6-beta.1
@pcc
Copy link
Contributor Author

pcc commented Dec 11, 2024

@MaskRay Ping.

@MaskRay
Copy link
Member

MaskRay commented Dec 11, 2024

@MaskRay Ping.

Done:) (You may want to click "Re-request review" in the future.)

peterwaller-arm added a commit that referenced this pull request Dec 11, 2024
This complements --pad-funcs, and by using both simultaneously, enables
moving a specific function through the address space without modifying
any code
other than the targeted function (and references to it) by doing
(before+after=constant).

See also: proposed functionality to enable inserting random padding in

https://discourse.llvm.org/t/rfc-lld-feature-for-controlling-for-code-size-dependent-measurement-bias
and #117653
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
@pcc pcc requested a review from MaskRay December 12, 2024 02:31
Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Does smithp35 have opinions?

static bool canInsertPadding(OutputSection *sec) {
StringRef s = sec->name;
return s == ".bss" || s == ".data" || s == ".data.rel.ro" || s == ".lbss" ||
s == ".ldata" || s == ".lrodata" || s == ".ltext" || s == ".rodata" ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just .ltext, which is not used by large code models.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/just .ltext/remove .ltext/ ? I got the impression from

return IsLarge ? ".ltext" : ".text";
and
return getCodeModel() == CodeModel::Large;
that it was, or am I missing something?

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too.

I've got a mild preference for setting the type and flags of the RandomizePaddingSection using the type and flags of the OutputSection os passed in to the constructor. This won't make a functional difference at this point though.

As an aside, I'll be out of office for the next 3 weeks, may be a bit slow to respond.

@pcc pcc merged commit 64da33a into main Dec 13, 2024
9 checks passed
@pcc pcc deleted the users/pcc/spr/elf-introduce-shuffle-padding-flag branch December 13, 2024 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants