Skip to content

Reland "[llvm-exegesis] Add support for pinning benchmarking process to a CPU (#85168)" #109688

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

boomanaiden154
Copy link
Contributor

This reverts commit 2cd20c2.

This relands commit 9886788.

This was causing more buildbot failures due to getcpu not being available with glibc <=2.29. This patch fixes that by directly making the syscall, assuming the syscall number macro is available.

…to a CPU (llvm#85168)"

This reverts commit 2cd20c2.

This relands commit 9886788.

This was causing more buildbot failures due to getcpu not being
available with glibc <=2.29. This patch fixes that by directly making
the syscall, assuming the syscall number macro is available.
@llvmbot
Copy link
Member

llvmbot commented Sep 23, 2024

@llvm/pr-subscribers-tools-llvm-exegesis

Author: Aiden Grossman (boomanaiden154)

Changes

This reverts commit 2cd20c2.

This relands commit 9886788.

This was causing more buildbot failures due to getcpu not being available with glibc <=2.29. This patch fixes that by directly making the syscall, assuming the syscall number macro is available.


Full diff: https://github.com/llvm/llvm-project/pull/109688.diff

5 Files Affected:

  • (added) llvm/test/tools/llvm-exegesis/X86/latency/cpu-pinning-execution-mode.s (+5)
  • (added) llvm/test/tools/llvm-exegesis/X86/latency/cpu-pinning.s (+5)
  • (modified) llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp (+64-12)
  • (modified) llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h (+4-2)
  • (modified) llvm/tools/llvm-exegesis/llvm-exegesis.cpp (+10-1)
diff --git a/llvm/test/tools/llvm-exegesis/X86/latency/cpu-pinning-execution-mode.s b/llvm/test/tools/llvm-exegesis/X86/latency/cpu-pinning-execution-mode.s
new file mode 100644
index 00000000000000..b73ac26f2cfc74
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/X86/latency/cpu-pinning-execution-mode.s
@@ -0,0 +1,5 @@
+# REQUIRES: exegesis-can-measure-latency, x86_64-linux
+
+# RUN: not llvm-exegesis -mtriple=x86_64-unknown-unknown -mode=latency -opcode-name=ADD64rr -execution-mode=inprocess --benchmark-process-cpu=0 2>&1 | FileCheck %s
+
+# CHECK: llvm-exegesis error: The inprocess execution mode does not support benchmark core pinning.
diff --git a/llvm/test/tools/llvm-exegesis/X86/latency/cpu-pinning.s b/llvm/test/tools/llvm-exegesis/X86/latency/cpu-pinning.s
new file mode 100644
index 00000000000000..0ea3752fc3bb95
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/X86/latency/cpu-pinning.s
@@ -0,0 +1,5 @@
+# REQUIRES: exegesis-can-measure-latency, x86_64-linux
+
+# RUN: llvm-exegesis -mtriple=x86_64-unknown-unknown -mode=latency -opcode-name=ADD64rr -execution-mode=subprocess | FileCheck %s
+
+# CHECK: - { key: latency, value: {{[0-9.]*}}, per_snippet_value: {{[0-9.]*}}
diff --git a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
index 4e60d33fa2c637..9116b5ced02748 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
@@ -98,7 +98,8 @@ class InProcessFunctionExecutorImpl : public BenchmarkRunner::FunctionExecutor {
 public:
   static Expected<std::unique_ptr<InProcessFunctionExecutorImpl>>
   create(const LLVMState &State, object::OwningBinary<object::ObjectFile> Obj,
-         BenchmarkRunner::ScratchSpace *Scratch) {
+         BenchmarkRunner::ScratchSpace *Scratch,
+         std::optional<int> BenchmarkProcessCPU) {
     Expected<ExecutableFunction> EF =
         ExecutableFunction::create(State.createTargetMachine(), std::move(Obj));
 
@@ -190,27 +191,31 @@ class SubProcessFunctionExecutorImpl
 public:
   static Expected<std::unique_ptr<SubProcessFunctionExecutorImpl>>
   create(const LLVMState &State, object::OwningBinary<object::ObjectFile> Obj,
-         const BenchmarkKey &Key) {
+         const BenchmarkKey &Key, std::optional<int> BenchmarkProcessCPU) {
     Expected<ExecutableFunction> EF =
         ExecutableFunction::create(State.createTargetMachine(), std::move(Obj));
     if (!EF)
       return EF.takeError();
 
     return std::unique_ptr<SubProcessFunctionExecutorImpl>(
-        new SubProcessFunctionExecutorImpl(State, std::move(*EF), Key));
+        new SubProcessFunctionExecutorImpl(State, std::move(*EF), Key,
+                                           BenchmarkProcessCPU));
   }
 
 private:
   SubProcessFunctionExecutorImpl(const LLVMState &State,
                                  ExecutableFunction Function,
-                                 const BenchmarkKey &Key)
-      : State(State), Function(std::move(Function)), Key(Key) {}
+                                 const BenchmarkKey &Key,
+                                 std::optional<int> BenchmarkCPU)
+      : State(State), Function(std::move(Function)), Key(Key),
+        BenchmarkProcessCPU(BenchmarkCPU) {}
 
   enum ChildProcessExitCodeE {
     CounterFDReadFailed = 1,
     RSeqDisableFailed,
     FunctionDataMappingFailed,
-    AuxiliaryMemorySetupFailed
+    AuxiliaryMemorySetupFailed,
+    SetCPUAffinityFailed
   };
 
   StringRef childProcessExitCodeToString(int ExitCode) const {
@@ -223,6 +228,8 @@ class SubProcessFunctionExecutorImpl
       return "Failed to map memory for assembled snippet";
     case ChildProcessExitCodeE::AuxiliaryMemorySetupFailed:
       return "Failed to setup auxiliary memory";
+    case ChildProcessExitCodeE::SetCPUAffinityFailed:
+      return "Failed to set CPU affinity of the benchmarking process";
     default:
       return "Child process returned with unknown exit code";
     }
@@ -384,6 +391,41 @@ class SubProcessFunctionExecutorImpl
     return make_error<SnippetSignal>(ChildSignalInfo.si_signo);
   }
 
+  static void setCPUAffinityIfRequested(int CPUToUse) {
+// Special case this function for x86_64 for now as certain more esoteric
+// platforms have different definitions for some of the libc functions that
+// cause buildtime failures. Additionally, the subprocess executor mode (the
+// sole mode where this is supported) currently only supports x86_64.
+
+// Also check that we have the SYS_getcpu macro defined, meaning the syscall
+// actually exists within the build environment. We manually use the syscall
+// rather than the libc wrapper given the wrapper for getcpu is only available
+// in glibc 2.29 and later.
+#if defined(__x86_64__) && defined(SYS_getcpu)
+    // Set the CPU affinity for the child process, so that we ensure that if
+    // the user specified a CPU the process should run on, the benchmarking
+    // process is running on that CPU.
+    cpu_set_t CPUMask;
+    CPU_ZERO(&CPUMask);
+    CPU_SET(CPUToUse, &CPUMask);
+    // TODO(boomanaiden154): Rewrite this to use LLVM primitives once they
+    // are available.
+    int SetAffinityReturn = sched_setaffinity(0, sizeof(CPUMask), &CPUMask);
+    if (SetAffinityReturn == -1) {
+      exit(ChildProcessExitCodeE::SetCPUAffinityFailed);
+    }
+
+    // Check (if assertions are enabled) that we are actually running on the
+    // CPU that was specified by the user.
+    [[maybe_unused]] unsigned int CurrentCPU;
+    assert(syscall(SYS_getcpu, &CurrentCPU, nullptr) == 0 &&
+           "Expected getcpu call to succeed.");
+    assert(static_cast<int>(CurrentCPU) == CPUToUse &&
+           "Expected current CPU to equal the CPU requested by the user");
+#endif // defined(__x86_64__) && defined(SYS_getcpu)
+    exit(ChildProcessExitCodeE::SetCPUAffinityFailed);
+  }
+
   Error createSubProcessAndRunBenchmark(
       StringRef CounterName, SmallVectorImpl<int64_t> &CounterValues,
       ArrayRef<const char *> ValidationCounters,
@@ -416,6 +458,10 @@ class SubProcessFunctionExecutorImpl
     }
 
     if (ParentOrChildPID == 0) {
+      if (BenchmarkProcessCPU.has_value()) {
+        setCPUAffinityIfRequested(*BenchmarkProcessCPU);
+      }
+
       // We are in the child process, close the write end of the pipe.
       close(PipeFiles[1]);
       // Unregister handlers, signal handling is now handled through ptrace in
@@ -538,6 +584,7 @@ class SubProcessFunctionExecutorImpl
   const LLVMState &State;
   const ExecutableFunction Function;
   const BenchmarkKey &Key;
+  const std::optional<int> BenchmarkProcessCPU;
 };
 #endif // __linux__
 } // namespace
@@ -615,11 +662,15 @@ BenchmarkRunner::getRunnableConfiguration(
 Expected<std::unique_ptr<BenchmarkRunner::FunctionExecutor>>
 BenchmarkRunner::createFunctionExecutor(
     object::OwningBinary<object::ObjectFile> ObjectFile,
-    const BenchmarkKey &Key) const {
+    const BenchmarkKey &Key, std::optional<int> BenchmarkProcessCPU) const {
   switch (ExecutionMode) {
   case ExecutionModeE::InProcess: {
+    if (BenchmarkProcessCPU.has_value())
+      return make_error<Failure>("The inprocess execution mode does not "
+                                 "support benchmark core pinning.");
+
     auto InProcessExecutorOrErr = InProcessFunctionExecutorImpl::create(
-        State, std::move(ObjectFile), Scratch.get());
+        State, std::move(ObjectFile), Scratch.get(), BenchmarkProcessCPU);
     if (!InProcessExecutorOrErr)
       return InProcessExecutorOrErr.takeError();
 
@@ -628,7 +679,7 @@ BenchmarkRunner::createFunctionExecutor(
   case ExecutionModeE::SubProcess: {
 #ifdef __linux__
     auto SubProcessExecutorOrErr = SubProcessFunctionExecutorImpl::create(
-        State, std::move(ObjectFile), Key);
+        State, std::move(ObjectFile), Key, BenchmarkProcessCPU);
     if (!SubProcessExecutorOrErr)
       return SubProcessExecutorOrErr.takeError();
 
@@ -643,8 +694,8 @@ BenchmarkRunner::createFunctionExecutor(
 }
 
 std::pair<Error, Benchmark> BenchmarkRunner::runConfiguration(
-    RunnableConfiguration &&RC,
-    const std::optional<StringRef> &DumpFile) const {
+    RunnableConfiguration &&RC, const std::optional<StringRef> &DumpFile,
+    std::optional<int> BenchmarkProcessCPU) const {
   Benchmark &BenchmarkResult = RC.BenchmarkResult;
   object::OwningBinary<object::ObjectFile> &ObjectFile = RC.ObjectFile;
 
@@ -665,7 +716,8 @@ std::pair<Error, Benchmark> BenchmarkRunner::runConfiguration(
   }
 
   Expected<std::unique_ptr<BenchmarkRunner::FunctionExecutor>> Executor =
-      createFunctionExecutor(std::move(ObjectFile), RC.BenchmarkResult.Key);
+      createFunctionExecutor(std::move(ObjectFile), RC.BenchmarkResult.Key,
+                             BenchmarkProcessCPU);
   if (!Executor)
     return {Executor.takeError(), std::move(BenchmarkResult)};
   auto NewMeasurements = runMeasurements(**Executor);
diff --git a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h
index 9b4bb1d41149fe..e688b814d1c83d 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h
@@ -68,7 +68,8 @@ class BenchmarkRunner {
 
   std::pair<Error, Benchmark>
   runConfiguration(RunnableConfiguration &&RC,
-                   const std::optional<StringRef> &DumpFile) const;
+                   const std::optional<StringRef> &DumpFile,
+                   std::optional<int> BenchmarkProcessCPU) const;
 
   // Scratch space to run instructions that touch memory.
   struct ScratchSpace {
@@ -135,7 +136,8 @@ class BenchmarkRunner {
 
   Expected<std::unique_ptr<FunctionExecutor>>
   createFunctionExecutor(object::OwningBinary<object::ObjectFile> Obj,
-                         const BenchmarkKey &Key) const;
+                         const BenchmarkKey &Key,
+                         std::optional<int> BenchmarkProcessCPU) const;
 };
 
 } // namespace exegesis
diff --git a/llvm/tools/llvm-exegesis/llvm-exegesis.cpp b/llvm/tools/llvm-exegesis/llvm-exegesis.cpp
index e6a43cfc6db51c..546ec770a8d221 100644
--- a/llvm/tools/llvm-exegesis/llvm-exegesis.cpp
+++ b/llvm/tools/llvm-exegesis/llvm-exegesis.cpp
@@ -269,6 +269,11 @@ static cl::list<ValidationEvent> ValidationCounters(
         "counter to validate benchmarking assumptions"),
     cl::CommaSeparated, cl::cat(BenchmarkOptions), ValidationEventOptions());
 
+static cl::opt<int> BenchmarkProcessCPU(
+    "benchmark-process-cpu",
+    cl::desc("The CPU number that the benchmarking process should executon on"),
+    cl::cat(BenchmarkOptions), cl::init(-1));
+
 static ExitOnError ExitOnErr("llvm-exegesis error: ");
 
 // Helper function that logs the error(s) and exits.
@@ -418,8 +423,12 @@ static void runBenchmarkConfigurations(
         std::optional<StringRef> DumpFile;
         if (DumpObjectToDisk.getNumOccurrences())
           DumpFile = DumpObjectToDisk;
+        const std::optional<int> BenchmarkCPU =
+            BenchmarkProcessCPU == -1
+                ? std::nullopt
+                : std::optional(BenchmarkProcessCPU.getValue());
         auto [Err, BenchmarkResult] =
-            Runner.runConfiguration(std::move(RC), DumpFile);
+            Runner.runConfiguration(std::move(RC), DumpFile, BenchmarkCPU);
         if (Err) {
           // Errors from executing the snippets are fine.
           // All other errors are a framework issue and should fail.

@boomanaiden154 boomanaiden154 merged commit e093bb9 into llvm:main Sep 23, 2024
8 of 9 checks passed
@boomanaiden154 boomanaiden154 deleted the llvm-exegesis-sys-getcpu-macro branch September 23, 2024 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants