Skip to content

[libc][math] Improve performance test framework #134501

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 24, 2025
Merged

Conversation

meltq
Copy link
Contributor

@meltq meltq commented Apr 5, 2025

  • Merges BinaryOpSingleOutputPerf.h and SingleInputSingleOutputPerf.h files into a unified PerfTest.h and update all performance tests to use this.
  • Improve the output printed to log file for tests.
  • Removes unused run_diff method and redundant run_perf call in BINARY_INPUT_SINGLE_OUTPUT_PERF_EX (previously BINARY_OP_SINGLE_OUTPUT_PERF_EX)
  • Change BINARY_INPUT_SINGLE_OUTPUT_PERF_EX and SINGLE_INPUT_SINGLE_OUTPUT_PERF to not define main

@llvmbot llvmbot added the libc label Apr 5, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 5, 2025

@llvm/pr-subscribers-libc

Author: Tejas Vipin (meltq)

Changes

Switches away from "Mine/Other function" to "Function A/B" and prints the names of the actual functions passed on the first line. Also uses ratio of ops per second instead of ratio of average runtime.


Full diff: https://github.com/llvm/llvm-project/pull/134501.diff

2 Files Affected:

  • (modified) libc/test/src/math/performance_testing/BinaryOpSingleOutputPerf.h (+26-25)
  • (modified) libc/test/src/math/performance_testing/SingleInputSingleOutputPerf.h (+22-21)
diff --git a/libc/test/src/math/performance_testing/BinaryOpSingleOutputPerf.h b/libc/test/src/math/performance_testing/BinaryOpSingleOutputPerf.h
index 98a1813bd7b54..8001710d83f5e 100644
--- a/libc/test/src/math/performance_testing/BinaryOpSingleOutputPerf.h
+++ b/libc/test/src/math/performance_testing/BinaryOpSingleOutputPerf.h
@@ -26,7 +26,7 @@ class BinaryOpSingleOutputPerf {
 public:
   typedef OutputType Func(InputType, InputType);
 
-  static void run_perf_in_range(Func myFunc, Func otherFunc,
+  static void run_perf_in_range(Func FuncA, Func FuncB,
                                 StorageType startingBit, StorageType endingBit,
                                 size_t N, size_t rounds, std::ofstream &log) {
     if (sizeof(StorageType) <= sizeof(size_t))
@@ -54,48 +54,48 @@ class BinaryOpSingleOutputPerf {
 
     Timer timer;
     timer.start();
-    runner(myFunc);
+    runner(FuncA);
     timer.stop();
 
-    double my_average = static_cast<double>(timer.nanoseconds()) / N / rounds;
-    log << "-- My function --\n";
+    double a_average = static_cast<double>(timer.nanoseconds()) / N / rounds;
+    log << "-- Function A --\n";
     log << "     Total time      : " << timer.nanoseconds() << " ns \n";
-    log << "     Average runtime : " << my_average << " ns/op \n";
+    log << "     Average runtime : " << a_average << " ns/op \n";
     log << "     Ops per second  : "
-        << static_cast<uint64_t>(1'000'000'000.0 / my_average) << " op/s \n";
+        << static_cast<uint64_t>(1'000'000'000.0 / a_average) << " op/s \n";
 
     timer.start();
-    runner(otherFunc);
+    runner(FuncB);
     timer.stop();
 
-    double other_average =
-        static_cast<double>(timer.nanoseconds()) / N / rounds;
-    log << "-- Other function --\n";
+    double b_average = static_cast<double>(timer.nanoseconds()) / N / rounds;
+    log << "-- Function B --\n";
     log << "     Total time      : " << timer.nanoseconds() << " ns \n";
-    log << "     Average runtime : " << other_average << " ns/op \n";
+    log << "     Average runtime : " << b_average << " ns/op \n";
     log << "     Ops per second  : "
-        << static_cast<uint64_t>(1'000'000'000.0 / other_average) << " op/s \n";
+        << static_cast<uint64_t>(1'000'000'000.0 / b_average) << " op/s \n";
 
-    log << "-- Average runtime ratio --\n";
-    log << "     Mine / Other's  : " << my_average / other_average << " \n";
+    log << "-- Average ops per second ratio --\n";
+    log << "     A / B  : " << b_average / a_average << " \n";
   }
 
-  static void run_perf(Func myFunc, Func otherFunc, int rounds,
-                       const char *logFile) {
+  static void run_perf(Func FuncA, Func FuncB, int rounds, const char *name_a,
+                       const char *name_b, const char *logFile) {
     std::ofstream log(logFile);
+    log << "Function A - " << name_a << " Function B - " << name_b << "\n";
     log << " Performance tests with inputs in denormal range:\n";
-    run_perf_in_range(myFunc, otherFunc, /* startingBit= */ StorageType(0),
+    run_perf_in_range(FuncA, FuncB, /* startingBit= */ StorageType(0),
                       /* endingBit= */ FPBits::max_subnormal().uintval(),
                       1'000'001, rounds, log);
     log << "\n Performance tests with inputs in normal range:\n";
-    run_perf_in_range(myFunc, otherFunc,
+    run_perf_in_range(FuncA, FuncB,
                       /* startingBit= */ FPBits::min_normal().uintval(),
                       /* endingBit= */ FPBits::max_normal().uintval(),
                       1'000'001, rounds, log);
     log << "\n Performance tests with inputs in normal range with exponents "
            "close to each other:\n";
     run_perf_in_range(
-        myFunc, otherFunc,
+        FuncA, FuncB,
         /* startingBit= */ FPBits(OutputType(0x1.0p-10)).uintval(),
         /* endingBit= */ FPBits(OutputType(0x1.0p+10)).uintval(), 1'000'001,
         rounds, log);
@@ -128,21 +128,22 @@ class BinaryOpSingleOutputPerf {
 } // namespace testing
 } // namespace LIBC_NAMESPACE_DECL
 
-#define BINARY_OP_SINGLE_OUTPUT_PERF(OutputType, InputType, myFunc, otherFunc, \
+#define BINARY_OP_SINGLE_OUTPUT_PERF(OutputType, InputType, FuncA, FuncB,      \
                                      filename)                                 \
   int main() {                                                                 \
     LIBC_NAMESPACE::testing::BinaryOpSingleOutputPerf<                         \
-        OutputType, InputType>::run_perf(&myFunc, &otherFunc, 1, filename);    \
+        OutputType, InputType>::run_perf(&FuncA, &FuncB, 1, #FuncA, #FuncB,    \
+                                         filename);                            \
     return 0;                                                                  \
   }
 
-#define BINARY_OP_SINGLE_OUTPUT_PERF_EX(OutputType, InputType, myFunc,         \
-                                        otherFunc, rounds, filename)           \
+#define BINARY_OP_SINGLE_OUTPUT_PERF_EX(OutputType, InputType, FuncA, FuncB,   \
+                                        rounds, filename)                      \
   {                                                                            \
     LIBC_NAMESPACE::testing::BinaryOpSingleOutputPerf<                         \
-        OutputType, InputType>::run_perf(&myFunc, &otherFunc, rounds,          \
+        OutputType, InputType>::run_perf(&FuncA, &FuncB, 1, #FuncA, #FuncB,    \
                                          filename);                            \
     LIBC_NAMESPACE::testing::BinaryOpSingleOutputPerf<                         \
-        OutputType, InputType>::run_perf(&myFunc, &otherFunc, rounds,          \
+        OutputType, InputType>::run_perf(&FuncA, &FuncB, 1, #FuncA, #FuncB,    \
                                          filename);                            \
   }
diff --git a/libc/test/src/math/performance_testing/SingleInputSingleOutputPerf.h b/libc/test/src/math/performance_testing/SingleInputSingleOutputPerf.h
index efad1259d6bf1..93c217de250e6 100644
--- a/libc/test/src/math/performance_testing/SingleInputSingleOutputPerf.h
+++ b/libc/test/src/math/performance_testing/SingleInputSingleOutputPerf.h
@@ -25,7 +25,7 @@ template <typename T> class SingleInputSingleOutputPerf {
 public:
   typedef T Func(T);
 
-  static void runPerfInRange(Func myFunc, Func otherFunc,
+  static void runPerfInRange(Func FuncA, Func FuncB,
                              StorageType startingBit, StorageType endingBit,
                              size_t rounds, std::ofstream &log) {
     size_t n = 10'010'001;
@@ -47,40 +47,41 @@ template <typename T> class SingleInputSingleOutputPerf {
 
     Timer timer;
     timer.start();
-    runner(myFunc);
+    runner(FuncA);
     timer.stop();
 
-    double myAverage = static_cast<double>(timer.nanoseconds()) / n / rounds;
-    log << "-- My function --\n";
+    double a_average = static_cast<double>(timer.nanoseconds()) / n / rounds;
+    log << "-- Function A --\n";
     log << "     Total time      : " << timer.nanoseconds() << " ns \n";
-    log << "     Average runtime : " << myAverage << " ns/op \n";
+    log << "     Average runtime : " << a_average << " ns/op \n";
     log << "     Ops per second  : "
-        << static_cast<uint64_t>(1'000'000'000.0 / myAverage) << " op/s \n";
+        << static_cast<uint64_t>(1'000'000'000.0 / a_average) << " op/s \n";
 
     timer.start();
-    runner(otherFunc);
+    runner(FuncB);
     timer.stop();
 
-    double otherAverage = static_cast<double>(timer.nanoseconds()) / n / rounds;
-    log << "-- Other function --\n";
+    double b_average = static_cast<double>(timer.nanoseconds()) / n / rounds;
+    log << "-- Function B --\n";
     log << "     Total time      : " << timer.nanoseconds() << " ns \n";
-    log << "     Average runtime : " << otherAverage << " ns/op \n";
+    log << "     Average runtime : " << b_average << " ns/op \n";
     log << "     Ops per second  : "
-        << static_cast<uint64_t>(1'000'000'000.0 / otherAverage) << " op/s \n";
+        << static_cast<uint64_t>(1'000'000'000.0 / b_average) << " op/s \n";
 
-    log << "-- Average runtime ratio --\n";
-    log << "     Mine / Other's  : " << myAverage / otherAverage << " \n";
+    log << "-- Average ops per second ratio --\n";
+    log << "     A / B  : " << b_average / a_average << " \n";
   }
 
-  static void runPerf(Func myFunc, Func otherFunc, size_t rounds,
-                      const char *logFile) {
+  static void runPerf(Func FuncA, Func FuncB, size_t rounds, const char *name_a,
+                      const char *name_b, const char *logFile) {
     std::ofstream log(logFile);
+    log << "Function A - " << name_a << " Function B - " << name_b << "\n";
     log << " Performance tests with inputs in denormal range:\n";
-    runPerfInRange(myFunc, otherFunc, /* startingBit= */ StorageType(0),
+    runPerfInRange(FuncA, FuncB, /* startingBit= */ StorageType(0),
                    /* endingBit= */ FPBits::max_subnormal().uintval(), rounds,
                    log);
     log << "\n Performance tests with inputs in normal range:\n";
-    runPerfInRange(myFunc, otherFunc,
+    runPerfInRange(FuncA, FuncB,
                    /* startingBit= */ FPBits::min_normal().uintval(),
                    /* endingBit= */ FPBits::max_normal().uintval(), rounds,
                    log);
@@ -90,16 +91,16 @@ template <typename T> class SingleInputSingleOutputPerf {
 } // namespace testing
 } // namespace LIBC_NAMESPACE_DECL
 
-#define SINGLE_INPUT_SINGLE_OUTPUT_PERF(T, myFunc, otherFunc, filename)        \
+#define SINGLE_INPUT_SINGLE_OUTPUT_PERF(T, FuncA, FuncB, filename)             \
   int main() {                                                                 \
     LIBC_NAMESPACE::testing::SingleInputSingleOutputPerf<T>::runPerf(          \
-        &myFunc, &otherFunc, 1, filename);                                     \
+        &FuncA, &FuncB, 1, #FuncA, #FuncB, filename);                          \
     return 0;                                                                  \
   }
 
-#define SINGLE_INPUT_SINGLE_OUTPUT_PERF_EX(T, myFunc, otherFunc, rounds,       \
+#define SINGLE_INPUT_SINGLE_OUTPUT_PERF_EX(T, FuncA, FuncB, rounds,            \
                                            filename)                           \
   {                                                                            \
     LIBC_NAMESPACE::testing::SingleInputSingleOutputPerf<T>::runPerf(          \
-        &myFunc, &otherFunc, rounds, filename);                                \
+        &FuncA, &FuncB, rounds, #FuncA, #FuncB, filename);                     \
   }

Copy link

github-actions bot commented Apr 5, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@meltq
Copy link
Contributor Author

meltq commented Apr 5, 2025

Old output

 Performance tests with inputs in denormal range:
-- My function --
     Total time      : 4545310 ns 
     Average runtime : 4443.12 ns/op 
     Ops per second  : 225067 op/s 
-- Other function --
     Total time      : 654400 ns 
     Average runtime : 639.687 ns/op 
     Ops per second  : 1563264 op/s 
-- Average runtime ratio --
     Mine / Other's  : 6.94577

New output

Function A - LIBC_NAMESPACE::hypotf16 Function B - LIBC_NAMESPACE::fputil::hypot<float16>
 Performance tests with inputs in denormal range:
-- Function A --
     Total time      : 2670736 ns 
     Average runtime : 2610.69 ns/op 
     Ops per second  : 383040 op/s 
-- Function B --
     Total time      : 499614 ns 
     Average runtime : 488.381 ns/op 
     Ops per second  : 2047580 op/s 
-- Average ops per second ratio --
     A / B  : 0.18707 

@meltq
Copy link
Contributor Author

meltq commented Apr 5, 2025

@lntue @overmighty requesting review

@overmighty overmighty requested review from overmighty and lntue April 5, 2025 19:35
@meltq
Copy link
Contributor Author

meltq commented Apr 15, 2025

Ping

@overmighty
Copy link
Member

I would prefer function names to be printed like this personally:

 Performance tests with inputs in denormal range:
-- Function A: LIBC_NAMESPACE::hypotf16 --
     Total time      : 2670736 ns
     Average runtime : 2610.69 ns/op
     Ops per second  : 383040 op/s
-- Function B: LIBC_NAMESPACE::fputil::hypot<float16> --
     Total time      : 499614 ns
     Average runtime : 488.381 ns/op
     Ops per second  : 2047580 op/s
-- Average ops per second ratio --
     A / B  : 0.18707

@meltq meltq changed the title [libc][math] Improve performance test output [libc][math] Improve performance test framework Apr 18, 2025
@meltq
Copy link
Contributor Author

meltq commented Apr 18, 2025

Made all suggested changes.

Copy link
Contributor

@michaelrj-google michaelrj-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I'll let @lntue make the final call. Thanks for this cleanup!

@meltq
Copy link
Contributor Author

meltq commented Apr 24, 2025

I'll need help for this to be merged since I don't have write perms.

@lntue lntue merged commit dde00f5 into llvm:main Apr 24, 2025
16 checks passed
@meltq meltq deleted the perf_cleanup branch April 26, 2025 05:06
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
- Merges `BinaryOpSingleOutputPerf.h` and
`SingleInputSingleOutputPerf.h` files into a unified `PerfTest.h` and
update all performance tests to use this.
- Improve the output printed to log file for tests.
- Removes unused `run_diff` method and redundant `run_perf` call in
`BINARY_INPUT_SINGLE_OUTPUT_PERF_EX` (previously
`BINARY_OP_SINGLE_OUTPUT_PERF_EX`)
- Change `BINARY_INPUT_SINGLE_OUTPUT_PERF_EX` and
`SINGLE_INPUT_SINGLE_OUTPUT_PERF` to not define `main`
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
- Merges `BinaryOpSingleOutputPerf.h` and
`SingleInputSingleOutputPerf.h` files into a unified `PerfTest.h` and
update all performance tests to use this.
- Improve the output printed to log file for tests.
- Removes unused `run_diff` method and redundant `run_perf` call in
`BINARY_INPUT_SINGLE_OUTPUT_PERF_EX` (previously
`BINARY_OP_SINGLE_OUTPUT_PERF_EX`)
- Change `BINARY_INPUT_SINGLE_OUTPUT_PERF_EX` and
`SINGLE_INPUT_SINGLE_OUTPUT_PERF` to not define `main`
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
- Merges `BinaryOpSingleOutputPerf.h` and
`SingleInputSingleOutputPerf.h` files into a unified `PerfTest.h` and
update all performance tests to use this.
- Improve the output printed to log file for tests.
- Removes unused `run_diff` method and redundant `run_perf` call in
`BINARY_INPUT_SINGLE_OUTPUT_PERF_EX` (previously
`BINARY_OP_SINGLE_OUTPUT_PERF_EX`)
- Change `BINARY_INPUT_SINGLE_OUTPUT_PERF_EX` and
`SINGLE_INPUT_SINGLE_OUTPUT_PERF` to not define `main`
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request May 9, 2025
- Merges `BinaryOpSingleOutputPerf.h` and
`SingleInputSingleOutputPerf.h` files into a unified `PerfTest.h` and
update all performance tests to use this.
- Improve the output printed to log file for tests.
- Removes unused `run_diff` method and redundant `run_perf` call in
`BINARY_INPUT_SINGLE_OUTPUT_PERF_EX` (previously
`BINARY_OP_SINGLE_OUTPUT_PERF_EX`)
- Change `BINARY_INPUT_SINGLE_OUTPUT_PERF_EX` and
`SINGLE_INPUT_SINGLE_OUTPUT_PERF` to not define `main`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants