Skip to content

[Exegesis][AArch64] Use more generic cycles counter #133376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 30, 2025

Conversation

r-belenov
Copy link
Contributor

@r-belenov r-belenov commented Mar 28, 2025

CPU_CYCLES counter does not work on some Aarch64 CPUs; CYCLES is more generic and is equivalent to CPU_CYCLES in case the latter is supported.

Longer story - CPU_CYCLES work only on CPU models explicitly recognized by libpfm4 ( via pfm_arm_detect_*() functions in https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/lib/pfmlib_arm_armv8.c ) and its name is consistent with ARM documentation. However, the counter is architectural and is supported on all ARMv8 CPUs; libpfm4 recognizes generic PMU on unknown ARMv8 CPUs, but does not provide CPU_CYCLES event. Instead, CYCLES is provided (an alias to PERF_COUNT_HW_CPU_CYCLES). Physically, it is the same event with code 0x11. On supported architectures CYCLES also work, so the change should not introduce regression.

Change counter name to more generic one
Update the test according to the change in the code
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Mar 28, 2025

@llvm/pr-subscribers-tools-llvm-exegesis

@llvm/pr-subscribers-backend-aarch64

Author: None (r-belenov)

Changes

[Exegesis][AArch64] Use more generic cycles counter
CPU_CYCLES counter does not on some Aarch64 CPUs; CYCLES is more generic and is equivalent to CPU_CYCLES in case the latter is supported.


Full diff: https://github.com/llvm/llvm-project/pull/133376.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64PfmCounters.td (+1-1)
  • (modified) llvm/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp (+1-1)
diff --git a/llvm/lib/Target/AArch64/AArch64PfmCounters.td b/llvm/lib/Target/AArch64/AArch64PfmCounters.td
index b1d1664e3f1b1..c7132b40ca2fe 100644
--- a/llvm/lib/Target/AArch64/AArch64PfmCounters.td
+++ b/llvm/lib/Target/AArch64/AArch64PfmCounters.td
@@ -10,7 +10,7 @@
 //
 //===----------------------------------------------------------------------===//
 
-def CpuCyclesPfmCounter : PfmCounter<"CPU_CYCLES">;
+def CpuCyclesPfmCounter : PfmCounter<"CYCLES">;
 
 def DefaultPfmCounters : ProcPfmCounters {
   let CycleCounter = CpuCyclesPfmCounter;
diff --git a/llvm/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp b/llvm/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp
index 71675d9f46739..ca5416eef39d5 100644
--- a/llvm/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp
+++ b/llvm/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp
@@ -65,7 +65,7 @@ TEST_F(AArch64TargetTest, SetRegToConstant) {
 }
 
 TEST_F(AArch64TargetTest, DefaultPfmCounters) {
-  const std::string Expected = "CPU_CYCLES";
+  const std::string Expected = "CYCLES";
   EXPECT_EQ(ExegesisTarget_->getPfmCounters("").CycleCounter, Expected);
   EXPECT_EQ(ExegesisTarget_->getPfmCounters("unknown_cpu").CycleCounter,
             Expected);

@boomanaiden154
Copy link
Contributor

@lakshayk-nv (since you aren't in the LLVM org and I can't request your review directly).

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a link to documentation mentioning the aliasing?

@r-belenov
Copy link
Contributor Author

r-belenov commented Mar 28, 2025

I'm not sure it is mentioned explicitly. libpfm4 uses PERF_COUNT_HW_CPU_CYCLES for "CYCLES" on all architectures (it's just the counter used for "perf stat -e cycles ..."). It is mapped to ARMV8_PMUV3_PERFCTR_CPU_CYCLES here - https://github.com/torvalds/linux/blob/acb4f33713b9f6cadb6143f211714c343465411c/drivers/perf/arm_pmuv3.c#L45 ; the definition of ARMV8_PMUV3_PERFCTR_CPU_CYCLES is here - https://github.com/torvalds/linux/blob/acb4f33713b9f6cadb6143f211714c343465411c/include/linux/perf/arm_pmuv3.h#L33

@r-belenov
Copy link
Contributor Author

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the additional info.

This seems reasonable enough to me, but please wait for approval from one of the other reviewers I've pinged as I've only ever worked on the X86 bits of llvm-exegesis.

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've ran into this before but never bottomed out what the difference between the two was. I tested on a couple of machines and it did OK on both (one of which worked before, one of which didn't).

LGTM, thanks.

@boomanaiden154
Copy link
Contributor

(Merging since the author does not have commit access).

@boomanaiden154 boomanaiden154 merged commit c9d90f1 into llvm:main Mar 30, 2025
9 of 11 checks passed
Copy link

@r-belenov Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

@r-belenov r-belenov deleted the tmp branch March 31, 2025 05:48
SchrodingerZhu pushed a commit to SchrodingerZhu/llvm-project that referenced this pull request Mar 31, 2025
CPU_CYCLES counter does not work on some Aarch64 CPUs; CYCLES is more
generic and is equivalent to CPU_CYCLES in case the latter is supported.

Longer story - CPU_CYCLES work only on CPU models explicitly recognized
by libpfm4 ( via pfm_arm_detect_*() functions in
https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/lib/pfmlib_arm_armv8.c
) and its name is consistent with ARM documentation. However, the
counter is architectural and is supported on all ARMv8 CPUs; libpfm4
recognizes generic PMU on unknown ARMv8 CPUs, but does not provide
CPU_CYCLES event. Instead, CYCLES is provided (an alias to
PERF_COUNT_HW_CPU_CYCLES). Physically, it is the same event with code
0x11. On supported architectures CYCLES also work, so the change should
not introduce regression.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants