Skip to content

[GlobalMerge] Add MinSize feature to the GlobalMerge Pass. #93686

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 3, 2024

Conversation

michaelmaitland
Copy link
Contributor

We add a feature that prevents the GlobalMerge pass from considering data smaller than a minimum size in bytes for merging.

The MinSize is set in 3 ways:

  1. If global-merge-min-data-size is explicitly set, then it uses that value.
  2. If SmallDataLimit is set and non-zero, then SmallDataLimit + 1 is used.
  3. Otherwise, 0 is used, which means all sizes are considered for merging.

We found that this feature allowed us to see the benefit of the GlobalMerge pass while eliminating some merging that was not beneficial. This feature allowed us to enable the GlobalMerge pass on RISC-V in our downstream by default because it led to improvements on multiple benchmark suites.

I plan to post a separate patch to propose enabling this by default on RISC-V. But I do not want that discussion to be part of the discussion of adding this feature, so I am keeping the patches separate.

We add a feature that prevents the GlobalMerge pass from considering
data smaller than a minimum size in bytes for merging.

The MinSize is set in 3 ways:
1. If global-merge-min-data-size is explicitly set, then it uses that value.
2. If SmallDataLimit is set and non-zero, then SmallDataLimit + 1 is used.
3. Otherwise, 0 is used, which means all sizes are considered for merging.

This feature allowed us to enable the GlobalMerge pass on RISC-V in our
downstream by default because it led to improvements on multiple
benchmark suites without causing regressions to Geomeans.

We found that this feature allowed us to see the benefit of the
GlobalMerge pass while eliminating some merging that was not beneficial.

I plan to post a separate patch to propose enabling this by default on
RISC-V. But I do not want that discussion to be part of the discussion
of adding this feature, so I am keeping the patches separate.
@preames
Copy link
Collaborator

preames commented May 29, 2024

I don't remember off the top of my head what SmallDataLimit represents, and am not finding it documented in LangRef. Is this related to GP relative addressing? Or something else?

Code wise, this looks pretty straight forward. I just want to make sure I understand the intent.

@topperc
Copy link
Collaborator

topperc commented May 29, 2024

I don't remember off the top of my head what SmallDataLimit represents, and am not finding it documented in LangRef. Is this related to GP relative addressing? Or something else?

Code wise, this looks pretty straight forward. I just want to make sure I understand the intent.

It controls what variables end up in the .sdata and .sbss sections. Where s means small. Those sections are placed together and GP is placed somewhere in the middle so that GP relative addressing can apply to variables in those sections.

The intent here is for global merge to not prevent small variables from being placed in sdata or sbss.

@michaelmaitland
Copy link
Contributor Author

I don't remember off the top of my head what SmallDataLimit represents, and am not finding it documented in LangRef. Is this related to GP relative addressing? Or something else?

Code wise, this looks pretty straight forward. I just want to make sure I understand the intent.

An address must be loaded from a small section if its size is less than the SmallDataLimit. Data in this section could be addressed by using gp_rel.

I have measured that basing the default GlobalMergeMinDataSize off of SmallDataLimit has beneficial effects compared to other default values. I measured a bunch of different defaults (0, 4, 5, 8, 9, 16, 17, ... 512, SmallDataLimit) and Small DataLimit did the trick.

@preames
Copy link
Collaborator

preames commented May 29, 2024

The intent here is for global merge to not prevent small variables from being placed in sdata or sbss.

GlobalMerge already clusters globals into a couple of sets, would introduce a "small" vs "large" set solve the same problem? (From a theoretical perspective. Not asking for a change in implementation at this time.)

I'm wondering about banning all small clustering as a bunch of the cases I see where GM would seem most useful are pairs of small globals. :)

@topperc
Copy link
Collaborator

topperc commented May 29, 2024

The intent here is for global merge to not prevent small variables from being placed in sdata or sbss.

GlobalMerge already clusters globals into a couple of sets, would introduce a "small" vs "large" set solve the same problem? (From a theoretical perspective. Not asking for a change in implementation at this time.)

You'd need many sets. You don't want any "small" cluster to exceed to small data limit if you want it to be eligible for GP relaxation.

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I think we can probably do better here for small data, but this seems like an entirely reasonable stepping stone.

@michaelmaitland michaelmaitland merged commit 0f66915 into llvm:main Jun 3, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants