[GlobalMerge] Add MinSize feature to the GlobalMerge Pass. #93686

michaelmaitland · 2024-05-29T13:50:59Z

We add a feature that prevents the GlobalMerge pass from considering data smaller than a minimum size in bytes for merging.

The MinSize is set in 3 ways:

If global-merge-min-data-size is explicitly set, then it uses that value.
If SmallDataLimit is set and non-zero, then SmallDataLimit + 1 is used.
Otherwise, 0 is used, which means all sizes are considered for merging.

We found that this feature allowed us to see the benefit of the GlobalMerge pass while eliminating some merging that was not beneficial. This feature allowed us to enable the GlobalMerge pass on RISC-V in our downstream by default because it led to improvements on multiple benchmark suites.

I plan to post a separate patch to propose enabling this by default on RISC-V. But I do not want that discussion to be part of the discussion of adding this feature, so I am keeping the patches separate.

We add a feature that prevents the GlobalMerge pass from considering data smaller than a minimum size in bytes for merging. The MinSize is set in 3 ways: 1. If global-merge-min-data-size is explicitly set, then it uses that value. 2. If SmallDataLimit is set and non-zero, then SmallDataLimit + 1 is used. 3. Otherwise, 0 is used, which means all sizes are considered for merging. This feature allowed us to enable the GlobalMerge pass on RISC-V in our downstream by default because it led to improvements on multiple benchmark suites without causing regressions to Geomeans. We found that this feature allowed us to see the benefit of the GlobalMerge pass while eliminating some merging that was not beneficial. I plan to post a separate patch to propose enabling this by default on RISC-V. But I do not want that discussion to be part of the discussion of adding this feature, so I am keeping the patches separate.

preames · 2024-05-29T14:37:22Z

I don't remember off the top of my head what SmallDataLimit represents, and am not finding it documented in LangRef. Is this related to GP relative addressing? Or something else?

Code wise, this looks pretty straight forward. I just want to make sure I understand the intent.

topperc · 2024-05-29T14:42:08Z

I don't remember off the top of my head what SmallDataLimit represents, and am not finding it documented in LangRef. Is this related to GP relative addressing? Or something else?

Code wise, this looks pretty straight forward. I just want to make sure I understand the intent.

It controls what variables end up in the .sdata and .sbss sections. Where s means small. Those sections are placed together and GP is placed somewhere in the middle so that GP relative addressing can apply to variables in those sections.

The intent here is for global merge to not prevent small variables from being placed in sdata or sbss.

michaelmaitland · 2024-05-29T14:45:26Z

I don't remember off the top of my head what SmallDataLimit represents, and am not finding it documented in LangRef. Is this related to GP relative addressing? Or something else?

Code wise, this looks pretty straight forward. I just want to make sure I understand the intent.

An address must be loaded from a small section if its size is less than the SmallDataLimit. Data in this section could be addressed by using gp_rel.

I have measured that basing the default GlobalMergeMinDataSize off of SmallDataLimit has beneficial effects compared to other default values. I measured a bunch of different defaults (0, 4, 5, 8, 9, 16, 17, ... 512, SmallDataLimit) and Small DataLimit did the trick.

preames · 2024-05-29T14:50:15Z

The intent here is for global merge to not prevent small variables from being placed in sdata or sbss.

GlobalMerge already clusters globals into a couple of sets, would introduce a "small" vs "large" set solve the same problem? (From a theoretical perspective. Not asking for a change in implementation at this time.)

I'm wondering about banning all small clustering as a bunch of the cases I see where GM would seem most useful are pairs of small globals. :)

topperc · 2024-05-29T15:11:29Z

The intent here is for global merge to not prevent small variables from being placed in sdata or sbss.

GlobalMerge already clusters globals into a couple of sets, would introduce a "small" vs "large" set solve the same problem? (From a theoretical perspective. Not asking for a change in implementation at this time.)

You'd need many sets. You don't want any "small" cluster to exceed to small data limit if you want it to be eligible for GP relaxation.

preames

LGTM - I think we can probably do better here for small data, but this seems like an entirely reasonable stepping stone.

michaelmaitland added the llvm:codegen label May 29, 2024

michaelmaitland requested review from asb, MaskRay, preames, topperc and kazutakahirata May 29, 2024 13:50

preames approved these changes May 29, 2024

View reviewed changes

michaelmaitland merged commit 0f66915 into llvm:main Jun 3, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[GlobalMerge] Add MinSize feature to the GlobalMerge Pass. #93686

[GlobalMerge] Add MinSize feature to the GlobalMerge Pass. #93686

Uh oh!

michaelmaitland commented May 29, 2024

Uh oh!

preames commented May 29, 2024

Uh oh!

topperc commented May 29, 2024

Uh oh!

michaelmaitland commented May 29, 2024

Uh oh!

preames commented May 29, 2024

Uh oh!

topperc commented May 29, 2024

Uh oh!

preames left a comment

Uh oh!

Uh oh!

Uh oh!

[GlobalMerge] Add MinSize feature to the GlobalMerge Pass. #93686

[GlobalMerge] Add MinSize feature to the GlobalMerge Pass. #93686

Uh oh!

Conversation

michaelmaitland commented May 29, 2024

Uh oh!

preames commented May 29, 2024

Uh oh!

topperc commented May 29, 2024

Uh oh!

michaelmaitland commented May 29, 2024

Uh oh!

preames commented May 29, 2024

Uh oh!

topperc commented May 29, 2024

Uh oh!

preames left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!