-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[GlobalMerge] Add MinSize feature to the GlobalMerge Pass. #93686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We add a feature that prevents the GlobalMerge pass from considering data smaller than a minimum size in bytes for merging. The MinSize is set in 3 ways: 1. If global-merge-min-data-size is explicitly set, then it uses that value. 2. If SmallDataLimit is set and non-zero, then SmallDataLimit + 1 is used. 3. Otherwise, 0 is used, which means all sizes are considered for merging. This feature allowed us to enable the GlobalMerge pass on RISC-V in our downstream by default because it led to improvements on multiple benchmark suites without causing regressions to Geomeans. We found that this feature allowed us to see the benefit of the GlobalMerge pass while eliminating some merging that was not beneficial. I plan to post a separate patch to propose enabling this by default on RISC-V. But I do not want that discussion to be part of the discussion of adding this feature, so I am keeping the patches separate.
I don't remember off the top of my head what SmallDataLimit represents, and am not finding it documented in LangRef. Is this related to GP relative addressing? Or something else? Code wise, this looks pretty straight forward. I just want to make sure I understand the intent. |
It controls what variables end up in the .sdata and .sbss sections. Where s means small. Those sections are placed together and GP is placed somewhere in the middle so that GP relative addressing can apply to variables in those sections. The intent here is for global merge to not prevent small variables from being placed in sdata or sbss. |
An address must be loaded from a small section if its size is less than the SmallDataLimit. Data in this section could be addressed by using gp_rel. I have measured that basing the default GlobalMergeMinDataSize off of SmallDataLimit has beneficial effects compared to other default values. I measured a bunch of different defaults (0, 4, 5, 8, 9, 16, 17, ... 512, SmallDataLimit) and Small DataLimit did the trick. |
GlobalMerge already clusters globals into a couple of sets, would introduce a "small" vs "large" set solve the same problem? (From a theoretical perspective. Not asking for a change in implementation at this time.) I'm wondering about banning all small clustering as a bunch of the cases I see where GM would seem most useful are pairs of small globals. :) |
You'd need many sets. You don't want any "small" cluster to exceed to small data limit if you want it to be eligible for GP relaxation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - I think we can probably do better here for small data, but this seems like an entirely reasonable stepping stone.
We add a feature that prevents the GlobalMerge pass from considering data smaller than a minimum size in bytes for merging.
The MinSize is set in 3 ways:
We found that this feature allowed us to see the benefit of the GlobalMerge pass while eliminating some merging that was not beneficial. This feature allowed us to enable the GlobalMerge pass on RISC-V in our downstream by default because it led to improvements on multiple benchmark suites.
I plan to post a separate patch to propose enabling this by default on RISC-V. But I do not want that discussion to be part of the discussion of adding this feature, so I am keeping the patches separate.