Skip to content

Adjust GC threshold if num_roots higher than threshold after collection #13758

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 25, 2024

Conversation

arnaud-lb
Copy link
Member

@arnaud-lb arnaud-lb commented Mar 19, 2024

This fixes an edge case causing the GC to be triggered repeatedly.

Destructors might add potential garbage to the buffer, so it may happen that num_root it higher than threshold after collection, thus triggering a GC run almost immediately. This can happen by touching enough objects in a destructor, e.g. by iterating over an array. If this happens again in the new run, and the threshold is not updated, the GC may be triggered again.

The edge case requires specific conditions to be triggered and it must happen rarely in practice:

  • At least GC_THRESHOLD_TRIGGER (100) objects must be collected during each run for the threshold to not be updated
  • At least GC_G(gc_threshold) (initially 10k) objects must be touched (decref'ed to n>0) by any destructor during each run to fill the buffer

The fix is to increase the threshold if GC_G(num_roots) >= GC_G(gc_threshold) after GC. The threshold eventually reaches a point at which the second condition is not met anymore.

The included tests trigger more than 200 GC runs before the fix, and 2 after the fix (dtors always trigger a second run).

A related issue is that zend_gc_check_root_tmpvars() may add potential garbage before the threshold is adjusted, which may trigger GC and exhaust the stack. This is fixed by setting GC_G(active)=1 around zend_gc_check_root_tmpvars().

Fixes #13670

}
}

for ($i = 0; $i < 10001; $i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these magical numbers be defined using shared consts for all GC tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I've updated the test to use gc_status()['threshold'] to fetch the default threshold.

This fixes an edge causing the GC to be triggered repeatedly.

Destructors might add potential garbage to the buffer, so it may happen that
num_root it higher than threshold after collection, thus triggering a GC run
almost immediately. This can happen by touching enough objects in a destructor,
e.g. by iterating over an array. If the happens again in the new run, and the
threshold is not updated, the GC may be triggered repeatedly.

Also prevent GC runs during zend_gc_check_root_tmpvars(), as the threshold is
not adjusted yet a this point.
@arnaud-lb arnaud-lb changed the base branch from master to PHP-8.2 March 19, 2024 15:33
@arnaud-lb arnaud-lb marked this pull request as ready for review March 19, 2024 16:23
Comment on lines +1677 to 1681

/* Prevent GC from running during zend_gc_check_root_tmpvars, before
* gc_threshold is adjusted, as this may result in unbounded recursion */
GC_G(gc_active) = 1;
zend_gc_check_root_tmpvars();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this lead to store after gc_grow_root_buffer() instead of immediate collection?
I didn't understand, how this is related to the main part of the patch and why this might be a problem.
Please explain.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this lead to store after gc_grow_root_buffer() instead of immediate collection?

Yes exactly

I didn't understand, how this is related to the main part of the patch and why this might be a problem

The main part of the patch increases gc_threshold when num_roots is still >= gc_threshold after collection, to prevent a collection from being triggered immediately. However, this adjustment happens after zend_gc_check_root_tmpvars(), so this function may still trigger an immediate collection. Moreover this happens recursively and it leads to a stack overflow in one of the test cases.

By setting gc_active=1 before zend_gc_check_root_tmpvars(), we prevent collections temporarily.

An alternative would be to adjust the threshold before zend_gc_check_root_tmpvars(). This requires a few more changes because the threshold should not be adjusted when collection is triggered manually, for example.

@arnaud-lb arnaud-lb merged commit c13794c into php:PHP-8.2 Mar 25, 2024
arnaud-lb added a commit that referenced this pull request Mar 25, 2024
* PHP-8.2:
  [ci skip]
  Adjust GC threshold if num_roots is higher than gc_threshold after collection (#13758)
arnaud-lb added a commit that referenced this pull request Mar 25, 2024
* PHP-8.3:
  [ci skip]
  [ci skip]
  Adjust GC threshold if num_roots is higher than gc_threshold after collection (#13758)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GC does not scale well with a lot of objects created in destructor
3 participants