Skip to content

Use __attribute__((assume())) in ZEND_ASSUME when available #13171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

iluuu1994
Copy link
Member

@iluuu1994 iluuu1994 commented Jan 16, 2024

On GCC, with this patch, I can see a reduction of assembly lines from 1134787 to 1132912 (-0.17%), and reduction of the sapi/cli/php binary size from 47974048 to 47897952 (-0.16%).

Prompted by #13159.

/cc @jorgsowa @dstogov

@iluuu1994
Copy link
Member Author

Hmm, it seems this only starts behaving correctly from GCC 13+. We'll have to pin it for now.

@iluuu1994 iluuu1994 force-pushed the use-assume-statement-attribute branch 2 times, most recently from 8b48851 to 36ea119 Compare January 16, 2024 23:27
@dstogov
Copy link
Member

dstogov commented Jan 17, 2024

Hmm, it seems this only starts behaving correctly from GCC 13+. We'll have to pin it for now.

What was wrong with the old versions?

@iluuu1994
Copy link
Member Author

I previously used autoconf to recognize whether the attribute was available. However, even though it's there in GCC 11/12 it seems not to work in various contexts, i.e. it failed to compile. It only works properly in GCC 13 for arbitrary expressions, as it's part of the C++23 specification (but works in C too).

Copy link
Member

@dstogov dstogov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I don't object.
Do you see any performance difference? (visible or callgrind)

@iluuu1994
Copy link
Member Author

iluuu1994 commented Jan 17, 2024

@dstogov I only tried Valgrind, but did not come to any definitive conclusions. Some benchmarks were faster and some were marginally slower, but unfortunately the Symfony Demo is still flaky, so performance differences are easier to see over a longer period of time with many runs.

@iluuu1994 iluuu1994 force-pushed the use-assume-statement-attribute branch from 36ea119 to adee760 Compare January 17, 2024 11:59
@iluuu1994
Copy link
Member Author

iluuu1994 commented Jan 17, 2024

There is one more issue. GCC emits a warning when the attribute appears directly after a label. Prefixing it with ; solves the issue. https://godbolt.org/z/q61W1q43o This should be ok, since ZEND_ASSERT/ZEND_ASSUME may only appear in statement context.

Edit: Actually we need a do-while loop for single-statement contexts. Macros are hard.

Here are the Valgrind results:

  • Zend/bench.php: 100 / 6989840533 * 6543980511 = 93.62%
  • Zend/bench.php JIT: 100 / 759865902 * 751348602 = 98.88%
  • Symfony Demo 2.2.3: 100 / 75355117 * 71435758 = 94.80%
  • Symfony Demo 2.2.3 JIT: 100 / 53077944 * 51314115 = 96.68%
  • Wordpress 6.2: 100 / 327560944 * 311141827 = 94.99%
  • Wordpress 6.2 JIT: 100 / 198305017 * 189822641 = 95.72%

These seem way too good. I did run them twice though...

As expected, I messed up. My configure script passed unrelated CFLAGS which implicitly removes -O2, even with --disable-debug. This is the second time this happened to me. -.- The new results are much less impressive.

  • Zend/bench.php: 100 / 2058388762 * 2057999638 = 99.981095699
  • Zend/bench.php JIT: 100 / 597585483 * 597780509 = 100.032635666
  • Symfony Demo 2.2.3: 100 / 28969593 * 28947299 = 99.923043448
  • Symfony Demo 2.2.3 JIT: 100 / 24593819 * 24577418 = 99.933312512
  • Wordpress 6.2: 100 / 115001172 * 115048478 = 100.041135233
  • Wordpress 6.2 JIT: 100 / 89080904 * 88962676 = 99.867280197

Hopefully the improvement Bob found is still legit, as Valgrind isn't always a good representation.

@iluuu1994 iluuu1994 force-pushed the use-assume-statement-attribute branch from adee760 to 952fc4a Compare January 17, 2024 12:05
@bwoebi
Copy link
Member

bwoebi commented Jan 18, 2024

I can confirm a 2.5% speedup (wall time) on Zend/bench.php (no jit) with this PR on gcc-13. Which is sort of astounding :-D

@iluuu1994
Copy link
Member Author

I will investigate the generated assembly, but I'm guessing that for if (foo()) __builtin_unreachable(); foo() won't get eliminated if the compiler cannot conclude that it doesn't have any side-effects, which would happen whenever the function comes from a different compilation unit (and LTOs are disabled).

@bwoebi
Copy link
Member

bwoebi commented Jan 19, 2024

I must have done something wrong, I could not believe that either. So recompiled it and got something between 0 and 0.5% difference now.

The generated assembly is a bit better though, like gcc allocating less stack space for a lot of frames now.

@iluuu1994
Copy link
Member Author

Heh, you and me both 😅 Anyway, at least we can see a small improvement. Thanks for checking!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants