Micro-optimize double comparison #11061

nielsdos · 2023-04-11T20:47:58Z

When using ZEND_NORMALIZE_BOOL(a - b) where a and b are doubles, this generates the following instruction sequence on x64:
subsd xmm0, xmm1
pxor xmm1, xmm1
comisd xmm0, xmm1
...

whereas if we use ZEND_THREEWAY_COMPARE we get two instructions less: ucomisd xmm0, xmm1

The only difference is that the threeway compare uses ucomisd instead of comisd. The difference is that it will cause a FP signal if a signaling NAN is used, but as far as I'm aware this doesn't matter for our use case.

Similarly, the amount of instructions on AArch64 is also quite a bit lower for this code compared to the old code.

Results

Using the benchmark https://gist.github.com/nielsdos/b36517d81a1af74d96baa3576c2b70df
I used hyperfine: hyperfine --runs 25 --warmup 3 './sapi/cli/php sort_double.php'
No extensions such as opcache used during benchmarking.

Before this patch

Time (mean ± σ): 255.5 ms ± 2.2 ms [User: 251.0 ms, System: 2.5 ms]
Range (min … max): 251.5 ms … 260.7 ms 25 runs

After this patch

Time (mean ± σ): 236.2 ms ± 2.8 ms [User: 228.9 ms, System: 5.0 ms]
Range (min … max): 231.5 ms … 242.7 ms 25 runs

When using ZEND_NORMALIZE_BOOL(a - b) where a and b are doubles, this generates the following instruction sequence on x64: subsd xmm0, xmm1 pxor xmm1, xmm1 comisd xmm0, xmm1 ... whereas if we use ZEND_THREEWAY_COMPARE we get two instructions less: ucomisd xmm0, xmm1 The only difference is that the threeway compare uses *u*comisd instead of comisd. The difference is that it will cause a FP signal if a signaling NAN is used, but as far as I'm aware this doesn't matter for our use case. Similarly, the amount of instructions on AArch64 is also quite a bit lower for this code compared to the old code. ** Results ** Using the benchmark https://gist.github.com/nielsdos/b36517d81a1af74d96baa3576c2b70df I used hyperfine: hyperfine --runs 25 --warmup 3 './sapi/cli/php sort_double.php' No extensions such as opcache used during benchmarking. BEFORE THIS PATCH ----------------- Time (mean ± σ): 255.5 ms ± 2.2 ms [User: 251.0 ms, System: 2.5 ms] Range (min … max): 251.5 ms … 260.7 ms 25 runs AFTER THIS PATCH ---------------- Time (mean ± σ): 236.2 ms ± 2.8 ms [User: 228.9 ms, System: 5.0 ms] Range (min … max): 231.5 ms … 242.7 ms 25 runs

Girgias

LGTM, also the 3 way compare seems to make way more sense then the normalize bool macro, even without the micro optimizaton.

nielsdos requested review from iluuu1994 and Girgias as code owners April 11, 2023 20:47

github-actions bot added Category: Engine Extension: spl Extension: standard labels Apr 11, 2023

Girgias approved these changes Apr 12, 2023

View reviewed changes

nielsdos merged commit a0476fd into php:master Apr 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Micro-optimize double comparison #11061

Micro-optimize double comparison #11061

Uh oh!

nielsdos commented Apr 11, 2023 •

edited

Loading

Uh oh!

Girgias left a comment

Uh oh!

Uh oh!

Micro-optimize double comparison #11061

Micro-optimize double comparison #11061

Uh oh!

Conversation

nielsdos commented Apr 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Results

Before this patch

After this patch

Uh oh!

Girgias left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nielsdos commented Apr 11, 2023 •

edited

Loading