-
Notifications
You must be signed in to change notification settings - Fork 7.9k
ext/bcmath: Use SIMD for trailing zero counts during conversion #14166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ext/bcmath: Use SIMD for trailing zero counts during conversion #14166
Conversation
e55e0e2
to
fc7f7cb
Compare
ext/bcmath/libbcmath/src/str2num.c
Outdated
if (EXPECTED(mask != 0xffff)) { | ||
/* Move the pointer back and check each character in loop. */ | ||
str += sizeof(__m128i); | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can also use code like the following, but a while loop has always been faster. This may be because the number of calculations increases by one.
return str + sizeof(__m128i) - __builtin_clz(~mask);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a bit weird to optimize for something that (hopefully) shouldn't happen a lot. I see a slight performance decrease for benchmark 2, a small increase in bench 1 and a huge increase in bench 3. Do we think trailing zeros is common?
Note though that I am completely fine with removing the ineffective code and using UNEXPECTED.
ext/bcmath/libbcmath/src/str2num.c
Outdated
{ | ||
/* Check in bulk */ | ||
#ifdef __SSE2__ | ||
const __m128i c_zero_repeat = _mm_set1_epi8((signed char) '0'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Casting this to signed char shouldn't be necessary.
ext/bcmath/libbcmath/src/str2num.c
Outdated
@@ -76,6 +76,35 @@ static const char *bc_count_digits(const char *str, const char *end) | |||
return str; | |||
} | |||
|
|||
static inline const char *bc_skip_zero_reverse(const char *str, const char *end) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument names are swapped, which makes it very confusing.
Does this patch mean that Benchmark 2 is a bit slower in your environment? Could it be that the speedup in my measurements with this patch is due to the use of UNEXPECTED and the removal of unnecessary code, and that SIMD has a negative effect on patch 2? (edit) Or maybe the order of the measurements in the description is confusing? The first is before applying the patch, the second is the final state, and the rest are commit units, so you should compare the first and second. |
Trailing zeros are probably quite common given the use cases for BCMath, but 16 decimal digits is probably quite rare. I've opened a PR on this as it improved performance in all cases in my environment, but if not, I wouldn't be picky about using SIMD here. |
These are the results I'm getting:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with accepting this, the degradation for 2.php isn't severe and there are improvements for the other cases.
I'm fine with the argument names on second thought, but please remove the redundant case upon merging. Thanks.
Thx, as #14180, I will prepare a more stable benchmark environment and try measuring again. |
@nielsdos master:
this branch:
|
I removed the unnecessary cast and changed the variable name slightly. If the variable names are okay, merge this. |
03bc6bb
to
323e144
Compare
Scanner is written with double n |
Thanks, I didn't notice at all |
benchmark: #14132
before
Final state (after removing unnecessary code)
after SIMD
after UNEXPECTED
FYI: without SIMD