Optimize stripos #7852

iluuu1994 · 2021-12-29T12:05:37Z

Previously stripos would lowercase both the haystack and needle to reuse
strpos. The approach in this PR is similar to strpos. memchr is highly
optimized so we're using it to search for the first character of the
needle in the haystack. If we find it we compare the remaining
characters of the needle manually.

The new implementation seems to perform about half as well as strpos (as
two memchr calls are necessary).

@cmb69 Thoughts?

cmb69

Thanks for the PR!

Zend/zend_operators.h

nikic

While this improves some cases, it will regress others. For large strings this makes the search quadratic, while the previous implementation is linear (if you will allow you me the imprecision).

Zend/zend_operators.h

iluuu1994 · 2021-12-29T18:26:23Z

@nikic Yeah, right now it would be quadratic in the worst-case scenario (something like stripos('ab', 'aaaaaaaaaaaaaaaaaAb)). I think we can make it linear by caching the result of each memchr for the next iteration if we haven't passed it yet.

iluuu1994 · 2021-12-29T20:08:48Z

It should be linear now. Performance is still approximately the same for me. Let me know if you think the add complexity is worth it.

KapitanOczywisty · 2021-12-29T20:56:55Z

I might be late to the party, but I've tried to improve this with 2 concurrent searches, somewhat close to what you did in the last commit, but I might have slightly better performance overall, and it doesn't dip with modified test:

$haystack = str_repeat('A', 1e+6) . "BCB";
$haystack = str_repeat($haystack, 10) . 'BBB';
//...
stripos($haystack, 'bbb');

Take a look if you will: https://gist.github.com/KapitanOczywisty/5c8c053ceb3ba687cbab41266120dac6

Also I think that more complex algorithm should be used with haystack longer than 10KB or so.

Zend/zend_operators.h

cmb69 · 2021-12-29T22:23:43Z

This is certainly an interesting approach (thanks, @iluuu1994!), but I still have doubts that the performance gain outweighs the added complexity. Is stripos() used often with long haystacks? And even if, that code still could use PCRE instead (i.e. document that in the manual). I also wonder about the performance of something like stripos(str_repeat("abc", 10000) . "abd", "abd"), what appears to ruin the potential optimizations of memchr() (such cases might be rare, though).

Anyhow, if we do this optimization, we also should consider doing it for stristr().

iluuu1994 · 2021-12-30T17:17:58Z

@cmb69 @KapitanOczywisty's version doesn't look too complicated. I'll check if the same can be applied to stristr. I'll also do some benchmarks for various cases (best-, average- and worst-case, real world cases) to see how it compares. I will try a If the difference for real-world cases is negligible I'm happy to drop this PR.

I also wonder about the performance of something like stripos(str_repeat("abc", 10000) . "abd", "abd"), what appears to ruin the potential optimizations of memchr() (such cases might be rare, though).

Yes. Although the same applies for strpos too. The worst case for the current algorithm would be something like stripos(str_repeat("aA", 10000) . "ab", "ab") as memchr but as you mentioned that's probably pretty rare.

iluuu1994 · 2021-12-30T19:50:04Z

@KapitanOczywisty Thanks for your implementation. There was a small bug here (only p should be checked because not both upper and lower case need to be present for a match). Other than that your implementation was simpler so I mostly copied it 1:1. I just removed the macro to avoid finding the right place to add it / fleshing it out / naming it properly.

@cmb69 It works fine for stristr too, except that php_stristr as a side-effect lowers the haystack and needle passed to it, so I created a new method that does not do that. We could still modify the old method and document the change as this is in master.

I'll create the benchmarks next.

KapitanOczywisty · 2021-12-30T20:30:31Z

@iluuu1994 This was fun challenge, since I'm not really working with C and glad I could help.

There was a small bug here (only p should be checked because not both upper and lower case need to be present for a match).

Yeah, I wasn't sure about that part so I left it in the code for people smarter than me to clean up :)

Zend/zend_operators.h

iluuu1994 · 2021-12-31T16:56:50Z

Posted the results in the wrong issue. #7847 (comment)

Anyway, I think this is worth merging because all cases seem to be faster, and the implementation doesn't seem terribly complex.

Closes phpGH-7847 Closes phpGH-7852 Previously stripos/stristr would lowercase both the haystack and the needle to reuse strpos. The approach in this PR is similar to strpos. memchr is highly optimized so we're using it to search for the first character of the needle in the haystack. If we find it we compare the remaining characters of the needle manually. The new implementation seems to perform about half as well as strpos (as two memchr calls are necessary to find the next candidate).

iluuu1994 · 2022-01-03T22:46:39Z

Any objections to merging this?

Zend/zend_operators.h

ext/standard/string.c

TysonAndre · 2022-01-07T15:29:30Z

Zend/zend_operators.h

+
+	const char first_lower = zend_tolower_ascii(*needle);
+	const char first_upper = zend_toupper_ascii(*needle);
+	const char *p_lower = (const char *)memchr(haystack, first_lower, end - haystack);


For future scope: I think there may be a way to optimize this with SSE for really long strings, but it may not be worth it, especially with stripos not being used that often.

(both checking if the first character exists case-insensitively, and checking if long needles match)

Definitely not something to add to this PR.

(*needle_byte ^ *haystack_byte) == *needle_mask, where needle_mask is 0xff or 0xcf (~0x20), since 'a' ^ 'A' is 0x20)

For a block of 8/16/32 bytes, that could be checked by:

Computing the needle and mask and extending it to 8 bytes

Bitwise xoring the needle block with the haystack mask

Masking the xored block (0xff or 0xcf, depending on whether this was an alphabet character. 0xff ^ first_lower ^ first_upper)

Checking for 0 bytes in the mask

Combining to a word

Checking if the combination was non-zero

Searching in that block normally for the first character

This would make the worst case of case-insensitive search faster (stripos(str_repeat('A', 1000000), 'a')) by stopping at the first lowercase or uppercase byte . The memchr could also start at whatever that result was if needle_len > 1.

I know nothing about SSE, but as you mentioned, the new implementation should already be much better for large strings, we can always improve it at a later point.

Closes phpGH-7847 Closes phpGH-7852 Previously stripos/stristr would lowercase both the haystack and the needle to reuse strpos. The approach in this PR is similar to strpos. memchr is highly optimized so we're using it to search for the first character of the needle in the haystack. If we find it we compare the remaining characters of the needle manually. The new implementation seems to perform about half as well as strpos (as two memchr calls are necessary to find the next candidate).

iluuu1994 · 2022-01-27T21:55:36Z

I decided to actually alter php_stristr instead of creating a new function to make sure the faster implementation is actually used. It's unlikely that this changes behavior of existing code as normally haystack and needle were copied exactly to avoid the tolower side effect. I documented the change in UPGRADING.INTERNALS. If there's no more feedback I'll merge this in a day or two.

ext/libxml/libxml.c

Zend/zend_operators.h

ext/standard/string.c

ext/libxml/libxml.c

cmb69 reviewed Dec 29, 2021

View reviewed changes

Zend/zend_operators.h Outdated Show resolved Hide resolved

Zend/zend_operators.h Outdated Show resolved Hide resolved

iluuu1994 force-pushed the stripos-optimization branch from 75f3258 to 23c4443 Compare December 29, 2021 13:12

cmb69 added the Waiting on Review label Dec 29, 2021

nikic reviewed Dec 29, 2021

View reviewed changes

Zend/zend_operators.h Outdated Show resolved Hide resolved

KapitanOczywisty reviewed Dec 29, 2021

View reviewed changes

Zend/zend_operators.h Outdated Show resolved Hide resolved

iluuu1994 force-pushed the stripos-optimization branch from 2ab84b5 to acad59f Compare December 30, 2021 19:45

KapitanOczywisty reviewed Dec 31, 2021

View reviewed changes

Zend/zend_operators.h Outdated Show resolved Hide resolved

iluuu1994 force-pushed the stripos-optimization branch from acad59f to f311aff Compare January 3, 2022 22:45

TysonAndre reviewed Jan 7, 2022

View reviewed changes

Zend/zend_operators.h Outdated Show resolved Hide resolved

TysonAndre reviewed Jan 7, 2022

View reviewed changes

ext/standard/string.c Outdated Show resolved Hide resolved

TysonAndre reviewed Jan 7, 2022

View reviewed changes

iluuu1994 force-pushed the stripos-optimization branch from f311aff to 70bd296 Compare January 26, 2022 13:44

iluuu1994 commented Jan 27, 2022

View reviewed changes

ext/libxml/libxml.c Show resolved Hide resolved

Alter php_stristr() instead of creating new function

1b800b9

iluuu1994 force-pushed the stripos-optimization branch from 348297d to 1b800b9 Compare January 27, 2022 22:10

TysonAndre reviewed Jan 28, 2022

View reviewed changes

Zend/zend_operators.h Show resolved Hide resolved

ext/standard/string.c Show resolved Hide resolved

ext/standard/string.c Show resolved Hide resolved

ext/libxml/libxml.c Show resolved Hide resolved

iluuu1994 closed this in 2f52956 Jan 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize stripos #7852

Optimize stripos #7852

iluuu1994 commented Dec 29, 2021 •

edited

Loading

cmb69 left a comment

nikic left a comment

iluuu1994 commented Dec 29, 2021

iluuu1994 commented Dec 29, 2021

KapitanOczywisty commented Dec 29, 2021

cmb69 commented Dec 29, 2021

iluuu1994 commented Dec 30, 2021

iluuu1994 commented Dec 30, 2021 •

edited

Loading

KapitanOczywisty commented Dec 30, 2021

iluuu1994 commented Dec 31, 2021

iluuu1994 commented Jan 3, 2022

TysonAndre Jan 7, 2022

iluuu1994 Jan 26, 2022

iluuu1994 commented Jan 27, 2022

Optimize stripos #7852

Optimize stripos #7852

Conversation

iluuu1994 commented Dec 29, 2021 • edited Loading

cmb69 left a comment

Choose a reason for hiding this comment

nikic left a comment

Choose a reason for hiding this comment

iluuu1994 commented Dec 29, 2021

iluuu1994 commented Dec 29, 2021

KapitanOczywisty commented Dec 29, 2021

cmb69 commented Dec 29, 2021

iluuu1994 commented Dec 30, 2021

iluuu1994 commented Dec 30, 2021 • edited Loading

KapitanOczywisty commented Dec 30, 2021

iluuu1994 commented Dec 31, 2021

iluuu1994 commented Jan 3, 2022

TysonAndre Jan 7, 2022

Choose a reason for hiding this comment

iluuu1994 Jan 26, 2022

Choose a reason for hiding this comment

iluuu1994 commented Jan 27, 2022

iluuu1994 commented Dec 29, 2021 •

edited

Loading

iluuu1994 commented Dec 30, 2021 •

edited

Loading