Skip to content

Improve performance of WeakReference/WeakMap. Avoid hash collisions on pointers. #7690

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 28, 2021

Conversation

TysonAndre
Copy link
Contributor

@TysonAndre TysonAndre commented Nov 25, 2021

Shift pointers by ZEND_MM_ALIGNMENT_LOG2
to avoid the noticeable performance degradation caused by hash table collisions.
in EG(weakrefs) and WeakMap entries

On 64-bit platforms, pointers are usually aligned to at least 8 bytes,
so only one in 8 hash buckets were actually getting used.
(With the metadata needed to track allocations,
alignment might be at least 16 bytes in practice)

Before this change with taskset 1 sapi/cli/php -d memory_limit=2G -d zend_extension=$PWD/modules/opcache.so -d opcache.enable_cli=1 -d opcache.enable=1 bench.php in a non-debug build

Benchmarking note: SplObjectStorage is slow because there's no native write handler (SplObjectStorage implementation was removed because it needed to call
offsetSet when subclasses overrid it and there were historically bugs), where it's the other way around for WeakMap which has a native write handler

(I forgot about that)

    /**
     * @param object $object
     * @tentative-return-type
     * @implementation-alias SplObjectStorage::attach
     * @no-verify Cannot specify arg type because ArrayAccess does not
     */
    public function offsetSet(mixed $object, mixed $info = null): void {}
# Before optimization
           bench_weak_map:  149048272 bytes create=0.356473 total=0.536920
 bench_spl_object_storage:  122724856 bytes create=0.336846 total=0.424876

           bench_weak_map:   98724768 bytes create=0.258280 total=0.437024
 bench_spl_object_storage:  122724856 bytes create=0.254590 total=0.345208

           bench_weak_map:   98724768 bytes create=0.258501 total=0.437509
 bench_spl_object_storage:  122724856 bytes create=0.223680 total=0.317106

           bench_weak_map:   98724768 bytes create=0.258847 total=0.437875
 bench_spl_object_storage:  122724856 bytes create=0.235694 total=0.322943

           bench_weak_map:   98724768 bytes create=0.260643 total=0.440493
 bench_spl_object_storage:  122724856 bytes create=0.244720 total=0.336108

           bench_weak_map:   98724768 bytes create=0.255080 total=0.434092
 bench_spl_object_storage:  122724856 bytes create=0.198827 total=0.290071

           bench_weak_map:   98724768 bytes create=0.255159 total=0.434814
 bench_spl_object_storage:  122724856 bytes create=0.200624 total=0.293820

After this change in php 8.2, performance is comparable to SplObjectStorage(first run grows EG(weakrefs) and allocates memory for the first time and is slower:

Note that EG(weakrefs) doesn't shrink capacity, so the first entry's memory is a more accurate representation of weakref memory usage for a long-running program.

# After optimization (v1: Only avoid hash collisions for EG(weakrefs))
           bench_weak_map:  149048272 bytes create=0.265922 total=0.374167
 bench_spl_object_storage:  122724856 bytes create=0.330690 total=0.416207

           bench_weak_map:   98724768 bytes create=0.214908 total=0.324684
 bench_spl_object_storage:  122724856 bytes create=0.251910 total=0.343701

           bench_weak_map:   98724768 bytes create=0.206807 total=0.315133
 bench_spl_object_storage:  122724856 bytes create=0.222431 total=0.319022

           bench_weak_map:   98724768 bytes create=0.217553 total=0.333927
 bench_spl_object_storage:  122724856 bytes create=0.225307 total=0.310985
After optimizations V2: Avoid collisions both for global EG(weakrefs) and individual WeakMap HashTables (zend_weakmap->ht)
           bench_weak_map:  149048272 bytes create=0.177938 total=0.225081
 bench_spl_object_storage:  122724856 bytes create=0.348268 total=0.441844

           bench_weak_map:   98724768 bytes create=0.130229 total=0.178878
 bench_spl_object_storage:  122724856 bytes create=0.263178 total=0.361414

           bench_weak_map:   98724768 bytes create=0.128731 total=0.177533
 bench_spl_object_storage:  122724856 bytes create=0.230146 total=0.326304

           bench_weak_map:   98724768 bytes create=0.135435 total=0.183836
 bench_spl_object_storage:  122724856 bytes create=0.241961 total=0.332324

           bench_weak_map:   98724768 bytes create=0.134626 total=0.185558
 bench_spl_object_storage:  122724856 bytes create=0.253069 total=0.345240

           bench_weak_map:   98724768 bytes create=0.125791 total=0.175136
 bench_spl_object_storage:  122724856 bytes create=0.208869 total=0.303268
<?php

class Example {
}

function bench_weak_map(int $n) {
    $wm = new WeakMap();
    $values = [];
    for ($i = 0; $i < $n; $i++) {
        $x = new Example();
        $wm[$x] = true;
        $values[] = $x;
    }
    return [$values, $wm];
}

function bench_spl_object_storage(int $n) {
    $wm = new SplObjectStorage();
    $values = [];
    for ($i = 0; $i < $n; $i++) {
        $x = new Example();
        $wm[$x] = true;
        $values[] = $x;
    }
    return [$values, $wm];
}

function main(int $n) {
    error_reporting(E_ALL & ~E_DEPRECATED);
    foreach (['bench_weak_map', 'bench_spl_object_storage'] as $f) {
        gc_collect_cycles();
        $t1 = hrtime(true);
        $m1 = memory_get_usage();
        $v = $f($n);
        $t2 = hrtime(true);
        $m2 = memory_get_usage();
        unset($v);
        $t3 = hrtime(true);
        printf("%25s: %10d bytes create=%.6f total=%.6f\n", $f, $m2 - $m1, ($t2 - $t1)/1e9, ($t3 - $t1)/1e9);
    }
    echo "\n";
}

while(true) {
    main(1000000);
}

@TysonAndre
Copy link
Contributor Author

TysonAndre commented Nov 25, 2021

As seen in #7672 - php's WeakReference/WeakMap handling is spending a lot of time in hash table lookups/modifications. This improves performance by:

  • Multiplying the usable hash entries by ZEND_MM_ALIGMENT (usually 8) and significantly reducing hash collisions
  • Improving cache locality for sequentially allocated objects when accessing the hash tables used to track WeakReference/WeakMap instances (especially in the benchmark)
=====================================================================
FAILED TEST SUMMARY
---------------------------------------------------------------------
FPM: Process manager config pm.process_idle_timeout [sapi/fpm/tests/proc-idle-timeout.phpt]
=====================================================================

The Travis build failure is unrelated. The azure failures are caused by running out of CI minutes.

@TysonAndre TysonAndre requested review from nikic and dstogov November 25, 2021 20:34
@nikic
Copy link
Member

nikic commented Nov 26, 2021

cc @bwoebi as the new commit changes keys for the zend_weakrefs_hash_add/del API -- they're no longer simply zend_object* cast to integer. Is that okay? It shouldn't matter if your just use zend_weakrefs_hash_add/del, but may matter is you use ZEND_HASH_FOREACH and interpret the keys.

@TysonAndre
Copy link
Contributor Author

TysonAndre commented Nov 26, 2021

I assume implementations that needed to inspect internals could add macros like these - 8.2 alphas aren't even out yet. If debuggers or instrumentation tools needed this I guess this could be added to the public API in 8.2.

static zend_always_inline zend_ulong copyof_zend_object_ptr_to_weakref_key(const zend_object *object)
{
#if PHP_VERSION_ID >= 80200
	ZEND_ASSERT(((uintptr_t)object) % ZEND_MM_ALIGNMENT == 0);
	return ((uintptr_t) object) >> ZEND_MM_ALIGNMENT_LOG2;
#else
	return (zend_long) object;
#endif
}

static zend_always_inline zend_object *copyof_zend_weakref_key_to_zend_object_ptr(zend_ulong key)
{
#if PHP_VERSION_ID >= 80200
	return (zend_object *) (((uintptr_t) key) << ZEND_MM_ALIGNMENT_LOG2);
#else
	return (zend_long) object;
#endif
}

@bwoebi
Copy link
Member

bwoebi commented Nov 26, 2021

I suppose that's fine, we just should have some macro in weakrefs.h to convert the bare key then.

@TysonAndre
Copy link
Contributor Author

I suppose that's fine, we just should have some macro in weakrefs.h to convert the bare key then.

That's done

@nikic
Copy link
Member

nikic commented Nov 26, 2021

ext/zend_test/tests/zend_weakmap.phpt is failing.

@TysonAndre
Copy link
Contributor Author

ext/zend_test/tests/zend_weakmap.phpt is failing.

I'd missed that since I thought php was still out of Azure Pipelines minutes (thanks for fixing that). That's fixed now, it needed to use the new function in zend_weakref.h to convert the hash key to a pointer.

TysonAndre added a commit to TysonAndre/php-src that referenced this pull request Nov 27, 2021
This makes reading/writing with `$splObjectStorage[$offset]` shorthand twice as
fast as it was previously, and a bit faster than offsetGet/offsetSet instead of
(previously) much slower.

Related to phpGH-7690
@TysonAndre TysonAndre requested a review from nikic November 27, 2021 17:15
TysonAndre added a commit to TysonAndre/php-src that referenced this pull request Nov 27, 2021
This makes reading/writing with `$splObjectStorage[$offset]` shorthand twice as
fast as it was previously, and a bit faster than offsetGet/offsetSet instead of
(previously) much slower.

Related to phpGH-7690
Shift pointers by ZEND_MM_ALIGNMENT_LOG2
to avoid the noticeable performance degradation caused by hash table collisions.
in `EG(weakrefs)` and zend_weakmap->ht

On 64-bit platforms, pointers are usually aligned to at least 8 bytes,
so only one in 8 hash buckets were actually getting used.
(With the metadata needed to track allocations,
alignment might be at least 16 bytes in practice)

Address review comments, add optimization

Make it public for any extensions that need to work with EG(weakrefs)
for instrumentation, debugging, etc. (e.g. zend_test)

PHP 8.1 and below would use the raw pointer value as a hash key instead.
@TysonAndre TysonAndre merged commit 7504cf1 into php:master Nov 28, 2021
@TysonAndre TysonAndre deleted the weakref-collision branch November 28, 2021 00:52
TysonAndre added a commit to TysonAndre/php-src that referenced this pull request Nov 28, 2021
This makes reading/writing with `$splObjectStorage[$offset]` shorthand twice as
fast as it was previously, and a bit faster than offsetGet/offsetSet instead of
(previously) much slower.

Related to phpGH-7690
TysonAndre added a commit to TysonAndre/php-src that referenced this pull request Nov 28, 2021
This makes reading/writing with `$splObjectStorage[$offset]` shorthand twice as
fast as it was previously, and a bit faster than offsetGet/offsetSet instead of
(previously) much slower.

Call destructor after overriding old SplObjectStorage entry.
Previously, it was called before, which was possibly unsafe
if the destructor had side effects.

Add tests.

Related to phpGH-7690
TysonAndre added a commit that referenced this pull request Nov 30, 2021
This makes reading/writing with `$splObjectStorage[$offset]` shorthand twice as
fast as it was previously, and a bit faster than offsetGet/offsetSet instead of
(previously) much slower.

Call destructor after overriding old SplObjectStorage entry.
Previously, it was called before, which was possibly unsafe
if the destructor had side effects.

Add tests.

Closes GH-7695

Related to GH-7690

Check for ref in SplObjectStorage->__unserialize, check for ref.

SplObjectStorage->unserialize may be a different cause of references
for malformed inputs, so continue checking

In internally used methods, convert references to non-references if they're
found.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants