Skip to content

RFC: Add final class Collections\Deque to PHP #7500

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

TysonAndre
Copy link
Contributor

@TysonAndre TysonAndre commented Sep 19, 2021

RFC: https://wiki.php.net/rfc/deque
Online WebAssembly Demo: https://tysonandre.github.io/php-rfc-demo/deque (outdated version of this)
Also see the final class Vector proposal: #7488
Discussion: Adding final class Deque to PHP

This proposes to add the class final class Deque to PHP. From the Wikipedia article for the Double-Ended Queue:

In computer science, a double-ended queue (abbreviated to deque, pronounced deck, like “cheque”) is an abstract data type that generalizes a queue, for which elements can be added to or removed from either the front (head) or back (tail).

(omitted sections, see article)

The dynamic array approach uses a variant of a dynamic array that can grow from both ends, sometimes called array deques. > These array deques have all the properties of a dynamic array, such as constant-time random access, good locality of reference, and inefficient insertion/removal in the middle, with the addition of amortized constant-time insertion/removal at both ends, instead of just one end. Three common implementations include:

  1. Storing deque contents in a circular buffer, and only resizing when the buffer becomes full. This decreases the frequency of resizings.
  2. (omitted, see article)

This has lower memory usage and better performance than SplDoublyLinkedList for
push/pop/shift/unshift operations.

The API is as follows:

namespace Collections;

/**
 * A double-ended queue (Typically abbreviated as Deque, pronounced "deck", like "cheque")
 * represented internally as a circular buffer.
 *
 * This has much lower memory usage than SplDoublyLinkedList or its subclasses (SplStack, SplStack),
 * and operations are significantly faster than SplDoublyLinkedList.
 *
 * See https://en.wikipedia.org/wiki/Double-ended_queue
 *
 * This supports amortized constant time pushing and popping onto the start (i.e. front, first)
 * or end (i.e. back, last) of the Deque.
 *
 * Method naming is based on https://www.php.net/spldoublylinkedlist
 * and on array_push/pop/unshift/shift/ and array_key_first/array_key_last.
 */
final class Deque implements IteratorAggregate, Countable, JsonSerializable, ArrayAccess
{
    /** Construct the Deque from the values of the Traversable/array, ignoring keys */
    public function __construct(iterable $iterator = []) {}
    /**
     * Returns an iterator that accounts for calls to shift/unshift tracking the position of the start of the Deque.
     * Calls to shift/unshift will do the following:
     * - Increase/Decrease the value returned by the iterator's key()
     *   by the number of elements added/removed to/from the start of the Deque.
     *   (`$deque[$iteratorKey] === $iteratorValue` at the time the key and value are returned).
     * - Repeated calls to shift will cause valid() to return false if the iterator's
     *   position ends up before the start of the Deque at the time iteration resumes.
     * - They will not cause the remaining values to be iterated over more than once or skipped.
     */
    public function getIterator(): \InternalIterator {}
    /** Returns the number of elements in the Deque. */
    public function count(): int {}
    /** Returns true if there are 0 elements in the Deque. */
    public function isEmpty(): bool {}
    /** Removes all elements from the Deque. */
    public function clear(): void {}

    public function __serialize(): array {}
    public function __unserialize(array $data): void {}
    /** Construct the Deque from the values of the array, ignoring keys */
    public static function __set_state(array $array): Deque {}

    /** Appends value(s) to the end of the Deque. */
    public function push(mixed ...$values): void {}
    /** Prepends value(s) to the start of the Deque. */
    public function pushFront(mixed ...$values): void {}
    /**
     * Pops a value from the end of the Deque.
     * @throws \UnderflowException if the Deque is empty
     */
    public function pop(): mixed {}
    /**
     * Pops a value from the start of the Deque.
     * @throws \UnderflowException if the Deque is empty
     */
    public function popFront(): mixed {}

    /**
     * Peeks at the value at the start of the Deque.
     * @throws \UnderflowException if the Deque is empty
     */
    public function first(): mixed {}
    /**
     * Peeks at the value at the end of the Deque.
     * @throws \UnderflowException if the Deque is empty
     */
    public function last(): mixed {}

    /**
     * Returns a list of the elements from the start to the end.
     */
    public function toArray(): array {}

    // Must be mixed for compatibility with ArrayAccess
    /**
     * Insert 0 or more values at the given offset of the Deque.
     * @throws \OutOfBoundsException if the value of $offset is not within the bounds of this Deque.
     */
    public function insert(int $offset, mixed ...$values): void {}
    /**
     * Returns the value at offset (int)$offset (relative to the start of the Deque)
     * @throws \OutOfBoundsException if the value of (int)$offset is not within the bounds of this Deque
     */
    public function offsetGet(mixed $offset): mixed {}
    /**
     * Returns true if `0 <= (int)$offset && (int)$offset < $this->count().
     */
    public function offsetExists(mixed $offset): bool {}
    /**
     * Sets the value at offset $offset (relative to the start of the Deque) to $value
     * @throws \OutOfBoundsException if the value of (int)$offset is not within the bounds of this Deque
     */
    public function offsetSet(mixed $offset, mixed $value): void {}
    /**
     * Removes the value at (int)$offset from the deque.
     * @throws \OutOfBoundsException if the value of (int)$offset is not within the bounds of this Deque.
     */
    public function offsetUnset(mixed $offset): void {}

    /**
     * This is JSON serialized as a JSON array with elements from the start to the end.
     */
    public function jsonSerialize(): array {}
}

Earlier work on the implementation can be found at
https://github.com/TysonAndre/pecl-teds
(though Teds\Deque hasn't been updated with new names yet)

This was originally based on spl_fixedarray.c and previous work I did on an RFC.

Notable features of Deque

  • Significantly lower memory usage and better performance than
    SplDoublyLinkedList

  • Amortized constant time operations for push/pop/unshift/shift.

  • Reclaims memory when roughly a quarter of the capacity is used,
    unlike array, which never releases allocated capacity
    https://www.npopov.com/2014/12/22/PHPs-new-hashtable-implementation.html (the article predates php 8.2-dev's memory optimization for packed arrays)

    One problem with the current implementation is that arData never shrinks
    (unless explicitly told to). So if you create an array with a few million
    elements and remove them afterwards, the array will still take a lot of
    memory. We should probably half the arData size if utilization falls below a
    certain level.

    For long-running applications when the maximum count of Deque
    is larger than the average count, this may be a concern.

  • Adds functionality that cannot be implemented nearly efficiently in
    an array. For example, shifting a single element onto an array
    (and making it first in iteration order) with array_shift
    would take linear time, because all elements in the array
    would need to be moved to make room for the first one

  • Support $deque[] = $element, like ArrayObject.

  • Having this functionality in php itself rather than a third party extension
    would encourage wider adoption of this

Backwards incompatible changes:

  • Userland classlikes named \Collections\Deque would
    cause a compile error due to this class now being declared internally.

Benchmark

Benchmarks were re-run on November 9th, 2022 with opcache enabled in an NTS non-debug build with default cflags (-O2)

Two cycles of appending n values then shifting them from the front

Note that it is possible to have constant time removal from the front of a PHP array efficiently (as long as key stays at the front of the array), but it is not possible to have constant time prepending (unshift) to the front of an array. array_unshift is a linear time operation (takes time proportional to the current array size) - **this benchmark avoids array_unshift and other pitfalls for array**. So unshift` is not benchmarked.

Because there's a second cycle, array becomes an associative array and uses more memory than a packed array (https://www.npopov.com/2014/12/22/PHPs-new-hashtable-implementation.html).

memory_get_usage is not counting the memory overhead of tracking the allocations of a lot of small objects, so the memory usage of SplDoublyLinkedList is under-reported (negligibly with the default emalloc allocator). SplQueue is a subclass of SplDoublyLinkedList and I expect it would have the same performance.

Click to expand source code for benchmarking 2 cycles of pushing `n` elements then popping all elements

<?php

const PUSH_SHIFT_CYCLES = 2;

function bench_array(int $n, int $iterations) {
    $totalReadTime = 0.0;
    $startTime = hrtime(true);
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = [];
        for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) {
            for ($i = 0; $i < $n; $i++) {
                $values[] = $i;
            }
            $maxMemory = memory_get_usage();
            $startReadTime = hrtime(true);
            while (count($values) > 0) {
                // Pretend we don't know the actual position of the first array key for this simulated benchmark
                // array_shift is not used because it is linear time (to move all other elements)
                // rather than constant time.
                //
                // This approach is simple at the cost of memory - it converts a packed array to a non-packed array
                // NOTE: Adding a call to reset() is not necessary in this case, and would result in worse performance.
                // NOTE: array_key_first results in quadratic performance for this synthetic benchmark.
                // $key = array_key_first($values);
                $key = key($values);
                $total += $values[$key];
                unset($values[$key]);
            }
            $endReadTime = hrtime(true);
            $totalReadTime += $endReadTime - $startReadTime;
        }

        unset($values);
    }
    $endTime = hrtime(true);

    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("2x Push then shift from array:               n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n",
        $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total);
}
function bench_deque(int $n, int $iterations) {
    $startTime = hrtime(true);
    $totalReadTime = 0.0;
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = new Collections\Deque();
        for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) {
            for ($i = 0; $i < $n; $i++) {
                $values[] = $i;
            }
            $maxMemory = memory_get_usage();

            $startReadTime = hrtime(true);
            while (count($values) > 0) {
                $total += $values->shift();
            }
            $endReadTime = hrtime(true);
            $totalReadTime += $endReadTime - $startReadTime;
        }

        unset($values);
    }
    $endTime = hrtime(true);
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("2x Push then shift from Collections\Deque:   n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n",
        $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total);
}
function bench_ds_deque(int $n, int $iterations) {
    $startTime = hrtime(true);
    $totalReadTime = 0.0;
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = new Ds\Deque();
        for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) {
            for ($i = 0; $i < $n; $i++) {
                $values[] = $i;
            }
            $maxMemory = memory_get_usage();

            $startReadTime = hrtime(true);
            while (count($values) > 0) {
                $total += $values->shift();
            }
            $endReadTime = hrtime(true);
            $totalReadTime += $endReadTime - $startReadTime;
        }

        unset($values);
    }
    $endTime = hrtime(true);
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("2x Push then shift from Ds\Deque:            n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n",
        $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total);
}
// SplDoublyLinkedList is a linked list that takes more memory than needed.
// Access to values in the middle of the SplDoublyLinkedList is also less efficient.
function bench_spl_dll(int $n, int $iterations) {
    $startTime = hrtime(true);
    $totalReadTime = 0.0;
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = new SplDoublyLinkedList();
        for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) {
            for ($i = 0; $i < $n; $i++) {
                $values->push($i);
            }
            $maxMemory = memory_get_usage();
            $startReadTime = hrtime(true);
            // Random access is linear time in a linked list, so use foreach instead
            while (count($values) > 0) {
                $total += $values->shift();
            }
            $endReadTime = hrtime(true);
            $totalReadTime += $endReadTime - $startReadTime;
        }
        unset($values);
    }
    $endTime = hrtime(true);
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("2x Push then shift from SplDoublyLinkedList: n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n",
        $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total);
}
$n = 2**20;
$iterations = 10;
$sizes = [
    [1, 8000000],
    [4, 2000000],
    [8, 1000000],
    [1024, 20000],
    [2**20, 20],
];
echo "Test creating a collection, then two cycles of push+shift (adding n values to the end of collections then shifting all of them from front of collection)\n";
printf(
    "Results for php %s debug=%s with opcache enabled=%s\n\n",
    PHP_VERSION,
    PHP_DEBUG ? 'true' : 'false',
    json_encode(function_exists('opcache_get_status') && (opcache_get_status(false)['opcache_enabled'] ?? false))
);

foreach ($sizes as [$n, $iterations]) {
    bench_array($n, $iterations);
    bench_deque($n, $iterations);
    bench_ds_deque($n, $iterations);
    bench_spl_dll($n, $iterations);
    echo "\n";
}

Tests run on November 9, 2022 with php 8.3-dev

Test creating a collection, then two cycles of push+shift (adding n values to the end of collections then shifting all of them from front of collection)
Results for php 8.3.0-dev debug=false with opcache enabled=true

2x Push then shift from array:               n=       1 iterations= 8000000
=> max memory=     216 bytes, create+destroy time=1.547 shift time = 1.377 total time = 2.924 result=0
2x Push then shift from Collections\Deque:   n=       1 iterations= 8000000
=> max memory=     144 bytes, create+destroy time=2.194 shift time = 1.191 total time = 3.385 result=0
2x Push then shift from Ds\Deque:            n=       1 iterations= 8000000
=> max memory=     216 bytes, create+destroy time=2.596 shift time = 1.263 total time = 3.859 result=0
2x Push then shift from SplDoublyLinkedList: n=       1 iterations= 8000000
=> max memory=     184 bytes, create+destroy time=2.758 shift time = 1.280 total time = 4.038 result=0

2x Push then shift from array:               n=       4 iterations= 2000000
=> max memory=     216 bytes, create+destroy time=0.593 shift time = 1.004 total time = 1.597 result=24000000
2x Push then shift from Collections\Deque:   n=       4 iterations= 2000000
=> max memory=     144 bytes, create+destroy time=0.741 shift time = 0.654 total time = 1.395 result=24000000
2x Push then shift from Ds\Deque:            n=       4 iterations= 2000000
=> max memory=     216 bytes, create+destroy time=0.836 shift time = 0.695 total time = 1.531 result=24000000
2x Push then shift from SplDoublyLinkedList: n=       4 iterations= 2000000
=> max memory=     280 bytes, create+destroy time=1.377 shift time = 0.747 total time = 2.124 result=24000000

2x Push then shift from array:               n=       8 iterations= 1000000
=> max memory=     376 bytes, create+destroy time=0.474 shift time = 0.931 total time = 1.405 result=56000000
2x Push then shift from Collections\Deque:   n=       8 iterations= 1000000
=> max memory=     208 bytes, create+destroy time=0.675 shift time = 0.617 total time = 1.292 result=56000000
2x Push then shift from Ds\Deque:            n=       8 iterations= 1000000
=> max memory=     216 bytes, create+destroy time=0.831 shift time = 0.914 total time = 1.744 result=56000000
2x Push then shift from SplDoublyLinkedList: n=       8 iterations= 1000000
=> max memory=     408 bytes, create+destroy time=2.079 shift time = 1.271 total time = 3.350 result=56000000

2x Push then shift from array:               n=    1024 iterations=   20000
=> max memory=   41016 bytes, create+destroy time=0.913 shift time = 3.011 total time = 3.924 result=20951040000
2x Push then shift from Collections\Deque:   n=    1024 iterations=   20000
=> max memory=   16464 bytes, create+destroy time=0.605 shift time = 1.210 total time = 1.815 result=20951040000
2x Push then shift from Ds\Deque:            n=    1024 iterations=   20000
=> max memory=   16472 bytes, create+destroy time=0.618 shift time = 1.275 total time = 1.893 result=20951040000
2x Push then shift from SplDoublyLinkedList: n=    1024 iterations=   20000
=> max memory=   32920 bytes, create+destroy time=1.986 shift time = 1.287 total time = 3.273 result=20951040000

2x Push then shift from array:               n= 1048576 iterations=      20
=> max memory=41943120 bytes, create+destroy time=1.359 shift time = 2.025 total time = 3.384 result=21990211584000
2x Push then shift from Collections\Deque:   n= 1048576 iterations=      20
=> max memory=16777320 bytes, create+destroy time=1.269 shift time = 1.368 total time = 2.637 result=21990211584000
2x Push then shift from Ds\Deque:            n= 1048576 iterations=      20
=> max memory=16777328 bytes, create+destroy time=1.339 shift time = 1.368 total time = 2.707 result=21990211584000
2x Push then shift from SplDoublyLinkedList: n= 1048576 iterations=      20
=> max memory=33554584 bytes, create+destroy time=2.044 shift time = 1.314 total time = 3.357 result=21990211584000

Only appending to a Deque and reading elements without removal

Note that the proposed Deque as well as the existing SplDoublyLinkedList/SplStack are expected to perform equally well at shifting to (adding to) or unshifting from(removing from) the front of the Collection (compared to adding/removing the back of a Collection)

Deque is more efficient than other object data structures in the SPL at this benchmark, but is less efficient than array after array optimizations for #7491 were merged into PHP 8.2

Click to expand benchmark of only appending to a Deque and reading elements without removal

<?php

function bench_array(int $n, int $iterations) {
    $totalReadTime = 0.0;
    $startTime = hrtime(true);
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = [];
        for ($i = 0; $i < $n; $i++) {
            $values[] = $i;
        }
        $startReadTime = hrtime(true);
        for ($i = 0; $i < $n; $i++) {
            $total += $values[$i];
        }
        $endReadTime = hrtime(true);
        $totalReadTime += $endReadTime - $startReadTime;

        $endMemory = memory_get_usage();
        unset($values);
    }
    $endTime = hrtime(true);

    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("Appending to array:             n=%8d iterations=%8d memory=%8d bytes, create+destroy time=%.3f read time = %.3f result=%d\n",
        $n, $iterations, $endMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $total);
}
function bench_deque(int $n, int $iterations) {
    $startTime = hrtime(true);
    $totalReadTime = 0.0;
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = new Collections\Deque();
        for ($i = 0; $i < $n; $i++) {
            $values[] = $i;
        }

        $startReadTime = hrtime(true);
        for ($i = 0; $i < $n; $i++) {
            $total += $values[$i];
        }
        $endReadTime = hrtime(true);
        $totalReadTime += $endReadTime - $startReadTime;

        $endMemory = memory_get_usage();
        unset($values);
    }
    $endTime = hrtime(true);
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("Appending to Collections\Deque: n=%8d iterations=%8d memory=%8d bytes, create+destroy time=%.3f read time = %.3f result=%d\n",
        $n, $iterations, $endMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $total);
}
function bench_ds_deque(int $n, int $iterations) {
    $startTime = hrtime(true);
    $totalReadTime = 0.0;
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = new Ds\Deque();
        for ($i = 0; $i < $n; $i++) {
            $values[] = $i;
        }

        $startReadTime = hrtime(true);
        for ($i = 0; $i < $n; $i++) {
            $total += $values[$i];
        }
        $endReadTime = hrtime(true);
        $totalReadTime += $endReadTime - $startReadTime;

        $endMemory = memory_get_usage();
        unset($values);
    }
    $endTime = hrtime(true);
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("Appending to Ds\Deque:          n=%8d iterations=%8d memory=%8d bytes, create+destroy time=%.3f read time = %.3f result=%d\n",
        $n, $iterations, $endMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $total);
}
// SplStack is a subclass of SplDoublyLinkedList, so it is a linked list that takes more memory than needed.
// Access to values in the middle of the SplStack is also less efficient.
function bench_spl_stack(int $n, int $iterations) {
    $startTime = hrtime(true);
    $totalReadTime = 0.0;
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = new SplStack();
        for ($i = 0; $i < $n; $i++) {
            $values->push($i);
        }
        $startReadTime = hrtime(true);
        // Random access is linear time in a linked list, so use foreach instead
        foreach ($values as $value) {
            $total += $value;
        }
        $endReadTime = hrtime(true);
        $totalReadTime += $endReadTime - $startReadTime;
        $endMemory = memory_get_usage();
        unset($values);
    }
    $endTime = hrtime(true);
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("Appending to SplStack:          n=%8d iterations=%8d memory=%8d bytes, create+destroy time=%.3f read time = %.3f result=%d\n",
        $n, $iterations, $endMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $total);
}
function bench_spl_fixed_array(int $n, int $iterations) {
    $startTime = hrtime(true);
    $totalReadTime = 0.0;
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = new SplFixedArray();
        for ($i = 0; $i < $n; $i++) {
            // Imitate how push() would be implemented in a situation
            // where the number of elements wasn't actually known ahead of time.
            // erealloc() tends to extend the existing array when possible.
            $size = $values->getSize();
            $values->setSize($size + 1);
            $values->offsetSet($size, $i);
        }
        $startReadTime = hrtime(true);
        for ($i = 0; $i < $n; $i++) {
            $total += $values[$i];
        }
        $endReadTime = hrtime(true);
        $totalReadTime += $endReadTime - $startReadTime;
        $endMemory = memory_get_usage();
        unset($values);
    }
    $endTime = hrtime(true);
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("Appending to SplFixedArray:     n=%8d iterations=%8d memory=%8d bytes, create+destroy time=%.3f read time = %.3f result=%d\n\n",
        $n, $iterations, $endMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $total);
}
$n = 2**20;
$iterations = 10;
$sizes = [
    [1, 8000000],
    [4, 2000000],
    [8, 1000000],
    [2**20, 20],
];
printf(
    "Results for php %s debug=%s with opcache enabled=%s\n\n",
    PHP_VERSION,
    PHP_DEBUG ? 'true' : 'false',
    json_encode(function_exists('opcache_get_status') && (opcache_get_status(false)['opcache_enabled'] ?? false))
);

foreach ($sizes as [$n, $iterations]) {
    bench_array($n, $iterations);
    bench_deque($n, $iterations);
    bench_ds_deque($n, $iterations);
    bench_spl_stack($n, $iterations);
    bench_spl_fixed_array($n, $iterations);
    echo "\n";
}

Caveats for comparison with Ds\Deque from ds PECL:

  • Ds\Deque was compiled and installed as a shared extension with phpize; ./configure; make install (installing as a shared extension reflects the default way to install PECL modules, e.g. what pecl install or an OS package manager would do or what copying the windows DLL would do), not statically compiled into php
  • Both Deque and Ds\Deque use circular buffers, so the performance is expected to be about the same.
  • This php-src PR has micro-optimizations such as storing the mask (instead of capacity) for faster bitwise ands, as well as the use of the EXPECTED/UNEXPECTED macros to hint the fast path to the compiler.
  • Having the extra step of installing a PECL may discourage application/library authors from writing code using a PECL or discourage some users from using those applications/libraries. While php-ds has a polyfill they would be slower than native solutions - https://github.com/php-ds/ext-ds#compatibility
Results for php 8.3.0-dev debug=false with opcache enabled=true

Appending to array:             n=       1 iterations= 8000000 memory=     216 bytes, create+destroy time=1.090 read time = 0.515 result=0
Appending to Collections\Deque: n=       1 iterations= 8000000 memory=     144 bytes, create+destroy time=1.807 read time = 0.713 result=0
Appending to Ds\Deque:          n=       1 iterations= 8000000 memory=     216 bytes, create+destroy time=2.125 read time = 0.668 result=0
Appending to SplStack:          n=       1 iterations= 8000000 memory=     184 bytes, create+destroy time=2.968 read time = 1.247 result=0
Appending to SplFixedArray:     n=       1 iterations= 8000000 memory=      80 bytes, create+destroy time=3.712 read time = 0.869 result=0


Appending to array:             n=       4 iterations= 2000000 memory=     216 bytes, create+destroy time=0.435 read time = 0.224 result=12000000
Appending to Collections\Deque: n=       4 iterations= 2000000 memory=     144 bytes, create+destroy time=0.585 read time = 0.334 result=12000000
Appending to Ds\Deque:          n=       4 iterations= 2000000 memory=     216 bytes, create+destroy time=0.698 read time = 0.334 result=12000000
Appending to SplStack:          n=       4 iterations= 2000000 memory=     280 bytes, create+destroy time=1.308 read time = 0.545 result=12000000
Appending to SplFixedArray:     n=       4 iterations= 2000000 memory=     128 bytes, create+destroy time=2.165 read time = 0.456 result=12000000


Appending to array:             n=       8 iterations= 1000000 memory=     216 bytes, create+destroy time=0.269 read time = 0.161 result=28000000
Appending to Collections\Deque: n=       8 iterations= 1000000 memory=     208 bytes, create+destroy time=0.424 read time = 0.286 result=28000000
Appending to Ds\Deque:          n=       8 iterations= 1000000 memory=     216 bytes, create+destroy time=0.461 read time = 0.294 result=28000000
Appending to SplStack:          n=       8 iterations= 1000000 memory=     408 bytes, create+destroy time=0.971 read time = 0.420 result=28000000
Appending to SplFixedArray:     n=       8 iterations= 1000000 memory=     192 bytes, create+destroy time=1.993 read time = 0.387 result=28000000


Appending to array:             n= 1048576 iterations=      20 memory=16781392 bytes, create+destroy time=0.930 read time = 0.297 result=10995105792000
Appending to Collections\Deque: n= 1048576 iterations=      20 memory=16777320 bytes, create+destroy time=1.016 read time = 0.533 result=10995105792000
Appending to Ds\Deque:          n= 1048576 iterations=      20 memory=16777328 bytes, create+destroy time=1.097 read time = 0.535 result=10995105792000
Appending to SplStack:          n= 1048576 iterations=      20 memory=33554584 bytes, create+destroy time=1.863 read time = 0.846 result=10995105792000
Appending to SplFixedArray:     n= 1048576 iterations=      20 memory=16777304 bytes, create+destroy time=4.707 read time = 0.715 result=10995105792000

@TysonAndre TysonAndre force-pushed the deque branch 2 times, most recently from 99adb5c to 84a8134 Compare September 19, 2021 19:17
@TysonAndre TysonAndre changed the title Proposal: Add final class Deque to PHP RFC: Add final class Deque to PHP Sep 19, 2021
@nikic nikic added the RFC label Sep 20, 2021
@morrisonlevi
Copy link
Contributor

IMO, if you aren't going to put "spl" in the name somewhere, then please don't put it in the ext/spl directory.

@TysonAndre
Copy link
Contributor Author

TysonAndre commented Sep 20, 2021

IMO, if you aren't going to put "spl" in the name somewhere, then please don't put it in the ext/spl directory.

Any name ideas for an extension? E.g. 'std'/'standard'/'nds'? Move to core? Obviously, can't use pecl names that already exist such as 'ds'. Should I create an extension per data structure

Still, I don't know if it's a good idea.

std also has 'ArrayObject' and various iterables and interfaces that aren't prefixed with Spl already.

Also, this would inconvience users trying to find datastructures in the manual if it was split up among many extensions - a developer would see SplDoublyLinkedList exists and not know there was a more efficient Deque, though a "See also" section in the overview would help.


EDIT: Overall, I believe that a programming language having multiple standard libraries ("Standard PHP Library") for generic data structures would lead to more confusion than it's worth and be regarded as a poor design choice in the future, especially if the original one would likely never be deprecated.

@morrisonlevi
Copy link
Contributor

Theoretically you can just add it to ext/standard. We may want to make a new namespace + extension, but that discussion is ongoing on the ML.

@rtheunissen
Copy link
Contributor

rtheunissen commented Jan 9, 2022

I would be curious to see benchmarks with the ds deque also - that is already implemented in C within the PHP internal API and has been released and tested for years.

My feeling there was that the API and behavior for deque and vector are so similar that you could just implement vector as a circular buffer and avoid the responsibility of choice. The benchmarks between deque and vector for all operations should be marginal and in all cases insignificant.

@TysonAndre
Copy link
Contributor Author

I would be curious to see benchmarks with the ds deque also - that is already implemented in C within the PHP internal API and has been released and tested for years.

Updated. Note that php 8.2's packed arrays use half the memory as packed arrays in PHP <=8.1

This is a tiny bit faster.

  • Ds\Deque was compiled and installed as a shared extension with phpize; ./configure; make install (shared reflects the default way to install PECL modules, e.g. what pecl install or an OS package manager would do or what copying the windows DLL would do), not statically compiled into php
  • Both Deque and Ds\Deque use circular buffers, so the performance is expected to be about the same.
  • This php-src PR has micro-optimizations such as storing the mask (instead of capacity) for faster bitwise ands, as well as the use of the EXPECTED/UNEXPECTED macros to hint the fast path to the compiler.

My feeling there was that the API and behavior for deque and vector are so similar that you could just implement vector as a circular buffer and avoid the responsibility of choice. The benchmarks between deque and vector for all operations should be marginal and in all cases insignificant.

Note that php 8.2's packed arrays use half the memory as packed arrays in PHP <=8.1 and would outperform Vector, so the use case for Vectors is less compelling. There's still the argument for

  • Expressing the intent of the code (e.g. not intended for use with shift/unshift)
  • Possibly being a cleaner way to express shared state than a reference to an array or object wrapping an array
  • Any methods that RFCs propose that conceptually make sense to the authors to add to a vector RFC, e.g. Vector::map()/Vector::filter()

There's the restriction that those Deque implementations need the capacity to be a power of 2 but Vector doesn't. In special cases, Vector could use less memory (e.g. if Vector::map() doesn't need subsequent resizing in the most common cases)

@TysonAndre TysonAndre force-pushed the deque branch 2 times, most recently from 6b96098 to bfaf732 Compare January 26, 2022 15:19
@TysonAndre TysonAndre changed the title RFC: Add final class Deque to PHP RFC: Add final class Collections\Deque to PHP Jan 27, 2022
@iluuu1994
Copy link
Member

@TysonAndre Is there any progress on this RFC?

@TysonAndre
Copy link
Contributor Author

TysonAndre commented Oct 11, 2022

ext/standard/tests/general_functions/phpinfo.phpt is just failing because of the commit message being in the environment variable.

Is there any progress on this RFC?

I had found a few things I'd wanted to change, and some that were brought up in https://externals.io/message/116100#116214

  • Change offsetUnset into a helper method to remove valid offsets, instead of unconditionally throwing
  • Add insert(int $offset, mixed...$values) method to add 0 or more elements into a valid offset
  • Make the object iteration behavior with shift/unshift behave consistently.
    Iterators for a deque of size n now only have n+1 iteration states
    (one state for each offset, plus one state for iterators being at the end of
    the deque), rather than an arbitrary integer offset
  • Switch from tracking an iteration offset to tracking an intrusive doubly
    linked list of active iterators for each Deque instance.
    This has better results for foreach.phpt when shifting from the start of a
    deque during iteration. (see previous note)
  • The new iteration behavior (n+1 possible states, iterators associated with the deque instance) is more consistent with how
    iteration works on regular arrays (with deprecated reset/next/prev) and with SplDoublyLinkedList
    and has the benefit of working better with insert() and offsetUnset()

It was getting too close to a feature freeze, and I was concerned that'd be too rushed, or that there wouldn't be time to adjust the api in followup RFCs if needed, but I think that's it.

I'd also wanted to make sure that memory usage would be consistently low (and at the same time free of edge cases) for new data structures for handlers of get_properties and get_properties_for (even after calling var_export/debug_zval_dump) - In php 8.3-dev it's now straightforward to do that, and the patches have been merged into 8.3-dev for months. #8044 #8046 (etc)

So I need to

  • Run benchmarks again and update documented benchmark results here and on the rfc (EDIT: done)
  • Update the rfc with the latest api and change notes (EDIT: done)
  • EDIT: Finish the simple optimization TODO for ->insert/->offsetUnset (EDIT: done)
  • EDIT: Update Teds\Deque with the same optimization (EDIT: done)
  • Announce the updates on the mailing list

This has lower memory usage and better performance than SplDoublyLinkedList for
push/pop operations.

The API is as follows:

```php
namespace Collections;

/**
 * A double-ended queue (Typically abbreviated as Deque, pronounced "deck", like "cheque")
 * represented internally as a circular buffer.
 *
 * This has much lower memory usage than SplDoublyLinkedList or its subclasses (SplStack, SplStack),
 * and operations are significantly faster than SplDoublyLinkedList.
 *
 * See https://en.wikipedia.org/wiki/Double-ended_queue
 *
 * This supports amortized constant time pushing and popping onto the start (i.e. start, first)
 * or back (i.e. end, last) of the Deque.
 *
 * Method naming is based on https://www.php.net/spldoublylinkedlist
 * and on array_push/pop/unshift/shift/ and array_key_first/array_key_last.
 */
final class Deque implements IteratorAggregate, Countable, JsonSerializable, ArrayAccess
{
    /** Construct the Deque from the values of the Traversable/array, ignoring keys */
    public function __construct(iterable $iterator = []) {}
    /**
     * Returns an iterator that accounts for calls to shift/unshift tracking the position of the start of the Deque.
     * Calls to shift/unshift will do the following:
     * - Increase/Decrease the value returned by the iterator's key()
     *   by the number of elements added/removed to/from the start of the Deque.
     *   (`$deque[$iteratorKey] === $iteratorValue` at the time the key and value are returned).
     * - Repeated calls to shift will cause valid() to return false if the iterator's
     *   position ends up before the start of the Deque at the time iteration resumes.
     * - They will not cause the remaining values to be iterated over more than once or skipped.
     */
    public function getIterator(): \InternalIterator {}
    /** Returns the number of elements in the Deque. */
    public function count(): int {}
    /** Returns true if there are 0 elements in the Deque. */
    public function isEmpty(): bool {}
    /** Removes all elements from the Deque. */
    public function clear(): void {}

    public function __serialize(): array {}
    public function __unserialize(array $data): void {}
    /** Construct the Deque from the values of the array, ignoring keys */
    public static function __set_state(array $array): Deque {}

    /** Appends value(s) to the end of the Deque. */
    public function push(mixed ...$values): void {}
    /** Prepends value(s) to the start of the Deque. */
    public function unshift(mixed ...$values): void {}
    /**
     * Pops a value from the end of the Deque.
     * @throws \UnderflowException if the Deque is empty
     */
    public function pop(): mixed {}
    /**
     * Shifts a value from the start of the Deque.
     * @throws \UnderflowException if the Deque is empty
     */
    public function shift(): mixed {}

    /**
     * Peeks at the value at the start of the Deque.
     * @throws \UnderflowException if the Deque is empty
     */
    public function first(): mixed {}
    /**
     * Peeks at the value at the end of the Deque.
     * @throws \UnderflowException if the Deque is empty
     */
    public function last(): mixed {}

    /**
     * Returns a list of the elements from the start to the end.
     */
    public function toArray(): array {}

    // Must be mixed for compatibility with ArrayAccess
    /**
     * Insert 0 or more values at the given offset of the Deque.
     * @throws \OutOfBoundsException if the value of $offset is not within the bounds of this Deque.
     */
    public function insert(int $offset, mixed ...$values): void {}
    /**
     * Returns the value at offset (int)$offset (relative to the start of the Deque)
     * @throws \OutOfBoundsException if the value of (int)$offset is not within the bounds of this vector
     */
    public function offsetGet(mixed $offset): mixed {}
    /**
     * Returns true if `0 <= (int)$offset && (int)$offset < $this->count().
     */
    public function offsetExists(mixed $offset): bool {}
    /**
     * Sets the value at offset $offset (relative to the start of the Deque) to $value
     * @throws \OutOfBoundsException if the value of (int)$offset is not within the bounds of this vector
     */
    public function offsetSet(mixed $offset, mixed $value): void {}
    /**
     * Removes the value at (int)$offset from the deque.
     * @throws \OutOfBoundsException if the value of (int)$offset is not within the bounds of this Deque.
     */
    public function offsetUnset(mixed $offset): void {}

    /**
     * This is JSON serialized as a JSON array with elements from the start to the end.
     */
    public function jsonSerialize(): array {}
}
```

Earlier work on the implementation can be found at
https://github.com/TysonAndre/pecl-teds
(though `Teds\Deque` hasn't been updated with new names yet)

This was originally based on spl_fixedarray.c and previous work I did on an RFC.

Notable features of `Deque`

- Significantly lower memory usage and better performance than
  `SplDoublyLinkedList`
- Amortized constant time operations for push/pop/unshift/shift.
- Reclaims memory when roughly a quarter of the capacity is used,
  unlike array, which never releases allocated capacity
  https://www.npopov.com/2014/12/22/PHPs-new-hashtable-implementation.html

  > One problem with the current implementation is that arData never shrinks
  > (unless explicitly told to). So if you create an array with a few million
  > elements and remove them afterwards, the array will still take a lot of
  > memory. We should probably half the arData size if utilization falls below a
  > certain level.

  For long-running applications when the maximum count of Deque
  is larger than the average count, this may be a concern.
- Adds functionality that cannot be implemented nearly efficiently in
  an array. For example, shifting a single element onto an array
  (and making it first in iteration order) with `array_shift`
  would take linear time, because all elements in the array
  would need to be moved to make room for the first one
- Support `$deque[] = $element`, like ArrayObject.
- Having this functionality in php itself rather than a third party extension
  would encourage wider adoption of this
These operations are constant-time. Unlike array_shift/array_unshift,
they aren't actually shifting values in the representation,
and the new names are more self-explanatory and commonly used
in other Deque implementations.
@CViniciusSDias
Copy link
Contributor

Hey there, @TysonAndre . How are you doing?

Are there any updates on this RFC? I stumbled into a problem today where a better data structure (compared to PHP arrays) would be helpful so I remembered this.
:-D

Btw, thank you for all the effort you have put into this. <3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants