Skip to content

getElementsByTagName returns collections with tagName-based indexing, causing loss of elements when converted to arrays #17572

Closed
@blar

Description

@blar

Description

Description:
Using querySelectorAll in Dom\XmlDocument and Dom\HtmlDocument leads to inconsistent behavior with the indexing of returned collections, causing issues when converting the results to an array.

Problem:
Both Dom\NodeList and Dom\HTMLCollection use the tagName of the elements as the index instead of a numerical index.

Impact:
When the result of querySelectorAll is converted to an array using iterator_to_array($elements), only one element is included in the resulting array because the tagName is used as the key, and it is the same for all elements.

Expected Behavior:
The index should be numeric so that all elements are included when iterating or converting the collection to an array.

Workaround:
As a temporary workaround, using iterator_to_array($elements, false) allows all elements to be included in the array.

The following code:


use Dom\HTMLDocument;
use Dom\XMLDocument;

function create_document(
    $document
) {
    $html = $document->createElement('html');
    $document->appendChild($html);

    for($i = 0; $i < 5; $i++) {
        $p = $document->createElement('p');
        $p->setAttribute(
            'id',
            $i
        );
        $html->appendChild($p);
    }
}

function test_document(
    $document
) {
    $html = $document->documentElement;
    printf("Class: %s\n", $document::class);

    foreach(['getElementsByTagName', 'querySelectorAll'] as $methodName) {
        if(!method_exists($document, $methodName)) {
            continue;
        }
        printf("    %s:\n", $methodName);
        $elements = $html->$methodName('p');
        printf("        Count1: %d\n", count($elements));

        $keys = array_keys(iterator_to_array($elements, true));
        printf("        Count2: %d, Keys: %s\n",
            count($keys),
            implode(', ', $keys)
        );
        $keys = array_keys(iterator_to_array($elements, false));
        printf("        Count3: %d, Keys: %s\n",
            count($keys),
            implode(', ', $keys)
        );
    }
}

foreach([DOMDocument::class, XmlDocument::class, HTMLDocument::class] as $className) {
    if(method_exists($className, 'createEmpty')) {
        $document = $className::createEmpty();
    }
    else {
        $document = new $className();
    }

    create_document($document);
    test_document($document);
}

Resulted in this output:

Class: DOMDocument
    getElementsByTagName:
        Count1: 5
        Count2: 5, Keys: 0, 1, 2, 3, 4
        Count3: 5, Keys: 0, 1, 2, 3, 4
Class: Dom\XMLDocument
    getElementsByTagName:
        Count1: 5
        Count2: 1, Keys: p
        Count3: 5, Keys: 0, 1, 2, 3, 4
    querySelectorAll:
        Count1: 5
        Count2: 5, Keys: 0, 1, 2, 3, 4
        Count3: 5, Keys: 0, 1, 2, 3, 4
Class: Dom\HTMLDocument
    getElementsByTagName:
        Count1: 5
        Count2: 1, Keys: p
        Count3: 5, Keys: 0, 1, 2, 3, 4
    querySelectorAll:
        Count1: 5
        Count2: 5, Keys: 0, 1, 2, 3, 4
        Count3: 5, Keys: 0, 1, 2, 3, 4

But I expected this output instead:

Class: DOMDocument
    getElementsByTagName:
        Count1: 5
        Count2: 5, Keys: 0, 1, 2, 3, 4
        Count3: 5, Keys: 0, 1, 2, 3, 4
Class: Dom\XMLDocument
    getElementsByTagName:
        Count1: 5
        Count2: 5, Keys: 0, 1, 2, 3, 4
        Count3: 5, Keys: 0, 1, 2, 3, 4
    querySelectorAll:
        Count1: 5
        Count2: 5, Keys: 0, 1, 2, 3, 4
        Count3: 5, Keys: 0, 1, 2, 3, 4
Class: Dom\HTMLDocument
    getElementsByTagName:
        Count1: 5
        Count2: 5, Keys: 0, 1, 2, 3, 4
        Count3: 5, Keys: 0, 1, 2, 3, 4
    querySelectorAll:
        Count1: 5
        Count2: 5, Keys: 0, 1, 2, 3, 4
        Count3: 5, Keys: 0, 1, 2, 3, 4

PHP Version

PHP 8.4.2

Operating System

Alpine 3:21

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions