Skip to content

Catalog Category Indexing takes very long on MariaDB 10.3 with many products #25199

Closed
@amenk

Description

@amenk

Description (*)

I know MariaDB 10.3 is not yet supported, as per dev docs, so I add this as feature requests, while it will manifest as a bug as soon as we support MariaDB 10.3

Magento should remove the quotes in the SELECT entity_id IN statement when creating indexes. The entity_ids are not strings, and this seems to confuse the optimizer.

Steps to reproduce:

  1. Change value for simple products to 50000 for example (setup/performance-toolkit/profiles/ce/small.xml);
  2. Run in console: bin/magento setup:perf:generate-fixtures /var/www/html/magento24/setup/performance-toolkit/profiles/ce/small.xml for example

Expected behavior (*)

Catalog Category Indexing should run quickly, on MariaDB 10.1 and on 10.3.18

Benefits

Shop ownes can use MariaDB 10.3.18

Additional information

See this answer on StackExchange: https://magento.stackexchange.com/a/293345/81

I took the full query which is run several times (for each store view) on the catalog category products indexer and extracted the select part

SELECT 2 AS `category_id`, `cp`.`entity_id` AS `product_id`, IF(ccp.product_id IS NOT NULL, ccp.position, 0) AS `position`, IF(ccp.product_id IS NOT NULL, 1, 0) AS `is_parent`, 2 AS `store_id`, IFNULL(cpvs.value, cpvd.value) AS `visibility` FROM `catalog_product_entity` AS `cp`
 INNER JOIN `catalog_product_website` AS `cpw` ON cpw.product_id = cp.entity_id
 INNER JOIN `catalog_product_entity_int` AS `cpsd` ON cpsd.entity_id = cp.entity_id AND cpsd.store_id = 0 AND cpsd.attribute_id = 97
 LEFT JOIN `catalog_product_entity_int` AS `cpss` ON cpss.entity_id = cp.entity_id AND cpss.attribute_id = cpsd.attribute_id AND cpss.store_id = 2
 INNER JOIN `catalog_product_entity_int` AS `cpvd` ON cpvd.entity_id = cp.entity_id AND cpvd.store_id = 0 AND cpvd.attribute_id = 99
 LEFT JOIN `catalog_product_entity_int` AS `cpvs` ON cpvs.entity_id = cp.entity_id AND cpvs.attribute_id = cpvd.attribute_id  AND cpvs.store_id = 2
 LEFT JOIN `catalog_category_product` AS `ccp` ON ccp.product_id = cp.entity_id WHERE (cpw.website_id = '1') AND (IFNULL(cpss.value, cpsd.value) = 1) AND           (IFNULL(cpvs.value, cpvd.value) IN (2, 3, 4)) AND (cp.entity_id IN ('107280', '107281', '107282', '107283', '107284'
    [... around 70 K entity IDs ...]
    187965', '187966')) GROUP BY `cp`.`entity_id`;

This query alone runs 8 minutes on Maria DB 10.3

When I simply remove all the quotes via s/'//g in vi in the line with the "IN" statement I get

SELECT 2 AS `category_id`, `cp`.`entity_id` AS `product_id`, IF(ccp.product_id IS NOT NULL, ccp.position, 0) AS `position`, IF(ccp.product_id IS NOT NULL, 1, 0) AS `is_parent`, 2 AS `store_id`, IFNULL(cpvs.value, cpvd.value) AS `visibility` FROM `catalog_product_entity` AS `cp`
 INNER JOIN `catalog_product_website` AS `cpw` ON cpw.product_id = cp.entity_id
 INNER JOIN `catalog_product_entity_int` AS `cpsd` ON cpsd.entity_id = cp.entity_id AND cpsd.store_id = 0 AND cpsd.attribute_id = 97
 LEFT JOIN `catalog_product_entity_int` AS `cpss` ON cpss.entity_id = cp.entity_id AND cpss.attribute_id = cpsd.attribute_id AND cpss.store_id = 2
 INNER JOIN `catalog_product_entity_int` AS `cpvd` ON cpvd.entity_id = cp.entity_id AND cpvd.store_id = 0 AND cpvd.attribute_id = 99
 LEFT JOIN `catalog_product_entity_int` AS `cpvs` ON cpvs.entity_id = cp.entity_id AND cpvs.attribute_id = cpvd.attribute_id  AND cpvs.store_id = 2
 LEFT JOIN `catalog_category_product` AS `ccp` ON ccp.product_id = cp.entity_id WHERE (cpw.website_id = '1') AND (IFNULL(cpss.value, cpsd.value) = 1) AND           (IFNULL(cpvs.value, cpvd.value) IN (2, 3, 4)) AND (cp.entity_id IN (107280, 107281, 107282, 107283, 107284
    [... around 70 K entity IDs ...]
    187965, 187966)) GROUP BY `cp`.`entity_id`;

Next steps

I will try to provide a patch.

Metadata

Metadata

Assignees

Labels

Auto-Tests: Not RequiredChanges in Pull Request does not require coverage by auto-testsComponent: IndexerFixed in 2.4.xThe issue has been fixed in 2.4-develop branchIssue: ConfirmedGate 3 Passed. Manual verification of the issue completed. Issue is confirmedIssue: Format is validGate 1 Passed. Automatic verification of issue format passedIssue: Ready for WorkGate 4. Acknowledged. Issue is added to backlog and ready for developmentPriority: P2A defect with this priority could have functionality issues which are not to expectations.Progress: ready for devReproduced on 2.4.xThe issue has been reproduced on latest 2.4-develop branchSeverity: S1Affects critical data or functionality and forces users to employ a workaround.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions