Description
Preconditions (*)
magento/product-community-edition: 2.3.5
Ubuntu 18.04.4 LTS
Steps to reproduce (*)
Load up and save a number of products like the following
$skus = [1000 product skus];
$products = [];
foreach ($skus as $sku) {
$products[] = $this->productRepository->get($sku, true, 0, true);
}
foreach ($products as $product) {
$product->setPrice(12);
$this->productRepository->save($product);
}
Expected result (*)
It's a performance issue. I expect it to take a roughly equal amount of time to complete each product save.
Actual result (*)
Put a debug in \Magento\Indexer\Model\Indexer\CacheCleaner
looking at the value of $identities
in cleanCache()
. Watch the array grow increasingly bigger with each product save.
Instead of 1000 product cache cleans being performed, there are over 500000 cache cleans. This is because for the first save, the first product is cleaned. But then on the second product save, the second product is cleaned AND the first product is cleaned again. On the third save, all three are cleaned, and so on and so forth.
When a product is saved, priceReindexCallback
is called for each one. This ultimately calls module-catalog\Model\Indexer\Product\Price->executeRow
which adds the entity to the cacheContext.
In addition to this, there is a plugin in module-indexer for Magento\Framework\Indexer\ActionInterface
which says afterExecuteRow
clean the cache of all identities currently in the cacheContext.
But it never removes those entities from the cacheContext. Every product save adds a new entry to cacheContext and therefore cleans all previously cleaned products again.
My suggestion is that in Magento\Indexer\Model\Indexer\CacheCleaner
where it cleans the cache, also remove the identity from the CacheContext. This is all new functionality since there is presently no way to remove entities or tags from the CacheContext.
Either that or restructure the code to not use a shared instance of cacheContext, as below. My workaround is to create a plugin on Magento\Catalog\Model\Indexer\Product\Price
which is basically to replace the functions with a non-shared instance of cacheContext. I also have a similar function for aroundExecuteList
.
public function aroundExecuteRow(
\Magento\Catalog\Model\Indexer\Product\Price $subject,
$proceed,
$id
) {
//Generate a sandboxed cache context which won't keep filling up as we save products
$cacheContext = $this->cacheContextFactory->create();
//Call the original executeRow code
$this->productPriceIndexerRow->execute($id);
$cacheContext->registerEntities(ProductModel::CACHE_TAG, [$id]);
//Emulate Magento\Indexer\Model\Indexer\CacheCleaner
$this->eventManager->dispatch('clean_cache_by_tags', ['object' => $cacheContext]);
$identities = $cacheContext->getIdentities();
if (!empty($identities)) {
$this->appCache->clean($identities);
}
}
Edit: In case it matters, I've also had to plugin aroundExecute
for Magento\Catalog\Model\Indexer\Category\Product
and Magento\Catalog\Model\Indexer\Product\Category
as well.
Please provide Severity assessment for the Issue as Reporter. This information will help during Confirmation and Issue triage processes.
- Severity: S0 - Affects critical data or functionality and leaves users without workaround.
- Severity: S1 - Affects critical data or functionality and forces users to employ a workaround.
- Severity: S2 - Affects non-critical data or functionality and forces users to employ a workaround.
- Severity: S3 - Affects non-critical data or functionality and does not force users to employ a workaround.
- Severity: S4 - Affects aesthetics, professional look and feel, “quality” or “usability”.