-
Notifications
You must be signed in to change notification settings - Fork 9.4k
Rewrote Magento\Framework\DB\Adapter\Pdo\Mysql->isTableExists #28516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrote Magento\Framework\DB\Adapter\Pdo\Mysql->isTableExists #28516
Conversation
* Use INFORMATION_SCHEMA.TABLES instead of "SHOW TABLE STATUS" * Extend DDL cache for table existence and use it in isTableExists
Hi @jonashrem. Thank you for your contribution
❗ Automated tests can be triggered manually with an appropriate comment:
You can find more information about the builds here ℹ️ Please run only needed test builds instead of all when developing. Please run all test builds before sending your PR for review. For more details, please, review the Magento Contributor Guide documentation. |
@magento run all tests |
* Update Magento\Framework\DB\Adapter\Pdo\Mysql->isTableExists() to conform to code style guidelines (multi-line function calls, line length)
@magento run Static Tests |
to conform to code style guidelines (line length)
@magento run all tests |
@magento run Integration Tests |
The failing integration test doesn't seem related to this PR. |
Hi @aleron75, thank you for the review.
|
We tested the failing Integration test on 2.3.5-p1 and it didn't fail there, so I agree that this isn't related to our change. |
Pull Request state was updated. Re-review required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integration test fails, even after restarting few times. Looks like your change caused regression in some specific case - image resize command fails in case if DB storage used.
@jonashrem could you try to run following test locally?
Lines 120 to 124 in 40a7876
public function testDatabaseStorageMissingFile() | |
{ | |
$this->tester->execute([]); | |
$this->assertStringContainsString('Product images resized successfully', $this->tester->getDisplay()); | |
} |
@magento give me 2.4-develop instance |
Hi @jonashrem. Thank you for your request. I'm working on Magento 2.4-develop instance for you |
Hi @jonashrem, here is your Magento instance. |
@ihor-sviziev thanks for the review. We tested this integrationtest locally on 2.3.5-p1 before. We now did another local test on our PR directly, but also couldn't reproduce this error. Can you check we run the test correctly or do we need to do some special Magento settings here (for preparing DB Storage for example)? We cloned our instance, did
|
@magento run all tests |
@magento run Functional Tests CE |
@magento run Functional Tests B2B |
Hi @ihor-sviziev, thank you for the review. |
Note for QA: the solution should be verified with cases when tables have been changed in DB. Example:
I suspect that we will get an error since the result is cached but the actual table was removed. |
One renaming and dropping tables, the cache will be cleared (
|
Hi @jonashrem, thank you for your contribution! |
update getNextAutoincrement to not use showTableStatus for performants reasons. See also magento#28516
Description
During heavier load on newer productive Magento2 installations (2.3 with MSI or 2.3.4+), we noticed increased execution time was spent in
SHOW TABLE STATUS
queries to MySQL. Analyzing this further, we found that these queries seemed to occur in higher frequency than in earlier Magento versions. This can be problematic, since the MySQL server uses an exclusive file system lock and a call to readdir() to answer them. Because of this, these calls can be slow. We also found that one web request can result in multiple calls ofSHOW TABLE STATUS
for the same tables, which seems very wasteful and inefficient.In more detailed analysis of some of our customers, we found instances where
SHOW TABLE STATUS
queries were executed nearly 100 times when requesting a single category page with 48 products.Looking at the code of the MySQL adapter, we identified several places that use the function
showTablesStatus()
, among themisTableExists()
.With this patch, we eliminate the calls of
SHOW TABLE STATUS
for the common case of checking table existence.Implementation details
In our implementation of
isTableExists()
, we use information from the INFORMATION_SCHEMA.TABLES table, which is provided without expensive file system operations and proved to be significantly faster in our tests (see below for details). Additionally, we use the already existing DDL cache to store a positive table existence, resulting in even fewer database calls.According to the official MySQL documentation[link], using the views from INFORMATION_SCHEMA is actually recommended over using SHOW queries, since this has some advantages. INFORMATION_SCHEMA views are read-only and respect the access rights of the database user, so it should be readily available on supported MySQL platforms.
The raw performance advantages of using INFORMATION_SCHEMA over
SHOW TABLES STATUS
can easily be measured with MySQL itself. We’ve used MySQL 5.7.30 on Ubuntu 18.04, withdatadir
running on a high performance NVME SSD (Samsung PM1725b).We’ve run 10 concurrent clients each executing 1,000,000 statements in a Magento database with demo data, once with the query
SHOW TABLES STATUS LIKE ‘catalog_product_entity’;
and once with our statementSELECT COUNT(1) AS tbl_exists FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = ‘catalog_product_entity’ AND TABLE_SCHEMA = DATABASE();
.Between these queries, we measured a difference of slightly over factor 10 in the execution time, favoring our solution with INFORMATION_SCHEMA. See below for details.
We assume the difference is even bigger when having more concurrent clients or slower hardware. As MySQL’s “SHOW TABLE STATUS” implementation is reading the whole directory, file system locking can have a stronger effect as well as which file system is used.
Our performance measurements:
time seq 1 10 | parallel -j 10 bash -c 'mysql < test1.sql > /dev/null’
time seq 1 10 | parallel -j 10 bash -c 'mysql < test2.sql > /dev/null’
Results:
For
SHOW TABLE STATUS
:For
INFORMATION_SCHEMA
:Manual testing scenarios
To see the impact our patch has, load the same set of pages on the frontend with and without the patch on any Magento instance (with real or demo data). The difference can be seen when analyzing the amount of SQL queries sent to the database and the time spent processing them. Here’s an easy way to do this:
Using Slow Query Log, you can make MySQL actually log all queries by setting the threshold to 0 seconds:
After you’ve opened the selected set of frontend pages, deactivate it again:
Now, using
pt-query-digest
from the Percona Toolkit on the slow query log file, you get an analysis of the logged queries, with queries grouped by type, their average and cumulative run time, and other information. Make sure to use two different log files so that they can easily be compared between M2 with and without our patch.In the analysis, you will usually see
SHOW TABLE STATUS
queries appearing in the top queries (this depends a bit on the module configuration and of course on which and how many pages you accessed for this test), meaning they're taking quite a good percentage of total execution time.Please see these extracts from the
pt-query-digest
output for an example, which we created when simply opening the single category page “Men > Tops” from the demo data (https://$BASEURL/men/tops-men.html?product_list_limit=36
).Analysis of SQL queries run, unpatched Magento:
You can see that queries of the type
SHOW TABLE STATUS
are the third most expensive queries with 12.5% of the total execution time inside the database and 3 calls in total just for this single page load.Patched Magento, opening the same category page:
We’ve had to include the full summary in this case to show that our implementation not only eliminates the common cause for
SHOW TABLE STATUS
queries, but also that the new queries to INFORMATION_SCHEMA.TABLES do not occur in the top 20 queries.This already illustrates that we’ve found a good solution to improve the performance of table existence checks. The complete impact can be estimated from observing production stores with many concurrent page views, where we see thousands of
SHOW TABLE STATUS
queries in 2–3 minutes.Questions or comments
Contribution checklist
Resolved issues: