Skip to content

Rewrote Magento\Framework\DB\Adapter\Pdo\Mysql->isTableExists #28516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jonashrem
Copy link
Contributor

@jonashrem jonashrem commented Jun 4, 2020

Description

During heavier load on newer productive Magento2 installations (2.3 with MSI or 2.3.4+), we noticed increased execution time was spent in SHOW TABLE STATUS queries to MySQL. Analyzing this further, we found that these queries seemed to occur in higher frequency than in earlier Magento versions. This can be problematic, since the MySQL server uses an exclusive file system lock and a call to readdir() to answer them. Because of this, these calls can be slow. We also found that one web request can result in multiple calls of SHOW TABLE STATUS for the same tables, which seems very wasteful and inefficient.

In more detailed analysis of some of our customers, we found instances where SHOW TABLE STATUS queries were executed nearly 100 times when requesting a single category page with 48 products.

Looking at the code of the MySQL adapter, we identified several places that use the function showTablesStatus(), among them isTableExists().
With this patch, we eliminate the calls of SHOW TABLE STATUS for the common case of checking table existence.

Implementation details

In our implementation of isTableExists(), we use information from the INFORMATION_SCHEMA.TABLES table, which is provided without expensive file system operations and proved to be significantly faster in our tests (see below for details). Additionally, we use the already existing DDL cache to store a positive table existence, resulting in even fewer database calls.

According to the official MySQL documentation[link], using the views from INFORMATION_SCHEMA is actually recommended over using SHOW queries, since this has some advantages. INFORMATION_SCHEMA views are read-only and respect the access rights of the database user, so it should be readily available on supported MySQL platforms.

The raw performance advantages of using INFORMATION_SCHEMA over SHOW TABLES STATUS can easily be measured with MySQL itself. We’ve used MySQL 5.7.30 on Ubuntu 18.04, with datadir running on a high performance NVME SSD (Samsung PM1725b).
We’ve run 10 concurrent clients each executing 1,000,000 statements in a Magento database with demo data, once with the query SHOW TABLES STATUS LIKE ‘catalog_product_entity’; and once with our statement SELECT COUNT(1) AS tbl_exists FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = ‘catalog_product_entity’ AND TABLE_SCHEMA = DATABASE();.

Between these queries, we measured a difference of slightly over factor 10 in the execution time, favoring our solution with INFORMATION_SCHEMA. See below for details.
We assume the difference is even bigger when having more concurrent clients or slower hardware. As MySQL’s “SHOW TABLE STATUS” implementation is reading the whole directory, file system locking can have a stronger effect as well as which file system is used.

Our performance measurements:

  • create the files test1.sql and test2.sql with 1.000.000 rows of the statements to test.
  • run these commands
    • time seq 1 10 | parallel -j 10 bash -c 'mysql < test1.sql > /dev/null’
    • time seq 1 10 | parallel -j 10 bash -c 'mysql < test2.sql > /dev/null’

Results:
For SHOW TABLE STATUS:

# time seq 1 10 |  parallel  -j 10 bash -c ' mysql   < test1.sql  > /dev/null’ 

real	85m21.374s
user	3m5.500s
sys	1m40.317s

For INFORMATION_SCHEMA:

time seq 1 10 |  parallel  -j 10 bash -c ' mysql   < test2.sql > /dev/null 


real	8m5.184s
user	1m54.587s
sys	1m11.687s

Manual testing scenarios

To see the impact our patch has, load the same set of pages on the frontend with and without the patch on any Magento instance (with real or demo data). The difference can be seen when analyzing the amount of SQL queries sent to the database and the time spent processing them. Here’s an easy way to do this:

Using Slow Query Log, you can make MySQL actually log all queries by setting the threshold to 0 seconds:

SET @@GLOBAL.long_query_time = 0; SET @@GLOBAL.slow_query_log = 1;

After you’ve opened the selected set of frontend pages, deactivate it again:

SET @@GLOBAL.slow_query_log = 0;

Now, using pt-query-digest from the Percona Toolkit on the slow query log file, you get an analysis of the logged queries, with queries grouped by type, their average and cumulative run time, and other information. Make sure to use two different log files so that they can easily be compared between M2 with and without our patch.

In the analysis, you will usually see SHOW TABLE STATUS queries appearing in the top queries (this depends a bit on the module configuration and of course on which and how many pages you accessed for this test), meaning they're taking quite a good percentage of total execution time.

Please see these extracts from the pt-query-digest output for an example, which we created when simply opening the single category page “Men > Tops” from the demo data (https://$BASEURL/men/tops-men.html?product_list_limit=36).

Analysis of SQL queries run, unpatched Magento:

# Profile
# Rank Query ID                         Response time Calls R/Call V/M   I
# ==== ================================ ============= ===== ====== ===== =
#    1 0xAADD55611E492BA965AF2F067A0...  0.0099 34.3%    22 0.0004  0.00 SELECT catalog_product_index_eav search_tmp_?ed?d?dcc?b?_? cataloginventory_stock_status
#    2 0x7009A071F49236A84A57562D461...  0.0050 17.2%     1 0.0050  0.00 SELECT catalog_product_entity catalog_category_product_index_store? catalog_product_index_price review_entity_summary review_entity cataloginventory_stock_status search_tmp_?ed?d?dccc?_?
#    3 0x8085D806F3631D0D30FE5C20326...  0.0036 12.5%     3 0.0012  0.00 SHOW TABLE STATUS
#    4 0x13262D434B03958CB4F62EC706B...  0.0011  3.9%     1 0.0011  0.00 INSERT SELECT search_tmp_?ed?d?dcc?b?_? catalog_product_index_eav cataloginventory_stock_status catalog_category_product_index_store?
#    5 0x2F9EDAB00084ED0A7EE1463DBA4...  0.0009  3.1%    72 0.0000  0.00 SELECT inventory_stock_sales_channel
[...]

You can see that queries of the type SHOW TABLE STATUS are the third most expensive queries with 12.5% of the total execution time inside the database and 3 calls in total just for this single page load.

Patched Magento, opening the same category page:

# Profile
# Rank Query ID                         Response time Calls R/Call V/M   I
# ==== ================================ ============= ===== ====== ===== =
#    1 0xAADD55611E492BA965AF2F067A0...  0.0105 29.9%    22 0.0005  0.00 SELECT catalog_product_index_eav search_tmp_?ed?cf?f?c?a?_? cataloginventory_stock_status
#    2 0x8EA4DF263A535F59EAC6ED75B20...  0.0073 20.7%    24 0.0003  0.00 SELECT catalog_product_entity catalog_product_super_link catalog_product_website catalog_product_entity_int cataloginventory_stock_status
#    3 0x7009A071F49236A84A57562D461...  0.0050 14.2%     1 0.0050  0.00 SELECT catalog_product_entity catalog_category_product_index_store? catalog_product_index_price review_entity_summary review_entity cataloginventory_stock_status search_tmp_?ed?cf?f?e?b?_?
#    4 0x13262D434B03958CB4F62EC706B...  0.0015  4.3%     1 0.0015  0.00 INSERT SELECT search_tmp_?ed?cf?f?c?a?_? catalog_product_index_eav cataloginventory_stock_status catalog_category_product_index_store?
#    5 0xCB3E7FA2C499AC0B69ACCD06C21...  0.0012  3.4%     2 0.0006  0.00 SELECT catalog_product_index_price search_tmp_?ed?cf?f?c?a?_?
#    6 0x2F9EDAB00084ED0A7EE1463DBA4...  0.0009  2.7%    72 0.0000  0.00 SELECT inventory_stock_sales_channel
#    7 0x593763BE686057BEA87DB0BC232...  0.0008  2.4%    72 0.0000  0.00 SELECT inventory_stock_sales_channel
#    8 0x23B42F57B465E32E1732E4070E9...  0.0008  2.2%    72 0.0000  0.00 SELECT inventory_stock
#    9 0x11C6E1A6D7347EEAF29EBEA1350...  0.0008  2.2%     1 0.0008  0.00 SELECT catalog_product_index_eav cataloginventory_stock_status catalog_category_product_index_store?
#   10 0xA788DCC6FE9679C3FDABA74662A...  0.0007  1.9%     1 0.0007  0.00 SELECT catalog_product_entity catalog_category_product_index_store? catalog_product_index_price cataloginventory_stock_status search_tmp_?ed?cf?f?e?b?_?
#   11 0x961084F55005407CD1C1E76C0BE...  0.0006  1.7%     2 0.0003  0.00 SELECT eav_attribute_option_swatch
#   12 0xE29FDCAD0B27AFD15D17697A415...  0.0005  1.5%     1 0.0005  0.00 SELECT catalog_product_index_price search_tmp_?ed?cf?f?c?a?_?
#   13 0x233B56CFFF59C63BF02E08D9654...  0.0005  1.4%     2 0.0002  0.00 CREATE TABLE search_tmp_?ed?cf?f?e?b?_? `search_tmp_5ed8cf6f89e2b9_26325963`
#   14 0x2F0A3214F96ED5CD4E962B1E990...  0.0005  1.3%    24 0.0000  0.00 SELECT catalog_product_super_attribute catalog_product_entity catalog_product_super_link eav_attribute catalog_product_entity catalog_product_entity_int catalog_product_super_attribute_label eav_attribute_option
#   15 0xDD45A442CF95F62A11D72D7D0F8...  0.0004  1.2%     1 0.0004  0.00 SELECT catalog_category_product_index_store? search_tmp_?ed?cf?f?c?a?_? catalog_category_entity
#   16 0x616C07D9410BE0BBA2EBCD6D77C...  0.0003  0.8%    21 0.0000  0.00 SELECT eav_attribute
#   17 0xB3E930152F4418A1C656779CE44...  0.0003  0.7%    12 0.0000  0.00 SELECT catalog_product_super_attribute
#   18 0xDFE08A11F72783E309DFA58BA8F...  0.0003  0.7%    12 0.0000  0.00 SELECT catalog_category_product
#   19 0x58EA3033F7E981C7FACDF0DE8D3...  0.0002  0.7%    20 0.0000  0.00 SELECT catalog_eav_attribute
#   20 0x13867189A41845FE8E8C4366F7B...  0.0002  0.5%    12 0.0000  0.00 SELECT catalog_product_super_attribute_label
# MISC 0xMISC                            0.0020  5.6%    74 0.0000   0.0 <38 ITEMS>

We’ve had to include the full summary in this case to show that our implementation not only eliminates the common cause for SHOW TABLE STATUS queries, but also that the new queries to INFORMATION_SCHEMA.TABLES do not occur in the top 20 queries.
This already illustrates that we’ve found a good solution to improve the performance of table existence checks. The complete impact can be estimated from observing production stores with many concurrent page views, where we see thousands of SHOW TABLE STATUS queries in 2–3 minutes.

Questions or comments

  • The commit in this pull request is a combined work of @jonashrem and @leeps.
  • We haven’t created an issue for this improvement, but can do that if it’s needed/wanted.

Contribution checklist

  • Pull request has a meaningful description of its purpose
  • All commits are accompanied by meaningful commit messages
  • All new or changed code is covered with unit/integration tests (if applicable)
  • All automated tests passed successfully (all builds are green)

Resolved issues:

  1. resolves [Issue] Rewrote Magento\Framework\DB\Adapter\Pdo\Mysql->isTableExists #29662: Rewrote Magento\Framework\DB\Adapter\Pdo\Mysql->isTableExists

* Use INFORMATION_SCHEMA.TABLES instead of "SHOW TABLE STATUS"
* Extend DDL cache for table existence and use it in isTableExists
@m2-assistant
Copy link

m2-assistant bot commented Jun 4, 2020

Hi @jonashrem. Thank you for your contribution
Here is some useful tips how you can test your changes using Magento test environment.
Add the comment under your pull request to deploy test or vanilla Magento instance:

  • @magento give me test instance - deploy test instance based on PR changes
  • @magento give me 2.4-develop instance - deploy vanilla Magento instance

❗ Automated tests can be triggered manually with an appropriate comment:

  • @magento run all tests - run or re-run all required tests against the PR changes
  • @magento run <test-build(s)> - run or re-run specific test build(s)
    For example: @magento run Unit Tests

<test-build(s)> is a comma-separated list of build names. Allowed build names are:

  1. Database Compare
  2. Functional Tests CE
  3. Functional Tests EE,
  4. Functional Tests B2B
  5. Integration Tests
  6. Magento Health Index
  7. Sample Data Tests CE
  8. Sample Data Tests EE
  9. Sample Data Tests B2B
  10. Static Tests
  11. Unit Tests
  12. WebAPI Tests

You can find more information about the builds here

ℹ️ Please run only needed test builds instead of all when developing. Please run all test builds before sending your PR for review.

For more details, please, review the Magento Contributor Guide documentation.

@jonashrem
Copy link
Contributor Author

@magento run all tests

* Update Magento\Framework\DB\Adapter\Pdo\Mysql->isTableExists()
  to conform to code style guidelines (multi-line function calls, line length)
@jonashrem
Copy link
Contributor Author

@magento run Static Tests

  to conform to code style guidelines (line length)
@jonashrem
Copy link
Contributor Author

@magento run all tests

@jonashrem
Copy link
Contributor Author

@magento run Integration Tests

aleron75
aleron75 previously approved these changes Jun 5, 2020
@aleron75
Copy link
Contributor

aleron75 commented Jun 5, 2020

The failing integration test doesn't seem related to this PR.
The affected method is used in a lot of tests so to me this satisfies test coverage.
Thus I approved this PR.

@aleron75 aleron75 added the Auto-Tests: Not Required Changes in Pull Request does not require coverage by auto-tests label Jun 5, 2020
@magento-engcom-team
Copy link
Contributor

Hi @aleron75, thank you for the review.
ENGCOM-7622 has been created to process this Pull Request
✳️ @aleron75, could you please add one of the following labels to the Pull Request?

Label Description
Auto-Tests: Covered All changes in Pull Request is covered by auto-tests
Auto-Tests: Not Covered Changes in Pull Request requires coverage by auto-tests
Auto-Tests: Not Required Changes in Pull Request does not require coverage by auto-tests

@jonashrem
Copy link
Contributor Author

We tested the failing Integration test on 2.3.5-p1 and it didn't fail there, so I agree that this isn't related to our change.

@ghost ghost dismissed aleron75’s stale review June 19, 2020 07:12

Pull Request state was updated. Re-review required.

Copy link
Contributor

@ihor-sviziev ihor-sviziev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration test fails, even after restarting few times. Looks like your change caused regression in some specific case - image resize command fails in case if DB storage used.

@jonashrem could you try to run following test locally?

public function testDatabaseStorageMissingFile()
{
$this->tester->execute([]);
$this->assertStringContainsString('Product images resized successfully', $this->tester->getDisplay());
}

@jonashrem
Copy link
Contributor Author

@magento give me 2.4-develop instance

@magento-engcom-team
Copy link
Contributor

Hi @jonashrem. Thank you for your request. I'm working on Magento 2.4-develop instance for you

@magento-engcom-team
Copy link
Contributor

Hi @jonashrem, here is your Magento instance.
Admin access: https://i-28516-2-4-develop.instances.magento-community.engineering/admin_6f30
Login: 0f11a9f8 Password: 4e5687cd5838
Instance will be terminated in up to 3 hours.

@jonashrem
Copy link
Contributor Author

@ihor-sviziev thanks for the review.

We tested this integrationtest locally on 2.3.5-p1 before.

We now did another local test on our PR directly, but also couldn't reproduce this error.

Can you check we run the test correctly or do we need to do some special Magento settings here (for preparing DB Storage for example)?

We cloned our instance, did composer install and run the test.

dev/tests/integration # ../../../vendor/bin/phpunit --testdox testsuite/Magento/MediaStorage/Console/Command/ImageResizeCommandTest.php
PHPUnit 9.1.5 by Sebastian Bergmann and contributors.

Warning:       Using a custom test suite loader is deprecated

Image Resize Command (Magento\MediaStorage\Console\Command\ImageResizeCommand)
 ✔ Run resize with missing file
 ✔ Execute with zero byte image
 ✔ Database storage missing file

Time: 00:19.762, Memory: 111.00 MB

OK (3 tests, 3 assertions)

=== Memory Usage System Stats ===
Memory usage (OS):      141.02M (129.38% of 109.00M reported by PHP)
Estimated memory leak:  32.02M (22.71% of used memory)
dev/tests/integration # git remote -v                                                                                                          
origin  https://github.com/jonashrem/magento2.git (fetch)
origin  https://github.com/jonashrem/magento2.git (push)
dev/tests/integration # git branch
  2.4-develop
* faster-isTableExists
dev/tests/integration # cat etc/install-config-mysql.php
<?php
/**
 * Copyright © Magento, Inc. All rights reserved.
 * See COPYING.txt for license details.
 */

return [
    'db-host' => 'localhost',
    'db-user' => 'db-user-4',
    'db-password' => '*',
    'db-name' => 'db-4',
    'db-prefix' => '',
    'backend-frontname' => 'backend',
    'admin-user' => \Magento\TestFramework\Bootstrap::ADMIN_NAME,
    'admin-password' => \Magento\TestFramework\Bootstrap::ADMIN_PASSWORD,
    'admin-email' => \Magento\TestFramework\Bootstrap::ADMIN_EMAIL,
    'admin-firstname' => \Magento\TestFramework\Bootstrap::ADMIN_FIRSTNAME,
    'admin-lastname' => \Magento\TestFramework\Bootstrap::ADMIN_LASTNAME,
//    'amqp-host' => 'localhost',
//    'amqp-port' => '5672',
//    'amqp-user' => 'guest',
//    'amqp-password' => 'guest',
];

@engcom-Charlie
Copy link
Contributor

@magento run all tests

@jonashrem
Copy link
Contributor Author

@magento run Functional Tests CE

@jonashrem
Copy link
Contributor Author

@magento run Functional Tests B2B

@magento magento deleted a comment from magento-engcom-team Sep 8, 2020
@magento-engcom-team
Copy link
Contributor

Hi @ihor-sviziev, thank you for the review.
ENGCOM-7622 has been created to process this Pull Request

@sidolov
Copy link
Contributor

sidolov commented Sep 11, 2020

Note for QA: the solution should be verified with cases when tables have been changed in DB.

Example:

  1. Check table exists for the first time (cache is generated)
  2. Remove or rename a table
  3. Check table presence once again.

I suspect that we will get an error since the result is cached but the actual table was removed.

@jonashrem
Copy link
Contributor Author

Note for QA: the solution should be verified with cases when tables have been changed in DB.

Example:

1. Check table exists for the first time (cache is generated)

2. Remove or rename a table

3. Check table presence once again.

I suspect that we will get an error since the result is cached but the actual table was removed.

One renaming and dropping tables, the cache will be cleared (

$this->resetDdlCache($oldTableName, $schemaName);
and
$this->resetDdlCache($tableName, $schemaName);
), so both should be fine.

@engcom-Echo
Copy link
Contributor

engcom-Echo commented Sep 28, 2020

✔️ QA Passed

Before: ✖️
image

After: ✔️

The request was not included in the list as the share of its participation was very small.

@engcom-Echo engcom-Echo added the QA: Ready to add to Regression Scope Should be analyzed and added to Regression Testing Scope(if applicable) label Sep 28, 2020
@ihor-sviziev ihor-sviziev added the Area: Perf/Frontend All tickets related with improving frontend performance. label Sep 29, 2020
@engcom-Echo engcom-Echo self-assigned this Sep 30, 2020
@engcom-Alfa engcom-Alfa self-assigned this Sep 30, 2020
@engcom-Alfa engcom-Alfa added QA: Added to Regression Scope Scenario was analysed and added to Regression Testing Scope and removed QA: Ready to add to Regression Scope Should be analyzed and added to Regression Testing Scope(if applicable) labels Sep 30, 2020
@magento-engcom-team magento-engcom-team merged commit 6d89270 into magento:2.4-develop Sep 30, 2020
@m2-assistant
Copy link

m2-assistant bot commented Sep 30, 2020

Hi @jonashrem, thank you for your contribution!
Please, complete Contribution Survey, it will take less than a minute.
Your feedback will help us to improve contribution process.

jonashrem added a commit to jonashrem/magento2 that referenced this pull request Jul 6, 2021
update getNextAutoincrement to not use showTableStatus for performants reasons. See also magento#28516
@jonashrem jonashrem mentioned this pull request Jul 6, 2021
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Perf/Frontend All tickets related with improving frontend performance. Auto-Tests: Not Required Changes in Pull Request does not require coverage by auto-tests Award: bug fix Award: category of expertise Component: DB Priority: P2 A defect with this priority could have functionality issues which are not to expectations. Progress: accept QA: Added to Regression Scope Scenario was analysed and added to Regression Testing Scope Release Line: 2.4 Severity: S2 Major restrictions or short-term circumventions are required until a fix is available. Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[Issue] Rewrote Magento\Framework\DB\Adapter\Pdo\Mysql->isTableExists