Closed
Description
- asyncpg version: 0.22.0
- PostgreSQL version: 11.10
- Do you use a PostgreSQL SaaS? If so, which? Can you reproduce
the issue with a local PostgreSQL install?: No, and I was able to reproduce it locally. - Python version: 3.7.9
- Platform: Fedora 31
- Do you use pgbouncer?: No
- Did you install asyncpg with pip?: Yes
- If you built asyncpg locally, which version of Cython did you use?:
- Can the issue be reproduced under both asyncio and
uvloop?: We use Starlette for serving API requests.
We have a large (~30 million rows) database table. I have noticed the SQL would run terribly slow when executed via asyncpg
but very fast via psql
.
Schema for table:
CREATE TABLE addresses(
zipcode VARCHAR(12),
line1 text,
-- 13x more fields of text type
);
CREATE INDEX z_idx ON addresses(zipcode, line1);
Query to execute.
SELECT line_1, ..., zipcode FROM addresses WHERE REPLACE(zipcode, ' ', '')=$1 GROUP BY A, B, line1 ORDER BY A, B
This query never results in more than 100 rows (out of mentioned 30 million).
Time to execute via psql
: less than 1 ms
Same query via asyncpg
: more than 6 seconds (!)
I have not looked at source code for asyncpg
so not sure what's going on here.
Can someone tell me why asyncpg
runs this query so slow? Thanks!
FYI I have fixed this "temporarily" by some quick data normalisation where I added zipcode_x
that contains no space so I could got rid of the SQL's REPLACE
function.
SELECT line_1, ..., zipcode FROM addresses WHERE zipcode=$1 GROUP BY A, B, line1 ORDER BY A, B
and now asyncpg
is very fast (as it should be).
Metadata
Metadata
Assignees
Labels
No labels