Description
-
asyncpg version: 0.21.0
-
PostgreSQL version: 11.8 fedora
-
Do you use a PostgreSQL SaaS? If so, which? Can you reproduce
the issue with a local PostgreSQL install?: N/A -
Python version: 3.8.3
-
Platform: Fedora 31
-
Do you use pgbouncer?: no
-
Did you install asyncpg with pip?: yes
-
If you built asyncpg locally, which version of Cython did you use?: N/A
-
Can the issue be reproduced under both asyncio and
uvloop?: N/A
It appears that the implementation for set_type_codec() relies upon the results of the query TYPE_BY_NAME which itself is assumed to return a bytes value from the PostgreSQL "char" datatype.
I was previously unaware that PostgreSQL actually has two "char" variants bpchar and char, and in the documentation at https://magicstack.github.io/asyncpg/current/usage.html#type-conversion this is talking about the "bpchar" datatype. that's fine. However, when trying to normalize asyncpg's behavior against that of the psycopg2 and pg8000 drivers, both of which will give you back string for both of these types (we have determined this is also a bug in those drivers, as they fail to return arbirary bytes for such a datatype and likely was missed when they migrated to Python 3), I tried setting up a type_codec for "char" that would allow it to return strings:
await conn.set_type_codec(
"char",
schema="pg_catalog",
encoder=lambda value: value,
decoder=lambda value: value,
format="text",
)
that works, but when you do that, you no longer can use the set_type_codec
method for anything else, because the behavior of the type is redefined outside of the assumptions made by is_scalar_type.
The example program below illustrates this failure when attempting to subsequently set up a codec for the JSONB datatype:
import asyncio
import json
import asyncpg
async def main(illustrate_bug):
conn = await asyncpg.connect(
user="scott", password="tiger", database="test"
)
if illustrate_bug:
await conn.set_type_codec(
"char",
schema="pg_catalog",
encoder=lambda value: value,
decoder=lambda value: value,
format="text",
)
await conn.set_type_codec(
"jsonb",
schema="pg_catalog",
encoder=lambda value: value,
decoder=json.loads,
format="text",
)
print("no bug")
asyncio.run(main(False))
print("bug")
asyncio.run(main(True))
output:
no bug
bug
Traceback (most recent call last):
File "test3.py", line 35, in <module>
asyncio.run(main(True))
File "/opt/python-3.8.3/lib/python3.8/asyncio/runners.py", line 43, in run
return loop.run_until_complete(main)
File "/opt/python-3.8.3/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "test3.py", line 21, in main
await conn.set_type_codec(
File "/home/classic/.venv3/lib/python3.8/site-packages/asyncpg/connection.py", line 991, in set_type_codec
raise ValueError(
ValueError: cannot use custom codec on non-scalar type pg_catalog.jsonb
Since the "char" datatype is kind of an obscure construct, it's likely reasonable that asyncpg disallow setting up a type codec for this particular type, or perhaps it could emit a warning, but at the moment there doesn't seem to be documentation suggesting there are limitations on what kinds of type codecs can be constructed.
none of this is blocking us, just something we came across and I hope it's helpful to the asyncpg project. cc @fantix