set_type_codec() appears to assume a particular set_type_codec for the "char" datatype


* **asyncpg version**:  0.21.0

* **PostgreSQL version**:  11.8 fedora 

* **Do you use a PostgreSQL SaaS?  If so, which?  Can you reproduce
  the issue with a local PostgreSQL install?**: N/A

* **Python version**: 3.8.3
* **Platform**: Fedora 31

* **Do you use pgbouncer?**: no

* **Did you install asyncpg with pip?**: yes

* **If you built asyncpg locally, which version of Cython did you use?**: N/A

* **Can the issue be reproduced under both asyncio and
  [uvloop](https://github.com/magicstack/uvloop)?**: N/A

It appears that the implementation for set_type_codec() relies upon the results of the query [TYPE_BY_NAME](https://github.com/MagicStack/asyncpg/blob/2bac166c1ba098b9ebdfca3dc5b8264ae850213c/asyncpg/introspection.py#L137) which itself is assumed to return a bytes value from the PostgreSQL "char" datatype.

I was previously unaware that PostgreSQL actually has two "char" variants bpchar and char, and in the documentation at https://magicstack.github.io/asyncpg/current/usage.html#type-conversion this is talking about the "bpchar" datatype.  that's fine.     However, when trying to normalize asyncpg's behavior against that of the psycopg2 and pg8000 drivers, both of which will give you back string for both of these types (we have determined this is also a bug in those drivers, as they fail to return arbirary bytes for such a datatype and likely was missed when they migrated to Python 3), I tried setting up a type_codec for "char" that would allow it to return strings:

        await conn.set_type_codec(
            "char",
            schema="pg_catalog",
            encoder=lambda value: value,
            decoder=lambda value: value,
            format="text",
        )

that works, but when you do that, you no longer can use the ``set_type_codec`` method for anything else, because the behavior of the type is redefined outside of the assumptions made by [is_scalar_type](https://github.com/MagicStack/asyncpg/blob/2bac166c1ba098b9ebdfca3dc5b8264ae850213c/asyncpg/introspection.py#L154).

The example program below illustrates this failure when attempting to subsequently set up a codec for the JSONB datatype:

```
import asyncio
import json

import asyncpg


async def main(illustrate_bug):
    conn = await asyncpg.connect(
        user="scott", password="tiger", database="test"
    )

    if illustrate_bug:
        await conn.set_type_codec(
            "char",
            schema="pg_catalog",
            encoder=lambda value: value,
            decoder=lambda value: value,
            format="text",
        )

    await conn.set_type_codec(
        "jsonb",
        schema="pg_catalog",
        encoder=lambda value: value,
        decoder=json.loads,
        format="text",
    )


print("no bug")
asyncio.run(main(False))


print("bug")
asyncio.run(main(True))

```

output:

```
no bug
bug
Traceback (most recent call last):
  File "test3.py", line 35, in <module>
    asyncio.run(main(True))
  File "/opt/python-3.8.3/lib/python3.8/asyncio/runners.py", line 43, in run
    return loop.run_until_complete(main)
  File "/opt/python-3.8.3/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "test3.py", line 21, in main
    await conn.set_type_codec(
  File "/home/classic/.venv3/lib/python3.8/site-packages/asyncpg/connection.py", line 991, in set_type_codec
    raise ValueError(
ValueError: cannot use custom codec on non-scalar type pg_catalog.jsonb

```

Since the "char" datatype is kind of an obscure construct, it's likely reasonable that asyncpg disallow setting up a type codec for this particular type, or perhaps it could emit a warning, but at the moment there doesn't seem to be documentation suggesting there are limitations on what kinds of type codecs can be constructed.

none of this is blocking us, just something we came across and I hope it's helpful to the asyncpg project.   cc @fantix 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

set_type_codec() appears to assume a particular set_type_codec for the "char" datatype #617

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

set_type_codec() appears to assume a particular set_type_codec for the "char" datatype #617

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions