Skip to content

Allow replacing ... by object / Any / Incomplete? #191

Open
@bluenote10

Description

@bluenote10

I'm currently experimenting with migrating from mypy's stubgen to pybind11-stubgen and noticed a bigger difference in the handling of invalid types (unfortunately not 100% preventable due to certain limitations in pybind11 itself). In the following example I have not added the #include <pybind11/stl/filesystem.h> on purpose to emulate the situation of a missing binding:

#include <filesystem>
#include <unordered_map>
#include <vector>

#include <pybind11/pybind11.h>
#include <pybind11/stl.h>

struct SomeStruct
{
  std::filesystem::path a;
  std::vector<std::filesystem::path> b;
  std::unordered_map<std::filesystem::path, int> c;
  std::unordered_map<int, std::filesystem::path> d;
};

std::filesystem::path funcA()
{
  return std::filesystem::path{"foobar"};
}

void funcB(std::filesystem::path, std::filesystem::path)
{
}

PYBIND11_MODULE(my_native_module, m)
{
  pybind11::class_<SomeStruct>(m, "SomeStruct")
      .def_readwrite("a", &SomeStruct::a)
      .def_readwrite("b", &SomeStruct::b)
      .def_readwrite("c", &SomeStruct::c)
      .def_readwrite("d", &SomeStruct::d);
  m.def("func_a", &funcA);
  m.def("func_b", &funcB);
}

pybind11-stubgen produces the following stub output:

from __future__ import annotations
__all__ = ['SomeStruct', 'func_a', 'func_b']
class SomeStruct:
    a: ...
    b: list[...]
    c: dict[..., int]
    d: dict[int, ...]
def func_a() -> ...:
    ...
def func_b(arg0: ..., arg1: ...) -> None:
    ...

We run mypy on our Python code to ensure proper usage of type stubs (including the stubs as an extra consistency check). Unfortunately, mypy does not like the generated stubs. First it complains about the last signature, and this error even prevents further checking of the file:

out/my_native_module.pyi:10: error: Ellipses cannot accompany other argument types in function type signature  [syntax]
Found 1 error in 1 file (errors prevented further checking)

After removing this last function, mypy still complains about all other usages of the ellipsis as a type with error: Unexpected "..." [misc]. Pyright seems to be bit more permissive, but also complains about the usages in b, c, and d with "..." is not allowed in this context.

For comparison, mypy's stubgen produces the following, which isn't fully consistent either, but at least does not mess up the type checking:

from _typeshed import Incomplete

class SomeStruct:
    a: Incomplete
    b: Incomplete
    c: Incomplete
    d: Incomplete
    def __init__(self, *args, **kwargs) -> None: ...

def func_a(*args, **kwargs): ...
def func_b(arg0, arg1) -> None: ...

This made me wonder why pybind11-stubgen is actually using ... and what the typing specs say about using ... as a type. I've found a few related issues, but I'm still not quite sure if this is intended to work:

My understanding so far was that ... actually has a special role different from meaning Any or so. For instance in tuple[int, ...] it expresses that the tuple type is heterogenous with an undefined arity, whereas the type tuple[int, Any] would rather mean an arity of 2, with the second tuple field arbitrarily typed. If I remember correctly it also means something special in Callable[[...], object] (also in the sense of "arbitrary arity"). Note that this problematic with using ... in the sense of an incomplete binding: If there is an std::pair<int, SomeIncompleteType> the it makes a difference if the stub generator outputs tuple[int, ...] (homogenous tuple of ints, arbitrary length), or tuple[int, Incomplete] (length-2 tuple, with second element untyped).

That's why I'm surprised that the stub generator uses ... for incomplete types. Is there any reason not to use typeshed's Incomplete, which was meant for that purpose? From a practical perspective it would be nice if the generated stub would be understood by mypy (+ pyright etc.).


As a bonus feature, it would be even nicer if the behavior how the incomplete type mapping works could be controlled a bit more explicitly. A random idea would be to have a an argument --invalid-expression-handling-strategy allowing for

  • any: the type will get replaced by Any. This should always work, but is the least explicit / type-safe.
  • incomplete: the type will get replaced by _typeshed.Incompete. This is technically equivalent to using Any, but communicates more clearly the fact that the type stub in completely defined.
  • object: the type will get replaced by object. This adds extra type-safety because the usages will typically require explicit type checking.

These three could be accompanied with any-annotated, incomplete-annotated, and object-annotated which does the same, but wraps the type into e.g. Annotated[Incomplete, "<raw type annotation>"]. This would basically be the equivalent to the current --print-invalid-expressions-as-is but by moving the raw annotation into a string wrapped in Annotated the resulting stub would be syntactically valid (at least in combination with pybind11, printing the raw type annotations directly typically results just in invalid syntax).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions