Skip to content

[BUG] Keep pybind11 type casters alive recursively when needed #2527

Open
@kenoslund-google

Description

@kenoslund-google

Issue description

When pybind11 casts from python->C++ for a function call, the casters for each argument are kept alive for the duration of the function call. This is important when passing a pointer or reference to memory owned by the caster (rather than owned by python). std::string is a common example of this- functions which take a std::string* or std::string& as one of their arguments (const or non-const) would fail without this behavior because the cast string would be freed before the function begins executing.

However, this behavior does not work recursively. For example, the list_caster constructs a caster for each element, moves the result out of it, and then discards the caster:

for (auto it : s) {
value_conv conv;
if (!conv.load(it, convert))
return false;
value.push_back(cast_op<Value &&>(std::move(conv)));
}

Thus, if you attempt to bind a function which takes a std::vector<std::string*> as one of its arguments, it will fail (and probably corrupt memory) because the std::string is owned by the element caster, which is destroyed before the function call begins.

This is particularly a problem when writing a caster for Spans (https://abseil.io/tips/93). Since the span does not confer ownership, the caster must retain ownership of a vector for the span to reference. This breaks with nested spans (Span<const Span<const T>>) exactly the same way std::vector<std::string*> breaks.

However, just keeping the casters alive recursively would be unnecessary and inefficient in many cases. For efficiency, I think the element casters should only be kept alive recursively if both of the following are true:

  1. The element caster owns the C++ representation.
  2. The element can't be moved out of the caster because moving the C++ type is not possible or does not also transfer ownership, such as with raw pointers.

The second condition is easy enough check- if the element type is a pointer, l-value reference, or the caster does not support the movable_cast_op_type, then it is satisfied. This can be checked by adding a r-value reference to the element type and feeding it through the element caster's cast_op_type:

using ElementCaster = make_caster<ElementType>;
static constexpr bool kCondition2 =
  !std::is_rvalue_reference_v<typename ElementCaster::template cast_op_type<
     typename std::add_rvalue_reference_t<ElementType>>>;

However, there does not appear to be a way to check the the first condition; type_casters would probably have to declare it, which would require an addition to the type_caster API.

Reproducible example code

C++

    // `std::vector<std::string*> does not work because the string is owned by
    // the element caster, which is destroyed before the function is called.
    m.def("load_vector_str_ptr", [](const std::vector<std::string*>& v) {
        // print more reliably triggers a memory error, making the use-after-free more obvious.
        py::print(*v.at(0), " && ", *v.at(1));
        return *v.at(0) == "test1" && *v.at(1) == "test2";
    });

python

assert m.load_vector_str_ptr(["test1", "test2"])

I'll upload a PR as soon as I figure out how...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions