Skip to content

[BUG]: invalid shared ptr conversion leads to crash? #4365

Open
@bluescarni

Description

@bluescarni

Required prerequisites

What version (or hash if on master) of pybind11 are you using?

2.10.1

Problem description

We are experiencing a crash in the exposition of a C++ class hierarchy which seems to be related to conversions between std::shared_ptrs within the class hierarchy. The attached minimal code is enough to reproduce the issue. Let me give a brief description.

On the C++ side, we have to base classes, Base0 and Base1, and a derived class Derived which inherits from the two bases. All classes are exposed via pybind11 using std::shared_ptr holder types, and the exposition of Derived signals that Base0 and Base1 are its bases.

Then, we expose a factory function make_object() which internally creates a shared ptr to Derived, converts it to std::shared_ptr< Base0 >, and then returns it. In the implementation of make_object(), we print to screen the memory addresses of the base and derived objects (more on that later).

Finally, we expose a print_object() function that takes in input a pointer to derived and tries to print to screen the size of its vector data member. It also prints to screen the pointer contained in the input shared pointer.

Now for the Python code:

In [1]: import example

In [2]: o = example.make_object(42)
ret der ptr: 0x557ad38c7a30
ret base ptr: 0x557ad38c7a50

In [3]: example.print_object(o)
der ptr: 0x557ad38c7a50
23501544953261

So there's something obviously going wrong here, since:

  • the vector data member definitely does not have 23501544953261 elements, and
  • the pointers seem to be all wrong. In particular, the memory address 0x557ad38c7a50 is a pointer to Base0 in make_object(), but it has magically been converted into a pointer to Derived in print_object(o), even though casting between Base0 and Derived should change the memory address in this hierarchy (as evidenced in the pointer values printed from make_object()).

In order to confirm that something fishy is going on, we can recompile the module with the address sanitizer on:

$ c++  -Wall -shared -std=c++11 -fPIC $(python3 -m pybind11 --includes) example.cpp -o example$(python3-config --extension-suffix) -O0 -fsanitize=address

Then we need to take care of invoking Python with the address sanitizer preloaded:

$ LD_PRELOAD=/home/yardbird/miniconda3/envs/heyoka_py_devel/lib/libasan.so ipython

(Note: this is from a conda installation, the libasan.so path should be adjusted as needed)

We can the rerun the code:

In [1]: import example

In [2]: o = example.make_object(42)
ret der ptr: 0x606000006c30
ret base ptr: 0x606000006c50

In [3]: example.print_object(o)
der ptr: 0x606000006c50
=================================================================
==13860==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x606000006c60 at pc 0x7fe2caf1dfcd bp 0x7ffe0813de20 sp 0x7ffe0813de18
READ of size 8 at 0x606000006c60 thread T0
    #0 0x7fe2caf1dfcc in std::vector<int, std::allocator<int> >::size() const (/home/yardbird/repos/bug_pybind11/example.cpython-39-x86_64-linux-gnu.so+0x8cfcc)
    #1 0x7fe2caedf9f7 in pybind11_init_example(pybind11::module_&)::{lambda(std::shared_ptr<Derived> const&)#2}::operator()(std::shared_ptr<Derived> const&) const (/home/yardbird/repos/bug_pybind11/example.cpython-39-x86_64-linux-gnu.so+0x4e9f7)
    #2 0x7fe2caee1dae in void pybind11::detail::argument_loader<std::shared_ptr<Derived> const&>::call_impl<void, pybind11_init_example(pybind11::module_&)::{lambda(std::shared_ptr<Derived> const&)#2}&, 0ul, pybind11::detail::void_type>(pybind11_init_example(pybind11::module_&)::{lambda(std::shared_ptr<Derived> const&)#2}&, pybind11::detail::index_sequence<0ul>, pybind11::detail::void_type&&) && (/home/yardbird/repos/bug_pybind11/example.cpython-39-x86_64-linux-gnu.so+0x50dae)
    #3 0x7fe2caee17aa in std::enable_if<std::is_void<void>::value, pybind11::detail::void_type>::type pybind11::detail::argument_loader<std::shared_ptr<Derived> const&>::call<void, pybind11::detail::void_type, pybind11_init_example(pybind11::module_&)::{lambda(std::shared_ptr<Derived> const&)#2}&>(pybind11_init_example(pybind11::module_&)::{lambda(std::shared_ptr<Derived> const&)#2}&) && (/home/yardbird/repos/bug_pybind11/example.cpython-39-x86_64-linux-gnu.so+0x507aa)

Thus it definitely looks like a wrong memory access.

I have tried to delve into the pybind11 source with a debugger to see what might be going on. I can't say I have fully understood the logic, but it seems to me that:

  • pybind11 "sees" that the dynamic type of the input argument to print_object() is of type Derived (I think this happens here: https://github.com/pybind/pybind11/blob/master/include/pybind11/detail/type_caster_base.h#L693), and thus
  • pybind11 forcefully casts via reinterpret_cast a shared_ptr<Base0> to shared_ptr<Derived>, which would be undefined behaviour in any case but works in practice if the pointers to Base0 and Derived coincide (which, in this case, they do NOT).

Please let me know if I can assist with more testing/triaging.

Reproducible example code

#include <memory>
#include <vector>
#include <iostream>

#include <pybind11/pybind11.h>

// The first base class.
struct Base0
{
    virtual ~Base0() { }
};

using Base0Ptr = std::shared_ptr< Base0 >;

// The second base class.
struct Base1
{
    virtual ~Base1() { }
    std::vector< int > vec = { 1, 2, 3, 4, 5 };
};

using Base1Ptr = std::shared_ptr< Base1 >;

// The derived class.
struct Derived : Base1, Base0
{
    virtual ~Derived() { }
};

using DerivedPtr = std::shared_ptr< Derived >;

PYBIND11_MODULE(example, m) {
    // Expose the bases.
    pybind11::class_< Base0, Base0Ptr > bs0( m, "Base0" );
    pybind11::class_< Base1, Base1Ptr > bs1( m, "Base1" );
    // Expose the derived class.
    pybind11::class_< Derived, DerivedPtr, Base0, Base1 >( m, "Derived" ).def( pybind11::init<>() );

    // A helper that returns a pointer to base.
    m.def( "make_object",
        []( int value ) -> Base0Ptr
        {
            auto ret_der = std::make_shared< Derived >();
            std::cout << "ret der ptr: " << ret_der.get() << std::endl;
            auto ret = Base0Ptr( ret_der );
            std::cout << "ret base ptr: " << ret.get() << std::endl;
            return ret;
        } );

    // A helper that accepts in input a pointer to derived.
    m.def( "print_object",
        []( const DerivedPtr& object )
        {
            std::cout << "der ptr: " << object.get() << std::endl;
            std::cout << object->vec.size() << std::endl;
        } );
}

Is this a regression? Put the last known working version here if it is.

Not a regression

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions