Boost.Python C++ object reference in Python: unexpected behaviour.
Hi, I am having an issue with Boost.Python with a very simple use case. I am returning a reference to an object, and it seems that my python object looses its C++ object's reference at a stage for some reason. Please see my *example* below reproducing this issue. *C++ Code:* #include <iostream> #include <vector> #include <string> #include <cmath> #include <boost/python.hpp> #include <boost/python/suite/indexing/vector_indexing_suite.hpp> class Car { public: Car(std::string name) : m_name(name) {} bool operator==(const Car &other) const { return m_name == other.m_name; } std::string GetName() { return m_name; } private: std::string m_name; }; class Factory { public: Factory(std::string name) : m_name(name) {} bool operator==(const Factory &other) const { return m_name == other.m_name && m_car_list == other.m_car_list; } Car& create_car(std::string name) { m_car_list.emplace_back(Car(name)); return m_car_list.back(); } std::string GetName() { return m_name; } std::vector<Car>& GetCarList() { return m_car_list;} private: std::string m_name; std::vector<Car> m_car_list; }; class Manufacturer { public: Manufacturer(std::string name) : m_name(name) {} bool operator==(const Manufacturer &other) const { return m_name == other.m_name && m_factory_list == other.m_factory_list; } Factory& create_factory(std::string name) { m_factory_list.emplace_back(Factory(name)); return m_factory_list.back(); } std::string GetName() { return m_name; } std::vector<Factory>& GetFactoryList() { return m_factory_list;} private: std::string m_name; std::vector<Factory> m_factory_list; }; BOOST_PYTHON_MODULE(carManufacturer) { using namespace boost::python; class_<Manufacturer>("Manufacturer", init<std::string>()) .add_property("factory_list", make_function(&Manufacturer::GetFactoryList, return_internal_reference<1>())) .add_property("name", &Manufacturer::GetName) .def("create_factory", &Manufacturer::create_factory, return_internal_reference<>()); class_<Factory>("Factory", init<std::string>()) .add_property("car_list", make_function(&Factory::GetCarList, return_internal_reference<1>())) .add_property("name", &Factory::GetName) .def("create_car", &Factory::create_car, return_internal_reference<>()); class_<Car>("Car", init<std::string>()) .add_property("name", &Car::GetName); class_<std::vector<Factory> >("FactoryList") .def(vector_indexing_suite<std::vector<Factory> >()); class_<std::vector<Car> >("Car") .def(vector_indexing_suite<std::vector<Car> >()); } *Python Code:* import sys sys.path[:0] = [r"bin\Release"] from carManufacturer import * vw = Manufacturer("VW") vw_bra_factory = vw.create_factory("Brazil Factory") beetle = vw_bra_factory.create_car("Beetle69") if vw_bra_factory is vw.factory_list[0]: print("equal.") else: print("NOT EQUAL") print("## I expected them to be the same reference..?") print("vw_bra_factory Car List size : " + str(len(vw_bra_factory.car_list))) print("Actual Car List size : " + str(len(vw.factory_list[0].car_list))) print("## This still works. Maybe the python objects differ, but refer to the same C++ object. I can live with that.") vw_sa_factory = vw.create_factory("South Africa Factory") print("vw_bra_factory Car List size : " + str(len(vw_bra_factory.car_list))) print("Actual Car List size : " + str(len(vw.factory_list[0].car_list))) print("## .. what? why? brazil py object has no cars now? I don't get it. I can't have any of that.") print("## What will happen if I create another car in the brazil factory?") combi = vw_bra_factory.create_car("Hippie van") print("vw_bra_factory Car List size : " + str(len(vw_bra_factory.car_list))) print("Actual Car List size : " + str(len(vw.factory_list[0].car_list))) print("## And another.") citi_golf = vw_bra_factory.create_car("Citi golf") print("vw_bra_factory Car List size : " + str(len(vw_bra_factory.car_list))) print("Actual Car List size : " + str(len(vw.factory_list[0].car_list))) print("## 'vw_bra_factory' must have lost its C++ reference it had to 'vw.factory_list[0]' when I created a new factory. Why?") *Python Output:* NOT EQUAL *## I expected them to be the same reference..?* vw_bra_factory Car List size : 1 Actual Car List size : 1 *## This still works. Maybe the python objects differ, but refer to the same C++ object. I can live with that.* vw_bra_factory Car List size : 0 Actual Car List size : 1 *## .. what? why? brazil py object has no cars now? I don't get it. I can't have any of that.* *## What will happen if I create another car in the brazil factory?* vw_bra_factory Car List size : 1 Actual Car List size : 1 *## And another.* vw_bra_factory Car List size : 2 Actual Car List size : 1 *## 'vw_bra_factory' must have lost its C++ reference it had to 'vw.factory_list[0]' when I created a new factory. Why?* This is just an example made to reproduce my real work's problem in a presentable way. In my real work, python crashes after I create a second "factory" and try to add a "car" to the first "factory". The crash occurs in C++ "create_car" method whan trying to access the "factory"'s "car" list. Does anyone have insight as to what the problem is? Any useful input will be greatly appreciated. Greetings, Christoff
Hi, This looks like a bug in Boost.Python to me. Could anyone confirm this? I provided a minimal, full working example. I would like to make sure it is a bug before reporting it as one. Christoff On 28 May 2015 at 09:29, Christoff Kok <christoff.kok@ex-mente.co.za> wrote:
Hi,
I am having an issue with Boost.Python with a very simple use case.
I am returning a reference to an object, and it seems that my python object looses its C++ object's reference at a stage for some reason.
Please see my *example* below reproducing this issue.
*C++ Code:*
#include <iostream> #include <vector> #include <string> #include <cmath> #include <boost/python.hpp> #include <boost/python/suite/indexing/vector_indexing_suite.hpp>
class Car { public: Car(std::string name) : m_name(name) {}
bool operator==(const Car &other) const { return m_name == other.m_name; }
std::string GetName() { return m_name; } private: std::string m_name; };
class Factory { public: Factory(std::string name) : m_name(name) {}
bool operator==(const Factory &other) const { return m_name == other.m_name && m_car_list == other.m_car_list; }
Car& create_car(std::string name) { m_car_list.emplace_back(Car(name)); return m_car_list.back(); }
std::string GetName() { return m_name; } std::vector<Car>& GetCarList() { return m_car_list;} private: std::string m_name; std::vector<Car> m_car_list; };
class Manufacturer { public: Manufacturer(std::string name) : m_name(name) {}
bool operator==(const Manufacturer &other) const { return m_name == other.m_name && m_factory_list == other.m_factory_list; }
Factory& create_factory(std::string name) { m_factory_list.emplace_back(Factory(name)); return m_factory_list.back(); }
std::string GetName() { return m_name; } std::vector<Factory>& GetFactoryList() { return m_factory_list;} private: std::string m_name; std::vector<Factory> m_factory_list; };
BOOST_PYTHON_MODULE(carManufacturer) { using namespace boost::python; class_<Manufacturer>("Manufacturer", init<std::string>()) .add_property("factory_list", make_function(&Manufacturer::GetFactoryList, return_internal_reference<1>())) .add_property("name", &Manufacturer::GetName) .def("create_factory", &Manufacturer::create_factory, return_internal_reference<>()); class_<Factory>("Factory", init<std::string>()) .add_property("car_list", make_function(&Factory::GetCarList, return_internal_reference<1>())) .add_property("name", &Factory::GetName) .def("create_car", &Factory::create_car, return_internal_reference<>()); class_<Car>("Car", init<std::string>()) .add_property("name", &Car::GetName);
class_<std::vector<Factory> >("FactoryList") .def(vector_indexing_suite<std::vector<Factory> >()); class_<std::vector<Car> >("Car") .def(vector_indexing_suite<std::vector<Car> >()); }
*Python Code:*
import sys sys.path[:0] = [r"bin\Release"]
from carManufacturer import *
vw = Manufacturer("VW") vw_bra_factory = vw.create_factory("Brazil Factory") beetle = vw_bra_factory.create_car("Beetle69")
if vw_bra_factory is vw.factory_list[0]: print("equal.") else: print("NOT EQUAL") print("## I expected them to be the same reference..?")
print("vw_bra_factory Car List size : " + str(len(vw_bra_factory.car_list))) print("Actual Car List size : " + str(len(vw.factory_list[0].car_list))) print("## This still works. Maybe the python objects differ, but refer to the same C++ object. I can live with that.")
vw_sa_factory = vw.create_factory("South Africa Factory") print("vw_bra_factory Car List size : " + str(len(vw_bra_factory.car_list))) print("Actual Car List size : " + str(len(vw.factory_list[0].car_list))) print("## .. what? why? brazil py object has no cars now? I don't get it. I can't have any of that.")
print("## What will happen if I create another car in the brazil factory?") combi = vw_bra_factory.create_car("Hippie van") print("vw_bra_factory Car List size : " + str(len(vw_bra_factory.car_list))) print("Actual Car List size : " + str(len(vw.factory_list[0].car_list)))
print("## And another.") citi_golf = vw_bra_factory.create_car("Citi golf") print("vw_bra_factory Car List size : " + str(len(vw_bra_factory.car_list))) print("Actual Car List size : " + str(len(vw.factory_list[0].car_list))) print("## 'vw_bra_factory' must have lost its C++ reference it had to 'vw.factory_list[0]' when I created a new factory. Why?")
*Python Output:*
NOT EQUAL *## I expected them to be the same reference..?* vw_bra_factory Car List size : 1 Actual Car List size : 1 *## This still works. Maybe the python objects differ, but refer to the same C++ object. I can live with that.* vw_bra_factory Car List size : 0 Actual Car List size : 1 *## .. what? why? brazil py object has no cars now? I don't get it. I can't have any of that.* *## What will happen if I create another car in the brazil factory?* vw_bra_factory Car List size : 1 Actual Car List size : 1 *## And another.* vw_bra_factory Car List size : 2 Actual Car List size : 1 *## 'vw_bra_factory' must have lost its C++ reference it had to 'vw.factory_list[0]' when I created a new factory. Why?*
This is just an example made to reproduce my real work's problem in a presentable way. In my real work, python crashes after I create a second "factory" and try to add a "car" to the first "factory". The crash occurs in C++ "create_car" method whan trying to access the "factory"'s "car" list.
Does anyone have insight as to what the problem is? Any useful input will be greatly appreciated.
Greetings, Christoff
-- Christoff Kok Software Engineer Ex Mente http://www.ex-mente.co.za christoff.kok@ex-mente.co.za PO Box 10214 Centurion 0046 South Africa tel: +27 12 743 6993 tel: +27 12 654 8198 fax: +27 85 150 1341
From: Christoff Kok <christoff.kok@ex-mente.co.za> To: cplusplus-sig@python.org Sent: Tuesday, June 2, 2015 7:34 AM Subject: Re: [C++-sig] Boost.Python C++ object reference in Python: unexpected behaviour.
Hi,
This looks like a bug in Boost.Python to me.
Could anyone confirm this? I provided a minimal, full working example.
I would like to make sure it is a bug before reporting it as one.
Christoff
What about making the Factory and Manufacturer class as noncopyable (and also exporting them to python as noncopyable)? Has something changed? Trigve
On 02/06/15 01:34 AM, Christoff Kok wrote:
Hi,
This looks like a bug in Boost.Python to me.
Could anyone confirm this? I provided a minimal, full working example.
I would like to make sure it is a bug before reporting it as one.
The 'is' operator compares the identities of the two Python objects, which differ. However, both are referencing the same C++ object. As a test, add bool identical(Factory &f1, Factory &f2) { return &f1 == &f2;} to your C++ and expose that, then use that function to compare the factory references, instead of 'is'. Yes, it would be nice if the same Python (wrapper) object would be returned. I'm not sure how to do that, though. I'll think about it some more... HTH, Stefan -- ...ich hab' noch einen Koffer in Berlin...
Christoff, I just noticed I wasn't really answering the real problem you report, which is the crash. I believe the problem is in your code: You create two vectors of value-types (cars and factories). Then you take references to the stored objects, while there is no guarantee that the objects' addresses won't change over time. In particular, there is a good chance of these objects to be copied as the vector gets resized as new objects are added beyond their current capacity. As a test, I called vector<...>::reserve(32) on each of the vectors right in the Factory and Manufacturer constructors, with the effect of allocating enough storage upfront so that in your sample code no re-allocation is required, and thus objects won't be copied around. This prevents the crash from happening for me. Obviously this is just to illustrate the problem; it's definitely not a solution to your problem, which still is that you reference objects beyond their lifetime. HTH, Stefan -- ...ich hab' noch einen Koffer in Berlin...
Hi Stefan, Thank you very much. That makes sense and my tests prove it. The code runs as expected when I reserve enough space for the vector. I do not quite get it why it works in C++ and not python. I know too little about the C++ and python run-time. I guess that the C++ run-time automatically updates objects holding references to its new address, whereas this is not the case in the Python run-time. Fixing that issue for me is way over my head at the moment. Thank you very much for spending the time to think about the problem and kudos to you for discovering the reason. I'm not sure how to solve the problem either, my objects are uniquely identifiable. I might be able to override the python objects' __getattribute__ methods and set the python object to that of its parent container's instance (only if the vector has not been resized). I don't know if this will work, This might slow the code down a bit as well. I'll try it none the less. Thanks again Stefan. Regards, Christoff On 2 June 2015 at 15:11, Stefan Seefeld <stefan@seefeld.name> wrote:
Christoff,
I just noticed I wasn't really answering the real problem you report, which is the crash.
I believe the problem is in your code: You create two vectors of value-types (cars and factories). Then you take references to the stored objects, while there is no guarantee that the objects' addresses won't change over time. In particular, there is a good chance of these objects to be copied as the vector gets resized as new objects are added beyond their current capacity.
As a test, I called
vector<...>::reserve(32)
on each of the vectors right in the Factory and Manufacturer constructors, with the effect of allocating enough storage upfront so that in your sample code no re-allocation is required, and thus objects won't be copied around. This prevents the crash from happening for me.
Obviously this is just to illustrate the problem; it's definitely not a solution to your problem, which still is that you reference objects beyond their lifetime.
HTH, Stefan
--
...ich hab' noch einen Koffer in Berlin...
_______________________________________________ Cplusplus-sig mailing list Cplusplus-sig@python.org https://mail.python.org/mailman/listinfo/cplusplus-sig
-- Christoff Kok Software Engineer Ex Mente http://www.ex-mente.co.za christoff.kok@ex-mente.co.za PO Box 10214 Centurion 0046 South Africa tel: +27 12 743 6993 tel: +27 12 654 8198 fax: +27 85 150 1341
On 02/06/15 10:36 AM, Christoff Kok wrote:
Hi Stefan,
Thank you very much. That makes sense and my tests prove it. The code runs as expected when I reserve enough space for the vector.
I do not quite get it why it works in C++ and not python. I know too little about the C++ and python run-time.
What works in C++ and not in Python ? With the "reserve" calls the Python script you provided runs fine (for me).
I guess that the C++ run-time automatically updates objects holding references to its new address, whereas this is not the case in the Python run-time. Fixing that issue for me is way over my head at the moment.
But the problem really is in your C++ code, and has nothing to do with the Python bindings. So you should start by clarifying your desired API, then adjust the implementation. (Do you really want references to be usable ? In that case, your vectors shouldn't store cars by-value. My guess is you shouldn't be using references, unless the real types are heavy objects, in which case you also want to prevent copy-construction, and store heap-allocated objects. Once the C++ API is settled, you can reconsider the appropriate Python API for it.
Thank you very much for spending the time to think about the problem and kudos to you for discovering the reason.
You are welcome. Stefan -- ...ich hab' noch einen Koffer in Berlin...
Thank you again Stefan,
What works in C++ and not in Python ? With the "reserve" calls the Python script you provided runs fine (for me).
I tested the code in a C++ console application, using the same example as in the Python example I posted. I meant, that, without the "reserve" calls, the C++ console application worked as I expected, and Python didn't. Python works as I expected with the 'reserve' calls.
Do you really want references to be usable ? In that case, your vectors shouldn't store cars by-value. My guess is you shouldn't be using references, unless the real types are heavy objects, in which case you also want to prevent copy-construction, and store heap-allocated objects.
The types I am using are big objects and performance is a concern. C++11's 'move' semantics make storing large objects by value in a vector much more viable and performant. (Very little overhead.) I liked this approach of using containers of objects by value everywhere (well, except where polymorphism is needed). I am sure that using heap-allocated objects will still be faster however. Even though moving has little overhead, it's still overhead heap-allocated objects doesn't have to deal with.
Once the C++ API is settled, you can reconsider the appropriate Python API for it.
I tested the use of heap-allocated objects and it works. I am going to change my code to rather store heap allocated objects for all my complex types (to keep it consistent for maintenance and simplicity's sake). Thank you for all your assistance, I greatly appreciate it. You saved me a lot of time. Regards, Christoff On 2 June 2015 at 16:55, Stefan Seefeld <stefan@seefeld.name> wrote:
On 02/06/15 10:36 AM, Christoff Kok wrote:
Hi Stefan,
Thank you very much. That makes sense and my tests prove it. The code runs as expected when I reserve enough space for the vector.
I do not quite get it why it works in C++ and not python. I know too little about the C++ and python run-time.
What works in C++ and not in Python ? With the "reserve" calls the Python script you provided runs fine (for me).
I guess that the C++ run-time automatically updates objects holding references to its new address, whereas this is not the case in the Python run-time. Fixing that issue for me is way over my head at the moment.
But the problem really is in your C++ code, and has nothing to do with the Python bindings. So you should start by clarifying your desired API, then adjust the implementation. (Do you really want references to be usable ? In that case, your vectors shouldn't store cars by-value. My guess is you shouldn't be using references, unless the real types are heavy objects, in which case you also want to prevent copy-construction, and store heap-allocated objects.
Once the C++ API is settled, you can reconsider the appropriate Python API for it.
Thank you very much for spending the time to think about the problem and kudos to you for discovering the reason.
You are welcome.
Stefan
--
...ich hab' noch einen Koffer in Berlin...
_______________________________________________ Cplusplus-sig mailing list Cplusplus-sig@python.org https://mail.python.org/mailman/listinfo/cplusplus-sig
-- Christoff Kok Software Engineer Ex Mente http://www.ex-mente.co.za christoff.kok@ex-mente.co.za PO Box 10214 Centurion 0046 South Africa tel: +27 12 743 6993 tel: +27 12 654 8198 fax: +27 85 150 1341
participants (3)
-
Christoff Kok -
Stefan Seefeld -
Trigve Siver