[C++-sig] Custom PyTypeObjects
Alex Leach
beamesleach at gmail.com
Mon Apr 29 18:26:38 CEST 2013
Dear list,
A thread I started was originally meant to discuss how to use C++ memory
management methods (operator new, delete etc.) with a boost python
instance. Rather than dwelling on the concern, I've been (successfully)
wrapping other code since, but have now arrived at a separate conundrum,
which I think could be addressed by the same conceptual solution. This
time I've found a working attempt at a solution here on this list[1], and
was hoping that more generic, template-ised versions could be introduced
into Boost..
[1] -
http://mail.python.org/pipermail/cplusplus-sig/2007-August/012438.html
The message has turned into a bit of an essay, so I'll summarise what I've
written here:-
* Python objects and protocols - they're not all the same.
* Python Buffers - An example and attempt at exposing one.
* C++ IO streams - exposing buffered object interfaces to Python.
* Customising PyTypeObjects already used in Boost Python.
* "There should be one-- and preferably only one --obvious way to
do it."
* Summary
What's in an Object?
--------------------
What I think it boils down to is a lack of support for the different type
objects defined in the Python C-API Abstract[2] and Concrete[3] Object
Layers.
The problem in [1] was related to PyBufferProcs and PyBufferObjects. How
can an object representing a buffer be properly exposed to Python? The
PyBuffer* structs were designed with this in mind, but are now deprecated
in favour of memory view objects [4]. Either way, a `grep` of the Boost
Python header and source files show no sign of either API being made of
use.
[2] - http://docs.python.org/2/c-api/abstract.html
[3] - http://docs.python.org/2/c-api/concrete.html
[4] - http://docs.python.org/2/c-api/buffer.html#memoryview-objects
A buffered solution
-------------------
The solution from [1] makes it about as simple as possible for the client
/ Python registration code to expose a return type that is managed by a
PyBuffer_Type-like PyTypeObject. A custom to-python converter is
registered and return_value_policy used.
However, this is still fairly cumbersome compared to current Boost Python
usage, as the C-Python API needs to be used directly and a custom
PyTypeObject defined, for any return-type that should use a different type
protocol. The solution also goes nowhere to providing the functionality a
Python buffer expects, but instead just demonstrates how one might use a
new PyTypeObject.
A standards-compliant solution
------------------------------
With the C++ standard library in mind, I was wondering what boost python
might be able to do with IO streams. I have a family of C++ classes that
use iostream-like operators to serialise objects into either XML, plain
text,
or binary formats. Providing this functionality via a buffered object
seems to be the appropriate solution... Using boost python to expose such
an interface though, looks non-trivial, to say the least.
A boost-friendly solution might be to recognise boost::asio::buffer[6]
objects, perhaps using boost::mpl statements in the to-python converter
registrations.
I'm still trying to get to grips with standard library templates
personally, so would prefer if classes derived from ios_base could
automatically have their '<<' and '>>' operators exposed at compile time,
depending on whether they are read-only or read-write. An exposed seek
function would also be useful, when one is available in the C++ type.
Specialised PyTypeObjects
-------------------------
Discussing each of the different object types is too large a subject to
describe in full here, but would it not be sensible for Boost Python to
make it easier to expose other PyTypeObjects?
The NumPy C-API exposes 8 public and 4 private type specialisations[5],
for representing clearly different types of data. These are essentially
PyTypeObjects conforming to the API defined in the C-Python object layers
documentation[2,3].
With quite a lot more code, Boost Python could potentially provide
capability to specialise the type objects for a number of pre-defined base
types, by providing custom HolderGenerators[6] for each type
specialisation. These HolderGenerators can be referred to by creating
corresponding `return_value_policy`s. This is what the solution from [1]
does, by defining both a new HolderGenerator and a corresponding
return_value_policy.
This concept is not problem-free, however. In my case, I'd like to tie a
C++ class's streaming interface directly to the PyTypeObject. For Python
2.x this would mean populating a new PyTypeObject's tp_as_buffer attribute
to a PyBufferProcs struct. The code from [1] could be modified to do this,
but it would take quite a lot more work. (It has..)
For Python2.7 and above, there are of course the new buffer and memoryview
APIs, but I haven't really read up on or done anything with them yet...
[5] -
http://docs.scipy.org/doc/numpy/reference/c-api.types-and-structures.html
[6] -
http://www.boost.org/doc/libs/1_53_0/libs/python/doc/v2/HolderGenerator.html
A generalised solution
----------------------
To answer my question from the previous thread I started here, on how to
use a custom PyTypeObject on an exposed class_<> hierarchy, I think the
way to do this is to use `pytype_object_manager_traits<PyTypeObject*,
object>`, as is done in str.hpp, list.hpp, etc. e.g.:-
namespace converter
{
template <>
struct object_manager_traits<str>
: pytype_object_manager_traits<&PyUnicode_Type, str>
{
};
}
This seems to be the best way to register a PyTypeObject to a C++ class,
with Boost. But it does require a tremendous of work, when wanting to use
PyTypeObjects that should use STL functionality.
C++ IO streams
--------------
Mapping C++ STL functions to PyTypeObject attributes[7] does not appear to
have been done at all in Boost Python, in so far as I can tell. Of course
there are the standard objects, bp::string, list, etc. , which use core
Python's respective PyTypeObjects as instance managers, like above, but it
doesn't seem like there is a robust way to replace a PyTypeObject's
function pointers with STL-conforming implementations. I suppose it is
possible to edit the PyTypeObject, after getting it with
`object.get_type()`, but that seems a bit of an inefficient, run-time hack.
I was playing around with the code from [1] over the weekend, and have
started to map the C++ iostream template functions to a PyTypeObject's
`tp_as_buffer` member struct, to expose buffered access to C++ formatted
stream methods through a PyBufferProcs struct[8]. Admittedly, this was a
bit of a pointless exercise, as the buffer protocol has been removed in
Python 3, but I am currently developing with Python 2.7 and wanted to try
out an initial, working implementation where a custom PyTypeObject is used.
For std::i/ostream, there is some production code available that can
perform Python file-like object conversions. In particular, the two
subsequent replies to this message[9] here on this list, mention
open-source libraries that can already do this. And from the code listed
in [1], I've made available yet another (partially complete)
implementation[10].
[7] - http://docs.python.org/2/c-api/typeobj.html#
[8] - http://docs.python.org/2/c-api/typeobj.html#PyBufferProcs
[9] - http://mail.python.org/pipermail/cplusplus-sig/2010-March/015411.html
[10] - https://github.com/alexleach/bp_helpers
Moving forward
--------------
Assuming Boost Python follows the Zen of Python, there should be one - and
only one - obvious way to achieve what I want. That is currently, to
expose a future-proof, STL-compliant iostream interface, through Boost
Python. I don't think any of the above implementations are compatible with
Python 3, since I don't think any of them use the new Python buffer or
memoryview APIs, but I'd like to make the switch soon, myself.
I'm sure adding buffer support to Boost Python would be valuable for a
number of users. From a backwards-compatibility perspective, it would
probably be good to have both the old and the new buffer APIs included in
Boost Python, to be selected with a Python preprocessor macro. Memoryviews
are a relatively fancy and new feature, but buffers have been around for
ages, so it would be good if they were supported, for basically all
versions of Python. Ideally though, we would also have memoryview
functionality in v2.7+, too.
One way to rule them all
------------------------
Now, I've discovered a number of ways to write to_python converters, and
am not sure what is the "one obvious way" to define a new PyTypeObject's
API.
I would be grateful for feedback on which should be the preferred way to
expose a class with a custom PyTypeObject. Here are the methods I've
looked into:-
1. indexing_suite
Perhaps my favourite way I found to expose a to_python converter, was
with boost python's indexing_suite, as I did for std::list[11] (also
attached to a msg on this list, earlier this month). From the client's
perspective, all that needs to be done is to instantiate a template. For
examples, see the C++ test code[12]. However, I haven't really looked into
how the converter is registered internally, as the base classes take care
of that. Either way, the indexing suite functions are only attached to the
PyObject, not its respective PyTypeObject.
[11] -
https://github.com/alexleach/bp_helpers/blob/master/include/boost_helpers/make_list.hpp
[12] -
https://github.com/alexleach/bp_helpers/blob/master/src/tests/test_make_list.cpp
2. class_<..>
The code for the class_ template, its bases and typedefs is really quite
advanced, but it can't be said that it is inflexible. Still, I haven't
found an "obvious way" to replace a class's object manager. I get a
runtime warning if a to_python_converter is registered as well as a class.
bp::init has an operator[] method, which can be used to specify a
CallPolicy, but I haven't managed to get that to change an instance's base
type.
The registry is probably the way to do this, but for me at least, the
registry is very opaque, so I haven't found a good way to edit or replace
a PyTypeObject, either during or after an exposed class_<> has been
initialised.
3. to_python_converter<class T, class Conversion, bool
has_get_pytype=false>
This is how the solution in [1] enables to-python (PyObject) conversion,
and is also how I've been doing it in the testing code I modified from
there[13-15]. A corresponding Conversion class seems necessary to write,
for each new type of PyTypeObject. e.g. as done in
return_opaque_pointer.hpp and opaque_pointer_converter.hpp.
[13] -
https://github.com/alexleach/bp_helpers/blob/master/src/tests/test_buffer_object.cpp
[14] -
https://github.com/alexleach/bp_helpers/blob/master/include/boost_helpers/return_buffer_object.hpp
[15] -
https://github.com/alexleach/bp_helpers/blob/master/include/boost_helpers/buffer_pointer_converter.hpp
4. Return value policies and HolderGenerators
In functions and methods where a CallPolicy can be specified, as I've
said already, a custom CallPolicy can be used to refer to a custom
HolderGenerator. These specify which PyTypeObject is used for managing the
python-converted object, but can a custom return value policy and holder
be specified with the class_<> template? I sure would like to find a way...
However, alone, this doesn't seem to put a type converter in the
registry. I thought that the MakeHolder::execute function should only need
to be called once, but my code is currently calling it every time I want a
new Python instance. So, I think I must not be registering the class' type
id properly[14]...
Then there are the lvalue and rvalue Python converters, which admittedly I
don't know much about. There's also some other concepts I haven't
mentioned above, like install_holders[16], for example, and whatever is
done when you add shared_ptr<X> to the class_ template's arguments.
[16] -
http://www.boost.org/doc/libs/1_53_0/libs/python/doc/v2/instance_holder.html#instance_holder-spec
Summary
-------
Should the ability to expose C++ istreams and ostreams be added to Boost
Python? How should this be done? I thought that having a chainable
return_value_policy for both istreams and ostreams would be great. That
way they could be both used in conjunction for an iostream, with the
functionality just incrementally added to a base PyTypeObject. But I don't
see how one could attach additional PyObject methods, like done by a
class_ template's def methods.
What about memoryviews? If someone was to go ahead and write converters
for Python memoryviews, are there any C++ standards-compliant classes that
could be accommodated? i.e. Are there any classes defined in the C++
standard for multidimensional, buffered memory access? Which, if any would
be an appropriate match to a Python memoryview? I guess that nested
std::vectors and lists might be good candidates, but I stand to be
corrected.
Apologies for the length this became and thank you for sticking with me
this far. Any advice, suggestions, pointers to code or documentation I've
probably overlooked or neglected, or even criticism would be appreciated.
Further discussion on how best to improve Boost Python as it is would be
great! I do like to contribute to open source communities when possible,
but I am strained for time...
Thanks again!
Kind regards,
Alex
More information about the Cplusplus-sig
mailing list