flextype.c -- extended type system
Hi Guido, py-dev, preface: -------- a week ago or so, I sent a patch to Guido that removes the "etype" struct. This is a hidden structure that extends types when they are allocated on the heap. One restriction with this type was that types could not be extended by metatypes for some internal reason. I fixed this. Now meta-types can define extra slots for types. the point: ---------- I wasn't really after slots in types, but I wanted to have a type that can be extended as the user likes to. Using the re-worked etype (now named PyHeapType_Type), I created a new meta-type with some cool new features which give you C++ - like virtual methods and inheritance. The new dynamic type (PyFlexType_Type) allows to clone any existing type, and thereby to pass a virtual method table which will be bound into the type. It is a bit like slots and slot definitions, but the VMT definition is written like a PyMethodDef list (in fact, I have PyCMethodDef), and the created virtual function entries are spelled explicitly in the type structure. Structure of a VMT definition: typedef struct _pycmethoddef { char *name; /* name to lookup in __dict__ */ PyCFunction match; /* to be found if non-overridden */ void *fast; /* native C call */ void *wrap; /* wrapped call into Python */ int offset; /* slot offset in heap type */ } PyCMethodDef; At creation time of a new flextype, all VMT entries in the accumulated bases (accessed via the MRT) are scanned from oldest to newest, and the new type's methods are retrieved by the "name" entry. Then it is checked whether the method descriptor still points to the original PyCFunction entry (the "match" field). If it is still original, the native C call (field "fast") is inserted into the VMT, otherwise the wrapped Python callback (field "wrap") is inserted. As a result, it is now very cheap to use overridable small methods in your C implementations, since it nearly comes to no cost if the method isn't overridden. It is also possible to have private methods, in the sense that you can use inheritance between your flextypes without publishing every virtual method to Python at all. Here an example of my Stackless type system, where I made my channel interface overridable: (channelobject.h) """ #define CHANNEL_SEND_HEAD(func) \ int func (PyChannelObject *self, PyObject *arg) #define CHANNEL_SEND_EXCEPTION_HEAD(func) \ int func (PyChannelObject *self, PyObject *klass, PyObject *value) #define CHANNEL_RECEIVE_HEAD(func) \ PyObject * func (PyChannelObject *self) typedef struct _pychannel_heaptype { PyFlexTypeObject type; /* the fast callbacks */ CHANNEL_SEND_HEAD( (*send) ); CHANNEL_SEND_EXCEPTION_HEAD( (*send_exception) ); CHANNEL_RECEIVE_HEAD( (*receive) ); } PyChannel_HeapType; int init_channeltype(void); """ Here the VMT definition of channelobject.c: """ static PyCMethodDef channel_cmethods[] = { CMETHOD_PUBLIC_ENTRY(PyChannel_HeapType, channel, send), CMETHOD_PUBLIC_ENTRY(PyChannel_HeapType, channel, send_exception), CMETHOD_PUBLIC_ENTRY(PyChannel_HeapType, channel, receive), {NULL} /* sentinel */ }; """ where the CMETHOD_PUBLIC_ENTRY macro looks like this: /* * a public entry defines * - the function name "name" * - the PyCFunction class_name seen from Python, * - the fast function impl_class_name implements the method for C * - the wrapper function wrap_class_name that calls back into a Python override. */ #define CMETHOD_PUBLIC_ENTRY(type, prefix, name) \ {#name, (PyCFunction)prefix##_##name, &impl_##prefix##_##name, &wrap_##prefix##_##name, \ offsetof(type, name)} So basically three functions are involved in a virtual method: the PyCFunction, the C implementation and a wrapper. Normally, the PyCFunction and the implementation can be identical, but usually my C interface looks slightly different from the Python interface, for convenience. Here an excerpt from channel_send: """ int PyChannel_Send(PyChannelObject *self, PyObject *arg) { PyChannel_HeapType *t = (PyChannel_HeapType *) self->ob_type; return t->send(self, arg); } static CHANNEL_SEND_HEAD(impl_channel_send) { PyThreadState *ts = PyThreadState_GET(); PyTaskletObject *sender, *receiver; .... implementation skipped .... } static CHANNEL_SEND_HEAD(wrap_channel_send) { PyObject * ret = PyObject_CallMethod((PyObject *) self, "send", "(O)", arg); return slp_return_wrapper(ret); } static PyObject * channel_send(PyObject *myself, PyObject *arg) { if (impl_channel_send((PyChannelObject*)myself, arg)) return NULL; Py_INCREF(Py_None); return Py_None; } """ end of story. Summary: -------- Overridable methods have always been present in Python, via the built-in method slots. My extension methods give the same functionality to the user, at maximum possible speed (only templates can be faster). The benefit is that users can use much more flexibility in C modules than before, without fear of speed loss. I believe that virtual methods will be used more often, since it is cheap, flexible and compatible with Python. Please let me know if there is interest to use this techique in the Python core. I'm also not sure how to show the complete thing, since it is partially a patch to the existing type implementation (concerning the etype), partially a new C module flextype.c, and the rest is part of Stackless. Does it make sense (would somebody look at it) if I create a little demo application or something? cheers - chris -- Christian Tismer :^) mailto:tismer@tismer.com Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/
[Christian Tismer]
Hi Guido, py-dev,
preface: -------- a week ago or so, I sent a patch to Guido that removes the "etype" struct. This is a hidden structure that extends types when they are allocated on the heap. One restriction with this type was that types could not be extended by metatypes for some internal reason. I fixed this. Now meta-types can define extra slots for types.
I have never written a type or object in C, so bear with my newbie questions. Are you saying, Chris, that before you could not inherit a type written in C and override a method? Is this only in regards to the magic method slots or just any method?
Brett Cannon wrote:
[Christian Tismer]
Hi Guido, py-dev,
preface: -------- a week ago or so, I sent a patch to Guido that removes the "etype" struct. This is a hidden structure that extends types when they are allocated on the heap. One restriction with this type was that types could not be extended by metatypes for some internal reason. I fixed this. Now meta-types can define extra slots for types.
I have never written a type or object in C, so bear with my newbie questions. Are you saying, Chris, that before you could not inherit a type written in C and override a method? Is this only in regards to the magic method slots or just any method?
Sure you could. The just was no general interface to it. The magic method slots are already easy to override, assuming that you always call these via the type slots and don't call them directly. For your own, non-magic methods, there was not support, yet. Sure, you could override your methods, but you needed extra machinery to keep track of the methods, to find out which to call when, and so on. The proper way to store extra info about methods is to put this info into the type object itself. This was not possible before my patch. You could help yourself my extending some of the existing method tables, but this is hackish. With my flextype stuff, you explicitly extend your type object with extra function pointers. Then you provide a table with your implementation and wrapper functions, and inheritance works from alone. That's what I was after.
From what I gather in your email, it seems like you came up with proper overriding inheritence in C for methods defined in a type. So does this means you can now override the __contains__ magic slot in C code through some inherited type and this was not doable before? Perhaps an example of something from the Python core that was not possible before would solidify this for me.
I didn't care of the magic slots at all. I think they don't need to be changed, but I will have a look at it. The difference with my dynamic methods is that the method tables are filled once, at the time when your type/class is created. After that, there is no longer any lookup necessary. Method calls which are not overridden are called with maximum possible speed. In order to support changes to the undelying classes *after* type creation, I will provide an extra type method that allows to "re-bind" explictly. ciao - chris -- Christian Tismer :^) mailto:tismer@tismer.com Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/
[Christian Tismer] <snip>
For your own, non-magic methods, there was not support, yet. Sure, you could override your methods, but you needed extra machinery to keep track of the methods, to find out which to call when, and so on. The proper way to store extra info about methods is to put this info into the type object itself. This was not possible before my patch. You could help yourself my extending some of the existing method tables, but this is hackish.
That sounds great. Anything to make coding C extensions easier.
I didn't care of the magic slots at all. I think they don't need to be changed, but I will have a look at it. <snip>
Part of the reason I asked about the magic slots is that I personally think it would be great if you didn't have to use the specific struct slots for magic slots but instead were called based on their name in Python. That way you would not have to view Include/object.h every time you wanted to use one of the magic methods; you could just add it just like any other method and just give it a Python name that matched its magic method name. The obvious drawback is you would lose compiler checking that the arguments were correct for the method. But wouldn't this simplify keeping binary-compatibility if it was used since the struct would be pruned down significantly? I don't know how much of a stumbling block this all is for newbies, but I know when I looked at extending sre's pattern objects to add a __contains__ method it took me a little while to find where the slot was and what all the macros were for. But that might also be because I didn't read the C extension docs and just dove in. =) -Brett
[Christian Tismer]
<snip>
For your own, non-magic methods, there was not support, yet. Sure, you could override your methods, but you needed extra machinery to keep track of the methods, to find out which to call when, and so on. The proper way to store extra info about methods is to put this info into the type object itself. This was not possible before my patch. You could help yourself my extending some of the existing method tables, but this is hackish.
[Brett Cannon]
That sounds great. Anything to make coding C extensions easier.
Brett, may I politely suggest that you try writing C extensions first before claiming it needs to be made easier? Christian's additions (as far as I understand them :-) are mostly intended for very esoteric situations.
I didn't care of the magic slots at all. I think they don't need to be changed, but I will have a look at it. <snip>
Part of the reason I asked about the magic slots is that I personally think it would be great if you didn't have to use the specific struct slots for magic slots but instead were called based on their name in Python. That way you would not have to view Include/object.h every time you wanted to use one of the magic methods; you could just add it just like any other method and just give it a Python name that matched its magic method name. The obvious drawback is you would lose compiler checking that the arguments were correct for the method. But wouldn't this simplify keeping binary-compatibility if it was used since the struct would be pruned down significantly?
Alas, it would cause a major slowdown if this was the only way to provide heavily-used operations like __add__ and __getitem__. Most of the machinery to allow this probably already exists, but I wouldn't recommend using it. Also, you'd have to provide two implementations for binary operators, e.g. __add__ and __radd__.
I don't know how much of a stumbling block this all is for newbies, but I know when I looked at extending sre's pattern objects to add a __contains__ method it took me a little while to find where the slot was and what all the macros were for. But that might also be because I didn't read the C extension docs and just dove in. =)
You could've picked a simpler extension to try to modify. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote: ...
Christian's additions (as far as I understand them :-) are mostly intended for very esoteric situations.
My additions support a subset of C++ virtual methods. How is that esoteric? ciao - chris -- Christian Tismer :^) mailto:tismer@tismer.com Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/
Brett Cannon wrote: ...
Part of the reason I asked about the magic slots is that I personally think it would be great if you didn't have to use the specific struct slots for magic slots but instead were called based on their name in Python. That way you would not have to view Include/object.h every time you wanted to use one of the magic methods; you could just add it just like any other method and just give it a Python name that matched its magic method name. The obvious drawback is you would lose compiler checking that the arguments were correct for the method.
No, vice versa. I *could* support any magic slot and put it into the extended type object with a Python name. And even better, this version could have full type checking, as my other methods have as well! This could go far bejond what we have now. My system is explicit as types: You repeat the whole function argument list in the new gown slot. This is as type safe as can be. Esoterically y'rs - chris -- Christian Tismer :^) mailto:tismer@tismer.com Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/
Part of the reason I asked about the magic slots is that I personally think it would be great if you didn't have to use the specific struct slots for magic slots but instead were called based on their name in Python. That way you would not have to view Include/object.h every time you wanted to use one of the magic methods; you could just add it just like any other method and just give it a Python name that matched its magic method name. The obvious drawback is you would lose compiler checking that the arguments were correct for the method.
No, vice versa. I *could* support any magic slot and put it into the extended type object with a Python name. And even better, this version could have full type checking, as my other methods have as well! This could go far beyond what we have now. My system is explicit at types: You repeat the whole function argument list in the newly grown slot. This is as type safe as can be. Esoterically y'rs - chris -- Christian Tismer :^) mailto:tismer@tismer.com Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/
Christian Tismer
Christian's additions (as far as I understand them :-) are mostly intended for very esoteric situations.
My additions support a subset of C++ virtual methods. How is that esoteric?
Why would an extension writer ever want to do this? "Normal" extension types either wrap some C type, so you don't have inheritance at all, or some C++ type, in which case a single type method can wrap arbitrary virtual methods (since the VMT is done in C++). A real-world example would help. Regards, Martin
My additions support a subset of C++ virtual methods. How is that esoteric?
Why would an extension writer ever want to do this? "Normal" extension types either wrap some C type, so you don't have inheritance at all, or some C++ type, in which case a single type method can wrap arbitrary virtual methods (since the VMT is done in C++).
I'm still in favor of a 'clean' method to add additional C accessible structure fields to types. Currently I'm attaching them to the the type's dict, as I reported before. As I understand it, Christian's first patch allows this. Thomas
From: "Martin v. Loewis"
Christian Tismer
writes: Christian's additions (as far as I understand them :-) are mostly intended for very esoteric situations.
My additions support a subset of C++ virtual methods. How is that esoteric?
Why would an extension writer ever want to do this? "Normal" extension types either wrap some C type, so you don't have inheritance at all, or some C++ type, in which case a single type method can wrap arbitrary virtual methods (since the VMT is done in C++).
A real-world example would help.
Well, I want to do something like this, and I think it's for a fairly simple reason. All of my (dynamically-generated) extension classes need a piece of data which tells them how much extra data to allocate in the variable-sized area of their instances. This is an implementation detail which I don't want to expose to users. Right now I have to stick it in the class' __dict__, which not only means that it's exposed, but that users can change it at will. It also costs me an extra lookup every time an instance of the extension class is allocated. It would be much nicer if I could get a little data area in the type object where I could stick this value, but right now there's no place to put it. Chris' patch allows me to handle the issue much more naturally. It doesn't seem esoteric to add information to a type which doesn't live it its __dict__. Not being able to do so makes types very different from other objects. ----------------------------------------------------------- David Abrahams * Boost Consulting dave@boost-consulting.com * http://www.boost-consulting.com Of course, that makes it esoteric by its very definition ;-)
From: "Thomas Heller"
You can (but you probably know this already) replace the type's tp_dict by a custom subclass of PyDict_Object, which adds additional fields.
I probably knew that once. Thanks for reminding me. When I have time... ----------------------------------------------------------- David Abrahams * Boost Consulting dave@boost-consulting.com * http://www.boost-consulting.com
From: "David Abrahams"
All of my (dynamically-generated) extension classes need a piece of data which tells them how much extra data to allocate in the variable-sized area of their instances. This is an implementation detail which I don't want to
Not so different from what I need...
expose to users. Right now I have to stick it in the class' __dict__, which not only means that it's exposed, but that users can change it at will. It also costs me an extra lookup every time an instance of the extension class is allocated. It would be much nicer if I could get a little data area in the type object where I could stick this value, but right now there's no place to put it.
You can (but you probably know this already) replace the type's tp_dict by a custom subclass of PyDict_Object, which adds additional fields.
Chris' patch allows me to handle the issue much more naturally. It doesn't seem esoteric to add information to a type which doesn't live it its __dict__. Not being able to do so makes types very different from other objects.
Actually this is not specific to types - it is for all variable size objects. Thomas
David Abrahams wrote: <snip>
Chris' patch allows me to handle the issue much more naturally. It doesn't seem esoteric to add information to a type which doesn't live it its __dict__. Not being able to do so makes types very different from other objects.
----------------------------------------------------------- David Abrahams * Boost Consulting dave@boost-consulting.com * http://www.boost-consulting.com
Of course, that makes it esoteric by its very definition ;-)
Hee hee :-)) -- Christian Tismer :^) mailto:tismer@tismer.com Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/
Thomas Heller wrote:
My additions support a subset of C++ virtual methods. How is that esoteric?
Why would an extension writer ever want to do this? "Normal" extension types either wrap some C type, so you don't have inheritance at all, or some C++ type, in which case a single type method can wrap arbitrary virtual methods (since the VMT is done in C++).
I'm still in favor of a 'clean' method to add additional C accessible structure fields to types. Currently I'm attaching them to the the type's dict, as I reported before.
As I understand it, Christian's first patch allows this.
Please let me know when you're actually going to use it. I know there is a bug in the 2.3 patch. For Stackless, I'm still hacking against 2.2.1, and the patch has been extended in serveral ways as well: I removed the assumption that objects generated from heap types need always to be GC objects. This was probably decided with too much classes in mind, but now this feature also makes sense to simple types where you might want to avoid GC for space or other reasons. ciao - chris -- Christian Tismer :^) mailto:tismer@tismer.com Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/
participants (6)
-
Brett Cannon
-
Christian Tismer
-
David Abrahams
-
Guido van Rossum
-
martin@v.loewis.de
-
Thomas Heller