Proposed: drop unnecessary "context" pointer from PyGetSetDef
I should have brought this up to python-dev before--sorry for being so slow. It's already in the tracker for a couple of days: http://bugs.python.org/issue5880 The idea: PyGetSetDef has this "void *closure" field that acts like a context pointer. You stick it in the PyGetSetDef, and it gets passed back to you when your getter or setter is called. It's a reasonable API design, but in practice you almost never need it. Meanwhile, it clutters up CPython, particularly typeobject.c; there are all these function calls that end with ", NULL);", just to satisfy the getter/setter prototype internally. Most of the time, the "closure" parameter is not only unused, it is skipped. PyGetSetDef definitions generally skip it, and often getter and setter implementations omit it. The "closure" was only actually *used* once in CPython, a silly use in Objects/longobject.c where it was abused as an integer value. And yes, I said "was": inspired by this discussion, Mark Dickinson removed this use in r72202 (trunk) and r72203 (py3k). So the "closure" field is now 100% unused in the python and py3k trunks. Mr. Dickinson also located an extension using the "closure" pointer, pyephem, which... *also* uses it to store an integer. Indeed, I have yet to see a use where someone stores a pointer in "closure". Anyone who needed functionality like this could roll it themselves with stub functions: PyObject *my_getter_with_context(PyObject *self, void *context) { /* ... */ } PyObject *my_getter_A(PyObject *self) { return my_getter_with_context(self, "A"); } PyObject *my_getter_B(PyObject *self) { return my_getter_with_context(self, "B"); } /* etc. */ (Although it'd make my example more realistic if "context" were an int!) So: you don't need it, it clutters up our code (particularly typeobject.c), and it adds overhead. The only good reason to keep it is backwards compatibility, which I admit is a fine reason. Whaddya think? To be honest I'd be surprised if you guys went for this. But I thought it was worth suggesting. /larry/
On Mon, May 4, 2009 at 10:10 AM, Larry Hastings
So: you don't need it, it clutters up our code (particularly typeobject.c), and it adds overhead. The only good reason to keep it is backwards compatibility, which I admit is a fine reason.
Presumably whoever added the context field had a reason for doing so. Does anyone remember what the intended use was? Trawling through the history, all I could find was this comment, attached to revision 23270: [Modified Thu Sep 20 21:45:26 2001 UTC (7 years, 7 months ago) by gvanrossum] """ Add optional docstrings to getset descriptors. Fortunately, there's no backwards compatibility to worry about, so I just pushed the 'closure' struct member to the back -- it's never used in the current code base (I may eliminate it, but that's more work because the getter and setter signatures would have to change.) """ Still, binary compatibility seems like a fairly strong reason not to remove the closure field. Mark
Mark Dickinson wrote:
Still, binary compatibility seems like a fairly strong reason not to remove the closure field.
My understanding is that there a) 2.x extension modules are not binary compatible with 3.x, and b) there are essentially no 3.x extension modules in the field. Is that accurate? If we don't have an installed base (yet) to worry about, now's the time to make this change. /larry/
Hi, Larry Hastings wrote:
Mark Dickinson wrote:
Still, binary compatibility seems like a fairly strong reason not to remove the closure field.
My understanding is that there a) 2.x extension modules are not binary compatible with 3.x, and b) there are essentially no 3.x extension modules in the field. Is that accurate? If we don't have an installed base (yet) to worry about, now's the time to make this change.
cx_Oracle at least uses this closure field, and has already been ported to 3.x: http://www.google.com/codesearch?q=Connection_SetOCIAttr+trunk -- Amaury Forgeot d'Arc
Amaury Forgeot d'Arc wrote:
Larry Hastings wrote:
My understanding is that there a) 2.x extension modules are not binary compatible with 3.x, and b) there are essentially no 3.x extension modules in the field. Is that accurate? If we don't have an installed base (yet) to worry about, now's the time to make this change.
cx_Oracle at least uses this closure field, and has already been ported to 3.x: http://www.google.com/codesearch?q=Connection_SetOCIAttr+trunk
And they're using it as a pointer, too! Nice to see it not abused for once. If it helps, I volunteer to port cx_Oracle to the new PyGetSetDef if my patch is accepted. The resulting code would be backwards-compatible with Python 3.0, so it could be incorporated immediately. Given the lack of interest in the proposal so far, this is an easy vow to make! /larry/
On Mon, May 4, 2009 at 4:10 AM, Larry Hastings
So: you don't need it, it clutters up our code (particularly typeobject.c), and it adds overhead. The only good reason to keep it is backwards compatibility, which I admit is a fine reason.
If you make the change, will 3rd party code that relies on it fail in unexpected ways, or will they just get a compile error? -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC http://stutzbachenterprises.com
On Mon, May 4, 2009 at 8:11 PM, Daniel Stutzbach
If you make the change, will 3rd party code that relies on it fail in unexpected ways, or will they just get a compile error?
I *think* that third party code that's recompiled for 3.1 and that doesn't use the closure field will either just work, or will produce an easily-fixed compile error. Larry, does this sound right? But I guess the bigger issue is that extensions already compiled against 3.0 that use PyGetSetDef (even if they don't make use of the closure field) won't work with 3.1 without a recompile: they'll segfault, or otherwise behave unpredictably. If that's not considered a problem, then surely we ought to be getting rid of tp_reserved? Mark
On Mon, May 4, 2009 at 3:00 PM, Mark Dickinson
But I guess the bigger issue is that extensions already compiled against 3.0 that use PyGetSetDef (even if they don't make use of the closure field) won't work with 3.1 without a recompile: they'll segfault, or otherwise behave unpredictably.
I was under the impression that binary compatibility was only guaranteed within a minor revision (e.g., 2.6.1 must run code compiled for 2.6.0, but 2.7.0 doesn't have to). I've been wrong before, though. Certainly the C extension module I maintain is sprinkled with #ifdef's so it will compile under 2.5, 2.6, and 3.0. ;-) -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC http://stutzbachenterprises.com
Mark Dickinson wrote:
I *think* that third party code that's recompiled for 3.1 and that doesn't use the closure field will either just work, or will produce an easily-fixed compile error. Larry, does this sound right?
Yep.
But I guess the bigger issue is that extensions already compiled against 3.0 that use PyGetSetDef (even if they don't make use of the closure field) won't work with 3.1 without a recompile: they'll segfault, or otherwise behave unpredictably.
Well, I think they'd work if they didn't use the closure and they had only one entry in their array of PyGetSetDefs. But more than one, and yes it would behave unpredictably. Probably segfault.
If that's not considered a problem, then surely we ought to be getting rid of tp_reserved?
In principle they are equivalent, but in practice removing tp_reserved is a much bigger change. Removing the closure field would result in obvious compile errors, and plenty of folks wouldn't even experience those. Removing tp_reserved would affect everybody, with inscrutable compiler errors. Personally I'd be up for removing tp_reserved. But I lack the caution regarding backwards compatibility that has served Python so well, so you're ill-advised to listen to me. Daniel Stutzbach wrote:
I was under the impression that binary compatibility was only guaranteed within a minor revision (e.g., 2.6.1 must run code compiled for 2.6.0, but 2.7.0 doesn't have to). I've been wrong before, though.
My understanding is that that's the explicit guarantee. However Python has been well-served by being much more cautious than that, a policy with which I cannot find fault.
Certainly the C extension module I maintain is sprinkled with #ifdef's so it will compile under 2.5, 2.6, and 3.0. ;-)
Happily this is one change where you could maintain backwards compatibility without #ifdefs. If you use the closure field, change your code to use stub functions and pass the closure data in yourself. /larry/
Larry Hastings wrote:
Removing tp_reserved would affect everybody, with inscrutable compiler errors.
This would have to be considered in conjunction with the proposed programmatic type-building API, I think. I'd like to see a migration towards something like that, BTW. Recently I had occasion to do some work on a Ruby extension module, and I was struck by how much more pleasant it was to be able to create a class and add a few functions to it using calls, rather than having to wrestle with a huge static struct declaration. While I like the Python language better than Ruby, I think Ruby's extension API is ahead in this particular area. -- Greg
Mark Dickinson
I *think* that third party code that's recompiled for 3.1 and that doesn't use the closure field will either just work, or will produce an easily-fixed compile error. Larry, does this sound right?
This doesn't sound right. The functions in the third party code will get compiled with the wrong signature, so they can crash (or behave unexpectedly) when called by Python.
On Mon, May 4, 2009 at 9:15 PM, Antoine Pitrou
Mark Dickinson
writes: I *think* that third party code that's recompiled for 3.1 and that doesn't use the closure field will either just work, or will produce an easily-fixed compile error. Larry, does this sound right?
This doesn't sound right. The functions in the third party code will get compiled with the wrong signature, so they can crash (or behave unexpectedly) when called by Python.
Yes, of course the signature of the getters and setters changes. Please ignore me. :-) Mark
Mark Dickinson wrote:
This doesn't sound right. The functions in the third party code will get compiled with the wrong signature, so they can crash (or behave unexpectedly) when called by Python.
Yes, of course the signature of the getters and setters changes. Please ignore me. :-)
If they don't use the closure field, then either they won't compile due to type mismatches or they'll work fine. There's a lot of code in CPython that didn't need to be changed for my remove-closure patch; the functions didn't bother taking the "void * closure" that they were going to ignore anyway, and then they cast the function pointer in the PyGetSetDef to make the compiler shut up. Worked fine. And, in nearly all cases, the static PyGetSetDefs omit the closure member, which means C initializes them with a 0. /larry/
On May 4, 2009, at 3:10 AM, Larry Hastings wrote:
I should have brought this up to python-dev before--sorry for being so slow. It's already in the tracker for a couple of days:
http://bugs.python.org/issue5880
The idea: PyGetSetDef has this "void *closure" field that acts like a context pointer. You stick it in the PyGetSetDef, and it gets passed back to you when your getter or setter is called. It's a reasonable API design, but in practice you almost never need it. Meanwhile, it clutters up CPython, particularly typeobject.c; there are all these function calls that end with ", NULL);", just to satisfy the getter/setter prototype internally.
I think this is an important feature, which allows you to define generic, reusable getter and setter functions and pass static metadata to them at runtime. Admittedly I have never needed the full pointer, my typical usage is to pass in an offset. I think this should only be removed if a suitable mechanism replaces it, if not it will require some needless duplication of code in extensions that use it (in particular my own) 8^) -Casey
Casey Duncan wrote:
I think this is an important feature, which allows you to define generic, reusable getter and setter functions and pass static metadata to them at runtime. Admittedly I have never needed the full pointer, my typical usage is to pass in an offset.
I think this should only be removed if a suitable mechanism replaces it, if not it will require some needless duplication of code in extensions that use it (in particular my own) 8^)
I disagree; I think it is a minor convenience feature, and one which encourages a lack of type safety. A suitable replacement mechanism already exists in C: static PyObject *generic_getter(PyObject *o, int context) { /* your generic code goes here */ } static PyObject *getter_with_context_1(o) { return generic_getter(o, 1); } static PyObject *getter_with_context_2(o) { return generic_getter(o, 2); } static PyObject *getter_with_context_3(o) { return generic_getter(o, 3); } You would then use "getter_with_context_1" &c in your PyGetSetDef. With a clever optimizing compiler this should result in no detectable slowdown or code bloat. However, you will be happy to learn there wasn't much support for this change, so it didn't make it into Python 3.1. /larry/
participants (7)
-
Amaury Forgeot d'Arc
-
Antoine Pitrou
-
Casey Duncan
-
Daniel Stutzbach
-
Greg Ewing
-
Larry Hastings
-
Mark Dickinson