Optional extra globals dict for function objects
I set out trying to redo the 3.0 autosuper metaclass in 2.5 without bytecode hacking and ran into a problem: a function's func_globals isn't polymorphic. That is, the interpreter uses PyDict_* calls to access it, and in one case (LOAD_GLOBAL), actually inlines PyDict_GetItem manually. If it weren't for this, I could have easily done 3.0 super without bytecode hacking, by making a custom dict that allows another dict to shadow it, and putting the new super object in the shadowing dict. I know it's for performance, and that if func_globals were made polymorphic, it'd bring the pystone benchmark to its knees, begging for a quick and merciful death. That's not what I'm proposing. I propose adding a read-only attribute func_extra_globals to the function object, default NULL. In the interpreter loop, global lookups try func_extra_globals first if it's not NULL. It's accessed using PyObject_* functions. Here are the reasons I think this is a good idea: - It should have near zero impact on performance in the general case because NULL checks are quick. There would be another attribute in the frame object (f_extra_globals), almost always NULL. - Language enhancement prototypes that currently use bytecode hacking could be accomplished with a method wrapper and a func_extra_globals dict. The prototypes could be pure Python, and thus more general, less brittle, and easier to get right. Hacking closures is nasty business. - I'm sure lots of other stuff that I can't think of, where it'd be nice to dynamically add information to a method or function that can be accessed as a variable. Pure-Python function preambles whose results can be seen by the original function would be pretty sweet. - Because func_extra_globals would be read-only and default NULL, it'd almost always be obvious when it's getting messed with. A wrapper/decorator or a metaclass, and a call to types.FunctionType() would signal that. - func_globals would almost never have to be overridden: for most purposes (besides security), shadowing it is actually better, as it leaves the function's module fully accessible. Anybody else think it's awesome? :) How about opinions of major suckage? If it helps acceptance, I'd be willing to make a patch for this. It looks pretty straightforward. Neil
On Nov 17, 2007 11:27 AM, Neil Toronto <ntoronto@cs.byu.edu> wrote:
I set out trying to redo the 3.0 autosuper metaclass in 2.5 without bytecode hacking and ran into a problem: a function's func_globals isn't polymorphic. That is, the interpreter uses PyDict_* calls to access it, and in one case (LOAD_GLOBAL), actually inlines PyDict_GetItem manually. If it weren't for this, I could have easily done 3.0 super without bytecode hacking, by making a custom dict that allows another dict to shadow it, and putting the new super object in the shadowing dict.
I know it's for performance, and that if func_globals were made polymorphic, it'd bring the pystone benchmark to its knees, begging for a quick and merciful death. That's not what I'm proposing.
I propose adding a read-only attribute func_extra_globals to the function object, default NULL. In the interpreter loop, global lookups try func_extra_globals first if it's not NULL. It's accessed using PyObject_* functions.
My initial response is "eww". I say this as I don't want to complicate the scoping rules anymore than they are. This adds yet another place to check for things. While it might not be a nasty performance hit (although you neglect to say what happens if something is not found in func_extra_globals; do you check func_globals as well? That will be a penalty hit), it does complicate semantics slightly.
Here are the reasons I think this is a good idea:
- It should have near zero impact on performance in the general case because NULL checks are quick. There would be another attribute in the frame object (f_extra_globals), almost always NULL.
That is only true if you skip a func_globals check if the func_extra_globals check doesn't happen.
- Language enhancement prototypes that currently use bytecode hacking could be accomplished with a method wrapper and a func_extra_globals dict. The prototypes could be pure Python, and thus more general, less brittle, and easier to get right. Hacking closures is nasty business.
Which are what? the auto-super example is not exactly common.
- I'm sure lots of other stuff that I can't think of, where it'd be nice to dynamically add information to a method or function that can be accessed as a variable. Pure-Python function preambles whose results can be seen by the original function would be pretty sweet.
Basing an idea on unknown potential is not a good reason to add something to the language. I don't think the Air Force needs to protect against flying pigs just because there is the possibility someone might genetically engineer some to carry nuclear bombs. =)
- Because func_extra_globals would be read-only and default NULL, it'd almost always be obvious when it's getting messed with. A wrapper/decorator or a metaclass, and a call to types.FunctionType() would signal that.
Read-only? Then how are you supposed to set this? Do you want to introduce something like __build_class__ for functions and methods? Requiring the use of Types.FunctionType() will be a pain and dilute the usefulness.
- func_globals would almost never have to be overridden: for most purposes (besides security), shadowing it is actually better, as it leaves the function's module fully accessible.
If that's the case why worry about func_extra_globals? =) It solves %95 of the uses you might have (and I suspect 94% of the uses are "I don't need to muck with func_globals").
Anybody else think it's awesome? :) How about opinions of major suckage?
I'm -1 on the idea personally.
If it helps acceptance, I'd be willing to make a patch for this. It looks pretty straightforward.
It always helps acceptance, it's just a question of whether it will push it over the edge into actually being accepted. -Brett
On 11/17/07, Neil Toronto <ntoronto@cs.byu.edu> wrote:
I set out trying to redo the 3.0 autosuper metaclass in 2.5 without bytecode hacking and ran into a problem: a function's func_globals isn't polymorphic. That is, the interpreter uses PyDict_* calls to access it, and in one case (LOAD_GLOBAL), actually inlines PyDict_GetItem manually.
(1) Is this just one of the "this must be a real dict, not just any mapping" limits, or is there something else I'm missing? (2) Isn't the func_globals already (a read-only reference to) the module's __dict__? So is this really about changing the promise of the module type, instead of just about func_globals? Note that weakening the module.__dict__ promise to only meeting the dict API would make it easier to implement the various speed-up-globals suggestions. And to be honest, I think that assuming a UserDict.DictMixin wouldn't be that bad. How often is a module's dict used for anything time-critical except get (and maybe set, delete, iterate)?
If it weren't for this, I could have easily done 3.0 super without bytecode hacking, by making a custom dict that allows another dict to shadow it, and putting the new super object in the shadowing dict.
...
I propose adding a read-only attribute func_extra_globals to the function object, default NULL. In the interpreter loop, global lookups try func_extra_globals first if it's not NULL.
Would this really be a global dict though, or just a closure inserted between the func and the normal globals? Is the real problem that you can't change which variables are in a closure (rather than fully global) after the function is compiled? -jJ
Jim Jewett wrote:
On 11/17/07, Neil Toronto <ntoronto@cs.byu.edu> wrote:
I set out trying to redo the 3.0 autosuper metaclass in 2.5 without bytecode hacking and ran into a problem: a function's func_globals isn't polymorphic. That is, the interpreter uses PyDict_* calls to access it, and in one case (LOAD_GLOBAL), actually inlines PyDict_GetItem manually.
(1) Is this just one of the "this must be a real dict, not just any mapping" limits, or is there something else I'm missing?
That's all it is, yes.
(2) Isn't the func_globals already (a read-only reference to) the module's __dict__? So is this really about changing the promise of the module type, instead of just about func_globals?
My original question was about extending (with an optional dictionary) the behavior of a function with regard to its func_globals. Because of speed concerns, I didn't suggest weakening the type constraint to allow just anything that meets the dict API.
Note that weakening the module.__dict__ promise to only meeting the dict API would make it easier to implement the various speed-up-globals suggestions.
By "implement" do you mean proof-of-concept, final, or both? At least for proof-of-concept, I totally agree. And thanks for the use case (which sort of applies to my original flawed idea), my lack of which Brett has raked me over the coals for. :) (But it didn't hurt much!)
And to be honest, I think that assuming a UserDict.DictMixin wouldn't be that bad. How often is a module's dict used for anything time-critical except get (and maybe set, delete, iterate)?
I doubt that delete and iterate are common enough that they'd have to be regarded as time-critical. Maybe set - maybe. It hardly happens (especially compared to get), and when it does, it's almost never in a time-critical inner loop. DictMixin is currently pure Python. That's a speed concern that wouldn't be *too* hard to address, I suppose.
I propose adding a read-only attribute func_extra_globals to the function object, default NULL. In the interpreter loop, global lookups try func_extra_globals first if it's not NULL.
Would this really be a global dict though, or just a closure inserted between the func and the normal globals?
Basically a customizable closure, yeah.
Is the real problem that you can't change which variables are in a closure (rather than fully global) after the function is compiled?
Really, that's it. That's why I made the silly bytecode hack to insert function parameters, which actually works better than augmenting a function's globals with a polymorphic dict. Assuming func_globals is a DictMixin is intriguing, though. Neil
participants (3)
-
Brett Cannon
-
Jim Jewett
-
Neil Toronto