[Python-ideas] __builtins__ behavior and... the FUTURE!
Guido van Rossum
guido at python.org
Mon Nov 26 18:46:44 CET 2007
The semantics of __builtins__ are an implementation detail used for
sandboxing, and assignment to __builtins__ is not supported. Alas, I
can't quite figure out what you're after; your post doesn't start with
a clear problem statement, so I'm not even sure if this is helpful
information. I just hope to encourage you from trying to change the
semantics of __builtins__. In 3.0, __builtins__ may well be renamed.
--Guido
On Nov 24, 2007 4:41 AM, Neil Toronto <ntoronto at cs.byu.edu> wrote:
> I'd post this on Python-dev, but it has more to do with the future of
> Python, and it directly impacts the fairly-well-received Python-idea I'm
> working on right now.
>
> The current behavior has persisted since revision 9877, nine years ago:
>
> http://svn.python.org/view?rev=9877&view=rev
>
> "Vladimir Marangozov' performance hack: copy f_builtins from ancestor
> if the globals are the same."
>
> A variant of the behavior has persisted since the age of the dinosaurs,
> as far as I can tell - or at least ever since Python had stack frames.
>
> Here's how the globals/builtins lookup is currently presented as working:
>
> 1. If 'name' is in globals, return globals['name']
> 2. Return globals['__builtins__']['name']
>
> Glossing over a lot of details, here's how it *actually* worked before
> the performance hack:
>
> 0. A code object gets executed, which creates a stack frame. It
> sets frame.builtins = globals['__builtins__'].
> While executing the code:
> 1. If 'name' is in globals, return globals['name'].
> 2. Otherwise return frame.builtins['name'].
>
> A problem example, which is still a problem today:
>
> __builtins__ = {'len': lambda x: 1}
> print len([1, 2, 3])
> # prints:
> # '3' when run as a script
> # '1' in interactive mode
>
> If running as a script or part of an import, the module's frame caches
> builtins, so it doesn't matter that it gets reassigned. When 'len' is
> looked up for the print statement, it's looked up in the cached version.
> But in interactive mode, each statement is executed in its own frame, so
> it doesn't have this problem.
>
> Well, at least module *functions* will run in their own frames, so
> they'll see the new builtins, right? But here's how it works now, after
> the performance hack:
>
> 0. A code object gets executed, which creates a stack frame.
> a. If the stack frame has a parent (think "call site") and
> the parent has the same globals, it sets
> frame.builtins = parent.builtins.
> b. Otherwise it sets frame.builtins = globals['__builtins__'].
> While executing the code:
> 1. If 'name' is in globals, return globals['name'].
> 2. Otherwise return frame.builtins['name'].
>
> A problem example:
>
> __builtins__ = {'len': lambda x: 1}
> def f(): print len([1, 2, 3])
> f()
> # prints:
> # '3' when run as a script
> # '1' in interactive mode
>
>
> At the call site "f()", frame.builtins is the original, cached builtins.
> Before the hack, f()'s frame would have recalculated and re-cached it.
> After the hack, f()'s frame inherits the cached version. But this only
> happens in a script, which runs its code in a single frame. If you try
> this in interactive mode, you'll get correct behavior.
>
> If function calls stay within a module, builtins is effectively frozen
> at the value it had when the module started execution. But if outside
> modules call those same functions, builtins will have its new value!
> That could be bad:
>
> import my_extra_special_builtins as __builtins__
>
> <define extra-special library functions that use new builtins>
>
> def run_tests_on_extra_special_functions():
> <tests, etc.>
>
> if __name__ == '__main__':
> run_tests_on_extra_special_functions()
>
> The special library functions work, but the tests don't. The special
> builtins module only shows up when functions are called from outside
> modules (where the call sites have different globals) and the functions'
> frames are forced to recalculate builtins rather than inheriting it.
> Here are some ways around the problem:
>
> 1. Put all the tests in a different module.
> 2. Use a unit testing framework, which will call the module
> functions from outside the module.
> 3. Call functions using exec with custom globals.
> 4. Replace functions using types.FunctionType with custom globals.
>
> #3 and #4 are decidedly unlikely. :) #1 is generally discouraged (AFAIK)
> if not annoying, and #2 is encouraged.
>
> In the last thread on __builtins__ vs. __builtin__, back in March, it
> seemed that Guido was open to new ideas for Python 3.0 on the subject.
> Well, keeping in mind this strange behavior and the length of time it's
> gone on, here's my recommendation:
>
> Kill __builtins__. Take it out of the module dict. Let LOAD_GLOBAL
> look in "builtins" (currently "__builtin__") for names after it
> checks globals. If modules want to hack at builtins, they can
> import it. But they hack it globally or not at all.
>
> I honestly can't think of a use case you can handle by replacing a
> module's __builtins__ that can't be handled without. If there is one,
> nobody actually does it, because we would have heard them screaming in
> agony and banging their heads against the walls from thousands of miles
> away by now. You just can't do it reliably as of February 1998.
>
> The regression test suite doesn't even touch things like this. It only
> goes as far as injecting stuff into __builtin__.
>
> Finally, on to my practical problem.
>
> I'm working on the fast globals stuff, which is how I got onto this
> subject in the first place. Here are a few of my options:
>
> 1. I can make __builtins__ work like it was always supposed to, at
> the cost of decreased performance and extra complexity. It would
> still be much faster than it is now, though.
> 2. Status quo: I can make __builtins__ work like it does now. I
> think I can do this, anyway. It's actually more complex than #1,
> and very likely slower. I would rather not take this route.
> 3. For a given function, I can freeze __builtins__ at the value it
> was at when the function was defined.
> 4. I can make it work like I suggested for Python 3.0, but make
> __builtin__ automatically available to modules as __builtins__.
>
> With or without it, I should be posting my patch for fast globals soon.
> No, don't look at me like that. I'm serious!
>
> Wondering-what-to-do-ly,
> Neil
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
More information about the Python-ideas
mailing list