[Python-ideas] __builtins__ behavior and... the FUTURE!

Guido van Rossum guido at python.org
Mon Nov 26 18:46:44 CET 2007


The semantics of __builtins__ are an implementation detail used for
sandboxing, and assignment to __builtins__ is not supported. Alas, I
can't quite figure out what you're after; your post doesn't start with
a clear problem statement, so I'm not even sure if this is helpful
information. I just hope to encourage you from trying to change the
semantics of __builtins__. In 3.0, __builtins__ may well be renamed.

--Guido

On Nov 24, 2007 4:41 AM, Neil Toronto <ntoronto at cs.byu.edu> wrote:
> I'd post this on Python-dev, but it has more to do with the future of
> Python, and it directly impacts the fairly-well-received Python-idea I'm
> working on right now.
>
> The current behavior has persisted since revision 9877, nine years ago:
>
> http://svn.python.org/view?rev=9877&view=rev
>
> "Vladimir Marangozov' performance hack: copy f_builtins from ancestor
> if the globals are the same."
>
> A variant of the behavior has persisted since the age of the dinosaurs,
> as far as I can tell - or at least ever since Python had stack frames.
>
> Here's how the globals/builtins lookup is currently presented as working:
>
>      1. If 'name' is in globals, return globals['name']
>      2. Return globals['__builtins__']['name']
>
> Glossing over a lot of details, here's how it *actually* worked before
> the performance hack:
>
>      0. A code object gets executed, which creates a stack frame. It
>         sets frame.builtins = globals['__builtins__'].
>      While executing the code:
>      1. If 'name' is in globals, return globals['name'].
>      2. Otherwise return frame.builtins['name'].
>
> A problem example, which is still a problem today:
>
>      __builtins__ = {'len': lambda x: 1}
>      print len([1, 2, 3])
>      # prints:
>      #   '3' when run as a script
>      #   '1' in interactive mode
>
> If running as a script or part of an import, the module's frame caches
> builtins, so it doesn't matter that it gets reassigned. When 'len' is
> looked up for the print statement, it's looked up in the cached version.
> But in interactive mode, each statement is executed in its own frame, so
> it doesn't have this problem.
>
> Well, at least module *functions* will run in their own frames, so
> they'll see the new builtins, right? But here's how it works now, after
> the performance hack:
>
>      0. A code object gets executed, which creates a stack frame.
>         a. If the stack frame has a parent (think "call site") and
>           the parent has the same globals, it sets
>           frame.builtins = parent.builtins.
>         b. Otherwise it sets frame.builtins = globals['__builtins__'].
>      While executing the code:
>      1. If 'name' is in globals, return globals['name'].
>      2. Otherwise return frame.builtins['name'].
>
> A problem example:
>
>      __builtins__ = {'len': lambda x: 1}
>      def f(): print len([1, 2, 3])
>      f()
>      # prints:
>      #   '3' when run as a script
>      #   '1' in interactive mode
>
>
> At the call site "f()", frame.builtins is the original, cached builtins.
> Before the hack, f()'s frame would have recalculated and re-cached it.
> After the hack, f()'s frame inherits the cached version. But this only
> happens in a script, which runs its code in a single frame. If you try
> this in interactive mode, you'll get correct behavior.
>
> If function calls stay within a module, builtins is effectively frozen
> at the value it had when the module started execution. But if outside
> modules call those same functions, builtins will have its new value!
> That could be bad:
>
>      import my_extra_special_builtins as __builtins__
>
>      <define extra-special library functions that use new builtins>
>
>      def run_tests_on_extra_special_functions():
>          <tests, etc.>
>
>      if __name__ == '__main__':
>          run_tests_on_extra_special_functions()
>
> The special library functions work, but the tests don't. The special
> builtins module only shows up when functions are called from outside
> modules (where the call sites have different globals) and the functions'
> frames are forced to recalculate builtins rather than inheriting it.
> Here are some ways around the problem:
>
>      1. Put all the tests in a different module.
>      2. Use a unit testing framework, which will call the module
>         functions from outside the module.
>      3. Call functions using exec with custom globals.
>      4. Replace functions using types.FunctionType with custom globals.
>
> #3 and #4 are decidedly unlikely. :) #1 is generally discouraged (AFAIK)
> if not annoying, and #2 is encouraged.
>
> In the last thread on __builtins__ vs. __builtin__, back in March, it
> seemed that Guido was open to new ideas for Python 3.0 on the subject.
> Well, keeping in mind this strange behavior and the length of time it's
> gone on, here's my recommendation:
>
>      Kill __builtins__. Take it out of the module dict. Let LOAD_GLOBAL
>      look in "builtins" (currently "__builtin__") for names after it
>      checks globals. If modules want to hack at builtins, they can
>      import it. But they hack it globally or not at all.
>
> I honestly can't think of a use case you can handle by replacing a
> module's __builtins__ that can't be handled without. If there is one,
> nobody actually does it, because we would have heard them screaming in
> agony and banging their heads against the walls from thousands of miles
> away by now. You just can't do it reliably as of February 1998.
>
> The regression test suite doesn't even touch things like this. It only
> goes as far as injecting stuff into __builtin__.
>
> Finally, on to my practical problem.
>
> I'm working on the fast globals stuff, which is how I got onto this
> subject in the first place. Here are a few of my options:
>
>      1. I can make __builtins__ work like it was always supposed to, at
>         the cost of decreased performance and extra complexity. It would
>         still be much faster than it is now, though.
>      2. Status quo: I can make __builtins__ work like it does now. I
>         think I can do this, anyway. It's actually more complex than #1,
>         and very likely slower. I would rather not take this route.
>      3. For a given function, I can freeze __builtins__ at the value it
>         was at when the function was defined.
>      4. I can make it work like I suggested for Python 3.0, but make
>         __builtin__ automatically available to modules as __builtins__.
>
> With or without it, I should be posting my patch for fast globals soon.
> No, don't look at me like that. I'm serious!
>
> Wondering-what-to-do-ly,
> Neil
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)



More information about the Python-ideas mailing list