[Python-Dev] Is outlawing-nested-import-* only an implementation issue?

Guido van Rossum guido@digicool.com
Thu, 22 Feb 2001 22:31:36 -0500


> Hi all -- i've been reading the enormous thread on nested scopes
> with some concern, since i would very much like Python to support
> "proper" lexical scoping, yet i also care about compatibility.

Note that this is moot now -- see my previous post about how we've
decided to resolve this using a magical import to enable nested scopes
(in 2.1).

> There is something missing from my understanding here:
> 
>     - The model is, each environment has a pointer to the
>       enclosing environment, right?

Actually, no.

>     - Whenever you can't find what you're looking for, you
>       go up to the next level and keep looking, right?

That depends.  Our model is inspired by the semantics of locals in
Python 2.0 and before, and this all happens at compile time.  That
means that we must be able to know which names are defined in each
scope at compile time.

>     - So what's the issue with not being able to determine
>       which variable binds in which scope?  With the model
>       just described, it's perfectly clear.  Is all this
>       breakage only caused by the particular optimizations
>       for lookup in the implementation (fast locals, etc.)?
>       Or have i missed something obvious?

You call it an optimization, and that's how it started.  But since it
clearly affects the semantics of the language, it's not really an
optimization -- it's a particular semantics that lends itself to more
and easy compile-time analysis and hence can be implemented more
efficiently, but the corner cases are different, and the language
semantics define what should happen, optimization or not.  In
particular:

    x = 1
    def f():
        print x
        x = 2

raises an UnboundLocalError error at the point of the print
statement.  Likewise, in the official semantics of nested scopes:

    x = 1
    def f():
        def g():
            print x
        g()
        x = 2

also raises an UnboundLocalError at the print statement.

> I could probably go examine the source code of the nested scoping
> changes to find the answer to my own question, but in case others
> share this confusion with me, i thought it would be worth asking.

No need to go to the source -- this is all clearly explained in the
PEP (http://python.sourceforge.net/peps/pep-0227.html).

>                         *       *       *
> 
> Consider for a moment the following simple model of lookup:
> 
>     1. A scope maps names to objects.
> 
>     2. Each scope except the topmost also points to a parent scope.
> 
>     3. To look up a name, first ask the current scope.
> 
>     4. When lookup fails, go up to the parent scope and keep looking.
> 
> I believe the above rules are common among many languages and are
> commonly understood.

Actually, most languages do all this at compile time.  Very early
Python versions did do all this at run time, but by the time 1.0 was
released, the "locals are locals" rule was firmly in place.  You may
like the purely dynamic version better, but it's been outlawed long
ago.

> The only Python-specific parts are then:
> 
>     5. The current scope is determined by the nearest enclosing 'def'.

For most purposes, 'class' also creates a scope.

>     6. These statements put a binding into the current scope:
>        assignment (=), def, class, for, except, import
> 
> And that's all.

Sure.

>                         *       *       *
> 
> Given this model, all of the scoping questions that have been
> raised have completely clear answers:
> 
>     Example I
> 
>     >>> y = 3
>     >>> def f():
>     ...     print y
>     ...
>     >>> f()
>     3

Sure.

>     Example II
> 
>     >>> y = 3
>     >>> def f():
>     ...     print y
>     ...     y = 1
>     ...     print y
>     ...
>     >>> f()
>     3
>     1
>     >>> y
>     3

You didn't try this, did you?  or do you intend to say that it
"should" print this?  In fact it raises UnboundLocalError: local
variable 'y' referenced before assignment.  (Before 2.0 it would raise
NameError.)

>     Example III
> 
>     >>> y = 3
>     >>> def f():
>     ...     exec "y = 2"
>     ...     def g():
>     ...         return y
>     ... return g()
>     ...
>     >>> f()
>     2

Wrong again.  This prints 3, both without and with nested scopes as
defined in 2.1a2.  However it raises an exception with the current CVS
version: SyntaxError: f: exec or 'import *' makes names ambiguous in
nested scope.

>     Example IV
> 
>     >>> m = open('foo.py', 'w')
>     >>> m.write('x = 1')
>     >>> m.close()
>     >>> def f():
>     ...     x = 3
>     ...     from foo import *
>     ...     def g():
>     ...         print x
>     ...     g()
>     ...
>     >>> f()
>     1

I didn't try this one, but I'm sure that it prints 3 in 2.1a1 and
raises the same SyntaxError as above with the current CVS version.

> In Example II, the model addresses even the current situation
> that sometimes surprises new users of Python.  Examples III and IV
> are the current issues of contention about nested scopes.
> 
>                         *       *       *
> 
> It's good to start with a simple model for the user to understand;
> the implementation can then do funky optimizations under the covers
> so long as the model is preserved.  So for example, if the compiler
> sees that there is no "import *" or "exec" in a particular scope it
> can short-circuit the lookup of local variables using fast locals.
> But the ability of the compiler to make this optimization should only
> affect performance, not affect the Python language model.

Too late.  The semantics have been bent since 1.0 or before.  The flow
analysis needed to optimize this in such a way that the user can't
tell whether this is optimized or not is too hard for the current
compiler.  The fully dynamic model also allows the user to play all
sorts of stupid tricks.  And the unoptimized code is so much slower
that it's well worth to hve the optimization.

> The model described above is the approximately the one available in
> Scheme.  It exactly reflects the environment-diagram model of scoping
> as taught to most Scheme students and i would argue that it is the
> easiest to explain.

I don't know Scheme, but isn't it supposed to be a compiled language?

> Some implementations of Scheme, such as STk, do what is described
> above.  UMB scheme does what Python does now: the use-before-binding
> of 'y' in Example II would cause an error.  I was surprised that
> these gave different behaviours; it turns out that the Scheme
> standard actually forbids the use of internal defines not at the
> beginning of a function body, thus sidestepping the issue.

I'm not sure how you can say that Scheme sidesteps the issue when you
just quote an example where Scheme implementations differ?

> But we
> can't do this in Python; assignment must be allowed anywhere.
> 
> Given that internal assignment has to have some meaning, the above
> meaning makes the most sense to me.

Sorry.  Sometimes, reality bites. :-)

Note that I want to take more of the dynamicism out of function
bodies.  The reference manual has for a long time outlawed import *
inside functions (but the implementation didn't enforce this).  I see
no good reason to allow this (it's causing a lot of work to happen
each time the function is called), and the needs of being able to
clearly define what happens with nested scopes make it necessary to
outlaw it.

I also want to eventually completely outlaw exec without an 'in'
clause inside a class or function, and access to local variables
through locals() or vars().  I'm not sure yet about exec without an
'in' clause at the global level, but I'm tempted to think that even
there it's not much use.

We'll start with warnings for some of these cases in 2.1.

I see that Tim posted another rebuttal, explaining better than I do
here *why* Ping's "simple" model is not good for Python, so I'll stop
now.

--Guido van Rossum (home page: http://www.python.org/~guido/)