[Python-Dev] Visibility scope for "for/while/if" statements
Josiah Carlson
jcarlson at uci.edu
Thu Sep 22 17:34:26 CEST 2005
Alexander Myodov <maa_public at sinn.ru> wrote:
> Hello,
>
> Don't want to be importunate annoyingly asking the things probably
> trivial for experienced community, but need to ask it anyway, after
> spending about two hours trying to find well-camouflaged error caused
> by it.
In the future you should test your assumptions before writing software
with them.
> Why the variables defined inside "for"/"while"/"if" statements
> (including loop variables for "for") are visible outside this scope?
if and while statements don't define variables, so they can't expose
them later in the scope.
In regards to 'for' loops, they have always done that, there is code
that relies on that behavior, and changing behavior would unnecessarily
break code.
As for list comprehensions, they were literally meant to be a
completely equivalent translation of a set of for loops. That is:
x = []
for i in foo:
if f(i):
x.append(i)
could be translated into
x = [i for i in foo if f(i)]
and one would get the exact same side effects, including the 'leaking'
of the most recently bound i into the local scope. The leakage was not
accidental.
> Yes, I am aware of one use case for this... er, feature.
> It could be useful to remember the last item of the loop after the loop
> is done... sometimes. But what is it for in other cases, except
> confusing the programmer?
There is a great reason: there generally exists two namespaces in Python,
locals and globals (you can get a representation of of locals by using
locals() (your changes to the dictionary won't change the local scope),
and get a reference to the globals dictionary by globals()...there is
also __builtins__, but you shouldn't be playing with that unless you
know what you are doing). These namespaces offer you access to other
namespaces (class, module, etc.).
In most cases (the body of a function or method), the local scope is
defined as an array with offset lookups:
>>> def foo(n):
... n + 1
...
>>> dis.dis(foo)
2 0 LOAD_FAST 0 (n) <- look at this opcode
3 LOAD_CONST 1 (1)
6 BINARY_ADD
7 POP_TOP
8 LOAD_CONST 0 (None)
11 RETURN_VALUE
>>>
This results in the bytecode to be executed being significantly faster
than if it were to reference a dictionary (like globals).
>>> dis.dis(compile('n+1', '_', 'single'))
1 0 LOAD_NAME 0 (n) <- compare with this
3 LOAD_CONST 0 (1)
6 BINARY_ADD
7 PRINT_EXPR
8 LOAD_CONST 1 (None)
11 RETURN_VALUE
Since the LOAD_NAME opcode does a dictionary lookup, it is necessarily
slower than an array + offset lookup.
Further, you should clarify what you would want this mythical non-leaky
for loop to do in various cases. What would happen in the following
case?
i = None
for i in ...:
...
print i
... assuming that the loop executed more than once? Would you always
get 'None' printed, or would you get the content of the last variable?
What about:
x = None
for i in ...:
x = f(i)
...
print x
Would 'x' be kept after the loop finished executing?
I would imagine in your current code you are running something like
this:
i = #some important value
... #many lines of code
for i in ...:
...
#you access the 'i' you bound a long time ago
In this case, you are making a common new programmer mistake by using
the same variable name for two disparate concepts. If the 'i' that was
initially bound was important through the lifetime of the scope, you
should have named it differently than the 'i' that was merely a loop
variable.
I will also point out that in general, not leaking/exposing/etc. such
variables would necessarily slow down Python. Why? Because it would
necessitate nesting the concept of pseudo-namespaces. Now, when a
function is compiled, nearly every local name is known, and they can be
made fast (in the Python sense). If one were to nest pseudo namespaces,
one would necessarily have another namespace lookup for every nested for
loop. More specifically, accessing 'foo' in these three different
nestings would take different amounts of time.
for i in xrange(10):
x = foo
for i in xrange(10):
for j in xrange(10):
x = foo
for i in xrange(10):
for j in xrange(10):
for k in xrange(10):
x = foo
And in the case that you do or don't want 'x' to 'leak' into the
surrounding scope, you either take the speed hit again, or break quite a
bit of code and be inconsistant.
> Or maybe can someone hint me whether I can somehow switch the behaviour on
> source-level to avoid keeping the variables outside the statements?
No.
> Something like Perlish "import strict"? I couldn't find it myself.
> While global change of Python to the variables local to statements and
> list comprehension could definitely affect too much programs, adding
> it on a per-source level would keep the compatibility while making
> the life of Python programmers safer.
Python semantics seem to have been following the rule of "we are all
adults here". If your assumptions have caused you bugs, then you should
realize that your assumptions should have been tested before they were
relied upon. That is what the interactive Python console is for.
Generally though, Python follows a common C semantic of variable leakage.
C Code:
int i; // required in some versions of C
for (i=0;i<10;i++) {
...
}
Equivalent Python:
for i in xrange(10):
...
As long as those loops don't have 'break' (or goto in the case of C) in
them, Python and C will have the same value buond to 'i' after the loop.
Again: test your assumptions. If your untested assumptions are wrong,
don't complain that the language is broken.
Also: python-dev is a mailing list for the development /of/ Python.
Being that your questions as of late have been in the realm of "why does
or doesn't Python do this?", you should go to python-list (or the
equivalent comp.lang.python newsgroup) for answers to questions
regarding current Python behavior, and why Python did or didn't do
something in its past.
- Josiah
More information about the Python-Dev
mailing list