[Python-Dev] replacing 'global'

Tue Oct 28 03:56:34 EST 2003

On Tuesday 28 October 2003 03:55 am, Guido van Rossum wrote:
> > If we adopt a method of nonlocal assignment that allows the
> > deprecation of "global", then we have a chance to change this,
> > if we think that such "at-a-distance" rules are undesirable
> > in general.
> >
> > Do we think that?
>
> Alex certainly seems to be arguing this, but I think it's a lost cause.

I must have some Don Quixote in my blood.  Ah, can anybody point
me to the nearest windmill, please...?-)

Seriously, I realize by now that I stand no chance of affecting your
decision in this matter.  Nevertheless, and that attitude may indeed
be quixotical, I still have (just barely) enough energy not to let your
explanation of your likely coming decision stand as if it was ok with
me, or as if I had no good response to your arguments.  If it's a lost
cause, I think it's because I'm not being very good at marshaling the
arguments for it, not because those arguments are weak.  So,
basically for the record, here goes, once more...

> Even Alex will have to accept the long-distance effect of
>
>   def f():
>       x = 42
>       .
>       . (hundreds of lines of unrelated code)
>       .
>       print x

I have absolutely no problem with that -- except that it's bad style, but
the language cannot, in general, force good style.  The language can
and should ALLOW good style, but enforcing it is not always possible.

In (old) C, there was often no alternative to putting a declaration far
away from the code that used the variable, because declarations had
to come at block start.  Sometimes you could enclose declaration and
use in a nested sub-block, but not always.  C++ and modern C have
removed this wart by letting declarations come at any point before the
variable's used, and _encouraging_ (stylistically -- no enforcement)
the declaration to come always together with the initialization.  That's
about all a language can be expected to do in this regard: not forbid
"action at a distance" (that would be too confining), but _allow_ and
_encourage_ most programs to avoid it.

Python is and always has been just as good or even better: there being
no separate declaration, you _always_ have the equivalent of it "at the
first initialization" (as C++ and modern C encourage but can't enforce),
and it's perfectly natural in most cases to keep that close to the region
in a function where the name is of interest, if that region comprises only
a subset of the function's body.

But this, to some extent, is a red herring.  "Reading" (accessing) the
value referred to by a name looks the name up by rules I mostly _like_,
even though it is quite possible that the name was set "far away".  As
AMK suggests in his "Python warts" essay, people don't often get in
trouble with that because _most_ global (module-level, and even more
built-in) names are NOT re-bound dynamically.  So, when I see, e.g.,
    print len(phonebook)
it's most often fine that phonebook is global, just as it's fine that len
is built-in (it may be argued that we have "too many" built-in names,
and similarly that having "too many" global names is not a good thing,
but having SOME such names is just fine, and indeed inevitable --
perhaps Python may remedy the "too many built-ins" in 3.0, and any
programmer can refactor his own code to alleviate the "too many
globals" -- no deep problem here, in either case).

Re-binding names is different.  It's far rarer than accessing them, of
course.  And while all uses of "print x" mean (semantics equivalent to)
"look x up in the locals, then if not found there in outer scopes, then
if not found there in the globals, then if not found there in the builtins" --
a single, reasonably simple and uniform rule, independent from any
"purely declarative statement", which just determines where the value
will come from -- the situation for "x=42" is currently different.  It's a
rarer situation than just accessing x; it's _more_ important to know
where x will be bound, because that will affect its future lifetime --
which we don't particularly care about when we're just accessing it, but
is more important when we're setting it; _and_ (alas!) it's affected
by a _possible_, purely-declarative, instruction-to-the-compiler "global"
statement SOMEwhere.  "Normally", "x=42" binds or rebinds x
locally.  That's the common case, as rebinding nonlocals is rare.
It's therefore a little trap that some (a small % of) the time we are
instead rebinding a nonlocal _with no nearby reminder of the fact_.

No "nearby reminder" is really needed for the _more common_ case 
of _accessing_ a name -- partly because "where is this being accessed 
from" is often less crucial (while it IS crucial when _binding_ the name),
partly because it's totally common and expected that the "just access"
may be doing lookup in other namespaces (indeed, when I write len(x),
it's the rare case where len HAS been rebound that may be a trap!-).

> And at some point in the future Python *will* grow (optional) type
> declarations for all sorts of things (arguments, local variables,
> instance variables) and those will certainly have effect at a
> distance.

Can we focus on the locals?  Argument passing, and setting attributes
of objects with e.g. "x.y = z" notation, are already subject to rather
different rules than setting bare names, e.g. "x.y = z" might perfectly
well be calling a property setter x.setY(z) or x.__setattr__('y', z), so
I don't think refining those potentially-subtle rules will be a problem,
nor that the situation is parallel to "global".

However, optional type declarations for local variables might surely
be (both problems and parallel:-), depending on roughly what you
have in mind for that.  E.g., are you thinking, syntax sugar apart, of
some new statement "constrain_type" which might go something
like...:

def f():
    constrain_type(int) x, y, z, t
    x = 23      # ok
    y = 2.3     # ??? a
    z = "23"    # ??? b
    t = "foo"   # raise subclass of (TypeError ?)

If so, what semantics do you have in mind for cases a and b?  I can
imagine either an implicit int() call around the RHS (which is why I
guess the assignment to t would fail, though I don't know whether it
would fail with a type or value error), or an implicit isinstance
check, in which case a and b would also fail (and then no doubt with
a type error).

I may be weird, but -- offhand, and not having had time to reflect
on this in depth -- it seems to me that having assignment to bare
names 'fail' in some circumstances, while revolutionary in Python,
would not be particularly troublesome in the "action at a distance"
sense.  After all the constrain_type would have the specific purpose
of forbidding some assignments that would otherwise succeed, would
be used specifically for that, and making "wrong" assignment fail
immediately and noisily would be exactly what it's for.  I may not
think it a GOOD idea to introduce it (for local variables), but if
I argued against it it would not be on the lines of "one can't tell
by just looking at y=2.3 whether it succeeds or fails".

If the concept is to make y=2.3 implicitly do y=int(2.3) I would
be much more worried.  THEN, with no clear indication to the
contrary, we'd have "y=2.3" leave y with a value of 2.3, or 2,
or maybe something else for sufficiently weird values of X in a
"constrain_type(X) y" -- the semantics of a CORRECT program would
suddenly grow subtle dependencies on "nonlocal" ``declarations''.
So, if THAT is your intention -- and indeed that would be far closer
to the way "global" works: it doesn't FORBID assignments, rather
it changes their semantics -- then I admit the parallel is indeed
strict, and I would be worried on basically the same grounds as
I'm grumbling about 'global' and its planned extensions.

Yes, I realize this seems to be arguing _against_ adaptation --
surely if we had "constrain_type(X) y", and set "y = z", we might
like an implicit "y = adapt(z, X)" to be the assignment's semantics?
My answer (again, this is a first-blush reaction, haven't thought
deeply about the issues) is that adaptation is good, but implicit
rather than explicit is ungood, and I'm not sure the good is
stronger than the ungood here; AND, adaptation is not typecasting:
e.g y=adapt("23", int) should NOT succeed.  So, while I might be
more intrigued than horrified by such novel suggestions, I would
surely see the risks in them -- and some of the risks I'd see WOULD
be about "lack of local indication of nonobvious semantics shift".

Just like with 'global', yes.

Alex