strange behavor....

Sat Nov 13 18:28:36 EST 2010

Steven D'Aprano <steve at REMOVE-THIS-cybersource.com.au> writes:

> On Sat, 13 Nov 2010 20:01:42 +0000, Mark Wooding wrote:
> > Some object types are primitive, provided by the runtime system;
> > there are no `internal' variables to be assigned in these cases.
>
> You seem to be making up your own terminology here, or at least using 
> terminology that isn't normally used in the languages I'm used to.

I was attempting to define possibly unfamiliar terms as I went along.
Did you not notice?

> If you mean "primitive" in the sense of "built-in", your conclusion is 
> wrong. Such "primitive" types include dicts and lists,

Yes, those are precisely the ones I was thinking of.

> which are sophisticated container objects with internal "variables" to
> be assigned to:

They have internal state.  I don't think that internal state consists of
(Python) variables, though.  This is why I brought up the possibility of
a pure Python implementation, maintaining its state by assigning to
variables captured using closures.

> > There's a qualitative difference here: simple assignment has semantics
> > defined by the language and provided by the implementation, and can
> > therefore be understood in isolation, using only lexically apparent
> > information; whereas the complex assignments are implemented by invoking
> > methods on the objects mentioned on the left hand side.
>
> Again, you're using unfamiliar terminology.

I'm sorry.  I thought I defined those terms recently.  (It appears not.
Sorry.)  I meant to refer to assignments in which the target is an
identifier (to use the actual terminology from the manual).

> "Simple" and "complex" assignment -- I can guess you mean "name
> binding" = simple and "any other reference binding" = complex in
> Python terminology:

Again: assignment is not binding.  See the explanation below.

> The thing is, the semantics of assigning to built-in components are
> equally defined by the language as the semantics of name binding.

Did you read what I said?

> You are right to point out that for arbitrary types:
>
> x[2] = 5
>
> need not be an assignment, 

Good.  This was /precisely/ what I was pointing out: determining
/whether/ the value named by an identifier inhabits one of the built-in
types is, in general, hard.

> The distinction between the two count as an important proviso to the
> discussion: x[2] = 5 is only an assignment if the type of x is
> non-pathological and not broken.

The latter is not an assignment: it's a disguised method call.

> But putting aside such pathological cases, I don't think you can justify 
> distinguishing between "simple" and "complex" assignment.

To conflate them is to confuse two different levels of meaning.  Simple
assignments occur because the language is hard-wired that way; complex
assignments are disguised method calls which often mutate values.

> Names are only one type of reference in Python, not the only one, and
> assignments other than name-binding can be fully understood from the
> language semantics *provided you know the type is a built-in*.

> >> Assignment *always* binds an object to a target.
> > 
> > No!  Assignment /never/ binds.
>
> A shocking claim that requires more explanation. If it doesn't bind,
> what does it do?

Duh!  It assigns.  You're not usually this slow.  Fortunately I explain
below.

> > There is syntactic confusion here too, since Python interprets a
> > simple assignment in a function body -- in the absence of a
> > declaration such as `global' to the contrary -- as indicating that
> > the variable in question should be bound to a fresh variable on
> > entry to the function.
>
> Well, there seems to be some confusion here, but I don't think it's 
> ours... local variables in functions aren't bound on *entry* to the 
> function. How could they be? They are bound *when the assignment is 
> executed*, which may be never -- hence it is possible to get an 
> UnboundLocalError exception, if you try to retrieve the value of a local 
> which hasn't yet had a value bound to it.

The exception name perpetuates the misunderstanding, alas; but it's
traditional, from Lisp, to say that a variable is `unbound' if it
contains no value.

> > But assignment itself doesn't perform binding.  (This is a
> > persistent error in the Python community; or, less charitably, the
> > Python community gratuitously uses the word in a different sense
> > from the wider programming-language-theory community.  See Lisp
> > literature passim, for example.)
>
> *Less* charitably? I'm sorry, you think that being *wrong* is better
> than being *different*? That's not a moral judgment I can agree with.

Being wrong is perhaps justifiable, and is rectifiable by learning.
Being gratuitously different in such a case is to intentionally do a
disservice to those coming from or going to other communities where they
encounter more conventional uses for the terms in question.

> > There's a two step mapping: names -> storage locations -> values.
> > Binding affects the left hand part of the mapping; assignment affects
> > the right hand part.
>
> That gratuitously conflates the implementation ("storage locations")

There isn't a common term for that concept.  I chose `storage location'
because it's the term I remember seeing in the denotational semantics
for Scheme.  The phrase `bound to fresh locations' occurs frequently
elsewhere in the Scheme report.  It seemed apt.  Choose some other term
if you please; I merely wanted a term to describe a concept.

> with the interface. Objects in Python have no storage location, 

And now you're confusing `storage locations' (whatever you choose to
call them) with values (or `objects'), when my point was precisely that
the two are different.

> There is nothing you can write in pure Python that can tell whether the 
> storage location of an object has changed

Indeed.  Several locations may contain the same value.  I remembered
that the term `reference' was controversial so I avoided using it. ;-)

> or even whether "storage location" is a well-defined concept.

I disagree; though it can be a little subtle.  Consider:

        def make_cell(x):
          def s(y): nonlocal x; x = y
          return lambda: x, s

        def copy_cell(c):
          g, s = c
          return lambda: g(), lambda y: s(y)

Given two such cells (g, s) and (gg, ss), we can determine whether they
use the same storage location or not, despite possibly having been
constructed using copy_cell, perhaps one from the other, or perhaps
(indirectly) from a common proper ancestor.

        def same_location_p(g, s, gg, ss):
          fresh = object()
          old = g()
          try:
            s(fresh)
            return gg() is fresh
          finally:
            s(old)

Because of `copy_cell' you can't use `id' to solve this problem (at
least not without doing serious poking about inside code objects).

> Nothing in the semantics of Python demand that objects must be
> implemented as single contiguous blocks of data with a well-defined
> location. 

Indeed.  I didn't claim otherwise.

> The closest you can come is that CPython exposes the memory location
> as the id(), but that's not a language promise: Jython and IronPython
> do not.

No.  `id' reveals object identity.  The clue is in the name.

-- [mdw]