anything like C++ references?

Mon Jul 14 11:36:12 EDT 2003

On 13 Jul 2003 22:46:41 GMT, bokr at oz.net (Bengt Richter) wrote:
[...]
Re-reading, I see that I should probably fill in some details (still subject to
disclaimer ;-)
>
>Python name assignment does not do anything to an object. It just creates
>an alias name for an object. It may reuse an existing name, but that only
>changes the meaning of that name as an alias, it does not affect the object
>that the name was previously an alias for.
>
>You can of course have other assignment targets than plain names. In fact,
>a "plain name" is just the simplest expression for an assignment target.
>But whatever the target expression evaluates to, what gets put "there" is
>a reference to an object, not an object itself.
>
>The confusion, ISTM, is in what to think of as "there." "There" is definitely
>not a memory space for a Python object representation or "value". 
>
>The most accurate (UIAM) C concept would be a "there" of type PyObject* -- i.e.,
>a pointer variable that points to the representation of some kind of Python object.
>
I should have mentioned that there are various access mechanisms for storing the "pointer".
I.e., the left hand side doesn't evaluate to a raw address until you are in the machine language
of the implementation. When storage is actually happening, of course byte codes are being
interpreted by the virtual machine of the current Python version. The byte codes will
differ according to the translation of the target expression, and even for byte codes
that looks the same, the implementation may involve complex dynamic behavior, e.g., in
searches through an inheritance graph for an appropriate method that will accept the
rh object reference and do the ultimate implementation-pointer storage.

The byte codes with STORE in their names, and some typical source statements that generate them,
are (2.2.3 windows):

STORE_SLICE:    x[a:b]=y # four byte codes numbered as STORE_SLICE+(1 if a is present)+(2 if b is present)
STORE_SUBSCR:   x[a]=y
STORE_NAME:     x=y #in global scope
STORE_ATTR:     x.a=y
STORE_GLOBAL:   def foo(): global x; x=y
STORE_FAST:     def foo(): x=y
STORE_DEREF:    def foo():
                    x=y # the STORE_DEREF
                    def bar():return x
                    return bar

Note that, e.g., STORE_SUBSCR looks the same for lists and dictionaries, or even something
undetermined that will give a run time error because the type of x doesn't support that operation.

Storage is often mediated by various methods, e.g., __setitem__ but finding the ultimate
method may be the result of a search through an inheritance graph,
and/or it could find the setter function of a named property, and that function in turn
may do much work that is not visible in the assignment statement source, or the immediate
level of corresponding byte codes.

>We don't have to know the implementation details to use the idea that the left hand
>side of a Python assignment is an *expression* (even if a single name) that yields
>a place to put a reference (effectively a pointer) to the object created by evaluating
>the expression on the right hand side.
Again, I should have been clearer. It *ultimately* yields a place, but potentially deep
in the bytecode and ultimately machine code of some chain of method invocations and implementations.
>
>A C programmer will tend to think that a symbol on the left is a static expression
>indicating a specific fixed memory space (static address or stack frame offset). But in Python
>it is first of all a dynamic expression (though an unqualified target name can only have its "there"
>be in the local or global namespace, and the choice between those is made statically
Hm. There is a third choice. When variables of the local namespace are destined to be captured
in a closure, the "there" is in the closure, stored by way of STORE_DEREF.
>(at least w.r.t. a given level of compiling/exec-ing).
>
>In C terms, the Python assignment target expression always evaluates to a place to put a pointer,
I elided a fair abount of implementation detail in saying that, but I think the effective
semantics are ok.

>never a place to put object representation bits. Of course, "a place to put a pointer"
>may be *inside* an object. And such inside places are identified by target expressions
>such as x[2] or x.a or x[2]().b[3] etc. Even a bare name really does identify a place inside
>an object -- i.e., inside the local or global dict associated with the context.
>
>The "place" identified by x= after a global x declaration (or just in global scope) will be the
>same place as globals()['x']= unless someone has subverted something. Either way, the binding
>of the name to the object happens by evaluating to a "there" within the global dict object,
>uniquely associated with the name ('x' here). Evaluated on the right hand side, that name will
>produce the reference/pointer again for use in accessing the object or copying to another "there"
>associated with perhaps some other name or a target within a composite object like a list or tuple,
>or other namespace dict.
>
>>
>>The wart remains, even if my description was wrong. And even that is a
>>dubious claim.
>>
>Every assignment is effectively stores a referential alias for an object,
>whether associated with a name or not. This is by design, not a wart.
>
The details of *how* an assignment "effectively stores" is a longer story though ;-)

>>A Python user is interested in how an object behaves - not how it is
>>internally implemented in the interpreter. Immutable objects don't
>when you say "immutable objects," are you really speaking of names
>bound to immutable objects? I.e., the integer 123 is an immutable object
>represented by the literal 123. You get an immutable object of the same value
>from the expression int('123'). These are the int *objects* -- how does
>"behave as references" apply to these actual "immutable objects"?
>I.e., could you rewrite your sentence (that this is breaking in two)
>to make it perhaps more understandable for me ?;-)
>
>Names are not objects. Nor are left-hand-side assignment target expressions
>in general, whether bare names or complicated.
>
>ISTM what you are discussing is not about objects but about the way names work,
>and ISTM you are thinking of names as if they were C++ variable references,
>which they are not, and they couldn't be. For one thing, a C++ reference type
>has to be initialized once to refer to a specific object. It can't be made to
>refer to something else during the life of that scope. Pointers are a better
>model, since you have to distinguish by expression syntax whether you mean
>to assign a new pointer value or modify the thing pointed to. In python you
>can only assign pointers, if you want to think in those terms. When you
>modify a mutable, you are still assigning pointers into some part of the
>mutable object representation. When you assign to a bare name, you are still assigning
>a pointer into a place in some dictionary object (or workalike). If you
>want to modify an object in the usual C sense, you can code it in C and provide a
>python interface to your C-implemented object. When you pass arguments
>to the various methods of your mutable, you will get pointers that you
>can dereference and you can do what you want to your mutable's C data representation.
>But when you pass the result back to the python world, it will have to be
>as a standard object reference/pointer, and if you assign that, you will be
>storing a pointer somewhere.
>
>>behave as references - the internal use of references for immutable
>>objects is basically a lazy copying optimisation and, apart from
>>performace and a couple of other technicalities (e.g. the 'is'
>>operator), has no relevance. Certainly it has no relevance to the
>>point I was making.
>>
>The way it works is part of the semantics of the language, not just
>an optimization issue.
>
>Names aren't variables.
>
>HTH ;-)
>
><disclaimer>I sometimes get things wrong. Corrections welcome.</disclaimer>
>

Regards,
Bengt Richter