[Tutor] <var:data> assignment

denis denis.spir at free.fr
Tue May 11 12:14:07 EDT 2004

[first thread of the 'var:data' class :-)]

Q: what happens with 'myVar = myVal'?
Qbis: why such a question?

When a variable is assigned for the first time, the language
itself --python's 'decoder' which is not written in python, but in C
instead-- has to yield a link between the variable and its associated data .
There are at least to things (not to say: onject') involved: the variable's
name and the value itself.
The value must obviously be stored somehow, if it's not already, which
actually means it must be put somewhere in a memory site --that is: yield
data. Then, Python must draw a kind of name --> value mapping to recover it
when needed, let's call that 'registration'. This is of course, you have
noted it ;-) the structure of a Python dictionary:

name1: value1, name2: value2, name3: value3, ...

or a pair of associated lists (by indexes):

name1,  name2,  name3, ...
  |        |       |
value1, value2, value3, ...

Consider this:

class C:
    l=['voici', 'mes', 'articles']
    d={1:'un', 2:'deux'}

>>> dir(C)
['__doc__', '__module__', 'd', 'i', 'l', 't', 'x']

>>> C.__dict__
{'__module__': '__main__', 'd': {1: 'un', 2: 'deux'}, 'i': 1, 'l': ['voici',
'mes', 'articles'], 't': 'text', 'x': 1.0, '__doc__': None}

Now, this doesn't mean that in the language's blackbox (in C), variable
registration is really implemented as a dictionary. Yet the language needs
not know where actually the value is. And if someone used python to design
another language, and dictionaries for variable registration, the actual
memory location of the data would be unknown and un-needed.

Now, consider the following line, using particuliar Python features:
>>> t1='salut'
>>> t2='salut'
>>> t1==t2, id(t1), id(t2), t1 is t2
(True, 18654912, 18654912, True)
>>> t2='hallo'
>>> t1==t2, id(t1), id(t2), t1 is t2
(False, 18654912, 18655776, False)

'id()' returns the so-called 'identity' of a variable, if fact the memory
address of its bound data, and 'is' checks if two variables have the same
id-s. The above lines tell us several things:
First, when a value already exists, it isn't created again and stored in
another place.
Second, two variables can thus share same value *and* same id: there
synonyms, or aliases.
Third, conversely, when a value changes, its place changes, too.

All of this leads (me) to question the above dictionary-like model. It seems
that, instead of values, the id-s are bound to variable names. This better
explains the three comments above:

name1: address1, name2: address2, name3: address3, ...    (address=id)

or a pair of associated lists:

name1,     name2,    name3, ...
  |          |         |
address1, address2, address3, ...

Of course, both of the models would be possible; the second one is right in
python, as other noticeable facts will show it. These models can be better
designed and distinguished by another picture:
[you'll need a fixed-width font]

>>> a=1; b=1; c=2
>>> id(a), id(b),id(c)
(7742480, 7742480, 7748592)
>>> a is b, a is c, b is c
(True, False, False)

'a' --> | 1 |
'b' --> | 1 |                 first model
'c' --> | 2 |


'a' --> | adr1 | \
         ------   \   ---
                   > | 1 |
         ------   /   ---
'b' --> | adr1 | /           second model

         ------       ---
'c' --> | adr2 | --> | 2 |
         ------       ---

In other words, in the second case we have on one side a set of name:address
pairs, and on the other side a set of data (see C pointers).

In the first case, we could see a variable as a kind of labeled box holding
a simple value, hence representing an assignment with:
a <- 1    (a) 'gets' 1
Or we could see variables like names whose purpose is to label data, the
data beeing boxes holding values, and thus an assignment could be
a -> 1    'a' 'is fixed to' 1

Neither (?) of these pictures match the actual variable processing by
Python. In python, we could eventually see variables as labeled boxes
holding adresses,
pointing to data ; while the data would be unlabeled boxes holding values:
a --> 1    a 'points to' 1 (C pointers, bis)

Now, consider the case of compound data:

>>> l1=[1,2]
>>> l2=[1,2]
>>> l1 == l2, l1 is l2
(True, False)

This contredicts the first rule saying that if a data exists, it needs not
created again. l1 and l2 hold the same data --there're equal--, but not at
same place --they're not identical in the sense of Python's idiom.
We can explain this (at first sight) strange behaviour by recursively
applying the other rule, the one saying that a variable isn't bound to a
value, but to an address instead. If we compare a list to an object holding
(named) data:

object data:     list data:
line_count       item#0
file_name        item#1

In the case of the object, it's obvious that both attributes, say
'sub_variables', will behave like 'real' variables, that is names associated
to addresses, where is actual data take place. Hence, the consequence is
that the object's data won't be a list of values, but a list of name:adress
pairs instead!

                   -----       ------------------------
'object_name' --> | adr | --> | list of name:adr pairs |
                   -----       ------------------------
                                              '--> actual data

The same for a list; except that names are replaced by indexes (thus, an
index is a particuliar form of a key, indeed).

                   -----       ------------------------
'list_name' --> | adr | --> | list of indexed adresses |
                   -----       ------------------------
                                              '--> actual data

If you think you have understood the mysteries of pythonesque data , read

>>> l1=[1]
>>> l2=[1]
>>> l1 is l2

>>> l1=l2=[1]
>>> l1 is l2

>>> l1=[1]; l2=[1]
>>> l1 is l2

>>> l1=l2
>>> l1 is l2



More information about the Tutor mailing list