Why not an __assign__ method?
Carlos Alberto Reis Ribeiro
cribeiro at mail.inet.com.br
Fri Mar 30 23:39:59 EST 2001
Robin, please have some patience, and read my entire answer before
replying. I think I have a good point here.
At 14:41 30/03/01 -0500, Robin Thomas wrote:
>At 01:26 PM 3/30/01 -0300, Carlos Alberto Reis Ribeiro wrote:
>
>>The question is: Why not have a __assign__ method, that gets called on
>>assignment? It should be called on the *right side* of the expression (at
>>the left side it does not make sense - I leave the proof as an exercise
>>for the reader <wink>). The return value would be then assigned to the
>>left side of the expression.
>
>You should really get the source to CPython and start reading it.
I'll do it, I promise :-)
>>z = a + b + c
>
>The "no changes to Python" implementation of your example is:
>
>z = a + b
>z += c
I've pointed out at such alternative in an earlier email (this morning,
with other subject). The concern then is that this leads to code that's
harder to understand. It would be better to have the expression in it's
original format, and leave the optimizations for the compiler. Nobody said
that it was easy :-)
I also have previously pointed out that I believed that the intermediate
object was actually instantiated inside the __add__ method of NumPy. I have
some reasons to believe so, even before reading the code (more on this later).
As you suggested, I checked the bytecode that is generated. It's simply a
stack-based calculator:
>>> import dis
>>> p = compile('z = a + b + c', '<string>', 'single')
>>> dis.disassemble(p)
0 SET_LINENO 0
3 SET_LINENO 1
6 LOAD_NAME 0 (a)
9 LOAD_NAME 1 (b)
12 BINARY_ADD
13 LOAD_NAME 2 (c)
16 BINARY_ADD
17 STORE_NAME 3 (z)
20 LOAD_CONST 0 (None)
23 RETURN_VALUE
>>>
After reading this, I still think that it is possible to implement my
proposal. Please note that I'm talking about two related things here: the
__assign__ proposal (that only involves a change on STORE_NAME behavior, as
it is shown on step 5 below); and the NumPy optimization, made possible by
the __assign__ feature, by using a temporary flag on the intermediate objects.
1) The Python compiler itself didn't create any new object. The bytecode
only asks for a BINARY_ADD. This is going to call some C code, that in turn
must create an intermediate variable to store the result before returning.
In the case of the NumPy array, the code for this must be on the numerical
extensions itself.
(I'll check this, I promise. But please keep on reading...)
2) Now imagine that, while creating the object to store the intermediate
value, we set a 'temp' attribute on the new object. The operation is done
(in this case, BINARY_ADD at pos 12), and the resulting object is returned,
still with the 'temp' flag set.
3) Then, at pos 16, other BINARY_ADD is issued. Some C code is called
again. Inside this code, we check the temp attribute. If the temp attribute
is true, *and* the operation can be done in-place, then we do it this way.
The object is again returned, still with the temp flag set.
4) Note that not all operations can be done inplace. However, for arrays,
any operation that is done element-by-element can be done this way. Add is
an obvious example.
5) At line 17, we have the STORE_NAME opcode. Again, inside C code, we
would call the __assign__ method of the object that is on the top of the
stack. The __assign__ method would then return the object to be assigned.
By default, that is just a "stub" function, that returns the object itself.
6) By intercepting __assign__, the NumPy code could check the temp flag on
the array object. The flag is still there; so we clear the flag, to point
out that in the future it is not safe to do the operation inplace anymore.
>Think about what your "temp flag" would do under the current behavior of
>threads in Python.
Why threads are not a problem?
- Because temporary objects have meaning only in the context of the
mathematical expression where they are create/evaluated/returned;
- Because each thread have their own stack for mathematical operations
(even in Stackless Python; the word 'stackless' has nothing to do with the
mathematical stack);
- Because we don't do multithreaded evaluation of mathematical expressions;
or, to put it other way: any mathematical expression is always evaluated in
the context of a single thread. If we did otherwise, then the code above
could fail; but this is not the case.
>into the syntax-illegal but perhaps (and perhaps not) bytecode-legal:
>
>z = ((a + b) += c)
I have some reasons to believe that the hack above is bytecode-legal. I'll
try it with NumPy this weekend (I'll need to grap the bytecode assembler
first).
>z = a + b + c % (d * e) / f ^ (g or 42)
The apparent complexity of the expression changes nothing. In the end
everything is just turned into a very simple stack calculator. Anyone who
ever programmed in Forth, PostScript, or HP calculators will recognize it
<wink>.
Carlos Ribeiro
More information about the Python-list
mailing list