[Python-Dev] stackable ints [stupid idea (ignore) :v]
tim_one at email.msn.com
Sat Jun 12 23:37:08 CEST 1999
> I thought it would be good to be able to do the following loop
> with Numeric arrays
> for x in array1:
> array2[x] = array3[x] + array4[x]
> without any memory management being involved. Right now, I think the
> for loop has to continually dynamically allocate each new x
Actually not, it just binds x to the sequence of PyObject*'s already in
array1, one at a time. It does bump & drop the refcount on that object a
lot. Also irksome is that it keeps allocating/deallocating a little integer
on each trip, for the under-the-covers loop index! Marc-Andre (I think)
had/has a patch to worm around that, but IIRC it didn't make much difference
(wouldn't expect it to, though -- not if the loop body does any real work).
One thing a smarter Python compiler could do is notice the obvious <snort>:
the *internal* incref/decref operations on the object denoted by x in the
loop above must cancel out, so there's no need to do any of them.
"internal" == those due to the routine actions of the PVM itself, while
pushing and popping the eval stack. Exploiting that is tedious; e.g.,
inventing a pile of opcode variants that do the same thing as today's except
skip an incref here and a decref there.
> and intermediate sum (and immediate deallocate them)
The intermediate sum is allocated each time, but not deallocated (the
pre-existing object at array2[x] *may* be deallocated, though).
> and that makes the loop piteously slow.
A lot of things conspire to make it slow. David is certainly right that, in
this particular case, array2[array1] = array3[array1] + etc worms around the
worst of them.
> The idea replacing pyobject *'s with a struct [typedescr *, data *]
> was a space/time tradeoff to speed up operations like the above
> by eliminating any need for mallocs or other memory management..
Fleshing out details may make it look less attractive. For machines where
ints are no wider than pointers, the "data *" can be replaced with the int
directly and then there's real potential. If for a float the "data*" really
*is* a pointer, though, what does it point *at*? Some dynamically allocated
memory to hold the float appears to be the only answer, and you're right
back at the problem you were hoping to avoid.
Make the "data*" field big enough to hold a Python float directly, and the
descriptor likely zooms to 128 bits (assuming float is IEEE double and the
machine requires natural alignment).
Let's say we do that. Where does the "+" implementation get the 16 bytes it
needs to store its result? The space presumably already exists in the slot
indexed by array2[x], but the "+" implementation has no way to *know* that.
Figuring it out requires non-local analysis, which is quite a few steps
beyond what Python's compiler can do today. Easiest: internal functions
all grow a new PyDescriptor* argument into which they are to write their
result's descriptor. The PVM passes "+" the address of the slot indexed by
array2[x] if it's smart enough; or, if it's not, the address of the stack
slot descriptor into which today's PVM *would* push the result. In the
latter case the PVM would need to copy those 16 bytes into the slot indexed
by array2[x] later.
Neither of those are simple as they sound, though, at least because if
array2[x] holds a descriptor with a real pointer in its variant half, the
thing to which it points needs to get decref'ed iff the add succeeds. It
can get very messy!
> I really can't say whether it'd be worth it or not without some sort of
> real testing. Just a thought.
It's a good thought! Just hard to make real.
it<wink>-ly y'rs - tim
More information about the Python-Dev