[Numpy-discussion] Assigning complex value to real array

Thu Oct 7 15:48:50 EDT 2010

On 7 October 2010 13:01, Pauli Virtanen <pav at iki.fi> wrote:
> to, 2010-10-07 kello 12:08 -0400, Andrew P. Mullhaupt kirjoitti:
> [clip]
>> No. You can define the arrays as backed by mapped files with real and
>> imaginary parts separated. Then the imaginary part, being initially
>> zero, is a sparse part of the file, takes only a fraction of the
>> space  (and, on decent machine doesn't incur memory bandwidth costs
>> either).  You can then slipstream the cost of testing for whether the
>> imaginary part has been subsequently assigned to zero (so you can
>> re-sparsify the representation of a page) with any operation that
>> examines all the values on that page. Consistency would be provided by
>> the OS, so there wouldn't really be much numpy-specific code involved.
>>
>> So there is at least one efficient way to implement my suggestion.
>
> Interesting idea. Most OSes offer also page-allocated memory not backed
> in files. In fact, Glibc's malloc works just like this on Linux for
> large memory blocks.
>
> It would work automatically like this with complex arrays, if the
> imaginary part was stored after the real part, and additional branches
> were added to not write zeros to memory.
>
> But to implement this, you'd have to rewrite large parts of Numpy since
> the separated storage of re/im conflicts with its memory model. I
> believe this will simply not be done, since there seems to be little
> need for such a feature.

Years ago MATLAB did just this - store real and complex parts of
arrays separately (maybe it still does, I haven't used it in a long
time). It caused us terrible performance headaches, since it meant
that individual complex values were spread over two different memory
areas (parts of the disk in fact, since we were using gigabyte arrays
in 1996), so that writing operations in the natural way tended to
cause disk thrashing.

As for what's right for numpy, well, I think it makes a lot more sense
to simply raise an exception when assigning a complex value to a real
array (or store a NaN). Usually when this happens it means you either
didn't know how numpy worked or you were feeding bogus values to some
special function so it went complex on you. If you actually wanted
potentially-complex values, you'd create complex arrays; as the OP
says, there's little extra cost.

Fortunately, this latter strategy is the way that numpy is already
headed; currently I believe it emits a warning, and ISTR this is
intended to be strengthened to an exception or NaN soon.

Anne