[Numpy-discussion] Re: Trying out Numeric3

Travis Oliphant oliphant at ee.byu.edu
Wed Mar 23 11:48:12 EST 2005

Scott Gilbert wrote:

>--- Michiel Jan Laurens de Hoon <mdehoon at ims.u-tokyo.ac.jp> wrote:
>>>>Do 4 gigabyte 1D numerical python arrays occur in practice?
>>Why? There needs to be a good reason to break compatibility. Who needs
>I (and others I work with) routinely deal with 1D datasets that are
>multiple gigabytes in length.  Working with terabyte datasets is on my near
>horizon.  For lots of reasons, I don't/can't use Python/Numeric for very
>much of this sort of thing, but it would be nice it I could.  The "32 bits
>is enough for anyone" design has bitten me with lots of tools (not just
>Python).  The Python core will fix it's int/intp problem eventually, I
>can't see why Numeric3 wouldn't avoid the problem now.
Thanks for your comments Scott.  This is exactly the kind of comments 
I'm looking for.  I want to hear the experiences of real users (I know 
there are a lot of silent-busy types out there).   It really helps in 
figuring out what are the most important issues.

>If the new Numeric3 didn't break too much compatibility with the original
>Numeric but pickled much faster, we'd probably be in a hurry to upgrade
>based on this feature alone.
I'm hoping we can do this, so stay tuned.

>I agree with the other guy who pointed out that arrays are mutable and that
>likewise, rank-0 arrays should be mutable.  I know it's unlikely to happen,
>but it would also be nice to see the Python parser change slightly to treat
>a[] as a[()].  Then the mutability of rank-0 could fit elegantly with the
>rank-(n > 1) arrays.  It's a syntax error now, so there wouldn't be a
>backwards compatibility issue.
Well, rank-0 arrays are and forever will be mutable.  But, Python 
scalars (and the new Array-like Scalars) are not mutable.   I know this 
is not ideal. But making it ideal means fundamental changes to Python 
scalars.  So far the current scheme is the best idea I've heard.  I'm 
always open to better ones.

>We commonly use data types that aren't in Numeric.  The most prevalent
>example at my work is complex-short.  It looks like I can wrap the new
>"Void" type to handle this to some extent.  Will indexing (subscripting) a
>class derived from a Numeric3 array return the derived class?
>     class Derived(Numeric3.ArrayType):
>         pass
>     d = Derived(shape=(200, 200, 2), typecode='s')
>     if isinstance(d[0], Derived):
>         print "This is what I mean"
Yes,  indexing will return a derived type currently.  There are probably 
going to be some issues here, but it can be made to work.  I'm glad you 
are noticing that the VOID * type is for more than just record arrays.  
I've got ideas for hooks that allow new types to be defined, but I could 
definitely use examples.  

>I don't really expect Numeric3 to add all of the possible oddball types,
>but I think it's important to remember that other types are out there
>(fixed point for DSP, mu-law for audio, 16 bit floats for graphics, IBMs
>decimal64 decimal128 types, double-double and quad-double for increased
>precision, quaternions of standard types, ....).  It's one thing to treat
>these like "record arrays", it's another thing for them to have overloaded
>arithmetic operators.
I think using standard Python overloading of arithmetic operators (i.e. 
define their own) may be the way to go.

>Since Numeric3 can't support every type under the sun, it would be nice if
>when the final version goes into the Python core that the C-API and Python
>library functions used "duck typing" so that other array implementations
>could work to whatever extent possible.  In other words, it would be better
>if users were not required to derive from the Numeric3 type in order to
>create new kinds of arrays that can be used with sufficiently generic
>Numeric3 routines.  Simply having the required attributes (shape, strides,
>itemsize, ...) of a Numeric3 array should be enough to be treated like a
>Numeric3 array.
I would really like to see this eventually too.  We need examples, 
though, to eventually make it work right.  One idea is to have classes 
define "coercion" routines that the ufunc machinery uses, and create an 
API wherein the ufunc can be made to call the right function.


More information about the NumPy-Discussion mailing list