[Numpy-discussion] Response to PEP suggestions

Thu Feb 17 17:01:27 EST 2005

David M. Cooke wrote:

>Travis Oliphant <oliphant at ee.byu.edu> writes:
>
>  
>
>>I'm glad to get the feedback.
>>
>>1) Types
>>
>>I like Francesc's suggestion that .typecode return a code and .type
>>return a Python class.   What is the attitude and opinion regarding
>>the use of attributes or methods for
>>this kind of thing?  It always seems to me so arbitrary as to what is
>>an attribute or what
>>is a method.
>>    
>>
>
>If it's an intrinisic attribute (heh) of the object, I usually try to
>make it an attribute. So I'd make these attributes.
>
>  
>
>>There will definitely be support for the nummary-style type
>>specification.   Something like that will be how they print (I like
>>the 'i4', 'f4', specification a bit better though). There will also be
>>support for specification in terms of a c-type.  The typecodes will
>>still be there, underneath.
>>    
>>
>
>+1. I think labelling types with their sizes at some level is necessary
>for cross-platform compatibility (more below).
>
>  
>
>>One thing has always bothered me though.  Why is a double complex type
>>Complex64? and a float complex type Complex32.  This seems to break
>>the idea that the number at the end specifies a bit width.   Why don't
>>we just call it Complex64 and Complex128?  Can we change this?
>>    
>>
>
>Or rename to ComplexFloat32 and ComplexFloat64?
>
>  
>
>>I'm also glad that some recognize the problems with always requiring
>>specification of types in terms of bit-width or byte-widths as these
>>are not the same across platforms.  For some types (like Int8 or
>>Int16) this is not a problem.   But what about long double?  On an
>>intel machine long double is Float96 while on a PowerPC it is
>>Float128.   Wouldn't it just be easier to specify LDouble or 'g' then
>>special-case your code?
>>    
>>
>
>One problem to consider (and where I first ran into these type of
>things) is when pickling. A pickle containing an array of Int isn't
>portable, if the two machines have a different idea of what an Int is
>(Int32 or Int64, for instance). Another reason to keep the byte-width.
>
>LDouble, for instance, should probably be an alias to Float96 on
>Intel, and Float128 on PPC, and pickle accordingly.
>
>  
>
>>Problems also exist when you are interfacing with hardware or other C
>>or Fortran code.  You know you want single-precision floating point.
>>You don't know or care what the bit-width is.    I think with the
>>Integer types the bit-width specification is more important than
>>floating point types.  In sum, I think it is important to have the
>>ability to specify it both ways.   When printing the array, it's
>>probably better if it gives bit-width information.  I like the way
>>numarray prints arrays.
>>    
>>
>
>Do you mean adding bit-width info to str()? repr() definitely needs
>it, and it should be included in all cases, I think.
>
>You also run into that sizeof(Python integer) isn't necessarily
>sizeof(C int) (a Python int being a C long), espically on 64-bit systems.
>
>I come from a C background, so things like Float64, etc., look wrong.
>I think more in terms of single- and double-precision, so I think
>adding some more descriptive types:
>
>CInt         (would be either Int32 or Int64, depending on the platform)
>CFloat       (can't do Float, for backwards-compatibility reasons)
>CDouble      (could just be Double)
>CLong        (or Long)
>CLongLong    (or LongLong)
>
>That could make it easier to match types in Python code to types in C
>extensions.
>  
>
I guess the issue revolves around the characteristics of the target 
users, if most are C aficionados then the above has merit.  However, 
this doesn't provide for the Int8's or the Int16's.  Neither does it 
provide for a bit array, which would be suitable for Booleans.

My guess is that most users would not be from a C background and so 
something along the lines of numerictypes makes sense.

>Oh, and the Python types int and float should be allowed (especially
>if you want this to go in the core!).
>
>And a Fortran integer could be something else, but I think that's
>more of a SciPy problem than Numeric or numarray. It could add
>FInteger and FBoolean, for instance.
>
>  
>
>>2) Multidimensional array indexing.
>>
>>Sometimes it is useful to select out of an array some elements based
>>on it's linear (flattened) index in the array.   MATLAB, for example,
>>will allow you to take a three-dimensional array and index it with a
>>single integer based on it's Fortran-order:  x(1,1,1),  x(2,1,1), ...
>>
>>What I'm proposing would have X[K] essentially equivalent to
>>X.flat[K].  The problem with always requiring the use of X.flat[K] is
>>that X.flat does not work for discontiguous arrays.   It could be made
>>to work if X.flat returned some kind of specially-marked array, which
>>would then have to be checked every time indexing occurred for any
>>array.  Or, there maybe someway to have X.flat return an "indexable
>>iterator" for X which may be a more Pythonic thing to do anyway.  That
>>could solve the problem and solve the discontiguous X.flat problem as
>>well.
>>
>>If we can make X.flat[K] work for discontiguous arrays, then I would
>>be very happy to not special-case the single index array but always
>>treat it as a 1-tuple of integer index arrays.
>>    
>>
>
>Right now, I find X.flat to be pretty useless, as you need a
>contiguous array. I'm +1 on making X.flat work in all cases (contiguous
>and discontiguous). Either
>
>a) X.flat returns a contiguous 1-dimensional array (like ravel(X)),
>   which may be a copy of X
>
>or
>
>b) X.flat returns a "flat-indexable" view of X
>
>I'd argue for b), as I feel that attributes should operate as views,
>not as potential copies. To me, attributes "feel like" they do no
>work, so making a copy by mere dereferencing would be suprising.
>
>If a), I'd rather flat() be a method (or have a ravel() method).
>
>
>I think overloading X[K] starts to run into trouble: too many special
>cases.
>  
>
As someone else said, the draft PEP needs to have a much clearer 
statement of what datatype K is and just what X[K] would mean.

Colin W.