
Travis Oliphant <oliphant@ee.byu.edu> writes:
I'm glad to get the feedback.
1) Types
I like Francesc's suggestion that .typecode return a code and .type return a Python class. What is the attitude and opinion regarding the use of attributes or methods for this kind of thing? It always seems to me so arbitrary as to what is an attribute or what is a method.
If it's an intrinisic attribute (heh) of the object, I usually try to make it an attribute. So I'd make these attributes.
There will definitely be support for the nummary-style type specification. Something like that will be how they print (I like the 'i4', 'f4', specification a bit better though). There will also be support for specification in terms of a c-type. The typecodes will still be there, underneath.
+1. I think labelling types with their sizes at some level is necessary for cross-platform compatibility (more below).
One thing has always bothered me though. Why is a double complex type Complex64? and a float complex type Complex32. This seems to break the idea that the number at the end specifies a bit width. Why don't we just call it Complex64 and Complex128? Can we change this?
Or rename to ComplexFloat32 and ComplexFloat64?
I'm also glad that some recognize the problems with always requiring specification of types in terms of bit-width or byte-widths as these are not the same across platforms. For some types (like Int8 or Int16) this is not a problem. But what about long double? On an intel machine long double is Float96 while on a PowerPC it is Float128. Wouldn't it just be easier to specify LDouble or 'g' then special-case your code?
One problem to consider (and where I first ran into these type of things) is when pickling. A pickle containing an array of Int isn't portable, if the two machines have a different idea of what an Int is (Int32 or Int64, for instance). Another reason to keep the byte-width. LDouble, for instance, should probably be an alias to Float96 on Intel, and Float128 on PPC, and pickle accordingly.
Problems also exist when you are interfacing with hardware or other C or Fortran code. You know you want single-precision floating point. You don't know or care what the bit-width is. I think with the Integer types the bit-width specification is more important than floating point types. In sum, I think it is important to have the ability to specify it both ways. When printing the array, it's probably better if it gives bit-width information. I like the way numarray prints arrays.
Do you mean adding bit-width info to str()? repr() definitely needs it, and it should be included in all cases, I think. You also run into that sizeof(Python integer) isn't necessarily sizeof(C int) (a Python int being a C long), espically on 64-bit systems. I come from a C background, so things like Float64, etc., look wrong. I think more in terms of single- and double-precision, so I think adding some more descriptive types: CInt (would be either Int32 or Int64, depending on the platform) CFloat (can't do Float, for backwards-compatibility reasons) CDouble (could just be Double) CLong (or Long) CLongLong (or LongLong) That could make it easier to match types in Python code to types in C extensions. Oh, and the Python types int and float should be allowed (especially if you want this to go in the core!). And a Fortran integer could be something else, but I think that's more of a SciPy problem than Numeric or numarray. It could add FInteger and FBoolean, for instance.
2) Multidimensional array indexing.
Sometimes it is useful to select out of an array some elements based on it's linear (flattened) index in the array. MATLAB, for example, will allow you to take a three-dimensional array and index it with a single integer based on it's Fortran-order: x(1,1,1), x(2,1,1), ...
What I'm proposing would have X[K] essentially equivalent to X.flat[K]. The problem with always requiring the use of X.flat[K] is that X.flat does not work for discontiguous arrays. It could be made to work if X.flat returned some kind of specially-marked array, which would then have to be checked every time indexing occurred for any array. Or, there maybe someway to have X.flat return an "indexable iterator" for X which may be a more Pythonic thing to do anyway. That could solve the problem and solve the discontiguous X.flat problem as well.
If we can make X.flat[K] work for discontiguous arrays, then I would be very happy to not special-case the single index array but always treat it as a 1-tuple of integer index arrays.
Right now, I find X.flat to be pretty useless, as you need a contiguous array. I'm +1 on making X.flat work in all cases (contiguous and discontiguous). Either a) X.flat returns a contiguous 1-dimensional array (like ravel(X)), which may be a copy of X or b) X.flat returns a "flat-indexable" view of X I'd argue for b), as I feel that attributes should operate as views, not as potential copies. To me, attributes "feel like" they do no work, so making a copy by mere dereferencing would be suprising. If a), I'd rather flat() be a method (or have a ravel() method). I think overloading X[K] starts to run into trouble: too many special cases. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm@physics.mcmaster.ca