![](https://secure.gravatar.com/avatar/b24e93182e89a519546baa7bafe054ed.jpg?s=120&d=mm&r=g)
The PEP has: In Python there will be a hierchial set of classes defined: GenericArray (type classes) BooleanArray (Bool) NumericArray (XX is the bit-width of the type) SignedArray UnsignedArray IntegerArray SignedIntegerArray (IntXX) UnsignedIntegerArray (UIntXX) FloatArray (FloatXX) ComplexArray (FloatXX) FlexibleArray CharacterArray StringArray (String) UnicodeArray (Unicode) VoidArray (Void) ObjectArray It seems that the record type is to be handled under the VoidArray. I hope that it will permit the setting and reading of fields. For example, if recs is an instance of an Array of records, then it should be possible to write: >>> recs[22, 5].sex= 'F' >>> recs[22, 5].sex F It is not clear, from the current draft, what is intended for ObjectArray. Can these be any Python objects tuple, list etc. or instances of any user defined class? There is also a tofile method ( I would prefer toFile), does this mean that pickling would be used for the object instances? The term WRITABLE is used, is this different from "mutable", the term used by Python? "Methods such as x.r(), x.i(), and x.flatten() are proposed.". Why not use properties, x.r, x.i and x.flatten. The parentheses appear redundant. Colin W.
![](https://secure.gravatar.com/avatar/4d021a1d1319f36ad861ebef0eb5ba44.jpg?s=120&d=mm&r=g)
It seems that the record type is to be handled under the VoidArray. I hope that it will permit the setting and reading of fields.
Exactly. A Python record class can make use of the basic array structure.
This will need to be handled by the record class specifically (it will make use of a void array). I do not see a need to clutter the general array c-type with this. Along that idea, the current design is based heavily on Numeric which places all arraytypes under a single ArrayType. It "knows" about the type of each array instead of asking the type what it can do. I know numarray made strides in a different direction (but I'm not sure how far it got, since there is still this basic set of types). I am very willing to consider the possibility of a more top-down design in which a basic array c-type had no information about what was in the array (it just managed its shape, its memory usage, and the size of the memory being used). Then, subtypes could be created which handled type-specific needs. This seems to be a more logical direction to pursue, but it is a bit of a switch and so carries even more risk. Someone from numarray could help here, perhaps.
It is the familiar "O" type from Numeric. Yes, these objects can be any Python object whatever. Numeric already handles this (and numarray recently added them).
There is also a tofile method ( I would prefer toFile), does this mean that pickling would be used for the object instances?
The naming convention is lowercase (camelCase is reserved for ClassInstances). I have not thought that far, but probably...
The term WRITABLE is used, is this different from "mutable", the term used by Python?
MUTABLE is a better term. Do numarray folks agree?
The distinction is that attributes do not return copies, but methods might. Thus, the idea is that x.i() would return zeros if the array were not complex, but x.imag would raise and error. I like the idea of attributes being intrinsic properties of an array, while methods could conceivably return or do anything. Thanks for your continued help with clarification and improvement to the PEP. -Travis
![](https://secure.gravatar.com/avatar/55f7acf47233a7a98f5eb9dfd0b2d763.jpg?s=120&d=mm&r=g)
Hi Travis, First off, let me say that I'm encouraged to see some action towards unifying Numeric/Numarray the split has been somewhat dismaying. Thank you for your efforts in this regard. I'd like to lobby against flatten(), r() and i(). To a large extent, these duplicate the functionality of flat, real and imag. And, these three methods are defined to sometimes return copies and sometimes return views. That type of interface is a recipe for errors and should only be used as a last resort. Fortunately in this case there are better alternatives. Flatten() is not necessary now that flat will be faux array with a view to the original [I believe you are calling it an indexable iterator]. I would, however, recomend that A.flat.copy() work. In that case, A.flat would be used when no copy was desired, and A.flat.copy() when a copy was desired. I don't find the copy when discontiguous case useful enough to deserve it's own function and it's error prone as I'll discuss more below. r() appears to be around just for symmetry with i() since A.r() will always be the same as A.real. That leaves i(). My opinion is that this case would be better served by returning a read-only array of zeros when operating on a real array. This array could even be a faux-array that doesn't allocate any space, although that may be a project for another day. If it's really deemed necessary to have these functions in addition to their attribute brethren, I recomend that they always return copies rather than varying their behaviour depending on the situation. The problem with methods that sometimes return a copy, is that it won't be long before someone types: def foobar(a) flat_view = a.flatten() # lots of code flat_view[some_index] = some_new_number This works until someone passes in a discontiguous array, at which point it fails mysteriously. This type of problem tends to be somewhat resistant to unit tests, since tests often involve only contiguous arrays. Regards, -tim
![](https://secure.gravatar.com/avatar/ec366db3649cf13f4061b519193849d6.jpg?s=120&d=mm&r=g)
Tim Hochberg wrote:
There is, however, a blisteringly common use case for such an interface: you are using the result directly in an expression such that it is only going to be read and never written to. In that case, you want it to never fail (except in truly pathological cases like being out of memory), and you want it to be as efficient as possible and so never produce a copy where you can produce a view. So, I think we need three interfaces for each of this kind of attribute: 1) Getting a view. If a view cannot be obtained, raise an error. Never copy. 2) Getting a copy. Never return a view. 3) Getting *something* the most efficient way possible. Caller beware. While I lean towards making the syntaxes for the first two The Most Obvious Ways To Do It, I think it may be rather important to keep the syntax of the third convenient and short, particularly since it is that case that usually occurs in the middle of already-complicated expressions. -- Robert Kern rkern@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter
![](https://secure.gravatar.com/avatar/55f7acf47233a7a98f5eb9dfd0b2d763.jpg?s=120&d=mm&r=g)
Robert Kern wrote:
The proposal for flat is to always return a view, if the array is not contiguous a special view-of-a-discontiguous-array will be returned. This special object obviously be less efficient to index than a contiguous array, but if implemented carefully could probably be made more efficient than a copy plus indexing for one-shot uses (i.e., in an expression).
2) Getting a copy. Never return a view.
3) Getting *something* the most efficient way possible. Caller beware.
By this you mean return a view if contiguous, a copy if not, correct?
Personally I don't find this all that compelling. Primarily because the blanket statement that (3) is more efficient than (1) is extremely suspect. In can many, if not most, cases (1) [as clarified above] will be more efficient than (3) anyway. Our disagreement here may stem from having different understandings of the proposed behaviour for flat. There are cases where you probably *do* want to copy if discontiguous, but not in expressions. I'm thinking of cases where you are going to be reusing some flattened slice of a larger array multiple times, and you plan to use it read only (already that's sounding pretty rare though). In that case, my preferred spelling is: a_contig_flat_array = ascontiguous(an_array).flat In other words, I'd like to segregate all of the sometimes-copy behaviour into functions called asXXX, so that it's easy to see and easier to debug when it goes wrong. Of course, ascontiguous doesn't exist at present, as far as I can tell, but it'd be easy enough to add: def ascontiguous(a): a = asarray(a) if not a.iscontiguous(): a = a.copy() return a Regards, -tim
![](https://secure.gravatar.com/avatar/9567eaa720c05c4f4a2dbf7d831f1ef8.jpg?s=120&d=mm&r=g)
A small plea: exciting work on Numeric3 not withstanding, would it be possible to fix setup.py in Numeric 23.7 so that it doesn't fall over on Linux? A few versions ago, if BLAS and LAPACK weren't installed or weren't found, NumPy would automatically build with its own versions of BLAS_lite and LAPACK_lite. Now it just falls over, and it is very inconvenient to have to instruct users of our app, which depends on Numeric, how to patch setup.py, and/or provide a replacement setup.py. Also, the new home page for Numpy, to which http://numpy.sf.net is redirected, does not have a link to the SorceForge project page for Numpy from which released tarballs or zip file can be downloaded. That's another source of grief for users, who may not know to go to http://www.sf.net/projects/numpy Thanks, Tim C
![](https://secure.gravatar.com/avatar/4d021a1d1319f36ad861ebef0eb5ba44.jpg?s=120&d=mm&r=g)
It seems that the record type is to be handled under the VoidArray. I hope that it will permit the setting and reading of fields.
Exactly. A Python record class can make use of the basic array structure.
This will need to be handled by the record class specifically (it will make use of a void array). I do not see a need to clutter the general array c-type with this. Along that idea, the current design is based heavily on Numeric which places all arraytypes under a single ArrayType. It "knows" about the type of each array instead of asking the type what it can do. I know numarray made strides in a different direction (but I'm not sure how far it got, since there is still this basic set of types). I am very willing to consider the possibility of a more top-down design in which a basic array c-type had no information about what was in the array (it just managed its shape, its memory usage, and the size of the memory being used). Then, subtypes could be created which handled type-specific needs. This seems to be a more logical direction to pursue, but it is a bit of a switch and so carries even more risk. Someone from numarray could help here, perhaps.
It is the familiar "O" type from Numeric. Yes, these objects can be any Python object whatever. Numeric already handles this (and numarray recently added them).
There is also a tofile method ( I would prefer toFile), does this mean that pickling would be used for the object instances?
The naming convention is lowercase (camelCase is reserved for ClassInstances). I have not thought that far, but probably...
The term WRITABLE is used, is this different from "mutable", the term used by Python?
MUTABLE is a better term. Do numarray folks agree?
The distinction is that attributes do not return copies, but methods might. Thus, the idea is that x.i() would return zeros if the array were not complex, but x.imag would raise and error. I like the idea of attributes being intrinsic properties of an array, while methods could conceivably return or do anything. Thanks for your continued help with clarification and improvement to the PEP. -Travis
![](https://secure.gravatar.com/avatar/55f7acf47233a7a98f5eb9dfd0b2d763.jpg?s=120&d=mm&r=g)
Hi Travis, First off, let me say that I'm encouraged to see some action towards unifying Numeric/Numarray the split has been somewhat dismaying. Thank you for your efforts in this regard. I'd like to lobby against flatten(), r() and i(). To a large extent, these duplicate the functionality of flat, real and imag. And, these three methods are defined to sometimes return copies and sometimes return views. That type of interface is a recipe for errors and should only be used as a last resort. Fortunately in this case there are better alternatives. Flatten() is not necessary now that flat will be faux array with a view to the original [I believe you are calling it an indexable iterator]. I would, however, recomend that A.flat.copy() work. In that case, A.flat would be used when no copy was desired, and A.flat.copy() when a copy was desired. I don't find the copy when discontiguous case useful enough to deserve it's own function and it's error prone as I'll discuss more below. r() appears to be around just for symmetry with i() since A.r() will always be the same as A.real. That leaves i(). My opinion is that this case would be better served by returning a read-only array of zeros when operating on a real array. This array could even be a faux-array that doesn't allocate any space, although that may be a project for another day. If it's really deemed necessary to have these functions in addition to their attribute brethren, I recomend that they always return copies rather than varying their behaviour depending on the situation. The problem with methods that sometimes return a copy, is that it won't be long before someone types: def foobar(a) flat_view = a.flatten() # lots of code flat_view[some_index] = some_new_number This works until someone passes in a discontiguous array, at which point it fails mysteriously. This type of problem tends to be somewhat resistant to unit tests, since tests often involve only contiguous arrays. Regards, -tim
![](https://secure.gravatar.com/avatar/ec366db3649cf13f4061b519193849d6.jpg?s=120&d=mm&r=g)
Tim Hochberg wrote:
There is, however, a blisteringly common use case for such an interface: you are using the result directly in an expression such that it is only going to be read and never written to. In that case, you want it to never fail (except in truly pathological cases like being out of memory), and you want it to be as efficient as possible and so never produce a copy where you can produce a view. So, I think we need three interfaces for each of this kind of attribute: 1) Getting a view. If a view cannot be obtained, raise an error. Never copy. 2) Getting a copy. Never return a view. 3) Getting *something* the most efficient way possible. Caller beware. While I lean towards making the syntaxes for the first two The Most Obvious Ways To Do It, I think it may be rather important to keep the syntax of the third convenient and short, particularly since it is that case that usually occurs in the middle of already-complicated expressions. -- Robert Kern rkern@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter
![](https://secure.gravatar.com/avatar/55f7acf47233a7a98f5eb9dfd0b2d763.jpg?s=120&d=mm&r=g)
Robert Kern wrote:
The proposal for flat is to always return a view, if the array is not contiguous a special view-of-a-discontiguous-array will be returned. This special object obviously be less efficient to index than a contiguous array, but if implemented carefully could probably be made more efficient than a copy plus indexing for one-shot uses (i.e., in an expression).
2) Getting a copy. Never return a view.
3) Getting *something* the most efficient way possible. Caller beware.
By this you mean return a view if contiguous, a copy if not, correct?
Personally I don't find this all that compelling. Primarily because the blanket statement that (3) is more efficient than (1) is extremely suspect. In can many, if not most, cases (1) [as clarified above] will be more efficient than (3) anyway. Our disagreement here may stem from having different understandings of the proposed behaviour for flat. There are cases where you probably *do* want to copy if discontiguous, but not in expressions. I'm thinking of cases where you are going to be reusing some flattened slice of a larger array multiple times, and you plan to use it read only (already that's sounding pretty rare though). In that case, my preferred spelling is: a_contig_flat_array = ascontiguous(an_array).flat In other words, I'd like to segregate all of the sometimes-copy behaviour into functions called asXXX, so that it's easy to see and easier to debug when it goes wrong. Of course, ascontiguous doesn't exist at present, as far as I can tell, but it'd be easy enough to add: def ascontiguous(a): a = asarray(a) if not a.iscontiguous(): a = a.copy() return a Regards, -tim
![](https://secure.gravatar.com/avatar/9567eaa720c05c4f4a2dbf7d831f1ef8.jpg?s=120&d=mm&r=g)
A small plea: exciting work on Numeric3 not withstanding, would it be possible to fix setup.py in Numeric 23.7 so that it doesn't fall over on Linux? A few versions ago, if BLAS and LAPACK weren't installed or weren't found, NumPy would automatically build with its own versions of BLAS_lite and LAPACK_lite. Now it just falls over, and it is very inconvenient to have to instruct users of our app, which depends on Numeric, how to patch setup.py, and/or provide a replacement setup.py. Also, the new home page for Numpy, to which http://numpy.sf.net is redirected, does not have a link to the SorceForge project page for Numpy from which released tarballs or zip file can be downloaded. That's another source of grief for users, who may not know to go to http://www.sf.net/projects/numpy Thanks, Tim C
participants (7)
-
Colin J. Williams
-
Ralf Juengling
-
Robert Kern
-
Tim Churches
-
Tim Hochberg
-
Todd Miller
-
Travis Oliphant