Re: [Matrix-SIG] An Experiment in code-cleanup.

Travis Oliphant writes:
1) The re-use of temporary arrays -- to conserve memory.
Please elaborate about this request.
When Python evaluates the expression:
Y = B*X + A
where A, B, X, and Y are all arrays, B*X creates a temporary array, T. A new array, Y, will be created to hold the result of T + A, and T will be deleted. If T and Y have the same shape and typecode, then instead of creating Y, T can be re-used to conserve memory.
2) A copy-on-write option -- to enhance performance.
I need more explanation of this as well.
This would be an advanced feature of arrays that use memory-mapping or access their arrays from disk. It is similar to the secondary cache of a CPU. The data is held in memory until a write request is made.
For mixed-type (or object) arrays containing strings, zeros() and ones() would be confusing. Therefore by default, integer and floating types are initialized to 0 and string types to ' ', and the option would be available to not initialize the array for performance.
No, not exactly. But the last time I looked, I thought some improvements could be made to it.
When I last spoke to Jim about this at IPC6, I was under the impression that IEEE support was not fully implemented and much work still needed to be done. Has this situation changed since then?
recordmodule.c is part of my PyFITS module for dealing with FITS files. You can find it here: ftp://ra.stsci.edu/pub/barrett/PyFITS_0.3.tgz I use NumPy to access fixed-type arrays and the record type for accessing mixed-type arrays. A common example is accessing the second element of a mixed-type (ie. an object) from the entire array. This returns a record type with a single element, which is equivalent to a NumPy array of fixed type. Therefore users expect this object to be a NumPy array and it isn't. They have to convert it to one.
Note that NumPy already has some support for an Object type. It has been proposed that it be removed, because it is not well supported and hence few people use it. I have the contrary opinion and feel we should enhance the Object type and make it much more usable. If you don't need it, then you don't have to use it. This enhancement really shouldn't get in the way of those who only use fixed-type arrays. So what changes to NumPy are needed? 1) Instead of a typecode (or in addition to the typecode for backward compatibility), I suggest an optional format keyword, which can be used to specify the mixed-type or object format. Namely, format = 'i, f, s10', where 'i' is an integer type, 'f' a floating point type, and s10 is a string of 10 characters. 2) Array access will be the same as it is now. For example # Create a 10x10 mixed-type array. A = array((10, 10), format = 'i, f, 10s') # Create a 10x10 fixed-type array. B = array((10, 10), typecode = 'i') # Print a 5x5 subarray of mixed-type. print A[:5,:5] # Print a 5x5 subarray of fixed-type print B[:5,:5] # Or # (Note that the 3rd index is optional for fixed-type arrays, it # always defaults to 0.) print B[:5,:5,0] # Print the second element of the mixed-type of the entire array. # Note that this is now an array of fixed-type. print A[:,:,1] The major thorn that I see at this point is how to reconcile the behavior of numbers and strings during operations. But I don't see this as an intractable problem. I actually believe this enhancement will encourage us to create a better and more generic multi-dimensional array module by concentrating on the behavioral aspects of this extension type. Note that J, which NumPy is base upon, allows such mixed-types. -- Dr. Paul Barrett Space Telescope Science Institute Phone: 410-516-6714 DESD/DPT FAX: 410-516-8615 Baltimore, MD 21218

I'd suggest to go all the way and make it a real object, not just a string. That object can then have useful attributes, like size in bytes, maxval, minval, some indication of precision, etc. Logically, itemsize should be an attribute of the numeric type of an array, not of the array itself. --david ascher

I'd suggest to go all the way and make it a real object, not just a string. That object can then have useful attributes, like size in bytes, maxval, minval, some indication of precision, etc. Logically, itemsize should be an attribute of the numeric type of an array, not of the array itself. --david ascher
participants (2)
-
David Ascher
-
Paul Barrett