
I thought I would clarify some historical issues (at least as far as I recall them) in helping understand how things got the way they are today. Travis Oliphant wrote:
Francesc Altet wrote:
I think numarray has made some incrdedible strides in showing what the numeric object needs to be and in implementing some really neat functionality. I just think its combination of Python and C code must be redone to overcome the speed issues that have arisen. My opinion after perusing the numarray code is that it would be easier (for me anyway) to adjust Numeric to support the features of numarray, than to re-write and re-organize the relevant sections of numarray code. One of the advantages of Numeric is it's tight implementation that added only two fundamental types, both written entirely in C. I was hoping that the Python dependencies for the fundamental types would fade as numarray matured, but it appears to me that this is not going to happen.
When we started this, one of those that suggested putting much of the code in Python was Guido himself if I recall correctly (I'd have to dig up the message he wrote on the subject to see what exactly he said). That was one of the factors influencing us to go that route (as well as other mentioned below).
I did not have the time in the past to deal with this. I wish I had looked at it more closely two years ago. If I had done this I would have seen how to support the features that Perry wanted without completely re-writing everything. But, then again, Python 2.2 changed what is possible on the C level and that has had an impact on the discussion.
Indeed, numarray was started well before Python 2.2 and was another it wasn't done in C. Knowing what would have been available in 2.2 would likely have changed the approach used.
- Memory-mapped objects: Allow working with on-disk numarray objects like if they were in-memory.
Numeric3 supports this cleanly and old Numeric did too (there was a memory-mapped module), it's just that byteswapping, and alignment had to be done manually.
Just to clarify (it may not be immediately apparent to those who haven't had to deal with it) many memory mapped files do not use the machine representation (as is often the case for astronomical data files). Memory mapping isn't nearly as useful if one has to create temporaries to handles these cases.
- RecArrays: Objects that allow to deal with heterogeneous datasets (tables) in an efficient manner. This ought to be very
beneficial in many
fields.
Heterogeneous arrays is the big one for old Numeric. It is a good idea. In Numeric3 it has required far fewer changes than I had at first imagined.
Numeric has had this in mind for some time. In fact the early Numeric developers were quite instrumental in getting significant changes into Python istelf, including Complex Objects, Ellipses, and Extended Slicing. Guido was quite keen on the idea of including Numeric at one point. Our infighting made him lose interest I think. So claiming this as an advantage of numarray over Numeric is simply inaccurate.
Actually, as I understand it, Guido had ruled out including Numeric well before numarray even got started. I can't claim to know exactly his reasons (I've heard various things such as he looked at the code and didn't like it, or Paul Dubois and others advised him against it; I can't say exactly why), but I am sure that that decision was made before numarray. No doubt the split has prevented any further consideration of inclusion. Perry