[Numpy-discussion] Proposal for making of Numarray a real Numeric 'NG'
oliphant at ee.byu.edu
Sat Feb 5 23:24:16 EST 2005
Francesc Altet wrote:
>I would like to make a formal proposal regarding with the subject of
>previous discussions in that list. This message is a bit long, but I've
>tried my best to expose my thoughts as clearly as possible.
I did not have time to respond to this mail, but it is very good. I
will be placing some of its comments in the scipy site.
>It's worth remembering that Numeric has been a major breakthrough in
>introducing the capability to deal with large (homogeneous) datasets in
>Python in a very efficient mannner. In my opinion Numarray is, generally
>speaking, a very good package as well with many interesting new features
>that lack Numeric. Between the main advantages of Numarray vs Numeric I can
>list the next (although I can be a bit misleaded here because of my own user
>cases of both libraries):
I think numarray has made some incrdedible strides in showing what the
numeric object needs to be and in implementing some really neat
functionality. I just think its combination of Python and C code must
be redone to overcome the speed issues that have arisen. My opinion
after perusing the numarray code is that it would be easier (for me
anyway) to adjust Numeric to support the features of numarray, than to
re-write and re-organize the relevant sections of numarray code. One of
the advantages of Numeric is it's tight implementation that added only
two fundamental types, both written entirely in C. I was hoping that
the Python dependencies for the fundamental types would fade as numarray
matured, but it appears to me that this is not going to happen.
I did not have the time in the past to deal with this. I wish I had
looked at it more closely two years ago. If I had done this I would
have seen how to support the features that Perry wanted without
completely re-writing everything. But, then again, Python 2.2 changed
what is possible on the C level and that has had an impact on the
>- Memory-mapped objects: Allow working with on-disk numarray objects like if
> they were in-memory.
Numeric3 supports this cleanly and old Numeric did too (there was a
memory-mapped module), it's just that byteswapping, and alignment had to
be done manually.
>- RecArrays: Objects that allow to deal with heterogeneous datasets
> (tables) in an efficient manner. This ought to be very beneficial in many
Heterogeneous arrays is the big one for old Numeric. It is a good
idea. In Numeric3 it has required far fewer changes than I had at first
>- CharArrays: Allow to work with large amounts of fixed and variable length
> strings. I see this implementation much more powerful that Numeric.
Also a good idea, and comees along for the ride with in Numeric3.
Numeric had CHAR arrays but a vision was never specified for how to
make them more useful. This change would have been a good step towards
>- Index arrays within subscripts: e.g. if ind = array([4, 4, 0, 2])
> and x = 2*arange(6), x[inx] results in array([8, 8, 0, 4])
For scipy this was implemented on top of Numeric (so it is in Numeric3
too), the multidimensional version needs to be worked on, still.
>- New design interface: We should not forget that numarray has been designed
> from the ground with Python Library integration in mind (or at least, this
> is my impression). So, it should have more chances (if there is some hope)
> to enter in the Standard Library than Numeric.
Numeric has had this in mind for some time. In fact the early Numeric
developers were quite instrumental in getting significant changes into
Python istelf, including Complex Objects, Ellipses, and Extended
Slicing. Guido was quite keen on the idea of including Numeric at one
point. Our infighting made him lose interest I think. So claiming
this as an advantage of numarray over Numeric is simply inaccurate.
>The real problem for Numarray: Object Creation Time
>On the other hand, the main drawback of Numarray vs Numeric is, in my
>opinion, its poor performance regarding object creation. This might look
>like a banal thing at first glance, but it is not in many cases. One example
>recently reported in this list is:
Ah, and there's the rub. I don't think this object creation time will
go away until Numarray's infrastructure becomes essentially like that of
Numeric. One tight object all in C. Getting it there seems harder
than fixing Numeric, with the additional features of Numarray.
Thanks for these comments. It is very good to hear what the most
important features for users are.
More information about the NumPy-Discussion