[Numpy-discussion] Proposal for making of Numarray a real Numeric 'NG'

Travis Oliphant oliphant at ee.byu.edu
Sat Feb 5 23:24:16 EST 2005


Francesc Altet wrote:

>Hi List,
>
>I would like to make a formal proposal regarding with the subject of
>previous discussions in that list. This message is a bit long, but I've
>tried my best to expose my thoughts as clearly as possible.
>  
>
I did not have time to respond to this mail, but it is very good.  I  
will be placing some of its comments in the scipy site.

>It's worth remembering that Numeric has been a major breakthrough in
>introducing the capability to deal with large (homogeneous) datasets in
>Python in a very efficient mannner. In my opinion Numarray is, generally
>speaking, a very good package as well with many interesting new features
>that lack Numeric. Between the main advantages of Numarray vs Numeric I can
>list the next (although I can be a bit misleaded here because of my own user
>cases of both libraries):
>  
>
I think numarray has made some incrdedible strides in showing what the 
numeric object needs to be and in implementing some really neat 
functionality.   I just think its combination of Python and C code must 
be redone to overcome the speed issues that have arisen.  My opinion 
after perusing the numarray code is that it would be easier (for me 
anyway) to adjust Numeric to support the features of numarray, than to 
re-write and re-organize the relevant sections of numarray code.  One of 
the advantages of Numeric is it's tight implementation that added only 
two fundamental types, both written entirely in C.   I  was hoping that 
the Python dependencies for the fundamental types would fade as numarray 
matured, but it appears to me that this is not going to happen.

I did not have the time in the past to deal with this.  I wish I had 
looked at it more closely two years ago.   If I had done this I would 
have seen how to support the features that Perry wanted without 
completely re-writing everything.   But, then again, Python 2.2 changed 
what is possible on the C level and that has had an impact on the 
discussion.   

>- Memory-mapped objects: Allow working with on-disk numarray objects like if
>  they were in-memory.
>  
>
Numeric3 supports this cleanly and old Numeric did too (there was a 
memory-mapped module), it's just that byteswapping, and alignment had to 
be done manually.

>  
>- RecArrays: Objects that allow to deal with heterogeneous datasets
>  (tables) in an efficient manner. This ought to be very beneficial in many
>  fields.
>  
>
Heterogeneous arrays is the big one for old Numeric.  It is a good 
idea.  In Numeric3 it has required far fewer changes than I had at first 
imagined.

>  
>- CharArrays: Allow to work with large amounts of fixed and variable length
>  strings. I see this implementation much more powerful that Numeric.
>  
>
Also a good idea, and comees along for the ride with in Numeric3.   
Numeric had CHAR arrays but a vision was never specified  for how to 
make them more useful.  This change would have been a good step towards 
heterogeneous arrays.

>  
>- Index arrays within subscripts: e.g. if ind = array([4, 4, 0, 2])
>  and x = 2*arange(6), x[inx] results in array([8, 8, 0, 4])
>
>  
>
For scipy this was implemented on top of Numeric (so it is in Numeric3 
too),  the multidimensional version needs to be worked on, still.

>- New design interface: We should not forget that numarray has been designed
>  from the ground with Python Library integration in mind (or at least, this
>  is my impression). So, it should have more chances (if there is some hope)
>  to enter in the Standard Library than Numeric.
>  
>

Numeric has had this in mind for some time.  In fact the early Numeric 
developers were quite instrumental in getting significant changes into 
Python istelf, including Complex Objects, Ellipses, and Extended 
Slicing.   Guido was quite keen on the idea of including Numeric at one 
point.  Our infighting made him lose interest I think.   So claiming 
this as an advantage of numarray over Numeric is simply inaccurate.

>The real problem for Numarray: Object Creation Time
>===================================================
>
>On the other hand, the main drawback of Numarray vs Numeric is, in my
>opinion, its poor performance regarding object creation. This might look
>like a banal thing at first glance, but it is not in many cases. One example
>recently reported in this list is:
>  
>
Ah, and there's the rub.  I don't think this object creation time will 
go away until Numarray's infrastructure becomes essentially like that of 
Numeric.  One tight object all in C.   Getting it there seems harder 
than fixing Numeric, with the additional features of Numarray.

Thanks for these comments.   It is very good to hear what the most 
important features for users are.

-Travis






More information about the NumPy-Discussion mailing list