[Numpy-discussion] Numeric3

Travis Oliphant oliphant at ee.byu.edu
Mon Feb 7 12:38:17 EST 2005


>
> It's interesting that SciPy and matplotlib really do have much the 
> same goal, but taken from different tacks: SciPy focused on the 
> computational functionality, and matplotlib on the plotting. However, 
> what's become really clear in this discussion is that a good 
> scientific math environment really needs both, and they need to be 
> well integrated.

The problem was that we needed someone like John to join us on the scipy 
effort to get the plotting working well.   Our team lacked someone with 
his skill.  All of us working on SciPy would have loved to work with 
him.  So, I have nothing but respect for John and his efforts.  I just 
wish we at SciPy were better recruiters :-)

> I'm optimistic that we can all work together to get that: One set of 
> packages that work together to do everything we all need. As someone 
> else on this thread mentioned, most of the individual functionality is 
> all there, what's mostly need now is for it all to be packaged well 
> and work together well. At the core of that is the numarray/NumPy 
> unification: I hope Numeric will get us there.

Yes, I think the numarray / Numeric split has been one of the biggest 
problems. 

>
> One question, Travis: Why did you decide to bring numarray stuff into 
> the Numeric code base, rather than the other way around? What I've 
> gathered is that the only real show-stopper with numarray is array 
> creation speed. Is that really impossible to fix without a complete 
> re-structuring?

A great question.    People deserve to hear what I think even if they 
disagree with me, so here is a summary of the issues I'm concerned about 
with numarray.  The basic answer to the question, is that I feel that 
numarray is too different structurally (i.e. the classes and objects 
that get defined) from Numeric and some of these differences are causing 
the speed issues.  I felt it would be too much work
to adapt numarray to the Numeric structure than adapt Numeric to the 
numarray features. 

Here are some specifics.

1)  Numarray did not seem to build on Numeric at all.  It has thrown out 
far too much.  As just one example, the ufunc object in Numeric is a 
pretty good start, but numarray decided to completely change the 
interface for reasons that I do not understand.   One result of this 
decision is that numarray still does not provide a similar C-API for the 
creation of ufuncs that Numeric did.  

2) Basic numarray _ndarray C object is way too big.  numarray added too 
many things to the underlying C-structure of an arrayobject.  I think 
this will have small-array performance implications.

3) While prototyping in Python was a good idea. Numarray should have 
moved the entire object to C and not left so many things to the Python 
level.  I don't think there should be a Python-level arrayobject as the 
basic class (except for RecordArrays).    I think this move must still 
be done to solve the speed issues, and I see this has much harder than 
fixing Numeric which is already all in C.

4) The numarray C-API seems way too big.  There are so many, seemingly 
repeated calls that seem to do the same thing. Are all of these API 
calls really necessary? 

5) Numarray builds fundamentally on Int16, Int32, and Float32 objects.  
I understand the need for this in many applications, but some users will 
still need a way to define arrays based on the c-type that is desired.  
In addition, as the mapping from bit-width to c-type is quite platform 
dependent, this needs to be done more carefully.

I'm not looking to debate these issues, because I agree that other 
opinions may be valid, I could be wrong, and the debate will just 
distract me.   But, fundamentally, my decision was based on a gut-feel 
influenced no doubt by my familiarity and appreciation of the Numeric 
code base.   If I'm wrong it will be apparent in a couple of months. 

-Travis











More information about the NumPy-Discussion mailing list