
Collin J. Williams Wrote:
I feel lower on the understanding tree with respect to what is being proposed in the draft PEP, but would still like to offer my 2 cents worth. I get the feeling that numarray is being bent out of shape to fit Numeric.
Todd and Gerard address this point well.
It was my understanding that Numeric had certain weakness which made it unacceptable as a Python component and that numarray was intended to provide the same or better functionality within a pythonic framework.
Let me reiterate what our motivations were. We wanted to use an array package for our software, and Numeric had enough shortcomings that we needed some changes in behavior (e.g., type coercion for scalars), changes in performance (particularly with regard to memory usage), and enhancements in capabilities (e.g., memory mapping, record arrays, etc.). It was the opinion of some (Paul Dubois, for example) that a rewrite was in order in any case since the code was not that maintainable (not everyone felt this way, though at the time that wasn't as clear). At the same time there was some hope that Numeric could be accepted into the standard Python distribution. That's something we thought would be good (but wasn't the highest priority for us) and I've come to believe that perhaps a better solution with regard to that is what this PEP is trying to address. In any case Guido made it clear that he would not accept Numeric in its (then) current form. That it be written mostly in Python was something suggested by Guido, and we started off that way, mainly because it would get us going much faster than writing it all in C. We definitely understood that it would also have the consequence of making small array performance worse. We said as much when we started; it wasn't as clear as it is now that many users objected to a factor of few slower performance (as it turned out, a mostly Python based implemenation was more than an order of magnitude slower for small arrays).
numarray has not achieved the expected performance level to date, but progress is being made and I believe that, for larger arrays, numarray has been shown to be be superior to Numeric - please correct me if I'm wrong here.
We never expected numarray to ever reach the performance level for small arrays that Numeric has. If it were within a factor of two I would be thrilled (its more like a factor of 3 or 4 currently for simple ufuncs). I still don't think it ever will be as fast for small arrays. The focus all along was on handling large arrays, which I think it does quite well, both regard to memory and speed. Yes, there are some functions and operations that may be much slower. Mainly they need to be called out so they can be improved. Generally we only notice performance issues that affect our software. Others need to point out remaining large discrepancies. I'm still of the opinion that if small array performance is really important, a very different approach should be used and have a completely different implementation. I would think that improvements of an order of magnitude over what Numeric does now are possible. But since that isn't important to us (STScI), don't expect us to work on that :-)
The shock came for me when Todd Miller said:
<> I looked at this some, and while INCREFing __dict__ maybe the right idea, I forgot that there *is no* Python NumArray.__init__ anymore.
Wasn't it the intent of numarray to work towards the full use of the Python class structure to provide the benefits which it offers?
The Python class has two constructors and one destructor.
The constructors are __init__ and __new__, the latter only provides the shell of an instance which later has to be initialized. In version 0.9, which I use, there is no __new__, but there is a new function which has a functionality similar to that intended for __new__. Thus, with this change, numarray appears to be moving further away from being pythonic.
I'll agree that optimization is driving the underlying implementation to one that is more complex and that is the drawback (no surprise there). There's Pythonic in use and Pythonic in implementation. We are certainly receptive to better ideas for the implementation, but I doubt that a heavily Python-based implementation is ever going to be competitive for small arrays (unless something like psyco become universal, but I think there are a whole mess of problems to be solved for that kind of approach to work well generically). Perry