Status of Numeric
data:image/s3,"s3://crabby-images/dbff1/dbff1dee826e4fc0a89b2bc2d2dac814c15fe85d" alt=""
Numarray is making great progress and is quite usable for many purposes. An idea that was championed by some is that the Numeric code base would stay static and be replaced entirely by Numarray. However, Numeric is currently used in a large installed base. In particular SciPy uses Numeric as its core array. While no doubt numarray arrays will be supported in the future, the speed of the less bulky Numeric arrays and the typical case that we encounter in SciPy of many, small arrays will make it difficult for people to abandon Numeric entirely with it's comparatively light-weight arrays. In the development of SciPy we have encountered issues in Numeric that we feel need to be fixed. As this has become an important path to success of several projects (both commercial and open) it is absolutely necessary that this issues be addressed. The purpose of this email is to assess the attitude of the community regarding how these changes to Numeric should be accomplished. These are the two options we can see: * freeze old Numeric 23.x and make all changes to Numeric 24.x still keeping Numeric separate from SciPy * freeze old Numeric 23.x and subsume Numeric into SciPy essentially creating a new SciPy arrayobject that is fast and lightweight. Anybody wanting this new array object would get it by installing scipy_base. Numeric would never change in the future but the array in scipy_base would. It is not an option to wait for numarray to get fast enough as these issues need to be addressed now. Ultimately I think it will be a wise thing to have two implementations of arrays: one that is fast and lightweight optimized for many relatively small arrays, and another that is optimized for large-scale arrays. Eventually, the use of these two underlying implementations should be automatic and invisible to the user. A few of the particular changes we need to make to the Numeric arrayobject are: 1) change the coercion model to reflect Numarray's choice and eliminate the savespace crutch. 2) Add indexing capability to Numeric arrays (similar to Numarray's) 3) Improve the interaction between Numeric arrays and scalars. 4) Optimization: Again, these changes are going to be made to some form of the Numeric arrays. What I am really interested in knowing is the attitude of the community towards keeping Numeric around. If most of the community wants to see Numeric go away then we will be forced to bring the Numeric array under the SciPy code-base and own it there. Your feedback is welcome and appreciated. Sincerely, Travis Oliphant and other SciPy developers
data:image/s3,"s3://crabby-images/25590/25590978e6ee8d8320bdf70e2e39cd3e3700b7ab" alt=""
Travis Oliphant writes:
Numarray is making great progress and is quite usable for many purposes. An idea that was championed by some is that the Numeric code base would stay static and be replaced entirely by Numarray.
However, Numeric is currently used in a large installed base. In particular SciPy uses Numeric as its core array. While no doubt numarray arrays will be supported in the future, the speed of the less bulky Numeric arrays and the typical case that we encounter in SciPy of many, small arrays will make it difficult for people to abandon Numeric entirely with it's comparatively light-weight arrays.
I'd like to ask if the numarray option couldn't at least be considered. In particular with regard to speed, we'd like to know what the necessary threshold is. For many ufuncs, numarray is within a factor of 3 or so of Numeric for small arrays. Is this good enough or not? What would be good enough? It would probably be difficult to make it as fast in all cases, but how close does it have to be? A factor of 2? 1.5? We haven't gotten very much feedback on specific numbers in this regard. Are there other aspects of numarray performance that are a problem? What specifically? We don't have the resources to optimize everything in case it might affect someone. We need to know that it is particular problem with users to give it some priority (and know what the necessary threshold is for acceptable performance). Perhaps the two (Numeric and numarray) may need to coexist for a while, but we would like to isolate the issues that make that necessary. That hasn't really happened yet. Travis, do you have any specific nummarray speed issues that have arisen from your benchmarking or use that we can look at? Perry Greenfield
data:image/s3,"s3://crabby-images/d421f/d421f94409c9c58530c3b155d2e2e0b410cb1ca7" alt=""
On 19.01.2004, at 21:32, Travis Oliphant wrote:
These are the two options we can see: * freeze old Numeric 23.x and make all changes to Numeric 24.x still keeping Numeric separate from SciPy * freeze old Numeric 23.x and subsume Numeric into SciPy essentially creating a new SciPy arrayobject that is fast and lightweight. Anybody wanting this new array object would get it by installing scipy_base. Numeric would never change in the future but the array in scipy_base would.
That depends on the exact nature of the changes. My view is that any package that is upwards-compatible with Numeric (except for bug fixes of course) should be called Numeric and distributed as such. Any package that is intentionally incompatible with Numeric in some important aspect should not be called Numeric. There is a lot of code out there that builds on Numeric, and some of it is hardly maintained any more, although there are still users around. Those users expect to be able to upgrade Numeric without breaking their code. Konrad.
data:image/s3,"s3://crabby-images/a03e9/a03e989385213ae76a15b46e121c382b97db1cc3" alt=""
Konrad Hinsen wrote:
My view is that any package that is upwards-compatible with Numeric (except for bug fixes of course) should be called Numeric and distributed as such. Any package that is intentionally incompatible with Numeric in some important aspect should not be called Numeric.
I absolutely agree with this. Travis Oliphant wrote:
1) change the coercion model to reflect Numarray's choice and eliminate the savespace crutch. 2) Add indexing capability to Numeric arrays (similar to Numarray's) 3) Improve the interaction between Numeric arrays and scalars.
These all look like backward in-compatable changes, so in that case, I vote for Sci-py-array, or whatever. However, it also looks like these are all moving toward the Numarray API. Is this the case? That would be great, as then Numarray would just be dropped in if/when it is deemed up to the task. It also leaves the door open for some sort of automagic selection of which array to use for a given instance.
4) Optimization:
Nothing wrong with that...as long as it's not premature!
Numarray is making great progress and is quite usable for many purposes. An idea that was championed by some is that the Numeric code base would stay static and be replaced entirely by Numarray.
However, Numeric is currently used in a large installed base. In particular SciPy uses Numeric as its core array. While no doubt numarray arrays will be supported in the future, the speed of the less bulky Numeric arrays and the typical case that we encounter in SciPy of many, small arrays will make it difficult for people to abandon Numeric entirely with it's comparatively light-weight arrays.
It was said that making Numarray more efficient with small arrays was a goal of the project...is it still? I'm still unclear on why Numarrays are so much more "heavy"..is it just that no one has taken the time to optimize them, or is there really something inherent (and important) in the design?
As this has become an important path to success of several projects (both commercial and open) it is absolutely necessary that this issues be addressed.
From the sammll list above, it looks like what you need is an array that is like a Numarray, but faster for samll arrays...Has anyone done an analysis of whether it would be harder to optimize Numarray than to make the above changes to Numeric, and continue to maintain two packages? You probably have, but I though I'd ask anyway...
Ultimately I think it will be a wise thing to have two implementations of arrays: one that is fast and lightweight optimized for many relatively small arrays, and another that is optimized for large-scale arrays.
Are these really incompatable goals?
If most of the community wants to see Numeric go away then we will be forced to bring the Numeric array under the SciPy code-base and own it there.
I think it's quite the opposite... if most of the community wants to see Numeric continue on, it must be maintained (and improved) with little change to the API. If we're all going to switch to Numarray, then the SciPy project can do whatever it wants with Numeric... In Summary: - Anything called "Numeric" should have a compatable API to the current version - I'd much rather have just one N-d array type, preferable one that is part of the Python Standard Library...is likely to ever happen? - I also want fast small arrays. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
data:image/s3,"s3://crabby-images/c9ee2/c9ee2079c77e7bc795e8ef60e379d497e20a08c4" alt=""
A Dimarts 20 Gener 2004 20:11, Chris Barker va escriure:
As this has become an important path to success of several projects (both commercial and open) it is absolutely necessary that this issues be addressed.
From the sammll list above, it looks like what you need is an array that is like a Numarray, but faster for samll arrays...Has anyone done an analysis of whether it would be harder to optimize Numarray than to make the above changes to Numeric, and continue to maintain two packages? You probably have, but I though I'd ask anyway...
I agree. An analysis should be done in order to see if it is better to concentrate in getting numarray better for small arrays or in having several array implementations. The problem is if numarray cannot be enhanced enough because of design problems, although I would bet that something can be done in order to get it close to Numeric performance. And I guess quite a bit people on this list would be happy to collaborate in some way or another so as to achieve this goal. However, as Perry says, in order to do this analysis, an amount of the needed speed-up should be estimated first. I personaly feel that it would worth the effort to go and try to optimize the small arrays case in numarray instead of having to fight against a jungle of Numeric/numarray/python array implementations. I strongly believe that numarray has enough advantages over Numeric that would compensate the effort to further enhance its present limitations rather than maintain several packages. Just my 2 cents, -- Francesc Alted
data:image/s3,"s3://crabby-images/e088f/e088f1fba07fc36eee1c3363a4879d33f9866e58" alt=""
Travis Oliphant wrote:
Numarray is making great progress and is quite usable for many purposes. An idea that was championed by some is that the Numeric code base would stay static and be replaced entirely by Numarray.
It was my impression that this idea had been generally accepted. It was not just one of the proposals under discussion. I wonder how many others out there had assumed that, in spite of current speed problems, numarray was the way for the future, and had based their development endeavours on numarray. I did. To this relative outsider, there seem to have been three groups involved in efforts to provide Python with numerical array capabilities, those connected with Numeric, SciPy and numarray. SciPy would appear to be the most recent addition to the list. Is there any way that some agrement between these groups can be achieved to restore the hope for a common development path? This message from Travis Oliphant seems to envisage two paths. Is this the better way to go?
However, Numeric is currently used in a large installed base. In particular SciPy uses Numeric as its core array. While no doubt numarray arrays will be supported in the future, the speed of the less bulky Numeric arrays and the typical case that we encounter in SciPy of many, small arrays will make it difficult for people to abandon Numeric entirely with it's comparatively light-weight arrays.
In the development of SciPy we have encountered issues in Numeric that we feel need to be fixed. As this has become an important path to success of several projects (both commercial and open) it is absolutely necessary that this issues be addressed.
The purpose of this email is to assess the attitude of the community regarding how these changes to Numeric should be accomplished. These are the two options we can see: * freeze old Numeric 23.x and make all changes to Numeric 24.x still keeping Numeric separate from SciPy * freeze old Numeric 23.x and subsume Numeric into SciPy essentially creating a new SciPy arrayobject that is fast and lightweight. Anybody wanting this new array object would get it by installing scipy_base. Numeric would never change in the future but the array in scipy_base would.
It is not an option to wait for numarray to get fast enough as these issues need to be addressed now. Ultimately I think it will be a wise thing to have two implementations of arrays: one that is fast and lightweight optimized for many relatively small arrays, and another that is optimized for large-scale arrays. Eventually, the use of these two underlying implementations should be automatic and invisible to the user.
Is this "automatic and invisible" practicable, excepts for trivial examples?
A few of the particular changes we need to make to the Numeric arrayobject are:
1) change the coercion model to reflect Numarray's choice and eliminate the savespace crutch. 2) Add indexing capability to Numeric arrays (similar to Numarray's) 3) Improve the interaction between Numeric arrays and scalars. 4) Optimization:
Again, these changes are going to be made to some form of the Numeric arrays. What I am really interested in knowing is the attitude of the community towards keeping Numeric around. If most of the community wants to see Numeric go away then we will be forced to bring the Numeric array under the SciPy code-base and own it there.
Your feedback is welcome and appreciated. Sincerely,
Travis Oliphant and other SciPy developers
------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
I hope that some cooperative approach can be devised. Colin W.
data:image/s3,"s3://crabby-images/25590/25590978e6ee8d8320bdf70e2e39cd3e3700b7ab" alt=""
On Tuesday, January 20, 2004, at 05:18 PM, Colin J. Williams wrote:
Travis Oliphant wrote:
Numarray is making great progress and is quite usable for many purposes. An idea that was championed by some is that the Numeric code base would stay static and be replaced entirely by Numarray.
It was my impression that this idea had been generally accepted. It was not just one of the proposals under discussion.
I don't think there was ever any formal vote. I think Paul Dubois had accepted the idea, others had a more "wait and see" attitude. Realistically, I think one can safely say that as one might expect, those that already were using Numeric probably were happy with its capabilities and that given normal motivations, there would be significant inertia on the part of well established users (those with a lot of code already) to switch over. But since it wasn't quite as usable for our needs, we decided that we needed a new version. We had to develop it to support our needs and would have done it regardless. We hoped that it would be suitable for all uses, and we've tried to involve all in the process as much as possible. As you might expect, we've devoted most of our attention to meeting our needs, but we have also expended significant energy trying to meet the needs of the more general community (and we will continue to try to do so within our resources). I don't know if it is reasonable to expect that a certain outcome has been blessed by all, nor did most of the existing Numeric users ask us to do this. But many did recognize (as Paul Dubois alluded to) that there was a need to recode the array stuff. Maybe someone could have done a better job of it, but no one else has yet (it is a fair amount of work after all). We do intend to support all the important packages that Numeric does, it make take some time to get there. I suppose our goal is to eventually attract all new users. We can't, nor should we expect that existing Numeric users will switch at our desire or whim.
I wonder how many others out there had assumed that, in spite of current speed problems, numarray was the way for the future, and had based their development endeavours on numarray. I did.
To this relative outsider, there seem to have been three groups involved in efforts to provide Python with numerical array capabilities, those connected with Numeric, SciPy and numarray. SciPy would appear to be the most recent addition to the list.
Actually, I think it would be more accurate to say that SciPy is an attempt to collect a large base of numeric code and integrate it into an array package (currently Numeric) rather than to develop a new array package. It was started before we started numarray and thus was centered around Numeric. They have found occasions to to modify and extend Numeric behavior. In that sense, it long has been somewhat incompatible with Numeric. (Travis can correct me if I got that wrong.)
Is there any way that some agrement between these groups can be achieved to restore the hope for a common development path?
I would certainly like to, and in any case, we want to adapt scipy to be compatible with numarray. Perry Greenfield
data:image/s3,"s3://crabby-images/73f12/73f12bc63125b942b833e684ce8ffd8ff31ce5a0" alt=""
On Mon, 19 Jan 2004, Travis Oliphant wrote:
... Ultimately I think it will be a wise thing to have two implementations of arrays: one that is fast and lightweight optimized for many relatively small arrays, and another that is optimized for large-scale arrays.
I am *extremely* interested in the use case of the small arrays in SciPy. Which algorithms and modules are dominated by the small array speed? -a
participants (7)
-
Andrew P. Lentvorski, Jr.
-
Chris Barker
-
Colin J. Williams
-
Francesc Alted
-
Konrad Hinsen
-
Perry Greenfield
-
Travis Oliphant