"eric jones"
The issue here is both consistency across a library and speed.
Consistency, fine. But not just within one package, also between that package and the language it is implemented in. Speed, no. If I need a sum along the first axis, I won't replace it by a sum across the last axis just because that is faster.
From the numpy.pdf, Numeric looks to have about 16 functions using axis=0 (or index=0 which should really be axis=0) and, counting FFT, about 10 functions using axis=-1. To this day, I can't remember which
If you weight by frequency of usage, the first group gains a lot in importance. I just scanned through some of my code; almost all of the calls to Numeric routines are to functions whose default axis is zero.
code. Unfortunately, many of the Numeric functions that should still don't take axis as a keyword, so you and up just inserting -1 in the
That is certainly something that should be fixed, and I suppose no one objects to that. My vote is for keeping axis defaults as they are, both because the choices are reasonable (there was a long discussion about them in the early days of NumPy, and the defaults were chosen based on other array languages that had already been in use for years) and because any change would cause most existing NumPy code to break in many places, often giving wrong results instead of an error message. If a uniformization of the default is desired, I vote for axis=0, for two reasons: 1) Consistency with Python usage. 2) Minimization of code breakage.
We should also strive to make it as easy as possible to write generic functions that work for all array types (Int, Float,Float32,Complex, etc.) -- yet another debate to come.
What needs to be improved in that area?
Changes are going to create some backward incompatibilities and that is definitely a bummer. But some changes are also necessary before the community gets big. I know the community is already reasonable size,
I'd like to see evidence that changing the current NumPy behaviour would increase the size of the community. It would first of all split the current community, because many users (like myself) do not have enough time to spare to go through their code line by line in order to check for incompatibilities. That many others would switch to Python if only some changes were made is merely an hypothesis.
Some feel that is contrary to expectations that the least rapidly varying dimension should be operated on by default. There are good arguments for both sides. For example, Konrad Hinsen has
Actually the argument is not for the least rapidly varying dimension, but for the first dimension. The internal data layout is not significant for most Python array operations. We might for example offer a choice of C style and Fortran style data layout, enabling users to choose according to speed, compatibility, or just personal preference. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais -------------------------------------------------------------------------------