[Numpy-discussion] RE: default axis for numarray

Konrad Hinsen hinsen at cnrs-orleans.fr
Tue Jun 11 06:17:04 EDT 2002


"eric jones" <eric at enthought.com> writes:

> The issue here is both consistency across a library and speed.

Consistency, fine. But not just within one package, also between
that package and the language it is implemented in.

Speed, no. If I need a sum along the first axis, I won't replace
it by a sum across the last axis just because that is faster.

> >From the numpy.pdf, Numeric looks to have about 16 functions using
> axis=0 (or index=0 which should really be axis=0) and, counting FFT,
> about 10 functions using axis=-1.  To this day, I can't remember which

If you weight by frequency of usage, the first group gains a lot in
importance. I just scanned through some of my code; almost all of the
calls to Numeric routines are to functions whose default axis
is zero.

> code.  Unfortunately, many of the Numeric functions that should still
> don't take axis as a keyword, so you and up just inserting -1 in the

That is certainly something that should be fixed, and I suppose no one
objects to that.


My vote is for keeping axis defaults as they are, both because the
choices are reasonable (there was a long discussion about them in the
early days of NumPy, and the defaults were chosen based on other array
languages that had already been in use for years) and because any
change would cause most existing NumPy code to break in many places,
often giving wrong results instead of an error message.

If a uniformization of the default is desired, I vote for axis=0,
for two reasons:
1) Consistency with Python usage.
2) Minimization of code breakage.


> We should also strive to make it as easy as possible to write generic
> functions that work for all array types (Int, Float,Float32,Complex,
> etc.) -- yet another debate to come.  

What needs to be improved in that area?

> Changes are going to create some backward incompatibilities and that is
> definitely a bummer.  But some changes are also necessary before the
> community gets big.  I know the community is already reasonable size,

I'd like to see evidence that changing the current NumPy behaviour
would increase the size of the community. It would first of all split
the current community, because many users (like myself) do not have
enough time to spare to go through their code line by line in order to
check for incompatibilities. That many others would switch to Python
if only some changes were made is merely an hypothesis.

> > Some feel that is contrary to expectations that the least rapidly
> > varying dimension should be operated on by default. There are
> > good arguments for both sides. For example, Konrad Hinsen has

Actually the argument is not for the least rapidly varying
dimension, but for the first dimension. The internal data layout
is not significant for most Python array operations. We might
for example offer a choice of C style and Fortran style data layout,
enabling users to choose according to speed, compatibility, or
just personal preference.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------




More information about the NumPy-Discussion mailing list