scipy choice of defaults for matrix manipulation
The SciPy documentation says: To begin with, all of the Numeric functions have been subsumed into the scipy namespace so that all of those functions are available without additionally importing Numeric. It was therefore unsettling to find that SciPy's function_base defines def cumsum(m,axis=-1): """Returns the cumulative sum of the elements along the given axis """ if axis is None: m = ravel(m) axis = 0 else: m = _asarray1d(m) return add.accumulate(m,axis) This changes the default axis of Numeric and numarray. Bug or feature?? Example:
x=[[1,2],[3,4]] from scipy import * print cumsum(x) [[1,3] [3,7]] print Numeric.cumsum(x) [[1,2] [4,6]]
I believe this should be considered a serious bug, either in implementation or documentation. Since SciPy is likely to be attracting Numeric and numarray users, I believe it should be considered an implementation bug. Naturally cumprod has the same problem. (I did not try to review the whole list, so there may be others.) There is even a certain schizophrenia: sum and prod choose different axes! It is clear that this has been thought about at some point, since SciPy's sum definition includes a note suggesting that the axis might change in the future (to *conflict* with the choice in Numeric and numarray!!). This is likely to generate great user confusion. It cost me some time. Thank you, Alan Isaac
Alan G Isaac wrote:
The SciPy documentation says: To begin with, all of the Numeric functions have been subsumed into the scipy namespace so that all of those functions are available without additionally importing Numeric.
It was therefore unsettling to find that SciPy's function_base defines def cumsum(m,axis=-1): """Returns the cumulative sum of the elements along the given axis """ if axis is None: m = ravel(m) axis = 0 else: m = _asarray1d(m) return add.accumulate(m,axis)
This changes the default axis of Numeric and numarray. Bug or feature??
Implementation feature; documentation bug. Numeric's functions have various, inconsistent ("schizophrenic" as you say) choices for the default axis. An attempt, albeit incomplete as you note, was made with SciPy to standardize on always using axis=-1. I'm fairly positive this was documented somewhere at some point, but probably only on the old website. The tutorial really should be updated to prominently note this convention. -- Robert Kern rkern@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter
Alan wrote:
The SciPy documentation says: To begin with, all of the Numeric functions have been subsumed into the scipy namespace so that all of those functions are available without additionally importing Numeric.
It was therefore unsettling to find that SciPy's function_base defines def cumsum(m,axis=-1): """Returns the cumulative sum of the elements along the given axis """ if axis is None: m = ravel(m) axis = 0 else: m = _asarray1d(m) return add.accumulate(m,axis)
This changes the default axis of Numeric and numarray. Bug or feature??
On Wed, 04 Aug 2004, Robert Kern apparently wrote:
Implementation feature; documentation bug. Numeric's functions have various, inconsistent ("schizophrenic" as you say) choices for the default axis. An attempt, albeit incomplete as you note, was made with SciPy to standardize on always using axis=-1. I'm fairly positive this was documented somewhere at some point, but probably only on the old website. The tutorial really should be updated to prominently note this convention.
Looking at briefly at Numeric and numarray, it seems they have made an effort to standardize on axis=0. This is also true in the Matlab module. Wouldn't it be wise for SciPy to join the parade? (I.e., shouldn't these closely related and interdependent packages share a convention? Otherwise, this is going to be VERY confusing for users. (As it was for me.) Now when I call a such a function, I have to remember: is it a SciPy function, or a Numeric/numarray function? Not a good situation, right? Thanks, Alan
Alan G Isaac wrote: [snip]
Looking at briefly at Numeric and numarray, it seems they have made an effort to standardize on axis=0.
No, they don't. It's a fairly even mix of axis=0 and axis=-1.
This is also true in the Matlab module. Wouldn't it be wise for SciPy to join the parade?
Not at this time. There's too much SciPy code that would need to be rewritten, namely SciPy itself. The axis=-1 convention will be staying.
(I.e., shouldn't these closely related and interdependent packages share a convention? Otherwise, this is going to be VERY confusing for users. (As it was for me.) Now when I call a such a function, I have to remember: is it a SciPy function, or a Numeric/numarray function?
Well, if it's SciPy, axis=-1 almost always (the deviations usually being the functions which shadow Numeric functions). If it's Numeric, you also have to remember if it's axis=0 or axis=-1. If it's numarray, then you have other problems since SciPy is currently all-Numeric. The real question you have to ask yourself when coding is "did SciPy overwrite this Numeric function?", a question that needs to be asked for reasons other than the choice of default axis. I would also note that having a terminal running IPython is invaluable when coding. The ? and ?? magic is usually better than a reference manual.
Not a good situation, right?
It's arguable that SciPy functions which shadow Numeric functions should have the same defaults. It is also arguable that SciPy functions should be internally consistent, so when a Numeric function is overwritten (to add genericity or whatever) it should follow the SciPy convention. Neither choice has been rigorously implemented although after I poked around in IPython for a minute or two, cumsum() appears to be the only deviation from the first option. This might qualify as a bug.
Thanks, Alan
-- Robert Kern rkern@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter
participants (2)
-
Alan G Isaac -
Robert Kern