
Andrew Straw wrote:
Considering that many of the statistical functions (mean, std, median) must iterate over all the data and that people (or at least myself) typically call them sequentially on the same data, it may make sense to make a super-function with less repetition.
http://currents.soest.hawaii.edu/hg/hgwebdir.cgi/pycurrents/file/df129ff36f6... I have something like that, in the link above (if the mailer does not break the line). I think it is quite flexible and efficient; it calculates only as much as necessary, so, for example, it only calculates the median if you ask for it. In the file that the link points to, you can import numpy.ma as MA to remove its one external dependency. Eric
Instead of: x_mean = np.mean(x) x_median = np.median(x) x_std = np.std(x) x_min = np.min(x) x_max = np.max(x)
We do: x_stats = np.get_descriptive_stats(x, stats=['mean','median','std','min','max'],axis=-1) And x_stats is a dictionary with 'mean','meadian','std','min', 'max' keys.
The implementation could reduce the number of iterations over the data in this case. The implementation wouldn't have to be optimized initially, but could be gradually sped up once the interface is in place. I bring this up now to suggest such an idea as a more-general alternative to the "medianwithaxis" function proposed. What do you think? (Perhaps something like this already exists?) And, finally, this all surely belongs in scipy, but we already have stuff in numpy that can't be removed without seriously breaking backwards compatibility...
-Andrew
Matthew Brett wrote:
Hi,
median moved mediandim0 implementation of medianwithaxis or similar, with same call signature as mean.
Deprecation warning for use of median, and return of mediandim0 for now. Eventual move of median to return medianwithaxis.
This would confuse people even more, I'm afraid. First they're said that median() is deprecated, and then later on it becomes the standard function to use. I would actually prefer a short pain rather than a long one.
I was thinking the warning could be something like:
"The current and previous version of numpy use a version of median that is not consistent with other summary functions such as mean. The calling convention of median will change in a future version of numpy to match that of the other summary functions. This compatible future version is implemented as medianwithaxis, and will become the default implementation of median. Please change any code using median to call medianwithaxis specifically, to maintain compatibility with future numpy APIs."
I would certainly like median to take the axis keyword. The axis keyword (and its friends) could be added to 1.0.5 with the default being 1 instead of None, so that it keeps compatibility with the 1.0 API. Then, with 1.1 (an API-breaking release) the default can be changed to None to restore consistency with mean, etc.
But that would be very surprising to a new user, and might lead to some hard to track down silent bugs at a later date.
Matthew _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion