
From the np.median doc string: "If the input contains integers, or floats of smaller precision than 64, then the output data-type is float64."
arr = np.array([[0,1,2,3,4,5]], dtype='float32') np.median(arr, axis=0).dtype dtype('float32') np.median(arr, axis=1).dtype dtype('float32') np.median(arr, axis=None).dtype dtype('float64')
np.sum([np.nan]).dtype
np.nansum([1,np.nan]).dtype
So the output doesn't agree with the doc string. What is the desired dtype of the accumulator and the output for when the input dtype is less than float64? Should it depend on axis? I'm trying to duplicate the behavior of np.median (and other numpy/scipy functions) in the Bottleneck package and am running into a few corner cases while unit testing. Here's another one: dtype('float64') dtype('float64')
np.nansum([np.nan]).dtype <snip> AttributeError: 'float' object has no attribute 'dtype'
I just duplicated the numpy behavior for that one since it was easy to do.

From the np.median doc string: "If the input contains integers, or floats of smaller precision than 64, then the output data-type is float64."
arr = np.array([[0,1,2,3,4,5]], dtype='float32') np.median(arr, axis=0).dtype dtype('float32') np.median(arr, axis=1).dtype dtype('float32') np.median(arr, axis=None).dtype dtype('float64')
So the output doesn't agree with the doc string.
What is the desired dtype of the accumulator and the output for when the input dtype is less than float64? Should it depend on axis?
I'm trying to duplicate the behavior of np.median (and other numpy/scipy functions) in the Bottleneck package and am running into a few corner cases while unit testing.
Here's another one:
np.sum([np.nan]).dtype dtype('float64') np.nansum([1,np.nan]).dtype dtype('float64') np.nansum([np.nan]).dtype <snip> AttributeError: 'float' object has no attribute 'dtype'
I just duplicated the numpy behavior for that one since it was easy to do. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Unless something has changed since the docstring was written, this is
np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype
np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype
On 12/13/2010 11:59 AM, Keith Goodman wrote: probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version: dtype('float32') dtype('float64') Bruce

On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey <bsouthey@gmail.com> wrote:
From the np.median doc string: "If the input contains integers, or floats of smaller precision than 64, then the output data-type is float64."
arr = np.array([[0,1,2,3,4,5]], dtype='float32') np.median(arr, axis=0).dtype dtype('float32') np.median(arr, axis=1).dtype dtype('float32') np.median(arr, axis=None).dtype dtype('float64')
So the output doesn't agree with the doc string.
What is the desired dtype of the accumulator and the output for when the input dtype is less than float64? Should it depend on axis?
I'm trying to duplicate the behavior of np.median (and other numpy/scipy functions) in the Bottleneck package and am running into a few corner cases while unit testing.
Here's another one:
np.sum([np.nan]).dtype dtype('float64') np.nansum([1,np.nan]).dtype dtype('float64') np.nansum([np.nan]).dtype <snip> AttributeError: 'float' object has no attribute 'dtype'
I just duplicated the numpy behavior for that one since it was easy to do. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Unless something has changed since the docstring was written, this is
On 12/13/2010 11:59 AM, Keith Goodman wrote: probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype dtype('float32') >>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype dtype('float64')
Same issue with np.std and np.var.

On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey <bsouthey@gmail.com> wrote:
Unless something has changed since the docstring was written, this is probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype dtype('float32') >>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype dtype('float64')
Are you saying the bug is in the doc string, the output, or both? I think it is both; I expect the second result above to be float32.

On Mon, Dec 13, 2010 at 4:53 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey <bsouthey@gmail.com> wrote:
Unless something has changed since the docstring was written, this is probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype dtype('float32') >>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype dtype('float64')
Are you saying the bug is in the doc string, the output, or both? I think it is both; I expect the second result above to be float32.
This was a surprise to me as this 'misunderstanding' goes back to at least numpy 1.1. Both! The documentation is wrong when using axis argument. There is a bug because the output should be the same dtype for all possible axis values - which should be a ticket regardless. The recent half-float dtype or if users want the lower precision suggests that it might be a good time to ensure the 'correct' option is used (whatever that is). Bruce

On 12/13/2010 04:53 PM, Keith Goodman wrote:
On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey<bsouthey@gmail.com> wrote:
Unless something has changed since the docstring was written, this is probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype dtype('float32') np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype dtype('float64') Are you saying the bug is in the doc string, the output, or both? I think it is both; I expect the second result above to be float32.
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Sorry as I filed a bug for this as 1710 http://projects.scipy.org/numpy/ticket/1710 but this is the same as ticket 518 that is listed as won't fix: http://projects.scipy.org/numpy/ticket/518
My expectation is that the internal and output dtypes should not depend on the axis argument. Related to this, I also think that internal dtypes should be the same as the output dtype (see ticket 465 regarding the internal precision http://projects.scipy.org/numpy/ticket/465). If the consensus is still won't fix then I or someone needs to edit the documentation to clearly reflect these situations. Bruce

On Wed, Jan 12, 2011 at 8:20 AM, Bruce Southey <bsouthey@gmail.com> wrote:
On 12/13/2010 04:53 PM, Keith Goodman wrote:
On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey<bsouthey@gmail.com> wrote:
Unless something has changed since the docstring was written, this is probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype dtype('float32') >>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype dtype('float64') Are you saying the bug is in the doc string, the output, or both? I think it is both; I expect the second result above to be float32.
Sorry as I filed a bug for this as 1710 http://projects.scipy.org/numpy/ticket/1710 but this is the same as ticket 518 that is listed as won't fix: http://projects.scipy.org/numpy/ticket/518
a = np.array([1,2,3], dtype='float32') bn.median(a).dtype
np.median(a).dtype
I fixed ticket 518 in bottleneck: dtype('float32') dtype('float64') Not sure I would have done that if I knew that numpy has a won't fix on it.
participants (2)
-
Bruce Southey
-
Keith Goodman