From the np.median doc string: "If the input contains integers, or floats of smaller precision than 64, then the output datatype is float64."
arr = np.array([[0,1,2,3,4,5]], dtype='float32') np.median(arr, axis=0).dtype dtype('float32') np.median(arr, axis=1).dtype dtype('float32') np.median(arr, axis=None).dtype dtype('float64')
np.sum([np.nan]).dtype
np.nansum([1,np.nan]).dtype
So the output doesn't agree with the doc string. What is the desired dtype of the accumulator and the output for when the input dtype is less than float64? Should it depend on axis? I'm trying to duplicate the behavior of np.median (and other numpy/scipy functions) in the Bottleneck package and am running into a few corner cases while unit testing. Here's another one: dtype('float64') dtype('float64')
np.nansum([np.nan]).dtype <snip> AttributeError: 'float' object has no attribute 'dtype'
I just duplicated the numpy behavior for that one since it was easy to do.
From the np.median doc string: "If the input contains integers, or floats of smaller precision than 64, then the output datatype is float64."
arr = np.array([[0,1,2,3,4,5]], dtype='float32') np.median(arr, axis=0).dtype dtype('float32') np.median(arr, axis=1).dtype dtype('float32') np.median(arr, axis=None).dtype dtype('float64')
So the output doesn't agree with the doc string.
What is the desired dtype of the accumulator and the output for when the input dtype is less than float64? Should it depend on axis?
I'm trying to duplicate the behavior of np.median (and other numpy/scipy functions) in the Bottleneck package and am running into a few corner cases while unit testing.
Here's another one:
np.sum([np.nan]).dtype dtype('float64') np.nansum([1,np.nan]).dtype dtype('float64') np.nansum([np.nan]).dtype <snip> AttributeError: 'float' object has no attribute 'dtype'
I just duplicated the numpy behavior for that one since it was easy to do. _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion Unless something has changed since the docstring was written, this is
np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype
np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype
On 12/13/2010 11:59 AM, Keith Goodman wrote: probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version: dtype('float32') dtype('float64') Bruce
On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey
From the np.median doc string: "If the input contains integers, or floats of smaller precision than 64, then the output datatype is float64."
arr = np.array([[0,1,2,3,4,5]], dtype='float32') np.median(arr, axis=0).dtype dtype('float32') np.median(arr, axis=1).dtype dtype('float32') np.median(arr, axis=None).dtype dtype('float64')
So the output doesn't agree with the doc string.
What is the desired dtype of the accumulator and the output for when the input dtype is less than float64? Should it depend on axis?
I'm trying to duplicate the behavior of np.median (and other numpy/scipy functions) in the Bottleneck package and am running into a few corner cases while unit testing.
Here's another one:
np.sum([np.nan]).dtype dtype('float64') np.nansum([1,np.nan]).dtype dtype('float64') np.nansum([np.nan]).dtype <snip> AttributeError: 'float' object has no attribute 'dtype'
I just duplicated the numpy behavior for that one since it was easy to do. _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion Unless something has changed since the docstring was written, this is
On 12/13/2010 11:59 AM, Keith Goodman wrote: probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype dtype('float32') >>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype dtype('float64')
Same issue with np.std and np.var.
On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey
Unless something has changed since the docstring was written, this is probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype dtype('float32') >>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype dtype('float64')
Are you saying the bug is in the doc string, the output, or both? I think it is both; I expect the second result above to be float32.
On Mon, Dec 13, 2010 at 4:53 PM, Keith Goodman
On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey
wrote: Unless something has changed since the docstring was written, this is probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype dtype('float32') >>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype dtype('float64')
Are you saying the bug is in the doc string, the output, or both? I think it is both; I expect the second result above to be float32.
This was a surprise to me as this 'misunderstanding' goes back to at least numpy 1.1. Both! The documentation is wrong when using axis argument. There is a bug because the output should be the same dtype for all possible axis values  which should be a ticket regardless. The recent halffloat dtype or if users want the lower precision suggests that it might be a good time to ensure the 'correct' option is used (whatever that is). Bruce
On 12/13/2010 04:53 PM, Keith Goodman wrote:
On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey
wrote: Unless something has changed since the docstring was written, this is probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype dtype('float32') np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype dtype('float64') Are you saying the bug is in the doc string, the output, or both? I think it is both; I expect the second result above to be float32.
NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion Sorry as I filed a bug for this as 1710 http://projects.scipy.org/numpy/ticket/1710 but this is the same as ticket 518 that is listed as won't fix: http://projects.scipy.org/numpy/ticket/518
My expectation is that the internal and output dtypes should not depend on the axis argument. Related to this, I also think that internal dtypes should be the same as the output dtype (see ticket 465 regarding the internal precision http://projects.scipy.org/numpy/ticket/465). If the consensus is still won't fix then I or someone needs to edit the documentation to clearly reflect these situations. Bruce
On Wed, Jan 12, 2011 at 8:20 AM, Bruce Southey
On 12/13/2010 04:53 PM, Keith Goodman wrote:
On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey
wrote: Unless something has changed since the docstring was written, this is probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype dtype('float32') >>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype dtype('float64') Are you saying the bug is in the doc string, the output, or both? I think it is both; I expect the second result above to be float32.
Sorry as I filed a bug for this as 1710 http://projects.scipy.org/numpy/ticket/1710 but this is the same as ticket 518 that is listed as won't fix: http://projects.scipy.org/numpy/ticket/518
a = np.array([1,2,3], dtype='float32') bn.median(a).dtype
np.median(a).dtype
I fixed ticket 518 in bottleneck: dtype('float32') dtype('float64') Not sure I would have done that if I knew that numpy has a won't fix on it.
participants (2)

Bruce Southey

Keith Goodman