
From the np.median doc string: "If the input contains integers, or
floats of smaller precision than 64, then the output data-type is float64."
arr = np.array([[0,1,2,3,4,5]], dtype='float32') np.median(arr, axis=0).dtype
dtype('float32')
np.median(arr, axis=1).dtype
dtype('float32')
np.median(arr, axis=None).dtype
dtype('float64')
So the output doesn't agree with the doc string.
What is the desired dtype of the accumulator and the output for when the input dtype is less than float64? Should it depend on axis?
I'm trying to duplicate the behavior of np.median (and other numpy/scipy functions) in the Bottleneck package and am running into a few corner cases while unit testing.
Here's another one:
np.sum([np.nan]).dtype
dtype('float64')
np.nansum([1,np.nan]).dtype
dtype('float64')
np.nansum([np.nan]).dtype
<snip> AttributeError: 'float' object has no attribute 'dtype'
I just duplicated the numpy behavior for that one since it was easy to do.

On 12/13/2010 11:59 AM, Keith Goodman wrote:
From the np.median doc string: "If the input contains integers, or
floats of smaller precision than 64, then the output data-type is float64."
arr = np.array([[0,1,2,3,4,5]], dtype='float32') np.median(arr, axis=0).dtype
dtype('float32')
np.median(arr, axis=1).dtype
dtype('float32')
np.median(arr, axis=None).dtype
dtype('float64')
So the output doesn't agree with the doc string.
What is the desired dtype of the accumulator and the output for when the input dtype is less than float64? Should it depend on axis?
I'm trying to duplicate the behavior of np.median (and other numpy/scipy functions) in the Bottleneck package and am running into a few corner cases while unit testing.
Here's another one:
np.sum([np.nan]).dtype
dtype('float64')
np.nansum([1,np.nan]).dtype
dtype('float64')
np.nansum([np.nan]).dtype
<snip> AttributeError: 'float' object has no attribute 'dtype'
I just duplicated the numpy behavior for that one since it was easy to do. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Unless something has changed since the docstring was written, this is probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype
dtype('float32')
np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype
dtype('float64')
Bruce

On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey bsouthey@gmail.com wrote:
On 12/13/2010 11:59 AM, Keith Goodman wrote:
From the np.median doc string: "If the input contains integers, or
floats of smaller precision than 64, then the output data-type is float64."
arr = np.array([[0,1,2,3,4,5]], dtype='float32') np.median(arr, axis=0).dtype
dtype('float32')
np.median(arr, axis=1).dtype
dtype('float32')
np.median(arr, axis=None).dtype
dtype('float64')
So the output doesn't agree with the doc string.
What is the desired dtype of the accumulator and the output for when the input dtype is less than float64? Should it depend on axis?
I'm trying to duplicate the behavior of np.median (and other numpy/scipy functions) in the Bottleneck package and am running into a few corner cases while unit testing.
Here's another one:
np.sum([np.nan]).dtype
dtype('float64')
np.nansum([1,np.nan]).dtype
dtype('float64')
np.nansum([np.nan]).dtype
<snip> AttributeError: 'float' object has no attribute 'dtype'
I just duplicated the numpy behavior for that one since it was easy to do. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Unless something has changed since the docstring was written, this is probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype dtype('float32') >>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype dtype('float64')
Same issue with np.std and np.var.

On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey bsouthey@gmail.com wrote:
Unless something has changed since the docstring was written, this is probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype dtype('float32') >>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype dtype('float64')
Are you saying the bug is in the doc string, the output, or both? I think it is both; I expect the second result above to be float32.

On Mon, Dec 13, 2010 at 4:53 PM, Keith Goodman kwgoodman@gmail.com wrote:
On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey bsouthey@gmail.com wrote:
Unless something has changed since the docstring was written, this is probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype dtype('float32') >>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype dtype('float64')
Are you saying the bug is in the doc string, the output, or both? I think it is both; I expect the second result above to be float32.
This was a surprise to me as this 'misunderstanding' goes back to at least numpy 1.1.
Both!
The documentation is wrong when using axis argument.
There is a bug because the output should be the same dtype for all possible axis values - which should be a ticket regardless. The recent half-float dtype or if users want the lower precision suggests that it might be a good time to ensure the 'correct' option is used (whatever that is).
Bruce

On 12/13/2010 04:53 PM, Keith Goodman wrote:
On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southeybsouthey@gmail.com wrote:
Unless something has changed since the docstring was written, this is probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype
dtype('float32')
np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype
dtype('float64')
Are you saying the bug is in the doc string, the output, or both? I think it is both; I expect the second result above to be float32. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Sorry as I filed a bug for this as 1710 http://projects.scipy.org/numpy/ticket/1710 but this is the same as ticket 518 that is listed as won't fix: http://projects.scipy.org/numpy/ticket/518
My expectation is that the internal and output dtypes should not depend on the axis argument. Related to this, I also think that internal dtypes should be the same as the output dtype (see ticket 465 regarding the internal precision http://projects.scipy.org/numpy/ticket/465).
If the consensus is still won't fix then I or someone needs to edit the documentation to clearly reflect these situations.
Bruce

On Wed, Jan 12, 2011 at 8:20 AM, Bruce Southey bsouthey@gmail.com wrote:
On 12/13/2010 04:53 PM, Keith Goodman wrote:
On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southeybsouthey@gmail.com wrote:
Unless something has changed since the docstring was written, this is probably an inherited 'bug' from np.mean() as the author expected that the docstring of mean was correct. For my 'old' 2.0 dev version:
>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype dtype('float32') >>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype dtype('float64')
Are you saying the bug is in the doc string, the output, or both? I think it is both; I expect the second result above to be float32.
Sorry as I filed a bug for this as 1710 http://projects.scipy.org/numpy/ticket/1710 but this is the same as ticket 518 that is listed as won't fix: http://projects.scipy.org/numpy/ticket/518
I fixed ticket 518 in bottleneck:
a = np.array([1,2,3], dtype='float32') bn.median(a).dtype
dtype('float32')
np.median(a).dtype
dtype('float64')
Not sure I would have done that if I knew that numpy has a won't fix on it.
participants (2)
-
Bruce Southey
-
Keith Goodman