[Numpy-discussion] Getting 95%/99% margin of ndarray

Thu Jul 23 08:08:03 EDT 2009

> 2009/7/23 Pierre GM <pgmdevlist at gmail.com>:
>
> On Jul 23, 2009, at 6:07 AM, Scott Sinclair wrote:
>
>>> 2009/7/22 Pierre GM <pgmdevlist at gmail.com>:
>>> You could try scipy.stats.scoreatpercentile,
>>> scipy.stats.mstats.plottingposition or scipy.stats.mstats.mquantiles,
>>> which will all approximate quantiles of your distribution.
>>
>> It seems that mquantiles doesn't do what you'd expect when the limit
>> keyword argument is specified. There's a patch for review here:
>
> Thx for the patch, I'll port it in the next few hours. However, I
> disagree with the last few lines (where the quantiles are transformed
> to a standard ndarray if the mask is nomask. For consistency, we
> should always have a MaskedArray, don't you think ? (And anyway,
> taking a view as a ndarray is faster than using np.asarray...)

Agree it's more consistent to always return a MaskedArray.

I don't remember why I chose to return an ndarray. I think that it was
probably to do with the fact that an ndarray is returned when 'axis'
isn't specified...

>>> import numpy as np
>>> import scipy as sp
>>> sp.__version__
'0.8.0.dev5874'
>>> from scipy.stats.mstats import mquantiles
>>> a = np.array([6., 47., 49., 15., 42., 41., 7., 39., 43., 40., 36.])
>>> type(mquantiles(a))
<type 'numpy.ndarray'>
>>> type(mquantiles(np.ma.masked_array(a)))
<type 'numpy.ndarray'>
>>> type(mquantiles(a, axis=0))
<class 'numpy.ma.core.MaskedArray'>

This could be fixed by forcing _quantiles1D() to always return a MaskedArray.

Cheers,
Scott