percentile function for masked array?

Dear all,
It seems that there is not a percentile function for masked array in numpy or scipy? I checked numpy.percentile and scipy.percentile, it seems not support only nonmasked array? And there is no percentile function in scipy.stats.mstats, so I have to use np.percentile(arr.compressed()) I guess.
Thanks for any comments.
Best,
Chao

On Mon, Jun 2, 2014 at 3:58 PM, Chao YUE chaoyuejoy@gmail.com wrote:
Dear all,
It seems that there is not a percentile function for masked array in numpy or scipy? I checked numpy.percentile and scipy.percentile, it seems not support only nonmasked array? And there is no percentile function in scipy.stats.mstats, so I have to use np.percentile(arr.compressed()) I guess.
Thanks for any comments.
there is currently no ma.percentile, but numpy 1.9 which we will hopefully release as a first beta very soon, will contain np.nanpercentile which can be used to emulate ma.percentile for floating point data:
r = np.nanpercentile(maskedarray.filled(np.nan), (5, 95), axis=(0,1)) r = ma.masked_array(r, np.isnan(r))
for 1 dimensional arrays np.percentile(arr.compressed(), (5, 95), overwrite_input=True) works fine and is also the fastest possible way.
a generic masked percentile would be useful, patches for that are welcome. Ideally a patch should also take care of the poor performance of multitimensional nanpercentile along small axes similar to how this PR fixes it for ma.median/nanmedian: https://github.com/numpy/numpy/pull/4760

It seems that there is not a percentile function for masked array in numpy or scipy?
Percentile is not the only function missing in ma. See for example
https://github.com/numpy/numpy/issues/4356 https://github.com/numpy/numpy/issues/4355
It seems to me that ma was treated on par with np.matrix in the recent years while several attempts were made to replace it with something better.
I don't think any better alternative have materialized, so it is probably time to declare that ma in the supported mechanism to deal with missing values in numpy and make an effort to keep the np and ma interfaces in sync.

On Mon, Jun 2, 2014 at 9:30 AM, Alexander Belopolsky ndarray@mac.com wrote:
It seems that there is not a percentile function for masked array in numpy
or scipy?
Percentile is not the only function missing in ma. See for example
https://github.com/numpy/numpy/issues/4356 https://github.com/numpy/numpy/issues/4355
It seems to me that ma was treated on par with np.matrix in the recent years while several attempts were made to replace it with something better.
I don't think any better alternative have materialized, so it is probably time to declare that ma in the supported mechanism to deal with missing values in numpy and make an effort to keep the np and ma interfaces in sync.
Masked arrays have no maintainer, and haven't for several years, nor do I see anyone coming along to take it up. It would be good to have methods for dealing with missing values in mainline as they would get more maintenance. Perhaps it is time to reopen that discussion and find a way forward.
Chuck

On Mon, Jun 2, 2014 at 11:48 AM, Charles R Harris <charlesr.harris@gmail.com
wrote:
Masked arrays have no maintainer, and haven't for several years, nor do I see anyone coming along to take it up.
I was effectively a maintainer of ma after Numeric -> numpy transition and before it was rewritten to use inheritance from ndarray.
I cannot commit to implementing new features myself, but I will review the patches that come along.

On Mon, Jun 2, 2014 at 10:15 AM, Alexander Belopolsky ndarray@mac.com wrote:
On Mon, Jun 2, 2014 at 11:48 AM, Charles R Harris < charlesr.harris@gmail.com> wrote:
Masked arrays have no maintainer, and haven't for several years, nor do I see anyone coming along to take it up.
I was effectively a maintainer of ma after Numeric -> numpy transition and before it was rewritten to use inheritance from ndarray.
I cannot commit to implementing new features myself, but I will review the patches that come along.
Most recent ma patches are coming from the astropy folks who want masked arrays to work better for numpy subclasses. We've put off committing them until 1.10 development begins, but there are several in the queue. I think the masked array code is also due a cleanup/rationalization. Any comments you have along that line are welcome.
Chuck

On Mon, Jun 2, 2014 at 12:25 PM, Charles R Harris <charlesr.harris@gmail.com
wrote:
I think the masked array code is also due a cleanup/rationalization. Any comments you have along that line are welcome.
Here are a few thoughts:
1. Please avoid another major rewrite. 2. Stop pretending that instances of ma.MaskedArray and ndarray have "is a" relationship. Use of inheritance should become an implementation detail and any method that is not explicitly overridden should raise an exception. 3. Add a mechanism to keep numpy and numpy.ma APIs in sync. At a minimum - add a test comparing public functions and methods and for pure python functions compare signatures. 4. Consider deprecating the ma.masked scalar. 5. Support duck-typing in MaskedArray constructors. If supplied data object has mask attribute it should be used as mask. This will allow interoperability with alternative missing values implementations. (ndarray may itself grow mask attribute one day which will be equivalent to isnan. Bit views, anyone?)

Dear all,
Thank you for this information. I will return to this issue later and probably make patch (as temporary solution) for this. Because I never tried before, so it may take me some time. For the other overall masked array constructing issues, it might be left further for more discussion.
Best,
Chao
On Mon, Jun 2, 2014 at 6:50 PM, Alexander Belopolsky ndarray@mac.com wrote:
On Mon, Jun 2, 2014 at 12:25 PM, Charles R Harris < charlesr.harris@gmail.com> wrote:
I think the masked array code is also due a cleanup/rationalization. Any comments you have along that line are welcome.
Here are a few thoughts:
- Please avoid another major rewrite.
- Stop pretending that instances of ma.MaskedArray and ndarray have "is
a" relationship. Use of inheritance should become an implementation detail and any method that is not explicitly overridden should raise an exception. 3. Add a mechanism to keep numpy and numpy.ma APIs in sync. At a minimum
- add a test comparing public functions and methods and for pure python
functions compare signatures. 4. Consider deprecating the ma.masked scalar. 5. Support duck-typing in MaskedArray constructors. If supplied data object has mask attribute it should be used as mask. This will allow interoperability with alternative missing values implementations. (ndarray may itself grow mask attribute one day which will be equivalent to isnan. Bit views, anyone?)
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (4)
-
Alexander Belopolsky
-
Chao YUE
-
Charles R Harris
-
Julian Taylor