This PR changes the behaviour of masked record arrays. The rationale makes sense to me, but I don't actually use this functionality. Any masked array users want to chime in?
-n
The current behavior of MaskedArray when accessing a row object is that if all masks are False
, the row is an np.void
object, and if any are set to True
, it is a np.ma.core.mvoid
object. The issue with this is that users can access/modify masks for rows that already have at least one mask set to True
, but not for rows that have no masks set. For example:
In [1]: a = ma.array(zip([1,2,3]), mask=[0,1,0], dtype=[('x', int)])
In [2]: a[0].mask
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-7-2221dd3fce8c> in <module>()
----> 1 a[0].mask
AttributeError: 'numpy.void' object has no attribute 'mask'
In [3]: a[1].mask
Out[3]: (True,)
There is no reason why a row should behave differently whether any values are masked or not, so this PR defaults to always returning an mvoid
object, otherwise mask values cannot be set for rows that start off with no mask values set.
(Of course, the present implementation in Numpy also has a performance impact - for arrays with many fields, each call to a row has to flatten the mask of the whole record just to check what type to return, which is inefficient. With this PR, row access should be faster for masked arrays. But this is secondary.)
git pull https://github.com/astrofrog/numpy fix-masked-getitem
Or view, comment on, or merge it at:
https://github.com/numpy/numpy/pull/483
—
Reply to this email directly or view it on GitHub.