Pierre, ma.masked_all does not seem to work with fancy dtypes and more then one dimension: In [1]:import numpy as np In [2]:dt = np.dtype({'names': ['a', 'b'], 'formats': ['f', 'f']}) In [3]:x = np.ma.masked_all((2,), dtype=dt) In [4]:x Out[4]: masked_array(data = [(--, --) (--, --)], mask = [(True, True) (True, True)], fill_value=(1.0000000200408773e+20, 1.0000000200408773e+20)) In [5]:x = np.ma.masked_all((2,2), dtype=dt) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /home/efiring/<ipython console> in <module>() /usr/local/lib/python2.5/site-packages/numpy/ma/extras.pyc in masked_all(shape, dtype) 78 """ 79 a = masked_array(np.empty(shape, dtype), ---> 80 mask=np.ones(shape, bool)) 81 return a 82 /usr/local/lib/python2.5/site-packages/numpy/ma/core.pyc in __new__(cls, data, mask, dtype, copy, subok, ndmin, fill_value, keep_mask, hard_mask, flag, shrink, **options) 1304 except TypeError: 1305 mask = np.array([tuple([m]*len(mdtype)) for m in mask], -> 1306 dtype=mdtype) 1307 # Make sure the mask and the data have the same shape 1308 if mask.shape != _data.shape: TypeError: expected a readable buffer object ----------------- Eric
Pierre, Your change fixed masked_all for the example I gave, but I think it introduced a new failure in zeros: dt = np.dtype([((' Pressure, Digiquartz [db]', 'P'), '<f4'), ((' Depth [salt water, m]', 'D'), '<f4'), ((' Temperature [ITS-90, deg C]', 'T'), '<f4'), ((' Descent Rate [m/s]', 'w'), '<f4'), ((' Salinity [PSU]', 'S'), '<f4'), ((' Density [sigma-theta, Kg/m^3]', 'sigtheta'), '<f4'), ((' Potential Temperature [ITS-90, deg C]', 'theta'), '<f4')]) np.ma.zeros((2,2), dt) results in: ValueError Traceback (most recent call last) /home/efiring/<ipython console> in <module>() /usr/local/lib/python2.5/site-packages/numpy/ma/core.pyc in __call__(self, a, *args, **params) 4533 # 4534 def __call__(self, a, *args, **params): -> 4535 return self._func.__call__(a, *args, **params).view(MaskedArray) 4536 4537 arange = _convert2ma('arange') /usr/local/lib/python2.5/site-packages/numpy/ma/core.pyc in __array_finalize__(self, obj) 1548 odtype = obj.dtype 1549 if odtype.names: -> 1550 _mask = getattr(obj, '_mask', make_mask_none(obj.shape, odtype)) 1551 else: 1552 _mask = getattr(obj, '_mask', nomask) /usr/local/lib/python2.5/site-packages/numpy/ma/core.pyc in make_mask_none(newshape, dtype) 921 result = np.zeros(newshape, dtype=MaskType) 922 else: --> 923 result = np.zeros(newshape, dtype=make_mask_descr(dtype)) 924 return result 925 /usr/local/lib/python2.5/site-packages/numpy/ma/core.pyc in make_mask_descr(ndtype) 819 if not isinstance(ndtype, np.dtype): 820 ndtype = np.dtype(ndtype) --> 821 return np.dtype(_make_descr(ndtype)) 822 823 def get_mask(a): /usr/local/lib/python2.5/site-packages/numpy/ma/core.pyc in _make_descr(datatype) 806 descr = [] 807 for name in names: --> 808 (ndtype, _) = datatype.fields[name] 809 descr.append((name, _make_descr(ndtype))) 810 return descr ValueError: too many values to unpack
Eric, That's quite a handful you have with this dtype... So yes, the fix I gave works with nested dtypes and flexible dtypes with a simple name (string, not tuple). I'm a bit surprised with numpy, here. Consider:
dt.names ('P', 'D', 'T', 'w', 'S', 'sigtheta', 'theta')
So we lose the tuple and get a single string instead, corresponding to the right-hand element of the name.. But this single string is one of the keys of dt.fields, whereas the tuple is not. Puzzling. I'm sure there must be some reference in the numpy book, but I can't look for it now. Anyway: Prior to version 6127, make_mask_descr was substituting the 2nd element of each tuple of a dtype.descr by a bool. Which failed for nested dtypes. Now, we check the field corresponding to a name, which fails in our particular case. I'll be working on it... On Dec 2, 2008, at 1:59 AM, Eric Firing wrote:
dt = np.dtype([((' Pressure, Digiquartz [db]', 'P'), '<f4'), ((' Depth [salt water, m]', 'D'), '<f4'), ((' Temperature [ITS-90, deg C]', 'T'), '<f4'), ((' Descent Rate [m/s]', 'w'), '<f4'), ((' Salinity [PSU]', 'S'), '<f4'), ((' Density [sigma-theta, Kg/m^3]', 'sigtheta'), '<f4'), ((' Potential Temperature [ITS-90, deg C]', 'theta'), '<f4')])
np.ma.zeros((2,2), dt)
Pierre GM wrote:
Eric, That's quite a handful you have with this dtype...
Here is a simplified example of how I made it: dt = np.dtype({'names': ['a','b'], 'formats': ['f', 'f'], 'titles': ['aaa', 'bbb']}) From page 132 in the numpy book: The fields dictionary is indexed by keys that are the names of the fields. Each entry in the dictionary is a tuple fully describing the field: (dtype, offset[,title]). If present, the optional title can actually be any object (if it is string or unicode then it will also be a key in the fields dictionary, otherwise it’s meta-data). -------- I put the titles in as a sort of additional documentation, and thinking that they might be useful for labeling plots; but it is rather hard to get the titles back out, since they are not directly accessible as an attribute, like names. Probably I should just omit them. Eric
So yes, the fix I gave works with nested dtypes and flexible dtypes with a simple name (string, not tuple). I'm a bit surprised with numpy, here. Consider:
dt.names ('P', 'D', 'T', 'w', 'S', 'sigtheta', 'theta')
So we lose the tuple and get a single string instead, corresponding to the right-hand element of the name.. But this single string is one of the keys of dt.fields, whereas the tuple is not. Puzzling. I'm sure there must be some reference in the numpy book, but I can't look for it now.
Anyway: Prior to version 6127, make_mask_descr was substituting the 2nd element of each tuple of a dtype.descr by a bool. Which failed for nested dtypes. Now, we check the field corresponding to a name, which fails in our particular case.
I'll be working on it...
On Dec 2, 2008, at 1:59 AM, Eric Firing wrote:
dt = np.dtype([((' Pressure, Digiquartz [db]', 'P'), '<f4'), ((' Depth [salt water, m]', 'D'), '<f4'), ((' Temperature [ITS-90, deg C]', 'T'), '<f4'), ((' Descent Rate [m/s]', 'w'), '<f4'), ((' Salinity [PSU]', 'S'), '<f4'), ((' Density [sigma-theta, Kg/m^3]', 'sigtheta'), '<f4'), ((' Potential Temperature [ITS-90, deg C]', 'theta'), '<f4')])
np.ma.zeros((2,2), dt)
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
On Dec 2, 2008, at 4:26 AM, Eric Firing wrote:
From page 132 in the numpy book:
The fields dictionary is indexed by keys that are the names of the fields. Each entry in the dictionary is a tuple fully describing the field: (dtype, offset[,title]). If present, the optional title can actually be any object (if it is string or unicode then it will also be a key in the fields dictionary, otherwise it’s meta-data).
I should read it more often...
I put the titles in as a sort of additional documentation, and thinking that they might be useful for labeling plots;
That's actually quite a good idea...
but it is rather hard to get the titles back out, since they are not directly accessible as an attribute, like names. Probably I should just omit them.
We could perhaps try a function: def gettitle(dtype, name): try: field = dtype.fields[name] except (TypeError, KeyError): return None else: if len(field) > 2: return field[-1] return None
Pierre GM wrote:
On Dec 2, 2008, at 1:59 AM, Eric Firing wrote:
Pierre,
Your change fixed masked_all for the example I gave, but I think it introduced a new failure in zeros:
Eric, Would you mind giving r6131 a try ? It's rather ugly but looks like it works...
So far, so good. Thanks very much. Eric
participants (2)
-
Eric Firing
-
Pierre GM