[Numpy-discussion] Transparently reading complex arrays from netcdf4

Stephan Hoyer shoyer at gmail.com
Sun Mar 30 02:18:23 EDT 2014


Hi Glenn,

Here is a full example of how we wrap a netCDF4.Variable object,
implementing all of its ndarray-like methods:
https://github.com/akleeman/xray/blob/0c1a963be0542b7303dc875278f3b163a15429c5/src/xray/conventions.py#L91

The __array__ method would be the most relevant one for you: it means that
numpy knows how to convert the wrapper array into a numpy.ndarray when you
call np.mean(cplx_data). More generally, any function that calls
np.asarray(cplx_data) will properly convert the values, which should
include most functions from well-written libraries (including numpy and
scipy). netCDF4.Variable doesn't currently have such an __array__ method,
but it will in the next released version of the library.

The quick and dirty hack to make all numpy methods work (now going beyond
what the netCDF4 library implements) would be to add something like the
following:

    def __getattr__(self, attr):
        return getattr(np.asarray(self), attr)

But this is a little dangerous, since some methods might silently fail or
give unpredictable results (e.g., those that modify data). It would be
safer to list the methods you want to implement explicitly, or to just
liberally use np.asarray. The later is generally a good practice when
writing library code, anyways, to catch unusual ndarray subclasses like
np.matrix.

Stephan


On Sat, Mar 29, 2014 at 8:42 PM, G Jones <glenn.caltech at gmail.com> wrote:

> Hi Stephan,
> Thanks for the reply. I was thinking of something along these lines but
> was hesitant because while this provides clean access to chunks of the
> data, you still have to remember to do cplx_data[:].mean() for example in
> the case that you want cplx_data.mean().
>
> I was hoping to basically have all of the ndarray methods at hand without
> any indexing, but then also being smart about taking advantage of the mmap
> when possible. But perhaps your solution is the best compromise.
>
> Thanks again,
> Glenn
> On Mar 29, 2014 10:59 PM, "Stephan Hoyer" <shoyer at gmail.com> wrote:
>
>> Hi Glenn,
>>
>> My usual strategy for this sort of thing is to make a light-weight
>> wrapper class which reads and converts values when you access them. For
>> example:
>>
>> class WrapComplex(object):
>>     def __init__(self, nc_var):
>>         self.nc_var = nc_var
>>
>>     def __getitem__(self, item):
>>         return self.nc_var[item].view('complex')
>>
>> nc = netCDF4.Dataset('my.nc')
>> cplx_data = WrapComplex(nc.groups['mygroup'].variables['cplx_stuff'])
>>
>> Now you can index cplx_data (e.g., cplx_data[:10]) and only the values
>> you need will be read from disk and converted on the fly.
>>
>> Hope this helps!
>>
>> Cheers,
>> Stephan
>>
>>
>>
>>
>> On Sat, Mar 29, 2014 at 6:13 PM, G Jones <glenn.caltech at gmail.com> wrote:
>>
>>> Hi,
>>> I am using netCDF4 to store complex data using the recommended strategy
>>> of creating a compound data type with the real and imaginary parts. This
>>> all works well, but reading the data into a numpy array is a bit clumsy.
>>>
>>> Typically I do:
>>>
>>> nc = netCDF4.Dataset('my.nc')
>>> cplx_data =
>>> nc.groups['mygroup'].variables['cplx_stuff'][:].view('complex')
>>>
>>> which directly gives a nice complex numpy array. This is OK for small
>>> arrays, but is wasteful if I only need some chunks of the array because it
>>> reads all the data in, reducing the utility of the mmap feature of netCDF.
>>>
>>> I'm wondering if there is a better way to directly make a numpy array
>>> view that uses the netcdf variable's memory mapped buffer directly. Looking
>>> at the Variable class, there is no access to this buffer directly which
>>> could then be passed to np.ndarray(buffer=...).
>>>
>>> Any ideas of simple solutions to this problem?
>>>
>>> Thanks,
>>> Glenn
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140329/accc2584/attachment.html>


More information about the NumPy-Discussion mailing list