Hi Glenn,

My usual strategy for this sort of thing is to make a light-weight wrapper class which reads and converts values when you access them. For example:

class WrapComplex(object):
    def __init__(self, nc_var):
        self.nc_var = nc_var

    def __getitem__(self, item):
        return self.nc_var[item].view('complex')

nc = netCDF4.Dataset('my.nc')
cplx_data = WrapComplex(nc.groups['mygroup'].variables['cplx_stuff'])

Now you can index cplx_data (e.g., cplx_data[:10]) and only the values you need will be read from disk and converted on the fly.

Hope this helps!

Cheers,
Stephan




On Sat, Mar 29, 2014 at 6:13 PM, G Jones <glenn.caltech@gmail.com> wrote:

Hi,
I am using netCDF4 to store complex data using the recommended strategy of creating a compound data type with the real and imaginary parts. This all works well, but reading the data into a numpy array is a bit clumsy.

Typically I do:

nc = netCDF4.Dataset('my.nc')
cplx_data = nc.groups['mygroup'].variables['cplx_stuff'][:].view('complex')

which directly gives a nice complex numpy array. This is OK for small arrays, but is wasteful if I only need some chunks of the array because it reads all the data in, reducing the utility of the mmap feature of netCDF.

I'm wondering if there is a better way to directly make a numpy array view that uses the netcdf variable's memory mapped buffer directly. Looking at the Variable class, there is no access to this buffer directly which could then be passed to np.ndarray(buffer=...).

Any ideas of simple solutions to this problem?

Thanks,
Glenn


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion