[Numpy-discussion] Reading a big netcdf file
Christopher Barker
Chris.Barker at noaa.gov
Wed Aug 3 17:15:06 EDT 2011
On 8/3/11 1:57 PM, Gökhan Sever wrote:
> This is what I get here:
>
> In [1]: a = np.zeros((21601, 10801), dtype=np.uint16)
>
> In [2]: a.tofile('temp.npa')
>
> In [3]: del a
>
> In [4]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
> 1 loops, best of 3: 251 ms per loop
so that's about 10 times faster than my machine. I didn't think disks
had gotten much faster -- they are still generally 7200 rpm (or slower
in laptops).
So I've either got a really slow disk, or you have a really fast one (or
both), or maybe you're getting cache effect, as you wrote the file just
before reading it.
repeating, doing just what you did:
In [8]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
1 loops, best of 3: 2.53 s per loop
then I wrote a bunch of others to disk, and tried again:
In [17]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
1 loops, best of 3: 2.45 s per loop
so ti seems I'm not seeing cache effects, but maybe you are.
Anyway, we haven't heard from the OP -- I'm not sure what s/he thought
was slow.
-Chris
>
> On Wed, Aug 3, 2011 at 10:50 AM, Christopher Barker
> <Chris.Barker at noaa.gov <mailto:Chris.Barker at noaa.gov>> wrote:
>
> On 8/3/11 9:30 AM, Kiko wrote:
> > I'm trying to read a big netcdf file (445 Mb) using netcdf4-python.
>
> I've never noticed that netCDF4 was particularly slow for reading
> (writing can be pretty slow some times). How slow is slow?
>
> > The data are described as:
>
> please post the results of:
>
> ncdump -h the_file_name.nc <http://the_file_name.nc>
>
> So we can see if there is anything odd in the structure (though I don't
> know what it might be)
>
> Post your code (in the simnd pplest form you can).
>
> and post your timings and machine type
>
> Is the file netcdf4 or 3 format? (the python lib will read either)
>
> As a reference, reading that much data in from a raw file into a numpy
> array takes 2.57 on my machine (a rather old Mac, but disks haven't
> gotten much faster). YOu can test that like this:
>
> a = np.zeros((21601, 10801), dtype=np.uint16)
>
> a.tofile('temp.npa')
>
> del a
>
> timeit a = np.fromfile('temp.npa', dtype=np.uint16)
>
> (using ipython's timeit)
>
> -Chris
>
>
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R (206) 526-6959 <tel:%28206%29%20526-6959> voice
> 7600 Sand Point Way NE (206) 526-6329 <tel:%28206%29%20526-6329> fax
> Seattle, WA 98115 (206) 526-6317 <tel:%28206%29%20526-6317> main
> reception
>
> Chris.Barker at noaa.gov <mailto:Chris.Barker at noaa.gov>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
>
> --
> Gökhan
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the NumPy-Discussion
mailing list