On 11. mars 2010, at 23.50, Lafras Uys wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
I need to save a fairly large set of arrays to disk. I have saved it using numpy.savez, and the resulting file is around 11Gb (yes, I did say fairly large ;D). When I try to load it using numpy.load, the zipfile module compains about BadZipfile: Bad magic number for file header
I can't open it with the normal zip utility present on the system, but it could be that it's barfing about files being larger than 2Gb. Is there some file limit for npzs?
Yes, the ZIP file format has a 4GB limit. Unfortunately, Python does not yet support the ZIP64 format.
Is there anyway I can recover the data (I guess I could try decompressing the file with 7z and extracting the individual npy files?)
Possibly. However, if the normal zip utility isn't working, 7z probably won't, either. Worth a try, though.
I've had similar problems, my solution was to move to HDF5. There are two options for accessing and working with HDF files from python: h5py (http://code.google.com/p/h5py/) and pytables (http://www.pytables.org/). Both packages have built in numpy support.
Regards, Lafras
I've experienced similar issues too, but I moved to NetCDF. The only disadvantage was that I did not find any python modules that work well _and_ support numpy. Hence, I am considering moving to HDF5. Which python module would people here recommend? (Or, alternatively, did I miss a great netCDF python module that someone could tell me about?) Cheers, Paul.