Hello,
I am trying to save numpy arrays (actually a list of them) for later use, and distribution to others. Up until yesterday, I've been using the zpickle module from the Cookbook, which is just pickle binary format with gzip compression. Yesterday, I upgraded my operating system, and now I can't read those files. I am using numpy 0.9.9.2536, and unfortunately I can't recall the version that I was using, but it was pretty relatively recent. I also upgraded from Python 2.3 to 2.4. Trying to load the "old" files, I get:
AttributeError: 'module' object has no attribute 'dtypedescr'
the file consists of a single dictionary, with two elements, like:
var={'im': numpy.zeros((5,5)),'im_scale_shift':[0.0,1.0]}
My question isn't how can I load these "old" files, because I can regenerate them. I would like to know what file format I should be using so that I don't have to worry about upgrades/version differences when I want to load them. Is there a preferred way to do this? I thought pickle was that way, but perhaps I don't understand how pickle works.
thanks,
Brian Blais
El dl 22 de 05 del 2006 a les 12:49 -0400, en/na Brian Blais va escriure:
I am trying to save numpy arrays (actually a list of them) for later use, and distribution to others. Up until yesterday, I've been using the zpickle module from the Cookbook, which is just pickle binary format with gzip compression. Yesterday, I upgraded my operating system, and now I can't read those files. I am using numpy 0.9.9.2536, and unfortunately I can't recall the version that I was using, but it was pretty relatively recent. I also upgraded from Python 2.3 to 2.4. Trying to load the "old" files, I get:
AttributeError: 'module' object has no attribute 'dtypedescr'
the file consists of a single dictionary, with two elements, like:
var={'im': numpy.zeros((5,5)),'im_scale_shift':[0.0,1.0]}
This could be because NumPy objects has suffered some changes in their structure in the last months. After 1.0 version there will (hopefully) be no more changes in the structure, so your pickles will be more stable (but again, you might have problems in the long run, i.e. when NumPy 2.0 will appear).
My question isn't how can I load these "old" files, because I can regenerate them. I would like to know what file format I should be using so that I don't have to worry about upgrades/version differences when I want to load them. Is there a preferred way to do this? I thought pickle was that way, but perhaps I don't understand how pickle works.
If you need full comaptibility, a better approach than pickle-based solutions are the .tofile() .fromfile() methods, but you need to save the metadata for your objects (type, shape, etc.) separately.
If you need full support for saving data & metadata for your NumPy objects in a transparent way that is independent of pickle you may want to have a look at PyTables [1] or NetCDF4 [2]. Both packages should be able to save NumPy datasets without a need to worry for future changes in NumPy data structures. These both packages are ultimately based on the HDF5 format[3], which has a pretty strong commitement with backward/forward format compatibility along its versions.
[1]http://www.pytables.org [2]http://www.cdc.noaa.gov/people/jeffrey.s.whitaker/python/netCDF4.html [3]http://hdf.ncsa.uiuc.edu/HDF5
Cheers,
Brian Blais wrote:
Hello,
I am trying to save numpy arrays (actually a list of them) for later use, and distribution to others. Up until yesterday, I've been using the zpickle module from the Cookbook, which is just pickle binary format with gzip compression. Yesterday, I upgraded my operating system, and now I can't read those files. I am using numpy 0.9.9.2536, and unfortunately I can't recall the version that I was using, but it was pretty relatively recent. I also upgraded from Python 2.3 to 2.4. Trying to load the "old" files, I get:
AttributeError: 'module' object has no attribute 'dtypedescr'
The name "dtypedescr" was changed to "dtype" back in early February. The problem with pickle is that it is quite sensitive to these kind of changes. These kind of changes are actually rare, but in the early stages of NumPy, more common. This should be more stable now. I don't expect changes that will cause pickled NumPy arrays to fail in the future.
If you needed to read the data on these files, it is likely possible with a little tweaking.
While pickle is convenient and the actual data is guaranteed to be readable, reconstructing the data requires that certain names won't change. Many people use other methods for persistence because of this.
-Travis