This works if run from Py3. Don't know if it will *always* work. From that
GH discussion you linked, it sounds like that is a bit of a hack.
However, I would consider defining some sort of v2 of your HDF file format,
which converts all of the lists of arrays to CArrays or EArrays in the HDF
file. (https://pytables.github.io/usersguide/libref/homogenous_storage.html)
Otherwise, what is the advantage of using HDF files over just plain
shelves?... Just a thought.
Ryan
On Thu, Mar 5, 2015 at 2:52 AM, Anrd Baecker <arnd.baecker at web.de> wrote:
> Dear all,
>
> when preparing the transition of our repositories from python 2
> to python 3, I encountered a problem loading pytables (.h5) files
> generated using python 2.
> I suspect that it is caused by a problem with pickling numpy arrays
> under python 3:
>
> The code appended at the end of this mail works
> fine on either python 2.7 or python 3.4, however,
> generating the data on python 2 and trying to load
> them on python 3 gives some strange string
> ( b'(lp1\ncnumpy.core.multiarray\n_reconstruct\np2\n(cnumpy\nndarray ...)
> instead of
> [array([ 0., 1., 2., 3., 4., 5.]),
> array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])]
>
> The problem sounds very similar to the one reported here
> https://github.com/numpy/numpy/issues/4879
> which was fixed with numpy 1.9.
>
> I tried different versions/combintations of numpy (including 1.9.2)
> and always end up with the above result.
> Also I tried to reduce the problem down to the level of pure numpy
> and pickle (as in the above bug report):
>
> import numpy as np
> import pickle
> arr1 = np.linspace(0.0, 1.0, 2)
> arr2 = np.linspace(0.0, 2.0, 3)
> data = [arr1, arr2]
>
> p = pickle.dumps(data)
> print(pickle.loads(p))
> p
>
> Using the resulting string for p as input string
> (with b added at the beginnung) under python 3 gives
> UnicodeDecodeError: 'ascii' codec can't decode
> byte 0xf0 in position 14: ordinal not in range(128)
>
>
> Can someone reproduce the problem with pytables?
> Is there maybe work-around?
> (And no: I can't re-generate the "old" data files - it's
> hundreds of .h5 files ... ;-).
>
> Many thanks, best, Arnd
>
>
> ##############################################################################
> """Illustrate problem with pytables data - python 2 to python 3."""
>
> from __future__ import print_function
>
> import sys
> import numpy as np
> import tables as tb
>
>
> def main():
> """Run the example."""
> print("np.__version__=", np.__version__)
> check_on_same_version = False
>
> arr1 = np.linspace(0.0, 5.0, 6)
> arr2 = np.linspace(0.0, 10.0, 11)
> data = [arr1, arr2]
>
> # Only generate on python 2.X or check on the same python version:
> if sys.version < "3.0" or check_on_same_version:
> fpt = tb.open_file("tstdat.h5", mode="w")
> fpt.set_node_attr(fpt.root, "list_of_arrays", data)
> fpt.close()
>
> # Load the saved file:
> fpt = tb.open_file("tstdat.h5", mode="r")
> result = fpt.get_node_attr("/", "list_of_arrays")
> fpt.close()
> print("Loaded:", result)
>
> main()
>
>
>
>
