[Numpy-discussion] numpy pickling problem - python 2 vs. python 3
Anrd Baecker
arnd.baecker at web.de
Thu Mar 5 02:52:21 EST 2015
Dear all,
when preparing the transition of our repositories from python 2
to python 3, I encountered a problem loading pytables (.h5) files
generated using python 2.
I suspect that it is caused by a problem with pickling numpy arrays
under python 3:
The code appended at the end of this mail works
fine on either python 2.7 or python 3.4, however,
generating the data on python 2 and trying to load
them on python 3 gives some strange string
( b'(lp1\ncnumpy.core.multiarray\n_reconstruct\np2\n(cnumpy\nndarray ...)
instead of
[array([ 0., 1., 2., 3., 4., 5.]),
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])]
The problem sounds very similar to the one reported here
https://github.com/numpy/numpy/issues/4879
which was fixed with numpy 1.9.
I tried different versions/combintations of numpy (including 1.9.2)
and always end up with the above result.
Also I tried to reduce the problem down to the level of pure numpy
and pickle (as in the above bug report):
import numpy as np
import pickle
arr1 = np.linspace(0.0, 1.0, 2)
arr2 = np.linspace(0.0, 2.0, 3)
data = [arr1, arr2]
p = pickle.dumps(data)
print(pickle.loads(p))
p
Using the resulting string for p as input string
(with b added at the beginnung) under python 3 gives
UnicodeDecodeError: 'ascii' codec can't decode
byte 0xf0 in position 14: ordinal not in range(128)
Can someone reproduce the problem with pytables?
Is there maybe work-around?
(And no: I can't re-generate the "old" data files - it's
hundreds of .h5 files ... ;-).
Many thanks, best, Arnd
##############################################################################
"""Illustrate problem with pytables data - python 2 to python 3."""
from __future__ import print_function
import sys
import numpy as np
import tables as tb
def main():
"""Run the example."""
print("np.__version__=", np.__version__)
check_on_same_version = False
arr1 = np.linspace(0.0, 5.0, 6)
arr2 = np.linspace(0.0, 10.0, 11)
data = [arr1, arr2]
# Only generate on python 2.X or check on the same python version:
if sys.version < "3.0" or check_on_same_version:
fpt = tb.open_file("tstdat.h5", mode="w")
fpt.set_node_attr(fpt.root, "list_of_arrays", data)
fpt.close()
# Load the saved file:
fpt = tb.open_file("tstdat.h5", mode="r")
result = fpt.get_node_attr("/", "list_of_arrays")
fpt.close()
print("Loaded:", result)
main()
