[issue2389] Array pickling exposes internal memory representation of elements
Hrvoje Nikšić
report at bugs.python.org
Tue Mar 18 15:38:12 CET 2008
New submission from Hrvoje Nikšić <hniksic at gmail.com>:
It would seem that pickling arrays directly exposes the underlying
machine words, making the pickle non-portable to platforms with
different layout of array elements. The guts of array.__reduce__ look
like this:
if (array->ob_size > 0) {
result = Py_BuildValue("O(cs#)O",
array->ob_type,
array->ob_descr->typecode,
array->ob_item,
array->ob_size * array->ob_descr->itemsize,
dict);
}
The byte string that is pickled is directly created from the array's
contents. Unpickling calls array_new which in turn calls
array_fromstring, which ends up memcpying the string data to the new array.
As far as I can tell, array pickles created on one platform cannot be
unpickled on a platform with different endianness (in case of integer
arrays), wchar_t size (in case of unicode arrays) or floating-point
representation (rare in practice, but possible). If pickles are
supposed to be platform-independent, this should be fixed.
Maybe the "typecode" field when used with the constructor could be
augmented to include information about the elements, such as endianness
and floating-point format. Or we should simply punt and pickle the
array as a list of Python objects that comprise it...?
----------
components: Extension Modules
messages: 63915
nosy: hniksic
severity: normal
status: open
title: Array pickling exposes internal memory representation of elements
type: behavior
versions: Python 2.5, Python 2.6
__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue2389>
__________________________________
More information about the Python-bugs-list
mailing list