[issue2389] Array pickling exposes internal memory representation of elements

Hrvoje Nikšić report at bugs.python.org
Tue Mar 18 15:38:12 CET 2008


New submission from Hrvoje Nikšić <hniksic at gmail.com>:

It would seem that pickling arrays directly exposes the underlying
machine words, making the pickle non-portable to platforms with
different layout of array elements.  The guts of array.__reduce__ look
like this:

	if (array->ob_size > 0) {
		result = Py_BuildValue("O(cs#)O", 
			array->ob_type, 
			array->ob_descr->typecode,
			array->ob_item,
			array->ob_size * array->ob_descr->itemsize,
			dict);
	}

The byte string that is pickled is directly created from the array's
contents.  Unpickling calls array_new which in turn calls
array_fromstring, which ends up memcpying the string data to the new array.

As far as I can tell, array pickles created on one platform cannot be
unpickled on a platform with different endianness (in case of integer
arrays), wchar_t size (in case of unicode arrays) or floating-point
representation (rare in practice, but possible).  If pickles are
supposed to be platform-independent, this should be fixed.

Maybe the "typecode" field when used with the constructor could be
augmented to include information about the elements, such as endianness
and floating-point format.  Or we should simply punt and pickle the
array as a list of Python objects that comprise it...?

----------
components: Extension Modules
messages: 63915
nosy: hniksic
severity: normal
status: open
title: Array pickling exposes internal memory representation of elements
type: behavior
versions: Python 2.5, Python 2.6

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue2389>
__________________________________


More information about the Python-bugs-list mailing list