# [MATRIX-SIG] Possible bug in Numeric LoadArray function

Paul F. Dubois Paul F. Dubois" <dubois1@llnl.gov
Thu, 8 Jan 1998 08:29:19 -0800

The PDB library has the following strategy: store the data in a format
native to the writing machine and also store info about the machine's
representation. If a reader discovers non-native data, it translates on the
way in. This gives you much better performance in normal cases than always
translating to some standard format.

Certainly there would be no reason to reinvent all that, if that was what
was wanted. But I know nothing about pickling so I will now shut up.

-----Original Message-----
From: Bill White <bill.white@technologist.com>
Cc: bsd@scripps.edu <bsd@scripps.edu>; matrix-sig@python.org
<matrix-sig@python.org>; sanner@joseph.scripps.edu
<sanner@joseph.scripps.edu>
Date: Thursday, January 08, 1998 6:38 AM
Subject: Re: [MATRIX-SIG] Possible bug in Numeric LoadArray function

>
>> > A pickled file for an integer array was created on an SGI Onyx.
>> > This machine has 4 byte ints and 4 byte longs.
>> >
>> > The picked file would not read on a DEC alpha (64 bit longs).
>> > The reshape function in LoadArray failed because the
>> > byte counts didn't match.
>>
>> I am not surprised - the pickling approach used in NumPy is not really
>> portable. Anyway, it will have to be rewritten to profit from cPickle
>> under Python 1.5. Unfortunately, there is no perfect solution; either
>> pickling must make assumptions about the binary format of its data
>> types (as it does now), or it must apply a conversion, which can
>> become very time consuming for large arrays.
>
>Well, forgive me for saying the obvious thing, but perhaps the best
>approach is to do the following, assuming that people will be moving
>mostly to similar or identical architectures, but with the occasional
>1.) Define a language for describing numeric representations.  This
>    could be as simple as a bit string, with k bit fields to store
>    the sizes of the various sized integers, a field to store a
>    token to represent the floating point representation, and whatever
>    else needs to be done.
>2.) Add a copy of this record to each pickled object somehow.
>3.) Write routines to translate between non-stanard representations.
>    You could go wild with this, but I'll bet most or all needs would
>    be met with routines to translate 2^m byte integers to 2^n byte
>    integers, both signed and unsigned, for n > m, n,m \in {1..3}.
>    That's 3 * 2 == 12 trivial routines, since the diagonals are
>    just copies.  Also, you'd have to write a floating point conversion
>    routine, which is complicated as well.
>
>This way, if you pickle something for your own use later, or for use
>on an identical machine, translation takes constant time.  If you
>pickle something to be sent to a different architecture, there is
>enough information to do the conversion.
>
>I think this is roughly the way DCE or Sun RPC does this.  However, I
>don't know the pickling code, so perhaps it is a silly idea after all.
>
>
>
>_______________
>MATRIX-SIG  - SIG on Matrix Math for Python
>
>send messages to: matrix-sig@python.org