[Numpy-discussion] saving groups of numpy arrays to disk

Robert Kern robert.kern at gmail.com
Fri Aug 26 12:05:19 EDT 2011


On Fri, Aug 26, 2011 at 07:04, Derek Homeier
<derek at astro.physik.uni-goettingen.de> wrote:
> On 25.08.2011, at 8:42PM, Chris.Barker wrote:
>
>> On 8/24/11 9:22 AM, Anthony Scopatz wrote:
>>>    You can use Python pickling, if you do *not* have a requirement for:
>>
>> I can't recall why, but it seem pickling of numpy arrays has been
>> fragile and not very performant.
>>
> Hmm, the pure Python version might be, but, I've used cPickle for a long time
> and never noted any stability problems.

IIRC, there have been one or two releases where we accidentally broke
the ability to load some old pickles. I think that's the kind of
fragility Chris meant. As for the other kind of stability, we have
had, at times, problems passing unpickled arrays to linear algebra
functions. This is because the SSE instructions used by the optimized
linear algebra package required aligned memory, but the unpickling
machinery did not give us such an option. We do some nasty hacks to
make unpickling performant. The unpickling machinery reads the actual
byte data in as a str object, then passes that to a numpy function to
reconstruct the array object. We simply reuse the memory underlying
the str object. This is a hack, but it's the only way to avoid copying
potentially large amounts of data. This is the cause the unaligned
memory.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list