[Python-Dev] PEP 574 (pickle 5) implementation and backport available

Stefan Behnel stefan_ml at behnel.de
Sat May 26 03:12:43 EDT 2018


Antoine Pitrou schrieb am 25.05.2018 um 23:11:
> On Fri, 25 May 2018 14:50:57 -0600
> Neil Schemenauer wrote:
>> On 2018-05-25, Antoine Pitrou wrote:
>>> Do you have something specific in mind?  
>>
>> I think compressed by default is a good idea.  My quick proposal:
>>
>> - Use fast compression like lz4 or zlib with Z_BEST_SPEED
>>
>> - Add a 'compress' keyword argument with a default of None.  For
>>   protocol 5, None means to compress.  Providing 'compress' != None
>>   for older protocols will raise an error.
> 
> The question is what purpose does it serve for pickle to do it rather
> than for the user to compress the pickle themselves.  You're basically
> saving one line of code.  Am I missing some other advantage?

Regarding the pickling side, if the pickle is large, then it can save
memory to compress while pickling, rather than compressing after pickling.
But that can also be done with file-like objects, so the advantage is small
here.

I think a major advantage is on the unpickling side rather than the
pickling side. Sure, users can compress a pickle after the fact, but if
there's a (set of) standard algorithms that unpickle can handle
automatically, then it's enough to pass "something pickled" into unpickle,
rather than having to know (or figure out) if and how that pickle was
originally compressed, and build up the decompression pipeline for it to
get everything uncompressed efficiently without accidentally wasting memory
or processing time.

Obviously, auto-decompression opens up a gate for compression bombs, but
then, unpickling data from untrusted sources is discouraged anyway, so...

Stefan



More information about the Python-Dev mailing list