Re: [Python-Dev] PEP 574 (pickle 5) implementation and backport available

Hi all, I agree that compression is often a good idea when moving serialized objects around on a network, but for what it's worth I as a library author would always set compress=False and then handle it myself as a separate step. There are a few reasons for this: 1. Bandwidth is often pretty good, especially intra-node, on high performance networks, or on decent modern discs (NVMe) 2. I often use different compression technologies in different situations. LZ4 is a great all-around default, but often snappy, blosc, or z-standrad are better suited. This depends strongly on the characteristics of the data. 3. Very often data often isn't compressible, or is already in some compressed form, such as in images, and so compressing only hurts you. In general, my thought is that compression is a complex topic with enough intricaces that setting a single sane default that works 70+% of the time probably isn't possible (at least not with the applications that I get exposed to). Instead of baking a particular method into pickle.dumps I would recommend trying to solve this problem through documentation, pointing users to the various compression libraries within the broader Python ecosystem, and perhaps pointing to one of the many blogposts that discuss their strengths and weaknesses. Best, -matt

+1 for not adding in-pickle compression as it is already very easy to handle compression externally (for instance by passing a compressing file object as an argument to the pickler). Furthermore, as PEP 574 makes it possible to stream the buffer bytes directly to the file-object without any temporary memory copy I don't see any benefit in including the compression into the pickle protocol. However adding lz4.LZ4File to the standard library in addition to gzip.GzipFile and lzma.LZMAFile is probably a good idea as LZ4 is really fast compared to zlib/gzip. But this is not related to PEP 574. -- Olivier

On Sat, 26 May 2018 18:42:42 +0200 Olivier Grisel <olivier.grisel@ensta.org> wrote:
However adding lz4.LZ4File to the standard library in addition to gzip.GzipFile and lzma.LZMAFile is probably a good idea as LZ4 is really fast compared to zlib/gzip. But this is not related to PEP 574.
If we go that way, we may probably want zstd as well :-). But, yes, most likely unrelated to PEP 574. Regards Antoine.
participants (3)
-
Antoine Pitrou
-
Matthew Rocklin
-
Olivier Grisel