On Thu, 29 Nov 2018 at 09:13, Gregory P. Smith firstname.lastname@example.org wrote:
Q: Are there other popular alternatives to fill that niche that we should strongly consider instead or as well?
5 years ago the answer would've been Snappy. 15 years ago the answer would've been LZO.
Today LZ4 hits a sweet spot for fast compression and decompression at the lower compression ratio end of the spectrum, offering significantly faster compression and decompression than zlib or bz2, but not as high compression ratios (at usable speeds). It's also had time to stabilize, and a standard frame format for compressed data has been adopted by the community.
The other main contenders in town are zstd, which was mentioned earlier in the thread, and brotli. Both are based on dictionary compression. Zstd is very impressive, offering high compression ratios, but is being very actively developed at present, so is a bit more of a moving target.Brotli is in the same ballpark as Zstd. They both cover the higher compression end of the spectrum than lz4. Some nice visualizations are here (although the data is now a bit out of date - lz4 has had some speed improvements at the higher compression ratio end):
I suggest not rabbit-holing this on whether we should adopt a top level namespace for these such as "compress". A good question to ask, but we can resolve that larger topic on its own without blocking anything.
It's funny, but I had gone around in that loop in my head ahead of sending my email. My thinking was: there's a real need for some unification and simplification in the compression space, but I'll work on integrating LZ4, and in the process look at opportunities for the new interface design. I'm a fan of learning through iteration, rather than spending 5 years designing the ultimate compression abstraction and then finding a corner case that it doesn't fit.
lz4 has claimed the global pypi lz4 module namespace today so moving it to the stdlib under that name is normal - A pretty transparent transition. If we do that, the PyPI version of lz4 should remain for use on older CPython versions, but effectively be frozen, never to gain new features once lz4 has landed in its first actual CPython release.
Yes, that was what I was presuming would be the path forward.