14 Oct
2020
14 Oct
'20
11:01 a.m.
On Tue, 13 Oct 2020 05:58:45 -0000 "Ma Lin" <malincns@163.com> wrote:
I heard in data science domain, the data is often huge, such as hundreds of GB or more. If people can make full use of multi-core CPU to compress, the experience will be much better than zlib.
This is true, but in data science it is extremely beneficial to use specialized file formats, such as Parquet (which incidentally can use zstd under the hood). In that case, the compression is built in the Parquet implementation, and won't depend on zstd being available in the Python standard library. Regards Antoine.