[Python-Dev] versioned .so files for Python 3.2
John Arbash Meinel
john.arbash.meinel at gmail.com
Wed Jul 7 23:56:23 CEST 2010
Scott Dial wrote:
> On 6/30/2010 2:53 PM, Barry Warsaw wrote:
>> It might be amazing, but it's still a significant overhead. As I've
>> described, multiply that by all the py files in all the distro packages
>> containing Python source code, and then still try to fit it on a CDROM.
>
> I decided to prove to myself that it was not a significant issue to have
> parallel directory structures in a .tar.bz2, and I was surprised to find
> it much worse at that then I had imagined. For example,
>
> # cd /usr/lib/python2.6/site-packages
> # tar --exclude="*.pyc" --exclude="*.pyo" \
> -cjf mercurial.tar.bz2 mercurial
> # du -h mercurial.tar.bz2
> 640K mercurial.tar.bz2
>
> # cp -a mercurial mercurial2
> # tar --exclude="*.pyc" --exclude="*.pyo" \
> -cjf mercurial2.tar.bz2 mercurial mercurial2
> # du -h mercurial.tar.bz2
> 1.3M mercurial2.tar.bz2
>
I believe the standard (and largest) block size for .bz2 is 900kB, and I
*think* that is uncompressed. Though I know that bz2 can chain, since it
can compress all NULL bytes extremely well (multiple GB down to kB, IIRC).
There was a question as to whether LZMA would do better here, I'm using
7zip, but .xz should perform similarly.
$ du -sh mercurial*
2.6M mercurial
2.6M mercurial2
366K mercurial.tar.bz2
734K mercurial2.tar.bz2
303K mercurial.7z
310K mercurial2.7z
So LZMA with the 'normal' compression has a big enough window to find
almost all of the redundancy, and 310kB is certainly a very small
increase over the 303kB. And clearly bz2 does not, since 734kB is
actually slightly more than 2x 366kB.
John
=:->
More information about the Python-Dev
mailing list