[pypy-issue] [issue733] bz2 decompression is very slow

Xavier Morel tracker at bugs.pypy.org
Sat Jul 9 15:02:45 CEST 2011

Xavier Morel <bugs.pypy.org at masklinn.net> added the comment:

Pasting observations I put in duplicate 770 on the same problem:

Using a clone of pypy's hg repo (working copy included) as my tar base,
decompressing to fs using `tarfile`.

Test archives created using BSDTAR, default options (`tar cjf` and `tar czf`),
likewise for tar's decompression baseline (`tar xf` in both 

hg id of local Pypy clone is 27df060341f0 tip

OS is OSX 10.6.8

Decompressors tested:
* CPython is Python 2.7.2
* Pypy 1.5 is Python 2.7.1 (?, May 22 2011, 11:59:12) [PyPy 1.5.0-alpha0 with
GCC 4.0.1] from macports
* Pypy trunk is Pypy-65b1ed60d7da from nightlies
* Tar is bsdtar 2.6.2 - libarchive 2.6.2

CPython and Pypy were running the exact same script, which can be found at the
end of the comment

All measurements were performed via `time` and are in minute:seconds, they're
the decompression times.

First I tested the behavior for gzipped files, in order to get an idea of what I
could expect:
* tar: 0:19
* CPython: 0:31
* Pypy 1.5: 0:47
* Pypy trunk: 0:43

Pypy is ~50% slower than CPython, itself ~50% slower than the native tar.

Then I tested using a bz2-compressed archive:
* tar: 0:54
* CPython: 1:10
* Pypy 1.5: hard crash
* Pypy trunk: 2:58

pypy is 200% slower than CPython, which is a significant slowdown. I believe it
might be a source of performance issues when 
installing bz2-packed modules via pip.

Decompression script:
import tarfile
import sys

tar = tarfile.open(sys.argv[1])

nosy: +masklinn

PyPy bug tracker <tracker at bugs.pypy.org>

More information about the pypy-issue mailing list