[pypy-dev] Speeding up zlib in standard library

Maciej Fijalkowski fijall at gmail.com
Mon Mar 19 19:36:43 CET 2012


On Mon, Mar 19, 2012 at 5:49 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> (Replying on list - assuming Justin went off list my mistake)
>
> On Fri, Mar 16, 2012 at 4:54 AM, Justin Peel <peelpy at gmail.com> wrote:
>> Two things to mention. First, if you are going to use valgrind, you
>> will need to build your own pypy because, as far as I know, the
>> buildbot ones do not have the debug info so you won't have any helpful
>> function names in your profile. If I remember correctly, most of the
>> time used is in _operate in rzlib.py.
>>
>> Also, in my opinion, the next thing to try in speeding up this code is
>> to do a faster copy than a char by char copy for copying to the input
>> buffer that is sent to the external C function. I'm not sure if the
>> copying from the output buffer to the string builder is char by char
>> or not.
>
> Looking at the code for rzlib.py, _operate does seem to be the
> core function, and so likely the hot spot.
> https://bitbucket.org/pypy/pypy/src/default/pypy/rlib/rzlib.py
>
> It uses the StringBuilder class from rstring via the append_charpsize
> method, defined as follows:
> https://bitbucket.org/pypy/pypy/src/default/pypy/rlib/rstring.py
>
>    def append_charpsize(self, s, size):
>        l = []
>        for i in xrange(size):
>            l.append(s[i])
>        self.l.append(self.tp("").join(l))
>        self._grow(size)
>
> So it is indeed doing a char by char copy of the string from Python
> to zlib (in my case to decompress a long chunk of data). I don't
> know enough about PyPy's internals to say if something naive
> like this would work faster (guessing looking at the append_slice
> method):
>
>    def append_charpsize(self, s, size):
>        assert 0 <= size
>        self.l.append(s[0:size])
>        self._grow(size)
>
> Presumably to try this idea out I'm going to first need to get
> PyPy to build locally?
>
> Thanks,
>
> Peter
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> http://mail.python.org/mailman/listinfo/pypy-dev

append_charpsize is special - it's not the *actual* implementation,
the actual implementation is buried somewhere in
rpython/lltypesystem/rbuilder.py, with the one you're mentioning being
just fake implementation for tests. StringBuilder is special in a
sense that it has some special GC support (which we can probably
improve upon).

Cheers,
fijal


More information about the pypy-dev mailing list