[pypy-dev] Speeding up zlib in standard library

Maciej Fijalkowski fijall at gmail.com
Mon Mar 19 20:32:41 CET 2012

On Mon, Mar 19, 2012 at 9:15 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Mon, Mar 19, 2012 at 6:36 PM, Maciej Fijalkowski <fijall at gmail.com> wrote:
>>> http://mail.python.org/mailman/listinfo/pypy-dev
>> append_charpsize is special - it's not the *actual* implementation,
>> the actual implementation is buried somewhere in
>> rpython/lltypesystem/rbuilder.py, with the one you're mentioning being
>> just fake implementation for tests. StringBuilder is special in a
>> sense that it has some special GC support (which we can probably
>> improve upon).
>> Cheers,
>> fijal
> I guess you are referring to the copy_string_contents function here:
> https://bitbucket.org/pypy/pypy/src/default/pypy/rpython/lltypesystem/rstr.py
> However, methods ll_append_multiple_char not ll_append_charpsize
> defined in rbuilder seem to use this - they both use a for loop char-by-char,
> https://bitbucket.org/pypy/pypy/src/default/pypy/rpython/lltypesystem/rbuilder.py
> My hunch would be to replace this:
>    @staticmethod
>    def ll_append_charpsize(ll_builder, charp, size):
>        used = ll_builder.used
>        if used + size > ll_builder.allocated:
>            ll_builder.grow(ll_builder, size)
>        for i in xrange(size):
>            ll_builder.buf.chars[used] = charp[i]
>            used += 1
>        ll_builder.used = used
> with this:
>    @staticmethod
>    def ll_append_charpsize(ll_builder, charp, size):
>        used = ll_builder.used
>        if used + size > ll_builder.allocated:
>            ll_builder.grow(ll_builder, size)
>        assert size >= 0
>        ll_str.copy_contents(charp, ll_builder.buf, 0, used, size)
>        ll_builder.used += size
> (and similarly for ll_append_multiple_char above it)
> Like an onion - more and more layers ;) I'm beginning to suspect
> speeding up append_charpsize in order to make passing strings
> to/from C code faster is a bit too ambitious for a first contribution
> to PyPy! [*]
> Peter
> [*] Especially as after three hours it is still building from source:
> $ python translate.py --opt=jit targetpypystandalone.py

ok, so let me reply a bit more :)

First of all, you don't have to translate pypy to see changes. We
mostly run tests to see if they work. You can also write a very small
rpython program in translator/goal (look at targetnopstandalone.py) if
you want to just test the performance of single function.

I suppose your code is indeed a bit faster, but my bet would be it's
not too much faster (feel free to prove me wrong, especially on older
GCCs, they might not figure out that a loop is vectorizable for

The main source of why passing strings to C is slow is however copying
the string from the GC area to non-moving one, raw malloced in C.
There are various strategies how to approach this, one of those would
be pinning, so the GC structures don't move and you can pass a pointer
to C. This is however definitely not a good  first patch to pypy ;-)

What I would suggest:

* Your patch looks good to me, although I'm not sure if
copy-string-contents would accept a raw memory. Check if tests pass.
* If you want to benchmark, write a small test for passing such
strings in translator/goal and see if it works.

We're usually available for help on IRC and thanks for tackling this problem!


More information about the pypy-dev mailing list