[pypy-dev] Speeding up zlib in standard library

Alex Gaynor alex.gaynor at gmail.com
Mon Mar 19 20:36:46 CET 2012


On Mon, Mar 19, 2012 at 2:32 PM, Maciej Fijalkowski <fijall at gmail.com>wrote:

> On Mon, Mar 19, 2012 at 9:15 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> > On Mon, Mar 19, 2012 at 6:36 PM, Maciej Fijalkowski <fijall at gmail.com>
> wrote:
> >>> http://mail.python.org/mailman/listinfo/pypy-dev
> >>
> >> append_charpsize is special - it's not the *actual* implementation,
> >> the actual implementation is buried somewhere in
> >> rpython/lltypesystem/rbuilder.py, with the one you're mentioning being
> >> just fake implementation for tests. StringBuilder is special in a
> >> sense that it has some special GC support (which we can probably
> >> improve upon).
> >>
> >> Cheers,
> >> fijal
> >
> > I guess you are referring to the copy_string_contents function here:
> >
> https://bitbucket.org/pypy/pypy/src/default/pypy/rpython/lltypesystem/rstr.py
> >
> > However, methods ll_append_multiple_char not ll_append_charpsize
> > defined in rbuilder seem to use this - they both use a for loop
> char-by-char,
> >
> https://bitbucket.org/pypy/pypy/src/default/pypy/rpython/lltypesystem/rbuilder.py
> >
> > My hunch would be to replace this:
> >
> >    @staticmethod
> >    def ll_append_charpsize(ll_builder, charp, size):
> >        used = ll_builder.used
> >        if used + size > ll_builder.allocated:
> >            ll_builder.grow(ll_builder, size)
> >        for i in xrange(size):
> >            ll_builder.buf.chars[used] = charp[i]
> >            used += 1
> >        ll_builder.used = used
> >
> > with this:
> >
> >    @staticmethod
> >    def ll_append_charpsize(ll_builder, charp, size):
> >        used = ll_builder.used
> >        if used + size > ll_builder.allocated:
> >            ll_builder.grow(ll_builder, size)
> >        assert size >= 0
> >        ll_str.copy_contents(charp, ll_builder.buf, 0, used, size)
> >        ll_builder.used += size
> >
> > (and similarly for ll_append_multiple_char above it)
> >
> > Like an onion - more and more layers ;) I'm beginning to suspect
> > speeding up append_charpsize in order to make passing strings
> > to/from C code faster is a bit too ambitious for a first contribution
> > to PyPy! [*]
> >
> > Peter
> >
> > [*] Especially as after three hours it is still building from source:
> > $ python translate.py --opt=jit targetpypystandalone.py
>
> ok, so let me reply a bit more :)
>
> First of all, you don't have to translate pypy to see changes. We
> mostly run tests to see if they work. You can also write a very small
> rpython program in translator/goal (look at targetnopstandalone.py) if
> you want to just test the performance of single function.
>
> I suppose your code is indeed a bit faster, but my bet would be it's
> not too much faster (feel free to prove me wrong, especially on older
> GCCs, they might not figure out that a loop is vectorizable for
> example).
>
> The main source of why passing strings to C is slow is however copying
> the string from the GC area to non-moving one, raw malloced in C.
> There are various strategies how to approach this, one of those would
> be pinning, so the GC structures don't move and you can pass a pointer
> to C. This is however definitely not a good  first patch to pypy ;-)
>
> What I would suggest:
>
> * Your patch looks good to me, although I'm not sure if
> copy-string-contents would accept a raw memory. Check if tests pass.
> * If you want to benchmark, write a small test for passing such
> strings in translator/goal and see if it works.
>
> We're usually available for help on IRC and thanks for tackling this
> problem!
>
> Cheers,
> fijal
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> http://mail.python.org/mailman/listinfo/pypy-dev
>

FWIW, copy_string_contents definitely doesn't take raw memory, it takes an
rstr.

Alex

-- 
"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20120319/ee9a517c/attachment-0001.html>


More information about the pypy-dev mailing list