<br><br>On Thursday, March 15, 2012, Armin Rigo <<a href="mailto:arigo@tunes.org">arigo@tunes.org</a>> wrote:<br>> Hi,<br>><br>> On Wed, Mar 14, 2012 at 03:19, Peter Cock <<a href="mailto:p.j.a.cock@googlemail.com">p.j.a.cock@googlemail.com</a>> wrote:<br>
>> I don't know - I was assuming any buffering would be the same<br>>> comparing PyPy 1.8 against Python 2.6 (and 3.2). That was one<br>>> reason for my email - is binding to C relatively slow (compared to<br>
>> the rest of PyPy running pure Python)?<br>><br>> Not necessarily. You get direct C calls, both from the translated<br>> pypy and from JITted assembler code. There are performance hits when<br>> e.g. the C library relies on macros, but I don't think that's the case<br>
> of zlib.<br>><br>> Passing big strings around, on the other hand, is typically slower on<br>> PyPy because they need to be copied between GC-managed areas and<br>> non-GC-managed areas. There are vague ideas on how to improve but<br>
> nothing I can summarize in two words.<br>><br>> At this level, for profiling, you can use valgrind. You'll see the<br>> time spent in zlib itself, the time spent copying big strings around,<br>> and the time spent actually executing the JIT-generated assembler<br>
> (this ends up in "functions" with no name, just an address).<br><br>I think in my case it could be this "big string" issue then, rather<br>than the interface with zlib itself. I'm dealing with 64kb chunks<br>
of data which are zlib compressed, and I'm using (bytes) strings<br>to hold these in Python.<br><br>I've used valgrind before, but never with PyPy - hopefully I can<br>find some time to dig into this a bit further.<br>
<br>Thanks,<br><br>Peter<br><br>P.S. This is for blocked gzip format (BGZF) if you're curious,<br><a href="http://blastedbio.blogspot.com/2011/11/bgzf-blocked-bigger-better-gzip.html">http://blastedbio.blogspot.com/2011/11/bgzf-blocked-bigger-better-gzip.html</a><br>