[pypy-dev] Benchmarking PyPy performance on real-world Django app

Maciej Fijalkowski fijall at gmail.com
Sat Oct 8 01:35:15 CEST 2011


On Sat, Oct 8, 2011 at 1:28 AM, Igor Katson <igor.katson at gmail.com> wrote:
> On 10/08/2011 02:50 AM, Maciej Fijalkowski wrote:
>>
>> On Sat, Oct 8, 2011 at 12:48 AM, Andy<angelflow at yahoo.com>  wrote:
>>>
>>> 15 times more memory? That's a lot.
>>> Interestingly Quora reported that their PyPy processes were only 50%
>>> larger
>>> than CPython ones:
>>>
>>> http://www.quora.com/Quora-Infrastructure/Did-Quoras-switch-to-PyPy-result-in-increased-memory-consumption
>>>
>>> "our PyPy worker processes themselves take approximately 50% more memory
>>> than our equivalent CPython worker processes, although we did not do a
>>> large
>>> amount of tuning of the GC. Regardless, this wasn't the main cause of our
>>> memory blowup.
>>> "In our development, we found that certain functions were not worth being
>>> ported from their C libraries to pure Python, things like
>>>
>>> crypto
>>>
>>> ,
>>>
>>> lxml
>>>
>>> ,
>>>
>>> PyML
>>>
>>> , and a couple other random libraries. Our solution for those functions
>>> was
>>> to run a parallel CPython process that would do nothing but take
>>> arguments
>>> via an
>>>
>>> execnet
>>>
>>> channel, and output return values via the same
>>>
>>> execnet
>>>
>>>  channel.
>>>
>>> "The overhead for some of these Python processes, especially for the ones
>>> that required a lot of state (for example,
>>>
>>> PyML
>>>
>>> ) is comparable to the amount of memory taken by the master PyPy process,
>>> effectively causing a 2-3x blowup in memory just to maintain the CPython
>>> processes; this is our main memory sink for our PyPy branch."
>>> ----
>>> I wonder what accounts for this large difference in PyPy memory
>>> consumption
>>> (50% more vs. 1,400% more). What type of "large amount of tuning of the
>>> GC"
>>> did Quora do?
>>
>> I think this is a bug, but also different stack was used right?
>> Indeed, pypy should not use much more than 2x of CPython usage, I
>> would like to give it a go if you can come up with a small
>> reproducible example.
>>
>> Cheers,
>> fijal
>
> yeah, I will send you the test suite in a while. This is a bit another
> setup: same site with no data and sqlite instead of pypq, but it's clear
> that the memory usage is also huge, though far more requests are needed to
> bump memory usage to 200mb. cPython memory usage is constant.
>

It *might* be the same thing as with tornado where memory usage grows
constantly. Justin peel is working on it and it'll be in 1.7 some time
soon (it does not have to though, but it does sound remarkably
similar)


More information about the pypy-dev mailing list