[pypy-dev] Benchmarking PyPy performance on real-world Django app

Maciej Fijalkowski fijall at gmail.com
Sun Oct 9 21:15:17 CEST 2011


On Sun, Oct 9, 2011 at 8:58 PM, Igor Katson <igor.katson at gmail.com> wrote:
> On 10/08/2011 03:35 AM, Maciej Fijalkowski wrote:
>>
>> On Sat, Oct 8, 2011 at 1:28 AM, Igor Katson<igor.katson at gmail.com>  wrote:
>>>
>>> On 10/08/2011 02:50 AM, Maciej Fijalkowski wrote:
>>>>
>>>> On Sat, Oct 8, 2011 at 12:48 AM, Andy<angelflow at yahoo.com>    wrote:
>>>>>
>>>>> 15 times more memory? That's a lot.
>>>>> Interestingly Quora reported that their PyPy processes were only 50%
>>>>> larger
>>>>> than CPython ones:
>>>>>
>>>>>
>>>>> http://www.quora.com/Quora-Infrastructure/Did-Quoras-switch-to-PyPy-result-in-increased-memory-consumption
>>>>>
>>>>> "our PyPy worker processes themselves take approximately 50% more
>>>>> memory
>>>>> than our equivalent CPython worker processes, although we did not do a
>>>>> large
>>>>> amount of tuning of the GC. Regardless, this wasn't the main cause of
>>>>> our
>>>>> memory blowup.
>>>>> "In our development, we found that certain functions were not worth
>>>>> being
>>>>> ported from their C libraries to pure Python, things like
>>>>>
>>>>> crypto
>>>>>
>>>>> ,
>>>>>
>>>>> lxml
>>>>>
>>>>> ,
>>>>>
>>>>> PyML
>>>>>
>>>>> , and a couple other random libraries. Our solution for those functions
>>>>> was
>>>>> to run a parallel CPython process that would do nothing but take
>>>>> arguments
>>>>> via an
>>>>>
>>>>> execnet
>>>>>
>>>>> channel, and output return values via the same
>>>>>
>>>>> execnet
>>>>>
>>>>>  channel.
>>>>>
>>>>> "The overhead for some of these Python processes, especially for the
>>>>> ones
>>>>> that required a lot of state (for example,
>>>>>
>>>>> PyML
>>>>>
>>>>> ) is comparable to the amount of memory taken by the master PyPy
>>>>> process,
>>>>> effectively causing a 2-3x blowup in memory just to maintain the
>>>>> CPython
>>>>> processes; this is our main memory sink for our PyPy branch."
>>>>> ----
>>>>> I wonder what accounts for this large difference in PyPy memory
>>>>> consumption
>>>>> (50% more vs. 1,400% more). What type of "large amount of tuning of the
>>>>> GC"
>>>>> did Quora do?
>>>>
>>>> I think this is a bug, but also different stack was used right?
>>>> Indeed, pypy should not use much more than 2x of CPython usage, I
>>>> would like to give it a go if you can come up with a small
>>>> reproducible example.
>>>>
>>>> Cheers,
>>>> fijal
>>>
>>> yeah, I will send you the test suite in a while. This is a bit another
>>> setup: same site with no data and sqlite instead of pypq, but it's clear
>>> that the memory usage is also huge, though far more requests are needed
>>> to
>>> bump memory usage to 200mb. cPython memory usage is constant.
>>>
>> It *might* be the same thing as with tornado where memory usage grows
>> constantly. Justin peel is working on it and it'll be in 1.7 some time
>> soon (it does not have to though, but it does sound remarkably
>> similar)
>
> I tried with that branch, but there is no difference. Will you try to debug
> it with the stuff I gave you?
>

Well, the branch is not ready yet, so no point. Yes, we're trying.


More information about the pypy-dev mailing list