[pypy-dev] Help with memory usage

Kenny Lasse Hoff Levinsen kennylevinsen at gmail.com
Mon Feb 3 20:45:35 CET 2014


Hi,

Sorry for intervening, but a quick answer to your last request: No. There are multiple pluggable GC’s that PyPy can compile with, incminimark being the default now IIRC, but refcounting is not an option. It is unlikely to do you any good, though. It saves you a bit of memory between the apparent death of an object and its collection, but that shouldn’t be that bad. With the new incminimark, that should have improved a lot.

Can you try to reduce the memory leak to a reproducible example? That would be terribly helpful. Unless there is a known issue with one of those libraries, there isn’t enough information for the issue to be debugged.

Best regards,
Kenny Levinsen

On 03 Feb 2014, at 20:07, Matheus Salvia <matheus2740 at gmail.com> wrote:

> Thanks for the answer, Armin.
> The application is corrently using about 15 processes, but this is a configurable number.
> By your information I can assume that the process overhead would be of about 600MB,
> which is not a negligible number but still doesn't fully explain all of the memory usage.
> The application is an information processor, and simplifying the algorithm it would be something like:
> - start a worker process pool
> - receive messages from a queue
> - delegate messages to the workers
> 
> The workers themselves have a lifecycle about like that:
> - get message
> - fetch data pointed by the message from database (apache cassandra)
> - process data:
> - - call a C library to do some preprocessing (Zorba XQuery thru CFFI)
> - - cal a python module to do intermediate and post processing
> - store resulting data to database (ElasticSearch)
> 
> The libraries I'm using are: pycassa, pyelasticsearch, cffi, boto and peewee
> My operating system is Ubuntu 12.04 LTS x64
> 
> Last but not least, is there a way to change pypy from using a GC to using refcount like CPython?
> 
> That was the information you needed?
> Feel free to ask anything.
> 
> Again, thank you,
> Matheus Salvia
> 
> 
> 2014-02-03 Armin Rigo <arigo at tunes.org>:
> Hi Matheus,
> 
> On 3 February 2014 03:55, Matheus Salvia <matheus2740 at gmail.com> wrote:
> > When running my app under CPython, it uses about 1GB of memory, but when
> > running with pypy it goes up to almost 3GB.
> 
> This is a question that doesn't have a single answer.  You need to
> give us a lot more information about what your application does,
> before we can start proposing ideas about why there is such a
> difference in your case.  In our own experience, the memory usage is
> usually very roughly around the same as CPython (a factor less than 2
> in one way or another).
> 
> As a first (likely completely random) guess, you are starting a very
> large number of processes, each of which consumes 60MB instead of 20MB
> in CPython.  If I'm correct, then there is not much we can do; it's
> known that PyPy's baseline memory usage is higher than CPython's, due
> to the JIT which ends up storing a few dozens of MBs in any process.
> 
> 
> A bientôt,
> 
> Armin.
> 
> 
> 
> -- 
> --
>  // Matheus Salvia
> Desenvolvedor Mobile
> Celular: +55 11 9-6446-2332
> Skype: meta.faraday
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20140203/f8694195/attachment-0001.html>


More information about the pypy-dev mailing list