[Python-Dev] The untuned tunable parameter ARENA_SIZE

Antoine Pitrou solipsis at pitrou.net
Thu Jun 1 04:19:06 EDT 2017


On Thu, 1 Jun 2017 00:38:09 -0700
Larry Hastings <larry at hastings.org> wrote:
>   * CPython programs would use more memory.  How much?  Hard to say.  It
>     depends on their allocation strategy.  I suspect most of the time it
>     would be < 3mb additional memory.  But for pathological allocation
>     strategies the difference could be significant.  (e.g: lots of
>     allocs, followed by lots of frees, but the occasional object lives
>     forever, which means that the arena it's in can never be freed.  If
>     1 out of ever 16 256k arenas is kept alive this way, and the objects
>     are spaced out precisely such that now it's 1 for every 4mb arena,
>     max memory use would be the same but later stable memory use would
>     hypothetically be 16x current)

Yes, this is the same kind of reason the default page size is still 4KB
on many platforms today, despite typical memory size having grown by a
huge amount.  Apart from the cost of fragmentation as you mentioned,
another issue is when many small Python processes are running on a
machine: a 2MB overhead per process can compound to large numbers if
you have many (e.g. hundreds) such processes.

I would suggest we exert caution here.  Small benchmarks generally have
a nice memory behaviour: not only they do not allocate a lot of memory a,
but often they will release it all at once after a single run.  Perhaps
some of those benchmarks would even be better off if we allocated 64MB
up front and never released it :-)

Long-running applications can be less friendly than that, having
various pieces of internal with unpredictable lifetimes (especially
when it's talking over the network with other peers which come and go).
And long-running applications are typically where Python memory usage is
a sensitive matter.

If you'd like to go that way anyway, I would suggest 1MB as a starting
point in 3.7.

>   * Many programs would be slightly faster now and then, simply because
>     we call malloc() 1/16 as often.

malloc() you said?  Arenas are allocated using mmap() nowadays, right?

Regards

Antoine.




More information about the Python-Dev mailing list