[Python-Dev] Multigigabyte memory usage in the OpenIndiana Buildbot

Jesus Cea jcea at jcea.es
Wed Sep 7 13:38:23 CEST 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 06/09/11 07:27, Nick Coghlan wrote:
> It may be the case that with the reduced memory limit, your
> machine may not be able to run concurrent slaves for 2.7, 3.2 and
> 3.x as I believe it does now.

Antoine has changed the buildmaster configuration to only send me a
build simultaneously. It doesn't solve the issue. I don't have enough
resources even for a single build.

I just send this email to the owner of the machine:

"""
XXXXXXX, I know you are very busy, but I would like to request
formally the removal of the SWAP capping for my zone.

After investigating the issue, I learn this:

1. Python "make test" launch a python process that can consume >300MB
of RAM.

2. Under Solaris, a 300MB process doing a "fork()" will consume 600MB.
That is, Solaris reserves this much memory just in case the processes
modify their memory (to avoid "out of memory" condition simply because
a process write to its own memory space).

3. So, if a 300MB is forked 10 times, it is going to "virtually" use
3GB. The real memory used is actually far less in the buildbot case,
because the forked process doesn't modify their own memory so much
(forked processes use Copy On Write).

4. So, the required memory to run the buildbots is actually "modest"
compared with the "virtual" memory used.

5. A 4GB SWAP is not enough to run a single buildbot instance. I can
have up to 6 instances, but 4GB is not enough for 1. Python-devs have
modify the buildbot master for only sending me up to two build
simultaneously, trying to help. It is not helping because 4GB of swap
is not enough even for a single instance.

6. With an uncapped SWAP, the actual swapping would be quite low,
because the swap is used to ensure memory reservation for the forked
processes in the worst case (that the forked processes mess with their
own copy of the 300MB address space, COW (Copy On Write)). In practice
4GB of RAM and uncapped SWAP would be enough, with no (or little)
actual swapping.

For this reasons I formally request a reconfiguration of my zone to
uncap my SWAP usage.

The proof is actually very simple:

"""
import time, os

a="a"*1024*1024*512

os.fork() # 2 processes
os.fork() # 4 processes
os.fork() # 16 processes

time.sleep(10)
"""

Running the previous program does this to my swap: (Solaris 10 Update 9)

"""
[root at buffy /]# swap -s
total: 684704k bytes allocated + 3732892k reserved = 4417596k used,
31829688k available
"""

After the programs die, I have this:

"""
[root at buffy /]# swap -s
total: 156680k bytes allocated + 43284k reserved = 199964k used,
36118796k available
"""

In this machine, I have 4GB of RAM, 32GB of swap.

So, this trivial test requires >4GB of RAM+SWAP even if it is actually
using only ~512MB of RAM. Solaris is (rightly) playing safe being sure
the program can actually play/modify its memory space.

XXXXX, if you can't/don't want to modify my zone configuration, let me
know, so I can think what to do next. If I have to talk to somebody
else, please let me know.

Sorry for bother your with these details. I really appreciate the
effort you and your team are doing with OpenIndiana in general and
supporting the Python buildbots under OI in particular. I hope we can
solve this situation.

Thanks for your time and effort.

PS: I think that such memory+swap requirements are quite high, anyway,
and I will pursuit it. But in the meantime I need the buildbot online,
as it was a couple of weeks ago :-)

Thanks!.
"""

So, the problem is that a) "make test" takes quite a bit of RAM and b)
the buildbot forks some "big" processes, so the virtual memory needed
is BIG.

Linux is known for "overcommiting" memory. That is, playing fast and
risky not actually reserving memory, hoping the process will not
actually use it or it will do an "exec" inmediatelly, so this problem
can be not apparent under Linux, but it is there.

So I have two questions:

1. Can we reduce the memory footprint of the tests?. I can't
understand why the python test process is taking so much memory.

2. Why buildbot is "forking()" big processes?. Can we do something to
change this?.

I will wait a few days for OpenIndiana team to reply. If the result is
not satisfactory, I will try to setup a VirtualMachine with the
required resources myself. Crossing fingers...

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTmdXr5lgi5GaxT1NAQKmRwP/dyg4qEs+oWt4r365D797+ItbHluuEVJ+
mWTZw5HVeDajrN7faGH6WuA/J+dJuBp2H4rB8WIM1U/DytL7aZDdDHCeXS79IlUw
SEb5kMA4ENSB6N6bhKmOWpKlwtMQWmw/CtB6//ZX29UZD6ys3UsbO8KslT+M/1EG
P2zmn3PSzo8=
=WE+9
-----END PGP SIGNATURE-----


More information about the Python-Dev mailing list