[Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2 3.x" buildbot

David Bolen db3l.net at gmail.com
Sat Nov 6 22:36:44 CET 2010


On Sat, Nov 6, 2010 at 7:19 AM, Victor Stinner <victor.stinner at haypocalc.com>
>
> I noticed "OSError: [Errno 23] Too many open files in system" errors on your
> FreeBSD buildbot. I would like to know if you configured a limit on the open
> files or maybe of child processes on this buildbot or not, or if it is a
> failure in Python?
>
> The first error always occurs in the first test of test_concurrent_futures. It's
> maybe because this test uses a lot of open files or processes?

I couldn't find the matching failures that you're talking about, but
then I figured out you mean the FreeBSD7 (7.2) buildbot, not the
FreeBSD (6.4) buildbot ....

I haven't configured any specific limits with respect to open files.
On both FreeBSD buildbots, kern.maxfiles is 3600 and
kern.maxfilesperproc is 3060.  Both have limits of 1530 processes.
The latter also agrees with the maximum descriptors as shown by limit.
 In regards to R. David Murray's response, the buildbots are VMs with
limited memory, so the dynamic calculation he references for
descriptors is much lower than his system.

Looks like the reason FreeBSD is ok, and FreeBSD7 is because the
relevant tests don't run due to lack of POSIX semaphore support.  I
manually enabled their use on FreeBSD7 a while back (11/2009,
issue7272) since they aren't on by default.  I'd be surprised if at
least test_multiprocessing didn't pass at that point (since that's
what the issue was for) but even it seems to be generating the open
files error now.  The buildbots haven't changed, but I suppose the
tests might just have grown in the number of files they need over
time.

I noticed that the failures seem to always be on a semaphore call.
Some quick googling found a few references that seems to imply that
the number of posix semaphores are very limited (like 30), and can't
be changed without recompiling the kernel from source.  So that's not
so big a threshold for the tests to have perhaps started crossing
since issue7272 was fixed.  Certainly seems more likely than 3000+
files or 1500+ processes.

I wonder if it's possible to deduce if this started recently or not?
The web buildbot interface doesn't go back that far, and an additional
complexity is that the FreeBSD builds tend to have various errors
somewhat consistently over time, but perhaps there are server logs we
can grep for this particular error?

Not sure if the best approach at this point is to see if the tests can
use fewer semaphores, skip these tests under FreeBSD 7 like 6, or if
it's important enough to compile a new kernel with a higher semaphore
limit.

-- David


More information about the Python-Dev mailing list