[issue2901] "error: can't allocate region" from mmap() when receiving big chunk of data

Gregory P. Smith report at bugs.python.org
Tue May 27 02:10:06 CEST 2008


Gregory P. Smith <greg at krypto.org> added the comment:

First and foremost:  do not use XML for bulk data transport.  It is
HORRIBLY inefficient.

I've been playing with this on Linux and OS X with various (trunk 2.6,
release25-maint and 2.5.2) pythons:

I was never able to reproduce the malloc failures on my systems, testing
with data sizes up to 100mb.  It likely takes a specific set of
conditions to reproduce exactly that problem but I do understand how it
could happen.

Anyways one -likely- source of such problems was the socket module
_fileobject.recv() code's long lived over-allocated+realloced strings. 
This was "fixed" in release25-maint [to become 2.5.3] (actually it
caused a perf regression in other code) and the fix was fixed to solve
the perf regression in trunk and will be backported...  Too much history
to sum up there.  See http://bugs.python.org/issue2632 and the older
issues it links to for details.

I cannot claim that the above solves this problem because the bulk of
the actual memory used is the XML parser's fault:

Instrumenting the SimpleXMLRPCServer do_POST code I see the following:

The majority of the memory bloat to handle a request (bloat appears to
be 5-10x the size of the Binary data blob in question!) comes from the
XML parser called by xmlrpclib.loads() from SimpleXMLRPCServer's
_marshaled_dispatch() method.

Why?  Its XML.  On top of that it is not being parsed and decoded as a
stream.

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue2901>
__________________________________


More information about the Python-bugs-list mailing list