I'm attaching a patch which generalizes the at-exit PYTHONMALLOCSTATS memory usage report, so that it's available in a regular build and can be triggered from Python, by calling:

This can be useful when debugging memory usage issues: a script can call the debug hook before and after certain activities, and the before/after amounts can be compared.

The patch moves this arena-accouting code when a new arena is allocated:
     if (narenas_currently_allocated > narenas_highwater)
         narenas_highwater = narenas_currently_allocated;
from out of the #ifdef PYMALLOC_DEBUG guard and into all PYMALLOC configurations, so that this data is available within the dump.  Given how much activity happens when a new arena is allocated, I believe this doesn't impact performance.

It changes _PyObject_DebugMallocStats() to take an arbitrary FILE*, rather than assuming stderr (which was handy for my original use-case of debugging a web server).  This function is already marked with PyAPI_FUNC() but not documented (albeit only present in a debug build).

Tested with --with-pymalloc, --without-pymalloc, and --with-pydebug

FWIW, Red Hat has been using a version of this patch in RHEL 5 as of RHEL 5.6 (http://rhn.redhat.com/errata/RHSA-2011-0027.html), and also in Fedora since September 2011 with python-2.7.2-15 and python3-3.2.2-6 (for the forthcoming Fedora 17).

