
Hello, On Fri, 3 Apr 2020 08:44:23 +1100 Chris Angelico <rosuav@gmail.com> wrote:
On Fri, Apr 3, 2020 at 8:26 AM Paul Sokolovsky <pmiscml@gmail.com> wrote:
Hello,
On Wed, 1 Apr 2020 21:25:46 -0400 Kyle Stanley <aeros167@gmail.com> wrote:
Also, on the point of memory usage: I'd very much like to see some real side-by-side comparisons of the ``''.join(parts)`` memory usage across Python implementations compared to ``StringIO.write()``. I some earlier in the thread, but the results were inaccurate since they relied entirely on ``sys.getsizeof()``, as mentioned earlier. IMO, having accurate memory benchmarks is critical to this proposal. As Chris Angelico mentioned, this can be observed through monitoring the before and after RSS (or equivalent on platforms without it). On
I would still find that too crude an approach. If it would come to that, I would prefer to actually study internal implementation(s) in detail, and patch up sys.getsizeof() to provide actual information.
But it's actually accurate. With getsizeof, you're trying to gauge how much memory something consumes, but it can't acknowledge certain types of memory savings, nor can it recognize certain types of memory consumption. By asking your operating system how much memory you're using, you guarantee to see the actual figure. OTOH, this only works for fairly large allocations, but then again, if the memory cost doesn't actually impact the RSS, is it really a cost?
But not exactly. Let me humbly explain what's really a cost. It's looking at PyObject_HEAD https://swenson.github.io/python-xr/Include/object.h.html#line-78 (damn, that's Python2 source, stupid google), and seeing that it's at least: Py_ssize_t ob_refcnt; \ struct _typeobject *ob_type; That's 2 word-sized fields, 16 bytes on 64-bit machine. You can dig further and further, and understand, how much memory it takes to store so-and-so kind of structure (and how it could be done differently). Now a couple of words about RSS. That's R there for a reason, you should wonder what's if it's not "R". And modern OSes are very modern and nobody knows what they do with virtual memory, or at least they can't fix bugs when something should be "R", but actually "V" - for decades: https://bugzilla.kernel.org/show_bug.cgi?id=12309 (damn, now self-isolated from spam). I hope, the idea is clear: RSS is largely outside of your control, but bytes you allocate in your source are (or should be). [] -- Best regards, Paul mailto:pmiscml@gmail.com