[Python-Dev] PEP 393 memory savings update
"Martin v. Löwis"
martin at v.loewis.de
Wed Sep 28 00:56:58 CEST 2011
I have redone my memory benchmark, and added a few new
counters.
The application is a very small Django application. The same
source code of the app and Django itself is used on all Python
versions. The full list of results is at
http://www.dcl.hpi.uni-potsdam.de/home/loewis/djmemprof/
Here are some excerpts:
A. 32-bit builds, storage for Unicode objects
3.x, 32-bit wchar_t: 6378540
3.x, 16-bit wchar_t: 3694694
PEP 393: 2216807
Compared to the previous results, there are now some
significant savings even compared to a narrow unicode build.
B. 3.x, number of strings by maxchar:
ASCII: 35713 (1,300,000 chars)
Latin-1: 235 (11,000 chars)
BMP: 260 (700 chars)
other: 0
total: 36,000 (1,310,000 chars)
This explains why the savings for shortening ASCII objects
are significant in this application. I have no good intuition
how this effect would show for "real" applications. It may be
that the percentage of ASCII strings (in number and chars) grows
proportionally with the total number of strings; it may also
be that the majority of these strings is a certain fixed overhead
(resulting from Python identifiers and other interned strings).
C. String-ish objects in 2.7 and 3.3-trunk:
2.x 3.x
#unicode 370 36,000
#bytes 43,000 14,000
#total 43,400 50,000
len(unicode) 5,300 1,306,000
len(bytes) 2,040,000 860,000
len(total) 2,046,000 2,200,000
(Note: the computations in the results are slightly messed up:
the number of bytes for bytes objectts is actually the sum
of the lengths, not the sum of the sizeofs; this gets added
in the "total" lines to the sum of sizeofs of unicode strings,
which is non-sensical. The table above corrects this)
As you can see, Python 3 creates more string objects in total.
D. Memory consumption for 2.x, 3.x, PEP 393, accounting both
unicode and bytes objects, using 32-bit builds and 32-bit
wchar_t:
2.x: 3,620,000 bytes
3.x: 7,750,000 bytes
PEP 393: 3,340,000 bytes
This suggests that PEP 393 actually reduces memory consumption
below what 2.7 uses. This is offset though by "other" (non-string)
objects, which take 300KB more in 3.x.
Regards,
Martin
More information about the Python-Dev
mailing list