[Python-Dev] Python startup time: String objects
Martin v. Löwis
martin at v.loewis.de
Wed Mar 24 15:19:22 EST 2004
At pycon, I have been looking into Python startup time.
I found that CVS-Python allocates roughly 12,000 string objects on startup,
whereas Python 2.2 only allocates 8,000 string objects. In either case,
most strings come from unmarshalling string objects, and the increase is
(probably) due to the increased number of modules loaded at startup
(up from 26 to 34).
The string objects allocated during unmarshalling are often quickly
discarded after being allocated, as they are identifiers, and get
interned - so only the interned version of the string survives, and
the second copy is deallocated.
I'd like to change the marshal format to perform sharing of equal
strings, instead of marshalling the same identifiers multiple times.
To do so, a dictionary of strings is created on marshalling and a
list is created on unmarshalling, and a new marshal code for
string-backreference would be added.
What do you think?
More information about the Python-Dev