[Guido]
... Because these literals look like identifiers, they are interned, so the unpickled data structure shares the string references -- while the original test data has 10,000 copies of each string!
If we really want this as a feature, a call to PyString_InternFromString() could be made under certain conditions in load_short_binstring() (e.g. when the length is at most 10 and all_name_chars() from compile.c returns true).
I'm not sure that this is a desirable feature though.
I hope Oren resumes his crusade to make interned strings follow the same refcount rules as everything else, and then we wouldn't have this fear of interning. BTW, nobody yet has reported any code where "indirect interning" pays -- or even triggers once in a non-eating-its-own-tail way.