[Python-Dev] undesireable unpickle behavior, proposed fix
Jake McGuire
jake at youtube.com
Tue Jan 27 21:25:02 CET 2009
On Jan 27, 2009, at 11:40 AM, Martin v. Löwis wrote:
>> Hm. This would change the pickling format though. Wouldn't just
>> interning (short) strings on unpickling be simpler?
>
> Sure - that's what Jake had proposed. However, it is always difficult
> to select which strings to intern - his heuristics (IIUC) is to intern
> all strings that appear as dictionary keys. Whether this is good
> enough,
> I don't know. In particular, it might intern very large strings that
> aren't identifiers at all.
I may have misunderstood how unpickling works, but I believe that my
path only interns strings that are keys in a dictionary used to
populate an instance. This is very similar to how instance creation
and modification works in Python now. The only difference is if you
set an attribute via "inst.__dict__['attribute_name'] = value" then
'attribute_name' will not be automatically interned, but if you pickle
the instance, 'attribute_name' will be interned on unpickling.
There may be cases where users specifically go through __dict__ to
avoid interning attribute names, but I would be surprised to hear
about it and very interested in talking to the person who did that.
Creating a new pickle protocol to handle this case seems excessive...
-jake
More information about the Python-Dev
mailing list