On 4/22/2019 7:27 PM, Steve Dower wrote:

On 22Apr2019 1921, Steve Dower wrote:

On 22Apr2019 1822, Glenn Linderman wrote:

Inada is now proposing a way to allow the coder to suggest a group of dictionaries that might benefit from the same gains, by preclassifying non-__dict__ slot dictionaries to do similar sharing.

CSV reader is an exemplary candidate, because it creates groups of dicts that use the same keys. (column names). I have other code that does similar things, that would get similar benefits.

Seems like since it is just an interface to existing builtin code, that the one interface function (or dictionary factory class) could just as well be a builtin function, instead of requiring an import.

Sounds like a similar optimisation to sys.intern() is for strings.

I see no reason to try and avoid an import here - it's definitely a special-case situation - but otherwise having a function to say "clone and update this dict" that starts by sharing the keys in the same way that __dict__ does (including the transformation when necessary) seems like an okay addition. Maybe copy() could just be enabled for this?

Or possibly just "dict(existing_dict).update(new_items)".

My primary concern is still to avoid making CPython performance characteristics part of the Python language definition. That only makes it harder for alternate implementations. (Even though I was out-voted last time on this issue since all the publicly-known alternate implementations said it would be okay... I'm still going to put in a vote for avoiding new language semantics for the sake of a single runtime's performance characteristics.)

While Inada's suggested DictBuilder interface was immediately obvious, I don't get how either copy or update would achieve the goal. Perhaps you could explain? Particularly, what would be the trigger that would make dict() choose to create a shared key dictionary from the start? Unless it is known that there will be lots of (mostly static) dictionaries with the same set of keys at the time of creation of the first one, creating a shared key dictionary in every case would cause later inefficiencies in converting them, when additional items are added? (I'm assuming without knowledge that a single shared key dictionary is less efficient than a single regular dictionary.)