[Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered

Mon Sep 12 12:27:14 EDT 2016

For the regular dict (non kwargs or namespace __dict__) use case I would
actually like to *see disorder preserved during iteration*.

If we don't, we will eventually to find ourselves in a similar state we
were in pre hash-randomization:
 (1) Over time, code will come to depend on the order for no good reason.
Especially true of tests. This greatly increases the engineering barrier
when trying to move a codebase between Python versions or Python VMs.

The underlying implementation is free to preserve order (as it now does,
great work!) but I think the behavior of iteration when an ordered type was
not explicitly requested or ordered iteration was not explicitly requested
should be perturbed in order to maintain long term code health.

Disorder for this purpose need not be a random shuffle (overkill). It just
needs to be regularly inconsistent. A simple thing to do on top of 3.6's
new dict implementation would be to pick a random starting point within the
order array rather than offset 0 to start iteration from. That small change
would be sufficient to guarantee that code depending on order must ask for
order. It could even allow us to get people ready for iteration within the
same process to become unstable.

Maybe I worry too much. Having slogged through fixing problems to enable
hash randomization on a code base of tens of millions of lines in 2012...
there is a lot of value in enforcing disorder where none is intended to be
guaranteed. Explicit is better than implicit.

-gps

On Mon, Sep 12, 2016 at 5:37 AM Victor Stinner <victor.stinner at gmail.com>
wrote:

> 2016-09-12 13:50 GMT+02:00 Antoine Pitrou <solipsis at pitrou.net>:
> > Besides, I don't think it has been proven that the compact-and-ordered
> > dict implementation is actually *faster* than the legacy one.
>
> Python 3.6 dict is slower than Python 3.5 dict, at least for a simple
> lookup:
> http://bugs.python.org/issue27350#msg275581
>
> But its memory usage is 25% smaller.
>
> I'm curious about the performance of the "compaction" needed after
> adding too many dummy entries (and to preserve insertion order), but I
> don't know how to benchmark this :-) Maybe add/remove many new keys? I
> expect bad performance on the compaction, but maybe not as bad as the
> "hash DoS".
>
> For regular Python code, I don't expect compaction to be a common
> operation, since it's rare to remove attributes. It's more common to
> modify attributes value, than to remove them and later add new
> attributes.
>
> Victor
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/greg%40krypto.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160912/517d6b8e/attachment.html>