[Python-ideas] RFC: PEP: Add dict.__version__

Gregory P. Smith greg at krypto.org
Mon Jan 11 17:57:14 EST 2016


On Sat, Jan 9, 2016 at 4:09 AM M.-A. Lemburg <mal at egenix.com> wrote:

> On 09.01.2016 10:58, Victor Stinner wrote:
> > 2016-01-09 9:57 GMT+01:00 Serhiy Storchaka <storchaka at gmail.com>:
> >>>> This also can be used for better detecting dict mutating during
> >>>> iterating:
> >>>> https://bugs.python.org/issue19332.
> >> (...)
> >>
> >> This makes Raymond's objections even more strong.
> >
> > Raymond has two major objections: memory footprint and performance. I
> > opened an issue with a patch implementing dict__version__ and I ran
> > pybench:
> > https://bugs.python.org/issue26058#msg257810
> >
> > pybench doesn't seem reliable: microbenchmarks on dict seems faster
> > with the patch, it doesn't make sense. I expect worse or same
> > performance.
> >
> > With my own timeit microbenchmarks, I don't see any slowdown with the
> > patch. For an unknown reason (it's really strange), dict operations
> > seem even faster with the patch.
>
> This can well be caused by a better memory alignment, which
> depends on the CPU you're using.
>
> > For the memory footprint, it's clearly stated in the PEP that it adds
> > 8 bytes per dict (4 bytes on 32-bit platforms). See the "dict subtype"
> > section which explains why I proposed to modify directly the dict
> > type.
>
> Some questions:
>
> * How would the implementation deal with wrap around of the
>   version number for fast changing dicts (esp. on 32-bit platforms) ?
>
> * Given that this is an optimization and not meant to be exact
>   science, why would we need 64 bits worth of version information ?
>
>   AFAIK, you only need the version information to be able to
>   answer the question "did anything change compared to last time
>   I looked ?".
>
>   For an optimization it's good enough to get an answer "yes"
>   for slow changing dicts and "no" for all other cases. False
>   negatives don't really hurt. False positives are not allowed.
>
>   What you'd need to answer the question is a way for the
>   code in need of the information to remember the dict
>   state and then later compare it's remembered state
>   with the now current state of the dict.
>
>   dicts could do this with a 16-bit index into an array
>   of state object slots which are set by the code tracking
>   the dict.
>
>   When it's time to check, the code would simply ask for the
>   current index value and compare the state object in the
>   array with the one it had set.
>

Given it is for optimization only with the fallback slow path being to do
an actual dict lookup, we could implement this using a single bit.

Every modification sets the bit.  There exists an API to clear the bit and
to query the bit.  Nothing else is needed.  The bit could be stored
creatively to avoid increasing the struct size, though ABI compatibility
may prevent that...


>
> * Wouldn't it be possible to use the hash array itself to
>   store the state index ?
>
>   We could store the state object as regular key in the
>   dict and filter this out when accessing the dict.
>
>   Alternatively, we could try to use the free slots for
>   storing these state objects by e.g. declaring a free
>   slot as being NULL or a pointer to a state object.
>
> --
> Marc-Andre Lemburg
> eGenix.com
>
> Professional Python Services directly from the Experts (#1, Jan 09 2016)
> >>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
> >>> Python Database Interfaces ...           http://products.egenix.com/
> >>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
> ________________________________________________________________________
>
> ::: We implement business ideas - efficiently in both time and costs :::
>
>    eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>            Registered at Amtsgericht Duesseldorf: HRB 46611
>                http://www.egenix.com/company/contact/
>                       http://www.malemburg.com/
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20160111/60595e93/attachment-0001.html>


More information about the Python-ideas mailing list