
On 09.01.2016 10:58, Victor Stinner wrote:
2016-01-09 9:57 GMT+01:00 Serhiy Storchaka <storchaka@gmail.com>:
This also can be used for better detecting dict mutating during iterating: https://bugs.python.org/issue19332. (...)
This makes Raymond's objections even more strong.
Raymond has two major objections: memory footprint and performance. I opened an issue with a patch implementing dict__version__ and I ran pybench: https://bugs.python.org/issue26058#msg257810
pybench doesn't seem reliable: microbenchmarks on dict seems faster with the patch, it doesn't make sense. I expect worse or same performance.
With my own timeit microbenchmarks, I don't see any slowdown with the patch. For an unknown reason (it's really strange), dict operations seem even faster with the patch.
This can well be caused by a better memory alignment, which depends on the CPU you're using.
For the memory footprint, it's clearly stated in the PEP that it adds 8 bytes per dict (4 bytes on 32-bit platforms). See the "dict subtype" section which explains why I proposed to modify directly the dict type.
Some questions: * How would the implementation deal with wrap around of the version number for fast changing dicts (esp. on 32-bit platforms) ? * Given that this is an optimization and not meant to be exact science, why would we need 64 bits worth of version information ? AFAIK, you only need the version information to be able to answer the question "did anything change compared to last time I looked ?". For an optimization it's good enough to get an answer "yes" for slow changing dicts and "no" for all other cases. False negatives don't really hurt. False positives are not allowed. What you'd need to answer the question is a way for the code in need of the information to remember the dict state and then later compare it's remembered state with the now current state of the dict. dicts could do this with a 16-bit index into an array of state object slots which are set by the code tracking the dict. When it's time to check, the code would simply ask for the current index value and compare the state object in the array with the one it had set. * Wouldn't it be possible to use the hash array itself to store the state index ? We could store the state object as regular key in the dict and filter this out when accessing the dict. Alternatively, we could try to use the free slots for storing these state objects by e.g. declaring a free slot as being NULL or a pointer to a state object. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 09 2016)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/