
2016-01-09 13:09 GMT+01:00 M.-A. Lemburg <mal@egenix.com>:
* How would the implementation deal with wrap around of the version number for fast changing dicts (esp. on 32-bit platforms) ?
Let me try to do some maths. haypo@selma$ python3 -m timeit 'd={}' 'for i in range(2**16): d[i]=i' 100 loops, best of 3: 7.01 msec per loop haypo@selma$ python3 Python 3.4.3 (default, Jun 29 2015, 12:16:01)
t=7.01e-3 / 2**16 t*1e9 106.964111328125
It looks like __setitem__() takes 107 in average. I guess that the number depends a lot on the dictionary size, the number of required resize (rehash all keys), etc. But well, it's just to have an estimation.
print(datetime.timedelta(seconds=2**32 * t)) 0:07:39.407360
With a 32-bit version, less than 8 minutes are enough to hit the integer overflow if each dict operation changes the dict version and you modify a dict in a loop.
print(2016 + datetime.timedelta(seconds=2**64 * t) / datetime.timedelta(days=365.25)) 64541.02049400773
With a 64-bit version, the situation is very different: the next overflow will not occur before the year 64 541 :-) Maybe it's worth to use a 64-bit version on 32-bit platforms? Python 3.5 already uses a 64-bit integer on 32-bit platforms to store a timestamp in the private "pytime" API. Guard has only a bug on integer overflow if the new version modulo 2^32 (or modulo 2^64) is equal to the old version. The bet is also that it's "unlikely".
* Given that this is an optimization and not meant to be exact science, why would we need 64 bits worth of version information ?
If a guard says that nothing changes where something changes, it is a real issue for me. It means that the optimization changes the Python semantic.
For an optimization it's good enough to get an answer "yes" for slow changing dicts and "no" for all other cases. False negatives don't really hurt. False positives are not allowed.
False negative means that you loose the optimization. It would be annoying to see server performance degrades after N days before of an integer overflow :-/ It can be a big issue. How do you choose the number of servers if performances are not stable? Victor