Re: [Python-ideas] RFC: PEP: Add dict.version

Jan. 9, 2016

      On Sun, Jan 10, 2016 at 12:21 AM, Steven D'Aprano <steve@pearwood.info> wrote:
...
On Sat, Jan 09, 2016 at 01:09:13PM +0100, M.-A. Lemburg wrote:
...
* Given that this is an optimization and not meant to be exact
  science, why would we need 64 bits worth of version information ?
AFAIK, you only need the version information to be able to
  answer the question "did anything change compared to last time
  I looked ?".
For an optimization it's good enough to get an answer "yes"
  for slow changing dicts and "no" for all other cases.
I don't understand this. The question has nothing to do with
how quickly or slowly the dict has changed, but only on whether or not
it actually has changed. Maybe your dict has been stable for three
hours, except for one change; or it changes a thousand times a second.
Either way, it has still changed.
...
False
  negatives don't really hurt. False positives are not allowed.
I think you have this backwards. False negatives potentially will
introduce horrible bugs. A false negative means that you fail to notice
when the dict has changed, when it actually has. ("Has the dict
changed?" "No.") The result of that will be to apply the optimization
when you shouldn't, and that is potentially catastrophic (the entirely
wrong function is mysteriously called).
A false positive means you wrongly think the dict has changed when it
hasn't. ("Has the dict changed?" "Yes.") That's still bad, because you
miss out on the possibility of applying the optimization when you
actually could have, but it's not so bad. So false positives (wrongly
thinking the dict has changed when it hasn't) can be permitted, but
false negatives shouldn't be.
I think we're getting caught in terminology a bit. The original
question was "why a 64-bit counter". Here's my take on it:

* If the dict has changed but we say it hasn't, this is a critical
failure. M-A L called this a "false positive", which works if the
question is "may we use the optimized version".

* If the dict has changed exactly N times since it was last checked,
where N is the integer wrap-around period of the counter, a naive
counter comparison will show that it has not changed.

Consequently, a small counter is more problematic than a large one. If
the counter has 2**8 states, then collisions will be frequent, and
that would be bad. If it has 2**32 states, then a slow-changing dict
will last longer than any typical run of a program (if it changes,
say, once per second, you get over a century of uptime before it's a
problem), but a fast-changing dict could run into issues (change every
millisecond and you'll run into trouble after a couple of months). A
64-bit counter could handle ridiculously fast mutation (say, every
nanosecond) for a ridiculously long time (hundreds of years).

That's the only way that fast-changing and slow-changing have any meaning.

ChrisA

Re: [Python-ideas] RFC: PEP: Add dict.__version__

Chris Angelico

Re: [Python-ideas] RFC: PEP: Add dict.version