[Python-Dev] Intricacies of calling __eq__
Steven D'Aprano
steve at pearwood.info
Wed Mar 19 02:09:49 CET 2014
On Tue, Mar 18, 2014 at 04:42:29PM -0700, Kevin Modzelewski wrote:
> My 2 cents: it feels like a slippery slope to start guaranteeing the number
> and ordering of calls to comparison functions -- for instance, doing that
> for the sort() function would lock in the sort implementation.
Although I agree with your conclusion, I'm not so sure I agree with the
way you reach that conclusion. (Actually, I'm not even sure I agree with
my own reasoning!)
The problem here isn't that Maciej wants to change the implementation of
some method or function, like sort. The problem is that (as I understand
it) Maciej wants a blessing to change the semantics of multiple Python
statements.
Currently, the code:
if key in dict:
return dict[key]
performs two dictionary lookups. If you read the code, you can see the
two lookups: "key in dict" performs a lookup, and "dict[key]" performs a
lookup. Sorry to belabour the obvious, but this gets to the core of the
matter. Often this won't matter, but each lookup involves a call to
__hash__ and some variable number of calls to __eq__. (Again, as I
understand it) Maciej wants to optimize away the second lookup, so that
even if you write code like the above, what actually gets executed
(modulo guards for modifications to dicts, multiple threads running,
etc.) is very different, closer to this in semantics:
try:
_temp = dict[key]
except KeyError:
pass
else:
return _temp
I'm not suggesting that PyPy actually will translate the code exactly
like this, only that this will be the semantics. The critical point here
is that in the Python code you write, there are two *separate* lookups,
but in the code that is actually executed, there is only one.
Maciej, is my analysis of what you are doing correct?
Although I have tentatively said I think this is okay, it is a change in
actual semantics of Python code: what you write is no longer what gets
run. That makes this *very* different from changing the implementation
of sort -- by analogy, its more like changing the semantics of
a = f(x) + f(x)
to only call f(x) once. I don't think you would call that an
implementation detail, would you? Even if justified -- f(x) is a pure,
deterministic function with no side-effects -- it would still be a
change to the high-level behaviour of the code.
Since this proposal is limited only to built-in dicts in scenarios where
they cannot be modified between the two lookups, I think that it will be
okay. There's no language guarantee as to the number of times that
__eq__ will be called (although I would be surprised if __hash__
isn't called twice). But I worry that I have missed something.
--
Steven
More information about the Python-Dev
mailing list