<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Dec 30, 2016 at 9:21 PM, Ethan Furman <span dir="ltr"><<a href="mailto:ethan@stoneleaf.us" target="_blank">ethan@stoneleaf.us</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div id=":8ek" class="a3s aXjCH m15952af578235f10">I don't think so.  As someone else said, a hash can be calculated once and then cached, but __eq__ has to be called every time.  Depending on the clustering of your data that could be quick... or not.</div></blockquote><div><br></div><div>__eq__ only has to be called when a hash bucket is non-empty. In that case, it may be O(n) in pathological cases, but it could also be O(1) every time. On the other hand, __hash__ has to be called on every lookup, is O(n) on the first call, and ideally O(1) after. I'd expect that __eq__ may often not dominate, and avoiding an unnecessary large tuple allocation on the first __hash__ call could be helpful. If a __hash__ implementation can avoid creating an arbitrarily large tuple unnecessarily and still perform well, why not save the space? If the resulting hash distribution is worse, that's another story, but it seems worth documenting one way or the other, since the current docs give memory-conscious Python programmers pause for this use case.</div></div></div></div>