On 7/27/20 5:00 PM, Christopher Barker wrote:
I guess this is the part I find confusing:
when (and why) does __eq__ play a role?
__eq__ is the final authority on whether two objects are equal. The default __eq__ punts and used identity.
On Mon, Jul 27, 2020 at 12:01 PM Ethan Furman wrote:
However, not all objects with the equal hashes compare equal themselves.
That's the one I find confusing -- why is it not "bad" for two objects with the same hash (the 42 example above) to not be equal? That seems like it would be very dangerous. Is this because it's possible, if very unlikely, for ANY hash algorithm to create the same hash for two different inputs? So equality always has to be checked anyway?
Well, there are a finite number of integers to be used as hashes, and potentially many more than that number of objects needing to be hashed. So, yes, hashes can (and will) be shared, and equality must be checked also.
For example, if a hash algorithm decided to use short names, then a group of people might be sorted like this:
Bob: Bob, Robert Chris: Christopher, Christine, Christian, Christina Ed: Edmund, Edward, Edwin, Edwina
So if somebody draws a name from a hat:
You apply the hash to it:
Ignore the Bob and Ed buckets, then use equality checks on the Chris names to find the right one.
From a practical standpoint, think of dictionaries:
(that's the trick here -- you can't "get" this without knowing something about the implementation details of dicts.)
Depends on the person -- I always do better with a concrete application.
- objects are sorted into buckets based on their hash
- any one bucket can have several items with equal hashes
is this mostly because there are many more possible hashes than buckets?
- those several items (obviously) will not compare equal
So the hash is a fast way to put stuff in buckets, so you only need to compare with the others that end up in the same bucket?
- get the hash of the object
- find the bucket that would hold that hash
- find the already stored objects with the same hash
- use __eq__ on each one to find the match
So here's my question: if there is only one object in that bucket, is __eq__ checked anyway?
Yes -- just because it has the same hash does not mean it's equal.
So what happens when there is no __eq__?The object can still be hashable -- I guess that's because there IS an __eq__ -- it defaults to an id check, yes?
The default hash, I believe, also defaults to the object id -- so, by default, objects are hashable and compare equal only to themselves.