[Python-Dev] Immutability vs. hashability

Chris Barker chris.barker at noaa.gov
Mon Feb 12 13:51:06 EST 2018


On Mon, Feb 5, 2018 at 3:37 PM, Steven D'Aprano <steve at pearwood.info> wrote:

> On Sun, Feb 04, 2018 at 09:18:25PM -0800, Guido van Rossum wrote:
>
> > The way I think of it generally is that immutability is a property of
> > types, while hashability is a property of values.
>
> That's a great way to look at it, thanks.
>

hmm -- maybe we should get a ValueError then when you try to use a
non-hashable value?

In [*9*]: t = ([1,2,3],)


In [*10*]: set(t)

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-10-a2fddf4dafd8> in <module>()

----> 1 set(t)


TypeError: unhashable type: 'list'


Of course, in this case, the error is triggered by the type of the zeroth
element of the tuple, not by the  value of the tuple per se.

Which means that hashability really is a property of type -- but container
types require a specific way of thinking -- hashability is not determined
by the type of the container (or may not be), but by the types of it's
contents. Is that the value of the container?

So maybe: an object is hashable if it is a hashable type, and if it is a
container, if it's contents are hashable types.


With dataclasses as they stand -- it seems the values of the fields does
not affect hashability:

(this is the version 0.4 from PyPi -- disregard if it's out of date)

Unhashable by default:

In [*14*]: @dataclasses.dataclass()

    ...: *class* *NoHash*:

    ...:     x = 5

    ...:     l = [1,2,3]

    ...:


In [*15*]: set((nh,))

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-15-80dc23ed0b24> in <module>()

----> 1 set((nh,))


TypeError: unhashable type: 'NoHash'



OK, that's what we expect.



But then if it is hashable:

In [*19*]: @dataclasses.dataclass(hash=*True*)

    ...: *class* *Hash*:

    ...:     x = 5

    ...:     l = [1,2,3]

    ...:


In [*20*]: h = Hash()


In [*21*]: set((h,))

Out[*21*]: {Hash()}



All works, regardless of the values of the fields

I haven't looked at the code -- but it appears the hash has nothing to do
with the values of the fields:

In [*23*]: hash(h)

Out[*23*]: 3527539


In [*24*]: h.l.append(6)


In [*25*]: hash(h)

Out[*25*]: 3527539


In [*26*]: h.x = 7


In [*27*]: hash(h)

Out[*27*]: 3527539

and it looks like all instances hash the same:

In [*31*]: h2 = Hash()


In [*32*]: hash(h2)

Out[*32*]: 3527539


In [*33*]: hash(h)

Out[*33*]: 3527539

So I'm wondering how hashablility is useful at all?

But it sure looks like there's a lot of room for confusion and error, even
if it's a frozen dataclass.

This may a case where we need to really make sure the docs are good!

-CHB



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20180212/364adddc/attachment.html>


More information about the Python-Dev mailing list