[Python-Dev] Immutability vs. hashability
Chris Barker
chris.barker at noaa.gov
Mon Feb 12 13:51:06 EST 2018
On Mon, Feb 5, 2018 at 3:37 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Sun, Feb 04, 2018 at 09:18:25PM -0800, Guido van Rossum wrote:
>
> > The way I think of it generally is that immutability is a property of
> > types, while hashability is a property of values.
>
> That's a great way to look at it, thanks.
>
hmm -- maybe we should get a ValueError then when you try to use a
non-hashable value?
In [*9*]: t = ([1,2,3],)
In [*10*]: set(t)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-10-a2fddf4dafd8> in <module>()
----> 1 set(t)
TypeError: unhashable type: 'list'
Of course, in this case, the error is triggered by the type of the zeroth
element of the tuple, not by the value of the tuple per se.
Which means that hashability really is a property of type -- but container
types require a specific way of thinking -- hashability is not determined
by the type of the container (or may not be), but by the types of it's
contents. Is that the value of the container?
So maybe: an object is hashable if it is a hashable type, and if it is a
container, if it's contents are hashable types.
With dataclasses as they stand -- it seems the values of the fields does
not affect hashability:
(this is the version 0.4 from PyPi -- disregard if it's out of date)
Unhashable by default:
In [*14*]: @dataclasses.dataclass()
...: *class* *NoHash*:
...: x = 5
...: l = [1,2,3]
...:
In [*15*]: set((nh,))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-15-80dc23ed0b24> in <module>()
----> 1 set((nh,))
TypeError: unhashable type: 'NoHash'
OK, that's what we expect.
But then if it is hashable:
In [*19*]: @dataclasses.dataclass(hash=*True*)
...: *class* *Hash*:
...: x = 5
...: l = [1,2,3]
...:
In [*20*]: h = Hash()
In [*21*]: set((h,))
Out[*21*]: {Hash()}
All works, regardless of the values of the fields
I haven't looked at the code -- but it appears the hash has nothing to do
with the values of the fields:
In [*23*]: hash(h)
Out[*23*]: 3527539
In [*24*]: h.l.append(6)
In [*25*]: hash(h)
Out[*25*]: 3527539
In [*26*]: h.x = 7
In [*27*]: hash(h)
Out[*27*]: 3527539
and it looks like all instances hash the same:
In [*31*]: h2 = Hash()
In [*32*]: hash(h2)
Out[*32*]: 3527539
In [*33*]: hash(h)
Out[*33*]: 3527539
So I'm wondering how hashablility is useful at all?
But it sure looks like there's a lot of room for confusion and error, even
if it's a frozen dataclass.
This may a case where we need to really make sure the docs are good!
-CHB
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20180212/364adddc/attachment.html>
More information about the Python-Dev
mailing list