[Python-ideas] incremental hashing in __hash__
Neil Girdhar
mistersheik at gmail.com
Thu Jan 5 15:31:12 EST 2017
On Thu, Jan 5, 2017 at 10:58 AM Paul Moore <p.f.moore at gmail.com> wrote:
> On 5 January 2017 at 13:28, Neil Girdhar <mistersheik at gmail.com> wrote:
> > The point is that the OP doesn't want to write his own hash function, but
> > wants Python to provide a standard way of hashing an iterable. Today,
> the
> > standard way is to convert to tuple and call hash on that. That may not
> be
> > efficient. FWIW from a style perspective, I agree with OP.
>
> The debate here regarding tuple/frozenset indicates that there may not
> be a "standard way" of hashing an iterable (should order matter?).
> Although I agree that assuming order matters is a reasonable
> assumption to make in the absence of any better information.
>
That's another good point. In keeping with my abc proposal, why not add
abstract base classes with __hash__:
* ImmutableIterable, and
* ImmutableSet.
ImmutableSet inherits from ImmutableIterable, and overrides __hash__ in
such a way that order is ignored.
This presumably involves very little new code — it's just a propagating up
of the code that's already in set and tuple.
The advantage is that instead of implementing __hash__ for your type, you
declare your intention by inheriting from an abc and get an
automatically-provided hash function.
Hashing is low enough level that providing helpers in the stdlib is
> not unreasonable. It's not obvious (to me, at least) that it's a
> common enough need to warrant it, though. Do we have any information
> on how often people implement their own __hash__, or how often
> hash(tuple(my_iterable)) would be an acceptable hash, except for the
> cost of creating the tuple? The OP's request is the only time this has
> come up as a requirement, to my knowledge. Hence my suggestion to copy
> the tuple implementation, modify it to work with general iterables,
> and publish it as a 3rd party module - its usage might give us an idea
> of how often this need arises. (The other option would be for someone
> to do some analysis of published code).
>
> Assuming it is a sufficiently useful primitive to add, then we can
> debate naming. But I'd prefer it to be named in such a way that it
> makes it clear that it's a low-level helper for people writing their
> own __hash__ function, and not some sort of variant of hashing (which
> hash.from_iterable implies to me).
>
> Paul
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20170105/f28a8af3/attachment-0001.html>
More information about the Python-ideas
mailing list