On Thu, Jan 5, 2017 at 10:58 AM Paul Moore <p.f.moore@gmail.com> wrote:
The point is that the OP doesn't want to write his own hash function, but wants Python to provide a standard way of hashing an iterable. Today,
On 5 January 2017 at 13:28, Neil Girdhar <mistersheik@gmail.com> wrote: the
standard way is to convert to tuple and call hash on that. That may not be efficient. FWIW from a style perspective, I agree with OP.
The debate here regarding tuple/frozenset indicates that there may not be a "standard way" of hashing an iterable (should order matter?). Although I agree that assuming order matters is a reasonable assumption to make in the absence of any better information.
That's another good point. In keeping with my abc proposal, why not add abstract base classes with __hash__: * ImmutableIterable, and * ImmutableSet. ImmutableSet inherits from ImmutableIterable, and overrides __hash__ in such a way that order is ignored. This presumably involves very little new code — it's just a propagating up of the code that's already in set and tuple. The advantage is that instead of implementing __hash__ for your type, you declare your intention by inheriting from an abc and get an automatically-provided hash function. Hashing is low enough level that providing helpers in the stdlib is
not unreasonable. It's not obvious (to me, at least) that it's a common enough need to warrant it, though. Do we have any information on how often people implement their own __hash__, or how often hash(tuple(my_iterable)) would be an acceptable hash, except for the cost of creating the tuple? The OP's request is the only time this has come up as a requirement, to my knowledge. Hence my suggestion to copy the tuple implementation, modify it to work with general iterables, and publish it as a 3rd party module - its usage might give us an idea of how often this need arises. (The other option would be for someone to do some analysis of published code).
Assuming it is a sufficiently useful primitive to add, then we can debate naming. But I'd prefer it to be named in such a way that it makes it clear that it's a low-level helper for people writing their own __hash__ function, and not some sort of variant of hashing (which hash.from_iterable implies to me).
Paul