Emit a SyntaxWarning for unhashables literals in hashable dependant literals.
I'm not sure if this is the right place to bring this up, python-ideas seemed like language issues and python-dev seemed like CPython issues. There are several unhashable builtin types present in CPython, as of today the ones I've noticed are: lists, dicts, sets, and bytearrays. Two of these are containers that require (at least partially) their members to be hashable: dicts and sets By this logic you cannot have a set of lists or a dict of sets to ints. CPython however does not stop or warn you if you attempt to do such a thing until it hits `BUILD_MAP` or `BUILD_SET` during runtime. This is reasonable behavior when it's not possible to infer the member type of the container i.e. ``{f(x) for x in iterable}`` or ``{f(x): y for zip(xs, ys)}`` However, given the situation where literals are nested i.e. ``{[*gen] for gen in gens}`` or ``{{green: eggs}, {and_: ham}}`` this presents an unavoidable exception at runtime. I suggest emitting a SyntaxWarning when encountering these cases of literals that produce unhashable types that are used in literals that produce types where the members must be hashable. I don't think it should be a SyntaxError because it's not a language issue, its an implementation issue. I don't think it should be a linters responsibility because for the most part linters should consider language issues/idioms not side-effects from the running implementation. I do understand that such cases this issue addresses may be uncommon and once you do get that TypeError raised its a relatively quick and easy fix, but consider this being present in code paths that don't get taken as frequently, large codebases where it becomes difficult to keep track of small one liner literals like this or even for the newer programmers toying with Python through CPython and naively using unhashables in places they shouldn't be. Either way I'm interested in hearing what the core team thinks of this suggestion, thanks in advance! :D
This is most definitely a language issue, not just a CPython issue -- the rules around hashability and (im)mutability are due to the language definition, not the whim of an implementer. A tool like mypy will catch this for you. As to the desirability of adding a syntax warning for such situations (when they can be detected statically), I'm not sure -- we generally only do syntax warnings when there is something that even experienced users get wrong, by mistake (e.g. assert (condition, message)). I presume this caused you some grief, or you wouldn't be posting here -- can you describe more of how this bit you, and why the runtime error did not suffice in your case? On Thu, Dec 12, 2019 at 7:49 AM mental na via Python-Dev < python-dev@python.org> wrote:
I'm not sure if this is the right place to bring this up, python-ideas seemed like language issues and python-dev seemed like CPython issues.
There are several unhashable builtin types present in CPython, as of today the ones I've noticed are: lists, dicts, sets, and bytearrays. Two of these are containers that require (at least partially) their members to be hashable: dicts and sets
By this logic you cannot have a set of lists or a dict of sets to ints.
CPython however does not stop or warn you if you attempt to do such a thing until it hits `BUILD_MAP` or `BUILD_SET` during runtime. This is reasonable behavior when it's not possible to infer the member type of the container i.e. ``{f(x) for x in iterable}`` or ``{f(x): y for zip(xs, ys)}``
However, given the situation where literals are nested i.e. ``{[*gen] for gen in gens}`` or ``{{green: eggs}, {and_: ham}}`` this presents an unavoidable exception at runtime. I suggest emitting a SyntaxWarning when encountering these cases of literals that produce unhashable types that are used in literals that produce types where the members must be hashable. I don't think it should be a SyntaxError because it's not a language issue, its an implementation issue. I don't think it should be a linters responsibility because for the most part linters should consider language issues/idioms not side-effects from the running implementation.
I do understand that such cases this issue addresses may be uncommon and once you do get that TypeError raised its a relatively quick and easy fix, but consider this being present in code paths that don't get taken as frequently, large codebases where it becomes difficult to keep track of small one liner literals like this or even for the newer programmers toying with Python through CPython and naively using unhashables in places they shouldn't be.
Either way I'm interested in hearing what the core team thinks of this suggestion, thanks in advance! :D _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/IUOIEOCI... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Guido van Rossum wrote:
This is most definitely a language issue, not just a CPython issue -- the rules around hashability and (im)mutability are due to the language definition, not the whim of an implementer.
I was not aware of this, I assumed it was a implementation issue because I knew CPython's dicts use a hash table implementation and there are other ways to implement a mapping data structure i.e. via trees. Guido could you provide a link to the language definition that dictates these rules about hashability and (im)mutability?
A tool like mypy will catch this for you.
Perhaps I should raise this as a mypy issue then? I'm using mypy version 0.750 as of today and the following code passes with no errors: ``` """foo bar.""" from typing import Set, Dict SOME = {{0}} print(SOME) def green_eggs() -> Set[Set[int]]: """foo.""" return {{1}} def and_ham() -> Dict[Set[int], int]: """bar.""" return {{1}: 0} ``` Additionally the above code gets a perfect 10/10 out of pylint version 2.4.4 And as a last ditch paranoia fueled attempt formatting with black did nothing version 19.10b0
As to the desirability of adding a syntax warning for such situations (when they can be detected statically), I'm not sure -- we generally only do syntax warnings when there is something that even experienced users get wrong, by mistake (e.g. assert (condition, message)).
Unless they're writing tests, I don't think anyone wants their code to fail. This pattern is a guaranteed way to do that, and as shown above it's not being caught on linters/static type checkers as far as i can tell (if it is and I'm out of date, that's fantastic news! and I'll happily shut up about it :D .) An experienced user wouldn't typically make this mistake, yes, but regardless of the user mistakes can still be made and these inadvertently produce failing code.
I presume this caused you some grief, or you wouldn't be posting here -- can you describe more of how this bit you, and why the runtime error did not suffice in your case?
Luckily there was no grief, I was playing around with the dis module observing how far the compiler optimized the source before depending on the runtime and I happened upon the case ``x in {"a", "b"}`` which led me towards ``x in {{"a"}}`` and ``x in {{0: 1}, {1: 2}}``, and I was surprised that there was valid code emitted for such a surefire way to raise an exception. I first posted here because I thought it was an implementation issue, but as you've pointed out it's most certainly a language issue; and one that can be detected statically. Now I'd like to suggest that instead of relying on linters and static type checkers to catch these bad patterns. Python shouldn't have allowed them in the first place, I see it as a contradiction in the language's semantics ultimately deferring the job of denying the programmer at the last possible moment.
On Fri, Dec 13, 2019, at 02:20, mental na via Python-Dev wrote:
Guido van Rossum wrote:
This is most definitely a language issue, not just a CPython issue -- the rules around hashability and (im)mutability are due to the language definition, not the whim of an implementer.
I was not aware of this, I assumed it was a implementation issue because I knew CPython's dicts use a hash table implementation and there are other ways to implement a mapping data structure i.e. via trees.
There are significant differences in semantics between a hash table and a tree-based mapping. Dict has to be a hash table on all implementations. (However, Java mutable types are hashable and the sky hasn't fallen for them.)
Guido could you provide a link to the language definition that dictates these rules about hashability and (im)mutability?
On Fri., 13 Dec. 2019, 5:26 pm mental na via Python-Dev, < python-dev@python.org> wrote:
Guido van Rossum wrote:
This is most definitely a language issue, not just a CPython issue -- the rules around hashability and (im)mutability are due to the language definition, not the whim of an implementer.
I was not aware of this, I assumed it was a implementation issue because I knew CPython's dicts use a hash table implementation and there are other ways to implement a mapping data structure i.e. via trees.
Guido could you provide a link to the language definition that dictates these rules about hashability and (im)mutability?
https://docs.python.org/3/reference/datamodel.html#object.__hash__ Many of the special method descriptions spell out the semantic requirements of well-behaved objects (and we then abide by those requirements when implementing builtin types and standard library modules)
A tool like mypy will catch this for you.
Perhaps I should raise this as a mypy issue then?
Aye, a typechecker failing to catch this situation would definitely be a reasonable issue to raise. We'd never check for it in the compiler, as it wouldn't be worth the additional state tracking needed for even the local type inference you suggest. Cheers, Nick.
Nick Coghlan wrote:
A tool like mypy will catch this for you. Perhaps I should raise this as a mypy issue then? Aye, a typechecker failing to catch this situation would definitely be a reasonable issue to raise.
Roger that, I've raised this on mypy: https://github.com/python/mypy/issues/8150
participants (4)
-
Guido van Rossum
-
mental na
-
Nick Coghlan
-
Random832