Inspired by Scala, a new syntax for Union type
Hello everybody, Scala 3 propose the a new syntax for Union type. See here <https://dotty.epfl.ch/docs/reference/new-types/union-types.html>. I propose to add a similar syntax in Python. # Operator for Union assert( int | str == Union[int,str]) assert( int | str | float == Union[int,str,float]) # Operator for Optional assert( ~int == Optional[int]) Now, it's possible to write: def fn(bag:List[int | str], option: ~int = None) -> float | str: ... in place of def fn(bag:List[Option[int,str]], option: Optional[int] = None) -> Union[float,str]: ... I think these syntaxes are more clear, and can help with the adoption of typing. I test and implement these ideas in a two fork : One for CPython <https://github.com/pprados/cpython> and one for MyPy <https://github.com/pprados/mypy>. See the branches add_OR_to_types (for Union syntax) or add_INVERT_to_types (for Union and Optional syntax). How I implement that ? I add the operators __or__ and __revert__ to PyType_Type. The C code is similar of : from typing import * def type_or(self,right): return Union[self,right] type(type).__or__ = type_or Actually, the accepted syntax for typing is : annotation: name_type name_type: NAME (args)? args: '[' paramslist ']' paramslist: annotation (',' annotation)* [','] I propose to extend the syntax to : annotation: ( name_type | or_type | invert_type ) name_type: NAME (args)? args: '[' paramslist ']' paramslist: annotation (',' annotation)* [','] or_type: name_type '|' annotation invert_type: '~' annotation What do you think about that ? The draft of a PEP is here <https://github.com/pprados/peps/blob/master/pep-9999.rst>. Regards
I don't want to add runtime behaviors for static type hinting. There is PEP 563 instead. Tools like mypy can implement them without touching runtime behavior. On Thu, Aug 29, 2019 at 9:48 PM Philippe Prados <philippe.prados@gmail.com> wrote:
Hello everybody,
Scala 3 propose the a new syntax for Union type. See here <https://dotty.epfl.ch/docs/reference/new-types/union-types.html>. I propose to add a similar syntax in Python.
# Operator for Union assert( int | str == Union[int,str]) assert( int | str | float == Union[int,str,float]) # Operator for Optional assert( ~int == Optional[int])
Now, it's possible to write:
def fn(bag:List[int | str], option: ~int = None) -> float | str: ...
in place of
def fn(bag:List[Option[int,str]], option: Optional[int] = None) -> Union[float,str]: ...
I think these syntaxes are more clear, and can help with the adoption of typing.
I test and implement these ideas in a two fork : One for CPython <https://github.com/pprados/cpython> and one for MyPy <https://github.com/pprados/mypy>. See the branches add_OR_to_types (for Union syntax) or add_INVERT_to_types (for Union and Optional syntax).
How I implement that ? I add the operators __or__ and __revert__ to PyType_Type. The C code is similar of :
from typing import * def type_or(self,right): return Union[self,right] type(type).__or__ = type_or
Actually, the accepted syntax for typing is :
annotation: name_type name_type: NAME (args)? args: '[' paramslist ']' paramslist: annotation (',' annotation)* [',']
I propose to extend the syntax to :
annotation: ( name_type | or_type | invert_type ) name_type: NAME (args)? args: '[' paramslist ']' paramslist: annotation (',' annotation)* [',']
or_type: name_type '|' annotation
invert_type: '~' annotation
What do you think about that ?
The draft of a PEP is here <https://github.com/pprados/peps/blob/master/pep-9999.rst>.
Regards _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/FCTXGD... Code of Conduct: http://python.org/psf/codeofconduct/
-- Inada Naoki <songofacandy@gmail.com>
No, it's not possible, because
int | str Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for |: 'type' and 'type'
Regards Philippe Prados Le jeu. 29 août 2019 à 14:55, Inada Naoki <songofacandy@gmail.com> a écrit :
I don't want to add runtime behaviors for static type hinting.
There is PEP 563 instead. Tools like mypy can implement them without touching runtime behavior.
On Thu, Aug 29, 2019 at 9:48 PM Philippe Prados <philippe.prados@gmail.com> wrote:
Hello everybody,
Scala 3 propose the a new syntax for Union type. See here <https://dotty.epfl.ch/docs/reference/new-types/union-types.html>. I propose to add a similar syntax in Python.
# Operator for Union assert( int | str == Union[int,str]) assert( int | str | float == Union[int,str,float]) # Operator for Optional assert( ~int == Optional[int])
Now, it's possible to write:
def fn(bag:List[int | str], option: ~int = None) -> float | str: ...
in place of
def fn(bag:List[Option[int,str]], option: Optional[int] = None) -> Union[float,str]: ...
I think these syntaxes are more clear, and can help with the adoption of typing.
I test and implement these ideas in a two fork : One for CPython <https://github.com/pprados/cpython> and one for MyPy <https://github.com/pprados/mypy>. See the branches add_OR_to_types (for Union syntax) or add_INVERT_to_types (for Union and Optional syntax).
How I implement that ? I add the operators __or__ and __revert__ to PyType_Type. The C code is similar of :
from typing import * def type_or(self,right): return Union[self,right] type(type).__or__ = type_or
Actually, the accepted syntax for typing is :
annotation: name_type name_type: NAME (args)? args: '[' paramslist ']' paramslist: annotation (',' annotation)* [',']
I propose to extend the syntax to :
annotation: ( name_type | or_type | invert_type ) name_type: NAME (args)? args: '[' paramslist ']' paramslist: annotation (',' annotation)* [',']
or_type: name_type '|' annotation
invert_type: '~' annotation
What do you think about that ?
The draft of a PEP is here <https://github.com/pprados/peps/blob/master/pep-9999.rst>.
Regards _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/FCTXGD... Code of Conduct: http://python.org/psf/codeofconduct/
-- Inada Naoki <songofacandy@gmail.com>
... and, runtime type checking (like https://github.com/agronholm/typeguard) must have a synthetised Union[] in __annotations__. Le jeu. 29 août 2019 à 15:02, Philippe Prados <philippe.prados@gmail.com> a écrit :
No, it's not possible, because
int | str Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for |: 'type' and 'type'
Regards
Philippe Prados
Le jeu. 29 août 2019 à 14:55, Inada Naoki <songofacandy@gmail.com> a écrit :
I don't want to add runtime behaviors for static type hinting.
There is PEP 563 instead. Tools like mypy can implement them without touching runtime behavior.
On Thu, Aug 29, 2019 at 9:48 PM Philippe Prados < philippe.prados@gmail.com> wrote:
Hello everybody,
Scala 3 propose the a new syntax for Union type. See here <https://dotty.epfl.ch/docs/reference/new-types/union-types.html>. I propose to add a similar syntax in Python.
# Operator for Union assert( int | str == Union[int,str]) assert( int | str | float == Union[int,str,float]) # Operator for Optional assert( ~int == Optional[int])
Now, it's possible to write:
def fn(bag:List[int | str], option: ~int = None) -> float | str: ...
in place of
def fn(bag:List[Option[int,str]], option: Optional[int] = None) -> Union[float,str]: ...
I think these syntaxes are more clear, and can help with the adoption of typing.
I test and implement these ideas in a two fork : One for CPython <https://github.com/pprados/cpython> and one for MyPy <https://github.com/pprados/mypy>. See the branches add_OR_to_types (for Union syntax) or add_INVERT_to_types (for Union and Optional syntax).
How I implement that ? I add the operators __or__ and __revert__ to PyType_Type. The C code is similar of :
from typing import * def type_or(self,right): return Union[self,right] type(type).__or__ = type_or
Actually, the accepted syntax for typing is :
annotation: name_type name_type: NAME (args)? args: '[' paramslist ']' paramslist: annotation (',' annotation)* [',']
I propose to extend the syntax to :
annotation: ( name_type | or_type | invert_type ) name_type: NAME (args)? args: '[' paramslist ']' paramslist: annotation (',' annotation)* [',']
or_type: name_type '|' annotation
invert_type: '~' annotation
What do you think about that ?
The draft of a PEP is here <https://github.com/pprados/peps/blob/master/pep-9999.rst>.
Regards _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/FCTXGD... Code of Conduct: http://python.org/psf/codeofconduct/
-- Inada Naoki <songofacandy@gmail.com>
On Thu, Aug 29, 2019 at 10:07 PM Philippe Prados <philippe.prados@gmail.com> wrote:
... and, runtime type checking (like https://github.com/agronholm/typeguard) must have a synthetised Union[] in __annotations__.
Runtime type checking tools can parse "str | int" too. No need to implement runtime behavior for `str | int`. Regards, -- Inada Naoki <songofacandy@gmail.com>
Cool ! My patch of MyPy is allready compatible with this new syntax. Regards Le jeu. 29 août 2019 à 15:11, Inada Naoki <songofacandy@gmail.com> a écrit :
On Thu, Aug 29, 2019 at 10:07 PM Philippe Prados < philippe.prados@gmail.com> wrote:
... and, runtime type checking (like https://github.com/agronholm/typeguard) must have a synthetised Union[] in __annotations__.
Runtime type checking tools can parse "str | int" too. No need to implement runtime behavior for `str | int`.
Regards,
-- Inada Naoki <songofacandy@gmail.com>
On Thu, Aug 29, 2019 at 10:03 PM Philippe Prados <philippe.prados@gmail.com> wrote:
No, it's not possible, because
int | str Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for |: 'type' and 'type'
Regards
It is possible because: Python 3.7.4 (default, Jul 9 2019, 18:13:23) [Clang 10.0.1 (clang-1001.0.46.4)] on darwin Type "help", "copyright", "credits" or "license" for more information.
from __future__ import annotations def foo() -> int | str: ... pass ... foo.__annotations__ {'return': 'int | str'}
Please read PEP 563. Regards, -- Inada Naoki <songofacandy@gmail.com>
With PEP 563, there's no runtime behaviors. It's strange to accept : def f() -> int | str: ... but not a = int | str The operator type.__or__() is called only if the user known what they do. it's strictly equivalent to a = Union[int,str] So, I think it's important to add this operator in root type. Le jeu. 29 août 2019 à 14:55, Inada Naoki <songofacandy@gmail.com> a écrit :
I don't want to add runtime behaviors for static type hinting.
There is PEP 563 instead. Tools like mypy can implement them without touching runtime behavior.
-- Inada Naoki <songofacandy@gmail.com>
I propose to use a unary operator to help the readability. With a lot of parameters: def f(source: str | None, destination: str | None, param: int | None):... I think it's more readable with def f(source: str?, destination: str?, param: int?): ... I propose two branches in my implementation: add_OR_to_types (for Union syntax) or add_INVERT_to_types (for Union and Optional syntax). For me, the best proposition is to use str? like kotlin (with new unary operator ?). Le jeu. 29 août 2019 à 17:00, Philippe Prados <philippe.prados@gmail.com> a écrit :
With PEP 563, there's no runtime behaviors.
It's strange to accept :
def f() -> int | str: ...
but not
a = int | str
The operator type.__or__() is called only if the user known what they do. it's strictly equivalent to
a = Union[int,str]
So, I think it's important to add this operator in root type.
Le jeu. 29 août 2019 à 14:55, Inada Naoki <songofacandy@gmail.com> a écrit :
I don't want to add runtime behaviors for static type hinting.
There is PEP 563 instead. Tools like mypy can implement them without touching runtime behavior.
-- Inada Naoki <songofacandy@gmail.com>
I think it's more readable with
def f(source: str?, destination: str?, param: int?): ...
Hmmm. But then with default arguments you end up with: def f(source: str?=def_src, destination: str?=MISSING, param: int?=1): ... ?= looks... not great to me. Though it does look better with spaces: def f(source: str? = def_src, destination: str? = MISSING, param: int? = 1): ...
On Fri, Aug 30, 2019 at 1:17 AM Philippe Prados <philippe.prados@gmail.com> wrote:
I propose to use a unary operator to help the readability. With a lot of parameters:
def f(source: str | None, destination: str | None, param: int | None):...
I think it's more readable with
def f(source: str?, destination: str?, param: int?): ...
TBH I don't see much of an advantage here - not enough to justify the creation of an entire new operator. "int | None" is already far from terrible, and the slight abuse of "~int" meaning "maybe int" is pretty plausible (consider how "approximately equal" is written mathematically). BTW, is there a strong reason for these union types to be disallowed in instance/subclass checks?
isinstance(3, Union[str, int]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python3.9/typing.py", line 764, in __instancecheck__ return self.__subclasscheck__(type(obj)) File "/usr/local/lib/python3.9/typing.py", line 772, in __subclasscheck__ raise TypeError("Subscripted generics cannot be used with" TypeError: Subscripted generics cannot be used with class and instance checks
If they were permitted, then instance checks could use an extremely clean-looking notation for "any of these": isinstance(x, str | int) ==> "is x an instance of str or int" It's very common for novices to write "if x == 3 or 5:", and I'm not sure whether that's an argument in favour or against. ChrisA
On 29/08/2019 16:30:49, Chris Angelico wrote:
isinstance(3, Union[str, int]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python3.9/typing.py", line 764, in __instancecheck__ return self.__subclasscheck__(type(obj)) File "/usr/local/lib/python3.9/typing.py", line 772, in __subclasscheck__ raise TypeError("Subscripted generics cannot be used with" TypeError: Subscripted generics cannot be used with class and instance checks
If they were permitted, then instance checks could use an extremely clean-looking notation for "any of these":
isinstance(x, str | int) ==> "is x an instance of str or int"
Er, is that necessary when you can already write isinstance(x, (str, int))
It's very common for novices to write "if x == 3 or 5:", and I'm not sure whether that's an argument in favour or against.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/RJ55YR... Code of Conduct: http://python.org/psf/codeofconduct/
--- This email has been checked for viruses by AVG. https://www.avg.com
On Sep 3, 2019, at 19:45, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Aug 29, 2019 at 06:20:55PM +0100, Rob Cliffe via Python-ideas wrote:
isinstance(x, str | int) ==> "is x an instance of str or int"
Er, is that necessary when you can already write isinstance(x, (str, int))
It's not *necessary* it's just nicer.
Definitely. It reads even more like what it means than the existing spelling, and it’s something novices are almost certain to expect to work, and so on. But that implies that you can also write this: isinstance(x, Union[str, int]) … because, after all, str|int is defined as meaning exactly that. Which implies that the current rule that instantiated genetic types cannot be used for runtime type checks needs to be changed. That’s certainly plausible, and maybe reasonable. But as I understand it, the idea that there things can’t be used for runtime type checks was a fundamental guiding principle to the typing design. So, it’s not something to be changed lightly. Someone has to go back to the reason for that principle (which may not be clearly stated anywhere, in which case it has to be extracted from things that _have_ been argued), and then make the case for why it should be violated here. And I haven’t seen anyone make that case. (If someone has and I missed it, apologies; chunks of this thread keep getting flagged as spam for some reason…) If we’re lucky, it was just a matter of “We haven’t really thought it through beyond the collection-like generics, so let’s defer it until later because it’s always easier to add features than to take them away.” Then the argument is just “That later is now” plus the specific argument for Union (and Optional), which seems like a pretty easy case to make. (After all, checking Iterator[int] at runtime is impossible, and checking Iterator is the same useful thing that the long-standing ABC has always checked; neither of those is true for Union, where Union[int, str] is both easy to do and obviously useful and matches functionality that’s been in the runtime type system since at least early 2.x, while checking just plain Union is both new to typing and completely useless.)
On Wed, Sep 4, 2019 at 1:15 PM Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
On Sep 3, 2019, at 19:45, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Aug 29, 2019 at 06:20:55PM +0100, Rob Cliffe via Python-ideas wrote:
isinstance(x, str | int) ==> "is x an instance of str or int"
Er, is that necessary when you can already write isinstance(x, (str, int))
It's not *necessary* it's just nicer.
Definitely. It reads even more like what it means than the existing spelling, and it’s something novices are almost certain to expect to work, and so on.
But that implies that you can also write this:
isinstance(x, Union[str, int])
… because, after all, str|int is defined as meaning exactly that. Which implies that the current rule that instantiated genetic types cannot be used for runtime type checks needs to be changed.
That’s certainly plausible, and maybe reasonable. But as I understand it, the idea that there things can’t be used for runtime type checks was a fundamental guiding principle to the typing design. So, it’s not something to be changed lightly. Someone has to go back to the reason for that principle (which may not be clearly stated anywhere, in which case it has to be extracted from things that _have_ been argued), and then make the case for why it should be violated here. And I haven’t seen anyone make that case. (If someone has and I missed it, apologies; chunks of this thread keep getting flagged as spam for some reason…)
I dislike runtime behavior of static types because I am very afraid accidental large performance or memory footprint regression. ABC has extension module for speedup, but `isinstance([], Iterable)` is 4x slower than `isinstance([], (str, list)`. ``` $ python3 -m pyperf timeit -s 'from collections.abc import Iterable; x=[]' -- 'isinstance(x, Iterable)' ..................... Mean +- std dev: 280 ns +- 8 ns $ python3 -m pyperf timeit -s 'x=[]; T=(str, list)' -- 'isinstance(x, T)' ..................... Mean +- std dev: 73.1 ns +- 1.0 ns ``` Typing module doesn't have speedup extension. I expect`isinstance([], Union[str, list])` will be much slower than `isinstance([], (str, list))`. Regards, -- Inada Naoki <songofacandy@gmail.com>
On Sep 4, 2019, at 01:29, Inada Naoki <songofacandy@gmail.com> wrote:
On Wed, Sep 4, 2019 at 1:15 PM Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
But that implies that you can also write this:
isinstance(x, Union[str, int])
… because, after all, str|int is defined as meaning exactly that. Which implies that the current rule that instantiated genetic types cannot be used for runtime type checks needs to be changed.
That’s certainly plausible, and maybe reasonable. But as I understand it, the idea that there things can’t be used for runtime type checks was a fundamental guiding principle to the typing design. So, it’s not something to be changed lightly. Someone has to go back to the reason for that principle (which may not be clearly stated anywhere, in which case it has to be extracted from things that _have_ been argued), and then make the case for why it should be violated here. And I haven’t seen anyone make that case. (If someone has and I missed it, apologies; chunks of this thread keep getting flagged as spam for some reason…)
I dislike runtime behavior of static types because I am very afraid accidental large performance or memory footprint regression.
ABC has extension module for speedup, but `isinstance([], Iterable)` is 4x slower than `isinstance([], (str, list)`.
Does the ABC use the extension module to speed up isinstance checks? (Couldn’t you just repeat your test with typing.Iterable—which can be tested; it’s only instantiated types like Iterable[int] that can’t, not the generics themselves—and see if it’s significantly slower than collections.abc.Iterable instead of guessing?) At any rate, both collections.abc and typing checks have to go through the metaclass and down into the class’s subclasscheck method and possibly do attribute checks and/or registry lookups, depending on the type (checking a list for Iterable I believe ends up passing the first test for __iter__, but only after getting that far into the process). Which is certainly slower than just iterating a tuple, which explains your 280 vs. 73. And that wouldn’t be any different for the types returned by Union.__getitem__. So, testing Union[int, str] is almost certainly going to be significantly slower than testing for (int, str), because it’ll be wrapping up the iteration over the types in a couple extra method lookups and calls. And you’re right, because int|str _looks_ better than (int, str) here, many people will be encouraged to use it even though it’s slower, which could potentially be a bad thing for some programs. Which means anyone proposing it now has to answer this performance issue, as well as researching and answering the other issue of whatever the reason is that for not allowing instantiated generics to typecheck in the first place.
On Thu, Sep 5, 2019 at 1:02 AM Andrew Barnert <abarnert@yahoo.com> wrote:
I dislike runtime behavior of static types because I am very afraid accidental large performance or memory footprint regression.
ABC has extension module for speedup, but `isinstance([], Iterable)` is 4x slower than `isinstance([], (str, list)`.
Does the ABC use the extension module to speed up isinstance checks? (Couldn’t you just repeat your test with typing.Iterable—which can be tested; it’s only instantiated types like Iterable[int] that can’t, not the generics themselves—and see if it’s significantly slower than collections.abc.Iterable instead of guessing?)
Yes. See this code: https://github.com/python/cpython/blob/b9a0376b0dedf16a2f82fa43d851119d1f7a2... ABC caches the instance check. So Iterable.__subclasshook__ is called only once in my benchmark. If __subclasshook__ is called every time when isinstance is called, it will be much slower.
And you’re right, because int|str _looks_ better than (int, str) here, many people will be encouraged to use it even though it’s slower, which could potentially be a bad thing for some programs.
That's exactly my point. If we say "you can use `isinstance(x, int | str)` for now", people may think it is a new and recommended way to write it. I prefer "there is one preferable way to do it" to "there is new way to do it but it may be much slower and use much memory, so you shouldn't use it unless you can ignore performance." Regards, -- Inada Naoki <songofacandy@gmail.com>
On Sep 4, 2019, at 20:51, Inada Naoki <songofacandy@gmail.com> wrote:
And you’re right, because int|str _looks_ better than (int, str) here, many people will be encouraged to use it even though it’s slower, which could potentially be a bad thing for some programs.
That's exactly my point. If we say "you can use `isinstance(x, int | str)` for now", people may think it is a new and recommended way to write it.
Right; I was agreeing, and saying it may even be worse than you’re suggesting. There’s always the possibility that any shiny new feature looks like the One True Way and gets overused. But in this case, it looks more obviously right even to someone who’s not interested in shiny new features and has never heard the word “pythonic” or read a new version’s release notes. So they’re going to use it. Which means if they run into a situation where they’re checking types of a zillion objects, they’re definitely not going to guess that it’s 5x as slow, so there definitely going to misuse it.
On Thu, Sep 5, 2019 at 7:08 PM Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
On Sep 4, 2019, at 20:51, Inada Naoki <songofacandy@gmail.com> wrote:
And you’re right, because int|str _looks_ better than (int, str) here, many people will be encouraged to use it even though it’s slower, which could potentially be a bad thing for some programs.
That's exactly my point. If we say "you can use `isinstance(x, int | str)` for now", people may think it is a new and recommended way to write it.
Right; I was agreeing, and saying it may even be worse than you’re suggesting. There’s always the possibility that any shiny new feature looks like the One True Way and gets overused. But in this case, it looks more obviously right even to someone who’s not interested in shiny new features and has never heard the word “pythonic” or read a new version’s release notes. So they’re going to use it. Which means if they run into a situation where they’re checking types of a zillion objects, they’re definitely not going to guess that it’s 5x as slow, so there definitely going to misuse it.
Hang on hang on.... what's this situation where you're checking types of a zillion objects? I think there's a bigger problem there than whether isinstance(x, int|str) is slower than isinstance(x, (int,str)) ! Even if this change DOES have a measurable impact on the time to do those checks, it only applies to unions, and if that's a notable proportion of your total run time, maybe there's a better way to architect this. The only situation I can think of would be exception handling. As a special case, BaseException could perhaps have an __or__ method that returns a tuple, but even there, I'd want to see a macrobenchmark that shows that the difference actually affects a larger program (especially since most exception checks are by hierarchy, not by union). ChrisA
On Thu, Sep 05, 2019 at 07:15:27PM +1000, Chris Angelico wrote:
Hang on hang on.... what's this situation where you're checking types of a zillion objects?
An earlier version of the statistics module used lots of isinstance checks in order to support arbitrary numeric types, and was a lot slower. The current version avoids most of those at the cost of being a lot less clear and elegant, but it improved performance somewhat. On my PC, an isinstance check against a single concrete type (not an ABC) is about three times as expensive as a float arithmetic operation, so in a tight loop it may not be an insignificant cost.
I think there's a bigger problem there than whether isinstance(x, int|str) is slower than isinstance(x, (int,str)) ! Even if this change DOES have a measurable impact on the time to do those checks, it only applies to unions, and if that's a notable proportion of your total run time, maybe there's a better way to architect this.
Maybe... but in my experience, only at the cost of writing quite ugly code. But having said all that, I'm not sure that we should be rejecting this proposal on the basis of performance when we haven't got any working code to measure performance of :) isinstance is a wrapper around PyObject_IsInstance(obj, class_or_tuple), and if I'm reading the C code correctly, PyObject_IsInstance is roughly equivalent to this Python pseudo-code: # Except in C, not Python def isinstance(obj, class_or_tuple): if type(class_or_tuple) is tuple: for C in class_or_tuple: if isinstance(obj, C): return True else: ... If Union is a built-in, we could have something like this: def isinstance(obj, class_or_tuple): if type(class_or_tuple) is Union: class_or_tuple = class_or_tuple.__union_params__ # followed by the same code as above typing.Union already defines .__union_params__ which returns a tuple of the classes used to construct the union, so in principle at least, there need be no significant performance hit from supporting Unions. -- Steven
On Thu, Sep 5, 2019 at 9:31 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Sep 05, 2019 at 07:15:27PM +1000, Chris Angelico wrote:
Hang on hang on.... what's this situation where you're checking types of a zillion objects?
An earlier version of the statistics module used lots of isinstance checks in order to support arbitrary numeric types, and was a lot slower. The current version avoids most of those at the cost of being a lot less clear and elegant, but it improved performance somewhat.
That's fair, although if someone's concerned about squeaking the performance as hard as possible, they'll probably be using numpy.
On my PC, an isinstance check against a single concrete type (not an ABC) is about three times as expensive as a float arithmetic operation, so in a tight loop it may not be an insignificant cost.
How often do you check against a union type (or, in current code, against a tuple of types)? This proposal wouldn't affect anything that checks against a single type.
I think there's a bigger problem there than whether isinstance(x, int|str) is slower than isinstance(x, (int,str)) ! Even if this change DOES have a measurable impact on the time to do those checks, it only applies to unions, and if that's a notable proportion of your total run time, maybe there's a better way to architect this.
Maybe... but in my experience, only at the cost of writing quite ugly code.
Perhaps. But for this to matter, you would need: 1) some sort of complicated dispatch handler that has to handle subclasses (so you can't just look up type(x) in a dict) 2) handling of multiple types the same way (so you want to do union isinstances rather than each one being done individually) 3) little enough other code that a performance regression in isinstance makes a measurable difference 4) clean code that you don't want to disrupt for the sake of performance Seems like a fairly uncommon case to me. Maybe I'm wrong.
But having said all that, I'm not sure that we should be rejecting this proposal on the basis of performance when we haven't got any working code to measure performance of :)
Definitely. I don't think performance should be a major consideration until code cleanliness has been proven or disproven.
isinstance is a wrapper around PyObject_IsInstance(obj, class_or_tuple), and if I'm reading the C code correctly, PyObject_IsInstance is roughly equivalent to this Python pseudo-code:
# Except in C, not Python def isinstance(obj, class_or_tuple): if type(class_or_tuple) is tuple: for C in class_or_tuple: if isinstance(obj, C): return True else: ...
If Union is a built-in, we could have something like this:
def isinstance(obj, class_or_tuple): if type(class_or_tuple) is Union: class_or_tuple = class_or_tuple.__union_params__ # followed by the same code as above
typing.Union already defines .__union_params__ which returns a tuple of the classes used to construct the union, so in principle at least, there need be no significant performance hit from supporting Unions.
That seems like a pretty good optimization! ChrisA
On Sep 5, 2019, at 04:24, Steven D'Aprano <steve@pearwood.info> wrote:
But having said all that, I'm not sure that we should be rejecting this proposal on the basis of performance when we haven't got any working code to measure performance of :)
Good point. But then if we never get any realistic use cases, it’ll probably already be rejected for that reason. :) So this is just something we should keep in mind when considering examples, not something we should use as an open-and-shut argument on its own, but I think it was still worth raising by [whoever raised it that I quoted in snipped text].
isinstance is a wrapper around PyObject_IsInstance(obj, class_or_tuple), and if I'm reading the C code correctly, PyObject_IsInstance is roughly equivalent to this Python pseudo-code:
# Except in C, not Python def isinstance(obj, class_or_tuple): if type(class_or_tuple) is tuple: for C in class_or_tuple: if isinstance(obj, C): return True else: ...
Wow, I didn’t realize it handled tuples recursively, but the docs say it does, and of course the docs and the code are right: >>> isinstance(2, (dict, (int, str))) True Not really relevant to anything (Unions already explicitly handle that same thing at construction time: `Union[dict, Union[int, str]]` just returns `Union[dict, int, str]`), I’m just surprised that I never noticed that.
If Union is a built-in, we could have something like this:
def isinstance(obj, class_or_tuple): if type(class_or_tuple) is Union: class_or_tuple = class_or_tuple.__union_params__ # followed by the same code as above
typing.Union already defines .__union_params__ which returns a tuple of the classes used to construct the union, so in principle at least, there need be no significant performance hit from supporting Unions.
That’s a great point. And if type.__or__ is going to return a union type, we probably don’t want that to go lazily importing typing and pulling something out of it to call __getitem__ on, so we probably want that result to be builtin. I don’t think we need Union itself to be a builtin. But typing.Union.__getitem__ needs to return the same kind of builtin as type.__or__ (which is presumably exposed as something like types.UnionType, not only exposed in the typing module) instead of returning a typing._GenericAlias, or the whole point of this proposal (that `int|str == Union[int, str]`) breaks. That does raise some more bikeshedding questions (what the constructor for types.union accepts, or whether it refuses to be constructed and forces you to use type.__or__ or Union.__getitem__; what its repr looks like; etc.). And I suppose it also helps answer the question of why typing.Union is special for isinstance: it’s returning a completely different thing than every other generic’s __getitem__, so it’s not really the same kind of generic, so maybe nobody expects it to follow the same rules in the first place? More generally, it changes the proposal to “create a new runtime union type that works the way you’d expect, then add syntax for that, and change typing.Union to take advantage of it”, which sounds conceptually better than “add syntax to creating typing.Union static types, and then add special support to just this one kind of static type to make it usable at runtime”, even if they’re close to equivalent in practice.
On Thu, Sep 05, 2019 at 12:12:05PM -0700, Andrew Barnert wrote:
On Sep 5, 2019, at 04:24, Steven D'Aprano <steve@pearwood.info> wrote:
But having said all that, I'm not sure that we should be rejecting this proposal on the basis of performance when we haven't got any working code to measure performance of :)
Good point. But then if we never get any realistic use cases, it’ll probably already be rejected for that reason. :)
Sorry, I don't understand your point about realistic use-cases. The realistic use-case for int|str (or Union[int, str]) in isinstance is exactly the same as the use-case for (int, str) and we've had that for many, many releases. I didn't think we still have to prove the utility of checking whether an object was an instance of class A or B. This is about making the syntax look pretty, not adding new functionality. There are two proposals here: * allow str|int in type annotations, as a nicer-looking and shorter equivalent for Union[str, int] * allow Unions (whether spelled explicitly or with the | operator) to have runtime effects, specifically in isinstance and I presume issubclass, as a nicer-looking alternative to a tuple of classes. (The first is a necessary but not sufficient condition for the second.) Both are cosmetic changes. Neither involves new functionality which doesn't already exist. I don't think use-cases come into it. This seems to me to be a pair of purely design questions: * are Unions important enough to get an operator? (I think so.) * is it reasonable to relax the prohibition on isinstance(obj, Union)? (I think so.) although there are subtleties that need to be considered, as your post goes on to mention. -- Steven
Two more bits of bikeshedding… On Sep 5, 2019, at 12:12, Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
On Sep 5, 2019, at 04:24, Steven D'Aprano <steve@pearwood.info> wrote:
If Union is a built-in, we could have something like this:
def isinstance(obj, class_or_tuple): if type(class_or_tuple) is Union: class_or_tuple = class_or_tuple.__union_params__ # followed by the same code as above
typing.Union already defines .__union_params__ which returns a tuple of the classes used to construct the union, so in principle at least, there need be no significant performance hit from supporting Unions.
That’s a great point.
And if type.__or__ is going to return a union type, we probably don’t want that to go lazily importing typing and pulling something out of it to call __getitem__ on, so we probably want that result to be builtin.
I don’t think we need Union itself to be a builtin. But typing.Union.__getitem__ needs to return the same kind of builtin as type.__or__ (which is presumably exposed as something like types.UnionType, not only exposed in the typing module) instead of returning a typing._GenericAlias, or the whole point of this proposal (that `int|str == Union[int, str]`) breaks.
That does raise some more bikeshedding questions (what the constructor for types.union accepts, or whether it refuses to be constructed and forces you to use type.__or__ or Union.__getitem__; what its repr looks like; etc.).
Also: Are runtime union types actually types, unlike the things in typing, or are they still non-type values that just have special handling as the second argument of isinstance and issubclass and maybe except statements? I’d expect issubclass(int|str, int|str|bytes) to be true, and issubclass(int|str, int) to be false, not for both of them to raise exceptions about the first argument not being a type. And I don’t see any reason that “things designed to be used as types for runtime type checks” shouldn’t be types. And their type (types.UnionType or whatever) a perfectly normal metaclass that inherits from type. But, other than the issubclass question, I’m having a hard time imagining anywhere that it would make a difference. While we’re at it: issubclass(int|str, types.UnionType) I think this should be false because UnionType is not like typing.Union (which is a typing.Generic, and therefore on its own it has to have the useless meaning of “the type that includes all values of any unions of any 1 or more types), it’s just a normal metaclass (with the normal meaning “the type of all union types”). Finally, do we still need the existing Generic, typing.Union, at all? If types.UnionType defines a __getitem__, we could just do Union = types.UnionType. Would this do the right thing in every case, or could it break anything? I don’t know; I think it’s safer to leave typing.Union as-is (except for defining its __getitem__ to return the | of all of its arguments, instead of inheriting the _SpecialForm.__getitem__ behavior).
On Thu, Sep 05, 2019 at 05:41:50PM -0700, Andrew Barnert wrote:
Are runtime union types actually types, unlike the things in typing, or are they still non-type values that just have special handling as the second argument of isinstance and issubclass and maybe except statements?
Union, and unions, are currently types: py> isinstance(Union, type) True py> isinstance(Union[int, str], type) True and I don't think that should change. I don't think you should be able to instantiate a Union (that's the current behaviour too). A Union of two types is not the same as inheriting from both types.
I’d expect issubclass(int|str, int|str|bytes) to be true, and issubclass(int|str, int) to be false, not for both of them to raise exceptions about the first argument not being a type.
Currently, issubclass accepts unions without raising, and that shouldn't change either. But I disagree that int|str is a subclass of int|str|bytes. There's no subclass relationship between the two: the (int|str).__bases__ won't include (int|str|bytes), and instances of int|str don't inherit from all three of int, str, bytes. Currently unions inherit from typing.Final, and that *may* change, but it surely won't change in such a way that Union[str|int] is equivalent to ``class Str_Int(str, int)``.
While we’re at it:
issubclass(int|str, types.UnionType)
I think this should be false
I have no opinion on that one :-)
Finally, do we still need the existing Generic, typing.Union, at all?
typing.Union will probably just become an alias to the (proposed) built-in types.Union. However, backwards compatibility requires that Union[int, str] is still supported, even if int|str is prefered. -- Steven
On Friday, September 6, 2019, 1:51:35 AM PDT, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Sep 05, 2019 at 05:41:50PM -0700, Andrew Barnert wrote:
Are runtime union types actually types, unlike the things in typing, or are they still non-type values that just have special handling as the second argument of isinstance and issubclass and maybe except statements?
Union, and unions, are currently types:
py> isinstance(Union, type) True py> isinstance(Union[int, str], type) True
What version of Python are you using here? A python.org 3.7 install on my laptop, a fresh build of master (3.9.0a0) on my laptop, 3.6 on Pythonista, and 3.7 on repl.it (https://repl.it/repls/CriticalDismalProgrammer) all give me `False`. And, looking at the source to typing.py (https://github.com/python/cpython/blob/master/Lib/typing.py#L433) on either master or 3.7, I can't see how it _could_ return `True`.
and I don't think that should change. I don't think you should be able to instantiate a Union (that's the current behaviour too). A Union of two types is not the same as inheriting from both types. Of course it isn't. That would be the opposite of a union. `Union[int, str]` is a type that can hold any value that's _either_ an int or a str. A type that can hold any value that's _both_ an int and a str would be an intersection. An intersection type would be a subtype of all of its types (but `Intersection[int, str]` still wouldn't be the same thing as `class IntStr(int, str): pass`), but a union type is not a subtype of any of its types. (Of course there are no values that are both an int and a str, but with more protocol-y types, intersections are useful—plenty of things are both an Iterable and a Container, for example..)
I’d expect issubclass(int|str, int|str|bytes) to be true, and issubclass(int|str, int) to be false, not for both of them to raise exceptions about the first argument not being a type.
Currently, issubclass accepts unions without raising, and that shouldn't change either. Again, what version are you using? Every version I try say it's a TypeError; again see repl.it 3.7 (TepidPowerlessSignature) for an example.
But I disagree that int|str is a subclass of int|str|bytes. There's no > subclass relationship between the two: the (int|str).__bases__ won't> include (int|str|bytes), First, `issubclass` is about subtyping, not about declared inheritance. That's why we not only have, but extensively use, subclass hooks to provide subclass relationships entirely based on structure (like `Iterable`) or on registration (like `Sequence`): >>> issubclass(list, collections.abc.Sequence) True >>> issubclass(list, collections.abc.Iterable) True >>> list.__bases__ (object,) If `issubclass` were about inheritance rather than subtyping, `list` would have to inherit from a half dozen types in `collections.abc`, plus a half-dozen near-identical types in `typing` plus at least one more. Fortunately, it doesn't have to inherit from any of them, and doesn't, but Python can still recognize that it's a subtype of all of them. And this is fundamental to the design of the static typing system, just as it is the dynamic typing system. If you declare a function to take a `typing.Sequence` argument, and you pass it a `list` value, it type-checks successfully. So, static type checkers consider `Union[int, str]` to be a subtype of `Union[int, str, bytes]`. As they should. The dynamic checker, `issubclass`, currently refuses to do that test (except maybe on your machine?). But if it does do the test, why shouldn't it give the same answer as the static checker? (Of course there _are_ cases where the dynamic and static type systems should, or even must, use different rules, but all those cases are for some specific reason. If you think there is such a reason here, you should be able to say what it is.) and instances of int|str don't inherit from > all three of int, str, bytes. And again, you're confusing intersection and union. In fact, `int|str` doesn't inherit from _any_ of those three types—and. more importantly, it isn't a subtype of any of them. If `int|str` were a subtype of `int`, that would mean that every valid `int|str` value was also a valid `int` values. Which obviously isn't true; `"abc"` is an `int|str`, and it isn't an `int`. But `int|str` _is_ a subtype of `int|str|bytes`. Every `int|str` value is an `int|str|bytes` value. You can double-check with the applicability rule of thumb: If I have some value of type `int|str` and I try to use it with some code that requires an `int`, can it raise a `TypeError`? Yes; if the value is `"abc"`. What if I try to use it with some code that requires an `int|str|bytes`? No, it cannot raise a `TypeError`. You can do further checks with the LSP and other rules of thumb if you want, but they all point the same way (as, again, the static type system already recognizes). Or, think about types as sets of values. The type `int` is literally just the infinite set of all possible `int` values. The type `int|str` is the union of the two sets `int` and `str`. The type `int|str|bytes` is the union of the three sets `int`, `str`, and `bytes`. And `issubclass` is just the subset relationship. So, is `int U str` a subset of `int U str U bytes`? Of course it is.
Finally, do we still need the existing Generic, typing.Union, at all?
typing.Union will probably just become an alias to the (proposed) built-in types.Union. If `typing.Union` and `typing.Union[int, str]` actually are types as you say, then there's probably no harm in replacing the existing `typing.Union` with the new `types.UnionType`. But if they aren't, as testing and reading the code seems to show, then I'm less sure that it's harmless. Which is why I asked.
| | | | | | | | | | | Mailman 3 [Python-ideas] Re: Inspired by Scala, a new syntax for Union ... | | | Code of Conduct:
On Fri, Sep 06, 2019 at 07:44:19PM +0000, Andrew Barnert wrote:
Union, and unions, are currently types:
py> isinstance(Union, type) True py> isinstance(Union[int, str], type) True
What version of Python are you using here?
Ah, I didn't realise that the behaviour has changed! I was using Python 3.5, but I just tried again in 3.8: py> isinstance(Union, type) False py> isinstance(Union[str, int], type) False So it seems that both the implementation and the interface of unions have changed radically between 3.5 and now. My mistake for assuming that backwards compatibility would have meant they were the same. In 3.5: py> Union.__bases__ (<class 'typing.Final'>,) py> Union.__mro__ (typing.Union, <class 'typing.Final'>, <class 'object'>) py> Union.__class__ <class 'typing.UnionMeta'> In 3.8, the first two raise AttributeError, the third: py> Union.__class__ <class 'typing._SpecialForm'> I don't know the reason for this change, but in the absense of a compelling reason, I think it should be reversed. Steven (me):
But I disagree that int|str is a subclass of int|str|bytes. There's no subclass relationship between the two: the (int|str).__bases__ won't include (int|str|bytes),
Andrew replied:
First, `issubclass` is about subtyping, not about declared inheritance.
That could be debated. help(subclass) says: "Return whether 'cls' is a derived from another class or is the same class." but of course ABCs and virtual subclassing do exist. My position is that in Python, issubclass is about inheritance, but you can fake it if you like, in which case "consenting adults" applies. If you do, it's still *subclassing*. If you want to call this "subtyping", I won't argue too strongly. This Stackoverflow post: https://stackoverflow.com/questions/45255270/union-types-in-scala-with-subty... suggests that Scala considers that int|str is *not* a subtype of int|str|bytes, but Scala.js considers that it is. I don't know if that distinction still applies, or why, or the arguments for or against treating int|str as a subtype of int|str|bytes, but if Scala treated it as *not* a subtype, I don't think the question is as cut-and-dried as you make out. But I will point out that the name of the function is *issubclass*, not *issubtype*. If you want to test for a subtype relationship with unions, this should work: # Python 3.5 py> Union[int, str].__union_set_params__ < Union[int, str, bytes].__union_set_params__ True # Python 3.8 py> set(Union[int, str].__args__) < set(Union[int, str, bytes].__args__) True A little clunky, but if it were needed, we could improve the API. (Perhaps even by calling it "issubclass" as you say.) -- Steven
On Friday, September 6, 2019, 5:41:23 PM PDT, Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, Sep 06, 2019 at 07:44:19PM +0000, Andrew Barnert wrote:
Union, and unions, are currently types:
py> isinstance(Union, type) True py> isinstance(Union[int, str], type) True
What version of Python are you using here?
Ah, I didn't realise that the behaviour has changed! I was using Python 3.5, but I just tried again in 3.8: I didn't even consider that this might be an old feature that was taken away, rather than a new feature that was added. Oh well, at least it gave me an excuse to configure and build 3.9. :) I don't know the reason for this change, but in the absense of a > compelling reason, I think it should be reversed.
My thinking is in the opposite direction. I'm assuming Guido or Jukka or whoever wouldn't have made this change without a compelling reason. And if we can't guess that reason, then someone (who cares more about pushing this change than me) has to do the research or ask the right people to find out. Without knowing that, we can't really decide whether it's worth reversing for all types, or just for union types specifically, or whether the proposal needs to be rethought from scratch (or scrapped, if that's not possible), and the conservative default should be the last one. But hopefully the default won't matter, because someone will care enough to find out the reasons and report them to us. (Or maybe Guido will just see this discussion and have the answers on the top of his head.)
Steven (me):
But I disagree that int|str is a subclass of int|str|bytes. There's no subclass relationship between the two: the (int|str).__bases__ won't include (int|str|bytes),
Andrew replied:
First, `issubclass` is about subtyping, not about declared inheritance.
That could be debated. help(subclass) says:
"Return whether 'cls' is a derived from another class or is the same > class."
but of course ABCs and virtual subclassing do exist.> My position is that in Python, issubclass is about inheritance, but you can fake it if you like, in which case "consenting adults" applies. But it's not just a "consenting adults" feature that users can do whatever they want with, it's a feature that's used prevalently and fundamentally in the standard library—including being the very core of typing.py—and always with the consistent purpose of supporting (a couple of minor variations on the idea of) subtyping. So sure, technically it is inheritance testing with a bolt-on to add whatever you want (and of course historically, that _is_ how it evolved), but in practice, it's subtype testing. Thanks to the usual practicality-beats-purity design, it isn't actually testing _exactly_ subtype either, of course. After all, while it would be silly to register `int` with `collections.abc.Sequence`, there's nothing actually stopping you from doing so, and that obviously won't actually make `int` a subtype of `Sequence`, but it will fool `issubclass` into believing it is. But, except when you're intentionally breaking things for good or bad reasons, what it tests is much closer to subtype than inheritance. In fact, even when you break things for good reasons, often it's to better match the subtyping expectation, not to violate it. (Consider `Sequence` and `Mapping`, which act as if they were structural subtyping tests like the other collection ABCs, despite the fact that it's actually impossible to distinguish the two types that way so they cheat with a registry. Try doing that with Swift or Go. :) This Stackoverflow post: union types in scala with subtyping: A|B <: A|B|C suggests that Scala considers that int|str is *not* a subtype of int|str|bytes, but Scala.js considers that it is. Without reading the whole question carefully, and without refreshing my fuzzy memory of Scala's type system, I think there are two things going on here. First, the OP in that question seems to be trying to build his own disjunction type that acts like a union in other languages (including, apparently, Scala.js?), and he just didn't know how to code it right. So, why would anyone do that in the first place? Well, the bigger issue is that Scala's unions aren't quite the same thing we're talking about here in the first place—despite the fact that they're what inspired this whole thread. In most languages, the union of int and str is a special type that's defined by including all int values and all str values (and nothing else), or something similar to that. In Scala, the union of int and str is defined as the least upper bound of int and str on the lattice of all types (which of course provably does include all int values and all str values, because int and str are subtypes). In simple cases this ends up doing the same thing. And when they differ, it's usually that Scala can infer something cool that other languages can't. But I do vaguely remember one case where Scala couldn't infer a type for me and (unlike, say, Haskell or Kotlin—which may fail more often, but can always tell you exactly why they failed and what you need to add to fix it) it couldn't tell that it couldn't infer it, and it went and did something crazy like… an infinitely-long compile that ate up all my disk space or something? I forget. But I will point out that the name of the function is *issubclass*, not *issubtype*. Sure, and the `class` statement creates a `type` instance. Which are sometimes called `classes` and sometimes `types`. Since 2.3, all classes are types and all types are classes, and the root of the metaclass hierarchy is called `type`, and Python just mixes and matches the two words arbitrarily, and it almost never causes confusion. (Of course in pre-2.3 Python, they were completely unrelated things, and `issubclass` only worked on classes, hence the name.) If you try and import a meaning from another language, then sometimes it can get confusing. But think about it: in, say, Java, "subclass" explicitly means only a concrete class inheriting implementation from another concrete class; a class implementing an interface is not subclassing. Which means a test for subclassing in the Java sense would be (a) not what Python does, and (b) pointless.
Hello, I try to implement a patch for ``isinstance()`` and ``issubclass()``. It's here <https://github.com/pprados/cpython/tree/updage_isinstance> (variation possible). I not patch mypy now <https://github.com/pprados/cpython/tree/updage_isinstance> # Without patch and Tuple $ ./python -m timeit -s 'isinstance("",(int,str))' 50000000 loops, best of 5: 6.29 nsec per loop # With patch and Tuple $ ./python -m timeit -s 'isinstance("",(int,str))' 50000000 loops, best of 5: 5.27 nsec per loop # With patch and Union $ ./python -m timeit -s 'isinstance("",int|str)' 50000000 loops, best of 5: 5.23 nsec per loop Le sam. 7 sept. 2019 à 04:34, Andrew Barnert via Python-ideas < python-ideas@python.org> a écrit :
On Friday, September 6, 2019, 5:41:23 PM PDT, Steven D'Aprano < steve@pearwood.info> wrote:
On Fri, Sep 06, 2019 at 07:44:19PM +0000, Andrew Barnert wrote:
Union, and unions, are currently types:
py> isinstance(Union, type) True
py> isinstance(Union[int, str], type) True
What version of Python are you using here?
Ah, I didn't realise that the behaviour has changed! I was using Python 3.5, but I just tried again in 3.8:
I didn't even consider that this might be an old feature that was taken away, rather than a new feature that was added. Oh well, at least it gave me an excuse to configure and build 3.9. :)
I don't know the reason for this change, but in the absense of a compelling reason, I think it should be reversed.
My thinking is in the opposite direction. I'm assuming Guido or Jukka or whoever wouldn't have made this change without a compelling reason. And if we can't guess that reason, then someone (who cares more about pushing this change than me) has to do the research or ask the right people to find out. Without knowing that, we can't really decide whether it's worth reversing for all types, or just for union types specifically, or whether the proposal needs to be rethought from scratch (or scrapped, if that's not possible), and the conservative default should be the last one.
But hopefully the default won't matter, because someone will care enough to find out the reasons and report them to us. (Or maybe Guido will just see this discussion and have the answers on the top of his head.)
Steven (me):
But I disagree that int|str is a subclass of int|str|bytes. There's no subclass relationship between the two: the (int|str).__bases__ won't include (int|str|bytes),
Andrew replied:
First, `issubclass` is about subtyping, not about declared inheritance.
That could be debated. help(subclass) says:
"Return whether 'cls' is a derived from another class or is the same class."
but of course ABCs and virtual subclassing do exist. My position is that in Python, issubclass is about inheritance, but you can fake it if you like, in which case "consenting adults" applies.
But it's not just a "consenting adults" feature that users can do whatever they want with, it's a feature that's used prevalently and fundamentally in the standard library—including being the very core of typing.py—and always with the consistent purpose of supporting (a couple of minor variations on the idea of) subtyping. So sure, technically it is inheritance testing with a bolt-on to add whatever you want (and of course historically, that _is_ how it evolved), but in practice, it's subtype testing.
Thanks to the usual practicality-beats-purity design, it isn't actually testing _exactly_ subtype either, of course. After all, while it would be silly to register `int` with `collections.abc.Sequence`, there's nothing actually stopping you from doing so, and that obviously won't actually make `int` a subtype of `Sequence`, but it will fool `issubclass` into believing it is. But, except when you're intentionally breaking things for good or bad reasons, what it tests is much closer to subtype than inheritance. In fact, even when you break things for good reasons, often it's to better match the subtyping expectation, not to violate it. (Consider `Sequence` and `Mapping`, which act as if they were structural subtyping tests like the other collection ABCs, despite the fact that it's actually impossible to distinguish the two types that way so they cheat with a registry. Try doing that with Swift or Go. :)
This Stackoverflow post:
union types in scala with subtyping: A|B <: A|B|C <https://stackoverflow.com/questions/45255270/union-types-in-scala-with-subty...>
suggests that Scala considers that int|str is *not* a subtype of int|str|bytes, but Scala.js considers that it is.
Without reading the whole question carefully, and without refreshing my fuzzy memory of Scala's type system, I think there are two things going on here.
First, the OP in that question seems to be trying to build his own disjunction type that acts like a union in other languages (including, apparently, Scala.js?), and he just didn't know how to code it right.
So, why would anyone do that in the first place? Well, the bigger issue is that Scala's unions aren't quite the same thing we're talking about here in the first place—despite the fact that they're what inspired this whole thread. In most languages, the union of int and str is a special type that's defined by including all int values and all str values (and nothing else), or something similar to that. In Scala, the union of int and str is defined as the least upper bound of int and str on the lattice of all types (which of course provably does include all int values and all str values, because int and str are subtypes). In simple cases this ends up doing the same thing. And when they differ, it's usually that Scala can infer something cool that other languages can't. But I do vaguely remember one case where Scala couldn't infer a type for me and (unlike, say, Haskell or Kotlin—which may fail more often, but can always tell you exactly why they failed and what you need to add to fix it) it couldn't tell that it couldn't infer it, and it went and did something crazy like… an infinitely-long compile that ate up all my disk space or something? I forget.
But I will point out that the name of the function is *issubclass*, not *issubtype*.
Sure, and the `class` statement creates a `type` instance. Which are sometimes called `classes` and sometimes `types`. Since 2.3, all classes are types and all types are classes, and the root of the metaclass hierarchy is called `type`, and Python just mixes and matches the two words arbitrarily, and it almost never causes confusion. (Of course in pre-2.3 Python, they were completely unrelated things, and `issubclass` only worked on classes, hence the name.)
If you try and import a meaning from another language, then sometimes it can get confusing. But think about it: in, say, Java, "subclass" explicitly means only a concrete class inheriting implementation from another concrete class; a class implementing an interface is not subclassing. Which means a test for subclassing in the Java sense would be (a) not what Python does, and (b) pointless. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/F7BFCN... Code of Conduct: http://python.org/psf/codeofconduct/
On Sep 5, 2019, at 02:15, Chris Angelico <rosuav@gmail.com> wrote:
On Thu, Sep 5, 2019 at 7:08 PM Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
On Sep 4, 2019, at 20:51, Inada Naoki <songofacandy@gmail.com> wrote:
And you’re right, because int|str _looks_ better than (int, str) here, many people will be encouraged to use it even though it’s slower, which could potentially be a bad thing for some programs.
That's exactly my point. If we say "you can use `isinstance(x, int | str)` for now", people may think it is a new and recommended way to write it.
Right; I was agreeing, and saying it may even be worse than you’re suggesting. There’s always the possibility that any shiny new feature looks like the One True Way and gets overused. But in this case, it looks more obviously right even to someone who’s not interested in shiny new features and has never heard the word “pythonic” or read a new version’s release notes. So they’re going to use it. Which means if they run into a situation where they’re checking types of a zillion objects, they’re definitely not going to guess that it’s 5x as slow, so there definitely going to misuse it.
Hang on hang on.... what's this situation where you're checking types of a zillion objects?
Exception was the first one I thought of, but, as you argue, that’s a bit of a special case. And my next thought was the kind of thing you’d do with pattern matching in a different language but which would be more naturally (and often more efficiently) done with method calls (or maybe @singledispatch) in Python. But here’s an example from the stdlib, in the json module: Deep inside the recursive encoder function, there’s a string of `elif isinstance(value, …):` calls. And at least one of them does a check with a tuple today (to special-case list and tuple for encoding as JSON arrays without needing to call any user callbacks). In a JSON structure with zillions of nodes, this check will happen zillions of times—once for every node that’s a list or tuple or any type that gets checked after those types. This isn’t the most compelling example. Most nodes are probably captured before this check, and the ones lower on the chain are all the ones that are inherently slow to process so the added cost of getting there is less important, and so on. Plus, the very fact that the author thought of optimizations like that implies that the author would have had no problem profiling and replacing `list|tuple` with `(list, tuple)` if it makes a difference. But the fact that I found an example in the stdlib in 2 minutes of searching implies that this probably isn’t nearly as rare as you’d at first expect.
On Fri, Sep 6, 2019 at 4:12 AM Andrew Barnert <abarnert@yahoo.com> wrote:
On Sep 5, 2019, at 02:15, Chris Angelico <rosuav@gmail.com> wrote:
On Thu, Sep 5, 2019 at 7:08 PM Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
Which means if they run into a situation where they’re checking types of a zillion objects, they’re definitely not going to guess that it’s 5x as slow, so there definitely going to misuse it.
Hang on hang on.... what's this situation where you're checking types of a zillion objects?
Exception was the first one I thought of, but, as you argue, that’s a bit of a special case. And my next thought was the kind of thing you’d do with pattern matching in a different language but which would be more naturally (and often more efficiently) done with method calls (or maybe @singledispatch) in Python.
But here’s an example from the stdlib, in the json module:
`elif isinstance(value, …):` ... with a tuple today (to special-case list and tuple).
But the fact that I found an example in the stdlib in 2 minutes of searching implies that this probably isn’t nearly as rare as you’d at first expect.
We need an implementation to benchmark before we can be sure, but part of my "hang on" was that, even when there ARE lots of objects to check, a single isinstance per object is a vanishingly small part of the overall job. I suppose you might find a performance regression on json.dumps([[]]*10000) but the other costs are normally going to dominate it. In any case, though, the Union type can itself be special-cased inside isinstance to make this efficient again (as Steven showed), which means this is highly unlikely to be "5x as slow", and making this entire subtread fairly moot :) (As a side effect of this change, I wouldn't be sorry to bypass the grammatical ambiguity (from history) of the comma in an except clause. Currently "except Exc1, Exc2:" is a syntax error, but "except Exc1|Exc2:" would be perfectly valid.) ChrisA
I think the "foo | bar" syntax for Union is pretty clear, I like it! The ~foo for Optional is... not that obvious. Not sure it's a win. On Thu, 29 Aug 2019 at 13:49, Philippe Prados <philippe.prados@gmail.com> wrote:
Hello everybody,
Scala 3 propose the a new syntax for Union type. See here <https://dotty.epfl.ch/docs/reference/new-types/union-types.html>. I propose to add a similar syntax in Python.
# Operator for Union assert( int | str == Union[int,str]) assert( int | str | float == Union[int,str,float]) # Operator for Optional assert( ~int == Optional[int])
Now, it's possible to write:
def fn(bag:List[int | str], option: ~int = None) -> float | str: ...
in place of
def fn(bag:List[Option[int,str]], option: Optional[int] = None) -> Union[float,str]: ...
I think these syntaxes are more clear, and can help with the adoption of typing.
I test and implement these ideas in a two fork : One for CPython <https://github.com/pprados/cpython> and one for MyPy <https://github.com/pprados/mypy>. See the branches add_OR_to_types (for Union syntax) or add_INVERT_to_types (for Union and Optional syntax).
How I implement that ? I add the operators __or__ and __revert__ to PyType_Type. The C code is similar of :
from typing import * def type_or(self,right): return Union[self,right] type(type).__or__ = type_or
Actually, the accepted syntax for typing is :
annotation: name_type name_type: NAME (args)? args: '[' paramslist ']' paramslist: annotation (',' annotation)* [',']
I propose to extend the syntax to :
annotation: ( name_type | or_type | invert_type ) name_type: NAME (args)? args: '[' paramslist ']' paramslist: annotation (',' annotation)* [',']
or_type: name_type '|' annotation
invert_type: '~' annotation
What do you think about that ?
The draft of a PEP is here <https://github.com/pprados/peps/blob/master/pep-9999.rst>.
Regards _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/FCTXGD... Code of Conduct: http://python.org/psf/codeofconduct/
-- Gustavo J. A. M. Carneiro Gambit Research "The universe is always one step beyond logic." -- Frank Herbert
I try with +foo in place of ~foo. Kotlin propose the null-safety <https://kotlinlang.org/docs/reference/null-safety.html> with foo?, but this syntax has a impact of the BNF and all others tools. Le jeu. 29 août 2019 à 15:58, Ricky Teachey <ricky@teachey.org> a écrit :
I like this idea.
The ~foo for Optional is... not that obvious. Not sure it's a win.
I agree. Seems like `foo | None` is just as readable. Assuming that None would be swapped out for NoneType, of course.
I try with +foo in place of ~foo.
+foo seems a lot better to me than ~foo, but i still lean towards `foo | None` as "good enough". it's 3 fewer characters than `Optional[foo]`, or 30 fewer if you include the full removal of `from typing import Optional`. the additional gain of +foo is only 6 characters. i suppose 6 could be significant if you are hitting up against a line length limit. +foo definitely seems to say "foo, plus something else" to me much more than ~foo.
On 29/08/2019 15:32:38, Philippe Prados wrote:
I try with +fooin place of ~foo. Kotlin propose the null-safety <https://kotlinlang.org/docs/reference/null-safety.html> with foo?, but this syntax has a impact of the BNF and all others tools.
Le jeu. 29 août 2019 à 15:58, Ricky Teachey <ricky@teachey.org <mailto:ricky@teachey.org>> a écrit :
I like this idea.
The ~foo for Optional is... not that obvious. Not sure it's a win.
I agree. Seems like `foo | None` is just as readable. Assuming that None would be swapped out for NoneType, of course.
This might be a nonsensical suggestion:-) as I don't understand annotations, but would it be possible to use one of (foo) [foo] or some such? Rob Cliffe
<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> Virus-free. www.avg.com <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/QKHDIZ... Code of Conduct: http://python.org/psf/codeofconduct/
On Thu, 29 Aug 2019 at 14:58, Ricky Teachey <ricky@teachey.org> wrote:
I like this idea.
The ~foo for Optional is... not that obvious. Not sure it's a win.
I agree. Seems like `foo | None` is just as readable. Assuming that None would be swapped out for NoneType, of course.
Agreed, `foo | None` is short and readable. There really is no need for special syntax for Optional. -- Gustavo J. A. M. Carneiro Gambit Research "The universe is always one step beyond logic." -- Frank Herbert
On Aug 29, 2019, at 05:25, Philippe Prados <philippe.prados@gmail.com> wrote:
Hello everybody, Scala 3 propose the a new syntax for Union type. See here. I propose to add a similar syntax in Python. # Operator for Union assert( int | str == Union[int,str]) assert( int | str | float == Union[int,str,float]) # Operator for Optional assert( ~int == Optional[int])
One immediate problem here is that you’d have to add methods to the builtin type type or this would be illegal at runtime. Which means you couldn’t use this feature in Python 3.7, much less 2.7. I’m not sure it maintaining backward compatibility in typing and in mypy is still as important today as it was 5 years ago, but I’m pretty sure it hasn’t been abandoned entirely. Also, for ~, that seems pretty misleading. | means union, and not just in Python. And I’m pretty sure it’s the most common way to spell union/sum types and related things across languages. So, str|int isn’t just easier to type, it should actually aid comprehension. But ~ means complement, which is a completely different thing from |None. And the most common way to spell Optional as an operator across languages is ?. Of course I wouldn’t actually expect ~a to mean “any type but a”, but only because of the meta-thought that such a declaration would be completely useless so you must have intended something different. So ~int seems like it would actually harm comprehension instead of helping. Also, IIRC, multiple shorthands for both Union and Optional were suggested back during the original discussion, including str|int and {str,int} (which doesn’t have the backward compatibility problem) and maybe others. If they were all rejected, the reasoning is probably in the list archives or the GitHub issues repo, so if you want to re-suggest one of them, you probably want to find the original rejection and explain why it no longer applies.
What about using `(int, str)` for indicating a `Union`? This doesn't have compatibility issues and it's similar to `isinstance(foo, (int, str))`, so it should be fairly intuitive: def bar(foo: (int, str) = 0): ... Also it's similar to `get_args(Union[int, str])`.
On Aug 29, 2019, at 12:09, Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
What about using `(int, str)` for indicating a `Union`? This doesn't have compatibility issues and it's similar to `isinstance(foo, (int, str))`, so it should be fairly intuitive:
def bar(foo: (int, str) = 0): ...
In most languages with similar-ish type syntax, (int, str) means Tuple[int, str], not Union[int, str]. Scala and TypeScript copied this from ML just like Haskell and F# did. And I’d bet this is the main reason that {str, int} rather than (str, int) was proposed the first time around. But, balanced against the long-standing Python-specific use of tuples for small numbers of alternatives, including alternative types in places like isinstance, except, etc.? Maybe that beats the cross-linguistic issue. Either one seems a lot better than breaking backward compatibility by adding new operator methods to the type type.
On Fri, Aug 30, 2019 at 5:46 AM Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
On Aug 29, 2019, at 12:09, Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
What about using `(int, str)` for indicating a `Union`? This doesn't have compatibility issues and it's similar to `isinstance(foo, (int, str))`, so it should be fairly intuitive:
def bar(foo: (int, str) = 0): ...
In most languages with similar-ish type syntax, (int, str) means Tuple[int, str], not Union[int, str]. Scala and TypeScript copied this from ML just like Haskell and F# did. And I’d bet this is the main reason that {str, int} rather than (str, int) was proposed the first time around.
But, balanced against the long-standing Python-specific use of tuples for small numbers of alternatives, including alternative types in places like isinstance, except, etc.? Maybe that beats the cross-linguistic issue.
Either one seems a lot better than breaking backward compatibility by adding new operator methods to the type type.
How does that break backward compat? ChrisA
On Aug 29, 2019, at 12:54, Chris Angelico <rosuav@gmail.com> wrote:
Either one seems a lot better than breaking backward compatibility by adding new operator methods to the type type.
How does that break backward compat?
It doesn’t make Python backward incompatible; it does mean that if typing or mypy relies on it, it becomes incompatible with earlier versions of Python (or has to fork different code for 3.8+ that relies on type.__or__ being available and 3.7- that doesn’t have whatever functionality relies on that).
On Fri, Aug 30, 2019 at 7:59 AM Andrew Barnert <abarnert@yahoo.com> wrote:
On Aug 29, 2019, at 12:54, Chris Angelico <rosuav@gmail.com> wrote:
Either one seems a lot better than breaking backward compatibility by adding new operator methods to the type type.
How does that break backward compat?
It doesn’t make Python backward incompatible; it does mean that if typing or mypy relies on it, it becomes incompatible with earlier versions of Python (or has to fork different code for 3.8+ that relies on type.__or__ being available and 3.7- that doesn’t have whatever functionality relies on that).
Ohh, gotcha. I'd describe that not as breaking backward compatibility but as breaking the backport (in that typing.py can easily be backported but core types can't). Still, I think it would be a valuable enhancement, even if it can't be depended upon for older versions - anything needing compatibility can have identical functionality with the longer spelling. ChrisA
On Thu, Aug 29, 2019 at 2:59 PM Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:
It doesn’t make Python backward incompatible; it does mean that if typing or mypy relies on it, it becomes incompatible with earlier versions of Python (or has to fork different code for 3.8+ that relies on type.__or__ being available and 3.7- that doesn’t have whatever functionality relies on that).
That doesn't strike me as blocker -- there are several things in the typing syntax that require a certain minimum version. E.g. type annotations require Python 3 (whereas type comments work in Python 2 too), type annotations on variables (PEP 526) require 3.6+, `from __future__ import annotations` (PEP 563) requires 3.7+. That said I'd rather not introduce new syntax (like `T?` or `?T`) for `Optional[T]` -- let's see what we can do with the existing operators. I think `~T` looks okay. I'm currently working on annotating a very large codebase, and `Optional[T]` is so frequent that I think `T | None` would not be enough of an improvement. Adding a default `__or__` overload to `type` seems a reasonable price to pay in 3.9, and ditto for `__invert__`. Type checkers can support this in older Python versions using PEP 563 or in type comments or in "forward references" (types hidden in string literals). A wart will be that we can make `int | None` work but we shouldn't make `None | int` work (I don't want to add any new operator overloads to `None`, it should always be an error). Open question: at runtime, what should `int | str` return? I don't want this to have to import the typing module. Maybe we could make a very simple `Union` builtin. This can then also be used by `~int` (which is equivalent to `int | None`). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Fri, Aug 30, 2019 at 8:28 AM Guido van Rossum <guido@python.org> wrote:
Open question: at runtime, what should `int | str` return? I don't want this to have to import the typing module. Maybe we could make a very simple `Union` builtin. This can then also be used by `~int` (which is equivalent to `int | None`).
Would it be okay to have a very simple Union builtin now, and it just always returns exactly that, and then in the future it might potentially actually return Union[int, str] ? I'm not pushing for it *now*, but it would be extremely handy in the future to be able to say isinstance(3, int|str) and have it be meaningful. ChrisA
On Thu, Aug 29, 2019 at 3:33 PM Chris Angelico <rosuav@gmail.com> wrote:
Open question: at runtime, what should `int | str` return? I don't want
On Fri, Aug 30, 2019 at 8:28 AM Guido van Rossum <guido@python.org> wrote: this to have to import the typing module. Maybe we could make a very simple `Union` builtin. This can then also be used by `~int` (which is equivalent to `int | None`).
Would it be okay to have a very simple Union builtin now, and it just always returns exactly that, and then in the future it might potentially actually return Union[int, str] ?
I'm not pushing for it *now*, but it would be extremely handy in the future to be able to say isinstance(3, int|str) and have it be meaningful.
Are you suggesting we introduce the "very simple Union builtin" earlier than the "int | str" notation/implementation? Why? 3.8 is closed for features, so it would be 3.9 at the earliest -- plenty of time to implement this right (including `isinstance(x, int|str)`). I do think we should probably review PEP 585 before doing anything about unions specifically -- likely there are bigger fish to fry. (And PEP 585 has not received much discussion.) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Fri, Aug 30, 2019 at 8:43 AM Guido van Rossum <guido@python.org> wrote:
On Thu, Aug 29, 2019 at 3:33 PM Chris Angelico <rosuav@gmail.com> wrote:
On Fri, Aug 30, 2019 at 8:28 AM Guido van Rossum <guido@python.org> wrote:
Open question: at runtime, what should `int | str` return? I don't want this to have to import the typing module. Maybe we could make a very simple `Union` builtin. This can then also be used by `~int` (which is equivalent to `int | None`).
Would it be okay to have a very simple Union builtin now, and it just always returns exactly that, and then in the future it might potentially actually return Union[int, str] ?
I'm not pushing for it *now*, but it would be extremely handy in the future to be able to say isinstance(3, int|str) and have it be meaningful.
Are you suggesting we introduce the "very simple Union builtin" earlier than the "int | str" notation/implementation? Why? 3.8 is closed for features, so it would be 3.9 at the earliest -- plenty of time to implement this right (including `isinstance(x, int|str)`).
I do think we should probably review PEP 585 before doing anything about unions specifically -- likely there are bigger fish to fry. (And PEP 585 has not received much discussion.)
No, I mean that at run-time, int|str might return a very simple object in 3.9, rather than everything that you'd need to grab from importing typing. Wondering if doing so would close off the possibility of, in 3.12 or something, making it a more directly usable "type union" that has other value. ChrisA
On Thu, Aug 29, 2019 at 3:55 PM Chris Angelico <rosuav@gmail.com> wrote:
On Fri, Aug 30, 2019 at 8:43 AM Guido van Rossum <guido@python.org> wrote:
On Thu, Aug 29, 2019 at 3:33 PM Chris Angelico <rosuav@gmail.com> wrote:
On Fri, Aug 30, 2019 at 8:28 AM Guido van Rossum <guido@python.org>
Open question: at runtime, what should `int | str` return? I don't want this to have to import the typing module. Maybe we could make a very simple `Union` builtin. This can then also be used by `~int` (which is equivalent to `int | None`).
Would it be okay to have a very simple Union builtin now, and it just always returns exactly that, and then in the future it might potentially actually return Union[int, str] ?
I'm not pushing for it *now*, but it would be extremely handy in the future to be able to say isinstance(3, int|str) and have it be meaningful.
Are you suggesting we introduce the "very simple Union builtin" earlier
wrote: than the "int | str" notation/implementation? Why? 3.8 is closed for features, so it would be 3.9 at the earliest -- plenty of time to implement this right (including `isinstance(x, int|str)`).
I do think we should probably review PEP 585 before doing anything about
unions specifically -- likely there are bigger fish to fry. (And PEP 585 has not received much discussion.)
No, I mean that at run-time, int|str might return a very simple object in 3.9, rather than everything that you'd need to grab from importing typing. Wondering if doing so would close off the possibility of, in 3.12 or something, making it a more directly usable "type union" that has other value.
I think typing shuld just re-export the builtin Union and deal with it. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Thu, 29 Aug 2019 at 23:48, Guido van Rossum <guido@python.org> wrote:
On Thu, Aug 29, 2019 at 3:33 PM Chris Angelico <rosuav@gmail.com> wrote:
On Fri, Aug 30, 2019 at 8:28 AM Guido van Rossum <guido@python.org> wrote:
[...]
I do tink we should probably review PEP 585 before doing anything about unions specifically -- likely there are bigger fish to fry. (And PEP 585 has not received much discussion.)
I also agree with this. Generally I am fine with Union[int, str] and Optional[int], but I also see how some people might want a shorter notation. Many things around typing have been previously rejected because we didn't want to introduce any (or at least minimal) changes to the syntax and runtime, but now that typing is much more widely used we can reconsider some of these. Importantly, I think this should be done in a systematic way (potentially using PEP 585 draft as a starting point). -- Ivan
With my implementation, I can check assert int | None == None | int is true Le lun. 2 sept. 2019 à 13:32, Ivan Levkivskyi <levkivskyi@gmail.com> a écrit :
On Thu, 29 Aug 2019 at 23:48, Guido van Rossum <guido@python.org> wrote:
On Thu, Aug 29, 2019 at 3:33 PM Chris Angelico <rosuav@gmail.com> wrote:
On Fri, Aug 30, 2019 at 8:28 AM Guido van Rossum <guido@python.org> wrote:
[...]
I do tink we should probably review PEP 585 before doing anything about unions specifically -- likely there are bigger fish to fry. (And PEP 585 has not received much discussion.)
I also agree with this. Generally I am fine with Union[int, str] and Optional[int], but I also see how some people might want a shorter notation. Many things around typing have been previously rejected because we didn't want to introduce any (or at least minimal) changes to the syntax and runtime, but now that typing is much more widely used we can reconsider some of these. Importantly, I think this should be done in a systematic way (potentially using PEP 585 draft as a starting point).
-- Ivan
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YZHAER... Code of Conduct: http://python.org/psf/codeofconduct/
Hello, I propose to resume all the arguments (I add my remarks in italic) and major questions. Add a new operator for `Union[type1|type2]` ? - CONS: This is not a new proposal. If I recall correctly, it was proposed way back at the very beginning of the type-hinting discussion, and there has been at least one closed feature request for it: https://github.com/python/typing/issues/387 - It is maybe too late to change this, many people are already get used to current notation. - I propose to add a new notation, not to replace the notation - This syntax is difficult to google, if someone encounters it in code - It is still not possible to use `|` for unions because of built-in types. (This would require a corresponding slot in type which is a non-starter) - I do it - There are currently no volunteer to implement this in mypy - I implement this (One patch for CPython <https://github.com/pprados/cpython> and one for MyPy <https://github.com/pprados/mypy>). - “but as @ilevkivskyi pointed out, that is not an option (at least until Python 4).” - Is it time now ? - PRO: It’s similar of Scala <https://dotty.epfl.ch/docs/reference/new-types/union-types.html> - PRO: Seems like `foo | None` is just as readable - PRO: Which means you couldn’t use this feature in Python 3.7, much less 2.7. I’m not sure it maintaining backward compatibility in typing and in mypy is still as important today as it was 5 years ago, but I’m pretty sure it hasn’t been abandoned entirely. - CONS: add operator introducing a dependency to typing in builtins - CONS: supporting this would likely break compatibility with existing code that overloads `|` for class objects using a metaclass. We could perhaps work around this by making `|` inside an annotation context different from the regular `|` operator. - A work around is to use `Union[type1,type2]` in this case - CONS: as breaking the backport (in that typing.py can easily be backported but core `types` can't) - There are several things in the typing syntax that require a certain minimum version. E.g. type annotations require Python 3 (whereas type comments work in Python 2 too), type annotations on variables (PEP 526) require 3.6+, `from __future__ import annotations` (PEP 563) requires 3.7+. - PRO: I mean that at run-time, `int|str` might return a very simple object in 3.9, rather than everything that you'd need to grab from importing `typing`. Wondering if doing so would close off the possibility of, in 3.12 or something, making it a more directly usable "type union" that has other value. - CONS: if Python itself doesn't have to be changed, we'd still need to implement it in mypy, Pyre, PyCharm, Pytype, and who knows what else. - My patch of mypy is just 20 lines of codes If yes, - Change only the PEP484 <https://www.python.org/dev/peps/pep-0484/> (Type hints) to accept the syntax `type1 | type2` ? - PRO: The PEP563 <https://www.python.org/dev/peps/pep-0563/> (Postponed Evaluation of Annotations) is enough to accept this proposition - CONS: The Resolving type hints at runtime <https://www.python.org/dev/peps/pep-0563/#resolving-type-hints-at-runtime> says: “For code which uses annotations for other purposes, a regular eval(ann, globals, locals) call is enough to resolve the annotation.”. Without add a new operator `__or__` in type `type`, it’s not possible to resolve type hints at runtime.
from __future__ import annotations def foo() -> int | str: pass ... eval(foo.__annotations__['return']) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<string>", line 1, in <module> TypeError: unsupported operand type(s) for |: 'type' and 'type'
- CONS: Without operator, it’s not possible to write
u = int | str u typing.Union[int, str]
- Use `(int, str)` in place of `Union[int,str]` ? - PRO: This doesn't have compatibility issues and it's similar to `isinstance(foo, (int, str))` - PRO: Either better than breaking backward compatibility by adding new operator methods to the type `type`. - CONS: In most languages with similar-ish type syntax, `(int, str)` means `Tuple[int, str]`, not `Union[int, str]`. - Use `{int, str}` in place of `Union[int,str]` ? - PRO: big advantage of `{int, str}` over `int|str`. It doesn't require adding anything to `type`, and we don't need to introduce a new lightweight builtin union type. Add a new operator for `Optional[type]` ? - CONS: `foo | None` is short and readable - CONS: `foo | None` it's 3 fewer characters than `Optional[foo]`, or 30 fewer if you include the full removal of `from typing import Optional`. the additional gain of `~foo` is only 6 characters. - PRO: help the readability, with a lot of parameters: def f(source: str | None, destination: str | None, param: int | None):... def f(source: ~str, destination: ~str, param: ~int):... - PRO: I'm currently working on annotating a very large codebase, and `Optional[T]` is so frequent that I think `T | None` would not be enough of an improvement. - PRO: Adding a default `__or__` overload to `type` seems a reasonable price to pay in 3.9, and ditto for `__invert__`. Type checkers can support this in older Python versions using PEP 563 or in type comments or in "forward references" (types hidden in string literals). - CONS: The `~` is easy to be missed (at least by human readers) and the meaning not obvious. - PRO: Also, Python’s typing system is a lot easier to grasp if you’re familiar with an established modern-typed language (Swift, Scala, Haskell, F#, etc.), and they also use `Optional[T]` (or `optional<T>` or `Maybe t` or some other spelling of the same idea) all over be place—so often that many of them have added shortcuts like `T?` to make it easier to write and less intrusive to read. if yes, Add operator `__revert__` in type type to use syntax like `~int` ? - CONS: `~` is not automatically readable - like `:` to separate variable and typing. - CONS: `~` means complement, which is a completely different thing from `|None`. `~int` seems like it would actually harm comprehension instead of helping. - PRO: the slight abuse of `~int` meaning "maybe int" is pretty plausible (consider how "approximately equal" is written mathematically). - PRO: Possibly relevant for tilde: https://www.thecut.com/article/why-the-internet-tilde-is-our-most-perfect-to... - CONS: With `~` there probably won't be a confusion in that sense, but someone reading it for the first time will definitely need to look it up (which is fine i.m.o.). - Like the first time someone reading the annotation def f(a=int):... def f(a:int):... Add operator __add__ in type type to use syntax like +int ? - PRO: `+foo` definitely seems to say "foo, plus something else" to me much more than `~foo`. - CON: `+foo` is less intuitive than `~foo` for `Optional` Like Kotlin <https://kotlinlang.org/docs/reference/null-safety.html>, add a new `?` operator to use syntax like `int?` ou `?int` ? - CONS: It’s not compatible with IPython and Jupyter Lab `?smth` displays help for symbol `smth` - CONS: With default arguments, `?=` looks... not great - def f(source: str?=def_src, destination: str?=MISSING, param: int?=1): ... Extend `isinstance()` and `issubclass()` to accept `Union` ? isinstance(x, str | int) ==> "is x an instance of str or int" - PRO: if they were permitted, then instance checks could use an extremely clean-looking notation for "any of these": Do nothing, open a new PEP or extend PEP585 ? - I do think we should probably review PEP 585 <https://www.python.org/dev/peps/pep-0585/> before doing anything about unions specifically -- likely there are bigger fish to fry - PEP 585 has not received much discussion So, I think it’s time to answer these questions: - Add a new operator for `Union[type1|type2]` ? - If yes - Change only the PEP484 (Type hints) to accept the syntax `type1 | type2` ? - Use `(int, str)` in place of `Union[int,str]` ? - Use `{int, str]` in place of `Union[int,str]` ? - Add a new operator for `Optional[type]` ? - If yes, - Add operator `__revert__` in type type to use syntax like `~int` ? - Add operator `__add__` in type type to use syntax like +int ? - Extend `isinstance()` and `issubclass()` to accept `Union` ? Do nothing, open a new PEP <https://github.com/pprados/peps/blob/master/pep-9999.rst> or extend PEP585 <https://www.python.org/dev/peps/pep-0585/> ? Philippe Le mar. 3 sept. 2019 à 08:30, Philippe Prados <python@prados.fr> a écrit :
With my implementation, I can check assert int | None == None | int is true
Le lun. 2 sept. 2019 à 13:32, Ivan Levkivskyi <levkivskyi@gmail.com> a écrit :
On Thu, 29 Aug 2019 at 23:48, Guido van Rossum <guido@python.org> wrote:
On Thu, Aug 29, 2019 at 3:33 PM Chris Angelico <rosuav@gmail.com> wrote:
On Fri, Aug 30, 2019 at 8:28 AM Guido van Rossum <guido@python.org> wrote:
[...]
I do tink we should probably review PEP 585 before doing anything about unions specifically -- likely there are bigger fish to fry. (And PEP 585 has not received much discussion.)
I also agree with this. Generally I am fine with Union[int, str] and Optional[int], but I also see how some people might want a shorter notation. Many things around typing have been previously rejected because we didn't want to introduce any (or at least minimal) changes to the syntax and runtime, but now that typing is much more widely used we can reconsider some of these. Importantly, I think this should be done in a systematic way (potentially using PEP 585 draft as a starting point).
-- Ivan
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YZHAER... Code of Conduct: http://python.org/psf/codeofconduct/
Regarding the discussion about using the tilde operator (an idea which has grown on me a bit): Would it help, for the purpose of alerting the user to potential backwards compatibility issues, to add a *second* dunder method, __optional__, or perhaps __tilde__. to the tilde operator resolution order (is that the correct phrase...?) that applies to `type` objects only? Possible additional benefit: - A FooMcls.__optional__ method could provide ability to use typing.Optional to type against a custom MissingType - example: if Foo's metaclass is defined such that Optional[Foo] produces Union[Foo, MissingType]: class MissingType: ... MISSING = MissingType() def func(x: Optional[Foo] = MISSING): ... - The Optional type would need to be changed to look for the __optional__ method at mpy runtime The logic could be like so: - when ~foo is invoked... - if foo has an __invert__ method, tilde always uses that; this will preserve compatibility for objects - if foo is a type, use the __optional__ method, if it is present (this step is skipped for non-type objects) - if a type object provides *both*, when used in the context of typing, provide a warning to the user that the Foo type object is returning a value from __invert__, which is not as expected --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler On Tue, Sep 3, 2019 at 11:42 AM Philippe Prados <python@prados.fr> wrote:
Hello,
I propose to resume all the arguments (I add my remarks in italic) and major questions.
Add a new operator for `Union[type1|type2]` ?
-
CONS: This is not a new proposal. If I recall correctly, it was proposed way back at the very beginning of the type-hinting discussion, and there has been at least one closed feature request for it: https://github.com/python/typing/issues/387 -
It is maybe too late to change this, many people are already get used to current notation. -
I propose to add a new notation, not to replace the notation -
This syntax is difficult to google, if someone encounters it in code -
It is still not possible to use `|` for unions because of built-in types. (This would require a corresponding slot in type which is a non-starter) -
I do it -
There are currently no volunteer to implement this in mypy -
I implement this (One patch for CPython <https://github.com/pprados/cpython> and one for MyPy <https://github.com/pprados/mypy>). -
“but as @ilevkivskyi pointed out, that is not an option (at least until Python 4).” -
Is it time now ? -
PRO: It’s similar of Scala <https://dotty.epfl.ch/docs/reference/new-types/union-types.html> -
PRO: Seems like `foo | None` is just as readable -
PRO: Which means you couldn’t use this feature in Python 3.7, much less 2.7. I’m not sure it maintaining backward compatibility in typing and in mypy is still as important today as it was 5 years ago, but I’m pretty sure it hasn’t been abandoned entirely. -
CONS: add operator introducing a dependency to typing in builtins -
CONS: supporting this would likely break compatibility with existing code that overloads `|` for class objects using a metaclass. We could perhaps work around this by making `|` inside an annotation context different from the regular `|` operator. -
A work around is to use `Union[type1,type2]` in this case -
CONS: as breaking the backport (in that typing.py can easily be backported but core `types` can't) -
There are several things in the typing syntax that require a certain minimum version. E.g. type annotations require Python 3 (whereas type comments work in Python 2 too), type annotations on variables (PEP 526) require 3.6+, `from __future__ import annotations` (PEP 563) requires 3.7+. -
PRO: I mean that at run-time, `int|str` might return a very simple object in 3.9, rather than everything that you'd need to grab from importing `typing`. Wondering if doing so would close off the possibility of, in 3.12 or something, making it a more directly usable "type union" that has other value. -
CONS: if Python itself doesn't have to be changed, we'd still need to implement it in mypy, Pyre, PyCharm, Pytype, and who knows what else. -
My patch of mypy is just 20 lines of codes
If yes,
-
Change only the PEP484 <https://www.python.org/dev/peps/pep-0484/> (Type hints) to accept the syntax `type1 | type2` ? -
PRO: The PEP563 <https://www.python.org/dev/peps/pep-0563/> (Postponed Evaluation of Annotations) is enough to accept this proposition -
CONS: The Resolving type hints at runtime <https://www.python.org/dev/peps/pep-0563/#resolving-type-hints-at-runtime> says: “For code which uses annotations for other purposes, a regular eval(ann, globals, locals) call is enough to resolve the annotation.”. Without add a new operator `__or__` in type `type`, it’s not possible to resolve type hints at runtime.
from __future__ import annotations def foo() -> int | str: pass ... eval(foo.__annotations__['return']) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<string>", line 1, in <module> TypeError: unsupported operand type(s) for |: 'type' and 'type'
-
CONS: Without operator, it’s not possible to write
u = int | str u typing.Union[int, str]
-
Use `(int, str)` in place of `Union[int,str]` ? -
PRO: This doesn't have compatibility issues and it's similar to `isinstance(foo, (int, str))` -
PRO: Either better than breaking backward compatibility by adding new operator methods to the type `type`. -
CONS: In most languages with similar-ish type syntax, `(int, str)` means `Tuple[int, str]`, not `Union[int, str]`. -
Use `{int, str}` in place of `Union[int,str]` ? -
PRO: big advantage of `{int, str}` over `int|str`. It doesn't require adding anything to `type`, and we don't need to introduce a new lightweight builtin union type.
Add a new operator for `Optional[type]` ?
-
CONS: `foo | None` is short and readable -
CONS: `foo | None` it's 3 fewer characters than `Optional[foo]`, or 30 fewer if you include the full removal of `from typing import Optional`. the additional gain of `~foo` is only 6 characters. -
PRO: help the readability, with a lot of parameters:
def f(source: str | None, destination: str | None, param: int | None):... def f(source: ~str, destination: ~str, param: ~int):...
-
PRO: I'm currently working on annotating a very large codebase, and `Optional[T]` is so frequent that I think `T | None` would not be enough of an improvement. -
PRO: Adding a default `__or__` overload to `type` seems a reasonable price to pay in 3.9, and ditto for `__invert__`. Type checkers can support this in older Python versions using PEP 563 or in type comments or in "forward references" (types hidden in string literals). -
CONS: The `~` is easy to be missed (at least by human readers) and the meaning not obvious. -
PRO: Also, Python’s typing system is a lot easier to grasp if you’re familiar with an established modern-typed language (Swift, Scala, Haskell, F#, etc.), and they also use `Optional[T]` (or `optional<T>` or `Maybe t` or some other spelling of the same idea) all over be place—so often that many of them have added shortcuts like `T?` to make it easier to write and less intrusive to read.
if yes,
Add operator `__revert__` in type type to use syntax like `~int` ?
-
CONS: `~` is not automatically readable -
like `:` to separate variable and typing. -
CONS: `~` means complement, which is a completely different thing from `|None`. `~int` seems like it would actually harm comprehension instead of helping. -
PRO: the slight abuse of `~int` meaning "maybe int" is pretty plausible (consider how "approximately equal" is written mathematically). -
PRO: Possibly relevant for tilde: https://www.thecut.com/article/why-the-internet-tilde-is-our-most-perfect-to...
-
CONS: With `~` there probably won't be a confusion in that sense, but someone reading it for the first time will definitely need to look it up (which is fine i.m.o.). -
Like the first time someone reading the annotation
def f(a=int):... def f(a:int):...
Add operator __add__ in type type to use syntax like +int ?
-
PRO: `+foo` definitely seems to say "foo, plus something else" to me much more than `~foo`. -
CON: `+foo` is less intuitive than `~foo` for `Optional`
Like Kotlin <https://kotlinlang.org/docs/reference/null-safety.html>, add a new `?` operator to use syntax like `int?` ou `?int` ?
-
CONS: It’s not compatible with IPython and Jupyter Lab `?smth` displays help for symbol `smth` -
CONS: With default arguments, `?=` looks... not great -
def f(source: str?=def_src, destination: str?=MISSING, param: int?=1): ...
Extend `isinstance()` and `issubclass()` to accept `Union` ?
isinstance(x, str | int) ==> "is x an instance of str or int"
-
PRO: if they were permitted, then instance checks could use an extremely clean-looking notation for "any of these":
Do nothing, open a new PEP or extend PEP585 ?
-
I do think we should probably review PEP 585 <https://www.python.org/dev/peps/pep-0585/> before doing anything about unions specifically -- likely there are bigger fish to fry -
PEP 585 has not received much discussion
So, I think it’s time to answer these questions:
-
Add a new operator for `Union[type1|type2]` ? -
If yes -
Change only the PEP484 (Type hints) to accept the syntax `type1 | type2` ? -
Use `(int, str)` in place of `Union[int,str]` ? -
Use `{int, str]` in place of `Union[int,str]` ? -
Add a new operator for `Optional[type]` ? -
If yes, -
Add operator `__revert__` in type type to use syntax like `~int` ? -
Add operator `__add__` in type type to use syntax like +int ? -
Extend `isinstance()` and `issubclass()` to accept `Union` ?
Do nothing, open a new PEP <https://github.com/pprados/peps/blob/master/pep-9999.rst> or extend PEP585 <https://www.python.org/dev/peps/pep-0585/> ?
Philippe
Le mar. 3 sept. 2019 à 08:30, Philippe Prados <python@prados.fr> a écrit :
With my implementation, I can check assert int | None == None | int is true
Le lun. 2 sept. 2019 à 13:32, Ivan Levkivskyi <levkivskyi@gmail.com> a écrit :
On Thu, 29 Aug 2019 at 23:48, Guido van Rossum <guido@python.org> wrote:
On Thu, Aug 29, 2019 at 3:33 PM Chris Angelico <rosuav@gmail.com> wrote:
On Fri, Aug 30, 2019 at 8:28 AM Guido van Rossum <guido@python.org> wrote:
[...]
I do tink we should probably review PEP 585 before doing anything about unions specifically -- likely there are bigger fish to fry. (And PEP 585 has not received much discussion.)
I also agree with this. Generally I am fine with Union[int, str] and Optional[int], but I also see how some people might want a shorter notation. Many things around typing have been previously rejected because we didn't want to introduce any (or at least minimal) changes to the syntax and runtime, but now that typing is much more widely used we can reconsider some of these. Importantly, I think this should be done in a systematic way (potentially using PEP 585 draft as a starting point).
-- Ivan
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YZHAER... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/SAHJWR... Code of Conduct: http://python.org/psf/codeofconduct/
On Sep 2, 2019, at 23:50, Philippe Prados <python@prados.fr> wrote:
Add a new operator for `Union[type1|type2]` ?
Hold on. Are you proposing `Union[t1 | t2]` as a new spelling for `Union[t1, t2]`? That seems pointless. I thought you were proposing just `t1 | t2`, which seems a whole lot more useful (and no more disruptive)? I’m going to assume this is just a typo? Also, as far as the runtime-vs.-static, etc., issues of `int|str` vs. `(int,str)` vs. `{int,str}`, I think it’s worth getting them all down in one place. * `(int, str)` * Not just valid syntax, but a legal value in every version of Python, so you can use it in annotations and still run your code in 3.5 or something (you can’t type-check it as 3.5 code, but that rarely matters). * The same value can be passed to `isinstance` and will work as intended in every version of Python, unlike `Union[int, str]`, which fails because `Union` (like all generic types) refuses to allow its instantiated types in `isinstance`. Whether this is a good or bad thing is an open question, but it seems good at first glance. * Potentially confusing, implying `Tuple` rather than `Union`, but maybe that’s just something for people to get used to * `{int, str}` * Also a legal value in all existing Pythons. * Not a legal argument to `isinstance`. Or `except` statements. This could be changed very easily and without much disruption, although of course only for 3.9+. It also opens the question of whether other Spam-or-tuple-of-Spam functions (like all the str methods) should also change to also accept -or-set-of-Spam, but Guido suggested that this could be left open until people either do or do not file bug reports about trying to pass a frozenset of suffixes to endswith. * Not confusing; it’s hard to imagine what it could mean other than “must be one of this set of types”. * `int|str` * Valid syntax in 3.8, but raises a TypeError when evaluated, so it can’t even be used in annotations without quoting it. * Can add `type.__or__` and `__ror__` methods in 3.9, so it _can_ be used unquoted, and can be inspected as a meaningful value at runtime. That means code they used it won’t compile in 3.8, but that’s nothing unprecedented, and might be worth it. And it’s not that major of a change. * Can’t be passed to `isinstance`. This would require an entirely independent change (which might be a good one, but should probably be argued out separately) to special-case `Union` as not like other generic types, and usable in `isinstance` calls. * Unlike the other options, this one has an obvious extension to `Optional` via a unary operator like `~`. * Close enough to similar meanings in other languages that it would be an aid to memory rather than a source of confusion. Plus, even if you don’t know any of those other languages, and don’t know much about typing, and read it as “int or string”, you still get the right idea.
A problem with `(int, str)` that I believe hasn't been brought up yet is that it interferes with the existing use of subscription syntax, particularly by `Tuple`. `Tuple[int, str]` is equivalent to `Tuple[(int, str)]` but not to `Tuple[Union[int, str]]`, because `__getitem__` receives a single tuple instead of multiple arguments. I think all other subscriptable types could resolve the ambiguity in principle because they aren't variadic.
On Tue, Sep 03, 2019 at 12:19:15PM -0700, Andrew Barnert via Python-ideas wrote:
On Sep 2, 2019, at 23:50, Philippe Prados <python@prados.fr> wrote:
Add a new operator for `Union[type1|type2]` ?
Hold on. Are you proposing `Union[t1 | t2]` as a new spelling for `Union[t1, t2]`? That seems pointless. I thought you were proposing just `t1 | t2`, which seems a whole lot more useful (and no more disruptive)?
I’m going to assume this is just a typo?
I too assume it was a typo, but for the record, since Union already flattens nested Unions, Union[str|int] would be equivalent to Union[Union[str|int]] which flattens to just str|int. -- Steven
On Sep 2, 2019, at 23:50, Philippe Prados <python@prados.fr> wrote:
Like Kotlin, add a new `?` operator to use syntax like `int?` ou `?int` ? CONS: It’s not compatible with IPython and Jupyter Lab `?smth` displays help for symbol `smth` CONS: With default arguments, `?=` looks... not great def f(source: str?=def_src, destination: str?=MISSING, param: int?=1): ...
This has a lot more precedents than Kotlin; it’s a widespread spelling across a variety of modern languages. The incompatibility with IPython isn’t that big a deal in practice, for the reasons I explained when I raised the issue in the first place. Using a `?` suffix isn’t just potentially ugly as in your example, it’s also potentially confusing given languages that use `?=` as null-coalescing assignment or equality, not to mention forever closing off the possibility of adding that feature to Python (which was rejected both times I remember it coming up, but people do still occasionally propose it anew). But I don’t think `?` as a prefix has either the ugliness problem or the cross-language confusion problem: def func(source: ?str=def_src, destination: ?str=MISSING): (I mean, it’s still not beautiful, but that’s just the usual brevity vs. familiarity issue, not a matter of having to parse in your head which part of the expression the `?` belongs to.) Also, `?` is a new operator. And it uses up one of the few ASCII symbols that Python hasn’t yet given a meaning to. Which doesn’t rule it out (unless we want to be like Go and explicitly permanently reserve `?` to mean “some unknown future feature that’s so amazing that it’s better than whatever feature you think you want to use it for”), but it does make the hurdle a lot higher than using `~`.
I never really understood the importance of `Optional`. Often it can be left out altogether and in other cases I find `Union[T, None]` more expressive (explicit) than `Optional[T]` (+ the latter saves only 3 chars). Especially for people not familiar with typing, the meaning of `Optional` is not obvious at first sight. `Union[T, None]` on the other hand is pretty clear. Also in other cases, where the default (fallback) is different from `None`, you'd have to use `Union` anyway. For example a function that normally returns an object of type `T` but in some circumstances it cannot and then it returns the reason as a `str`, i.e. `-> Union[T, str]`; `Optional` won't help here. Scanning through the docs and PEP I can't find strongly motivating examples for `Optional` (over `Union[T, None]`). E.g. in the following: def lookup(self, name: str) -> Optional[Node]: nodes = self.get(name) if nodes: return nodes[-1] return None I would rather write `Union[Node, None]` because that's much more explicit about what happens. Then introducing `~T` in place of `Optional[T]` just further obfuscates the meaning of the code: def lookup(self, name: str) -> ~Node: The `~` is easy to be missed (at least by human readers) and the meaning not obvious. For `Union` on the other hand it would be more helpful to have a shorter syntax, `int | str` seems pretty clear, but what prevents tuples `(int, str)` from being interpreted as unions by type checkers. This doesn't require any changes to the built-in types and it is aligned with the already existing syntax for checking multiple types with `isinstance` or `issubclass`: `isinstance(x, (int, str))`. Having used this a couple of times, whenever I see a tuple of types I immediately think of them as `or` options.
On Thu, Aug 29, 2019 at 4:04 PM Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
I never really understood the importance of `Optional`. Often it can be left out altogether and in other cases I find `Union[T, None]` more expressive (explicit) than `Optional[T]` (+ the latter saves only 3 chars).
I respectfully disagree. In our (huge) codebase we see way more occurrences of Optional than of Union. It's not that it saves a tremendous amount of typing -- it's a much more intuitive meaning. Every time I see Union[T, None] I have to read it carefully to see what it means. When I see Optional[T] my brain moves on immediately (in a sense it's only one bit of information).
Especially for people not familiar with typing, the meaning of `Optional` is not obvious at first sight. `Union[T, None]` on the other hand is pretty clear. Also in other cases, where the default (fallback) is different from `None`, you'd have to use `Union` anyway. For example a function that normally returns an object of type `T` but in some circumstances it cannot and then it returns the reason as a `str`, i.e. `-> Union[T, str]`; `Optional` won't help here. Scanning through the docs and PEP I can't find strongly motivating examples for `Optional` (over `Union[T, None]`). E.g. in the following:
def lookup(self, name: str) -> Optional[Node]: nodes = self.get(name) if nodes: return nodes[-1] return None
I would rather write `Union[Node, None]` because that's much more explicit about what happens.
Then introducing `~T` in place of `Optional[T]` just further obfuscates the meaning of the code:
def lookup(self, name: str) -> ~Node:
The `~` is easy to be missed (at least by human readers) and the meaning not obvious.
Do you easily miss the `-` in an expression like `-2`? Surely the meaning of `?` in a programming language also has to be learned. And not every language uses it to mean "optional" (IIRC there's a language where it means "boolean" -- maybe Scheme?)
For `Union` on the other hand it would be more helpful to have a shorter syntax, `int | str` seems pretty clear, but what prevents tuples `(int, str)` from being interpreted as unions by type checkers. This doesn't require any changes to the built-in types and it is aligned with the already existing syntax for checking multiple types with `isinstance` or `issubclass`: `isinstance(x, (int, str))`. Having used this a couple of times, whenever I see a tuple of types I immediately think of them as `or` options.
First, *if* we were to introduce `(int, str)` it would make more sense for it to mean `Tuple[int, str]` (tuples are also a very common type). Second, comma is already very overloaded. Yes, it's unfortunately that `(int, str)` means "union" in `isinstance()` but it's not enough to sway me. Anyway, let's just paint the bikeshed *some* color. :-) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
Guido van Rossum wrote:
On Thu, Aug 29, 2019 at 4:04 PM Dominik Vilsmeier dominik.vilsmeier@gmx.de wrote:
I never really understood the importance of Optional. Often it can be left out altogether and in other cases I find Union[T, None] more expressive (explicit) than Optional[T] (+ the latter saves only 3 chars). I respectfully disagree. In our (huge) codebase we see way more occurrences of Optional than of Union. It's not that it saves a tremendous amount of typing -- it's a much more intuitive meaning. Every time I see Union[T, None] I have to read it carefully to see what it means. When I see Optional[T] my brain moves on immediately (in a sense it's only one bit of information).
You are probably right, it's all a matter of how used our brains are to seeing stuff. So if I started using it more frequently, after some time I would probably appreciate it over `Union[T, None]`.
Especially for people not familiar with typing, the meaning of Optional is not obvious at first sight. Union[T, None] on the other hand is pretty clear. Also in other cases, where the default (fallback) is different from None, you'd have to use Union anyway. For example a function that normally returns an object of type T but in some circumstances it cannot and then it returns the reason as a str, i.e. -> Union[T, str]; Optional won't help here. Scanning through the docs and PEP I can't find strongly motivating examples for Optional (over Union[T, None]). E.g. in the following: def lookup(self, name: str) -> Optional[Node]: nodes = self.get(name) if nodes: return nodes[-1] return None
I would rather write Union[Node, None] because that's much more explicit about what happens. Then introducing ~T in place of Optional[T] just further obfuscates the meaning of the code: def lookup(self, name: str) -> ~Node:
The ~ is easy to be missed (at least by human readers) and the meaning not obvious. Do you easily miss the - in an expression like -2?
I don't miss the `-` in the context because my brain is trained on recognizing such patterns. We encounter negative numbers everywhere, from (pre-)school on, so this pattern is easy to recognize. However `~Noun` is not something you've likely seen in the real world (or anywhere), so it's much harder to recognize. I cn wrt ths txt wtht vwls or eevn rdoerer teh lterets and you'll still be able to read it because your brain just fills in what it expects (i.e. what it is accustomed to). For that reason `~Node` is much harder to recognize than `-3637` because I wouldn't expect a `~` to appear in that place.
Surely the meaning of ? in a programming language also has to be learned. And not every language uses it to mean "optional" (IIRC there's a language where it means "boolean" -- maybe Scheme?)
For Union on the other hand it would be more helpful to have a shorter syntax, int | str seems pretty clear, but what prevents tuples (int, str) from being interpreted as unions by type checkers. This doesn't require any changes to the built-in types and it is aligned with the already existing syntax for checking multiple types with isinstance or issubclass: isinstance(x, (int, str)). Having used this a couple of times, whenever I see a tuple of types I immediately think of them as or options. First, if we were to introduce (int, str) it would make more sense for it to mean Tuple[int, str] (tuples are also a very common type). Second, comma is already very overloaded. Yes, it's unfortunately that (int, str) means "union" in isinstance() but it's not enough to sway me. Anyway, let's just paint the bikeshed some color. :-)
I don't think it's unfortunate, it's pretty neat syntax (and intuitive). Checking if `x` is an instance of `y` it makes sense to "list" (`tuple`) multiple options for `y`. It's a clever way of reusing the available syntax / functionality of the language. And I think this is what typing should do as well: build around the existing language and use whatever is available. Adding `__or__` to `type` for allowing things like `int | str` on the other hand bends the language toward typing and thus is a step in the opposite direction. Then I don't think it's the comma that receives emphasis in the syntax `(int, str)`, it's rather the parens - and those, as a bonus, provide visual boundaries for the beginning and end of the union. Consider def foo(x: str | int, y: list): versus def foo(x: (str, int), y: list): The comma is a small character, visually the scene will be dominated by the matching parens - and whatever is inside is anyway a common sight as we are used to seeing tuples. Hence I think from readability perspective it's a plus to reuse existing, common syntax. I agree that there is the ambiguity with `(int, str)` being interpreted as `Tuple[int, str]` and this is a valid argument. Since I've used the `isinstance(x, (y,z))` syntax quite often I wouldn't interpret `(int, str)` as a tuple but of course for other people the situation might be completely different. Hence that could really be a blocker.
Possibly relevant for tilde: https://www.thecut.com/article/why-the-internet-tilde-is-our-most-perfect-to... On Thu, Aug 29, 2019 at 5:09 PM Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
Guido van Rossum wrote:
On Thu, Aug 29, 2019 at 4:04 PM Dominik Vilsmeier dominik.vilsmeier@gmx.de wrote:
I never really understood the importance of Optional. Often it can be left out altogether and in other cases I find Union[T, None] more expressive (explicit) than Optional[T] (+ the latter saves only 3 chars). I respectfully disagree. In our (huge) codebase we see way more occurrences of Optional than of Union. It's not that it saves a tremendous amount of typing -- it's a much more intuitive meaning. Every time I see Union[T, None] I have to read it carefully to see what it means. When I see Optional[T] my brain moves on immediately (in a sense it's only one bit of information).
You are probably right, it's all a matter of how used our brains are to seeing stuff. So if I started using it more frequently, after some time I would probably appreciate it over `Union[T, None]`.
Especially for people not familiar with typing, the meaning of Optional is not obvious at first sight. Union[T, None] on the other hand is pretty clear. Also in other cases, where the default (fallback) is different from None, you'd have to use Union anyway. For example a function that normally returns an object of type T but in some circumstances it cannot and then it returns the reason as a str, i.e. -> Union[T, str]; Optional won't help here. Scanning through the docs and PEP I can't find strongly motivating examples for Optional (over Union[T, None]). E.g. in the following: def lookup(self, name: str) -> Optional[Node]: nodes = self.get(name) if nodes: return nodes[-1] return None
I would rather write Union[Node, None] because that's much more explicit about what happens. Then introducing ~T in place of Optional[T] just further obfuscates the meaning of the code: def lookup(self, name: str) -> ~Node:
The ~ is easy to be missed (at least by human readers) and the meaning not obvious. Do you easily miss the - in an expression like -2?
I don't miss the `-` in the context because my brain is trained on recognizing such patterns. We encounter negative numbers everywhere, from (pre-)school on, so this pattern is easy to recognize. However `~Noun` is not something you've likely seen in the real world (or anywhere), so it's much harder to recognize. I cn wrt ths txt wtht vwls or eevn rdoerer teh lterets and you'll still be able to read it because your brain just fills in what it expects (i.e. what it is accustomed to). For that reason `~Node` is much harder to recognize than `-3637` because I wouldn't expect a `~` to appear in that place.
Surely the meaning of ? in a programming language also has to be learned. And not every language uses it to mean "optional" (IIRC there's a language where it means "boolean" -- maybe Scheme?)
For Union on the other hand it would be more helpful to have a shorter syntax, int | str seems pretty clear, but what prevents tuples (int, str) from being interpreted as unions by type checkers. This doesn't require any changes to the built-in types and it is aligned with the already existing syntax for checking multiple types with isinstance or issubclass: isinstance(x, (int, str)). Having used this a couple of times, whenever I see a tuple of types I immediately think of them as or options. First, if we were to introduce (int, str) it would make more sense for it to mean Tuple[int, str] (tuples are also a very common type). Second, comma is already very overloaded. Yes, it's unfortunately that (int, str) means "union" in isinstance() but it's not enough to sway me. Anyway, let's just paint the bikeshed some color. :-)
I don't think it's unfortunate, it's pretty neat syntax (and intuitive). Checking if `x` is an instance of `y` it makes sense to "list" (`tuple`) multiple options for `y`. It's a clever way of reusing the available syntax / functionality of the language. And I think this is what typing should do as well: build around the existing language and use whatever is available. Adding `__or__` to `type` for allowing things like `int | str` on the other hand bends the language toward typing and thus is a step in the opposite direction.
Then I don't think it's the comma that receives emphasis in the syntax `(int, str)`, it's rather the parens - and those, as a bonus, provide visual boundaries for the beginning and end of the union. Consider
def foo(x: str | int, y: list):
versus
def foo(x: (str, int), y: list):
The comma is a small character, visually the scene will be dominated by the matching parens - and whatever is inside is anyway a common sight as we are used to seeing tuples. Hence I think from readability perspective it's a plus to reuse existing, common syntax.
I agree that there is the ambiguity with `(int, str)` being interpreted as `Tuple[int, str]` and this is a valid argument. Since I've used the `isinstance(x, (y,z))` syntax quite often I wouldn't interpret `(int, str)` as a tuple but of course for other people the situation might be completely different. Hence that could really be a blocker. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ZONQON... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Aug 29, 2019, at 16:32, Guido van Rossum <guido@python.org> wrote:
Surely the meaning of `?` in a programming language also has to be learned. And not every language uses it to mean "optional" (IIRC there's a language where it means "boolean" -- maybe Scheme?)
Sure, ? does mean lots of different things that have nothing to do with Optional. The C ?: operator is probably the most famous. But as an operator on types, I can’t think of any uses other than Optional. But, just for fun: * ?: from C * null-coalescing, as in C# * Optional-chaining, as in Swift * throwing, as in Rust *’ordinary identifier character, usually idiomatically predicate functions end in ?, as in Scheme * ordinary operator (as opposed to identifier) character, used for a wide variety of completely unrelated monad things you don’t want to know about, as in Haskell * special 1-char identifier that’s idiomatically sort of like _ from C/gettext, as in Smalltalk * expand as macro, as in… Erlang? * print, as in Basic * random, as in APL * whatever the hell Ruby is doing that gives me "P" in one interpreter and 80 in another when I type ?P. Ruby _also_ has ? as an id-cont character, with the Schemeish convention, and the ?: operator, and in some other contexts it’s a syntax error but in some it does… whatever that string-or-ord thing is.
On Aug 30, 2019, at 18:40, Pasha Stetsenko <stpasha@gmail.com> wrote:
I can’t think of any uses other than Optional.
Also, in IPython and Jupyter Lab `?smth` displays help for symbol `smth`.
Yeah, and probably more Python users regularly use IPython than, say, BASIC, so maybe a bit more potential for confusion. Plus, it allows `smth?` too, which would actually be ambiguous if `int?` meant `Optional[int]`. But then, it would rarely be _useful_ to type `int?` on a line by itself just to see `Optional[int]`, so as long as there were an option to disable it, I don’t think anyone would complain if it still means help(int) by default. (iPython already has lots of features like that, which usually are convenient but very rarely are a problem, so you can disable them but they’re on by default.) I’m pretty sure the same thing came up during the discussion about adding null coalescing operators, and that was the consensus then, but don’t quote me on that.
On Aug 29, 2019, at 16:03, Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
I never really understood the importance of `Optional`. Often it can be left out altogether and in other cases I find `Union[T, None]` more expressive (explicit) than `Optional[T]` (+ the latter saves only 3 chars).
Especially for people not familiar with typing, the meaning of `Optional` is not obvious at first sight. `Union[T, None]` on the other hand is pretty clear. Also in other cases, where the default (fallback) is different from `None`, you'd have to use `Union` anyway. For example a function that normally returns an object of type `T` but in some circumstances it cannot and then it returns the reason as a `str`, i.e. `-> Union[T, str]`; `Optional` won't help here.
But this should be very rare. Most functions that can return a fallback value return a fallback value of the expected return type. For example, a get(key, default) method will return the default param, and the caller should pass in a default value of the type they’re expecting to look up. So, this shouldn’t be get(key: KeyType, default: T) -> Union[ValueType, T], it should be get(key: KeyType, default: ValueType) -> ValueType. Or maybe get(key: KeyType, default: Optional[ValueType]=None) -> Optional[ValueType]. Most functions that want to explain why they failed do so by raising an exception, not by returning a string. And what other cases are there? Of course you could be trying to add type checking to some weird legacy codebase that doesn’t do things Pythonically, so you have to use Union returns. But that’s specific to that one weird codebase. Meanwhile, Optional return values are common all over Python. Also, Python’s typing system is a lot easier to grasp if you’re familiar with an established modern-typed language (Swift, Scala, Haskell, F#, etc.), and they also use Optional[T] (or optional<T> or Maybe t or some other spelling of the same idea) all over be place—so often that many of them have added shortcuts like T? to make it easier to write and less intrusive to read. I think there may be a gap in the docs. They make perfect sense to someone with experience in one of those languages, but a team that has nobody with that experience might be a little lost. There’s a mile-high overview, a theory paper, and then basically just reference docs that expect you to already know all the key concepts that you don’t already know. Maybe that’s something that an outsider who’s trying to learn from the docs plus trial and error could help improve?
Scanning through the docs and PEP I can't find strongly motivating examples for `Optional` (over `Union[T, None]`). E.g. in the following:
def lookup(self, name: str) -> Optional[Node]: nodes = self.get(name) if nodes: return nodes[-1] return None
I would rather write `Union[Node, None]` because that's much more explicit about what happens.
Then introducing `~T` in place of `Optional[T]` just further obfuscates the meaning of the code:
def lookup(self, name: str) -> ~Node:
The `~` is easy to be missed (at least by human readers) and the meaning not obvious.
That’s kind of funny, because I had to read your Union[Node, None] a couple times before I realized you hadn’t written Union[Node, Node]. :) I do dislike ~ for other reasons (but I already mentioned them, Guido isn’t convinced, so… fine, I don’t hate it that much). But I don’t think ~ is easy to miss. It’s not like a period or backtick that can be mistaken for grit on your screen; it’s more visible than things like - that everyone expects to be able to pick out.
For `Union` on the other hand it would be more helpful to have a shorter syntax, `int | str` seems pretty clear, but what prevents tuples `(int, str)` from being interpreted as unions by type checkers. This doesn't require any changes to the built-in types and it is aligned with the already existing syntax for checking multiple types with `isinstance` or `issubclass`: `isinstance(x, (int, str))`. Having used this a couple of times, whenever I see a tuple of types I immediately think of them as `or` options.
The biggest problem with tuple is that in every other language with a similar type system, (int, str) means Tuple[int, str]. I think {int, str}, which someone proposed in one of the earlier discussions, is nice. What else would a set of types mean (unless you’re doing mathematical type theory rather than programming language typing)? But it’s unfortunate that things like isinstance and except take a tuple of types (and it has to be a tuple, not any other kind of iterable), so a set might be just as confusing for hardcore Python types as a tuple would be for polyglots. If the compatibility issue isn’t a big deal (and I trust Guido that is isn’t), I think int | str is the best option. It’a an operator that means union, it’s used for sum/union types in other languages, it makes perfect sense if you read it as “int or str”… I cant imagine anyone being confused or put off by it.
On Thu, Aug 29, 2019 at 5:56 PM Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:
I think {int, str}, which someone proposed in one of the earlier discussions, is nice. What else would a set of types mean (unless you’re doing mathematical type theory rather than programming language typing)? But it’s unfortunate that things like isinstance and except take a tuple of types (and it has to be a tuple, not any other kind of iterable), so a set might be just as confusing for hardcore Python types as a tuple would be for polyglots.
If the compatibility issue isn’t a big deal (and I trust Guido that is isn’t), I think int | str is the best option. It’a an operator that means union, it’s used for sum/union types in other languages, it makes perfect sense if you read it as “int or str”… I cant imagine anyone being confused or put off by it.
I just realized one big advantage of `{int, str}` over `int|str`. It doesn't require adding anything to `type`, and we don't need to introduce a new lightweight builtin union type. We could still do `~int` -- it would just return `{int, None}`. (But that *would* require adding to `type`.) If we did this, it would behoove us to support `isinstance(x, {int, str})` as well. A wrinkle with that: the current code is naively recursive, because it's impossible to create a loop in a tuple (other than using the C API). Tuples are also easier to traverse than sets. But those are surmountable problems. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Aug 29, 2019, at 18:10, Guido van Rossum <guido@python.org> wrote:
On Thu, Aug 29, 2019 at 5:56 PM Andrew Barnert via Python-ideas <python-ideas@python.org> wrote: I think {int, str}, which someone proposed in one of the earlier discussions, is nice. What else would a set of types mean (unless you’re doing mathematical type theory rather than programming language typing)? But it’s unfortunate that things like isinstance and except take a tuple of types (and it has to be a tuple, not any other kind of iterable), so a set might be just as confusing for hardcore Python types as a tuple would be for polyglots.
If the compatibility issue isn’t a big deal (and I trust Guido that is isn’t), I think int | str is the best option. It’a an operator that means union, it’s used for sum/union types in other languages, it makes perfect sense if you read it as “int or str”… I cant imagine anyone being confused or put off by it.
I just realized one big advantage of `{int, str}` over `int|str`. It doesn't require adding anything to `type`,
Well, that was my initial worry with |, and I thought you did a good job arguing that it wasn’t a big deal… but I’m fine either way.
We could still do `~int` -- it would just return `{int, None}`. (But that *would* require adding to `type`.)
If we did this, it would behoove us to support `isinstance(x, {int, str})` as well.
What about other constructs that take a tuple of types? Should try/except take sets? (Are there any others?) Would anyone be surprised that isinstance/issubclass take a tuple or set of types but all those string methods take only a tuple, not a set, of strings? I think people would get over that quickly, there’d just be another entry in the StackOverflow Python FAQ list.
On Thu, Aug 29, 2019 at 7:14 PM Andrew Barnert <abarnert@yahoo.com> wrote:
What about other constructs that take a tuple of types? Should try/except take sets? (Are there any others?)
Would anyone be surprised that isinstance/issubclass take a tuple or set of types but all those string methods take only a tuple, not a set, of strings? I think people would get over that quickly, there’d just be another entry in the StackOverflow Python FAQ list.
Yeah, we could gradually fix those over time based on feedback. Note that `except (E1, E2)` has the same concern as `isinstance(x, (T1, T2))` that we need to be careful not to dive into infinite recursion. Also a caveat: I'm participating in this discussion but that doesn't mean this will all happen (soon, or ever). It takes a lot of work to implement such changes: even if Python itself doesn't have to be changed, we'd still need to implement it in mypy, Pyre, PyCharm, Pytype, and who knows what else. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
Andrew Barnert wrote:
I never really understood the importance of Optional. Often it can be left out altogether and in other cases I find Union[T, None] more expressive (explicit) than Optional[T] (+ the latter saves only 3 chars). Especially for people not familiar with typing, the meaning of Optional is not obvious at first sight. Union[T, None] on the other hand is pretty clear. Also in other cases, where the default (fallback) is different from None, you'd have to use Union anyway. For example a function that normally returns an object of type T but in some circumstances it cannot and then it returns the reason as a str, i.e. -> Union[T, str]; Optional won't help here. But this should be very rare. Most functions that can return a fallback value return a fallback value of the expected return type. For example, a get(key, default) method will return the default param, and
On Aug 29, 2019, at 16:03, Dominik Vilsmeier dominik.vilsmeier@gmx.de wrote: the caller should pass in a default value of the type they’re expecting to look up. So, this shouldn’t be get(key: KeyType, default: T) -> Union[ValueType, T], it should be get(key: KeyType, default: ValueType) -> ValueType. Or maybe get(key: KeyType, default: Optional[ValueType]=None) -> Optional[ValueType]. Most functions that want to explain why they failed do so by raising an exception, not by returning a string. And what other cases are there?
Well, I actually made this up, so I can't think of any other real cases either :-)
Of course you could be trying to add type checking to some weird legacy codebase that doesn’t do things Pythonically, so you have to use Union returns. But that’s specific to that one weird codebase. Meanwhile, Optional return values are common all over Python. Also, Python’s typing system is a lot easier to grasp if you’re familiar with an established modern-typed language (Swift, Scala, Haskell, F#, etc.), and they also use Optional[T] (or optional<T> or Maybe t or some other spelling of the same idea) all over be place—so often that many of them have added shortcuts like T? to make it easier to write and less intrusive to read.
I don't have experience in any of these languages (basically I'm self-taught Python), so I learned it mostly from the docs (also `Optional`). That doesn't necessarily imply understanding the importance of the concept, but I acknowledge that `Optional[T]` is much easier to read than `Union[T, None]`; the former has less visual overhead and it reads more like "natural" language, so once you combine this with the fact that functions return `None` when they don't hit a `return` statement (or the convention of explicitly putting `return None` at the end), the meaning of `Optional[T]` becomes more clear.
I think there may be a gap in the docs. They make perfect sense to someone with experience in one of those languages, but a team that has nobody with that experience might be a little lost. There’s a mile-high overview, a theory paper, and then basically just reference docs that expect you to already know all the key concepts that you don’t already know. Maybe that’s something that an outsider who’s trying to learn from the docs plus trial and error could help improve?
Scanning through the docs and PEP I can't find strongly motivating examples for Optional (over Union[T, None]). E.g. in the following: def lookup(self, name: str) -> Optional[Node]: nodes = self.get(name) if nodes: return nodes[-1] return None I would rather write Union[Node, None] because that's much more explicit about what happens. Then introducing ~T in place of Optional[T] just further obfuscates the meaning of the code: def lookup(self, name: str) -> ~Node: The ~ is easy to be missed (at least by human readers) and the meaning not obvious. That’s kind of funny, because I had to read your Union[Node, None] a couple times before I realized you hadn’t written Union[Node, Node]. :)
I had a similar thought when writing this, so I get the point. I'm not arguing against `Optional` I just think it's less self-explanatory than `Union[T, None]` when you see it for the first time and if you're not familiar with the concept in general. But that doesn't mean you shouldn't familiarize yourself with it :-)
I do dislike ~ for other reasons (but I already mentioned them, Guido isn’t convinced, so… fine, I don’t hate it that much). But I don’t think ~ is easy to miss. It’s not like a period or backtick that can be mistaken for grit on your screen; it’s more visible than things like - that everyone expects to be able to pick out.
As I mentioned in my other relpy to Guido, patterns like -341 are easily recognizable as a negative number (i.e. you won't miss the `-`) because our brains are accustomed to seeing it. ~Noun on the other hand is not something you're likely to encounter in everyday language and thus it is an unfamiliar pattern. Noun? on the other hand is easily recognizable. Regarding the meaning, `T?` should be pretty clear (read as "maybe T", i.e. maybe you hit a return statement with `T` and if not it's going to be `None` by default); for `~` on the other hand I'm not aware of any meaning in natural language. I did a bit of internet search for symbols representing "optional" but I couldn't find any (e.g. none of the icon websites I tried gave satisfying results, or any results at all). Also the guys over at ux.stackexchange seem to agree that the only way to mark something optional is to write "optional" (https://ux.stackexchange.com/q/102930, https://ux.stackexchange.com/q/9684). Python code reads very natural, but I'm not convinced `~` would add to that; it's rather a step away. Personally, for me that's not important, I'm more of the style "look-up the docs and learn from there" rather than relying on my intuition of the meaning of something. But from other discussions on this list I had the impression that Python wants to keep possible confusions to a minimum, especially for newcomers (I remember the discussion about `while ... except` , with the main argument against, that this syntax could easily be confused). With `~` there probably won't be a confusion in that sense, but someone reading it for the first time will definitely need to look it up (which is fine i.m.o.).
For Union on the other hand it would be more helpful to have a shorter syntax, int | str seems pretty clear, but what prevents tuples (int, str) from being interpreted as unions by type checkers. This doesn't require any changes to the built-in types and it is aligned with the already existing syntax for checking multiple types with isinstance or issubclass: isinstance(x, (int, str)). Having used this a couple of times, whenever I see a tuple of types I immediately think of them as or options. The biggest problem with tuple is that in every other language with a similar type system, (int, str) means Tuple[int, str]. I think {int, str}, which someone proposed in one of the earlier discussions, is nice. What else would a set of types mean (unless you’re doing mathematical type theory rather than programming language typing)? But it’s unfortunate that things like isinstance and except take a tuple of types (and it has to be a tuple, not any other kind of iterable), so a set might be just as confusing for hardcore Python types as a tuple would be for polyglots.
The possible confusion with `Tuple[x, y]` is a strong counter-argument, but as you mention, `{int, str}` doesn't have this particular problem. The unfortunate part about `isinstance` is that it takes _only_ a tuple and not any kind of collection of types.
If the compatibility issue isn’t a big deal (and I trust Guido that is isn’t), I think int | str is the best option. It’a an operator that means union, it’s used for sum/union types in other languages, it makes perfect sense if you read it as “int or str”… I cant imagine anyone being confused or put off by it.
I also like the `int | str` syntax, and I can't imagine that it will cause any kind of confusion. One difference about `int | str` and `{int, str}` however is that successfully interpreting the meaning of the ` | ` syntax likely requires a context while for the `{ }` syntax it is clear that it defers the interpretation to whatever context it is used in. For example: def foo(x: int | str): def foo(x: {int, str}): Here it's pretty clear that both versions indicate multiple type options. However when reading something like: x = int | str it's not immediately clear what this means and what `x` actually is (or represents). Probably `x` is used in type annotations later on but someone reading this statement (and maybe being unfamiliar with typing) could also assume something of the following: 1. Some kind of type chain with fallbacks, so that you can do `x(2.0) == 2` with a fallback on `str`: `x('foo') == 'foo'`. 2. (shell style) Some kind of compound type, piping the output from `int` to `str`: `x(2.3) == str(int(2.3)) == '2'`. Only if you're familiar with `__or__`'ing types or you see this in a typing context it becomes clear that this means a type union. On the other hand `{int, str}` is just a collection of types, nothing more, no further meaning attached to it. Whatever meaning is eventually assigned to such a type collection is deferred to the context that uses it, e.g. type annotations or usage with `isinstance`. Similar for `x = (int, str)`, this can be used as `ininstance(foo, x)` or `type_chain(foo, x)` or `type_pipe(foo, x)`. Here it's the functions that give meaning to the type collection, not the collection itself.
Why not extend isInstance to : isinstance(object:Any,classinfo: Iterable | Union) ? Le ven. 30 août 2019 à 09:41, Dominik Vilsmeier <dominik.vilsmeier@gmx.de> a écrit :
Andrew Barnert wrote:
I never really understood the importance of Optional. Often it can be left out altogether and in other cases I find Union[T, None] more expressive (explicit) than Optional[T] (+ the latter saves only 3 chars). Especially for people not familiar with typing, the meaning of Optional is not obvious at first sight. Union[T, None] on the other hand is pretty clear. Also in other cases, where the default (fallback) is different from None, you'd have to use Union anyway. For example a function that normally returns an object of type T but in some circumstances it cannot and then it returns the reason as a str, i.e. -> Union[T, str]; Optional won't help here. But this should be very rare. Most functions that can return a fallback value return a fallback value of the expected return type. For example, a get(key, default) method will return the default param, and
On Aug 29, 2019, at 16:03, Dominik Vilsmeier dominik.vilsmeier@gmx.de wrote: the caller should pass in a default value of the type they’re expecting to look up. So, this shouldn’t be get(key: KeyType, default: T) -> Union[ValueType, T], it should be get(key: KeyType, default: ValueType) -> ValueType. Or maybe get(key: KeyType, default: Optional[ValueType]=None) -> Optional[ValueType]. Most functions that want to explain why they failed do so by raising an exception, not by returning a string. And what other cases are there?
Well, I actually made this up, so I can't think of any other real cases either :-)
Of course you could be trying to add type checking to some weird legacy codebase that doesn’t do things Pythonically, so you have to use Union returns. But that’s specific to that one weird codebase. Meanwhile, Optional return values are common all over Python. Also, Python’s typing system is a lot easier to grasp if you’re familiar with an established modern-typed language (Swift, Scala, Haskell, F#, etc.), and they also use Optional[T] (or optional<T> or Maybe t or some other spelling of the same idea) all over be place—so often that many of them have added shortcuts like T? to make it easier to write and less intrusive to read.
I don't have experience in any of these languages (basically I'm self-taught Python), so I learned it mostly from the docs (also `Optional`). That doesn't necessarily imply understanding the importance of the concept, but I acknowledge that `Optional[T]` is much easier to read than `Union[T, None]`; the former has less visual overhead and it reads more like "natural" language, so once you combine this with the fact that functions return `None` when they don't hit a `return` statement (or the convention of explicitly putting `return None` at the end), the meaning of `Optional[T]` becomes more clear.
I think there may be a gap in the docs. They make perfect sense to someone with experience in one of those languages, but a team that has nobody with that experience might be a little lost. There’s a mile-high overview, a theory paper, and then basically just reference docs that expect you to already know all the key concepts that you don’t already know. Maybe that’s something that an outsider who’s trying to learn from the docs plus trial and error could help improve?
Scanning through the docs and PEP I can't find strongly motivating examples for Optional (over Union[T, None]). E.g. in the following: def lookup(self, name: str) -> Optional[Node]: nodes = self.get(name) if nodes: return nodes[-1] return None I would rather write Union[Node, None] because that's much more explicit about what happens. Then introducing ~T in place of Optional[T] just further obfuscates the meaning of the code: def lookup(self, name: str) -> ~Node: The ~ is easy to be missed (at least by human readers) and the meaning not obvious. That’s kind of funny, because I had to read your Union[Node, None] a couple times before I realized you hadn’t written Union[Node, Node]. :)
I had a similar thought when writing this, so I get the point. I'm not arguing against `Optional` I just think it's less self-explanatory than `Union[T, None]` when you see it for the first time and if you're not familiar with the concept in general. But that doesn't mean you shouldn't familiarize yourself with it :-)
I do dislike ~ for other reasons (but I already mentioned them, Guido isn’t convinced, so… fine, I don’t hate it that much). But I don’t think ~ is easy to miss. It’s not like a period or backtick that can be mistaken for grit on your screen; it’s more visible than things like - that everyone expects to be able to pick out.
As I mentioned in my other relpy to Guido, patterns like -341 are easily recognizable as a negative number (i.e. you won't miss the `-`) because our brains are accustomed to seeing it. ~Noun on the other hand is not something you're likely to encounter in everyday language and thus it is an unfamiliar pattern. Noun? on the other hand is easily recognizable. Regarding the meaning, `T?` should be pretty clear (read as "maybe T", i.e. maybe you hit a return statement with `T` and if not it's going to be `None` by default); for `~` on the other hand I'm not aware of any meaning in natural language. I did a bit of internet search for symbols representing "optional" but I couldn't find any (e.g. none of the icon websites I tried gave satisfying results, or any results at all). Also the guys over at ux.stackexchange seem to agree that the only way to mark something optional is to write "optional" ( https://ux.stackexchange.com/q/102930, https://ux.stackexchange.com/q/9684). Python code reads very natural, but I'm not convinced `~` would add to that; it's rather a step away. Personally, for me that's not important, I'm more of the style "look-up the docs and learn from there" rather than relying on my intuition of the meaning of something. But from other discussions on this list I had the impression that Python wants to keep possible confusions to a minimum, especially for newcomers (I remember the discussion about `while ... except` , with the main argument against, that this syntax could easily be confused). With `~` there probably won't be a confusion in that sense, but someone reading it for the first time will definitely need to look it up (which is fine i.m.o.).
For Union on the other hand it would be more helpful to have a shorter syntax, int | str seems pretty clear, but what prevents tuples (int, str) from being interpreted as unions by type checkers. This doesn't require any changes to the built-in types and it is aligned with the already existing syntax for checking multiple types with isinstance or issubclass: isinstance(x, (int, str)). Having used this a couple of times, whenever I see a tuple of types I immediately think of them as or options. The biggest problem with tuple is that in every other language with a similar type system, (int, str) means Tuple[int, str]. I think {int, str}, which someone proposed in one of the earlier discussions, is nice. What else would a set of types mean (unless you’re doing mathematical type theory rather than programming language typing)? But it’s unfortunate that things like isinstance and except take a tuple of types (and it has to be a tuple, not any other kind of iterable), so a set might be just as confusing for hardcore Python types as a tuple would be for polyglots.
The possible confusion with `Tuple[x, y]` is a strong counter-argument, but as you mention, `{int, str}` doesn't have this particular problem. The unfortunate part about `isinstance` is that it takes _only_ a tuple and not any kind of collection of types.
If the compatibility issue isn’t a big deal (and I trust Guido that is isn’t), I think int | str is the best option. It’a an operator that means union, it’s used for sum/union types in other languages, it makes perfect sense if you read it as “int or str”… I cant imagine anyone being confused or put off by it.
I also like the `int | str` syntax, and I can't imagine that it will cause any kind of confusion. One difference about `int | str` and `{int, str}` however is that successfully interpreting the meaning of the ` | ` syntax likely requires a context while for the `{ }` syntax it is clear that it defers the interpretation to whatever context it is used in. For example:
def foo(x: int | str): def foo(x: {int, str}):
Here it's pretty clear that both versions indicate multiple type options.
However when reading something like:
x = int | str
it's not immediately clear what this means and what `x` actually is (or represents). Probably `x` is used in type annotations later on but someone reading this statement (and maybe being unfamiliar with typing) could also assume something of the following:
1. Some kind of type chain with fallbacks, so that you can do `x(2.0) == 2` with a fallback on `str`: `x('foo') == 'foo'`. 2. (shell style) Some kind of compound type, piping the output from `int` to `str`: `x(2.3) == str(int(2.3)) == '2'`.
Only if you're familiar with `__or__`'ing types or you see this in a typing context it becomes clear that this means a type union.
On the other hand `{int, str}` is just a collection of types, nothing more, no further meaning attached to it. Whatever meaning is eventually assigned to such a type collection is deferred to the context that uses it, e.g. type annotations or usage with `isinstance`. Similar for `x = (int, str)`, this can be used as `ininstance(foo, x)` or `type_chain(foo, x)` or `type_pipe(foo, x)`. Here it's the functions that give meaning to the type collection, not the collection itself. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OFH5DB... Code of Conduct: http://python.org/psf/codeofconduct/
On Aug 30, 2019, at 02:09, Philippe Prados <philippe.prados@gmail.com> wrote:
Why not extend isInstance to :
isinstance(object:Any,classinfo: Iterable | Union) ?
How would you pass a single type? And once you allow a single type, you shouldn’t have to do anything special to allow a Union, because a Union is already a type. And there’s a good reason isinstance only takes tuples of types, not arbitrary iterables: because a type can be iterable (e.g., an Enum contains all of its members), so it would make single types ambiguous. Just like str.endswith can’t take a string or iterable (strings contain their characters), so it gets the same solution used there and dozens of other places: tuples are special-cased. If we already have | (you’re using it in your annotation), we don’t have a problem to solve in the first place. The problem only comes up if we allow sets of types as shorthand for unions in annotations instead. And the only problem is that sets as shorthand don’t match the “tuples are special” rule in isinstance and elsewhere, so that rule has to change to something like “tuples and sets (and frozensets?) are special”. I don’t think we need (or want) any more complicated change than that.
May be, with isinstance(object:Any,classinfo: Tuple | Union) ? it's possible to accept isinstance(anUnion, [Union]) # Force tuple for Union isinstance(aTuple, [Tuple] # Force Tuple isinstance(aStr, int | str) Correct ? Le ven. 30 août 2019 à 19:29, Andrew Barnert <abarnert@yahoo.com> a écrit :
On Aug 30, 2019, at 02:09, Philippe Prados <philippe.prados@gmail.com> wrote:
Why not extend isInstance to :
isinstance(object:Any,classinfo: Iterable | Union) ?
How would you pass a single type?
And once you allow a single type, you shouldn’t have to do anything special to allow a Union, because a Union is already a type.
And there’s a good reason isinstance only takes tuples of types, not arbitrary iterables: because a type can be iterable (e.g., an Enum contains all of its members), so it would make single types ambiguous. Just like str.endswith can’t take a string or iterable (strings contain their characters), so it gets the same solution used there and dozens of other places: tuples are special-cased.
If we already have | (you’re using it in your annotation), we don’t have a problem to solve in the first place. The problem only comes up if we allow sets of types as shorthand for unions in annotations instead.
And the only problem is that sets as shorthand don’t match the “tuples are special” rule in isinstance and elsewhere, so that rule has to change to something like “tuples and sets (and frozensets?) are special”. I don’t think we need (or want) any more complicated change than that.
On Aug 30, 2019, at 11:26, Philippe Prados <philippe.prados@gmail.com> wrote:
May be, with
isinstance(object:Any,classinfo: Tuple | Union) ?
it's possible to accept
isinstance(anUnion, [Union]) # Force tuple for Union isinstance(aTuple, [Tuple] # Force Tuple isinstance(aStr, int | str)
Correct ?
No. Currently, isinstance takes a type, or a tuple of types. You’re changing it to take a tuple of anything, or a value of any Union of any types. What is that supposed to do? I think you’re mixing up types and values here (it may be easier to think of List instead of Union here: [1, 2, 3] is a List, because it’s a List[int], a list of bits; List[int] is not a List, because it’s not a list of anything, it’s just a single value). But even beyond that, I don’t understand why you want to stop isinstance from taking individual types. Or why you think you need to add anything for Union types. Union types are already types. So isinstance already accepts them. Meanwhile, [Union] and [Tuple] are not tuples, they’re lists. So, they don’t work today, and they wouldn’t work with your change. If you pass a single-value tuple (Tuple,), that does work today, but I’m not sure why you’d bother given that it means the exact same thing as just passing Tuple. Anyway, either way, that already works. As for int|str, as you said at the start of this thread, that just returns Union[int, str]. It’s the exact same value, so it has the exact same type, so you don’t need to add anything new to anyone’s signature to handle it the same. Taking a step back, I suspect your real issue is with this: >>> isinstance(2, Union[int|str]) TypeError: Subscripted generics cannot be used with class an instance checks But this has nothing to do with the signature of isinstance. That Union is a type, and isinstance correctly accepts it as a type, and correctly follows the protocol, which includes calling its __subclasscheck__ method. It’s that method, Union.__subclasscheck__, that’s preventing you from making this test, as you can see from the traceback, and it’s even explaining why. And this isn’t a bug; it’s implementing exactly what the design and docs say it should. You’re not supposed to use instantiated generic types in isinstance because that’s almost always a sign of confusion between static and dynamic types, not a useful thing to do. You could argue that Union is a special case, unlike most of the other generics, and it actually is useful more often than a mistake, and therefore it should have different rules. But you have to make that argument, and then come up with those different rules, and then convince people they’re better, and then implement them in the subclasscheck method. Changing the signature of isinstance is irrelevant. And that’s all orthogonal to, and almost completely unrelated to, adding |.
On Aug 29, 2019, at 15:28, Guido van Rossum <guido@python.org> wrote:
A wart will be that we can make `int | None` work but we shouldn't make `None | int` work (I don't want to add any new operator overloads to `None`, it should always be an error).
Is there a reason that type.__ror__ wouldn’t handle that, something funky about type, or about builtin types in general, that I’m forgetting?
On Thu, Aug 29, 2019 at 5:00 PM Andrew Barnert <abarnert@yahoo.com> wrote:
On Aug 29, 2019, at 15:28, Guido van Rossum <guido@python.org> wrote:
A wart will be that we can make `int | None` work but we shouldn't make
`None | int` work (I don't want to add any new operator overloads to `None`, it should always be an error).
Is there a reason that type.__ror__ wouldn’t handle that, something funky about type, or about builtin types in general, that I’m forgetting?
Handn't thought of `__ror__`. I guess that would work, so never mind on that wart :-) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Thu, Aug 29, 2019 at 02:25:56PM +0200, Philippe Prados wrote:
# Operator for Union assert( int | str == Union[int,str])
[Aside: assert is not a function and you don't need the parentheses.] This is not a new proposal. If I recall correctly, it was proposed way back at the very beginning of the type-hinting discussion, and there has been at least one closed feature request for it: https://github.com/python/typing/issues/387 -- Steven
participants (15)
-
Andrew Barnert
-
Chris Angelico
-
Dominik Vilsmeier
-
Guido van Rossum
-
Gustavo Carneiro
-
Inada Naoki
-
Ivan Levkivskyi
-
Jan Verbeek
-
Pasha Stetsenko
-
Philippe Prados
-
Philippe Prados
-
Rhodri James
-
Ricky Teachey
-
Rob Cliffe
-
Steven D'Aprano