Experiences with Creating PEP 484 Stub Files
I've been adding support to the SIP wrapper generator for automatically generating PEP 484 compatible stub files so that future versions of PyQt can be shipped with them. By way of feedback I thought I'd share my experience, confusions and suggestions. There are a number of things I'd like to express but cannot find a way to do so... - objects that implement the buffer protocol - type objects - slice objects - capsules - sequences of fixed size (ie. specified in the same way as Tuple) - distinguishing between instance and class attributes. The documentation is incomplete - there is no mention of Set or Tuple for example. I found the documentation confusing regarding Optional. Intuitively it seems to be the way to specify arguments with default values. However it is explained in terms of (for example) Union[str, None] and I (intuitively but incorrectly) read that as meaning "a str or None" as opposed to "a str or nothing". bytes can be used as shorthand for bytes, bytearray and memoryview - but what about objects that really only support bytes? Shouldn't the shorthand be called something like AnyBytes? Is there any recommended way to test the validity and completeness of stub files? What's the recommended way to parse them? Phil
On Feb 9, 2016, at 03:44, Phil Thompson
There are a number of things I'd like to express but cannot find a way to do so...
- objects that implement the buffer protocol
That seems like it should be filed as a bug with the typing repo. Presumably this is just an empty type that registers bytes, bytearray, and memoryview, and third-party classes have to register with it manually?
- type objects - slice objects
Can't you just use the concrete types type and slice tor these two? I don't think you need generic or abstract "any metaclass, whether inheriting from type or not" or "any class that meets the slice protocol", do you?
- capsules
That one seems reasonable. But maybe there should just be a types.Capsule Type or types.PyCapsule, and then you can just check that the same as any other concrete type? But how often do you need to verify that something is a capsule, without knowing that it's the *right* capsule? At runtime, you'd usually use PyCapsule_IsValid, not PyCapsule_CheckExacf, right? So should the type checker be tracking the name too?
- sequences of fixed size (ie. specified in the same way as Tuple)
How would you disambiguate between a sequence of one int and a sequence of 0 or more ints if they're both spelled "Sequence[int]"? That isn't a problem for Tuple, because it's assumed to be heterogeneous, so Tuple[int] can only be a 1-tuple. (This was actually discussed in some depth. I thought it would be a problem, because some types--including tuple itself--are sometimes used as homogenous arbitrary-length containers and sometimes as heterogeneous fixed-length containers, but Guido and others had some good answers for that, even if I can't remember what they were.)
- distinguishing between instance and class attributes.
Where? Are you building a protocol that checks the data members of a type for conformance or something? If so, why is an object that has "spam" and "eggs" as instance attributes but "cheese" as a class attribute not usable as an object conforming to the protocol with all three attributes? (Also, does @property count as a class or instance attribute? What about an arbitrary data descriptor? Or a non-data descriptor?)
[Just adding to Andrew's response]
On Tue, Feb 9, 2016 at 9:58 AM, Andrew Barnert via Python-Dev
On Feb 9, 2016, at 03:44, Phil Thompson
wrote: There are a number of things I'd like to express but cannot find a way to do so...
- objects that implement the buffer protocol
That seems like it should be filed as a bug with the typing repo. Presumably this is just an empty type that registers bytes, bytearray, and memoryview, and third-party classes have to register with it manually?
Hm, there's no way to talk about these in regular Python code either, is there? I think that issue should be resolved first. Probably by adding something to collections.abc. And then we can add the corresponding name to typing.py. This will take time though (have to wait for 3.6) so I'd recommend 'Any' for now (and filing those bugs).
- type objects
You can use 'type' for this (i.e. the builtin). You can't specify any properties for types though; that's a feature request: https://github.com/python/typing/issues/107 -- but it may be a while before we address it (it's not entirely clear how it should work, and we have many other pressing issues still).
- slice objects
Can't you just use the concrete types type and slice tor these two? I don't think you need generic or abstract "any metaclass, whether inheriting from type or not" or "any class that meets the slice protocol", do you?
Can't you use 'slice' (i.e. the builtin)? Mypy supports that.
- capsules
That one seems reasonable. But maybe there should just be a types.Capsule Type or types.PyCapsule, and then you can just check that the same as any other concrete type?
But how often do you need to verify that something is a capsule, without knowing that it's the *right* capsule? At runtime, you'd usually use PyCapsule_IsValid, not PyCapsule_CheckExacf, right? So should the type checker be tracking the name too?
- sequences of fixed size (ie. specified in the same way as Tuple)
That's kind of a poor data structure. :-( Why can't you use Tuple here?
How would you disambiguate between a sequence of one int and a sequence of 0 or more ints if they're both spelled "Sequence[int]"? That isn't a problem for Tuple, because it's assumed to be heterogeneous, so Tuple[int] can only be a 1-tuple. (This was actually discussed in some depth. I thought it would be a problem, because some types--including tuple itself--are sometimes used as homogenous arbitrary-length containers and sometimes as heterogeneous fixed-length containers, but Guido and others had some good answers for that, even if I can't remember what they were.)
We solved that by allowing Tuple[int, ...] to spell a homogeneous tuple of integers.
- distinguishing between instance and class attributes.
Where? Are you building a protocol that checks the data members of a type for conformance or something? If so, why is an object that has "spam" and "eggs" as instance attributes but "cheese" as a class attribute not usable as an object conforming to the protocol with all three attributes? (Also, does @property count as a class or instance attribute? What about an arbitrary data descriptor? Or a non-data descriptor?)
It's a known mypy bug. :-( It's somewhat convoluted to fix. https://github.com/JukkaL/mypy/issues/1097 Some things Andrew snipped:
The documentation is incomplete - there is no mention of Set or Tuple for example.
Tuple is here: https://docs.python.org/3/library/typing.html#typing.Tuple collections.Set maps to typing.AbstractSet (https://docs.python.org/3/library/typing.html#typing.AbstractSet; present twice in the docs somehow :-( ). typing.Set (corresponding to builtins.set) is indeed missing, I've a note of that: http://bugs.python.org/issue26322.
I found the documentation confusing regarding Optional. Intuitively it seems to be the way to specify arguments with default values. However it is explained in terms of (for example) Union[str, None] and I (intuitively but incorrectly) read that as meaning "a str or None" as opposed to "a str or nothing".
But it *does* mean 'str or None'. The *type* of an argument doesn't have any bearing on whether it may be omitted from the argument list by the caller -- these are orthogonal concepts (though sadly the word optional might apply to both). It's possible (though unusual) to have an optional argument that must be a str when given; it's also possible to have a mandatory argument that may be a str or None. Can you help improve the wording in the docs (preferably by filing an issue)?
bytes can be used as shorthand for bytes, bytearray and memoryview - but what about objects that really only support bytes? Shouldn't the shorthand be called something like AnyBytes?
We debated that, but found it too annoying to have to import and write write AnyBytes in so many places. The type checker may not be precise for cases that only accept bytes, but hopefully it's more useful in general this way.
Is there any recommended way to test the validity and completeness of stub files? What's the recommended way to parse them?
That's also an open issue. For a quick check I tend to just point mypy at a stub file, since it is the most mature implementation of PEP 484 to date (Google's pytype is still working on PEP 484 compatibility). While this doesn't always catch all errors, it will at least find syntax errors and cases that mypy doesn't support. :-) -- --Guido van Rossum (python.org/~guido)
On 9 Feb 2016, at 8:54 pm, Guido van Rossum
[Just adding to Andrew's response]
On Tue, Feb 9, 2016 at 9:58 AM, Andrew Barnert via Python-Dev
wrote: On Feb 9, 2016, at 03:44, Phil Thompson
wrote: There are a number of things I'd like to express but cannot find a way to do so...
- objects that implement the buffer protocol
That seems like it should be filed as a bug with the typing repo. Presumably this is just an empty type that registers bytes, bytearray, and memoryview, and third-party classes have to register with it manually?
Hm, there's no way to talk about these in regular Python code either, is there? I think that issue should be resolved first. Probably by adding something to collections.abc. And then we can add the corresponding name to typing.py. This will take time though (have to wait for 3.6) so I'd recommend 'Any' for now (and filing those bugs).
Ok.
- type objects
You can use 'type' for this (i.e. the builtin). You can't specify any properties for types though; that's a feature request: https://github.com/python/typing/issues/107 -- but it may be a while before we address it (it's not entirely clear how it should work, and we have many other pressing issues still).
Yes, I can use type.
- slice objects
Can't you just use the concrete types type and slice tor these two? I don't think you need generic or abstract "any metaclass, whether inheriting from type or not" or "any class that meets the slice protocol", do you?
Can't you use 'slice' (i.e. the builtin)? Mypy supports that.
Yes, I can use slice.
- capsules
That one seems reasonable. But maybe there should just be a types.Capsule Type or types.PyCapsule, and then you can just check that the same as any other concrete type?
But how often do you need to verify that something is a capsule, without knowing that it's the *right* capsule? At runtime, you'd usually use PyCapsule_IsValid, not PyCapsule_CheckExacf, right? So should the type checker be tracking the name too?
- sequences of fixed size (ie. specified in the same way as Tuple)
That's kind of a poor data structure. :-( Why can't you use Tuple here?
Because allowing any sequence is more flexible that only allowing a tuple.
How would you disambiguate between a sequence of one int and a sequence of 0 or more ints if they're both spelled "Sequence[int]"? That isn't a problem for Tuple, because it's assumed to be heterogeneous, so Tuple[int] can only be a 1-tuple. (This was actually discussed in some depth. I thought it would be a problem, because some types--including tuple itself--are sometimes used as homogenous arbitrary-length containers and sometimes as heterogeneous fixed-length containers, but Guido and others had some good answers for that, even if I can't remember what they were.)
We solved that by allowing Tuple[int, ...] to spell a homogeneous tuple of integers.
- distinguishing between instance and class attributes.
Where? Are you building a protocol that checks the data members of a type for conformance or something? If so, why is an object that has "spam" and "eggs" as instance attributes but "cheese" as a class attribute not usable as an object conforming to the protocol with all three attributes? (Also, does @property count as a class or instance attribute? What about an arbitrary data descriptor? Or a non-data descriptor?)
It's a known mypy bug. :-( It's somewhat convoluted to fix. https://github.com/JukkaL/mypy/issues/1097
Some things Andrew snipped:
The documentation is incomplete - there is no mention of Set or Tuple for example.
Tuple is here: https://docs.python.org/3/library/typing.html#typing.Tuple
Yes, I missed that.
collections.Set maps to typing.AbstractSet (https://docs.python.org/3/library/typing.html#typing.AbstractSet; present twice in the docs somehow :-( ). typing.Set (corresponding to builtins.set) is indeed missing, I've a note of that: http://bugs.python.org/issue26322.
I found the documentation confusing regarding Optional. Intuitively it seems to be the way to specify arguments with default values. However it is explained in terms of (for example) Union[str, None] and I (intuitively but incorrectly) read that as meaning "a str or None" as opposed to "a str or nothing".
But it *does* mean 'str or None'. The *type* of an argument doesn't have any bearing on whether it may be omitted from the argument list by the caller -- these are orthogonal concepts (though sadly the word optional might apply to both). It's possible (though unusual) to have an optional argument that must be a str when given; it's also possible to have a mandatory argument that may be a str or None.
In the case of Python wrappers around a C++ library then *every* optional argument will have to have a specific type when given. So you are saying that a mandatory argument that may be a str or None would be specified as Union[str, None]? But the docs say that that is the underlying implementation of Option[str] - which (to me) means an optional argument that should be a string when given.
Can you help improve the wording in the docs (preferably by filing an issue)?
When I eventually understand what it means...
bytes can be used as shorthand for bytes, bytearray and memoryview - but what about objects that really only support bytes? Shouldn't the shorthand be called something like AnyBytes?
We debated that, but found it too annoying to have to import and write write AnyBytes in so many places. The type checker may not be precise for cases that only accept bytes, but hopefully it's more useful in general this way.
Is there any recommended way to test the validity and completeness of stub files? What's the recommended way to parse them?
That's also an open issue. For a quick check I tend to just point mypy at a stub file, since it is the most mature implementation of PEP 484 to date (Google's pytype is still working on PEP 484 compatibility). While this doesn't always catch all errors, it will at least find syntax errors and cases that mypy doesn't support. :-)
Ok I'll try that. Phil
[Phil]
I found the documentation confusing regarding Optional. Intuitively it seems to be the way to specify arguments with default values. However it is explained in terms of (for example) Union[str, None] and I (intuitively but incorrectly) read that as meaning "a str or None" as opposed to "a str or nothing". [me] But it *does* mean 'str or None'. The *type* of an argument doesn't have any bearing on whether it may be omitted from the argument list by the caller -- these are orthogonal concepts (though sadly the word optional might apply to both). It's possible (though unusual) to have an optional argument that must be a str when given; it's also possible to have a mandatory argument that may be a str or None. [Phil] In the case of Python wrappers around a C++ library then *every* optional argument will have to have a specific type when given.
IIUC you're saying that every argument that may be omitted must still have a definite type other than None. Right? In that case just don't use Optional[]. If a signature has the form def foo(a: str = 'xyz') -> str: ... then this means that str may be omitted or it may be a str -- you cannot call foo(a=None). You can even (in a stub file) write this as: def foo(a: str = ...) -> str: ... (literal '...' i.e. ellipsis) if you don't want to commit to a specific default value (it makes no difference to mypy).
So you are saying that a mandatory argument that may be a str or None would be specified as Union[str, None]?
Or as Optional[str], which means the same.
But the docs say that that is the underlying implementation of Option[str] - which (to me) means an optional argument that should be a string when given.
(Assuming you meant Option*al*.) There seems to be an utter confusion of the two uses of the term "optional" here. An "optional argument" (outside PEP 484) is one that has a default value. The "Optional[T]" notation in PEP 484 means "Union[T, None]". They mean different things.
Can you help improve the wording in the docs (preferably by filing an issue)?
When I eventually understand what it means...
-- --Guido van Rossum (python.org/~guido)
On 02/09/2016 03:48 PM, Guido van Rossum wrote:
(Assuming you meant Option*al*.) There seems to be an utter confusion of the two uses of the term "optional" here. An "optional argument" (outside PEP 484) is one that has a default value. The "Optional[T]" notation in PEP 484 means "Union[T, None]". They mean different things.
In an effort to be (crystal) clear: option argument in Python: has a default value, so may be omitted when the function is called. Optional[T] in MyPy: the argument has no default value, and must be supplied when the function is called, but the argument can be None. -- ~Ethan~
On 9 Feb 2016, at 11:48 pm, Guido van Rossum
wrote: [Phil]
I found the documentation confusing regarding Optional. Intuitively it seems to be the way to specify arguments with default values. However it is explained in terms of (for example) Union[str, None] and I (intuitively but incorrectly) read that as meaning "a str or None" as opposed to "a str or nothing". [me] But it *does* mean 'str or None'. The *type* of an argument doesn't have any bearing on whether it may be omitted from the argument list by the caller -- these are orthogonal concepts (though sadly the word optional might apply to both). It's possible (though unusual) to have an optional argument that must be a str when given; it's also possible to have a mandatory argument that may be a str or None. [Phil] In the case of Python wrappers around a C++ library then *every* optional argument will have to have a specific type when given.
IIUC you're saying that every argument that may be omitted must still have a definite type other than None. Right? In that case just don't use Optional[]. If a signature has the form
def foo(a: str = 'xyz') -> str: ...
then this means that str may be omitted or it may be a str -- you cannot call foo(a=None).
You can even (in a stub file) write this as:
def foo(a: str = ...) -> str: ...
(literal '...' i.e. ellipsis) if you don't want to commit to a specific default value (it makes no difference to mypy).
So you are saying that a mandatory argument that may be a str or None would be specified as Union[str, None]?
Or as Optional[str], which means the same.
But the docs say that that is the underlying implementation of Option[str] - which (to me) means an optional argument that should be a string when given.
(Assuming you meant Option*al*.) There seems to be an utter confusion of the two uses of the term "optional" here. An "optional argument" (outside PEP 484) is one that has a default value. The "Optional[T]" notation in PEP 484 means "Union[T, None]". They mean different things.
Can you help improve the wording in the docs (preferably by filing an issue)?
When I eventually understand what it means...
I understand now. The documentation, as it stands, is correct and consistent but (to me) the meaning of Optional is completely counter-intuitive. What you suggest with str = ... is exactly what I need. Adding a section to the docs describing that should clear up the confusion. Thanks, Phil
On Wed, Feb 10, 2016 at 1:11 AM, Phil Thompson
I understand now. The documentation, as it stands, is correct and consistent but (to me) the meaning of Optional is completely counter-intuitive. What you suggest with str = ... is exactly what I need. Adding a section to the docs describing that should clear up the confusion.
I tried to add some clarity to the docs with this paragraph: Note that this is not the same concept as an optional argument, which is one that has a default. An optional argument with a default needn't use the ``Optional`` qualifier on its type annotation (although it is inferred if the default is ``None``). A mandatory argument may still have an ``Optional`` type if an explicit value of ``None`` is allowed. Should be live on docs.python.org with the next push (I don't recall the delay, at most a day IIRC). -- --Guido van Rossum (python.org/~guido)
On 10 Feb 2016, at 5:52 pm, Guido van Rossum
On Wed, Feb 10, 2016 at 1:11 AM, Phil Thompson
wrote: I understand now. The documentation, as it stands, is correct and consistent but (to me) the meaning of Optional is completely counter-intuitive. What you suggest with str = ... is exactly what I need. Adding a section to the docs describing that should clear up the confusion.
I tried to add some clarity to the docs with this paragraph:
Note that this is not the same concept as an optional argument, which is one that has a default. An optional argument with a default needn't use the ``Optional`` qualifier on its type annotation (although it is inferred if the default is ``None``). A mandatory argument may still have an ``Optional`` type if an explicit value of ``None`` is allowed.
Should be live on docs.python.org with the next push (I don't recall the delay, at most a day IIRC).
That should do it, thanks. A followup question... Is... def foo(bar: str = Optional[str]) ...valid? In other words, bar can be omitted, but if specified must be a str or None? Thanks, Phil
On Wed, Feb 10, 2016 at 10:01 AM, Phil Thompson
On 10 Feb 2016, at 5:52 pm, Guido van Rossum
wrote: [...] That should do it, thanks. A followup question... Is...
def foo(bar: str = Optional[str])
...valid? In other words, bar can be omitted, but if specified must be a str or None?
The syntax you gave makes no sense (the default value shouldn't be a type) but to do what your words describe you can do def foo(bar: Optional[str] = ...): ... That's literally what you would put in the stub file (the ... are literal ellipses). In a .py file you'd have to specify a concrete default value. If your concrete default is neither str nor None you'd have to use cast(str, default_value), e.g. _NO_VALUE = object() # marker def foo(bar: Optional[str] = cast(str, _NO_VALUE)): ...implementation... Now the implementation can distinguish between foo(), foo(None) and foo(''). -- --Guido van Rossum (python.org/~guido)
On 10 February 2016 at 06:54, Guido van Rossum
[Just adding to Andrew's response]
On Tue, Feb 9, 2016 at 9:58 AM, Andrew Barnert via Python-Dev
wrote: On Feb 9, 2016, at 03:44, Phil Thompson
wrote: There are a number of things I'd like to express but cannot find a way to do so...
- objects that implement the buffer protocol
That seems like it should be filed as a bug with the typing repo. Presumably this is just an empty type that registers bytes, bytearray, and memoryview, and third-party classes have to register with it manually?
Hm, there's no way to talk about these in regular Python code either, is there? I think that issue should be resolved first. Probably by adding something to collections.abc. And then we can add the corresponding name to typing.py. This will take time though (have to wait for 3.6) so I'd recommend 'Any' for now (and filing those bugs).
Somewhat related, there's actually no way to export PEP 3118 buffers directly from a type implemented in Python: http://bugs.python.org/issue13797 Cython and PyPy each have their own approach to handling that, but there's no language level cross-interpreter convention A type (e.g. BytesLike, given the change we made to relevant error messages) could still be added to collections.abc without addressing that problem, it would just need to be empty and used only for explicit registration without any structural typing support. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (5)
-
Andrew Barnert
-
Ethan Furman
-
Guido van Rossum
-
Nick Coghlan
-
Phil Thompson