Opt-in “iter def” and/or “gen def” syntax for generator functions
After getting used to writing async functions, I’ve been wanting use a similar syntax to declare generator functions. Something along the lines of `iter def my_iterator() -> T` and/or `gen def my_generator() -> (T, U, V)` Obviously, for backwards compatibility, this would need to be optional or have an opt-in mechanism. Would a feature like this be at all within the realm of possibility? I’d be happy to write up a much longer discussion if so. (I found a short discussion of such a feature in the archives about 8 years ago[1]. But, since it predates both `async def` and the current type checker regime, I thought it might be worth discussing. Apologies if I missed any more recent discussions.) Thanks, Aaron [1] https://mail.python.org/archives/list/python-ideas@python.org/thread/OVIHVRK...
On Tue, 31 May 2022 at 23:00, Aaron L via Python-ideas <python-ideas@python.org> wrote:
After getting used to writing async functions, I’ve been wanting use a similar syntax to declare generator functions. Something along the lines of
`iter def my_iterator() -> T`
and/or
`gen def my_generator() -> (T, U, V)`
Obviously, for backwards compatibility, this would need to be optional or have an opt-in mechanism. Would a feature like this be at all within the realm of possibility? I’d be happy to write up a much longer discussion if so.
(I found a short discussion of such a feature in the archives about 8 years ago[1]. But, since it predates both `async def` and the current type checker regime, I thought it might be worth discussing. Apologies if I missed any more recent discussions.)
What's the advantage? You can just use normal function syntax to define them, and it works correctly. Do you need the ability to declare that it's a generator even without any yields in it? ChrisA
Thanks for your reply.
What's the advantage?
I brought this up thinking about explicitness and readability. Say you want to figure out what this function is doing: ```` def foo() -> t.Iterator[T]: [... 300 lines of code] ``` Is this a generator function? I'd argue that whether it's a generator function or not is fundamental to being able to read it. The type hint alone doesn't tell you whether you're looking at a generator function or not - it might just construct and return an iterator. So, you have to look for a `yield`. And if "yield" is somewhere in the function body, it will abruptly change the semantics of the entire definition. This feels a like spooky-action-at-a-distance to me - I'd much rather have the information up top in the function declaration, the same way that `async def` declares a coroutine function. However: this was actually discussed in PEP 255, where there was a decision *not* to introduce a new keyword for generator functions. From the BDFL decision:
No argument on either side is totally convincing, so I have consulted my language designer’s intuition. It tells me that the syntax proposed in the PEP is exactly right - not too hot, not too cold. But, like the Oracle at Delphi in Greek mythology, it doesn’t tell me why, so I don’t have a rebuttal for the arguments against the PEP syntax.
- Aaron
On Wed, 1 Jun 2022 at 03:34, Aaron L via Python-ideas <python-ideas@python.org> wrote:
Thanks for your reply.
What's the advantage?
I brought this up thinking about explicitness and readability. Say you want to figure out what this function is doing:
```` def foo() -> t.Iterator[T]: [... 300 lines of code] ```
Is this a generator function? I'd argue that whether it's a generator function or not is fundamental to being able to read it. The type hint alone doesn't tell you whether you're looking at a generator function or not - it might just construct and return an iterator.
Does it actually matter whether it's a generator, or returns some other type of iterable? What's the difference between these two functions: def enumerate_spam(n): yield "spam 1" yield "spam 2" yield "spam 3" yield n yield "the rest of the spam" def enumerate_default_spam(): return enumerate_spam("default") Technically, one of these is a generator, and one is not. But the return value from both of them is a generator object. You can send it values, get values back, all the things you can do with a generator. How fundamental is it that THIS function is a generator, rather than simply that it returns an iterator (or that it returns a generator/coroutine object, etc)? If your function is really just named "foo" and has 300 lines of code, you have other problems. Normally, the function's name should tell you a lot. In your case, you seem to also have a return type hint, which tells you a bit more, so that ought to be sufficient? Maybe that's not sufficient for your codebase. Well, that's what docstrings and decorators and code comments are for :) ChrisA
I don't really disagree with most of what you wrote! And agree that decorators, specifically, are a pretty good solution within the scope of an individual package. But I would quibble with this:
How fundamental is it that THIS function is a generator, rather than simply that it returns an iterator (or that it returns a generator/coroutine object, etc)?
The delayed execution of a generator function is, IMO very, very different than a regular function! ``` def f(): result = side_effect1() yield 0 yield 1 return result ``` vs. ``` def f(): side_effect() return range(2) ``` These are contrived examples, obviously, but I've specifically encountered issues like this when working with code that handles streaming data - from a db or REST API, for instance. My experience of this type of code is often along the following lines: ``` def stream_rows(self): [mutex handling ...] [retry logic] [more mutexes?] [some timeout stuff] [...] ``` I don't mean to belabor the point - the decision in PEP 255 is pretty definitive, and this was more or less a "modest proposal" anyway. I do appreciate the responses and discussion! Thanks, Aaron
31.05.22 16:21, Chris Angelico пише:
On Tue, 31 May 2022 at 23:00, Aaron L via Python-ideas <python-ideas@python.org> wrote:
After getting used to writing async functions, I’ve been wanting use a similar syntax to declare generator functions.
What's the advantage? You can just use normal function syntax to define them, and it works correctly. Do you need the ability to declare that it's a generator even without any yields in it?
The advantage is that you cannot accidentally turn a function into a generator by adding "yield". If the result of the call is ignored (it is expected to be None), this bug can live a long time. It is a common issue: test containing yield always passes. Since 3.11 the unittest module emits a warning if a test method returns not None, but it will not solve all problems: the test can call helpers, and if they are generators, the call is virtually no-op. This error can also occur in non-test code. Asynchronous functions are more reliable. "async" is mandatory, and if you do not await the result of an asynchronous function call you will get a loud warning. I think it is a time to apply the same approach to generator functions.
On Wed, 1 Jun 2022 at 23:55, Serhiy Storchaka <storchaka@gmail.com> wrote:
31.05.22 16:21, Chris Angelico пише:
On Tue, 31 May 2022 at 23:00, Aaron L via Python-ideas <python-ideas@python.org> wrote:
After getting used to writing async functions, I’ve been wanting use a similar syntax to declare generator functions.
What's the advantage? You can just use normal function syntax to define them, and it works correctly. Do you need the ability to declare that it's a generator even without any yields in it?
The advantage is that you cannot accidentally turn a function into a generator by adding "yield". If the result of the call is ignored (it is expected to be None), this bug can live a long time. It is a common issue: test containing yield always passes. Since 3.11 the unittest module emits a warning if a test method returns not None, but it will not solve all problems: the test can call helpers, and if they are generators, the call is virtually no-op. This error can also occur in non-test code.
That might be nice, but without a massive backward compatibility break (or another keyword for non-generator functions), it can't happen. Is there any advantage to being able to declare that it must be a generator (as opposed to simply returning a generator object)? Maybe I just don't work on the right sorts of codebases. ChrisA
On 6/1/2022 9:59 AM, Chris Angelico wrote:
31.05.22 16:21, Chris Angelico пише:
On Tue, 31 May 2022 at 23:00, Aaron L via Python-ideas <python-ideas@python.org> wrote:
After getting used to writing async functions, I’ve been wanting use a similar syntax to declare generator functions. What's the advantage? You can just use normal function syntax to define them, and it works correctly. Do you need the ability to declare that it's a generator even without any yields in it? The advantage is that you cannot accidentally turn a function into a generator by adding "yield". If the result of the call is ignored (it is expected to be None), this bug can live a long time. It is a common issue: test containing yield always passes. Since 3.11 the unittest module emits a warning if a test method returns not None, but it will not solve all problems: the test can call helpers, and if they are generators, the call is virtually no-op. This error can also occur in non-test code. That might be nice, but without a massive backward compatibility break (or another keyword for non-generator functions), it can't happen. Is
On Wed, 1 Jun 2022 at 23:55, Serhiy Storchaka <storchaka@gmail.com> wrote: there any advantage to being able to declare that it must be a generator (as opposed to simply returning a generator object)?
Serhiy explains the issues above, and I've been bitten by it. def fn(): # Do something with side effects, or maybe mutating parameters. print("foo") yield 3 fn() # Called for the side effects, but not iterated over. It does not print "foo". I agree that the compatibility issues are large and tricky. Eric
Maybe I just don't work on the right sorts of codebases.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/XBY4IP... Code of Conduct: http://python.org/psf/codeofconduct/
01.06.22 16:59, Chris Angelico пише:
On Wed, 1 Jun 2022 at 23:55, Serhiy Storchaka <storchaka@gmail.com> wrote:
The advantage is that you cannot accidentally turn a function into a generator by adding "yield". If the result of the call is ignored (it is expected to be None), this bug can live a long time. It is a common issue: test containing yield always passes. Since 3.11 the unittest module emits a warning if a test method returns not None, but it will not solve all problems: the test can call helpers, and if they are generators, the call is virtually no-op. This error can also occur in non-test code.
That might be nice, but without a massive backward compatibility break (or another keyword for non-generator functions), it can't happen. Is there any advantage to being able to declare that it must be a generator (as opposed to simply returning a generator object)?
The advantage is not in the ability to declare that it must be a generator, but in the ability to declare that it must not be a generator (and this should be the default). Of course it is a massive backward compatibility break, and the transition should be very slow and careful. First introduce a new optional syntax for generator functions, after 4 or 5 years (when all maintained Python versions support the new syntax) start to emit a quiet warning for a generator function with old syntax, after few more years make it more loud, and finally, after yet more years, an error. In between, third-party linters can start complaining about old syntax, first these warnings will be disable by default. It all will take 8-10 years or more. Or we can just break the world in Python 4.0 (but this is not the plan).
Serhiy Storchaka writes:
The advantage is that you cannot accidentally turn a function into a generator by adding "yield".
Can't mypy catch this?
Asynchronous functions are more reliable. "async" is mandatory, and if you do not await the result of an asynchronous function call you will get a loud warning.
"yield" is mandatory in generators, and it should be possible to detect if a generator is called other than in an iteration context (and possibly we would want to special-case functions like 'dir', 'help', and 'type' for this purpose). As far as the signature goes, I would suppose -> GeneratorType (or GeneratorType[T] if you know the type of object to be yielded) would clue folks in, and mypy could warn if it were missing. Sure, you would like to have the guarantees of mandatory syntax, but (unless I'm missing something) we can do this now. Steve
participants (6)
-
Aaron L
-
armload_lakes0c@icloud.com
-
Chris Angelico
-
Eric V. Smith
-
Serhiy Storchaka
-
Stephen J. Turnbull