Mailman 3 Discrepancy between what aiter() and `async for` requires on purpose? - Python-Dev

newer
Re: PEP 646 (Variadic Generics):...

Discrepancy between what aiter() and `async for` requires on purpose?

Brett Cannon

29 Aug 2021 29 Aug '21

3:16 p.m.

If you look at https://github.com/python/cpython/blob/b11a951f16f0603d98de24fee5c023df83ea5... you will see that `async for` requires that the iterator returned from `__aiter__` define `__anext__`. But if you look at aiter() which uses PyObject_GetAiter() from https://github.com/python/cpython/blob/f0a6fde8827d5d4f7a1c741ab1a8b206b66ff... and PyAiter_Check() from https://github.com/python/cpython/blob/f0a6fde8827d5d4f7a1c741ab1a8b206b66ff... you will notice that aiter() requires `__anext__` *and* `__aiter__` on the async iterator that gets returned from __aiter__. Now the docs for aiter() at https://docs.python.org/3.10/library/functions.html#aiter points out that the async iterator is expected to define both methods as does the glossary definition for "asynchronous iterator" ( https://docs.python.org/3.8/glossary.html#term-asynchronous-iterator). So my question is whether the discrepancy between what `async for` expects and what `aiter()` expects on purpose? https://bugs.python.org/issue31861 was the issue for creating aiter() and I didn't notice a discussion of this difference. The key reason I'm asking is this does cause a deviation compared to the relationship between `for` and `iter()` (which does not require `__iter__` to be defined on the iterator, although collections.abc.Iterator does). It also makes the glossary definition being linked from https://docs.python.org/3.10/reference/compound_stmts.html#the-async-for-sta... incorrect.

Attachments:

attachment.htm (text/html — 2.3 KB)

Show replies by date

Serhiy Storchaka

29 Aug 29 Aug

3:59 p.m.

29.08.21 23:16, Brett Cannon пише:

...

If you look at https://github.com/python/cpython/blob/b11a951f16f0603d98de24fee5c023df83ea5... <https://github.com/python/cpython/blob/b11a951f16f0603d98de24fee5c023df83ea552c/Python/ceval.c#L2409-L2451> you will see that `async for` requires that the iterator returned from `__aiter__` define `__anext__`. But if you look at aiter() which uses PyObject_GetAiter() from https://github.com/python/cpython/blob/f0a6fde8827d5d4f7a1c741ab1a8b206b66ff... <https://github.com/python/cpython/blob/f0a6fde8827d5d4f7a1c741ab1a8b206b66ffd57/Objects/abstract.c#L2741-L2759> and PyAiter_Check() from https://github.com/python/cpython/blob/f0a6fde8827d5d4f7a1c741ab1a8b206b66ff... <https://github.com/python/cpython/blob/f0a6fde8827d5d4f7a1c741ab1a8b206b66ffd57/Objects/abstract.c#L2769-L2778> you will notice that aiter() requires `__anext__` *and* `__aiter__` on the async iterator that gets returned from __aiter__.

Now the docs for aiter() at https://docs.python.org/3.10/library/functions.html#aiter <https://docs.python.org/3.10/library/functions.html#aiter> points out that the async iterator is expected to define both methods as does the glossary definition for "asynchronous iterator" (https://docs.python.org/3.8/glossary.html#term-asynchronous-iterator <https://docs.python.org/3.8/glossary.html#term-asynchronous-iterator>).

So my question is whether the discrepancy between what `async for` expects and what `aiter()` expects on purpose? https://bugs.python.org/issue31861 <https://bugs.python.org/issue31861> was the issue for creating aiter() and I didn't notice a discussion of this difference. The key reason I'm asking is this does cause a deviation compared to the relationship between `for` and `iter()` (which does not require `__iter__` to be defined on the iterator, although collections.abc.Iterator does). It also makes the glossary definition being linked from https://docs.python.org/3.10/reference/compound_stmts.html#the-async-for-sta... <https://docs.python.org/3.10/reference/compound_stmts.html#the-async-for-statement> incorrect.

PyIter_Check() only checks existence of __next__, not __iter__ (perhaps for performance reasons). I just ported changes from PyPy in SQLite tests (https://github.com/python/cpython/pull/28021) because a test class with __next__ but without __iter__ passed tests on CPython but failed on PyPy.

Brett Cannon

30 Aug 30 Aug

11:46 a.m.

On Sun, Aug 29, 2021 at 2:01 PM Serhiy Storchaka <storchaka@gmail.com> wrote:

...

29.08.21 23:16, Brett Cannon пише:

...
If you look at

https://github.com/python/cpython/blob/b11a951f16f0603d98de24fee5c023df83ea5...

...
< https://github.com/python/cpython/blob/b11a951f16f0603d98de24fee5c023df83ea5...

you will see that `async for` requires that the iterator returned from `__aiter__` define `__anext__`. But if you look at aiter() which uses PyObject_GetAiter() from

https://github.com/python/cpython/blob/f0a6fde8827d5d4f7a1c741ab1a8b206b66ff...

...
< https://github.com/python/cpython/blob/f0a6fde8827d5d4f7a1c741ab1a8b206b66ff...

and PyAiter_Check() from

https://github.com/python/cpython/blob/f0a6fde8827d5d4f7a1c741ab1a8b206b66ff...

...
< https://github.com/python/cpython/blob/f0a6fde8827d5d4f7a1c741ab1a8b206b66ff...

you will notice that aiter() requires `__anext__` *and* `__aiter__` on the async iterator that gets returned from __aiter__.

Now the docs for aiter() at https://docs.python.org/3.10/library/functions.html#aiter <https://docs.python.org/3.10/library/functions.html#aiter> points out that the async iterator is expected to define both methods as does the glossary definition for "asynchronous iterator" (https://docs.python.org/3.8/glossary.html#term-asynchronous-iterator <https://docs.python.org/3.8/glossary.html#term-asynchronous-iterator>).

So my question is whether the discrepancy between what `async for` expects and what `aiter()` expects on purpose? https://bugs.python.org/issue31861 <https://bugs.python.org/issue31861> was the issue for creating aiter() and I didn't notice a discussion of this difference. The key reason I'm asking is this does cause a deviation compared to the relationship between `for` and `iter()` (which does not require `__iter__` to be defined on the iterator, although collections.abc.Iterator does). It also makes the glossary definition being linked from

https://docs.python.org/3.10/reference/compound_stmts.html#the-async-for-sta...

...
< https://docs.python.org/3.10/reference/compound_stmts.html#the-async-for-sta...

incorrect.

PyIter_Check() only checks existence of __next__, not __iter__ (perhaps for performance reasons).

Or maybe no one thought to require __iter__ for iterators?

...

I just ported changes from PyPy in SQLite tests (https://github.com/python/cpython/pull/28021) because a test class with __next__ but without __iter__ passed tests on CPython but failed on PyPy.

I'm going to wait to hear from anyone who may have been involved with implementing aiter() and `async for` before proposing various ways to align them with iter() and `for`.

Nick Coghlan

1 Sep 1 Sep

6:04 p.m.

On Tue, 31 Aug 2021, 2:52 am Brett Cannon, <brett@python.org> wrote:

...

On Sun, Aug 29, 2021 at 2:01 PM Serhiy Storchaka <storchaka@gmail.com> wrote:

...
...
So my question is whether the discrepancy between what `async for` expects and what `aiter()` expects on purpose? https://bugs.python.org/issue31861 <https://bugs.python.org/issue31861> was the issue for creating aiter() and I didn't notice a discussion of this difference. The key reason I'm asking is this does cause a deviation compared to the relationship between `for` and `iter()` (which does not require `__iter__` to be defined on the iterator, although collections.abc.Iterator does). It also makes the glossary definition being linked from

https://docs.python.org/3.10/reference/compound_stmts.html#the-async-for-sta...

...
< https://docs.python.org/3.10/reference/compound_stmts.html#the-async-for-sta...

incorrect.

PyIter_Check() only checks existence of __next__, not __iter__ (perhaps for performance reasons).

Or maybe no one thought to require __iter__ for iterators?

I don't think PyIter_Check is testing the formal definition of an iterator, I think it's just testing if calling __iter__ can be skipped (as you say, for performance reasons). I'm surprised iter() would skip calling __iter__ just because an object defines __next__, though. Even though "__iter__ is defined and returns self" is part of the iterator definition, it still feels like a leap from there to "if __next__ is defined, skip calling __iter__ in iter()". The optimisation that bypasses the __[a]iter__ method call feels more legitimate in the actual for loop syntax, it just feels odd to me if the builtin isn't forcing the call. Cheers, Nick.

...

Guido van Rossum

2 Sep 2 Sep

8:18 p.m.

First of all, we should ping Yury, who implemented `async for` about 6 years ago (see PEP 492), and Joshua Bronson, who implemented aiter() and anext() about 5 months ago (see https://bugs.python.org/issue31861). I've CC'ed them here. My own view: A. iter() doesn't check that the thing returned implements __next__, because it's not needed -- iterators having an __iter__ methor is a convention, not a requirement. You shouldn't implement __iter__ returning something that doesn't implement __iter__ itself, because then "for x in iter(a)" would fail even though "for x in a" works. But you get an error, and anyone who implements something like that (or uses it) deserves what they get. People know about this convention and the ABC enforces it, so in practice it will be very rare that someone gets bitten by this. B. aiter() shouldn't need to check either, for exactly the same reason. I *suspect* (but do not know) that the extra check for the presence of __iter__ is simply an attempt by the implementer to enforce the convention. There is no *need* other than ensuring that "async for x in aiter(a)" works when "async for x in a" works. Note that PEP 525, which defines async generators, seems to imply that an __aiter__ returning self is always necessary, but I don't think it gives a reason. I do notice there's some backwards compatibility issue related to __aiter__, alluded to in both PEP 492 ( https://www.python.org/dev/peps/pep-0492/#api-design-and-implementation-revi...) and PEP 525 ( https://www.python.org/dev/peps/pep-0525/#aiter-and-anext-builtins). So it's *possible* that it has to do with this (maybe really old code implementing the 3.5 version of __aiter__ would be caught out by the extra check) but I don't think it is. Hopefully Yury and/or Joshua remembers? FWIW I don't think there are any optimizations that avoid calling __iter__ or __aiter__ if __next__ or __anext__ is present. And certainly I wouldn't endorse adding them (this would seem an ad-hoc optimization that could break user expectations unexpectedly, quite apart from the issue discussed here). --Guido On Wed, Sep 1, 2021 at 4:11 PM Nick Coghlan <ncoghlan@gmail.com> wrote:

...

On Tue, 31 Aug 2021, 2:52 am Brett Cannon, <brett@python.org> wrote:

...
On Sun, Aug 29, 2021 at 2:01 PM Serhiy Storchaka <storchaka@gmail.com> wrote:

...
...
So my question is whether the discrepancy between what `async for` expects and what `aiter()` expects on purpose? https://bugs.python.org/issue31861 <https://bugs.python.org/issue31861

was the issue for creating aiter() and I didn't notice a discussion of this difference. The key reason I'm asking is this does cause a deviation compared to the relationship between `for` and `iter()` (which does not require `__iter__` to be defined on the iterator, although collections.abc.Iterator does). It also makes the glossary definition being linked from

https://docs.python.org/3.10/reference/compound_stmts.html#the-async-for-sta...

...
< https://docs.python.org/3.10/reference/compound_stmts.html#the-async-for-sta...

incorrect.

PyIter_Check() only checks existence of __next__, not __iter__ (perhaps for performance reasons).

Or maybe no one thought to require __iter__ for iterators?

I don't think PyIter_Check is testing the formal definition of an iterator, I think it's just testing if calling __iter__ can be skipped (as you say, for performance reasons).

I'm surprised iter() would skip calling __iter__ just because an object defines __next__, though. Even though "__iter__ is defined and returns self" is part of the iterator definition, it still feels like a leap from there to "if __next__ is defined, skip calling __iter__ in iter()".

The optimisation that bypasses the __[a]iter__ method call feels more legitimate in the actual for loop syntax, it just feels odd to me if the builtin isn't forcing the call.

Cheers, Nick.

...
...
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5UMLDQ4C... Code of Conduct: http://python.org/psf/codeofconduct/

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>

Yury Selivanov

9:41 p.m.

Comments inlined: On Thu, Sep 2, 2021 at 6:23 PM Guido van Rossum <guido@python.org> wrote:

...

First of all, we should ping Yury, who implemented `async for` about 6 years ago (see PEP 492), and Joshua Bronson, who implemented aiter() and anext() about 5 months ago (see https://bugs.python.org/issue31861). I've CC'ed them here.

Looks like PyAiter_Check was added along with the aiter/anext builtins. I agree it's unnecessary to check for __aiter__ in it, so I let's just fix it.

...

My own view:

A. iter() doesn't check that the thing returned implements __next__, because it's not needed -- iterators having an __iter__ methor is a convention, not a requirement.

Yeah.

...

You shouldn't implement __iter__ returning something that doesn't implement __iter__ itself, because then "for x in iter(a)" would fail even though "for x in a" works. But you get an error, and anyone who implements something like that (or uses it) deserves what they get. People know about this convention and the ABC enforces it, so in practice it will be very rare that someone gets bitten by this.

B. aiter() shouldn't need to check either, for exactly the same reason. I *suspect* (but do not know) that the extra check for the presence of __iter__ is simply an attempt by the implementer to enforce the convention. There is no *need* other than ensuring that "async for x in aiter(a)" works when "async for x in a" works.

I agree.

...

Note that PEP 525, which defines async generators, seems to imply that an __aiter__ returning self is always necessary, but I don't think it gives a reason.

PEP 525 implies that specifically for asynchronous generators, not iterators. That's due to the fact that synchronous generators return self from their __iter__.

...

I do notice there's some backwards compatibility issue related to __aiter__, alluded to in both PEP 492 ( https://www.python.org/dev/peps/pep-0492/#api-design-and-implementation-revi...) and PEP 525 ( https://www.python.org/dev/peps/pep-0525/#aiter-and-anext-builtins). So it's *possible* that it has to do with this (maybe really old code implementing the 3.5 version of __aiter__ would be caught out by the extra check) but I don't think it is. Hopefully Yury and/or Joshua remembers?

That wasn't related. In the first iteration of PEP 492, __aiter__ was required to be a coroutine. Some time after shipping 3.5.0 I realized that that would complicate asynchronous generators for no reason (and I think there were also some bigger problems than just complicating them). So I updated the PEP to change __aiter__ return type from `Awaitable[AsyncIterator]` to `AsyncIterator`. ceval code was changed to call __aiter__ and see if the object that it returned had __anext__. If not, it tried to await on it. Bottom line: let's fix PyAiter_Check to only look for __anext__. It's a new function so we can still fix it to reflect PyIter_Check and not worry about anything. Yury

Dennis Sweeney

10:29 p.m.

I think the C implementation of PyAiter_Check was a translation of the Python code `isinstance(..., collections.abc.AsyncIterator)`, but I agree that it would be more consistent to just check for __anext__. There were comments at the time here: https://github.com/python/cpython/pull/8895#discussion_r532833905

Brett Cannon

8 Sep 8 Sep

2:48 p.m.

On Thu, Sep 2, 2021 at 7:43 PM Yury Selivanov <yselivanov.ml@gmail.com> wrote:

...

Comments inlined:

On Thu, Sep 2, 2021 at 6:23 PM Guido van Rossum <guido@python.org> wrote:

...
First of all, we should ping Yury, who implemented `async for` about 6 years ago (see PEP 492), and Joshua Bronson, who implemented aiter() and anext() about 5 months ago (see https://bugs.python.org/issue31861). I've CC'ed them here.

Looks like PyAiter_Check was added along with the aiter/anext builtins. I agree it's unnecessary to check for __aiter__ in it, so I let's just fix it.

...
My own view:

A. iter() doesn't check that the thing returned implements __next__, because it's not needed -- iterators having an __iter__ methor is a convention, not a requirement.

Yeah.

...
You shouldn't implement __iter__ returning something that doesn't implement __iter__ itself, because then "for x in iter(a)" would fail even though "for x in a" works. But you get an error, and anyone who implements something like that (or uses it) deserves what they get. People know about this convention and the ABC enforces it, so in practice it will be very rare that someone gets bitten by this.

B. aiter() shouldn't need to check either, for exactly the same reason. I *suspect* (but do not know) that the extra check for the presence of __iter__ is simply an attempt by the implementer to enforce the convention. There is no *need* other than ensuring that "async for x in aiter(a)" works when "async for x in a" works.

I agree.

[SNIP]

...

Bottom line: let's fix PyAiter_Check to only look for __anext__. It's a new function so we can still fix it to reflect PyIter_Check and not worry about anything.

I don't know if Pablo wants such a change in 3.10 since we are at rc2 at this point, so this might have to wait for 3.11 (although there's no deprecation here since it's a loosening of requirements so it could go in straight away).

Yury Selivanov

2:49 p.m.

We have already merged it, the fix is part of the rc2. yury On Wed, Sep 8 2021 at 12:48 PM, Brett Cannon <brett@python.org> wrote:

...

On Thu, Sep 2, 2021 at 7:43 PM Yury Selivanov <yselivanov.ml@gmail.com> wrote:

...
Comments inlined:

On Thu, Sep 2, 2021 at 6:23 PM Guido van Rossum <guido@python.org> wrote:

...
First of all, we should ping Yury, who implemented `async for` about 6 years ago (see PEP 492), and Joshua Bronson, who implemented aiter() and anext() about 5 months ago (see https://bugs.python.org/issue31861). I've CC'ed them here.

Looks like PyAiter_Check was added along with the aiter/anext builtins. I agree it's unnecessary to check for __aiter__ in it, so I let's just fix it.

...
My own view:

A. iter() doesn't check that the thing returned implements __next__, because it's not needed -- iterators having an __iter__ methor is a convention, not a requirement.

Yeah.

...
You shouldn't implement __iter__ returning something that doesn't implement __iter__ itself, because then "for x in iter(a)" would fail even though "for x in a" works. But you get an error, and anyone who implements something like that (or uses it) deserves what they get. People know about this convention and the ABC enforces it, so in practice it will be very rare that someone gets bitten by this.

B. aiter() shouldn't need to check either, for exactly the same reason. I *suspect* (but do not know) that the extra check for the presence of __iter__ is simply an attempt by the implementer to enforce the convention. There is no *need* other than ensuring that "async for x in aiter(a)" works when "async for x in a" works.

I agree.

[SNIP]

...
Bottom line: let's fix PyAiter_Check to only look for __anext__. It's a new function so we can still fix it to reflect PyIter_Check and not worry about anything.

I don't know if Pablo wants such a change in 3.10 since we are at rc2 at this point, so this might have to wait for 3.11 (although there's no deprecation here since it's a loosening of requirements so it could go in straight away).

Brett Cannon

5:31 p.m.

On Wed, Sep 8, 2021 at 12:49 PM Yury Selivanov <yselivanov@gmail.com> wrote:

...

We have already merged it, the fix is part of the rc2.

Thanks! (If we were on Discourse I would have left a ♥ instead 😃)

...

yury

On Wed, Sep 8 2021 at 12:48 PM, Brett Cannon <brett@python.org> wrote:

...
On Thu, Sep 2, 2021 at 7:43 PM Yury Selivanov <yselivanov.ml@gmail.com> wrote:

...
Comments inlined:

On Thu, Sep 2, 2021 at 6:23 PM Guido van Rossum <guido@python.org> wrote:

...
First of all, we should ping Yury, who implemented `async for` about 6 years ago (see PEP 492), and Joshua Bronson, who implemented aiter() and anext() about 5 months ago (see https://bugs.python.org/issue31861). I've CC'ed them here.

Looks like PyAiter_Check was added along with the aiter/anext builtins. I agree it's unnecessary to check for __aiter__ in it, so I let's just fix it.

...
My own view:

A. iter() doesn't check that the thing returned implements __next__, because it's not needed -- iterators having an __iter__ methor is a convention, not a requirement.

Yeah.

...
You shouldn't implement __iter__ returning something that doesn't implement __iter__ itself, because then "for x in iter(a)" would fail even though "for x in a" works. But you get an error, and anyone who implements something like that (or uses it) deserves what they get. People know about this convention and the ABC enforces it, so in practice it will be very rare that someone gets bitten by this.

B. aiter() shouldn't need to check either, for exactly the same reason. I *suspect* (but do not know) that the extra check for the presence of __iter__ is simply an attempt by the implementer to enforce the convention. There is no *need* other than ensuring that "async for x in aiter(a)" works when "async for x in a" works.

I agree.

[SNIP]

...
Bottom line: let's fix PyAiter_Check to only look for __anext__. It's a new function so we can still fix it to reflect PyIter_Check and not worry about anything.

I don't know if Pablo wants such a change in 3.10 since we are at rc2 at this point, so this might have to wait for 3.11 (although there's no deprecation here since it's a loosening of requirements so it could go in straight away).

1157

Age (days ago)

1167

Last active (days ago)

List overview

Download

9 comments

7 participants

participants (7)

Brett Cannon
Dennis Sweeney
Guido van Rossum
Nick Coghlan
Serhiy Storchaka
Yury Selivanov
Yury Selivanov

Discrepancy between what aiter() and `async for` requires on purpose?

tags

participants (7)