[docs] sum( ) in generator bugged
Zachary Ware
zachary.ware+pydocs at gmail.com
Wed Dec 9 15:10:21 EST 2015
On Tue, Dec 8, 2015 at 3:08 PM, Anselm Kiefner <python at anselm.kiefner.de> wrote:
> Hey Zach,
>
> Thanks for your answer and explanation.
> I just posted here following the advice on
> https://docs.python.org/3.5/bugs.html ("If you’re short on time, you can
> also email your bug report to docs at python.org. ‘docs@’ is a mailing list
> run by volunteers; your request will be noticed, though it may take a
> while to be processed.").
Ah, that line is meant for documentation bugs; I've clarified the
wording. For further discussion on this issue, please send your
message to python-list at python.org.
> Now, I'm fairly aware how that code works and how to work around it, but
> I must say I almost felt insulted by your claim that my "expectations
> are bugged".
My apologies; insulting was far from my intent.
> Considering the Zen of Python ("There should be one - and
> preferrably only one- obvious way to do it") and the Principle of least
> astonishment I'd rather argue that my expectations on this one are
> rather well within what could be considered normal.
>
> When I, as a user, replace a list or a list comprehension with a
> generator - that is simply by replacing [] with (), I would expect that
> the items are now generated on the fly and not held in memory anymore
> (that's the speed to memory tradeoff you were talking about) - but the
> result of both should be the same, logically.
> This is also how it works in most cases without any trouble.
As long as you don't try to iterate over the generator more than once.
> Now, applying sum() on a list in general returns the same result as it
> does when applied on a generator, and it works as expected inside the
> list comprehension when applied on the list. So you see, the described
> behaviour of sum() applied on a generator inside a list comprehension is
> clearly an exception of the general behaviour.
How should `sum` behave differently? A generator is a one-shot
iterator, it cannot be reset (how would you reset a generator that
yields random values?). `sum` is just a consumer of iterables, just
like `list()`, `max()`, `any()`, 'for' loops, comprehensions, and
generator expressions. All of those follow the same protocol: call
`iter()` on the iterable, whose __iter__() method returns some object
with a __next__() method (an iterator), then call next() on that
object until __next__() raises StopIteration.
For the record, just replacing [] with () makes no difference. Your
example 'a' (`a = [x*sum(L) for x in L]`) is a list comprehension that
iterates over a list multiple times. Your example 'b' (`b =
(x*sum(L_g) for x in L_g)`) is a generator expression that attempts to
iterate over a generator multiple times. It's the second change that
makes the difference; `c = (x*sum(L) for x in L)` would give the same
result as 'a', as would `d = (x*sum(L) for x in L_g)`. This would
give the same answer as b: `e = [x*sum(L_g) for x in L_g]`. This
would give an entirely different answer: `f = [x*sum(L_g) for x in
L]`.
> Let me quote the Zen of Python again: Special cases aren't special
> enough to break the rules.
I'm not clear on what is breaking what rule. Everything is following
the iterator protocol, which is documented here:
https://docs.python.org/3/library/stdtypes.html#typeiter
> Yes, surely there are ways to work around it, but I hope you agree now
> that this is not a flaw in my expectations rather than in the code.
The examples I gave in my previous message were not workarounds, they
were the two options you have to get what you expected while using
generators everywhere. If you want a reusable lazy iterator, you'll
need something other than a generator.
I'm sorry if I'm not explaining things well. You may have better luck
starting a thread on python-list; there are several knowledgable
people there who are quite willing to explain the finer points of
things like this better than I can.
Regards,
--
Zach
More information about the docs
mailing list