On Sun, Oct 7, 2012 at 3:43 PM, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On 7 October 2012 21:19, Guido van Rossum <guido@python.org> wrote:
On Sun, Oct 7, 2012 at 12:30 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
On 07.10.12 04:45, Guido van Rossum wrote:
But yes, this was all considered and accepted when PEP 380 was debated (endlessly :-), and I see no reason to change anything about this.
The reason is that when someone uses StopIteration.value for some purposes, he will lose this value if the iterator will be wrapped into itertools.chain (quite often used technique) or into other standard iterator wrapper.
If this is just about iterator.chain() I may see some value in it (but TBH the discussion so far mostly confuses -- please spend some more time coming up with good examples that show actually useful use cases rather than f() and g() or foo() and bar())
OTOH yield from is not primarily for iterators -- it is for coroutines. I suspect most of the itertools functionality just doesn't work with coroutines.
I think what Serhiy is saying is that although pep 380 mainly discusses generator functions it has effectively changed the definition of what it means to be an iterator for all iterators: previously an iterator was just something that yielded values but now it also returns a value. Since the meaning of an iterator has changed, functions that work with iterators need to be updated.
I think there are different philosophical viewpoints possible on that issue. My own perspective is that there is no change in the definition of iterator -- only in the definition of generator. Note that the *ability* to attach a value to StopIteration is not new at all.
Before pep 380 filter(lambda x: True, obj) returned an object that was the same kind of iterator as obj (it would yield the same values). Now the "kind of iterator" that obj is depends not only on the values that it yields but also on the value that it returns. Since filter does not pass on the same return value, filter(lambda x: True, obj) is no longer the same kind of iterator as obj. The same considerations apply to many other functions such as map, itertools.groupby, itertools.dropwhile.
There are other differences between iterators and generators that are not preserved by the various forms of "iterator algebra" that can be applied -- in particular, non-generator iterators don't support send(). I think it's perfectly valid to view generators as a kind of special iterators with properties that aren't preserved by applying generic iterator operations to them (like itertools or filter()).
Cases like itertools.chain and zip are trickier since they each act on multiple underlying iterables. Probably chain should return a tuple of the return values from each of its iterables.
That's one possible interpretation, but I doubt it's the most useful one.
This feature was new in Python 3.3 which was released a week ago
It's been in alpha/beta/candidate for a long time, and PEP 380 was first discussed in 2009.
so it is not widely used but it has uses that are not anything to do with coroutines.
Yes, as a shortcut for "for x in <iterator>: yield x". Note that the for-loop ignores the value in the StopIteration -- would you want to change that too?
As an example of how you could use it, consider parsing a file that can contains #include statements. When the #include statement is encountered we need to insert the contents of the included file. This is easy to do with a recursive generator. The example uses the return value of the generator to keep track of which line is being parsed in relation to the flattened output file:
def parse(filename, output_lineno=0): with open(filename) as fin: for input_lineno, line in enumerate(fin): if line.startswith('#include '): subfilename = line.split()[1] output_lineno = yield from parse(subfilename, output_lineno) else: try: yield parse_line(line) except ParseLineError: raise ParseError(filename, input_lineno, output_lineno) output_lineno += 1 return output_lineno
Hm. This example looks constructed to prove your point... It would be easier to count the output lines in the caller. Or you could use a class to hold that state. I think it's just a bad habit to start using the return value for this purpose. Please use the same approach as you would before 3.3, using "yield from" just as the shortcut I mentione above.
When writing code like the above that depends on being able to get the value returned from an iterator, it is no longer possible to freely mix utilities like filter, map, zip, itertools.chain with the iterators returned by parse() as they no longer act as transparent wrappers over the underlying iterators (by not propagating the value attached to StopIteration).
I see that as one more argument for not using the return value here...
Hopefully, I've understood Serhiy and the docs correctly (I don't have access to Python 3.3 right now to test any of this).
I don't doubt it. But I think you're fighting windmills. -- --Guido van Rossum (python.org/~guido)