[Python-ideas] Generator syntax hooks?

Nick Coghlan ncoghlan at gmail.com
Wed Aug 9 10:54:57 EDT 2017


On 9 August 2017 at 15:38, Guido van Rossum <guido at python.org> wrote:
> On Tue, Aug 8, 2017 at 10:06 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> The OP's proposal doesn't fit into that category though: rather it's
>> asking about the case where we have an infinite iterator (e.g.
>> itertools.count(0)), and want to drop items until they start meeting
>> some condition (i.e. itertools.dropwhile) and then terminate the
>> iterator as soon as another condition is no longer met (i.e.
>> itertools.takewhile).
>
> I don't think that's what the OP meant. The original proposal seemed to
> assume that it would be somehow reasonable for the input ("integers" in the
> example) to be able to see and parse the condition in the generator
> expression ("1000 <= x < 100000" in the example, with "x" somehow known to
> be bound to the iteration value). That's at least what I think the remark "I
> like mathy syntax" referred to.

Right, I was separating the original request to make "{x for x in
integers if 1000 <= x < 1000000}" work into the concrete proposal to
make exactly *that* syntax work (which I don't think is feasible), and
the slightly more general notion of offering a more math-like syntax
that allows finite sets to be built from infinite iterators by
defining a termination condition in addition to a filter condition.

>> There aren't any technical barriers I'm aware of to implementing that,
>> with the main historical objection being that instead of the
>> comprehension level while clause mapping to a while loop directly the
>> way the for and if clauses map to their statement level counterparts,
>> it would instead map to the conditional break in the expanded
>> loop-and-a-half form:
>>
>>     while True:
>>         if not condition:
>>             break
>>
>> While it's taken me a long time to come around to the idea, "Make
>> subtle infinite loops in mathematical code easier to avoid" *is* a
>> pretty compelling user-focused justification for incurring that extra
>> complexity at the language design level.
>
> I haven't come around to this yet. It looks like it will make explaining
> comprehensions more complex, since the translation of "while X" into "if not
> X: break" feels less direct than the translations of "for x in xs" or "if
> pred(x)". (In particular, your proposal seems to require more experience
> with mentally translating loops and conditions into jumps -- most regulars
> of this forum do that for a living, but I doubt it's second nature for the
> OP.)

Yeah, if we ever did add something like this, I suspect a translation
using takewhile would potentially be easier for at least some users to
understand than the one to a break condition:

    {x for x in itertools.count(0) if 1000 <= x while x < 1000000}

    <=>

    x = set()
    for x in itertools.count(0):
        if 1000 <= x:
            set.add(x)
        # If you've never used the loop-and-a-half idiom, it's
        # not obvious why "while <expr>" means "if not <expr>: break"
        if not x < 1000000:
            break

    is roughly

    {x for x in itertools.takewhile(itertools.count(0), lambda x: x <
1000000) if 1000 <= x}

    <=>

    x = set()
    for x in takewhile(itertools.count(0), lambda x: x < 1000000):
        if 1000 <= x:
            set.add(x)

However, the break condition is the translation that would make sense
at a language *implementation* level (and would hence be the one that
determined the relative location of the while clause in the expression
form).

That discrepancy *still* sets off alarm bells for me (since it's a
clear sign that "how people would think this works" and "how it would
actually work" probably wouldn't match), I'm also conscious of the
amount of syntactic noise that "takewhile" introduces vs the "while"
keyword.

The counter-argument (which remains valid even against my own change
of heart) is that adding a new comprehension clause doesn't actually
fix the "accidental infinite loop" problem: "{x for x in
itertools.count(0) if 1000 <= x < 1000000}" will still loop forever,
it would just have a nicer fix to get it to terminate (adding " while
x" to turn the second filter condition into a termination condition).

So while I'm +0 where I used to be a firm -1, it's still only a +0 :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list