[Python-ideas] "continue with" for dynamic iterable injection

Thu Sep 25 13:31:15 CEST 2014

I appreciate the comparison to coroutines, it helps frame some of the
use-cases. However (forgive me for saying), I find that Python's
coroutine model isn't intuitive, certainly for newcomers. I often find
it hard to envisage use-cases for Python coroutines that wouldn't be
better served with a class, for the sake of readability; Python's
overall philosophy leans toward readability, so that suggests it's
leaning away from coroutines.

Now, that's an aside, so I don't want to fall off-topic; I also realise
Python is drifting into more uniform and useful coroutine based code,
based on asyncio. So, if `continue with` were useful to building better
or more intuitive coroutines, then by all means that's a valid application.

What I had more in mind was to have it as something that would stand
independent of the custom-class spaghetti normally required to implement
"clean" alternative looping or coroutines. That is, everyone can accept
that it's not necessary; but does it clean up real-world code?

So, for clarity, this is the kind of thing I had in mind, which I've
encountered in various forms and which I think `continue with` would
help with, at its most basic:

Ingredient (unit) (unit/100g):
Protein g, 5
Carbohydrate g, 50
Fibre g 10
Insoluble Fibre g 5
Soluble Fibre g 5
Starches g 20
Sugars g 20
Sucrose g 10
Glucose g 5
Fructose g 5
Vitamins mg 100
Ascorbic Acid mg 50
Niacin mg 50

The above is invented, but I was actually parsing an
ingredient/nutrition list when the idea occurred to me. As you can see,
there are "totals" followed by sub-categories, some of which are
subtotals which form their own category. When parsed, I might want
(pseudo-json):

{
    Protein: 5g
    Carbohydrates: {
        total: 50g
        fibre: {
            total: 10g,
            soluble: 5
            insoluble: 5
        }
 <...>
}

To parse this, I create code like this, with some obviously-named
functions that aren't given. With recursive subtables, obviously this
isn't going to work as-is, but it illustrates the point:

```
table = {}
subtable = ''
for line in raw_table:
    name, unit, quant = line.strip().rsplit(None, 2)
    if subtable:
        if is_valid_subtable_element(subtable, name):
            table[subtable][name] = quant + unit
        else:
            subtable = ''
            table[name] = quant + unit  # DRY!
    else:
        if is_subtable_leader(name):
            subtable = name
            table[subtable] = {'total': quant_unit}
        else:
            table[name] = quant + unit  # DRY!
```

Now, if I have to maintain this code, which will quickly become
nontrivial for enough branches, I have several locations that need
parallel fixes and modifications.

One solution is to functionalise this and build functions to which the
container (table) and the tokens are passed; changes are then made in
the functions, and the repeated calls in different code branches become
more maintainable. Another is to make an object instead of a dict-table,
and the object performs some magic to handle things correctly.

However, with `continue with` the solution is more straightforward.
Because the problem, essentially, is that some tokens cause a
state-change in addition to presenting data in their own right, by using
`continue with` you can handle them in one code branch first, then
repeat to handle them in the other branch:

```
table = {}
subtable = ''
for line in raw_table.splitlines():
    name, unit, quant = line.rsplit(None, 2)
    if subtable:
        if is_valid_subtable_element(subtable, name):
            table[subtable][name] = quant + unit
        else:
            subtable = ''
            continue with line
    else:
        if is_subtable_leader(name):
            subtable = name
            table[subtable] = {}
            continue with 'total {} {}'.format(quant, unit)
        else:
            table[name] = quant + unit
```

The result is a single table entry per branch; one for subtables, one
for base table. The handling of tokens that present flow-control issues,
like titles of subtables or values that indicate the subtable should end
(like a vitamin, when we were parsing carbohydrates), is handled first
as flow-control issues and then again as data. (in this case, assume
that the function is_valid_subtable_element accepts "total" as a valid
subtable element always, and judges the rest according to predefined
valid items for categories like "carbohydrates", "fibres", "vitamins",
etcetera).

The flow control is cleaner, more intuitive to read IMO, and there is
less call for the definition of special flow-control classes, functions
or coroutines. In my opinion, anything that removes the need for custom
classes and functions *for the purpose of flow control and readability*
is an improvement to the language.

Now, as indicated above, `continue with` does not merely repeat the
current iteration; you can dynamically generate the next iteration
cycle. In the above example, that changed a line like "Carbohydrates g
50" into "total g 50" for use in the subtable iteration. More creative
uses of dynamic iterable injection will surely present themselves with
further thought.

Sorry for the poor clarity last night, and perhaps today; I'm recovering
from illness and distracted by various other things. :)
Thanks for your feedback and thoughts!

Cathal

On 25/09/14 04:51, Nathaniel Smith wrote:
> On Thu, Sep 25, 2014 at 2:50 AM, Nathaniel Smith <njs at pobox.com> wrote:
>> The most elegant solution I know is:
>>
>> class PushbackAdaptor:
>>     def __init__(self, iterable):
>>         self.base = iter(iterable)
>>         self.stack = []
>>
>>     def next(self):
>>         if self.stack:
>>             return self.stack.pop()
>>         else:
>>             return self.base.next()
>>
>>     def pushback(self, obj):
>>         self.stack.append(obj)
>>
>> it = iter(character_source)
>> for char in it:
>>     ...
>>     if state is IDENTIFIER and char not in IDENT_CHARS:
>>         state = NEW_TOKEN
>>         it.push_back(char)
>>         continue
>>     ...
>>
>> In modern python, I think the natural meaning for 'continue with' wouldn't
>> be to special-case something like this. Instead, where 'continue' triggers a
>> call to 'it.next()', I'd expect 'continue with x' to trigger a call to
>> 'it.send(x)'. I suspect this might enable some nice idioms in coroutiney
>> code, though I'm not very familiar with such.
> 
> In fact, given the 'send' definition of 'continue with x', the above
> tokenization code would become simply:
> 
> def redoable(iterable):
>     for obj in iterable:
>         while yield obj == "redo":
>             pass
> 
> for char in redoable(character_source):
>     ...
>     if state is IDENTIFIER and char not in IDENT_CHARS:
>         state = NEW_TOKEN
>         continue with "redo"
>     ...
> 
> which I have to admit is fairly sexy.
> 

-- 
Twitter: @onetruecathal, @formabiolabs
Phone: +353876363185
Blog: http://indiebiotech.com
miniLock.io: JjmYYngs7akLZUjkvFkuYdsZ3PyPHSZRBKNm6qTYKZfAM
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0x988B9099.asc
Type: application/pgp-keys
Size: 6176 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20140925/776a8b4d/attachment-0001.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20140925/776a8b4d/attachment-0001.sig>