Jacob Holm wrote:
I would like to split off a function for parsing a single element. And I would like it to look like this:
def parse_elem(): opening_tag = yield name = opening_tag[1:-1] items = yield from parse_items("</%s>" % name) return (name, items)
I don't see what you gain by writing it like that, though. You don't even know whether you want to call this function until you've seen the first token and realized that it's a tag.
In other words, you need a one-token lookahead. A more conventional parser would use a scanner that lets you peek at the next token without absorbing it, but that's not an option when you're receiving the tokens via yield, so another solution must be found.
The solution I chose was to keep the lookahead token as state in the parsing functions, and pass it to wherever it's needed. Your parse_elem() function clearly needs it, so it should take it as a parameter.
If there's some circumstance in which you know for certain that there's an elem coming up, you can always write another parsing function for dealing with that, e.g.
def expect_elem(): first = yield return yield from parse_elem(opening_tag = first)
I don't think there's anything inconvenient about that.
A convention like Nick suggested where all coroutines take an optional "start" argument with the first value to yield doesn't help, because it is not the value to yield that is the problem.
I think you've confused the issue a bit yourself, because you started out by asking for a way of specifing the first value to yield in the yield-from expression. But it seems that what you really want is to specify the first value to send into the subiterator.
I haven't seen anything so far that convinces me it would be a serious inconvenience not to have such a feature.
Also, it doesn't seem to generalize. What if your parser needs a two-token lookahead? Then you'll be asking for a way to specify the first two values to send in. Where does it end?