Simple text parsing gets difficult when line continues to next line
John Machin
sjmachin at lexicon.net
Tue Nov 28 16:29:16 EST 2006
Tim Hochberg wrote:
[snip]
> I agree that mixing the line assembly and parsing is probably a mistake
> although using next explicitly is fine as long as your careful with it.
> For instance, I would be wary to use the mixed for-loop, next strategy
> that some of the previous posts suggested. Here's a different,
> generator-based implementation of the same idea that, for better or for
> worse is considerably less verbose:
>
[snip]
Here's a somewhat less verbose version of the state machine gadget.
def continue_join_3(linesin):
linesout = []
buff = ""
pending = 0
for line in linesin:
# remove *all* trailing whitespace
line = line.rstrip()
if line.endswith('_'):
buff += line[:-1]
pending = 1
else:
linesout.append(buff + line)
buff = ""
pending = 0
if pending:
raise ValueError("last line is continued: %r" % line)
return linesout
FWIW, it works all the way back to Python 2.1
Cheers,
John,
More information about the Python-list
mailing list