In need of a binge-and-purge idiom
Magnus Lie Hetland
mlh at furu.idi.ntnu.no
Sun Mar 23 13:20:57 EST 2003
Maybe not the best name, but it somehow describes what's going on...
So...
I've noticed that I use the following in several contexts:
chunk = []
for element in iterable:
if isSeparator(element) and chunk:
doSomething(chunk)
chunk = []
if chunk:
doSomething(chunk)
chunk = []
If the iterable above is a file, isSeparator(element) is simply
defined as not element.strip() and doSomething(chunk) is
yield(''.join(chunk)) you have a paragraph splitter. I've been using
the same approach for slightly more complicated parsing recently.
However, the extra check at the end (i.e. the duplication) is a bit
ugly. A solution would be:
...
for element in iterable + separator:
...
but that isn't possible, of course. (It could be possible with some
fiddling with itertools etc., I guess.)
If it were possible to check whether the iterator extracted from the
iterable was at an end, that could help too -- but I see no elegant
way of doing it.
I can't really see any good way of using the while/break idiom either,
without resorting to explicit iterator pumping and a Boolean flag
(which isn't really all that elegant...):
it = iter(iterable)
chunk = []
done = False
while not done:
try:
element = it.next()
except StopIteration:
done = True
element = SomeSeparator()
if isSeparator(element) and chunk:
doSomething(chunk)
chunk = []
This seems far too wordy and clunky.
An alternative is:
it = iter(iterable)
chunk = []
while True:
try:
try:
element = it.next()
except StopIteration:
element = SomeSeparator()
break
finally:
if isSeparator(element) and chunk:
doSomething(chunk)
chunk = []
But this stuff is really just as bad (or even quite a bit worse) than
the version with duplication.
I just thought I'd hear if someone can think of a more elegant way of
handling this sort of thing?
--
Magnus Lie Hetland "Nothing shocks me. I'm a scientist."
http://hetland.org -- Indiana Jones
More information about the Python-list
mailing list