In need of a binge-and-purge idiom

Magnus Lie Hetland mlh at furu.idi.ntnu.no
Sun Mar 23 13:20:57 EST 2003


Maybe not the best name, but it somehow describes what's going on...
So...

I've noticed that I use the following in several contexts:

  chunk = []
  for element in iterable:
      if isSeparator(element) and chunk:
          doSomething(chunk)
          chunk = []
  if chunk:
      doSomething(chunk)
      chunk = []

If the iterable above is a file, isSeparator(element) is simply
defined as not element.strip() and doSomething(chunk) is
yield(''.join(chunk)) you have a paragraph splitter. I've been using
the same approach for slightly more complicated parsing recently.

However, the extra check at the end (i.e. the duplication) is a bit
ugly. A solution would be:

  ...
  for element in iterable + separator:
      ...

but that isn't possible, of course. (It could be possible with some
fiddling with itertools etc., I guess.)

If it were possible to check whether the iterator extracted from the
iterable was at an end, that could help too -- but I see no elegant
way of doing it.

I can't really see any good way of using the while/break idiom either,
without resorting to explicit iterator pumping and a Boolean flag
(which isn't really all that elegant...):

  it = iter(iterable)
  chunk = []
  done = False
  while not done:
      try:
          element = it.next()
      except StopIteration:
          done = True
          element = SomeSeparator()
      if isSeparator(element) and chunk:
          doSomething(chunk)
          chunk = []

This seems far too wordy and clunky.

An alternative is:

  it = iter(iterable)
  chunk = []
  while True:
      try:
          try:
              element = it.next()
          except StopIteration:
              element = SomeSeparator()
              break
      finally:
          if isSeparator(element) and chunk:
              doSomething(chunk)
              chunk = []

But this stuff is really just as bad (or even quite a bit worse) than
the version with duplication.

I just thought I'd hear if someone can think of a more elegant way of
handling this sort of thing?

-- 
Magnus Lie Hetland               "Nothing shocks me. I'm a scientist." 
http://hetland.org                                   -- Indiana Jones




More information about the Python-list mailing list