How to do this in Python...

Martin Maney maney at pobox.com
Mon Jan 27 00:06:16 CET 2003


Carl Banks <imbosol at vt.edu> wrote:
> If you had said "Python is incomplete because you can't do this
> set-and-test idiom" instead of "Python is incomplete because it
> doesn't allow assignment inside if," you would have gotten less
> ideology.

That proves not to be the case.  This is an issue which reliably causes
knees to jerk, and chins to wag without benefit of much if any thought
process.  For example, I see the "operator= is a proven source of
trouble" meme even in the incomplete sample of the current crop that
I've read, even though the ONLY citation for this that anyone could
come up with last time proved not to be saying that at all, at all.

I do believe this is a religious issue.  Nevertheless, I shall attempt
to summarize my understanding of the issue in as secular a fashion as I
can manage.


There were some reasonable ideas that came out; for this specific case
wrapping the re in a lightweight class that can run the match, saving
the result as it is returned and making it available later is IMO
probably the most natural.  It adds a fair bit of overhead if you
hadn't been pre-compiling the regexps, though.

  pat1 = REwrap(r'some pattern')
  pat2 = REwrap(r'another pattern')


  if pat1.match(targetString):
      # access results as pat1.group(..), etc.
  elif pat2.match(targetString):
      ...


The hack that's a recipie in the Cookbook seems to me to be a very bad
choice.  It's convenient, but if you're worried about the risk of
making a mess with operator=, it seems to me inconsistent to the point
of lunacy to adopt a supposed fix that introduces a global, shared
state like that.  (recipie 1.9 in the book; I don't find it easily in
the online cookbook, but that chapter is online at O'Reilly's site)
The danger can be reduced by using it without the shared global state,
but then it loses much of its convenience.  All in all, a design I
think we'd do best to forget.


The suggestion to make a list of (pattern, action) pairs might be a
good one.  It seems to me to be overelaborate if there are only a few
cases, and of course it's much harder to apply in the general case
where the switch needs more variety in its tests.  If the actions are
mostly simple but rely on the surrounding context then that may also
make the list of lists less attractive.  Even when it's a good fit,
it's hard to believe anyone could argue that this will be obvious to
someone who comes on the code cold.

One drawback to this is that the action must either fit into a lambda
(and some folks will prefer not to use lambda under any circumstances,
which seems odd to me but if it makes them uncomfortable then it's best
not to push) or else it must be written at a point physically distant
from the pattern with which it is associated.  This is a real step
backwards: one of the sterling virtues of the if:elif:...else structure
is that it makes the connection immediately obvious.  As Kernighan and
Plauger said of one such construct, "For all its apparent complexity,
it is simply a seven-way case statement, with the code for each case
in-line instead of in a separate routine." (1)  In place of the simple
case structure, the list of lists requires, in general, something like

  def act1(match_result): ...
  def act2(match_result): ...

  cases = (
      (r'pattern one', act1),
      (r'pattern two', act2),
      ...

I can't honestly imagine a situation where this would be an improvement
by any rational measure.  It disperses the pieces of the puzzle and
makes everything indirect and therefore at least a little mysterious. 
One might suggest that this is advantageous when sets of cases lead to
the same action, but the :elif:... form can call shared functions as
easily as use inline code, so I can't see much motivation there, either. 
I want there to be some circumstances in which this is clearly the best
approach, but so far I have thought myself free of finding any.  :-(


This seems to me to be, very simply, an area where Python's
intentionally chosen limitations make it awkward to compose a clear and
robust solution, at least in the general case.  For the specific use
case suggested here, I favor either the simple Pythonic idiom if there
were only a few possibilities, or the lightweight wrapper class when
there are enough to make the rightward drift annoying.  In practice,
the one application where I needed this I adopted the lightweight class
in all places, even those where there was only one or two patterns,
mostly to avoid a pointless variation in style in one body of code.


(1) _Software Tools_, 1976, p. 269





More information about the Python-list mailing list