[Python-Dev] Switch statement

Thu Jun 22 21:24:30 CEST 2006

At 11:52 AM 6/22/2006 -0700, Guido van Rossum wrote:
>On 6/22/06, Phillip J. Eby <pje at telecommunity.com> wrote:
>>I think one of the problems I sometimes have in communicating with you is
>>that I think out stuff from top to bottom of an email, and sometimes
>>discard working assumptions once they're no longer needed.  We then end up
>>having arguments over ideas I already discarded, because you find the
>>problems with them faster than I do, and you assume that those problems
>>carry through to the end of my message.  :)
>
>You *do* have a text editor that lets you go back to the top of the
>draft to remove discarded ideas, don't you? :-)

Well, usually the previous idea seems an essential part of figuring out the 
new idea, and showing why the new idea is better.  At least the way I think 
about it.  But now that I've noticed this seems to be a recurring theme in 
our discussions, I'll try to be more careful.

>It's a reasonable form of discourse to propose an idea only to shoot
>it down, but usually this is introduced by some phrase that hints to
>the reader what's going to happen. You can't expect the reader to read
>the entire email before turning on their brain. :)

Well, you can't expect me to know ahead of time what ideas I'm going to 
discard before I've had the ideas that will replace them.  ;-)  But again, 
I'll be more careful in future about retroactively adding such warnings or 
removing the old ideas entirely.

>>1. "case (literal|NAME)" is the syntax for equality testing -- you can't
>>use an arbitrary expression, not even a dotted name.
>
>But dotted names are important! E.g. case re.DOTALL. And sometimes
>compile-time constant expressions are too. Example: case sys.maxint-1.

True - but at least you *can* use them, with "from re import DOTALL" and 
"maxint_less_1 = sys.maxint-1".  You're just required to disambiguate 
*when* the calculation of these values is to be performed.

>>2. NAME, if used, must be bound at most once in its defining scope
>
>That's fine -- but doesn't extend to dotted names.

Right, hence #1.

>>3. Dictionary optimization can occur only for literals and names not bound
>>in the local scope, others must use if-then.
>
>So this wouldn't be optimized?!
>
>NL = "\n"
>for line in sys.stdin:
>  switch line:
>    "abc\n": ...
>    NL: ...

This would result in a switch dictionary with "abc\n" in it, preceded by an 
if line==NL test.  So it's half-optimized.  The more literals, the more 
optimized.  If you put the same switch in a function body, it becomes fully 
optimized if the NL binding stays outside the function definition.

Note that you previously proposed a switch at top level not be optimized at 
all, so this is an improvement over that.

>I like it better than const declarations, but I don't like it as much
>as the def-time-switch-freezing proposal; I find the limitiation to
>simple literals and names too restrictive, and there isn't anything
>else like that in Python.

Well, you can't "def" a dotted name, but I realize this isn't a binding.

>I also don't like the possibility that it
>degenerates to if/elif. I like predictability.

It is predictable: anything defined in the same scope will be if/elif, 
anything defined outside will be dict-switched.

>I like to be able to switch on dotted names.
>Also, when using a set in a case, one should be able to use an
>expression like s1|s2 in a case.

...which then gets us back to the question of when the dots or "|" are 
evaluated.  My proposal forces you to make the evaluation time explicit, 
visible, and unquestionably obvious in the source, rather than relying on 
invisible knowledge about the function definition time.

"First time use" is also a more visible approach, because it does not 
contradict the user's assumption that evaluation takes place where the 
expression appears.  The "invisible" assumption is only that subsequent 
execution will reuse the same expression results without recalculating them 
-- it doesn't *move* the evaluation somewhere else.

I seem to recall that in general, Python prefers to evaluate expressions in 
the order that they appear in source code, and that we try to preserve that 
property as much as possible.  Both the "names and literals only" and 
"first-time use" approaches preserve that property; "function definition 
time" does not.

Of course, it's up to you to weigh the cost and benefit; I just wanted to 
bring this one specific factor (transparency of the source) to your 
attention.  This whole "const" thread was just me trying to find another 
approach besides "first-time use" that preserves that visibility property 
for readers of the code.