[Python-Dev] Switch statement

Mon Jun 19 21:47:26 CEST 2006

On 6/19/06, Raymond Hettinger <rhettinger at ewtllc.com> wrote:
> [...] Can we conclude that arbitrary expressions are
> fine for the switch value but that the case values must be constants?

That's too strong I believe. If some or all of the cases are arbitrary
expressions the compiler should try to deal. (Although we might have
to add a rule that if more than one case matches there's no guarantee
which branch is taken.)

In particular I expect that named constants are an important use case
(e.g. all of sre_compile.py uses names to compare the op with). The
compiler can't really tell with any degree of certainty that a name
won't ever be rebound (it would take a pretty smart global code
analyzer to prove that).

> That would neatly dispense with some proposed hypergeneralizations and
> keep the discussion focused.
>
>
> >>  Given:
> >>
> >>  switch x:
> >>     case 1:  one()
> >>     case 2:  two()
> >>     case 3:  three()
> >>     default:  too_many()
> >>
> >> Do we require that x be hashable so that the compiler can use a lookup
> >> table?
> >
> >
> > That's a good question. We could define switch/case in terms of a hash
> > table created by the compiler, and then raising an exception if x is
> > unhashable is fair game.
>
>
> +1
>
> > Or we could define it in terms of successive
> > '==' comparisons, and then the compiler would have to create code for
> > a slow path in case x is unhashable.
>
> Too perilous.  I would not like to put us in a position of generating
> duplicate code or funky new opcodes for the case suites.  Also, it is
> better for the user to know that __hash__ is going to be called, that
> the default-clause will execute when the key in not found, and that a
> KeyError would be raised if x is unhashable.  This is simple,
> explainable, consistent behavior.  Besides, if we've agreed that the
> case values are required to be constants, then there isn't much in the
> way of use cases for x being unhashable.

Well, the hypothetical use case is one where we have an arbitrary
object of unknown origin or type, and we want to special-case
treatment for a few known values.

I wonder if there should be two default clauses, or some other
syntactic way to indicate whether we expect all x to be hashable?

OTOH maybe doign the simplest thing that could possibly work is the
right thing here, so I'm not going to push back hard. I guess
practicality beats purity and all that.

Actually there are quiet a few zen of Python rules that endorse the
view that requiring x to be hashable is Pythonic, so I'm being swayed
as I write this. ;-)

> > I don't think I'm in favor of
> > always taking the default path when x is unhashable; that would cause
> > some surprises if an object defines __eq__ to be equal to ints (say)
> > but not __hash__.
>
>
> That would be unpleasant.
>
>
> >
> > Note that we currently don't have a strong test for hashable; it's
> > basically "if hash(x) doesn't raise an exception" which means that we
> > would have to catch this exception (or perhaps only TypeError) in
> > order to implement the slow path for the successive-comparisons
> > semantics.
> >
> > I note that C doesn't require any particular implementation for
> > switch/case; there's no rule that says the numbers must fit in an
> > array of pointers or anything like that. So I would be careful before
> > we define this in terms of hash tables. OTOH the hash table semantics
> > don't require us to commit to a definition of hashable, which is an
> > advantage.
> >
> > How's that for a wishy-washy answer. :-)
> >
> Perfect.  Wishy-washy answers reflect an open mind and they contain the
> seeds of complete agreement.

Thanks. Lawyers have different reasons for being wishy-washy but among
geeks there can be clarity in wshy-washiness. :-)

> My thought is that we *should* define switching in terms of hash
> tables.  It builds off of existing knowledge and therefore has a near
> zero learning curve.  The implementation is straight-forward and there
> are none of the hidden surprises that we would have with
> fastpath/slowpath approaches which use different underlying magic
> methods and do not guarantee order of execution.

I'm not so sure about there being no hidden surprises. I betcha that
there are quire a few bits of code that curerntly use the if/elif
style and seem to beg for a switch statement that depend on the
ordering of the tests. A typical example would be to have one of the
earlier tests express an exception to a later test that is a range
test. (Surely we're going to support range tests... sre_compile.py
uses 'in' almost as often as 'is'.)

> If use cases eventually emerge for an alternative path using successive
> == comparisons, then it can always be considered and added later.  For
> now, YAGNI (neither the functionality, nor the implementation headaches,
> nor the complexity of explaining what it does under all the various cases).

I say, let someone give a complete implementation a try, and then try
to modify as much standard library code as possible to use it. Then
report back. That would be a very interesting experiment to do. (And
thanks for the pointer to sre_compile.py as a use case!)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)