[Python-Dev] Py3K: indirect coupling between raise and exception handler

Skip Montanaro skip@mojam.com (Skip Montanaro)
Fri, 10 Mar 2000 14:17:30 -0600 (CST)

    Guido> Skip, I'm not familiar with MySQLdb.py, and I have no idea what
    Guido> your example is about.  From the rest of the message I feel it's
    Guido> not about MySQLdb at all, but about string formatting, 

My apologies.  You're correct, it's really not about MySQLdb. It's about
handling multiple cases raised by the same exception.

First, a more concrete example that just uses simple string formats:

    code		exception
    "%s" % ("a", "b")	TypeError: 'not all arguments converted'
    "%s %s" % "a"	TypeError: 'not enough arguments for format string'
    "%(a)s" % ("a",)	TypeError: 'format requires a mapping'
    "%d" % {"a": 1}	TypeError: 'illegal argument type for built-in operation'

Let's presume hypothetically that it's possible to recover from some subset
of the TypeErrors that are raised, but not all of them.  Now, also presume
that the format strings and the tuple, string or dict literals I've given
above can be stored in variables (which they can).

If we wrap the code in a try/except statement, we can catch the TypeError
exception and try to do something sensible.  This is precisely the trick
that Andy Dustman uses in MySQLdb: first try expanding the format string
using a tuple as the RH operand, then try with a dict if that fails.

Unfortunately, as you can see from the above examples, there are four cases
that need to be handled.  To distinguish them currently, you have to compare
the message you get with the exception to string literals that are generally
defined in C code in the interpreter.  Here's what Andy's original code
looked like stripped of the MySQLdb-ese:

        x = format % tuple_generating_function(...)
    except TypeError:
        x = format % dict_generating_function(...)

That doesn't handle the first two cases above.  You have to inspect the
message that raise sends out:

        x = format % tuple_generating_function(...)
    except TypeError, m:
        if m.args[0] == "not all arguments converted": raise
        if m.args[0] == "not enough arguments for format string": raise
        x = format % dict_generating_function(...)

This comparison of except arguments with hard-coded strings (especially ones
the programmer has no direct control over) seems fragile to me.  If you
decide to reword the error message strings, you break someone's code.

In my previous message I suggested collecting this fragility in the
exceptions module where it can be better isolated.  My solution is a bit
cumbersome, but could probably be cleaned up somewhat, but basically looks

        x = format % tuple_generating_function(...)
    except TypeError, m:
        import exceptions
	msg_case = exceptions.message_map.get((TypeError, m.args),
	# punt on the cases we can't recover from
        if msg_case == exceptions.TYP_SHORT_FORMAT: raise
        if msg_case == exceptions.TYP_LONG_FORMAT: raise
        if msg_case == exceptions.UNKNOWN_ERROR_CATEGORY: raise
	# handle the one we can
        x = format % dict_generating_function(...)

In private email that crossed my original message, Andy suggested defining
more standard exceptions, e.g.:

    class FormatError(TypeError): pass
    class TooManyElements(FormatError): pass
    class TooFewElements(FormatError): pass

then raising the appropriate error based on the circumstance.  Code that
catches TypeError exceptions would still work.

So there are two possible changes on the table:

    1. define more standard exceptions so you can distinguish classes of
       errors on a more fine-grained basis using just the first argument of
       the except clause.

    2. provide some machinery in exceptions.py to allow programmers a
       measure of uncoupling from using hard-coded strings to distinguish