[Python-Dev] PEP 463: Exception-catching expressions

Chris Angelico rosuav at gmail.com
Sat Feb 22 04:42:30 CET 2014


On Sat, Feb 22, 2014 at 6:04 AM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>> * seq[index] - no way to handle a bounds error
>
> We can add 'list.get(index, default)' method, similar to
> 'Mapping.get'. It's far more easier than introducing new
> syntax.

That fixes it for the list. Square brackets notation works for any
sequence. Are you proposing adding magic to the language so that any
class that defines a __getitem__ automatically has a .get() with these
semantics? Or, conversely, that defining __getitem__ requires that you
also explicitly define get()?

> Inconvenience of dict[] raising KeyError was solved by
> introducing the dict.get() method. And I think that
>
> dct.get('a', 'b')
>
> is 1000 times better than
>
> dct['a'] except KeyError: 'b'
>
> I don't want to see this (or any other syntax) used by
> anyone.

Every separate method has to be written. That's code that has to be
tested, etc. Also, when you go searching for usage of something, you
have to cope with the fact that it can be spelled two different ways.
This is more noticeable with attributes:

print(spam.eggs)
# versus
print(getattr(spam,"eggs","(no eggs)")

The second one definitely doesn't look like it's retrieving spam.eggs,
either in plain text search or in an AST walk. I would like to see
getattr used primarily where the second argument isn't a literal, and
an exception-catching rig used when a default is wanted; that keeps
the "x.y" form predictable.

> Moreover, I think that explicit handling of IndexError is
> rather ugly and error prone, using len() is usually
> reads better.
>>
>> Retrieve an argument, defaulting to None::
>>          cond = args[1] except IndexError: None
>>
>>          # Lib/pdb.py:803:
>>          try:
>>              cond = args[1]
>>          except IndexError:
>>              cond = None
>
>
> cond = None if (len(args) < 2) else args[1]

There's a distinct difference between those: one is LBYL and the other
is EAFP. Maybe it won't matter with retrieving arguments, but it will
if you're trying to pop from a queue in a multi-threaded program.

>> Attempt a translation, falling back on the original::
>>          e.widget = self._nametowidget(W) except KeyError: W
>>
>>          # Lib/tkinter/__init__.py:1222:
>>          try:
>>              e.widget = self._nametowidget(W)
>>          except KeyError:
>>              e.widget = W
>
> I'm not sure this is a good example either.
> I presume '_nametowidget' is some function,
> that might raise a KeyError because of a bug in
> its implementation, or to signify that there is
> no widget 'W'. Your new syntax just helps to work
> with this error prone api.

I don't know the details; this is exactly what you can see in tkinter
at the file and line I point to. Maybe some of these are highlighting
other problems to be solved, I don't know, but certainly there will be
cases where the API is exactly like this.

>> Read from an iterator, continuing with blank lines once it's
>> exhausted::
>>          line = readline() except StopIteration: ''
>>
>>          # Lib/lib2to3/pgen2/tokenize.py:370:
>>          try:
>>              line = readline()
>>          except StopIteration:
>>              line = ''
>
> Handling StopIteration exception is more common in standard
> library than IndexError (although not much more), but again,
> not all of that code is suitable for your syntax. I'd say
> about 30%, which is about 20-30 spots (correct me if I'm
> wrong).

I haven't counted them up, but it wouldn't be hard to. Probably not
terribly many cases of this in the stdlib, but a reasonable few.

>> Retrieve platform-specific information (note the DRY improvement);
>> this particular example could be taken further, turning a series of
>> separate assignments into a single large dict initialization::
>>          # sys.abiflags may not be defined on all platforms.
>>          _CONFIG_VARS['abiflags'] = sys.abiflags except AttributeError: ''
>>
>>          # Lib/sysconfig.py:529:
>>          try:
>>              _CONFIG_VARS['abiflags'] = sys.abiflags
>>          except AttributeError:
>>              # sys.abiflags may not be defined on all platforms.
>>              _CONFIG_VARS['abiflags'] = ''
>
> Ugly.
> _CONFIG_VARS['abiflags'] = getattr(sys, 'abiflags', '')
> Much more readable.

Go ahead and make that change, if you prefer it. That's exactly how it
really is currently - the try/except block. Downside is as I mentioned
above: it no longer looks like "sys.abiflags", and won't come up when
you search for that.

>> Retrieve an indexed item, defaulting to None (similar to dict.get)::
>>      def getNamedItem(self, name):
>>          return self._attrs[name] except KeyError: None
>>
>>      # Lib/xml/dom/minidom.py:573:
>>      def getNamedItem(self, name):
>>          try:
>>              return self._attrs[name]
>>          except KeyError:
>>              return None
>
> _attrs there is a dict (or at least it's something that quaks
> like a dict, and has [] and keys()), so
>
> return self._attrs.get(name)

To what extent does it have to quack like a dict? In this particular
example, I traced through a few levels of "where did _attrs come
from", and got bogged down. Does "quacks like a dict" have to include
a .get() method?

>> Translate numbers to names, falling back on the numbers::
>>              g = grp.getgrnam(tarinfo.gname)[2] except KeyError:
>> tarinfo.gid
>>              u = pwd.getpwnam(tarinfo.uname)[2] except KeyError:
>> tarinfo.uid
>>
>>              # Lib/tarfile.py:2198:
>>              try:
>>                  g = grp.getgrnam(tarinfo.gname)[2]
>>              except KeyError:
>>                  g = tarinfo.gid
>>              try:
>>                  u = pwd.getpwnam(tarinfo.uname)[2]
>>              except KeyError:
>>                  u = tarinfo.uid
>
> This one is a valid example, but totally unparseable by
> humans. Moreover, it promotes a bad pattern, as you
> mask KeyErrors in 'grp.getgrnam(tarinfo.gname)' call.

My translation masks nothing that the original didn't mask. The
KeyError will come from the function call; it would be IndexError if
the function returns a too-short tuple, and that one's allowed to
bubble up.

>>> import pwd
>>> pwd.getpwnam("rosuav")
pwd.struct_passwd(pw_name='rosuav', pw_passwd='x', pw_uid=1000,
pw_gid=1000, pw_gecos='Chris Angelico,,,', pw_dir='/home/rosuav',
pw_shell='/bin/bash')
>>> pwd.getpwnam("spam")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'getpwnam(): name not found: spam'

(Note that it's possible for 'import pwd' to fail, in which case pwd
is set to None early in the script. But this particular bit of code
checks "if pwd" before continuing anyway, so we don't expect
AttributeError here.)

>> Calculate the mean of a series of numbers, falling back on zero::
>>
>>      value = statistics.mean(lst) except statistics.StatisticsError: 0
>>
>>      try:
>>          value = statistics.mean(lst)
>>      except statistics.StatisticsError:
>>          value = 0
>
> I think all of the above more readable with try statement.

Readability is a matter of personal preference, to some extent. I find
it clearer to use the shorter form, for the same reason as I'd use
this:

def query(prompt, default):
    return input("%s [%s]: "%(prompt, default)) or default

I wouldn't use long-hand:

def query(prompt, default):
    s = input("%s [%s]: "%(prompt, default))
    if not s: s = default
    return s

It's easier to see that it's calling something, and defaulting to
something else.

>> Retrieving a message from either a cache or the internet, with auth
>> check::
>>
>>      logging.info("Message shown to user: %s",((cache[k]
>>          except LookupError:
>>              (backend.read(k) except OSError: 'Resource not available')
>>          )
>>          if check_permission(k) else 'Access denied'
>>      ) except BaseException: "This is like a bare except clause")
>>
>>      try:
>>          if check_permission(k):
>>              try:
>>                  _ = cache[k]
>>              except LookupError:
>>                  try:
>>                      _ = backend.read(k)
>>                  except OSError:
>>                      _ = 'Resource not available'
>>          else:
>>              _ = 'Access denied'
>>      except BaseException:
>>          _ = "This is like a bare except clause"
>>      logging.info("Message shown to user: %s", _)
>
>
> If you replace '_' with a 'msg' (why did you use '_'??)
> then try statements are *much* more readable.

I've removed that example. The reason for using _ was because I wanted
it to have the "feel" of still being an expression, where nothing's
named. But it's not a very helpful example anyway; part of the
confusion comes from the if/else in the middle, which completely
wrecks evaluation order expectations.

>> Lib/ipaddress.py:343::
>>              try:
>>                  ips.append(ip.ip)
>>              except AttributeError:
>>                  ips.append(ip.network_address)
>> Becomes::
>>              ips.append(ip.ip except AttributeError: ip.network_address)
>
> or it may become:
>
> ips.append(getattr(ip, 'ip', ip.network_address))
>
> or
>
> address = getattr(ip, 'ip', ip.network_address)
> ips.append(address)

There's a subtle difference here that makes that not equivalent. With
the original try/except statement, evaluation proceeds thus:

1) Attempt to look up ip.ip. If that succeeds, call ips.append().
2) If AttributeError is not thrown in #1, done. Otherwise, proceed.
3) Attempt to look up ip.network_address. If that succeeds, call ips.append.
4) Any exception raised will propagate.

This means that, if neither ip nor network_address is defined, an
AttributeError will come up, but that if ip is, network_address won't
even be looked at.

My version narrows the scope slightly, but is functionally similar.

1) Attempt to look up ip.ip.
2) If AttributeError is thrown in #1, attempt to look up ip.network_address.
3) If either #1 or #2 succeeds, call ips.append.
4) Any exception raised anywhere else will propagate.

Your version, however, inverts the evaluation order:

1) Attempt to look up ip.network_address
2) If AttributeError is thrown in #1, propagate it up and stop evaluating.
3) Retrieve ip.ip, defaulting to the looked-up network address.
4) Pass that to ips.append().

It's probably not safe to use 'or' here, but if you can be sure ip.ip
will never be blank, you could get lazy evaluation this way:

ips.append(getattr(ip, 'ip', '') or ip.network_address)

But at this point, the clarity advantage over the except clause is
diminishing, plus it conflates AttributeError and ''.

> Yes, some examples look neat. But your syntax is much easier
> to abuse, than 'if..else' expression, and if people start
> abusing it, Python will simply loose it's readability
> advantage.

If people start abusing it, style guides can tell them off. Unlike the
if/else operator, evaluation of "expr except Exception: default"
happens in strict left-to-right order, so in that sense it's _less_
confusing. I'll be adding a paragraph to the PEP shortly explaining
that.

ChrisA


More information about the Python-Dev mailing list