[Python-Dev] PEP 463: Exception-catching expressions

Yury Selivanov yselivanov.ml at gmail.com
Sat Feb 22 05:25:48 CET 2014


On 2/21/2014, 10:42 PM, Chris Angelico wrote:
> On Sat, Feb 22, 2014 at 6:04 AM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>>> * seq[index] - no way to handle a bounds error
>> We can add 'list.get(index, default)' method, similar to
>> 'Mapping.get'. It's far more easier than introducing new
>> syntax.
> That fixes it for the list. Square brackets notation works for any
> sequence. Are you proposing adding magic to the language so that any
> class that defines a __getitem__ automatically has a .get() with these
> semantics? Or, conversely, that defining __getitem__ requires that you
> also explicitly define get()?

No, I proposed to consider adding 'list.get()' and
'collections.abc.Sequence.get()' (or a better name),
if catching IndexError is such a popular pattern.

We can also fix stdlib. Authors of the python libraries/
frameworks will likely follow.

No magic in the language, no requirements on __getitem__,
obviously.

>> Inconvenience of dict[] raising KeyError was solved by
>> introducing the dict.get() method. And I think that
>>
>> dct.get('a', 'b')
>>
>> is 1000 times better than
>>
>> dct['a'] except KeyError: 'b'
>>
>> I don't want to see this (or any other syntax) used by
>> anyone.
> Every separate method has to be written. That's code that has to be
> tested, etc. Also, when you go searching for usage of something, you
> have to cope with the fact that it can be spelled two different ways.
> This is more noticeable with attributes:
>
> print(spam.eggs)
> # versus
> print(getattr(spam,"eggs","(no eggs)")
>
> The second one definitely doesn't look like it's retrieving spam.eggs,
> either in plain text search or in an AST walk.
And that's not a common thing to use AttributeError/getattr at
all. It is common in frameworks, yes, but not that much
in client code.
>   I would like to see
> getattr used primarily where the second argument isn't a literal, and
> an exception-catching rig used when a default is wanted; that keeps
> the "x.y" form predictable.
I understand. But I still think that using getattr is better
than 'a.b except AttributeError: c'
>> Moreover, I think that explicit handling of IndexError is
>> rather ugly and error prone, using len() is usually
>> reads better.
>>> Retrieve an argument, defaulting to None::
>>>           cond = args[1] except IndexError: None
>>>
>>>           # Lib/pdb.py:803:
>>>           try:
>>>               cond = args[1]
>>>           except IndexError:
>>>               cond = None
>>
>> cond = None if (len(args) < 2) else args[1]
> There's a distinct difference between those: one is LBYL and the other
> is EAFP. Maybe it won't matter with retrieving arguments, but it will
> if you're trying to pop from a queue in a multi-threaded program.
Using a method that modifies its underlying object deep in
expression is also questionable, in terms of readability and
maintainability of the code.  I find try..except statement
more favourable here.

>>> Attempt a translation, falling back on the original::
>>>           e.widget = self._nametowidget(W) except KeyError: W
>>>
>>>           # Lib/tkinter/__init__.py:1222:
>>>           try:
>>>               e.widget = self._nametowidget(W)
>>>           except KeyError:
>>>               e.widget = W
>> I'm not sure this is a good example either.
>> I presume '_nametowidget' is some function,
>> that might raise a KeyError because of a bug in
>> its implementation, or to signify that there is
>> no widget 'W'. Your new syntax just helps to work
>> with this error prone api.
> I don't know the details; this is exactly what you can see in tkinter
> at the file and line I point to. Maybe some of these are highlighting
> other problems to be solved, I don't know, but certainly there will be
> cases where the API is exactly like this.
>
>>> Read from an iterator, continuing with blank lines once it's
>>> exhausted::
>>>           line = readline() except StopIteration: ''
>>>
>>>           # Lib/lib2to3/pgen2/tokenize.py:370:
>>>           try:
>>>               line = readline()
>>>           except StopIteration:
>>>               line = ''
>> Handling StopIteration exception is more common in standard
>> library than IndexError (although not much more), but again,
>> not all of that code is suitable for your syntax. I'd say
>> about 30%, which is about 20-30 spots (correct me if I'm
>> wrong).
> I haven't counted them up, but it wouldn't be hard to. Probably not
> terribly many cases of this in the stdlib, but a reasonable few.

Just having a "reasonable few" use cases in huge stdlib
isn't a warrant for new syntax.
>
>>> Retrieve platform-specific information (note the DRY improvement);
>>> this particular example could be taken further, turning a series of
>>> separate assignments into a single large dict initialization::
>>>           # sys.abiflags may not be defined on all platforms.
>>>           _CONFIG_VARS['abiflags'] = sys.abiflags except AttributeError: ''
>>>
>>>           # Lib/sysconfig.py:529:
>>>           try:
>>>               _CONFIG_VARS['abiflags'] = sys.abiflags
>>>           except AttributeError:
>>>               # sys.abiflags may not be defined on all platforms.
>>>               _CONFIG_VARS['abiflags'] = ''
>> Ugly.
>> _CONFIG_VARS['abiflags'] = getattr(sys, 'abiflags', '')
>> Much more readable.
> Go ahead and make that change, if you prefer it. That's exactly how it
> really is currently - the try/except block. Downside is as I mentioned
> above: it no longer looks like "sys.abiflags", and won't come up when
> you search for that.
I'd search for 'abiflags' since it's not a common name ;)
But again, I get your point here.
>>> Retrieve an indexed item, defaulting to None (similar to dict.get)::
>>>       def getNamedItem(self, name):
>>>           return self._attrs[name] except KeyError: None
>>>
>>>       # Lib/xml/dom/minidom.py:573:
>>>       def getNamedItem(self, name):
>>>           try:
>>>               return self._attrs[name]
>>>           except KeyError:
>>>               return None
>> _attrs there is a dict (or at least it's something that quaks
>> like a dict, and has [] and keys()), so
>>
>> return self._attrs.get(name)
> To what extent does it have to quack like a dict? In this particular
> example, I traced through a few levels of "where did _attrs come
> from", and got bogged down. Does "quacks like a dict" have to include
> a .get() method?
The point is that whoever wrote that code knew what is _attrs.
Likely it's a dict, since they use '.keys()' on it. And likely,
that person just prefers try..except instead of using '.get()'.
And I just want to say, that this particular example isn't a
good use case for the new syntax you propose.

>>> Translate numbers to names, falling back on the numbers::
>>>               g = grp.getgrnam(tarinfo.gname)[2] except KeyError:
>>> tarinfo.gid
>>>               u = pwd.getpwnam(tarinfo.uname)[2] except KeyError:
>>> tarinfo.uid
>>>
>>>               # Lib/tarfile.py:2198:
>>>               try:
>>>                   g = grp.getgrnam(tarinfo.gname)[2]
>>>               except KeyError:
>>>                   g = tarinfo.gid
>>>               try:
>>>                   u = pwd.getpwnam(tarinfo.uname)[2]
>>>               except KeyError:
>>>                   u = tarinfo.uid
>> This one is a valid example, but totally unparseable by
>> humans. Moreover, it promotes a bad pattern, as you
>> mask KeyErrors in 'grp.getgrnam(tarinfo.gname)' call.
> My translation masks nothing that the original didn't mask. The
> KeyError will come from the function call; it would be IndexError if
> the function returns a too-short tuple, and that one's allowed to
> bubble up.
Right
>
>>>> import pwd
>>>> pwd.getpwnam("rosuav")
> pwd.struct_passwd(pw_name='rosuav', pw_passwd='x', pw_uid=1000,
> pw_gid=1000, pw_gecos='Chris Angelico,,,', pw_dir='/home/rosuav',
> pw_shell='/bin/bash')
>>>> pwd.getpwnam("spam")
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> KeyError: 'getpwnam(): name not found: spam'
>
> (Note that it's possible for 'import pwd' to fail, in which case pwd
> is set to None early in the script. But this particular bit of code
> checks "if pwd" before continuing anyway, so we don't expect
> AttributeError here.)
>
>>> Calculate the mean of a series of numbers, falling back on zero::
>>>
>>>       value = statistics.mean(lst) except statistics.StatisticsError: 0
>>>
>>>       try:
>>>           value = statistics.mean(lst)
>>>       except statistics.StatisticsError:
>>>           value = 0
>> I think all of the above more readable with try statement.
> Readability is a matter of personal preference, to some extent. I find
> it clearer to use the shorter form, for the same reason as I'd use
> this:
>
> def query(prompt, default):
>      return input("%s [%s]: "%(prompt, default)) or default
>
> I wouldn't use long-hand:
>
> def query(prompt, default):
>      s = input("%s [%s]: "%(prompt, default))
>      if not s: s = default
>      return s
>
> It's easier to see that it's calling something, and defaulting to
> something else.
>
>>> Retrieving a message from either a cache or the internet, with auth
>>> check::
>>>
>>>       logging.info("Message shown to user: %s",((cache[k]
>>>           except LookupError:
>>>               (backend.read(k) except OSError: 'Resource not available')
>>>           )
>>>           if check_permission(k) else 'Access denied'
>>>       ) except BaseException: "This is like a bare except clause")
>>>
>>>       try:
>>>           if check_permission(k):
>>>               try:
>>>                   _ = cache[k]
>>>               except LookupError:
>>>                   try:
>>>                       _ = backend.read(k)
>>>                   except OSError:
>>>                       _ = 'Resource not available'
>>>           else:
>>>               _ = 'Access denied'
>>>       except BaseException:
>>>           _ = "This is like a bare except clause"
>>>       logging.info("Message shown to user: %s", _)
>>
>> If you replace '_' with a 'msg' (why did you use '_'??)
>> then try statements are *much* more readable.
> I've removed that example. The reason for using _ was because I wanted
> it to have the "feel" of still being an expression, where nothing's
> named. But it's not a very helpful example anyway; part of the
> confusion comes from the if/else in the middle, which completely
> wrecks evaluation order expectations.

I think that this example is better to be kept ;) But up to you.
While each PEP want's to be Final, it's still good to see
how people can abuse it. And this example is excellent in that
regard.

>>> Lib/ipaddress.py:343::
>>>               try:
>>>                   ips.append(ip.ip)
>>>               except AttributeError:
>>>                   ips.append(ip.network_address)
>>> Becomes::
>>>               ips.append(ip.ip except AttributeError: ip.network_address)
>> or it may become:
>>
>> ips.append(getattr(ip, 'ip', ip.network_address))
>>
>> or
>>
>> address = getattr(ip, 'ip', ip.network_address)
>> ips.append(address)
> There's a subtle difference here that makes that not equivalent. With
> the original try/except statement, evaluation proceeds thus:
>
> 1) Attempt to look up ip.ip. If that succeeds, call ips.append().
> 2) If AttributeError is not thrown in #1, done. Otherwise, proceed.
> 3) Attempt to look up ip.network_address. If that succeeds, call ips.append.
> 4) Any exception raised will propagate.
>
> This means that, if neither ip nor network_address is defined, an
> AttributeError will come up, but that if ip is, network_address won't
> even be looked at.
>
> My version narrows the scope slightly, but is functionally similar.
>
> 1) Attempt to look up ip.ip.
> 2) If AttributeError is thrown in #1, attempt to look up ip.network_address.
> 3) If either #1 or #2 succeeds, call ips.append.
> 4) Any exception raised anywhere else will propagate.
>
> Your version, however, inverts the evaluation order:
>
> 1) Attempt to look up ip.network_address
> 2) If AttributeError is thrown in #1, propagate it up and stop evaluating.
> 3) Retrieve ip.ip, defaulting to the looked-up network address.
> 4) Pass that to ips.append().
>
> It's probably not safe to use 'or' here, but if you can be sure ip.ip
> will never be blank, you could get lazy evaluation this way:
>
> ips.append(getattr(ip, 'ip', '') or ip.network_address)
>
> But at this point, the clarity advantage over the except clause is
> diminishing, plus it conflates AttributeError and ''.

Yes, conditional evaluation a good part of the new syntax (but
again, this *particular* example doesn't show it; better to
use some ORM maybe, where each __getattr__ is potentially
a query).

>> Yes, some examples look neat. But your syntax is much easier
>> to abuse, than 'if..else' expression, and if people start
>> abusing it, Python will simply loose it's readability
>> advantage.
> If people start abusing it, style guides can tell them off. Unlike the
> if/else operator, evaluation of "expr except Exception: default"
> happens in strict left-to-right order, so in that sense it's _less_
> confusing. I'll be adding a paragraph to the PEP shortly explaining
> that.
>

There is one more thing: "There should be one-- and preferably
only one --obvious way to do it." -- so far it's true for all
python language constructs. They all are solving a real pain
point. Maybe it's just me, but I fail to see the real pain
point this proposal solves.


To conclude:

1. I'm still not convinced that the new syntax is really
necessary.

2. You should probably invest some time in finding better
examples for the PEP. Examples, that will focus on its
strengths, and show the real need.

3. Syntax. Using ':' in it makes it look bad. I know there
are many other alternatives, and 400+ emails on python-ideas,
but still, we couldn't find a viable alternative. Perhaps,
we need to look for it some more time.

My 2 cents.

Yury


More information about the Python-Dev mailing list