[Python-Dev] if-syntax for regular for-loops

Sat Oct 4 04:26:30 CEST 2008

Greg Ewing wrote:
> Vitor Bosshard wrote:
>> The exact same argument could be used for list comprehensions themselves.
> No, an LC saves more than newlines -- it saves the code
> to set up and append to a list. This is a substantial
> improvement when this code would otherwise swamp the
> essentials of what's being done.
> 
> This doesn't apply to a plain for-loop that's not
> building a list.

Not only do LCs make it obvious to the reader that "all this loop does
is build a list", but the speed increases from doing the iteration in
native code rather than pure Python are also non-trivial - every pass
through the main eval loop that can be safely avoided leads to a fairly
substantial time saving.

Generally speaking, syntactic sugar (or new builtins) need to take a
construct in idiomatic Python that is fairly obvious to an experienced
Python user and make it obvious to even new users, or else take an idiom
that is easy to get wrong when writing (or miss when reading) and make
it trivial to use correctly.

Providing significant performance improvements (usually in the form of
reduced memory usage or increased speed) also counts heavily in favour
of new constructs.

I strongly suggest browsing through past PEPs (both accepted and
rejected ones) before proposing syntax changes, but here are some
examples of syntactic sugar proposals that were accepted.

List/set/dict comprehensions
============================
(and the reduction builtins any(), all(), min(), max(), sum())

  target = [op(x) for x in source]

instead of:
  target = []
  for x in source:
    target.append(op(x))

The transformation ("op(x)") is far more prominent in the comprehension
version, as is the fact that all the loop does is produce a new list. I
include the various reduction builtins here, since they serve exactly
the same purpose of taking an idiomatic looping construct and turning it
into a single expression.

Generator expressions
=====================

  total = sum(x*x for x in source)

instead of:

  def _g(seq):
    for x in source:
      yield x*x
  total = sum(_g(x))

or:

  total = sum([x*x for x in source])

Here, the GE version has obvious readability gains over the generator
function version (as with comprehensions, it brings the operation being
applied to each element front and centre instead of burying it in the
middle of the code, as well as allowing reduction operations like sum()
to retain their prominence), but doesn't actually improve readability
significantly over the second LC-based version. The gain over the
latter, of course, is that the GE based version needs a lot less
*memory* than the LC version, and, as it consumes the source data
incrementally, can work on source iterators of arbitrary (even infinite)
length, and can also cope with source iterators with large time gaps
between items (e.g. reading from a socket) as each item will be returned
as it becomes available.

With statements
===============

  with lock:
    # perform synchronised operations

instead of:

  lock.aqcuire()
  try:
    # perform synchronised operations
  finally:
    lock.release()

This change was a gain for both readability and writability - there were
plenty of ways to get this kind of code wrong (e.g. leave out the
try-finally altogether, acquire the resource inside the try block
instead of before it, call the wrong method or spell the variable name
wrong when attempting to release the resource in the finally block), and
it wasn't easy to audit because the lock acquisition and release could
be separated by an arbitrary number of lines of code. By combining all
of that into a single line of code at the beginning of the block, the
with statement eliminated a lot of those issues, making the code much
easier to write correctly in the first place, and also easier to audit
for correctness later (just make sure the code is using the correct
context manager for the task at hand).

Function decorators
===================

  @classmethod
  def f(cls):
    # Method body

instead of:

  def f(cls):
    # Method body
  f = classmethod(f)

Easier to write (function name only written once instead of three
times), and easier to read (decorator names up top with the function
signature instead of buried after the function body). Some folks still
dislike the use of the @ symbol, but compared to the drawbacks of the
old approach, the dedicated function decorator syntax is a huge improvement.

Conditional expressions
=======================

  x = A if C else B

instead of:

  x = C and A or B

The addition of conditional expressions arguably wasn't a particularly
big win for readability, but it *was* a big win for correctness. The
and/or based workaround for lack of a true conditional expression was
not only hard to read if you weren't already familiar with the
construct, but using it was also a potential buggy if A could ever be
False while C was True (in such case, B would be returned from the
expression instead of A).

Except clause
=============

  except Exception as ex:

instead of:

  except Exception, ex:

Another example of changing the syntax to eliminate potential bugs (in
this case, except clauses like "except TypeError, AttributeError:", that
would actually never catch AttributeError, and would locally do
AttributeError=TypeError if a TypeError was caught).

Cheers,
Nick.

P.S. There's a fractionally better argument to be used in favour of
allowing an if condition on the for loop header line: it doesn't just
save a newline or improve consistency with comprehensions and generator
expressions, it saves an *indentation level*. And that gain is exactly
the rationale that was used to begin allowing:

  try:
    ...
  except:
    ...
  else:
    ...
  finally:
    ...

instead of requiring the extra indentation level:

  try:
    try:
      ...
    except:
      ...
    else:
      ...
  finally:
    ...

However, even that argument is greatly weakened in the for/if case by
the fact that the indentation level is being saved by moving the if
condition up and to the right after the for loop details, whereas in the
try-statement case there were absolutely no downsides (the redundant try
keyword was simply dropped entirely).

So I'm personally still -1 when it comes to incorporating an if clause
directly into the for loop syntax - it's only necessary in the GE/LC
case due to the fact that those don't support statement-based nesting.

(Tangent: the above two try/except examples are perfectly legal Py3k
code. Do we really need the "pass" statement anymore?)

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
            http://www.boredomandlaziness.org