[Python-Dev] PEP 30XZ: Simplified Parsing

Michael Foord fuzzyman at voidspace.org.uk
Wed May 2 17:42:09 CEST 2007


Jim Jewett wrote:
> PEP: 30xz
> Title: Simplified Parsing
> Version: $Revision$
> Last-Modified: $Date$
> Author: Jim J. Jewett <JimJJewett at gmail.com>
> Status: Draft
> Type: Standards Track
> Content-Type: text/plain
> Created: 29-Apr-2007
> Post-History: 29-Apr-2007
>
>
> Abstract
>
>     Python initially inherited its parsing from C.  While this has
>     been generally useful, there are some remnants which have been
>     less useful for python, and should be eliminated.
>
>     + Implicit String concatenation
>
>     + Line continuation with "\"
>
>     + 034 as an octal number (== decimal 28).  Note that this is
>       listed only for completeness; the decision to raise an
>       Exception for leading zeros has already been made in the
>       context of PEP XXX, about adding a binary literal.
>
>
> Rationale for Removing Implicit String Concatenation
>
>     Implicit String concatentation can lead to confusing, or even
>     silent, errors. [1]
>
>         def f(arg1, arg2=None): pass
>
>         f("abc" "def")  # forgot the comma, no warning ...
>                         # silently becomes f("abcdef", None)
>
>   
Implicit string concatenation is massively useful for creating long 
strings in a readable way though:

    call_something("first part\n"
                           "second line\n"
                            "third line\n")

I find it an elegant way of building strings and would be sad to see it 
go. Adding trailing '+' signs is ugly.

Michael Foord


>     or, using the scons build framework,
>
>         sourceFiles = [
>         'foo.c',
>         'bar.c',
>         #...many lines omitted...
>         'q1000x.c']
>
>     It's a common mistake to leave off a comma, and then scons complains
>     that it can't find 'foo.cbar.c'.  This is pretty bewildering behavior
>     even if you *are* a Python programmer, and not everyone here is.
>
>     Note that in C, the implicit concatenation is more justified; there
>     is no other way to join strings without (at least) a function call.
>
>     In Python, strings are objects which support the __add__ operator;
>     it is possible to write:
>
>         "abc" + "def"
>
>     Because these are literals, this addition can still be optimized
>     away by the compiler.
>
>     Guido indicated [2] that this change should be handled by PEP, because
>     there were a few edge cases with other string operators, such as the %.
>     The resolution is to treat them the same as today.
>
>         ("abc %s def" + "ghi" % var)  # fails like today.
>                                       # raises TypeError because of
>                                       # precedence.  (% before +)
>
>         ("abc" + "def %s ghi" % var)  # works like today; precedence makes
>                                       # the optimization more difficult to
>                                       # recognize, but does not change the
>                                       # semantics.
>
>         ("abc %s def" + "ghi") % var  # works like today, because of
>                                       # precedence:  () before %
>                                       # CPython compiler can already
>                                       # add the literals at compile-time.
>
>
> Rationale for Removing Explicit Line Continuation
>
>     A terminal "\" indicates that the logical line is continued on the
>     following physical line (after whitespace).
>
>     Note that a non-terminal "\" does not have this meaning, even if the
>     only additional characters are invisible whitespace.  (Python depends
>     heavily on *visible* whitespace at the beginning of a line; it does
>     not otherwise depend on *invisible* terminal whitespace.)  Adding
>     whitespace after a "\" will typically cause a syntax error rather
>     than a silent bug, but it still isn't desirable.
>
>     The reason to keep "\" is that occasionally code looks better with
>     a "\" than with a () pair.
>
>         assert True, (
>             "This Paren is goofy")
>
>     But realistically, that paren is no worse than a "\".  The only
>     advantage of "\" is that it is slightly more familiar to users of
>     C-based languages.  These same languages all also support line
>     continuation with (), so reading code will not be a problem, and
>     there will be one less rule to learn for people entirely new to
>     programming.
>
>
> Rationale for Removing Implicit Octal Literals
>
>     This decision should be covered by PEP ???, on numeric literals.
>     It is mentioned here only for completeness.
>
>     C treats integers beginning with "0" as octal, rather than decimal.
>     Historically, Python has inherited this usage.  This has caused
>     quite a few annoying bugs for people who forgot the rule, and
>     tried to line up their constants.
>
>         a = 123
>         b = 024   # really only 20, because octal
>         c = 245
>
>     In Python 3.0, the second line will instead raise a SyntaxError,
>     because of the ambiguity.  Instead, the line should be written
>     as in one of the following ways:
>
>         b = 24    # PEP 8
>         b =  24   # columns line up, for quick scanning
>         b = 0t24  # really did want an Octal!
>
>
> References
>
>     [1] Implicit String Concatenation, Jewett, Orendorff
>         http://mail.python.org/pipermail/python-ideas/2007-April/000397.html
>
>     [2] PEP 12, Sample reStructuredText PEP Template, Goodger, Warsaw
>         http://www.python.org/peps/pep-0012
>
>     [3] http://www.opencontent.org/openpub/
>
>
>
> Copyright
>
>     This document has been placed in the public domain.
>
>
> 
> Local Variables:
> mode: indented-text
> indent-tabs-mode: nil
> sentence-end-double-space: t
> fill-column: 70
> coding: utf-8
> End:
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>
>   



More information about the Python-Dev mailing list