[Python-Dev] PEP 30XZ: Simplified Parsing
Michael Foord
fuzzyman at voidspace.org.uk
Wed May 2 17:42:09 CEST 2007
Jim Jewett wrote:
> PEP: 30xz
> Title: Simplified Parsing
> Version: $Revision$
> Last-Modified: $Date$
> Author: Jim J. Jewett <JimJJewett at gmail.com>
> Status: Draft
> Type: Standards Track
> Content-Type: text/plain
> Created: 29-Apr-2007
> Post-History: 29-Apr-2007
>
>
> Abstract
>
> Python initially inherited its parsing from C. While this has
> been generally useful, there are some remnants which have been
> less useful for python, and should be eliminated.
>
> + Implicit String concatenation
>
> + Line continuation with "\"
>
> + 034 as an octal number (== decimal 28). Note that this is
> listed only for completeness; the decision to raise an
> Exception for leading zeros has already been made in the
> context of PEP XXX, about adding a binary literal.
>
>
> Rationale for Removing Implicit String Concatenation
>
> Implicit String concatentation can lead to confusing, or even
> silent, errors. [1]
>
> def f(arg1, arg2=None): pass
>
> f("abc" "def") # forgot the comma, no warning ...
> # silently becomes f("abcdef", None)
>
>
Implicit string concatenation is massively useful for creating long
strings in a readable way though:
call_something("first part\n"
"second line\n"
"third line\n")
I find it an elegant way of building strings and would be sad to see it
go. Adding trailing '+' signs is ugly.
Michael Foord
> or, using the scons build framework,
>
> sourceFiles = [
> 'foo.c',
> 'bar.c',
> #...many lines omitted...
> 'q1000x.c']
>
> It's a common mistake to leave off a comma, and then scons complains
> that it can't find 'foo.cbar.c'. This is pretty bewildering behavior
> even if you *are* a Python programmer, and not everyone here is.
>
> Note that in C, the implicit concatenation is more justified; there
> is no other way to join strings without (at least) a function call.
>
> In Python, strings are objects which support the __add__ operator;
> it is possible to write:
>
> "abc" + "def"
>
> Because these are literals, this addition can still be optimized
> away by the compiler.
>
> Guido indicated [2] that this change should be handled by PEP, because
> there were a few edge cases with other string operators, such as the %.
> The resolution is to treat them the same as today.
>
> ("abc %s def" + "ghi" % var) # fails like today.
> # raises TypeError because of
> # precedence. (% before +)
>
> ("abc" + "def %s ghi" % var) # works like today; precedence makes
> # the optimization more difficult to
> # recognize, but does not change the
> # semantics.
>
> ("abc %s def" + "ghi") % var # works like today, because of
> # precedence: () before %
> # CPython compiler can already
> # add the literals at compile-time.
>
>
> Rationale for Removing Explicit Line Continuation
>
> A terminal "\" indicates that the logical line is continued on the
> following physical line (after whitespace).
>
> Note that a non-terminal "\" does not have this meaning, even if the
> only additional characters are invisible whitespace. (Python depends
> heavily on *visible* whitespace at the beginning of a line; it does
> not otherwise depend on *invisible* terminal whitespace.) Adding
> whitespace after a "\" will typically cause a syntax error rather
> than a silent bug, but it still isn't desirable.
>
> The reason to keep "\" is that occasionally code looks better with
> a "\" than with a () pair.
>
> assert True, (
> "This Paren is goofy")
>
> But realistically, that paren is no worse than a "\". The only
> advantage of "\" is that it is slightly more familiar to users of
> C-based languages. These same languages all also support line
> continuation with (), so reading code will not be a problem, and
> there will be one less rule to learn for people entirely new to
> programming.
>
>
> Rationale for Removing Implicit Octal Literals
>
> This decision should be covered by PEP ???, on numeric literals.
> It is mentioned here only for completeness.
>
> C treats integers beginning with "0" as octal, rather than decimal.
> Historically, Python has inherited this usage. This has caused
> quite a few annoying bugs for people who forgot the rule, and
> tried to line up their constants.
>
> a = 123
> b = 024 # really only 20, because octal
> c = 245
>
> In Python 3.0, the second line will instead raise a SyntaxError,
> because of the ambiguity. Instead, the line should be written
> as in one of the following ways:
>
> b = 24 # PEP 8
> b = 24 # columns line up, for quick scanning
> b = 0t24 # really did want an Octal!
>
>
> References
>
> [1] Implicit String Concatenation, Jewett, Orendorff
> http://mail.python.org/pipermail/python-ideas/2007-April/000397.html
>
> [2] PEP 12, Sample reStructuredText PEP Template, Goodger, Warsaw
> http://www.python.org/peps/pep-0012
>
> [3] http://www.opencontent.org/openpub/
>
>
>
> Copyright
>
> This document has been placed in the public domain.
>
>
>
> Local Variables:
> mode: indented-text
> indent-tabs-mode: nil
> sentence-end-double-space: t
> fill-column: 70
> coding: utf-8
> End:
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>
>
More information about the Python-Dev
mailing list