[Python-3000] Revised PEPs 30XZ: remove implicit string concatenation and backslash continuation

Jim Jewett jimjjewett at gmail.com
Tue May 1 00:37:57 CEST 2007


On 4/30/07, Guido van Rossum <guido at python.org> wrote:
> I think these should be two separate proposals, with more specific
> names (e.g. "remove implicit string concatenation" and "remove
> backslash continuation"). There's no need to mention the octal thing
> if it's already a separate PEP.

Revised versions attached, as David Goodger seemed to prefer attachments.

-jJ
-------------- next part --------------
PEP: 30XZA
Title: Remove Backslash Continuation
Version: $Revision$
Last-Modified: $Date$
Author: Jim J. Jewett <JimJJewett at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 29-Apr-2007
Post-History: 29-Apr-2007, 30-Apr-2007


Abstract

    Python initially inherited its parsing from C.  While this has
    been generally useful, there are some remnants which have been
    less useful for python, and should be eliminated.

    This PEP proposes elimination of terminal "\" as a marker for
    line continuation.

    
Rationale for Removing Explicit Line Continuation

    A terminal "\" indicates that the logical line is continued on the
    following physical line (after whitespace).

    Note that a non-terminal "\" does not have this meaning, even if the
    only additional characters are invisible whitespace.  (Python depends
    heavily on *visible* whitespace at the beginning of a line; it does
    not otherwise depend on *invisible* terminal whitespace.)  Adding
    whitespace after a "\" will typically cause a syntax error rather
    than a silent bug, but it still isn't desirable.

    The reason to keep "\" is that occasionally code looks better with
    a "\" than with a () pair.

        assert True, (
            "This Paren is goofy")

    But realistically, that parenthesis is no worse than a "\".  The
    only advantage of "\" is that it is slightly more familiar to users of
    C-based languages.  These same languages all also support line
    continuation with (), so reading code will not be a problem, and
    there will be one less rule to learn for people entirely new to
    programming.


Alternate proposal

    Several people have suggested alternative ways of marking the line
    end.  Most of these were rejected for not actually simplifying things.

    The one exception was to let any unfished expression signify a line
    continuation, possibly in conjunction with increased indentation

        assert True,            # comma implies tuple implies continue
            "No goofy parens"

    The objections to this are:

        - The amount of whitespace may be contentious; expression
          continuation should not be confused with opening a new
          suite.

        - The "expression continuation" markers are not as clearly marked
          in Python as the grouping punctuation "(), [], {}" marks are.

              "abc" +   # Plus needs another operand, so it continues
                  "def"

              "abc"       # String ends an expression, so
                  + "def" # this is a syntax error.

        - Guido says so.  [1]  His reasoning is that it may not even be
          feasible.  (See next reason.)

        - As a technical concern, supporting this would require allowing
          INDENT or DEDENT tokens anywhere, or at least in a widely
          expanded (and ill-defined) set of locations.  While this is
          in some sense a concern only for the internal parsing
          implementation, it would be a major new source of complexity.  [1]


References

    [1] PEP 30XZ: Simplified Parsing, van Rossum
        http://mail.python.org/pipermail/python-3000/2007-April/007063.html


Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
-------------- next part --------------
PEP: 30xzB
Title: Remove Implicit String Concatenation
Version: $Revision$
Last-Modified: $Date$
Author: Jim J. Jewett <JimJJewett at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 29-Apr-2007
Post-History: 29-Apr-2007, 30-Apr-2007


Abstract

    Python initially inherited its parsing from C.  While this has
    been generally useful, there are some remnants which have been
    less useful for python, and should be eliminated.

    This PEP proposes to eliminate Implicit String concatenation
    based on adjacency of literals.

    Instead of

        "abc" "def" == "abcdef"

    authors will need to be explicit, and add the strings

        "abc" + "def" == "abcdef"


Rationale for Removing Implicit String Concatenation

    Implicit String concatentation can lead to confusing, or even
    silent, errors.

        def f(arg1, arg2=None): pass

        f("abc" "def")  # forgot the comma, no warning ...
                        # silently becomes f("abcdef", None)

    or, using the scons build framework,

        sourceFiles = [
        'foo.c'
        'bar.c',
        #...many lines omitted...
        'q1000x.c']

    It's a common mistake to leave off a comma, and then scons complains
    that it can't find 'foo.cbar.c'.  This is pretty bewildering behavior
    even if you *are* a Python programmer, and not everyone here is.  [1]

    Note that in C, the implicit concatenation is more justified; there
    is no other way to join strings without (at least) a function call.

    In Python, strings are objects which support the __add__ operator;
    it is possible to write:

        "abc" + "def"

    Because these are literals, this addition can still be optimized
    away by the compiler.  (The CPython compiler already does.  [2])

    Guido indicated [2] that this change should be handled by PEP, because
    there were a few edge cases with other string operators, such as the %.
    (Assuming that str % stays -- it may be eliminated in favor of
    PEP 3101 -- Advanced String Formatting.  [3] [4])
    
    The resolution is to treat them the same as today.

        ("abc %s def" + "ghi" % var)  # fails like today.
                                      # raises TypeError because of
                                      # precedence.  (% before +)
        
        ("abc" + "def %s ghi" % var)  # works like today; precedence makes
                                      # the optimization more difficult to
                                      # recognize, but does not change the
                                      # semantics.

        ("abc %s def" + "ghi") % var  # works like today, because of
                                      # precedence:  () before % 
                                      # CPython compiler can already 
                                      # add the literals at compile-time.
    
    
References

    [1] Implicit String Concatenation, Jewett, Orendorff
        http://mail.python.org/pipermail/python-ideas/2007-April/000397.html

    [2] Reminder: Py3k PEPs due by April, Hettinger, van Rossum
        http://mail.python.org/pipermail/python-3000/2007-April/006563.html

    [3] PEP 3101, Advanced String Formatting, Talin
        http://www.python.org/peps/pep-3101.html

    [4] ps to question Re: Need help completing ABC pep, van Rossum
        http://mail.python.org/pipermail/python-3000/2007-April/006737.html

Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:


More information about the Python-3000 mailing list