[Python-Dev] PEP 259: Omit printing newline after newline

Guido van Rossum guido@digicool.com
Mon, 11 Jun 2001 16:07:40 -0400


Please comment on the following.  This came up a while ago in
python-dev and I decided to follow through.  I'm making this a PEP
because of the risk of breaking code (which everybody on Python-dev
seemed to think was acceptable).

--Guido van Rossum (home page: http://www.python.org/~guido/)

PEP: 259
Title: Omit printing newline after newline
Version: $Revision: 1.1 $
Author: guido@python.org (Guido van Rossum)
Status: Draft
Type: Standards Track
Python-Version: 2.2
Created: 11-Jun-2001
Post-History: 11-Jun-2001

Abstract

    Currently, the print statement always appends a newline, unless a
    trailing comma is used.  This means that if we want to print data
    that already ends in a newline, we get two newlines, unless
    special precautions are taken.

    I propose to skip printing the newline when it follows a newline
    that came from data.

    In order to avoid having to add yet another magic variable to file
    objects, I propose to give the existing 'softspace' variable an
    extra meaning: a negative value will mean "the last data written
    ended in a newline so no space *or* newline is required."


Problem

    When printing data that resembles the lines read from a file using
    a simple loop, double-spacing occurs unless special care is taken:

        >>> for line in open("/etc/passwd").readlines():           
        ... print line 
        ... 
        root:x:0:0:root:/root:/bin/bash

        bin:x:1:1:bin:/bin:

        daemon:x:2:2:daemon:/sbin:

        (etc.)

        >>>

    While there are easy work-arounds, this is often noticed only
    during testing and requires an extra edit-test roundtrip; the
    fixed code is uglier and harder to maintain.


Proposed Solution

    In the PRINT_ITEM opcode in ceval.c, when a string object is
    printed, a check is already made that looks at the last character
    of that string.  Currently, if that last character is a whitespace
    character other than space, the softspace flag is reset to zero;
    this suppresses the space between two items if the first item is a
    string ending in newline, tab, etc. (but not when it ends in a
    space).  Otherwise the softspace flag is set to one.

    The proposal changes this test slightly so that softspace is set
    to:

        -1 -- if the last object written is a string ending in a
              newline

         0 -- if the last object written is a string ending in a
              whitespace character that's neither space nor newline

         1 -- in all other cases (including the case when the last
              object written is an empty string or not a string)

    Then, the PRINT_NEWLINE opcode, printing of the newline is
    suppressed if the value of softspace is negative; in any case the
    softspace flag is reset to zero.


Scope

    This only affects printing of 8-bit strings.  It doesn't affect
    Unicode, although that could be considered a bug in the Unicode
    implementation.  It doesn't affect other objects whose string
    representation happens to end in a newline character.


Risks

    This change breaks some existing code.  For example:

        print "Subject: PEP 259\n"
        print message_body

    In current Python, this produces a blank line separating the
    subject from the message body; with the proposed change, the body
    begins immediately below the subject.  This is not very robust
    code anyway; it is better written as

        print "Subject: PEP 259"
        print
        print message_body

    In the test suite, only test_StringIO (which explicitly tests for
    this feature) breaks.


Implementation

    A patch relative to current CVS is here:

        http://sourceforge.net/tracker/index.php?func=detail&aid=432183&group_id=5470&atid=305470


Copyright

    This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
End: