PEP 259: Revise to remove context-driven magic from print
Robin Thomas
robin900 at yahoo.com
Wed Jun 13 18:06:23 EDT 2001
Feedback, please. If you all are positive about the proposal below,
I'll work it up into a new PEP.
Summary:
PEP 259 should be reworked to remove contextual magic from print
statement, which was the main demand made by the Python community when
*rejecting* the PEP. Then it could be resurrected from rejection.
Opinions:
0) We don't need another built-in function that echoes what print
does. Please.
1) We can't get rid of "print". It is a very common idiom in many
languages.
2) Python coders learn to avoid print because a print statement's
behavior changes with a run-time context that the coder cannot
necessarily control. To rehabilitate print, remove the context magic.
3) Whenever I have a string of bytes that I wish to write to
sys.stdout, without any formatting magic such as space characters or
newlines, I cannot use the print statement, because it offers me no
way to "turn off" all the formatting magic. Thus I am required to
absorb the conceptual overhead of the sys module, sys.stdout and
"standard output", file objects, and the write() method, just so that
I can avoid having Python prepend a space character to my output in
certain cases. For a novice Python coder, that sucks.
4) "softspace" has always seemed to me quite a lame thing to be part
of the official file object API. What should file objects care about
some cheesy report-printing operation like "print"?
5) The "am i at the start of a line context" only helps me when I want
print 1,2,
print 3,4,
print 5
to be equivalent to
print 1,2,3,4,5
...and I have never exploited that feature of Python print, even in
lazy report scripting, the application for which the feature seems to
be intended.
6) print-users and write-users operate at two different conceptual
levels. print is a novice feature, for coders who may not yet know of
sys, let alone "stdout" or even the concept of standard output. print
is also a pretty formatting feature, so that the coder does not have
to understand string concatenation, string format strings, string
conversion, or even the newline escape sequence. Thus I think that it
is OK for print and write() to have feature overlap, without violating
the Pythonic principle of "there should be only one way to do it".
Proposed "new" PEP 259
---------------------------------
1) softspace attribute of file objects now ignored by print statement
(and rest of Python core, if applicable). print statement no longer
cares about "am i at the start of a line" context. Instead, the print
statement writes the same bytes every time to the file object. If the
file object wants to play softspace-style games, it can track its own
context. Thus any write operation to the file object affects softspace
context if the file object wishes to implement softspace context.
However, Python core no longer cares about the softspace attribute,
and no longer bothers to read it nor call softspace API functions.
"softspace behavior" then becomes like any other encapsulated behavior
of a stream writer, such as zlib compression or similar. If the file
object wants to do that, fine, but callers to write() such as print
are appropriately ignorant of it.
2) When commas separate expressions as in "print 'a','b','c'", the
print statement writes a single space character " " for each comma.
3) Just to repeat from 1): softspace has no effect any longer. print
will never print a leading space character for any statement "print
foo".
3) If the print statement ends in a trailing comma as in "print
'a','b','c',", the trailing comma means "do not print either a final
space character or a final newline character": "print 'a','b','c',"
prints "a b c". And "print 'a',; print 'b'" prints "ab\n", not "a
b\n".
4) If the print statement does not end in a trailing comma as in
"print a,b,c", the print statement will print a single newline
character after printing the result of all expressions: "print a,b,c"
prints "a b c\n". The statement "print", with no printing operands,
prints a single newline character.
5) It is suggested that the values for "space separator" and "newline"
be system-dependent, settable at run-time, and available in modules
sys or os. Your feedback is welcome on this issue.
It is *not* suggested that file objects take on this aspect of print
behavior, such as by implementing a method print(self, items,
print_newline=1, softsep=' ', linesep='\n'), and if not implemented,
falling back to stream.write(string.join(' ', map(str, items)) +
"\n"). Again, print is not important enough to demand attention from
designers of file-like objects.
Examples
------------
1) Just print a line with a newline at end, without messy newline
escape:
print 1
2) Print some stuff, and let commas put in nice separating spaces:
print 1,2,3
3a) Print some stuff with nice spaces, but don't print a newline. Then
continue with another print statement, making sure that nice spaces
separate all strings on the same line:
print 1,2,3,
print "",4,5,6
(This is the workaround for code that exploits the current softspace
context feature for "lazy report printing", but does the print
statements in an order clear from the source code. This way, current
lazy scripts can avoid code breakage with minor edits.)
3b) Print some stuff with nice spaces, and always make sure that when
you print with a trailing comma, that there is always a nice space
separating your strings in case you issue another trailing-comma print
statement:
print a,b,c,"",
(Second workaround, for code that wants to print at will, without the
coder having to think about things previously printed since the last
newline. Only tradeoff: if the code wishes to end lines with an empty
"print", the output lines will have trailing spaces, which changes the
return value of split() on the line.)
4) Print a string of bytes, with no spaces or newlines added:
print byte_string,
5) Print a series of objects with no space separators or newline,
accepting the fact that string formatting has a performance cost:
print "%s%s%s" % (a,b,c),
6) Print a list of objects with no space separators or newline,
without string formatting cost, but with iteration cost:
for i in a,b,c: print i,
# or, to echo a file to sys.stdout...
for x in open('foo').readlines(): print x,
7) Write bytes to a file object because you hate print:
fileobject.write(str(object))
Implementation Notes
-----------------------
- Something has to change in the bytecode to achieve the above.
PRINT_ITEM and PRINT_ITEM_TO could take an oparg indicating whether to
print a trailing space. Or new opcodes, PRINT_ITEMS and
PRINT_ITEMS_TO, take an oparg indicating number of items on the stack
to pop and print in sequence, printing a space between each. Or else
two new opcodes PRINT_ITEM_SPACE and PRINT_ITEM_TO_SPACE, which would
suck.
- I don't see a problem in having an opcode like PRINT_ITEMS, which
requires that all of the expressions to print are evaluated and put on
the stack, and the print statement itself (minus the PRINT_NEWLINE) is
a single opcode printing *all* of the items. If this is a problem,
please enlighten me. (At the very least, it avoids all those
DUP_TOP/ROT_TWO combinations preceding every PRINT_ITEM_TO.)
- In fact, it might be best to refactor all print operations into a
single opcode, PRINT_STMT, which takes an oparg indicating a vector of
(num_items_on_stack, output_stream_also_on_stack_boolean,
print_newline_at_end_boolean). You still get 14 bits to hold the
amount of objects to print; we perhaps could bring in EXTENDED_ARG if
we ever found the need to handle more than 2**14 expressions in a
print statement in source code. :)
All existing print opcodes can stick around unchanged, even with the
softspace crap. PRINT_STMT will ignore all softspace crap. Or at
least, while softspace must still be supported, PRINT_STMT will ensure
that softspace is set to false after it completes printing.
- "from __future__ import print_stmt" can handle the migration
process.
- Two system values, represented by os.softsep and os.linesep, or
sys.softsep and sys.linesep, or whatever you think is good, could be
used as the run-time values of the "space character" to separate print
items and the "newline" character to terminate a line, respectively.
These values, if changed at runtime by assigning to sys.xx or os.xxx,
can be stored as system globals a la Py_None and others, for speed, to
avoid constant attr lookups on sys or os.
More information about the Python-list
mailing list