[Python-Dev] str() for interpreter output

Tim Peters tim_one@email.msn.com
Sun, 9 Apr 2000 00:39:09 -0400


[Guido van Rossum]
> However, it may be time to switch so that "immediate expression"
> values are printed as str() instead of as repr()...

[Ka-Ping Yee]
> You do NOT want this.
>
> I'm against this change -- quite strongly, in fact.

Relax, nobody wants that.  The fact is that neither str() nor repr() is
reasonable today for use at the interactive prompt.  repr() *appears*
adequate only so long as you stick to the builtin types, where the
difference between repr() and str() is most often non-existent(!).  But
repr() has driven me (& not only me) mad for years at the interactive prompt
in my own (and extension) types, since a *faithful* representation of a
"large" object is exactly what you *don't* want to see scrolling by.  You
later say (echoing Donn Cave)

> repr() is for the human, not for the machine

but that contradicts the docs and the design.  What you mean <wink> to say
is "the thing that the interactive prompt uses by default *should* be for
the human, not for the machine" -- which repr() is not.  That's why repr()
sucks here, despite that it's certainly "more for the human" than a pickle
is.

str() isn't suitable either, alas, despite that (by design and by the docs)
it was *intended* to be, at least because str() on a container invokes
repr() on the containees.  Neither str() nor repr() can be used to get a
human-friendly string form of nested objects today (unless, as is
increasingly the *practice*, people misuse __repr__() to do what __str__()
was *intended* to do -- c.f. Guido's complaint about that).

> ...
> Have repr() use triple-quotes when strings contain newlines
> if you like, but do *not* hide the fact that the thing being
> displayed is a string.

Nobody wants to hide this (or, if someone does, set yourself up for
merciless poking before it's too late).

> ...
> Getting the representation of objects from the interpreter provides
> a very important visual cue: you can usually tell just by looking
> at the first character what kind of animal you've got.  A digit means
> it's a number; a quote means a string; "[" means a list; "(" means a
> tuple; "{" means a dictionary; "<" means an instance or a special
> kind of object.  Switching to str() instead of repr() completely
> breaks this property so you have no idea what you are getting.
> Intuitions go out the window.

This is way oversold:  str() also supplies "[" for lists, "(" for tuples,
"{" for dicts, and "<" for instances of classes that don't override __str__.
The only difference between repr() and str() in this listing of faux terror
<wink> is when they're applied to strings.

> Granted, repr() cannot always produce an exact reconstruction of an
> object.  repr() is not a serialization mechanism!

To the contrary, many classes and types implement repr() for that very
purpose.  It's not universal but doesn't need to be.

> We have 'pickle' for that.

pickles are unreadable by humans; that's why repr() is often preferred.

> ...
> As a corollary, here is an important property of repr() that
> i think ought to be documented and preserved:
>
>     eval(repr(x)) should produce an object with the same value
>     and state as x, or it should cause a SyntaxError.
>
> We should avoid ever having it *succeed* and produce the *wrong* x.

Fine by me.

> ...
> Honestly i'm really surprised that such a convoluted hack as the
> suggestion to "special-case the snot out of strings" would come
> from Tim, and more surprised that it actually got so much airtime.

That thread tapped into real and widespread unhappiness with what's
displayed at an interactive prompt today.  That's why it got so much
airtime -- no mystery there.

As above, your objections to str() reduce to its behavior for strings
specifically (I have more objections than just that -- str() should "get
passed down" too), hence "str() special-casing the snot out of strings" was
a direct hack to address that specific complaint.

> Doing this special-case mumbo-jumbo would be even worse!  Look:
>
> (in a hypothetical Python-with-snotless-str()...)
>
>     >>> a = '\\'
>     >>> b = '\''

I'd actually like to use euroquotes for str(string) -- don't throw the
Latin-1 away with your outrage <wink>.  Whatever, examples with backslashes
are non-starters, since newbies can't make any sense out of their doubling
under repr() today either (if it's not a FAQ, it should be -- I've certainly
had to explain it often enough!).

> ...much later...
>
>     >>> a
>     '\'
>     >>> '\'
>       File "<stdin>", line 1
>         '\'
>           ^
>     SyntaxError: invalid token
>
> (at this point i am envisioning the user screaming, "But that's
> what YOU said!")

Nobody ever promised that eval(str(x)) == x -- if they want that, they
should use repr() or backticks.  Today they get

>>> a
'\\'

and scream "Huh?! I thought that was only supposed to be ONE backslash!".
Or someone in Europe tries to look at a list of strings, or a simple dict
keyed by names, and gets back a god-awful mish-mash of octal backslash
escapes (and str() can't be used today to stop that either, since str()
"isn't passed down").  Compared to that, confusion over explicit backslashes
strikes me as trivial.

> [various examples of ambiguous output]

That's why it's called a hack <wink>.  Last time I corresponded with Guido
about it, he was leaning toward using angle brackets (<>) instead.  That
would take away the temptation to believe you should be able to type the
same thing back in and have it do something reasonable.

> Tim's snot-removal algorithm forces the user to *infer* the rules
> of snot removal, remember them, and tentatively apply them to
> everything they see (since they still can't be sure whether snot
> has been removed from what they are seeing).

Not at all.  "Tim's snot-removal algorithm" didn't remove anything
("removal" is an adjective I don't believe I've seen applied to it before).
At the time it simply did str() and stuck a pair of quotes around the
result.  The (passed down) str() was the important part; how it's decorated
to say "and, btw, it's a string" is the teensy tail of a flea that's killing
the whole dog <0.9 wink>.  If we had Latin-1, we could use euroquotes for
this.  If we had control over the display, we could use a different color or
font.  If we stick to 7-bit ASCII, we have to do *something* irritating.

So here's a different idea for SSCTSOOS:  escape quote chars and backslashes
(like repr()) as needed, but leave everything else alone (like str()).  Then
you can have fun stringing N adjacent backslashes together <wink>, and other
people can use non-ASCII characters without going mad.

What I want *most*, though, is for ssctsoos() to get passed down (from
container to containee), and for it to be the default action.

> ...
> As for the suggestion to add an interpreter hook to __builtins__
> such that you can supply your own display routine, i'm all for it.
> Great idea there.

Same here!  But I reject going on from there to say "and since Python lets
you do it yourself, Python isn't obligated to try harder itself".

anything-to-keep-octal-escapes-out-of-a-unicode-world<wink>-ly y'rs  - tim