[Python-3000] PEP 3138- String representation in Python 3000

Stephen J. Turnbull stephen at xemacs.org
Tue May 27 05:02:45 CEST 2008


Jim Jewett writes:

 > The only reason for this change is that __repr__ gets used when
 > __str__ *should* be used instead.

That's not what the advocates say.

Now, repr() is supposed to return something that is acceptable to eval
(but doesn't always, especially for recursive objects), while str() is
supposed to be more "user-friendly" (but can be horrible if you need
to see precisely what the contents are or on an output device that's
not prepared for it).

As far as I can tell, which should be used is a "beauty in the eye of
the beholder" issue, and in the case of repr() Spanish and Chinese
users are going to feel more or less differently from Americans about
which characters should be escaped.

 > repr is not for normal UI; it is in explicit contrast to str.  I
 > therefore believe it should default to the safest possible
 > representation.

Well, in `String Conversions', the manual says """In particular,
converting a string adds quotes around it and converts 
"funny" characters to escape sequences that are safe to print."""

Now, I agree with you about what's "safe".  However, in a text-
processing application in a Japanese environment, that's hardly
useful, and our Japanese programmer can argue that in his environment,
printing all of Unicode *is* safe.  Furthermore, most people run in
environments where printing Unicode is safe.

 >>> I just want it to be very easy to say "on my system, repr is ASCII".
 > 
 >> That is in all proposals.
 > 
 > Then I sometimes missed it.

I should say, "that was in Guido's desiderata, so I assume anything
still on the table has it".  Viz:

    2. If you don't want any non-ASCII printed to a file, set the file's
    encoding to ASCII and the error handler to backslashescape.
    (In <ca471dc20805221055j52594fd2offb7fa3fcf936629 at mail.gmail.com>)

If that's not easy enough for you (I sympathize!), then you need to
get Guido's ear.

 > And I'll note that it didn't happen for identifiers.

That's on input, which is very much a different question.

 > Again -- *why* is repr used instead of str?

I don't use it myself other than as a way of diagnosing bugs in
programs I write or maintain; in personal practice, I'm in your camp.
But my understanding is that there is often an intermediate level,
such as a website admin, who needs *some* of the precision of repr()
such as escaped representation of whitespace, but also needs to be
able read most of the output.  It so happens that repr() works as
designed for ASCII and acceptably so for ISO Latin, precisely because
it *was* designed for ASCII!  It sucks for non-Western-European
scripts, though, including the ISO 8859 scripts for Cyrillic, Greek,
Arabic, and Hebrew.

My understanding is that there are more use-cases than there are
stringifying functions and methods.  Something's got to give.



More information about the Python-3000 mailing list