[Python-Dev] Should repr() of string should observe locale?

Tim Peters tim_one@email.msn.com
Sun, 30 Jul 2000 03:53:15 -0400


[Guido]
> There have been some requests (Francois Pinard, recently Alexander
> Voinov) that repr() of a string should not use escape sequences for
> characters that are printable according to the locale.
>
> I don't have the time to write a PEP or implement it (although it
> should be simple enough) but it needs to be recorded as a feature
> that I think is good (that's at least a +0).

[Moshe Zadka]
> I don't.  Most people who are requesting it are requesting it for the
> purpose of the interactive Python session. I think there is general
> agreement that there should be a way to better control the REPL from
> Python (my, now lost, sys.display patch, for example). Wouldn't that
> solve the problem?

Because str(list) and str(dict) and str(tuple) end up calling repr() on the
items they contain, even simple stmts like e.g.

    print list_of_goofy_foreign_i.e._not_plain_ascii_strings

produce unreadable octal escapes instead of the goofy foreign non-ascii
characters goofy foreigners want <wink>.  That's one of the Lost Pythonic
Theses, btw:

    Goofy is better than unreadable.

Hooking the REPL loop doesn't help with that, in part because an explicit
print would sidestep the hook, and the rest because it's a real problem in
non-interactive mode too.

So there are problems other than just Fran\347ois's, including your desire
to hook the P in REPL, and including that str(list) and str(dict) and
str(tuple) applying repr to their containees causes horrible output for many
a user-defined class too (so much so that many classes code __repr__ to do
what __str__ is *supposed* to do) -- but they're arguably all distinct
problems.

That said, I'm -1 too, because Guido once sensibly agreed that strings
produced by repr should restrict themselves to the characters C guarantees
can be written and read faithfully in text-mode I/O, excluding the tab
character (or, iow, each character c in a string produced by repr should
have ord(c) in range(32, 128)).  Give that up and text-mode pickles (plus
anything else repr is used deliberately for in a text file) lose their
guarantee of cross-platform portability at the C level (not to mention
losing x-platform human readability); etc.

The problem isn't that repr sticks in backslash escapes, the problem is that
repr gets called when repr is inappropriate.  There was extensive debate
about that in Python-Dev earlier this year (and the year before, and ...).
Thanks to the lack of PEPs in those benighted days, I bet we get to do it
all over again <wink>.  I can't make time for this round, though.

In brief:  Breaking repr's contract to produce printable ASCII is
unacceptable damage to me, no matter what the short-term perceived benefit.
A principled solution appeared to require a combination of (at least) making
the P in the REPL loop hookable, and making the builtin container types pass
on whichever of {str, repr} they were passed *to*; the latter is problematic
when the containee is a string, though, because str(string) produces a
string without delimiters to say "hey, I'm a string!", making the output
pretty unreadable in the context of the containee; further fiddling of some
sort is needed.

if-the-current-repr-didn't-exist-we'd-have-to-reinvent-it-and-
    we-still-wouldn't-want-to-invoke-either-repr-much-of-the-
    time-anyway-ly y'rs  - tim