Should repr() of string should observe locale?

There have been some requests (Francois Pinard, recently Alexander Voinov) that repr() of a string should not use escape sequences for characters that are printable according to the locale. I don't have the time to write a PEP or implement it (although it should be simple enough) but it needs to be recorded as a feature that I think is good (that's at least a +0). --Guido van Rossum (home page: http://www.pythonlabs.com/~guido/)

On Sat, 29 Jul 2000, Guido van Rossum wrote:
I don't. Most people who are requesting it are requesting it for the purpose of the interactive Python session. I think there is general agreement that there should be a way to better control the REPL from Python (my, now lost, sys.display patch, for example). Wouldn't that solve the problem? -- Moshe Zadka <moshez@math.huji.ac.il> There is no IGLU cabal. http://advogato.org/person/moshez

[Guido]
[Moshe Zadka]
Because str(list) and str(dict) and str(tuple) end up calling repr() on the items they contain, even simple stmts like e.g. print list_of_goofy_foreign_i.e._not_plain_ascii_strings produce unreadable octal escapes instead of the goofy foreign non-ascii characters goofy foreigners want <wink>. That's one of the Lost Pythonic Theses, btw: Goofy is better than unreadable. Hooking the REPL loop doesn't help with that, in part because an explicit print would sidestep the hook, and the rest because it's a real problem in non-interactive mode too. So there are problems other than just Fran\347ois's, including your desire to hook the P in REPL, and including that str(list) and str(dict) and str(tuple) applying repr to their containees causes horrible output for many a user-defined class too (so much so that many classes code __repr__ to do what __str__ is *supposed* to do) -- but they're arguably all distinct problems. That said, I'm -1 too, because Guido once sensibly agreed that strings produced by repr should restrict themselves to the characters C guarantees can be written and read faithfully in text-mode I/O, excluding the tab character (or, iow, each character c in a string produced by repr should have ord(c) in range(32, 128)). Give that up and text-mode pickles (plus anything else repr is used deliberately for in a text file) lose their guarantee of cross-platform portability at the C level (not to mention losing x-platform human readability); etc. The problem isn't that repr sticks in backslash escapes, the problem is that repr gets called when repr is inappropriate. There was extensive debate about that in Python-Dev earlier this year (and the year before, and ...). Thanks to the lack of PEPs in those benighted days, I bet we get to do it all over again <wink>. I can't make time for this round, though. In brief: Breaking repr's contract to produce printable ASCII is unacceptable damage to me, no matter what the short-term perceived benefit. A principled solution appeared to require a combination of (at least) making the P in the REPL loop hookable, and making the builtin container types pass on whichever of {str, repr} they were passed *to*; the latter is problematic when the containee is a string, though, because str(string) produces a string without delimiters to say "hey, I'm a string!", making the output pretty unreadable in the context of the containee; further fiddling of some sort is needed. if-the-current-repr-didn't-exist-we'd-have-to-reinvent-it-and- we-still-wouldn't-want-to-invoke-either-repr-much-of-the- time-anyway-ly y'rs - tim

On Sun, 30 Jul 2000, Tim Peters identified loads of problems with Python, some of which I wish to address:
A principled solution appeared to require a combination of (at least) making the P in the REPL loop hookable
Oh yes. I think I want to PEP on this one. Barry, that's two numbers you owe me. still-commited-to-dumping-work-on-barry-ly y'rs, Z. -- Moshe Zadka <moshez@math.huji.ac.il> There is no IGLU cabal. http://advogato.org/person/moshez

Tim Peters:
The problem isn't that repr sticks in backslash escapes, the problem is that repr gets called when repr is inappropriate.
Seems like we need another function that does something in between str() and repr(). It would be just like repr() except that it wouldn't put escape sequences in strings unless absolutely necessary, and it would apply this recursively to sub-objects. Not sure what to call it -- goofy() perhaps :-) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

[Barry A. Warsaw]
I'd bet most people don't even understand why there has to be two functions that do almost the same thing.
Indeed they do not. The docs are too vague about the intended differences between str and repr; in 1.5.2 and earlier, string was just about the only builtin type that actually had distinct str() and repr() implementations, so it was easy to believe that strings were somehow a special case with unique behavior; 1.6 extends that (just) to floats, where repr(float) now displays enough digits so that the output can be faithfully converted back to the float you started with. This is starting to bother people in the way that distinct __str__ and __repr__ functions have long frustrated me in my own classes: the default (repr) at the prompt leads to bloated output that's almost always not what I want to see. Picture repr() applied to a matrix object! If it meets the goal of producing a string sufficient to reproduce the object when eval'ed, it may spray megabytes of string at the prompt. Many classes implement __repr__ to do what __str__ was intended to do as a result, just to get bearable at-the-prompt behavior. So "learn by example" too often teaches the wrong lesson too. I'm not surprised that users are confused! Python is *unusual* in trying to cater to more than one form of to-string conversion across the board. It's a mondo cool idea that hasn't got the praise it deserves, but perhaps that's just because the current implementation doesn't really work outside the combo of the builtin types + plain-ASCII strings. Unescaping locale printables in repr() is the wrong solution to a small corner of the right problem.

[Greg Ewing]
In the previous incarnation of this debate, a related (more str-like than repr-like) intermediate was named ssctsoos(). Meaning, of course <wink>, "str() special casing the snot out of strings". It was purely a hack, and I was too busy working at Dragon at the time to give it the thought it needed. Now I'm too busy working at PythonLabs <0.5 wink>. not-a-priority-ly y'rs - tim
participants (5)
-
bwarsaw@beopen.com
-
Greg Ewing
-
Guido van Rossum
-
Moshe Zadka
-
Tim Peters