[Python-Dev] repr vs. str and locales again
M.-A. Lemburg
mal@lemburg.com
Fri, 19 May 2000 14:30:08 +0200
Guido van Rossum wrote:
>
> The email below suggests a simple solution to a problem that
> e.g. Fran\347ois Pinard brought up long ago; repr() of a string turns
> all non-ASCII chars into \oct escapes. Jyrki's solution: use
> isprint(), which makes it locale-dependent. I can live with this.
>
> It needs a Py_CHARMASK() call but otherwise seems to be fine.
>
> Anybody got an opinion on this? I'm +0. I would even be +0 on a
> similar patch for unicode strings (once the ASCII proposal is
> implemented).
The subject line is a bit misleading: the patch only touches
tp_print, not repr() output. And this is good, IMHO, since
otherwise eval(repr(string)) wouldn't necessarily result
in string.
Unicode objects don't implement a tp_print slot... perhaps
they should ?
--
About the ASCII proposal:
Would you be satisfied with what
import sys
sys.set_string_encoding('ascii')
currently implements ?
There are several places where an encoding comes into play with
the Unicode implementation. The above API currently changes
str(unicode), print unicode and the assumption made by the
implementation during coercion of strings to Unicode.
It does not change the encoding used to implement the "s"
or "t" parser markers and also doesn't change the way the
Unicode hash value is computed (these are currently still
hard-coded as UTF-8).
--
Marc-Andre Lemburg
______________________________________________________________________
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/