[Python-3000] Unicode and OS strings
Greg Ewing
greg.ewing at canterbury.ac.nz
Fri Sep 14 07:08:04 CEST 2007
Stephen J. Turnbull wrote:
> You can't win that, because Unicode is the only encoding that attempts
> to guarantee even the possibility of round-tripping.
Rubbish -- I can do print [ord(c) for c in my_unicode_string]
and get perfect round-trippability if I want.
You can ask people to use pre-existing officially-sanctioned
encodings for their unicode data, but you can't force them to.
> The main problem with this scheme that I know of is that if you have a
> Python string that contains such a code point, you'll need to somehow
> include the information about the original encoding when pickling and
> the like.
That's exactly the sort of thing I'm talking about. It
would be surprising if pickling worked reliably for all
strings *except* ones that happened to come in as a
command line argument.
--
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury, | Carpe post meridiem! |
Christchurch, New Zealand | (I'm not a morning person.) |
greg.ewing at canterbury.ac.nz +--------------------------------------+
More information about the Python-3000
mailing list