[XML-SIG] [OT] working unicode strings
M.-A. Lemburg
mal@lemburg.com
Thu, 28 Jun 2001 18:47:06 +0200
Alexandre Fayolle wrote:
>
> Ahem, sorry for posting this here, since it is slightly off-topic, but
> this list is a place where I know I'll find people who have played with
> unicode strings in Python.
>
> So, I have a UTF8 encoded unicode string, that I got it by reading a text
> node in a DOM. I want to write it somewhere on the disk.
>
> s= u'été'
> f=open('/tmp/foo','w')
> f.write(s)
>
> This gets me
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeError: ASCII encoding error: ordinal not in range(128)
>
> By experimenting a bit, it appears that I can call f.write(s) when f is a
> StringIO.StringIO, but not when it is a cStringIO.StringIO, a file, or a
> standard stream (sys.stdout). Needless to say, I'm quite disapointed.
Python's IO will try to write your Unicode string as 8-bit string
and to do so first convert it to the default encoding which is
ASCII.
f.write(s.encode('latin-1'))
should get you the results you are probably looking for.
Altenatively, open the file using codecs.open() -- it let's
you specfiy the encoding of the file and does the converting
for you.
> So the first question is, how do you manage to work with unicode strings
> in PyXML/4Suite (a pointer to the right source file is a valuable answer).
I'll punt on this one...
> The second question is, where is the right place to discuss these issues ?
Sure.
> The third question is, does anyone know if this state of things is likely
> to change in python 2.2 ?
Things always change from one Python release to the next ;-)
Seriously, there are no plans for changing this behaviour;
otherwise Python would have to guess your encoding in some
way (one way is by looking at your locale settings;
see site.py for details).
IMO, explicit is better than implicit. So the codecs.open()
approach should be preferred.
--
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting: http://www.egenix.com/
Python Software: http://www.lemburg.com/python/