[Tutor] Python and unicode
Ferry Dave Jäckel
dave.jaeckel at arcor.de
Fri Mar 10 08:55:35 CET 2006
Hello list,
I try hard to understand python and unicode support, but don't get it
really.
What I thought about this until yesterday :)
If I write my script in unicode encoding and put the magic # -*- coding:
utf-8 -*- at its start, I can just use unicode everywhere without problems.
Reading strings in different encodings, I have to decode them, specifying
there source encoding, and writing them in different encode i have to
encode them, giving the target encoding.
But I have problems with printing my strings with print >> sys.stderr,
mystring. I get "ASCII codec encoding errors". I'm on linux with python2.4
My programming problem where I'm stumbling about this:
I have an xml-file from OO.org writer (encoded in utf-8), and I parse this
with sax, getting some values from it. This data should go into a mysql db
(as utf-8, too). I think this works quite well, but debug printing gives
this errors.
What is the right way to handle unicode and maybe different encodings in
python?
What encoding should be put into the header of the file, and when to use the
strings encode and decode methods? Are there modules (as maybe sax) which
require special treatment because of lack of full unicode support?
In general I'd like to keep all strings as unicode in utf-8, and just
convert strings from/to other encodings upon input/output.
Regards,
Dave
--
If you're using anything besides US-ASCII, I *stringly* suggest Python 2.0.
-- Uche Ogbuji (A fortuitous typo?), 29 Jan 2001
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/tutor/attachments/20060310/6da5da7a/attachment.pgp
More information about the Tutor
mailing list