Totally confused by Python's string thing.
Doru-Catalin Togea
doru-cat at ifi.uio.no
Mon Dec 16 11:34:30 EST 2002
Hi!
I am doing basic string manipulation with ActivePython 2.2 on Win2000 Pro.
getdefaultlocale() returns: "('no_NO', 'cp1252')"
getlocale() returns: "['Norwegian_Norway', '1252']"
when trying
locale.setlocale(locale.LC_ALL, 'latin-1')
I get
locale.Error: locale setting not supported
I am so totally confused, as calling doEncode, which is defined as
follows,
def doEncode(str):
strCopy = str.encode('latin-1')
for tag in myTags:
strCopy = string.replace(strCopy, tag[0], tag[1])
return strCopy
crashes when 'str' contains norwegian letters (åøæÅØÆ), with the following
error message:
...
File ... , line 53, in doEncode
strCopy = str.encode('latin-1')
UnicodeError: ASCII decoding error: ordinal not in range(128)
Can you help me understand how python deals with strings?
1) According to
http://www.cl.cam.ac.uk/~mgk25/ucs/CP1252.html, the 1252 extension
extends ISO 8859-1. Now ISO 8859-1 allready contains the norwegian
characters, at least according to
http://www.ramsch.org/martin/uni/fmi-hp/iso8859-1.html
So what is my problem, actually?
2) How do I set up my system to deal correctly and robustly with the ISO
8859-1 character set? How about the ISO 8859-2 character set?
3) Is there any INTRODUCTORY documentation about Python's internal string
thing?
One last curiosity and its mandatory question:
4) What kind of string objects does pyXML employ, since I can parse XML
with norwegian content and call doEncode on strings returned from my XML
file, without any Unicode crash?
Thank you, if you can help.
Catalin
<<<< ================================== >>>>
<< We are what we repeatedly do. >>
<< Excellence, therefore, is not an act >>
<< but a habit. >>
<<<< ================================== >>>>
More information about the Python-list
mailing list