[ python-Bugs-1324237 ] ISO8859-9 broken
SourceForge.net
noreply at sourceforge.net
Fri Oct 21 16:18:36 CEST 2005
Bugs item #1324237, was opened at 2005-10-11 23:35
Message generated for change (Comment added) made by lemburg
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1324237&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Unicode
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Eray Ozkural (exa)
Assigned to: M.-A. Lemburg (lemburg)
Summary: ISO8859-9 broken
Initial Comment:
Probably not limited to ISO8859-9.
The problem is that the encodings returned by getlocale()
and getpreferredencoding() are not guaranteed to work
with, say, encode method of string.
I'm on MDK10.2 and i switch to Turkish locale
>>> locale.setlocale(locale.LC_ALL, '')
'tr_TR'
There is nothing in sys.stdout.encoding!
>>> sys.stdout.encoding
>>>
So I take a look at the encoding:
>>> locale.getlocale()
['tr_TR', 'ISO8859-9']
>>> locale.getpreferredencoding()
'ISO-8859-9'
Too bad I cannot use either encoding to encode innocent
unicode strings
>>> a = unicode('André','latin-1')
>>> print a.encode(locale.getpreferredencoding())
Traceback (most recent call last):
File "<stdin>", line 1, in ?
LookupError: unknown encoding: ISO-8859-9
>>> print a.encode(locale.getlocale()[1])
Traceback (most recent call last):
File "<stdin>", line 1, in ?
LookupError: unknown encoding: ISO8859-9
So I take a look at python page and I see that all encoding
names are in lowercase. That's no good, because:
>>> locale.getpreferredencoding().lower()
'\xfdso-8859-9'
(see bug 1193061 )
So I have to do this by hand! But of course this is
unacceptable for any locale aware application.
>>> print a.encode('iso-8859-9')
André
Expected:
1. I expect the encoding string returned by
getpreferredencoding and getlocale to be *identical*
2. I expect the encoding string returned to *work* with
encode method and in general *any* function that accepts
locales.
Got:
1. Different, ad hoc strings
2. Not all aliases present, only lowercases present, no
reliable way to find a canonical locale name.
Recommendations:
a. Please consider the Java-like solution to make Locale
into a class or an enum, something reliable, rather than
just a string.
b. Please test the locale functions in locales other than
US (that is not really a locale anyway)
----------------------------------------------------------------------
>Comment By: M.-A. Lemburg (lemburg)
Date: 2005-10-21 16:18
Message:
Logged In: YES
user_id=38388
Something in your installation must be broken: it seems the
system cannot find the ISO-8859-9 codec.
Note that the .encode() method uses the codec registry for
the lookup of the codec. The lookup itself is done
case-insensitive and subject to a few other normalizations
(see encodings/__init__.py).
Please check your system and then report back whether you
still see the reported error.
Thanks.
----------------------------------------------------------------------
Comment By: M.-A. Lemburg (lemburg)
Date: 2005-10-21 16:12
Message:
Logged In: YES
user_id=38388
Something in your installation must be broken: it seems the
system cannot find the ISO-8859-9 codec. Note that the
.encode() method uses the codec registry for the lookup of
the codec. The lookup itself is done case-insensitive and
subject to a few other normalizations (see
encodings/__init__.py).
Please check your system and then report back whether you
still see the reported error.
Thanks.
----------------------------------------------------------------------
Comment By: Eray Ozkural (exa)
Date: 2005-10-11 23:46
Message:
Logged In: YES
user_id=1454
BTW, I put this into Unicode category, because the bugs in it
seemed relevant to localization. Thank you very much for your
consideration.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1324237&group_id=5470
More information about the Python-bugs-list
mailing list