[ python-Feature Requests-1001895 ] Adding missing ISO 8859 codecs,
especially Thai
SourceForge.net
noreply at sourceforge.net
Thu Aug 5 13:41:15 CEST 2004
Feature Requests item #1001895, was opened at 2004-08-02 11:48
Message generated for change (Comment added) made by loewis
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1001895&group_id=5470
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Peter Jacobi (peter_jacobi)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Adding missing ISO 8859 codecs, especially Thai
Initial Comment:
As the missing ISO 8859 codecs, (11:Thai, 16:Romanian)
can be automatically generated from the Unicode
mapping files (via gencodec.py), I'd like to ask for
inclusion in the next version.
----------------------------------------------------------------------
>Comment By: Martin v. Löwis (loewis)
Date: 2004-08-05 13:41
Message:
Logged In: YES
user_id=21627
The unfortunate problem is that ISO-8859-11 is not a
IANA-registered character set. For ISO-8859-16,
http://www.iana.org/assignments/character-sets
lists:
Name: ISO-8859-16
MIBenum: 112
Source: ISO
Alias: iso-ir-226
Alias: ISO_8859-16:2001
Alias: ISO_8859-16
Alias: latin10
Alias: l10
I believe ISO-8859-11 does not have any aliases. Some people
may claim TIS-620 is an alias, but it is not (as it does not
contain \xa0).
----------------------------------------------------------------------
Comment By: M.-A. Lemburg (lemburg)
Date: 2004-08-05 13:15
Message:
Logged In: YES
user_id=38388
Thank you.
Please also provide suitable aliases (I couldn't find any on
the IANA site), then I'll add them to Python 2.4.
----------------------------------------------------------------------
Comment By: Peter Jacobi (peter_jacobi)
Date: 2004-08-04 00:58
Message:
Logged In: YES
user_id=845149
Attached are the output if gencodec.py for ISO-8859-11,
ISO-8859-16 and for reference also the original mapping files.
Peter
----------------------------------------------------------------------
Comment By: M.-A. Lemburg (lemburg)
Date: 2004-08-03 16:34
Message:
Logged In: YES
user_id=38388
Peter, could you attach the generated codecs to this report ?
Thanks.
----------------------------------------------------------------------
Comment By: M.-A. Lemburg (lemburg)
Date: 2004-08-02 13:14
Message:
Logged In: YES
user_id=38388
Martin, I think it's a good idea to add the codecs for
completeness.
We should probably also review the mapping files posted on
the unicode.org site every now and then and update the
codecs in Python accordingly. Sticking to the Unicode
Consortium's view of things is a good way to assure
compatibility with other applications, IMO.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2004-08-02 12:30
Message:
Logged In: YES
user_id=21627
Marc-Andre, should we add these?
----------------------------------------------------------------------
Comment By: Peter Jacobi (peter_jacobi)
Date: 2004-08-02 12:16
Message:
Logged In: YES
user_id=845149
In a thread on news://comp.lang.python I was asked by
Martin v. Löwis to provide evidence on the correctness of the
ISO 8859-11 Unicode mapping file, as found on
ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-11.TXT
(due to the disclaimer boilerplate in these files).
So far I can provide these three points:
a) ISO 8859-n vs ISO-8859-n
If the information at http://en.wikipedia.org/wiki/ISO_8859-
1#ISO_8859-1_vs_ISO-8859-1 is correct, Python 8859-n
codecs do implement the ISO standard charsets ISO 8859-n
in the specialized IANA forms ISO-8859-n (and in agreement
with the Unicode mapping files). So any difficult C0/C1
wording in the original ISO standard can be disregarded.
b) libiconv ISO 8859-11
The implementation by Bruno Haible in libiconv does agree
with the Unicode mapping file:
http://cvs.sourceforge.net/viewcvs.py/libiconv/libiconv/lib/
c) IBM ICU4C
The implementation in ICU4C does agree with the Unicode
mapping file:
http://oss.software.ibm.com/cvs/icu/charset/data/ucm/
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1001895&group_id=5470
More information about the Python-bugs-list
mailing list