[Patches] [ python-Patches-670715 ] Universal Unicode Codec for POSIX iconv

SourceForge.net noreply@sourceforge.net
Tue, 21 Jan 2003 16:21:45 -0800


Patches item #670715, was opened at 2003-01-20 01:51
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=670715&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hye-Shik Chang (perky)
Assigned to: Martin v. Löwis (loewis)
Summary: Universal Unicode Codec for POSIX iconv

Initial Comment:
Here's the unicode codec using POSIX iconv(3) library.

Tested on these platforms and seems to work:
  FreeBSD/i386, FreeBSD/alpha, FreeBSD/ia64,
  FreeBSD/sparc64, MacOS X/ppc, HP-UX/pa-risc2

This codec implementation supports PEP293, also.


----------------------------------------------------------------------

>Comment By: Hye-Shik Chang (perky)
Date: 2003-01-22 09:21

Message:
Logged In: YES 
user_id=55188

iconv implementations on commercial UNIXes are very scary. 

Solaris implementation:
  no support for CJK <-> UCS conversion.
  They support UCS[24] only for iso-8859 and UTFs.

HP-UX implementation:
  They have useless iconv. HP-UX iconv has no unicode support.

BSD implementation (Konstantin Chuguev's):
  compatible with this patch (provides ucs-[24]-internal)

GLIBC implementation:
  provides ucs-[24] and they are same with GNU iconv's
ucs-[24]-internal.
  Because ucs-[24] of GNU/BSD implentation is big endian
always. We can't use ucs-[24] for every platform.

In conclusion, we must use 3rd party iconv on Solaris or HP-UX.
And, we need to detect whether the linked iconv is GNU/BSD
iconv or GLIBC iconv. (how?)

I'll investigate how to detect them, but ... :)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2003-01-22 07:27

Message:
Logged In: YES 
user_id=21627

I'm quite happy with this patch, and will apply it shortly.
However, I am concerned that it is specific for GNU iconv.
IMO, there should be machinery to find out the "internal"
encoding, in case native the native iconv implementation is
used instead of GNU iconv.

----------------------------------------------------------------------

Comment By: Hye-Shik Chang (perky)
Date: 2003-01-20 22:33

Message:
Logged In: YES 
user_id=55188

Thank you for comments. :->
I uploaded a new revised patch with unittest and some code
style fixes.

I saw Martin v. Loewis's iconvcodecs about a years ago.
His implementation is very neat, but it had a limit on error
handling due to recursive call.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-01-20 07:56

Message:
Logged In: YES 
user_id=38388

The patch looks good, but you'll need to add some form
of testing to underline the "seems to work" :-)

Some docs on how to use the codec would also be needed.

Martin von Loewis has written a similar codec some months ago.
Perhaps you two could get in touch and sort out the details ?!

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=670715&group_id=5470