[Python-Dev] Re: [Python-checkins] python/dist/src/Include unicodeobject.h, 2.42, 2.43

Hye-Shik Chang perky at i18n.org
Wed Jun 2 13:31:49 EDT 2004

On Thu, Jun 03, 2004 at 02:22:50AM +0900, Hye-Shik Chang wrote:
> On Wed, Jun 02, 2004 at 12:01:39PM -0500, Skip Montanaro wrote:
> > 
> >     perky> - SF #962502: Add two more methods for unicode type; width() and
> >     perky> iswide() for east asian width manipulation.
> > 
> > Should strings grow these methods as well for symmetry?
> > 
> I think there'll be two possible behaviors for strings:
> 1) regard all characters as non-wide.
> 2) decode the string to unicode with the system default encoding
>    and call its methods.


3) leave them available on unicode type only.  Because the east
   asian width is a concept just from Unicode.  POSIX supports width
   manipulation (wcswidth, wcwidth) for wide characters only.

> 1) is simple and cheap and can work for non-unicode builds. And it
> even work nicely for the most east asian encodings, too. (the only
> encodings that len() and screen width are different are euc-jp,
> euc-tw and gb18030. But they aren't so major encoding in real life.)
> 2) is somewhat expensive and will not work in many of CJK environments
> because major portion of them don't aware of sys.setdefaultencoding()
> and how to play with it.  But this would be more flawless and it
> works all encodings that have its unicode codec in Python including
> iso-2022 instances.
> I didn't make my mind between these two yet.  What do you think?
> Hye-Shik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20040603/9b9722b8/attachment.bin

More information about the Python-Dev mailing list