[Python-Dev] Re: [Python-checkins] python/dist/src/Include unicodeobject.h, 2.42, 2.43

Hye-Shik Chang perky at i18n.org
Wed Jun 2 13:22:50 EDT 2004

On Wed, Jun 02, 2004 at 12:01:39PM -0500, Skip Montanaro wrote:
>     perky> - SF #962502: Add two more methods for unicode type; width() and
>     perky> iswide() for east asian width manipulation.
> Should strings grow these methods as well for symmetry?

I think there'll be two possible behaviors for strings:

1) regard all characters as non-wide.
2) decode the string to unicode with the system default encoding
   and call its methods.

1) is simple and cheap and can work for non-unicode builds. And it
even work nicely for the most east asian encodings, too. (the only
encodings that len() and screen width are different are euc-jp,
euc-tw and gb18030. But they aren't so major encoding in real life.)

2) is somewhat expensive and will not work in many of CJK environments
because major portion of them don't aware of sys.setdefaultencoding()
and how to play with it.  But this would be more flawless and it
works all encodings that have its unicode codec in Python including
iso-2022 instances.

I didn't make my mind between these two yet.  What do you think?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20040603/c3d7c41e/attachment.bin

More information about the Python-Dev mailing list