[Python-Dev] Re: [Python-checkins] python/dist/src/Include
unicodeobject.h, 2.42, 2.43
Hye-Shik Chang
perky at i18n.org
Wed Jun 2 13:31:49 EDT 2004
On Thu, Jun 03, 2004 at 02:22:50AM +0900, Hye-Shik Chang wrote:
> On Wed, Jun 02, 2004 at 12:01:39PM -0500, Skip Montanaro wrote:
> >
> > perky> - SF #962502: Add two more methods for unicode type; width() and
> > perky> iswide() for east asian width manipulation.
> >
> > Should strings grow these methods as well for symmetry?
> >
>
> I think there'll be two possible behaviors for strings:
>
> 1) regard all characters as non-wide.
> 2) decode the string to unicode with the system default encoding
> and call its methods.
Or,
3) leave them available on unicode type only. Because the east
asian width is a concept just from Unicode. POSIX supports width
manipulation (wcswidth, wcwidth) for wide characters only.
>
> 1) is simple and cheap and can work for non-unicode builds. And it
> even work nicely for the most east asian encodings, too. (the only
> encodings that len() and screen width are different are euc-jp,
> euc-tw and gb18030. But they aren't so major encoding in real life.)
>
> 2) is somewhat expensive and will not work in many of CJK environments
> because major portion of them don't aware of sys.setdefaultencoding()
> and how to play with it. But this would be more flawless and it
> works all encodings that have its unicode codec in Python including
> iso-2022 instances.
>
> I didn't make my mind between these two yet. What do you think?
>
>
> Hye-Shik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20040603/9b9722b8/attachment.bin
More information about the Python-Dev
mailing list