Multibyte Character Surport for Python
Martin v. Loewis
martin at v.loewis.de
Thu May 9 04:22:11 EDT 2002
"Stephen J. Turnbull" <stephen at xemacs.org> writes:
> Martin> So in the end, the only acceptable strategy would be to
> Martin> allow identifiers that contain letters (or letterlike
> Martin> symbols) in arbitrary languages. For Python, that would
> Martin> mean that attributes must be Unicode objects, which could
> Martin> cause code breakage.
>
> This would actually be rather simple if you just declare that Python
> programs as submitted to the (internal) parser must be in UTF-8, and
> ensure that PEP 263 codecs do this in a way transparent to the user.
Indeed, parsing it would not be an issue (although it *would* be an
issue to define the set of acceptable identifier characters).
> I see no reason why this would cause code breakage, although I
> haven't tried it yet.
It would break introspective tools who suddenly find Unicode objects
in attribute dictionaries.
Regards,
Martin
More information about the Python-list
mailing list