[Python-Dev] Re: [Python-checkins] python/dist/src/Lib textwrap.py,1.18,1.19
Martin v. Löwis
martin@v.loewis.de
11 Dec 2002 17:53:56 +0100
Greg Ward <gward@python.net> writes:
> My attitude is that textwrap should work on European languages, whether
> they are encoded in 8-bit "ASCII" or Unicode.
Please, don't assume any specific encoding. Why is Latin-1 better than
KOI8-R? The only encoding that is truly better than all others is
ASCII, since virtually all other encodings have ASCII as a subset
(except for the EBCDIC ones, and, with limitations, the ISO-2022
ones).
Also, you'll find more-and-more European languages encoded in UTF-8,
so your support would be useless and give wrong results.
[If you meant to suggest no specific processing for disregard
this comment]
> I suspect that passing an arbitrary Unicode string to it is
> meaningles -- what the heck does it even mean to wrap a string of
> Chinese or Hebrew or Devangari characters? Beats me, and I think
> they're out of scope for textwrap.
Actually, the Unicode database has "line-breaking properties". Those
are not yet incorporated into unicodedata, but that could be used to
meaningfully extend the module to Unicode.
> So: do I even need to worry about the cornucopia of Unicode whitespace
> characters at all? Or can I sweep that can of worms under the rug?
> (Pardon the horribly mixed metaphor.)
Sweep away.
Regards,
Martin